Evaluating Protein-Protein Networks and Interactions with Rate-Distortion Techniques

Background and Question
Real-world Networks are complex, comprising vast webs of interconnected elements performing a diverse array of social and biological functions. Common among many networks is the pressure to be efficiently compressed either in the brain or in genetic code (1). Biological networks among molecular and cellular components are encoded at various scales in genetic material (2-5), and evolution uses these encoding to propagate network topologies in a surviving species. To answer the questions on what makes one network more compressible than another, we adapt tools from information theory to quantify the compressibility of a network.

Two ways to reduce the amount of information in the sequence is lossless and lossy compression, with lossless compression, the data originally in the file remains after compression, and all the information is restored while in our framework we will be using lossy compression, where the only the reference to recent data is saved which reduces the overall size without losing quality and is important for large scale biological networks. There are several studies done on directed networks however, few studies exist by using “real-world biological data”, therefore there is little information on complex biological networks. Some applied applications include identifying disease genes and drug targets, dysregulated pathways, and discover cell physiology in normal and disease states.

Student Name
Michael Pham
Faculty Mentor
Yunan Luo