Clustering of Interaction Network Definition q Process to

  • Slides: 26
Download presentation
Clustering of Interaction Network ¢ Definition q Process to detect densely connected sub-graphs q

Clustering of Interaction Network ¢ Definition q Process to detect densely connected sub-graphs q Determines protein complexes or functional modules ¢ Difficulties q Noisy data (too many false positives or false negatives) q Cannot be solved by traditional clustering techniques m Difficult to define the pair-wise distance between proteins in the network. m Protein complexes may overlap. q Disparate sources of data m Different reliabilities ¢ 17%~50% m Small overlaps ¢ <17% University at Buffalo The State University of New York

Protein Interaction Network ¢ Undirected, unweighted graph q Node represents protein, edge represents interaction

Protein Interaction Network ¢ Undirected, unweighted graph q Node represents protein, edge represents interaction ¢ Example of Yeast protein interaction network ¢ Importance § Provide a global view of cellular organizations and biological functions § Applicable to systematic approaches for functional knowledge discovery ¢ Problem § Large scale § Complex connectivity University at Buffalo The State University of New York

Structural Property Ø Small-world Phenomenon ( Watts & Strogatz ) § Appearance of networks

Structural Property Ø Small-world Phenomenon ( Watts & Strogatz ) § Appearance of networks in the middle of regular and random networks § Higher average clustering coefficient than expected by random chance § Significantly small average shortest path length Ø Scale-free Distribution ( Barabasi & Albert ) § Network growth by preferential attachment § Power law degree distribution – a few high degree nodes, many low degree nodes § Clustering coefficient distribution independent to degree Protein Interaction Database DIP MIPS density 0. 0015 average clustering coefficient 0. 2283 0. 2878 average shortest path length 4. 14 4. 43 degree distribution (γ) 1. 77 1. 64 University at Buffalo The State University of New York high modularity hub existence

Conventional Graph Clustering Approaches Ø Density-based Clustering § Finding densely connected sub-graphs ( e.

Conventional Graph Clustering Approaches Ø Density-based Clustering § Finding densely connected sub-graphs ( e. g. Maximal clique algorithm ) Ø Hierarchical Clustering § Top-down approach: iteratively partitioning a graph ( e. g. Minimum cut algorithm ) § Bottom-up approach: iteratively merging nodes ( e. g. Node merging by common neighbors ) Ø Problems § Computationally inefficient § Unable to detect overlapping clusters § Discard sparsely connected nodes University at Buffalo The State University of New York

Functional Influence Model ¢Functional Flow qtreat each protein of known functional annotation as a

Functional Influence Model ¢Functional Flow qtreat each protein of known functional annotation as a ‘source’ of ‘functional flow’ for that function qsimulating the spread of this functional flow through the neighborhoods surrounding the sources with random walk. q‘functional score’: the amount of ‘flow’ that the protein has received for that function u Func(a) v University at Buffalo The State University of New York

Functional Influence ¢ Functional Influence based on Distance. ¢ Weibull Distribution Ø Curve Fitting

Functional Influence ¢ Functional Influence based on Distance. ¢ Weibull Distribution Ø Curve Fitting d is the distance between two nodes University at Buffalo The State University of New York

Functional Influence Model Ø Information Flow Simulation § Computation of functional influence infs(x) of

Functional Influence Model Ø Information Flow Simulation § Computation of functional influence infs(x) of s on x ∈ V based on Shortest Path § Input: a weighted interaction network and a source node s § Output: functional influence pattern of s Ø Measurements § Path. Ratio is the natural “aging” or “losing” of information propagation in the network. SPath(s, y) is all the shortest paths between node s and node y. PR(s, y) is the Path. Ratio between node s and node y. § Path. Strength PS(P) measures the strength of path P using weights on the edges along the path P. University at Buffalo The State University of New York

Framework of functional influence simulation Ø Algorithm 1. Initialize inf(s) 2. Compute initial flow

Framework of functional influence simulation Ø Algorithm 1. Initialize inf(s) 2. Compute initial flow I(s → y) by F(d) is the functional distribution model. d is the distance between node s and node y. PR(s, y) is the Path Resistance between node s and node y. 3. Update inf(y) by 4. Repeat 3 for every node in the network. 5. Finally, the functional profile, is generated for every node in the network. University at Buffalo The State University of New York Inf(s) is the initial functional influence from node s. Infs(y) is the functional influence received by node y from node s.

Functional Module Detection (FMD) University at Buffalo The State University of New York

Functional Module Detection (FMD) University at Buffalo The State University of New York

Flow. Chart for functional module detection University at Buffalo The State University of New

Flow. Chart for functional module detection University at Buffalo The State University of New York

Functional Modularity Detection ¢ Experimental Data q DIP (4935 proteins, 14162 interaction) ¢ Evaluation

Functional Modularity Detection ¢ Experimental Data q DIP (4935 proteins, 14162 interaction) ¢ Evaluation q Functional categories and annotations from MIPS q Hyper-geometric p-value ¢ Result University at Buffalo The State University of New York

Computational Epidemiology ¢ Computational Epidemiology q is a multidisciplinary field utilizing techniques to develop

Computational Epidemiology ¢ Computational Epidemiology q is a multidisciplinary field utilizing techniques to develop tools and models to aid epidemiologists in their study of the spread of diseases. 4. Analyzing results of the containment strategy (death toll vs. strategies) 1. Developing a virus spread and containment respond model 3. Utilizing this finding into real infectious virus spread 2. Understanding virus spread and identifying critical properties University at Buffalo The State University of New York

Virus Spread Network Model ¢ What represent nodes and edges in virus spread network

Virus Spread Network Model ¢ What represent nodes and edges in virus spread network model? q Node m Person (community network) m Town or place (road network) q Edge m Interaction (community network) m Pathway (road network) ¢ Weight of nodes and edges q Changed by time t based on virus spread dynamics model q Node weight: Status of health (0 ~ 1) q Edge weight: Status of strength (0 ~ 1) University at Buffalo The State University of New York

Model Scheme ¢ Spread Model q Spreading phase: edges which are in the region

Model Scheme ¢ Spread Model q Spreading phase: edges which are in the region of spreading will be damaged ¢ Defense Model q Signaling and propagation phase: nodes which have a certain number of damaged edges will send signals to neighbor nodes q Defense action phase: nodes which have a certain level of signals from neighbor nodes will remove all edges of those nodes University at Buffalo The State University of New York Virus progression to neighbor nodes Signaling alarms to neighbor nodes from infected neighbor node Culling nodes to prevent from virus progression

Spread Model ¢ Spreading Model q. Simulating disease spreading q. Damaging nodes and edges

Spread Model ¢ Spreading Model q. Simulating disease spreading q. Damaging nodes and edges which are in a virus spread radius from center q. Virus Spread by r(t) University at Buffalo The State University of New York

Defense Model ¢ Defense Model q. Simulating defense system of disease spreading and message

Defense Model ¢ Defense Model q. Simulating defense system of disease spreading and message spreading q. Culling interactions from damaged nodes in order to stop spreading (Edge Culling in Green Circles) University at Buffalo The State University of New York

Problem / Solution Approach ¢ Which element of virus spread system has the greatest

Problem / Solution Approach ¢ Which element of virus spread system has the greatest impact on containment campaign? à Identifying critical element of system by computational modeling and stochastic simulation. Parameters ¢ How to plan a effective containment campaign for minimizing damages by virus spread? à Mining best combination of critical parameters under certain conditions. University at Buffalo The State University of New York Simulation & Analysis Critical parameter

Application ¢ Virus Spread Simulation on the road network at the city of Oldenburg,

Application ¢ Virus Spread Simulation on the road network at the city of Oldenburg, German q Green edges: Healthy edges q Red edges: Damaged edges by spread process q Blue edges: Damaged edges by defense process Uncontrolled = 0. 02 Intermediate = 0. 12 University at Buffalo The State University of New York Controlled = 0. 22

Osteoporosis ¢ Osteoporosis q Definition: “a systemic skeletal disease characterized by low bone mass

Osteoporosis ¢ Osteoporosis q Definition: “a systemic skeletal disease characterized by low bone mass and micro-architectural deterioration of bone tissue leading to enhanced bone fragility and a consequent increase in fracture risk” q 25 million people in the United States are suffered. q $10 billion dollars are expended by medical charges including rehabilitation and treatment facilities. q Research Funding will be $200 billion by the year of 2040 Normal Osteoporosis University at Buffalo The State University of New York

Challenges ¢ Diagnosis of Osteoporosis? q Traditional method of evaluating bone strength is by

Challenges ¢ Diagnosis of Osteoporosis? q Traditional method of evaluating bone strength is by assessing bone mineral density (BMD). ¢ Limitations on BMD q A major limitation of BMD is that it incompletely reflects variation in bone strength. q Other factors like bone microarchitecture contribute substantially to bone strength q By evaluating bone microstructure we can improve determination of bone quality and strength Computational Model on Bone Microstructure University at Buffalo The State University of New York

Computational Model on Bone Microstructure ¢ Questions q What is the better way to

Computational Model on Bone Microstructure ¢ Questions q What is the better way to evaluate bone strength? q How can we identify fragile locations of the bone structure? q Why don’t we think this problem in a new direction? m. Let me think this problem with the structural point of view. ¢ Graph-based approach of bone microstructure q Bone microstructure contributes on bone strength. q We suppose rod-like mineral fibers represented by edges in a graph. q It is capable of quantitative assessment of bone mineral density and bone micro-architecture University at Buffalo The State University of New York

Model Approach ¢ Bone is not a uniformly solid material, but rather has some

Model Approach ¢ Bone is not a uniformly solid material, but rather has some spaces between its hard elements. ¢ Designing a network approach model for the bone microstructure. ¢ Quantitative assessment of bone mineral density could be successfully done with this approach. University at Buffalo The State University of New York

Bone Network Model ¢ Creating Bone Network q A femur bone image from patients

Bone Network Model ¢ Creating Bone Network q A femur bone image from patients with osteoporosis by DXA scan. q By image profiling on DXA scan image, we create bone network based on the bone density. ¢ What represent nodes and edges in bone network model? q Node: fiber binding point for bone cell movements and biochemical interactions q Edge: a group of mineralized fibers ¢ Weight of nodes and edges q Node weight: average weight of directly connected edges q Edge weight: Strength status of mineralized fibers University at Buffalo The State University of New York

Problem / Solution Approach ¢ What alternative ways for determining the strength of bone

Problem / Solution Approach ¢ What alternative ways for determining the strength of bone rather than Bone Mineral Density (BMD)? Human Bone Designing a computational model of bone microstructure. Bone Model ¢ How can we identify fragile locations of the bone structure? Creating algorithms for mining weak locations from a computational model of bone microstructure. University at Buffalo The State University of New York

Identifying Critical Locations ¢ Information Propagation Model q An algorithm to find critical edges

Identifying Critical Locations ¢ Information Propagation Model q An algorithm to find critical edges in bone network q Measuring the quantity of stress energy in each edge q Cutting the most critical edge by Information Propagation Model q Iteratively run to find the next critical edges. q It stops at the first isolated network University at Buffalo The State University of New York

Conclusions ¢ Various applications are generating data very rapidly and in great volume, demanding

Conclusions ¢ Various applications are generating data very rapidly and in great volume, demanding data mining approaches. ¢ Network-based approaches look promising to solve complex problems. ¢ This research requires close collaboration among multidisciplinary groups. ¢ Semi-supervised approaches to integrate domain knowledge into data mining tools are important to the success of the research. University at Buffalo The State University of New York