A PARALLEL ALGORITHM FOR EXTRACTING TRANSCRIPTIONAL REGULATORY NETWORK

  • Slides: 30
Download presentation
A PARALLEL ALGORITHM FOR EXTRACTING TRANSCRIPTIONAL REGULATORY NETWORK MOTIFS Fu Rong Wu

A PARALLEL ALGORITHM FOR EXTRACTING TRANSCRIPTIONAL REGULATORY NETWORK MOTIFS Fu Rong Wu

OUTLINE Preliminary Previous Work Method Experimental Conclusion Result

OUTLINE Preliminary Previous Work Method Experimental Conclusion Result

BIOLOGICAL MOTIFS Sequence motif � a sequence pattern of nucleotides in a DNA sequence

BIOLOGICAL MOTIFS Sequence motif � a sequence pattern of nucleotides in a DNA sequence or amino acids in a protein Structural motif � a pattern in a protein structure formed by the spatial arrangement of amino acids Network motif � patterns (sub-graphs) that recur within a network much more often than expected at random

TRANSCRIPTIONAL REGULATORY NETWORK describe the interactions between transcription factor proteins and the genes that

TRANSCRIPTIONAL REGULATORY NETWORK describe the interactions between transcription factor proteins and the genes that they regulate

BIOLOGICAL NETWORK MOTIFS EXAMPLE Autoregulation (AR) Feed Forward Loops (FFL) Bi. Fan Regulating and

BIOLOGICAL NETWORK MOTIFS EXAMPLE Autoregulation (AR) Feed Forward Loops (FFL) Bi. Fan Regulating and Regulated Feedback Loops (RFL) Diamond

OUTLINE Preliminary Previous Work Method Experimental Conclusion Result

OUTLINE Preliminary Previous Work Method Experimental Conclusion Result

PREVIOUS WORK exhaustive search algorithm � runtime increase dramatically for subgraphs with size ≥

PREVIOUS WORK exhaustive search algorithm � runtime increase dramatically for subgraphs with size ≥ 4. � Impractical to find high-order motifs because of its time complexity. random sampling algorithm � method improves the running time � only estimate the frequency of subgraphs cannot provide an exact solution

OUTLINE Preliminary Previous Work Method Experimental Conclusion Result

OUTLINE Preliminary Previous Work Method Experimental Conclusion Result

METHOD Goal: Find motif from a given graph G(V, E) One Master Processor �

METHOD Goal: Find motif from a given graph G(V, E) One Master Processor � Sort all nodes by degree � Partition nodes to Slave processors Slave Processors � Finding Neighborhoods from a Network � Finding Subgraphs within Neighborhood � Gather subgraph set to Master Processor

FINDING NEIGHBORHOODS FROM A NETWORK

FINDING NEIGHBORHOODS FROM A NETWORK

FINDING NEIGHBORHOODS FROM A NETWORK

FINDING NEIGHBORHOODS FROM A NETWORK

REVIEW OF BFS

REVIEW OF BFS

REVIEW OF BFS

REVIEW OF BFS

EXAMPLE OF BFS TREE

EXAMPLE OF BFS TREE

ALGORITHM 1 NBR(G, V)

ALGORITHM 1 NBR(G, V)

ALGORITHM 1 NBR(G, V)

ALGORITHM 1 NBR(G, V)

EXAMPLE OF ALGORITHM 1 (a) A graph G with 8 nodes that are labeled

EXAMPLE OF ALGORITHM 1 (a) A graph G with 8 nodes that are labeled from 1 to 8 (b) The neighborhood of node 1 in G with motif size k = 4. (Nbr(1) )

EXAMPLE FOR ALGORITHM 2

EXAMPLE FOR ALGORITHM 2

EXAMPLE FOR ALGORITHM 3 Subgraph from (c)

EXAMPLE FOR ALGORITHM 3 Subgraph from (c)

OUTLINE Preliminary Previous Work Method Experimental Conclusion Result

OUTLINE Preliminary Previous Work Method Experimental Conclusion Result

EXPERIMENTAL RESULT The cluster has 32 machines with two 2. 4 GHz processors The

EXPERIMENTAL RESULT The cluster has 32 machines with two 2. 4 GHz processors The programs are written in C and MPI library.

EXPERIMENTAL RESULT Real data set of interactions between transcription factors and operons in an

EXPERIMENTAL RESULT Real data set of interactions between transcription factors and operons in an E. coli network from the Regulon. DB database Each protein complex of a transcription factor or a gene is represented by a node.

EXPERIMENTAL RESULT Precision / Recall Given Truth Positive value(TP), False Positive value(FP) and False

EXPERIMENTAL RESULT Precision / Recall Given Truth Positive value(TP), False Positive value(FP) and False Negative value(FN), Recall = TP/(FN + TP) and Precision = TP/(TP + FP)

EXPERIMENTAL RESULT For k=6 Total number 15747 motif number 22532584

EXPERIMENTAL RESULT For k=6 Total number 15747 motif number 22532584

EXPERIMENTAL RESULT

EXPERIMENTAL RESULT

OUTLINE Preliminary Previous Work Method Experimental Conclusion Result

OUTLINE Preliminary Previous Work Method Experimental Conclusion Result

CONCLUSION This parallel algorithm can accurately find all high-order network motifs in a fast

CONCLUSION This parallel algorithm can accurately find all high-order network motifs in a fast running time. High-order motifs provide important information on biological system design.