CMU SCS Large Graph Algorithms Christos Faloutsos CMU
- Slides: 31
CMU SCS Large Graph Algorithms Christos Faloutsos CMU Akoglu, Leman Chau, Polo Kang, U Open. Cirrus'10 Mc. Glohon, Mary Prakash, Aditya Tong, Hanghang Tsourakakis, Babis C. Faloutsos (CMU) #1
CMU SCS Graphs - why should we care? Internet Map [lumeta. com] Food Web [Martinez ’ 91] Protein Interactions [genomebiology. com] Friendship Network [Moody ’ 01] ICDM-LDMTA 2009 C. Faloutsos 2
CMU SCS Graphs - why should we care? • IR: bi-partite graphs (doc-terms) D 1 . . . DN • • • TM web: hyper-text graph Social networking sites (Facebook, twitter) Users posing and answering questions Click-streams (user – page bipartite graph). . . and more – any M: N db relationship ICDM-LDMTA 2009 C. Faloutsos T 1 3
CMU SCS Our goal: One-stop solution for mining huge graphs: PEGASUS project (PEta Gr. Aph mining System) • www. cs. cmu. edu/~pegasus • Open-source code and papers Open. Cirrus'10 C. Faloutsos (CMU) 4
CMU SCS Outline – Algorithms & results Degree Distr. Pagerank Diameter/ANF Conn. Comp Triangles Visualization Open. Cirrus'10 Centralized Hadoop/PEG ASUS old old old DONE STARTED C. Faloutsos (CMU) 5
CMU SCS HADI for diameter estimation • Radius Plots for Mining Tera-byte Scale Graphs U Kang, Charalampos Tsourakakis, Ana Paula Appel, Christos Faloutsos, Jure Leskovec, SDM’ 10 • Naively: diameter needs O(N**2) space and up to O(N**3) time – prohibitive (N~1 B) • Our HADI: linear on E (~10 B) – Near-linear scalability wrt # machines – Several optimizations -> 5 x faster Open. Cirrus'10 C. Faloutsos (CMU) 6
CMU SCS Count ? ? ? 19+? [Barabasi+] Radius Yahoo. Web graph (120 Gb, 1. 4 B nodes, 6. 6 B edges) • Largest publicly available graph ever studied. Open. Cirrus'10 C. Faloutsos (CMU) 7
CMU SCS Yahoo. Web graph (120 Gb, 1. 4 B nodes, 6. 6 B edges) • effective diameter: surprisingly small. • Multi-modality: probably mixture of cores. Open. Cirrus'10 C. Faloutsos (CMU) 8
CMU SCS Yahoo. Web graph (120 Gb, 1. 4 B nodes, 6. 6 B edges) • effective diameter: surprisingly small. • Multi-modality: probably mixture of cores. Open. Cirrus'10 C. Faloutsos (CMU) 9
CMU SCS Radius Plot of GCC of Yahoo. Web. Open. Cirrus'10 C. Faloutsos (CMU) 10
CMU SCS Running time - Kronecker and Erdos-Renyi Graphs with billions edges. Open. Cirrus'10 C. Faloutsos (CMU) #11
CMU SCS Outline – Algorithms & results Degree Distr. Pagerank Diameter/ANF Conn. Comp Triangles Visualization Open. Cirrus'10 Centralized Hadoop/PEG ASUS old old old DONE STARTED C. Faloutsos (CMU) 12
CMU SCS Generalized Iterated Matrix Vector Multiplication (GIMV) PEGASUS: A Peta-Scale Graph Mining System - Implementation and Observations. U Kang, Charalampos E. Tsourakakis, and Christos Faloutsos. (ICDM) 2009, Miami, Florida, USA. Best Application Paper (runner-up). Open. Cirrus'10 C. Faloutsos (CMU) 13
CMU SCS Generalized Iterated Matrix Vector Multiplication (GIMV) • Page. Rank • proximity (RWR) • Diameter • Connected components • (eigenvectors, • Belief Prop. • …) Open. Cirrus'10 C. Faloutsos (CMU) Matrix – vector Multiplication (iterated) 14
CMU SCS Example: GIM-V At Work • Connected Components Count Size 15 Open. Cirrus'10 C. Faloutsos (CMU)
CMU SCS Example: GIM-V At Work • Connected Components Count 300 -size cmpt X 500. 1100 -size cmpt Why? X 65. Why? Size 16 Open. Cirrus'10 C. Faloutsos (CMU)
CMU SCS Example: GIM-V At Work • Connected Components Count suspicious financial-advice sites (not existing now) Size 17 Open. Cirrus'10 C. Faloutsos (CMU)
CMU SCS Outline – Algorithms & results Degree Distr. Pagerank Diameter/ANF Conn. Comp Triangles Visualization Open. Cirrus'10 Centralized Hadoop/PEG ASUS old old old DONE STARTED C. Faloutsos (CMU) 18
CMU SCS Triangles • Real social networks have a lot of triangles ASONAM 2009 C. Faloutsos 19
CMU SCS Triangles • Real social networks have a lot of triangles – Friends of friends are friends • Q 1: how to compute quickly? • Q 2: Any patterns? ASONAM 2009 C. Faloutsos 20
CMU SCS Triangles : Computations [Tsourakakis ICDM 2008] Q: Can we do that quickly? Triangles are expensive to compute (3 -way join; several approx. algos) ASONAM 2009 C. Faloutsos 21
CMU SCS Triangles : Computations [Tsourakakis ICDM 2008] But: triangles are expensive to compute (3 -way join; several approx. algos) Q: Can we do that quickly? A: Yes! #triangles = 1/6 Sum ( li 3 ) (and, because of skewness, we only need the top few eigenvalues! ASONAM 2009 C. Faloutsos 22
CMU SCS Triangles : Computations [Tsourakakis ICDM 2008] ASONAM 2009 1000 x+ speed-up, high accuracy C. Faloutsos 23
CMU SCS Triangles • Easy to implement on hadoop: it only needs eigenvalues (working on it, using Lanczos) Open. Cirrus'10 C. Faloutsos (CMU) 24
CMU SCS Triangles • Real social networks have a lot of triangles – Friends of friends are friends • Q 1: how to compute quickly? • Q 2: Any patterns? ASONAM 2009 C. Faloutsos 25
CMU SCS Triangle Law: #1 [Tsourakakis ICDM 2008] HEP-TH ASN Epinions X-axis: # of Triangles a node participates in Y-axis: count of such nodes ASONAM 2009 C. Faloutsos 26
CMU SCS Triangle Law: #2 [Tsourakakis ICDM 2008] Reuters SN Epinions ASONAM 2009 C. Faloutsos X-axis: degree Y-axis: mean # triangles Notice: slope ~ degree exponent (insets) 27
CMU SCS Outline – Algorithms & results Degree Distr. Pagerank Diameter/ANF Conn. Comp Triangles Visualization Open. Cirrus'10 Centralized Hadoop/PEG ASUS old old old DONE STARTED C. Faloutsos (CMU) 28
CMU SCS Visualization: Shift. R • Supporting Ad Hoc Sensemaking: Integrating Cognitive, HCI, and Data Mining Approaches Aniket Kittur, Duen Horng (‘Polo’) Chau, Christos Faloutsos, Jason I. Hong Sensemaking Workshop at CHI 2009, April 4 -5. Boston, MA, USA. Open. Cirrus'10 C. Faloutsos (CMU) 29
CMU SCS
CMU SCS Conclusions One-stop shopping for large graph mining: • www. cs. cmu. edu/~pegasus Akoglu, Leman Tsourakakis, Babis Kang, U Chau, Polo Mc. Glohon, Mary THANKS: NSF, Yahoo (M 45), LLNL Open. Cirrus'10 C. Faloutsos (CMU) 31
- Christos faloutsos
- Michalis faloutsos
- Incrementalizing graph algorithms
- W graph
- Undirected graph algorithms
- White path theorem
- Christos anastasiou
- Christos leonidopoulos
- Christos energy
- Christos kanellopoulos
- Christos markou
- Nicholas lemonias
- Christos chronopoulos
- Interstitiella lungsjukdomar
- Christos hatzis
- Christos takoudis
- Christos lenis
- Christos davatzikos
- Christos kotselidis
- Christos h papadimitriou
- What is the al
- Christos hatzis
- Scs lulu
- Curva caracteristica scr
- Contoh rangkaian mosfet
- Scs 770069 power relay
- Scs archiver
- Infiltration index
- Scs methode
- Scs reasonable person principle
- Applied hydrology
- Lengkung peralihan