Fast Exact Graph Diameter Computation with Vertex Programming

  • Slides: 20
Download presentation

Fast, Exact Graph Diameter Computation with Vertex Programming Vertex-Centric Computing for Large Scale Graph

Fast, Exact Graph Diameter Computation with Vertex Programming Vertex-Centric Computing for Large Scale Graph Analytics Corey Pennycuff and Tim Weninger SIGKDD Workshop on High Performance Graph Mining August 10, 2015

Dijkstra’s Single Source Shortest Path B F A 2 C E 0 G D

Dijkstra’s Single Source Shortest Path B F A 2 C E 0 G D A A B C 0 1 1 D E F G 1 2 2

Medium Graphs 4 million nodes 200 million edges

Medium Graphs 4 million nodes 200 million edges

Bigger Graphs DISK Solution – Hadoop DISK data DISK 2 DISK mappers DISK shuffle

Bigger Graphs DISK Solution – Hadoop DISK data DISK 2 DISK mappers DISK shuffle and sort DISK 3 reducers DISK result 4 DISK

Graph Diameter • HADI Reverse Cuthill-Mc. Kee Random BFS

Graph Diameter • HADI Reverse Cuthill-Mc. Kee Random BFS

Bulk Synchronous Parallel (BSP) Created in 1990 by Les Valiant and Bill Mc. Coll

Bulk Synchronous Parallel (BSP) Created in 1990 by Les Valiant and Bill Mc. Coll at Oxford DISK data DISK barrier Superstep 0 Superstep 1 Data kept in memory Superstep 2 Superstep 3 result

Graph Analytics with BSP Require the programmer to “think like a vertex” A B

Graph Analytics with BSP Require the programmer to “think like a vertex” A B F E C D …

The Vertex Each Vertex Can: • • • Receive messages from previous superstep Modify

The Vertex Each Vertex Can: • • • Receive messages from previous superstep Modify its value/datum Send messages

BSP Single Source Shortest Path B E A F C G D compute(Message. Iterator*

BSP Single Source Shortest Path B E A F C G D compute(Message. Iterator* msgs){ bool changed = false; foreach(msg : msgs){ if(msg < datum){ datum = msg; changed = true; } } if(changed) { foreach(edge : Get. Out. Edge. Iterator()){ send. Message. To(edge. dest, datum + edge. weight) } }else{ vote. To. Halt(); } }

Dijkstra’s Single Source Shortest Path master B E A F 0 C G D

Dijkstra’s Single Source Shortest Path master B E A F 0 C G D A A Superstep 0 0 B C D E F G

Dijkstra’s Single Source Shortest Path B E A F 0 C G D A

Dijkstra’s Single Source Shortest Path B E A F 0 C G D A Superstep 1 A B C 0 1 1 D E F 2 G

Dijkstra’s Single Source Shortest Path B E A F 0 C G D A

Dijkstra’s Single Source Shortest Path B E A F 0 C G D A Superstep 2 A B C 0 1 1 D E F 2 G

Supersteps-1 = Node Eccenctricity B E A F 0 C G D A A

Supersteps-1 = Node Eccenctricity B E A F 0 C G D A A B C 0 1 1 D E F 2 G

Diameter Measurement E A F C G D C C C G D B

Diameter Measurement E A F C G D C C C G D B E A F D G D E A F B B B A F D E G C G D B E A F C D G

Limitations Must be synchronous Designed for unweighted graphs

Limitations Must be synchronous Designed for unweighted graphs

Performance Results ER-Graphs (p=32%)

Performance Results ER-Graphs (p=32%)

Performance Results SF-Graphs (k=3)

Performance Results SF-Graphs (k=3)

Performance Results Real World Graphs

Performance Results Real World Graphs

Thank you

Thank you