CS 240 A Breadthfirst search in Cilk Thanks
CS 240 A : Breadth-first search in Cilk++ Thanks to Charles E. Leiserson for some of these slides 1
Breadth First Search ∙ Level-by-level graph traversal ∙ Serially complexity: Θ(m+n) 1 2 3 4 5 6 7 8 9 10 11 12 16 17 18 19 Graph: G(E, V) E: Set of edges (size m) V: Set of vertices (size n) 2
Breadth First Search ∙ Level-by-level graph traversal ∙ Serially complexity: Θ(m+n) 1 2 3 4 5 6 7 8 9 10 11 12 16 17 18 19 1 Level 1 3
Breadth First Search ∙ Level-by-level graph traversal ∙ Serially complexity: Θ(m+n) 1 2 3 4 5 5 6 7 8 9 10 11 12 16 17 18 19 Level 1 1 2 Level 2 4
Breadth First Search ∙ Level-by-level graph traversal ∙ Serially complexity: Θ(m+n) 1 2 3 4 5 5 6 7 Level 1 1 Level 2 2 8 9 9 10 11 12 16 17 18 19 6 3 Level 3 5
Breadth First Search ∙ Level-by-level graph traversal ∙ Serially complexity: Θ(m+n) 1 2 3 4 5 5 6 7 10 11 6 17 18 3 12 16 16 Level 2 2 8 9 9 Level 1 1 10 7 Level 3 4 Level 4 19 6
Breadth First Search ∙ Level-by-level graph traversal ∙ Serially complexity: Θ(m+n) 1 2 3 4 5 5 6 7 10 11 6 17 18 3 12 16 16 Level 2 2 8 9 9 Level 1 1 19 10 17 4 Level 4 7 11 Level 3 8 Level 5 7
Breadth First Search ∙ Level-by-level graph traversal ∙ Serially complexity: Θ(m+n) 1 2 3 4 5 5 6 7 10 11 6 17 18 3 12 16 16 Level 2 2 8 9 9 Level 1 1 19 10 18 4 Level 4 7 11 17 8 12 Level 3 Level 5 Level 6 8
Breadth First Search ∙ Who is parent(19)? parent(19) § If we use a queue for expanding the frontier? § Does it actually matter? 1 2 3 4 5 5 6 7 10 11 6 17 18 3 12 16 16 Level 2 2 8 9 9 Level 1 1 19 10 18 4 Level 4 7 11 17 8 12 Level 3 Level 5 Level 6 9
Parallel BFS Bag<T> has an associative reduce function that merges ∙ Way #1: A custom reducer two sets void BFS(Graph *G, Vertex root) { Bag<Vertex> frontier(root); while ( ! frontier. is. Empty() ) { cilk: : hyperobject< Bag<Vertex> > succbag(); cilk_for (int i=0; i< frontier. size(); i++) { for( Vertex v in frontier[i]. adjacency() ) { if( ! v. unvisited() ) succbag() += v; } } operator+=(Vertex & rhs) frontier = succbag. get. Value(); } also marks rhs “visited” } 10
Parallel BFS ∙ Way #2: Concurrent writes + List reducer void BFS(Graph *G, Vertex root) { list<Vertex> frontier(root); Vertex * parent = new Vertex[n]; while ( ! frontier. is. Empty() ) { cilk_for (int i=0; i< frontier. size(); i++) { for( Vertex v in frontier[i]. adjacency() ) { if ( ! v. visited() ) An intentional parent[v] = frontier[i]; data race } } . . . How to generate the new frontier? 11
Parallel BFS Run cilk_for loop again void BFS(Graph *G, Vertex root) {. . . while ( ! frontier. is. Empty() ) {. . . hyperobject< reducer_list_append<Vertex> > succlist(); cilk_for (int i=0; i< frontier. size(); i++) { for( Vertex v in frontier[i]. adjacency() ) { if ( parent[v] == frontier[i] ) { succlist. push_back(v); v. visit(); // Mark “visited” } !v. visited() check is not } necessary. Why? } frontier = succlist. get. Value(); } 12
Parallel BFS ∙ Each level is explored with Θ(1) span ∙ Graph G has at most d, at least d/2 levels § Depending on the location of root § d=diameter(G) Work: T 1(n) = Θ(m+n) Span: T∞(n) = Θ(d) Parallelism: T 1(n) T∞(n) = Θ((m+n)/d) 13
Parallel BFS Caveats ∙ d is usually small ∙ d = lg(n) for scale-free graphs § But the degrees are not bounded ∙ Parallel scaling will be memory-bound ∙ Lots of burdened parallelism, § Loops are skinny § Especially to the root and leaves of BFS-tree ∙ You are not “expected” to parallelize BFS part of Homework #4 § You may do it for extra credit though 14
- Slides: 14