Locality Sensitive Distributed Computing Exercise Set 1 David
Locality Sensitive Distributed Computing Exercise Set 1 David Peleg Weizmann Institute Exercises • Basic complexity considerations • Global function computation + pipeline • Termination detection for Dijkstra’s BFS • Fast DFS • Tgap in synchronizers a and b • 3 -coloring bounded-degree graphs • MIS / coloring on unoriented rings
Basic complexity issues 1. Prove or disprove: In a graph G(V, E), if there at least k edge-disjoint paths of length d between the nodes v and w, then it is possible to send m msgs from v to w in time O(d+m/k).
Exercises (cont) 2. Prove or disprove: In a graph G(V, E), if dist(v, w)=k and there are k 2 edge-disjoint paths between the nodes v and w, then it is possible to send k 2 msgs from v to w in time O(k).
Global function computation Goal: Compute global function f(Xv 1, . . . , Xvn) where each node v holds input Xv Semigroup function f: 1. Well-defined for any input subset 2. Associative and commutative Efficiently computable on tree T by convergecast
Global function computation During the process: The value sent upwards by each v in T = value of function on inputs of its subtree Tv fv = f(Yv) where Yv = { Xw | w Tv } Converge(f, X) process • Leaf v sends Xv to parent • Intermediate v with k children w 1, . . . , wk - receives values fwi = f(Ywi) from all children - applies fv (Xv, fw 1, . . . , fwk)
Example: Addition 105 50 13 49 29 13 29
Example: Maximum ? ? ? ? ?
Global function computation Claim: Assume f(Y) is represented in O(p) bits for every input set Y On tree T: • Message(Converge(f)) = ? • Time(Converge(f)) = ?
Pipelining Separate broadcast / convergecast operations can be efficiently pipelined Example: Pipelining 3 Converge(max) operations to get Mi = max {Xi(v) | v leaf } for i=1, 2, 3
Pipelining
Pipelining Lemma: k global semi-group functions can be computed on tree in time ?
Level-synchronized BFS (Dijkstra) Q. Prove the tightness of the message complexity analysis of Dijkstra's algorithm, by establishing the following: Lower bound: For integers n and 1 D n-1, there exists n-node, D-diameter graph G=(V, E) on which the execution of Dijkstra's algorithm requires (n. D+|E|) messages.
Level-synchronized BFS (Dijkstra) Termination detection: Modify the Distributed Dijkstra algorithm so that the root can tell when the process is completed (and the entire graph is spanned by the constructed BFS tree)
Distributed Depth-First Search DFS: Search process on G, traversing all vertices, progressing over edges, with preference to visiting new vertices
Distributed Depth-First Search DFS algorithm • Search starts at origin v 0 • Whenever search reaches vertex v: - If v has neighbors not visited so far, then visit one of them next. - Else return to the vertex from which visited first - If v = v 0 then end
Distributed Depth First Search Fact: DFS process visits every vertex in G. Search defines DFS tree, with v 0 as root: • v's parent = node from which v was visited first Sequential time complexity = O(|E|)
Direct distributed implementation - Completely sequential: one activity locus at any time. - Control carried via single message (“token”) traversing G in depth-first fashion Note: For v to know whether neighbor w was visited or not, it must send message over edge (v, w)
Direct distributed implementation every edge must be explored both time and message complexities = Q(|E|)
Exercise Q. • Modify the DFS algorithm to allow the traveler to complete the tour (visiting all nodes) faster than O(|E|). • Analyze the time complexity of the modified algorithm and prove your bound.
Synchronizers Consider a 15 -processor asynchronous G(V, E), V={0, …, 14}, constantly running a synchronizer. v, v' = nodes in G. At time t, pulse counter Pv = 27. What is the range of possible pulse numbers at Pv' in following cases:
Synchronizers (cont) 1. G = ring (with nodes arranged in order), v=11, v'=2, synchronizer used = a. 2. G = full balanced binary tree (4 levels), v = root, v' = one of the leaves, synchronizer used = b. 3. The same as in (2), except both v and v' are leaves.
Synchronizers (cont) 4. Synchronizer used = g. 5. Clusters, spanning tree of each cluster, inter-cluster edges and locations of v, v' are as follows:
Synchronizer gaps Gap of synchronizer n: Tgap(n) = maxv, p {t(v, p+1)-t(v, p)} (max length of period t(v, p) that some processor v stays in some pulse p) Tgap(n)
Synchronizer gaps (cont) 1. How large could Tgap be for : a. synchronizer a b. synchronizer b c. synchronizer g 2. For synchronizer a, design a scenario realizing the worst-case waiting time, Tgap(a)
3 -coloring bounded-degree graphs Goal: Color arbitrary bounded degree G (D(G)=O(1)) with D+1 colors in time O(log*n)
Consistent orientation and MIS Consistent orientation: the ring edges are oriented in a consistent manner (each node identifies its “left” and “right” neighbors, and each edge e = (v, w) is marked as going “left” by one of its endpoints and as going “right” by the other. We have seen: Given an MIS on the ring, the nodes can be colored with 3 colors in a single round.
Consistent orientation and MIS (a) Show that algorithm might fail if the ring does not enjoy consistent orientation (b) Prove that on an anonymous ring (no ID’s) without a consistent orientation, it is impossible to deterministically 3 -color the nodes, even given an MIS. (c) Prove that on a non-anonymous ring without a consistent orientation, it is still possible to deterministically 3 -color the vertices in a constant number of rounds given an MIS. (Try to use the smallest number of rounds. )
Consistent orientation and MIS (d) What lower bound can be proved for the number of rounds required for computing MIS on a ring without consistent orientation? Specify precise constants (not only asymptotic lower bound). (e) What is the smallest n for which the lower bound you got in the previous question is greater than 1? What is the smallest n for which the lower bound on 3 -coloring is greater than 1?
- Slides: 28