Minimum Spanning Trees 2704 BOS 867 849 PVD

Minimum Spanning Trees Spanning subgraph n ORD Subgraph of a graph G containing all the vertices of G 1 Spanning tree n Spanning subgraph that is itself a (free) tree DEN Minimum spanning tree (MST) n q 10 Spanning tree of a weighted graph with minimum total edge weight PIT 9 6 STL 4 8 7 3 DCA 5 2 Applications n n Communications networks Transportation networks © 2010 Goodrich, Tamassia DFW Minimum Spanning Trees ATL 2

Cycle Property: Let T be a minimum spanning tree of a weighted graph G n Let e be an edge of G that is not in T and C let be the cycle formed by e with T n For every edge f of C, weight(f) weight(e) Proof: n By contradiction n If weight(f) > weight(e) we can get a spanning tree of smaller weight by replacing e with f Minimum Spanning Trees 2 4 C n © 2010 Goodrich, Tamassia 8 f 6 9 3 e 8 7 7 Replacing f with e yields a better spanning tree f 2 6 8 4 C 9 3 8 e 7 7 3

Partition Property U f Partition Property: Consider a partition of the vertices of G into subsets U and V n Let e be an edge of minimum weight across the partition n There is a minimum spanning tree of G containing edge e Proof: n Let T be an MST of G n If T does not contain e, consider the cycle C formed by e with T and let f be an edge of C across the partition n By the cycle property, weight(f) weight(e) n Thus, weight(f) = weight(e) n We obtain another MST by replacing f with e n © 2010 Goodrich, Tamassia Minimum Spanning Trees V 7 4 9 5 2 8 8 3 e 7 Replacing f with e yields another MST U 2 f V 7 4 9 5 8 8 3 e 7 4

Kruskal’s Algorithm q q q Maintain a partition of the vertices into clusters Algorithm Kruskal. MST(G) for each vertex v in G do Create a cluster consisting of v n Initially, single-vertex let Q be a priority queue. clusters Insert all edges into Q n Keep an MST for each T cluster {T is the union of the MSTs of the clusters} n Merge “closest” clusters while T has fewer than n - 1 edges do and their MSTs e Q. remove. Min(). get. Value() [u, v] G. end. Vertices(e) A priority queue stores the A get. Cluster(u) edges outside clusters B get. Cluster(v) n Key: weight if A B then n Element: edge Add edge e to T merge. Clusters(A, B) At the end of the algorithm return T n One cluster and one MST © 2010 Goodrich, Tamassia Minimum Spanning Trees 5

Example 8 B 5 1 6 3 H D 9 5 1 2 C 11 7 A 9 C 11 7 10 © 2010 Goodrich, Tamassia E 6 F 3 D H 1 2 A Campus Tour E 6 5 H C 11 10 2 G 9 7 F 3 8 B 4 4 D 10 G 8 5 F G 8 B 4 E C 11 10 B A 9 7 A 1 G 4 E 6 F 3 D H 2 6

Example (contd. ) B 5 1 G 8 9 6 H D 9 5 1 2 C 11 7 A 4 E 6 H D 10 A 5 9 C 11 7 10 © 2010 Goodrich, Tamassia E ep 6 F 3 D H G st o 4 8 B 1 2 Campus Tour 2 four steps tw 8 B 1 G F 3 s 10 F 3 C 11 7 A E 8 B 4 G A 5 9 C 11 7 10 4 E 6 F 3 D H 2 7

Data Structure for Kruskal’s Algorithm q q The algorithm maintains a forest of trees A priority queue extracts the edges by increasing weight An edge is accepted it if connects distinct trees We need a data structure that maintains a partition, i. e. , a collection of disjoint sets, with operations: n n n make. Set(u): create a set consisting of u find(u): return the set storing u union(A, B): replace sets A and B with their union © 2010 Goodrich, Tamassia Minimum Spanning Trees 8

Recall of List-based Partition q q Each set is stored in a sequence Each element has a reference back to the set n n n q operation find(u) takes O(1) time, and returns the set of which u is a member. in operation union(A, B), we move the elements of the smaller set to the sequence of the larger set and update their references the time for operation union(A, B) is min(|A|, |B|) Whenever an element is processed, it goes into a set of size at least double, hence each element is processed at most log n times © 2010 Goodrich, Tamassia Minimum Spanning Trees 9

Partition-Based Implementation q Partition-based version of Kruskal’s Algorithm n n q Cluster merges as unions Cluster locations as finds Running time O((n + m) log n) n n PQ operations O(m log n) UF operations O(n log n) © 2010 Goodrich, Tamassia Algorithm Kruskal. MST(G) Initialize a partition P for each vertex v in G do P. make. Set(v) let Q be a priority queue. Insert all edges into Q T {T is the union of the MSTs of the clusters} while T has fewer than n - 1 edges do e Q. remove. Min(). get. Value() [u, v] G. end. Vertices(e) A P. find(u) B P. find(v) if A B then Add edge e to T P. union(A, B) return T Minimum Spanning Trees 10

Prim-Jarnik’s Algorithm q q Similar to Dijkstra’s algorithm We pick an arbitrary vertex s and we grow the MST as a cloud of vertices, starting from s We store with each vertex v label d(v) representing the smallest weight of an edge connecting v to a vertex in the cloud At each step: n n We add to the cloud the vertex u outside the cloud with the smallest distance label We update the labels of the vertices adjacent to u © 2010 Goodrich, Tamassia Minimum Spanning Trees 11

Prim-Jarnik’s Algorithm (cont. ) q A heap-based adaptable priority queue with location-aware entries stores the vertices outside the cloud n n n q Key: distance Value: vertex Recall that method replace. Key(l, k) changes the key of entry l We store three labels with each vertex: n n n Distance Parent edge in MST Entry in priority queue © 2010 Goodrich, Tamassia Algorithm Prim. Jarnik. MST(G) Q new heap-based priority queue s a vertex of G for all v G. vertices() if v = s set. Distance(v, 0) else set. Distance(v, ) set. Parent(v, ) l Q. insert(get. Distance(v), v) set. Locator(v, l) while Q. is. Empty() l Q. remove. Min() u l. get. Value() for all e G. incident. Edges(u) z G. opposite(u, e) r weight(e) if r < get. Distance(z) set. Distance(z, r) set. Parent(z, e) Q. replace. Key(get. Entry(z), r) Minimum Spanning Trees 12

Example 2 7 B 0 2 B 5 C 0 © 2010 Goodrich, Tamassia 2 0 A 4 9 5 C 5 F 8 8 7 E 7 7 2 4 F 8 7 B 7 D 7 3 9 8 A D 7 5 F E 7 2 2 4 8 8 A 9 8 C 5 2 D E 2 3 7 Minimum Spanning Trees 7 B 0 3 7 7 4 9 5 C 5 F 8 8 A D 7 E 3 7 13 4

Analysis q Graph operations n q Label operations n n q n Each vertex is inserted once into and removed once from the priority queue, where each insertion or removal takes O(log n) time The key of a vertex w in the priority queue is modified at most deg(w) times, where each key change takes O(log n) time Prim-Jarnik’s algorithm runs in O((n + m) log n) time provided the graph is represented by the adjacency list structure n q We set/get the distance, parent and locator labels of vertex z O(deg(z)) times Setting/getting a label takes O(1) time Priority queue operations n q Method incident. Edges is called once for each vertex Recall that Sv deg(v) = 2 m The running time is O(m log n) since the graph is connected © 2010 Goodrich, Tamassia Minimum Spanning Trees 15

Baruvka’s Algorithm (Exercise) q q q Like Kruskal’s Algorithm, Baruvka’s algorithm grows many clusters at once and maintains a forest T Each iteration of the while loop halves the number of connected components in forest T The running time is O(m log n) Algorithm Baruvka. MST(G) T V {just the vertices of G} while T has fewer than n - 1 edges do for each connected component C in T do Let edge e be the smallest-weight edge from C to another component in T if e is not already in T then Add edge e to T return T © 2010 Goodrich, Tamassia Minimum Spanning Trees 16