Minimum Spanning Trees Featuring Disjoint Sets HKOI Training

  • Slides: 45
Download presentation
Minimum Spanning Trees Featuring Disjoint Sets HKOI Training 2006 Liu Chi Man (cx) 25

Minimum Spanning Trees Featuring Disjoint Sets HKOI Training 2006 Liu Chi Man (cx) 25 Mar 2006

Prerequisites Asymptotic complexity n Set theory n Elementary graph theory n Priority queues (or

Prerequisites Asymptotic complexity n Set theory n Elementary graph theory n Priority queues (or heaps) n 2

Graphs A graph is a set of vertices and a set of edges n

Graphs A graph is a set of vertices and a set of edges n G = (V, E) n Number of vertices = |V| n Number of edges = |E| n We assume simple graph, so |E| = O(|V|2) n 3

Roadmap What is a tree? n Disjoint sets n Minimum spanning trees n Various

Roadmap What is a tree? n Disjoint sets n Minimum spanning trees n Various tree topics n 4

What is a Tree? 5

What is a Tree? 5

Trees in graph theory n In graph theory, a tree is an acyclic, connected

Trees in graph theory n In graph theory, a tree is an acyclic, connected graph ¨ Acyclic means “without cycles” 6

Properties of trees n |E| = |V| - 1 ¨ |E| n n n

Properties of trees n |E| = |V| - 1 ¨ |E| n n n = (|V|) Between any pair of vertices, there is a unique path Adding an edge between a pair of non-adjacent vertices creates exactly one cycle Removing an edge from the tree breaks the tree into two smaller trees 7

Definition? n The following four conditions are equivalent: ¨G is connected and acyclic ¨

Definition? n The following four conditions are equivalent: ¨G is connected and acyclic ¨ G is connected and |E| = |V| - 1 ¨ G is acyclic and |E| = |V| - 1 ¨ Between any pair of vertices in G, there exists a unique path n G is a tree if at least one of the above conditions is satisfied 8

Other properties of trees Bipartite n Planar n A tree with at least two

Other properties of trees Bipartite n Planar n A tree with at least two vertices has at least two leaves (vertices of degree 1) n 9

Roadmap What is a tree? n Disjoint sets n Minimum spanning trees n Various

Roadmap What is a tree? n Disjoint sets n Minimum spanning trees n Various tree topics n 10

The Union-Find problem n N balls initially, each ball in its own bag ¨

The Union-Find problem n N balls initially, each ball in its own bag ¨ Label n the balls 1, 2, 3, . . . , N Two kinds of operations: ¨ Pick two bags, put all balls in these bags into a new bag (Union) ¨ Given a ball, find the bag containing it (Find) 11

The Union-Find problem An example with 4 balls n Initial: {1}, {2}, {3}, {4}

The Union-Find problem An example with 4 balls n Initial: {1}, {2}, {3}, {4} n Union {1}, {3} {1, 3}, {2}, {4} n Find 3. Answer: {1, 3} n Union {4}, {1, 3} {1, 3, 4}, {2} n Find 2. Answer: {2} n Find 1. Answer {1, 3, 4} n 12

Disjoint sets Disjoint-set data structures can be used to solve the union-find problem n

Disjoint sets Disjoint-set data structures can be used to solve the union-find problem n Each bag has its own representative ball n ¨ {1, 3, 4} is represented by ball 3 (for example) ¨ {2} is represented by ball 2 13

Implementation 1: Naive arrays Bag[x] : = representative of the bag containing x n

Implementation 1: Naive arrays Bag[x] : = representative of the bag containing x n <O(N), O(1)> n ¨ Union n Slight modifications give <O(U), O(1)> ¨U n takes O(N) and Find takes O(1) is the size of the union Worst case: O(MN) for M operations 14

Implementation 1: Naive arrays n How to union Bag[x] and Bag[y]? ¨Z : =

Implementation 1: Naive arrays n How to union Bag[x] and Bag[y]? ¨Z : = Bag[x] For each ball v in Z do Bag[v] : = Bag[y] Can I update the balls in Bag[y] instead? n Rule: Update the balls in the smaller bag n ¨ O(Mlg. N) for M union operations 15

Implementation 2: Forest A forest is a collection of trees n Each bag is

Implementation 2: Forest A forest is a collection of trees n Each bag is represented by a rooted tree, with the root being the representative ball n 6 1 5 3 4 2 7 Example: Two bags --- {1, 3, 5} and {2, 4, 6, 7}. 16

Implementation 2: Forest n Find(x) ¨ Traverse n from x up to the root

Implementation 2: Forest n Find(x) ¨ Traverse n from x up to the root Union(x, y) ¨ Merge the two trees containing x and y 17

Implementation 2: Forest n Initial: Union 1 3: 1 2 1 2 3 4

Implementation 2: Forest n Initial: Union 1 3: 1 2 1 2 3 4 4 3 Union 2 4: 3 4 1 Find 4: 3 2 4 18

Implementation 2: Forest n 1 Union 1 4: 2 3 4 1 Find 4:

Implementation 2: Forest n 1 Union 1 4: 2 3 4 1 Find 4: 2 3 4 19

Implementation 2: Forest n How to represent the trees? ¨ Leftmost-Child-Right-Sibling n (LCRS)? Too

Implementation 2: Forest n How to represent the trees? ¨ Leftmost-Child-Right-Sibling n (LCRS)? Too complicated ¨ Parent array Parent[x] : = parent of x n If x is a tree root, set Parent[x] : = x n 20

Implementation 2: Forest n The worst case is still O(MN ) for M operations

Implementation 2: Forest n The worst case is still O(MN ) for M operations ¨ What n is the worst case? Improvements ¨ Union-by-rank ¨ Path compression 21

Union-by-rank We should avoid tall trees n Root of the taller tree becomes the

Union-by-rank We should avoid tall trees n Root of the taller tree becomes the new root when union n So, keep track of tree heights (ranks) n Good Bad 22

Path compression See also the solution for Symbolic Links (HKOI 2005 Senior Final) n

Path compression See also the solution for Symbolic Links (HKOI 2005 Senior Final) n Find(x): traverse from x up to root n Compress the x-to-root path at the same time n 23

Path compression n Find(4) The root is 3 3 5 5 1 6 3

Path compression n Find(4) The root is 3 3 5 5 1 6 3 1 6 The root is 3 4 2 6 5 1 7 4 4 2 3 7 The root is 3 2 7 24

U-by-rank + Path compression We ignore the effect of path compression on tree heights

U-by-rank + Path compression We ignore the effect of path compression on tree heights to simplify U-by-rank n U-by-rank alone gives O(Mlg. N) n U-by-rank + path compression gives O(M (N)) n ¨ : inverse Ackermann function n (N) 5 for practically large N 25

Roadmap What is a tree? n Disjoint sets n Minimum spanning trees n Various

Roadmap What is a tree? n Disjoint sets n Minimum spanning trees n Various tree topics n 26

Minimum spanning trees n Given a connected graph G = (V, E), a spanning

Minimum spanning trees n Given a connected graph G = (V, E), a spanning tree of G is a graph T such that ¨T is a subgraph of G ¨ T is a tree ¨ T contains every vertex of G n A connected graph must have at least one spanning tree 27

Minimum spanning trees Given a weighted connected graph G, a minimum spanning tree T*

Minimum spanning trees Given a weighted connected graph G, a minimum spanning tree T* of G is a spanning tree of G with minimum total edge weight n Application: Minimizing the total length of wires needed to connect up a collection of computers n 28

Minimum spanning trees n Two algorithms ¨ Kruskal’s algorithm ¨ Prim’s algorithm 29

Minimum spanning trees n Two algorithms ¨ Kruskal’s algorithm ¨ Prim’s algorithm 29

Kruskal’s algorithm n Choose edges in ascending weight greedily, while preventing cycles 30

Kruskal’s algorithm n Choose edges in ascending weight greedily, while preventing cycles 30

Kruskal’s algorithm n Algorithm ¨T is an empty set ¨ Sort the edges in

Kruskal’s algorithm n Algorithm ¨T is an empty set ¨ Sort the edges in G by their weights ¨ For (in ascending weight) each edge e do n If T {e} is acyclic then ¨ Add e to T ¨ Return T 31

Kruskal’s algorithm n How to detect a cycle? ¨ Depth-first search (DFS) O(V) per

Kruskal’s algorithm n How to detect a cycle? ¨ Depth-first search (DFS) O(V) per check n O(VE) overall n ¨ Disjoint n set Vertices are balls, connected components are bags 32

Kruskal’s algorithm n Algorithm (using disjoint-set) ¨T is an empty set ¨ Create bags

Kruskal’s algorithm n Algorithm (using disjoint-set) ¨T is an empty set ¨ Create bags {1}, {2}, …, {V} ¨ Sort the edges in G by their weights ¨ For (in ascending weight) each edge e do Suppose e connects vertices x and y n If Find(x) Find(y) then n ¨ Add e to T, then Union(Find(x), Find(y)) ¨ Return T 33

Kruskal’s algorithm The improved time complexity is O(Elg. V) n The bottleneck is sorting

Kruskal’s algorithm The improved time complexity is O(Elg. V) n The bottleneck is sorting n 34

Prim’s algorithm In Kruskal’s algorithm, the MST-inprogress scatters around n Prim’s algorithm grows the

Prim’s algorithm In Kruskal’s algorithm, the MST-inprogress scatters around n Prim’s algorithm grows the MST from a “seed” n Prim’s algorithm iteratively chooses the lightest grow-able edge n ¨A grow-able edge connects a grown vertex and a non-grown vertex 35

Prim’s algorithm n Algorithm ¨ Let seed be any vertex, and Grown : =

Prim’s algorithm n Algorithm ¨ Let seed be any vertex, and Grown : = {seed} ¨ Initially T is an empty set ¨ Repeat |V|-1 times Let e=(x, y) be the lightest grow-able edge n Add e to T n Add x and y to Grown n ¨ Return T 36

Prim’s algorithm n How to find the lightest grow-able edge? ¨ Check n all

Prim’s algorithm n How to find the lightest grow-able edge? ¨ Check n all (grown, non-grown) vertex pairs Too slow ¨ Each non-grown vertex x keeps a value nearest[x], which is the weight of the lightest edge connecting x to some grown vertex n Nearest[x] = if no such edge 37

Prim’s algorithm n How to use nearest? ¨ Grow the vertex (x) with the

Prim’s algorithm n How to use nearest? ¨ Grow the vertex (x) with the minimum nearest- value n Which edge? Keep track on it! ¨ Since x has just been grown, we need to update the nearest-values of all non-grown vertices n Only need to consider edges incident to x 38

Prim’s algorithm Try to program Prim’s algorithm n You may find that it’s very

Prim’s algorithm Try to program Prim’s algorithm n You may find that it’s very similar to Dijkstra’s algorithm for finding shortest paths! n ¨ Almost only a one-line difference 39

Prim’s algorithm n Per round. . . ¨ Finding minimum nearest-value: O(V) ¨ Updating

Prim’s algorithm n Per round. . . ¨ Finding minimum nearest-value: O(V) ¨ Updating nearest-values: O(V) (Overall O(E)) Overall: O(V 2+E) = O(V 2) time n Using a binary heap, n ¨ O(lg. V) per Finding minimum ¨ O(lg. V) per Updating ¨ Overall: O(Elg. V) time 40

MST Extensions n Second-best MST ¨ We n don’t want the best! Online MST

MST Extensions n Second-best MST ¨ We n don’t want the best! Online MST ¨ See n IOI 2003 Path Maintenance Minimum bottleneck spanning tree ¨ The bottleneck of a spanning tree is the weight of its maximum weight edge ¨ An algorithm that runs in O(V+E) exists 41

MST Extensions (NP-Hard) n Minimum Steiner Tree ¨ No need to connect all vertices,

MST Extensions (NP-Hard) n Minimum Steiner Tree ¨ No need to connect all vertices, but at least a given subset B V n Degree-bounded MST ¨ Every vertex of the spanning tree must have degree not greater than a given value K n For a discussion of NP-hardness, please attend [Talk] Introduction to Complexity Theory on 3 June 42

Roadmap What is a tree? n Disjoint sets n Minimum spanning trees n Various

Roadmap What is a tree? n Disjoint sets n Minimum spanning trees n Various tree topics n 43

Various tree topics (List) Center, eccentricity, radius, diameter n Tree isomorphism n ¨ Canonical

Various tree topics (List) Center, eccentricity, radius, diameter n Tree isomorphism n ¨ Canonical representation Prüfer code n Lowest common ancestor (LCA) n Counting spanning trees n 44

Supplementary readings n Advanced: ¨ Disjoint set forest (Lecture slides) ¨ Prim’s algorithm ¨

Supplementary readings n Advanced: ¨ Disjoint set forest (Lecture slides) ¨ Prim’s algorithm ¨ Kruskal’s algorithm ¨ Center and diameter n Post-advanced (so-called Beginners): ¨ Lowest common ancestor ¨ Maximum branching 45