Introduction to Graphs And Breadth First Search Graphs

Introduction to Graphs And Breadth First Search

Graphs: what are they? • Representations of pairwise relationships • Collections of objects under some specified relationship

Graphs: what are they mathematically? • A graph G is a pair (V, E) • V is a set of vertices (nodes) • E is a set of pairs (a, b), a, b V • V is the set of relatable objects • E is the set of relationships

A Visual Example 1 3 2 4 5 G = ( {1, 2, 3, 4, 5}, {(1, 2), (1, 4), (2, 3), (2, 4), (1, 5)} )

Directed Graphs • In a directed graph • (a, b) E does not imply (b, a) E • Undirected graphs are a subset • (a, b) E if and only if (b, a) E • Visually, directed graphs are drawn with arrows

Directed Graph Example 1 3 2 4 5 G = ( {1, 2, 3, 4, 5}, { (1, 5), (2, 1), (2, 3), (2, 4), (3, 2), (4, 1) } )

Weighted Graphs • Have weights associated with edges • Can be directed or undirected • Can have pairs, in a directed graph, where the weights from (a, b) have no relationship on the weights from (b, a)

Weighted Graph Example 3. 2 2 1 π 42 3 666 777 4 G = ( {1, 2, 3, 4, 5}, { (1, 5, -5), (2, 1, 3. 2), (2, 3, 42), (3, 2, π), (2, 4, 777), (4, 1, 666) } ) -5 5

Graph Representation • How to represent in memory? • Two common ways: • Adjacency Lists • Adjacency Matrix

Adjacency Lists • Compact usage in sparse graphs where |E| << |V|2 • Stores graph as array of |V| lists • Each v has a list of adjacent v in G

Undirected Graph Example G = ( {1, 2, 3, 4, 5}, {(1, 2), (1, 4), (2, 3), (2, 4), (1, 5)} ) 1 2 3 4 5 2 1 1 4 3 2 5 4

Directed Graph Example G = ( {1, 2, 3, 4, 5}, { (1, 5), (2, 1), (2, 3), (2, 4), (3, 2), (4, 1) } ) 1 2 3 4 5 5 1 2 1 3 4

Adjacency Lists Wrap Up • Sum of list lengths for undirected • 2|E| • For some apps, could optimize to |E| • Sum of list lengths for directed • |E| • Weighted graphs: left as exercise

Adjacency Matrix • Often less memory for dense graphs • Faster check for edge existence • Mathematically: • M is a |V|*|V| matrix • Dimensions represent vertices • M(i, j)=1 if (i, j) E, 0 otherwise

Undirected Graph Example G = ( {1, 2, 3, 4, 5}, {(1, 2), (1, 4), (2, 3), (2, 4), (1, 5)} ) 1 2 3 4 5 0 1 1 1 0 0 0 1 0 0

Directed Graph Example G = ( {1, 2, 3, 4, 5}, { (1, 5), (2, 1), (2, 3), (2, 4), (3, 2), (4, 1) } ) 1 2 3 4 5 0 0 1 1 0 0 0 0 0 0

Adjacency Matrix Wrap Up • Size is always |V|2 • If |E| close to |V|, can be more efficient because edge is 1 bit instead of a 4 bytes for a pointer • Weighted graphs: use weight instead of 0’s and 1’s

Breadth-first Search • Problem: For a given graph G, and a specified s in the graph, find all vertices v that are reachable from s and determine the shortest path in G from s to v.

How BFS works • Constructs a breadth first tree • Root is s • Path from s to v is shortest path from s to v in G

The BFS algorithm • Assigns a color to each node • white = vertex has not been reached • gray = vertex is in the BFS frontier) • black = vertex and ALL of its neighbors have been processed.

BFS algorithm (cont. ) • Computes d[v] for each v • Shortest distance from s to v in G • Computes p[v] for each v • Predecessor of v in the breadth-first tree

1. 2. 3. 4. 5. 6. 7. 8. BFS Pseudo Code (initialization) for each vertex v in V color[v] = white d[v] = INFINITY p[v] = NULL color[s] = gray d[s] = 0 Queue. clear() Queue. put(s)

BFS Pseudo Code (tree construction) 9. while (!Queue. empty()) 10. u = Queue. get() 11. for each v adjacent to u 12. if (color[v] == white) 13. color[v] = gray 14. d[v] = d[u] + 1 15. p[v] = u 16. Queue. put(v) 17. color[u] = black

Correctness of BFS • Definition 1. b(s, v) is the min number of edges in any path from s to v. If there is no path from s to v then b(s, v) = INFINITY. b(s, v) is the shortest-path distance. • Lemma 1. Let G=(V, E), v in V. For any edge (u, v) in E b(s, v) <= b(s, u) + 1

Proof of Lemma 1 • If u is reachable from s, so is v. The shortest path from s to v cannot be more than the shortest path from s to u plus the edge (u, v), thus the inequality holds. If u is not reachable then b(s, u) = INFINITY so the inequality holds

Lemma 2 • Upon termination, the BFS algorithm computes d[v] for every vertex and d[v] >= b(s, v)

Proof of Lemma 2 • By induction on the number i of enqueue operations. • For i = 1 (s is enqueued), • d[s]=[0]=b(s, s) • d[v]=INFINITY>=b(s, v) for all v != s • For i = n, consider white v discovered from u. By induction, d[u]>=b(s, u). Since d[v]=d[u]+1 >= b(s, u)+1 >= b(s, v)

Lemma 3 • At all times during execution of BFS • the queue contains vertices (v 1, v 2, … vr) such that • d[v 1] <= d[v 2]…<=d[vr] • d[vr] <= d[v 1] + 1

Proof of Lemma 3 • By induction on number i of queue op’s. • For i=1, queue only has s, hypothesis holds • For i=n • After dequeueing v 1: • d[vr]<=d[v 1]+1 and d[v 1]<=d[v 2], then d[vr]<=d[v 2]+1, so hypothesis holds • After enqueueing vr+1: • D[vr+1] = d[v 1]+1 >= d[vr] • D[vr+1] = d[v 1]+1 <= d[v 2]+1, since d[v 1]<=d[v 2] • Since v 2 is the new head of queue, hypothesis holds

Corollary (4) to Lemma 3 • If vertices u and v are enqueued during execution of BFS and u is enqueued before v, then d[u] <= d[v]

Theorem 5 • Given G=(V, E) and s • BFS discovers every v reachable from s • Upon termination, d[v]=b(s, v) • Moreover, for v reachable from s • One of the shortests paths from s to v is a path followed from s to p[v], followed by edge (p[v], v).

Proof of Theorem 5 • By contradiction. • Assume v assigned d[v] != b(s, v). By lemma 2, d[v]>=b(s, v), so d[v] > b(s, v). • v must be reachable, else b(s, v)>=d[v] • Let u be predecessor on path to v • b(s, v) = b(s, u)+1 = d[u]+1 • This would mean d[v] > d[u]+1

Proof completion • d[v] > d[u]+1 cannot happen! • Look at when BFS dequeues u • v is either white, black, or gray • If v is black, already removed from queue, and by corollary 4, d[v]<=d[u] • If v is gray, it was made gray when other vertex w was dequeued, • so d[v]=d[w]+1 <= d[u]+1 (by corollary 4) • If v is white, then the code sets d[v] • d[v] = d[u] + 1

BFS Wrap Up • So, d[v]=b(s, v) for all v in V • All reachable vertices discovered, else d = INFINITY • If p[v]=u, then d[v]=d[u]+1, so one of the shortest paths from s to v takes path from s to u then (u, v)

Applications for Graphs • • Link structure of a website Problems in travel, biology, etc. Network representation Solution space: • EXAMPLE: Sudoku