Chapter 6 Transform and Conquer Basic Idea Instance

Chapter 6 Transform and Conquer

• Basic Idea • Instance simplification Ø Ø Presorting Search with presorting Element uniqueness with presorting Gaussian elimination • Representation change Ø Binary Search Trees Ø Heaps Ø Horner’s rule for polynomial evaluation • Problem reduction • Conclusion

Basic Idea This group of techniques solves a problem by a transformation • to a simpler/more convenient instance of the same problem (instance simplification) • to a different representation of the same instance (representation change) • to a different problem for which an algorithm is already available (problem reduction) Reduce problem instance to smaller instance of the same problem and extend solution

Instance Simplification - Presorting Many problems involving lists are easier when list is sorted. • searching • computing the median (selection problem) • checking if all elements are distinct (element uniqueness) • Presorting is used in many geometric algorithms. • Efficiency of algorithms involving sorting depends on efficiency of sorting.

Instance Simplification Presorting Many problems involving lists are easier when list is sorted. n searching n computing the median (selection problem) n checking if all elements are distinct (element uniqueness) n Topological sorting helps solving some problems for dags. n Presorting is used in many geometric algorithms. n Efficiency of algorithms involving sorting depends on efficiency of sorting.

How fast can we sort ? n log 2 n comparisons are necessary in the worst case to sort a list of size n by any comparisonbased algorithm, Ω(n log 2 n). n log 2 n comparisons are sufficient to sort array of size n , O(n log 2 n). Comparison-based sorting is optimal at Θ(n log 2 n).

Searching with presorting ALGORITHM Sequential. Search(A[0. . n-1], K) // Search for a given K in A[0. . n-1] i <- 0 while i < n and A[i] ≠ K do i <- i+1 if i < n return i else return -1 Presorting-based algorithm: Stage 1 (transform): Sort the array by an efficient sorting algorithm. Stage 2 (conquer): Apply binary search. Efficiency: Θ(nlog n) + O(log n) = Θ(nlog n) Why do we have our dictionaries, telephone directories, etc. sorted?

Element uniqueness with presorting ALGORITHM Unique. Elements(A[0. . n-1]) //Determines whether all elements are distinct. for i <- 0 to n-2 do for j <- i+1 to n-1 do if A[i] = A[j] return false return true • Brute force algorithm Compare all pairs of elements Efficiency: O(n 2) • Presorting-based algorithm Stage 1: Sort by efficient sorting algorithm. Stage 2: Scan array to check pairs of adjacent elements. Efficiency: Θ(nlog n) + O(n) = Θ(nlog n)

Gaussian elimination Stage 1: Transform the matrix by forward elimination Class Ex. 2 x 1 − 4 x 2 + x 3 = 6 3 x 1 − x 2 + x 3 = 11 x 1 + x 2 − x 3 = − 3 Stage 2:

2 x 1 - 4 x 2 + x 3 = 6 3 x 1 - x 2 + x 3 = 11 x 1 + x 2 - x 3 = -3 Forward elimination 2 -4 1 6 3 -1 1 11 row 2 – (3/2)*row 1 1 1 -1 -3 row 3 – (1/2)*row 1 2 -4 1 6 0 5 -1/2 2 0 3 -3/2 -6 row 3–(3/5)*row 2 2 -4 1 6 0 5 -1/2 2 0 0 -6/5 -36/5 Backward substitution x 3 = (-36/5) / (-6/5) = 6 x 2 = (2+(1/2)*6) / 5 = 1 x 1 = (6 – 6 + 4*1)/2 = 2

System of n equations in n unknowns

Upper triangular coefficient matrix

Efficiency of Gaussian Elimination Stage 1: Reduction to an upper-triangular matrix for i ← 1 to n-1 do for j ← i+1 to n do for k ← i to n+1 do A[j, k] ← A[j, k] - A[i, k] * A[j, i ] / A[i, i ] Stage 2: Back substitutions for j ← n down to 1 do t← 0 for k ← j +1 to n do t ← t + A[j, k] * x[k] x[j] ← (A[j, n+1] - t) / A[j, j ] Efficiency: Θ(n 3) + Θ(n 2) = Θ(n 3)

LU Decomposition Matrix A Forward elimination Matrix L Matrix U

LU Decomposition

Matrix Inverse If a matrix A does not have an inverse, it is called singular. A matrix is singular if and only if one of its rows is a linear combination (a sum of some multiples) of the other rows. Apply Gaussian elimination. If it yields an upper-triangular matrix with one or more zeros on the main diagonal, the matrix is singular; otherwise, it is nonsingular.

Determinant • Adding a scalar multiple of one row to another row does not change the value of the determinant. • Interchanging any pair of rows of a matrix multiplies its determinant by − 1. • The determinant of an upper-triangular matrix is equal to the product of elements on its main diagonal.

Using Gaussian elimination, we can compute the determinant of an n × n matrix in cubic time. B is obtained from A by adding − 1/2×the first row to the second det(A) = det(B). C is obtained from B by adding the first to the third row det(C) = det(B). D is obtained from C by exchanging the second and third row det(D) = −det(C). det(D) = (− 2) · 2 · 4. 5 = − 18. det(A) = −det(D) = +18.

Searching Problem A bag is a multiset, i. e. , duplicates are allowed. Given a bag S of integers and a search key K, find an occurrence of K in S, if any. Transform the bag into a tree structure. Then try to locate the key in the tree.

Binary search tree i <i How to avoid worst case? >i All keys stored in the left subtree of node i are smaller than i. All keys in right sub-tree are greater. In-order traversal yields a sorted list of the keys in the tree. Average Worst case Space O(n) Search O(log n) O(n) Insert O(log n) O(n) Delete O(log n) O(n)

AVL tree Balance factor of a node = the height of its left subtree − the height of its right subtree An AVL tree is a binary search tree. For every node, the balance factor is either − 1 or 0 or +1.

R-rotation, L-rotation Right side is light. Rotate toward the light side. Right Rotation Current root Q rotates down to the right. The middle guy B gets attached to the former root Q. Before rotation: P < Q After rotation, the binary search Before rotation: P < B tree property is still preserved. After rotation: P < B Before rotation: B < Q After rotation: B < Q

Inserting a new node upsets the balance. Rotate to re-balance.

L-rotation

Class Ex. Perform left rotation on

Class Ex. Answer Perform left rotation on Answer

RL-rotation RL(a): Perform a right rotation on the right subtree at c, R(c). Now we're ready for a left rotation at root a, L(a). Class Ex. Try LR(3).

LR-rotation Ans: In general, LR(r) r =3. c=1. g=2. T 2 is the (R) middle guy. T 3 is the (L) middle guy.

AVL tree for the list 5, 6, 8, 3, 2, 4, 7 if there are several nodes with the ± 2 balance, the rotation is done for the tree rooted at the unbalanced node that is the closest to the newly inserted leaf.

AVL tree for the list 5, 6, 8, 3, 2, | 4, 7 r =6. c=3. g=5. T 1=2. T 2=4. T 3 is empty. T 4=8. Class Ex. Show the missing step

4 situations 4 rotations Right side is light. Left side is light. 2 nd: right side is light. 1 st: left side is light. 2 nd: left side is light. 1 st: right side is light.

Class Ex. Build AVL tree for the list 2, 9, 7, 6, 5, 8.

Analysis of AVL trees • • • h 1. 4404 log 2 (n + 2) − 1. 3277 Average height: 1. 01 log 2 n + 0. 1 for large n Search and insertion are O(log n) Deletion is more complicated but is also O(log n) Disadvantages: frequent rotations

Complete binary tree 1. Every level, except possibly the last, is completely filled. 2. All nodes are as far left as possible.

Heap yes no no A heap is a binary tree with keys assigned to its nodes, one key per node. 1. The shape property: the binary tree is complete. 2. The parental dominance property: the key in each node is greater than or equal to the keys in its children.

1 -d array representation • About half of the array are parental nodes. • For the parent at index i, its children are at positions 2 i and 2 i + 1. • A 1 -d array can be interpreted as a binary tree. • i gets bigger as you go down the tree.

Bottom-up heap construction Step 0: Initialize the structure with keys in the order given Step 1: Starting with the last (rightmost) parental node, fix the heap rooted at it. You may need to keep going down and exchanging it with its larger child until the heap condition holds. Step 2: Repeat Step 1 for the preceding parental node Inputs: 2 9 7 6 5 8

2 perspectives: tree & array 2 2 2 9 9 9 2 6 7 8 8 6 6 2 5 5 5 8 7 7

Class Ex. Try 5, 6, 8, 3, 2, 4, 7. Step 0: Initialize the structure with keys in the order given Step 1: Starting with the last (rightmost) parental node, fix the heap rooted at it. If it doesn’t satisfy the heap condition, keep exchanging values with its largest child and head down until the heap condition holds. Step 2: Repeat Step 1 for the preceding parental node

Check one parent, node i, at a time. From the last parent to the first parent. For the current subtree rooted at i, it is not yet a heap. j points to its child. If j=n, j points to the last child. j points to larger child. Exchange keys (values). Go down in the subtree to the next k.

Time complexity

Insert a key into the heap 1. Attach a new node with key K in it after the last leaf of the existing heap. 2. Promote K up to its appropriate place in the new heap. K Time efficiency of insertion is O(log n).

Delete the root’s key 1. Exchange the root’s key with the last key K of the heap. 2. Decrease the heap’s size by 1. 3. Demote K down to its appropriate place in the new heap. Time efficiency: O(log n). K

Heapsort 1. Construct a heap for a given list of n keys, O(n) 2. Repeat operation of root removal n− 1 times, O(n)×O(log n) = O(n log n) Empirical analysis shows that heapsort runs more slowly than quicksort but is competitive with mergesort.

Class Ex. Perform heapsort on the heap from last ex. 1. Exchange the root’s key with the last key K of the heap. 2. Decrease the heap’s size by 1. 3. Demote K down to its appropriate place in the new heap.

Horner’s Rule p(x) = 2 x 4 − x 3 + 3 x 2 + x − 5 = x(2 x 3 − x 2 + 3 x + 1) − 5 = x(x(2 x 2 − x + 3) + 1) − 5 = x(x(x(2 x − 1) + 3) + 1) − 5 • Last equality is faster than first. • n multiplications and n additions (subtractions) to evaluate an n-degree polynomial at a given point. Class Ex. Let x=3. Calculate p(x) by the first formula. Calculate p(x) by the last formula.

Answer 2 -1 at x = 3 3 2: -1: 3 * 2 + (-1) = 5 3: 3*5 + 3 = 18 1: 3 * 18 + 1 = 55 -5: 3 * 55 + (-5) = 160 1 -5

Problem reduction: Least common multiple lcm(24, 60) = 120. Transform it to the gcd problem. • Transform the problem into a different problem for which an algorithm is already available. • Analyze the combined time of the transformation and solving the other problem.

Adjacency matrix

Minimization and maximization problems

Linear Programming Consider a university to invest $100 million. 3 types of investments: stocks, bonds, and cash. Expect an annual return of 10%, 7%, and 3% respectively. Amount invested in stocks to be no more than one-third of the moneys invested in bonds. At least 25% of the total amount invested in stocks and bonds must be invested in cash. How to maximize the return? Let x, y, and z be the amounts (in millions of dollars) invested in stocks, bonds, and cash, respectively.

Fractional knapsack problem Knapsack of capacity W n items of weights w 1, . . . , wn values v 1, . . . , vn, Find the most valuable subset of the items that fits into the knapsack. Let xj , j = 1, . . . , n, be a variable representing a fraction of item j taken into the knapsack.

Discrete knapsack problem Given a knapsack of capacity W and n items of weights w 1, . . . , wn and values v 1, . . . , vn, find the most valuable subset of the items that fits into the knapsack. Let xj , j = 1, . . . , n, be a variable representing a fraction of item j taken into the knapsack.

For i = 0, . . . , n, let ui be a dummy variable. Let cij to be the distance from city i to city j. Travelling salesman problem as integer linear programming Each city be arrived at from exactly one other city From each city there is a departure to exactly one other city. There is only a single tour covering all cities, and not two or more disjointed tours that only collectively cover all cities.

Peasant, wolf, goat, and cabbage What is the minimum number of crossings needed? How many such solutions are there?

Summary • Transform-and-conquer is the fourth general algorithm design (and problem solving) strategy discussed in the book. • A heap is an essentially complete binary tree with keys (one per node) satisfying the parental dominance requirement. • AVL trees are binary search trees that are always balanced to the extent possible for a binary tree. The balance is maintained by transformations of four types called rotations. • Gaussian elimination—an algorithm for solving systems of linear equations—is a principal algorithm in linear algebra. It solves a system by transforming it to an equivalent system with an uppertriangular coefficient matrix, which is easy to solve by back substitutions. • Horner’s rule is an optimal algorithm for polynomial evaluation without coefficient preprocessing. • Linear programming concerns optimizing a linear function of several variables subject to constraints in the form of linear equations and linear inequalities. • Integer linear programming is more difficult.