Chapter 2 The Complexity of Algorithms and the



































































- Slides: 67
Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems 1
The goodness of an algorithm n n n Time complexity (more important) Space complexity For a parallel algorithm : n n time-processor product For a VLSI circuit : n area-time (AT, AT 2) 2
Measure the goodness of an algorithm n Time complexity of an algorithm n n efficient (algorithm) worst-case average-case amortized 3
Measure the difficulty of a problem n n n NP-complete ? Undecidable ? Is the algorithm best ? n optimal (algorithm) n We can use the number of comparisons to measure a sorting algorithm. 4
Asymptotic notations n n Def: f(n) = O(g(n)) "at most" c, n 0 |f(n)| c|g(n)| n n 0 e. g. f(n) = 3 n 2 + 2 g(n) = n 2 n 0=2, c=4 f(n) = O(n 2) e. g. f(n) = n 3 + n = O(n 3) e. g. f(n) = 3 n 2 + 2 = O(n 3) or O(n 100 ) 5
Def : f(n) = (g(n)) “at least“, “lower bound" c, and n 0, |f(n)| c|g(n)| n n 0 e. g. f(n) = 3 n 2 + 2 = (n 2) or (n) n Def : f(n) = (g(n)) c 1, c 2, and n 0, c 1|g(n)| |f(n)| c 2|g(n)| n n 0 e. g. f(n) = 3 n 2 + 2 = (n 2) n n Def : f(n) o(g(n)) 1 e. g. f(n) = 3 n 2+n = o(3 n 2) 6
Problem size 10 102 103 104 log 2 n 3. 3 6. 6 10 13. 3 n 10 102 103 104 1. 3 x 105 106 108 nlog 2 n n 2 0. 33 x 1 3 0. 7 x 10 0 2 102 104 2 n 1. 3 x 103 1024 0 >10100 n! 3 x 106 >10100 Time Complexity Functions 7
Common computing time functions n n O(1) O(log n) O(n 2) O(n 3) O(2 n) O(n!) O(nn) n n Exponential algorithm: O(2 ) n polynomial algorithm: e. g. O(n 2), O(nlogn) Algorithm A : O(n 3), algorithm B : O(n) n Should Algorithm B run faster than A? NO ! n It is true only when n is large enough! 8
Analysis of algorithms n n n Best case: easiest Worst case Average case: hardest 9
Straight insertion sort input: 7, 5, 1, 4, 3 5, 7, 1, 4, 3 1, 5, 7, 4, 3 1, 4, 5, 7, 3 1, 3, 4, 5, 7 10
Algorithm 2. 1 Straight Insertion Sort Input: x 1, x 2, . . . , xn Output: The sorted sequence of x 1, x 2, . . . , xn For j : = 2 to n do Begin i : = j-1 x : = xj While x<xi and i > 0 do Begin xi+1 : = xi i : = i-1 End xi+1 : = x End 11
Inversion table (a 1, a 2, . . . , an) : a permutation of {1, 2, . . . , n} n (d 1, d 2, . . . , dn): the inversion table of (a 1, a 2, . . . an) n dj: the number of elements to the left of j that are greater than j n e. g. permutation (7 5 1 4 3 2 6) inversion table 2 4 3 2 1 1 0 n e. g. permutation (7 6 5 4 3 2 1) inversion table 6 5 4 3 2 1 0 n 12
Analysis of # of movements n M: # of data movements in straight insertion sort 1 5 7 4 3 temporary e. g. d 3=2 n 13
Analysis by inversion table n n best case: already sorted di = 0 for 1 i n M = 2(n 1) = O(n) worst case: reversely sorted d 1 = n 1 d 2 = n 2 : di = n i dn = 0 14
n average case: xj is being inserted into the sorted sequence x 1 x 2. . . x j-1 n the probability that xj is the largest: 1/j n takes 2 data movements n the probability that xj is the second largest : 1/j n takes 3 data movements 1 4 7 5 : n # of movements for inserting xj: 15
Analysis of # of exchanges l Method 1 (straightforward) xj is being inserted into the sorted sequence x 1 x 2. . xj-1 n If xj is the kth (1 k j) largest, it takes (k 1) exchanges. n e. g. 1 5 7 4 1 5 4 7 1 4 5 7 n # of exchanges required for xj to be inserted: n 16
n # of exchanges for sorting: 17
Method 2: with inversion table and generating function In(k): # of permutations in n nmbers which have exactly k inversions nk 0 1 2 3 4 5 6 1 1 0 0 0 2 1 1 0 0 0 3 1 2 2 1 0 0 0 4 1 3 5 6 5 3 1 18
Assume we have I 3(k), 0 k 3. We will calculate I 4(k). (1) a 1 a 2 a 3 a 4 (2) a 1 a 2 a 3 a 4 largest second largest G 3(Z) ZG 3(Z) (3) a 1 a 2 a 3 a 4 (4) a 1 a 2 a 3 a 4 third largest smallest n 19
case I 4(0) I 4(1) I 4(2) I 4(3) I 4(4) I 4(5) I 4(6) 1 I 3(0) I 3(1) I 3(2) I 3(3) 2 I 3(0) I 3(1) I 3(2) I 3(3) 3 I 3(0) I 3(1) I 3(2) I 3(3) 4 case 1 I 4(0 I 4(1 I 4(2) I 4(3) I 4(4) I 4(5) I 4(6) ) ) 1 2 2 2 1 1 2 2 1 6 5 3 1 3 4 total 1 3 5 20
n generating function for In(k) n for n = 4 n in general, 21
Pn(k): probability that a given permutation of n numbers has k inversions n generating function for Pn(k): 22
Binary search sorted sequence : (search 9) 1 4 5 7 9 10 12 step 1 step 2 step 3 n best case: 1 step = O(1) n worst case: ( log 2 n +1) steps = O(log n) n average case: O(log n) steps n 15 23
n cases for successful search n+1 cases for unsuccessful search Average # of comparisons done in the binary tree: A(n) = , where k = log n +1 24
A(n) = k = log n = O(log n) as n is very large 25
Straight selection sort n n 7 5 1 4 3 1 5 7 4 3 1 3 7 4 5 1 3 4 7 5 1 3 4 5 7 Only consider # of changes in the flag which is used for selecting the smallest number in each iteration. n best case: O(1) 2 n worst case: O(n ) n average case: O(n log n) 26
Quicksort n Recursively apply the same procedure. 27
Best case of quicksort n n Best case: O(nlogn) A list is split into two sublists with almost equal size. log n rounds are needed In each round, n comparisons (ignoring the element used to split) are required. 28
Worst case of quicksort n n n Worst case: O(n 2) In each round, the number used to split is either the smallest or the largest. 29
Average case of quicksort n Average case: O(n log n) 30
31
32
2 -D ranking finding n n Def: Let A = (a 1, a 2), B = (b 1, b 2). A dominates B iff a 1> b 1 and a 2 > b 2 Def: Given a set S of n points, the rank of a point x is the number of points dominated by x. D B A C E rank(A)= 0 rank(B) = 1 rank(C) = 1 rank(D) = 3 rank(E) = 0 33
n Straightforward algorithm: compare all pairs of points : O(n 2) 34
Divide-and-conquer 2 -D ranking finding Step 1: Split the points along the median line L into A and B. Step 2: Find ranks of points in A and ranks of points in B, recursively. Step 3: Sort points in A and B according to their y-values. Update the ranks of points in B. 35
36
Lower bound n n n Def : A lower bound of a problem is the least time complexity required for any algorithm which can be used to solve this problem. ☆ worst case lower bound ☆ average case lower bound The lower bound for a problem is not unique. n e. g. (1), (n log n) are all lower bounds for sorting. n ( (1), (n) are trivial) 37
n At present, if the highest lower bound of a problem is (n log n) and the time complexity of the best algorithm is O(n 2). n n We may try to find a higher lower bound. We may try to find a better algorithm. Both of the lower bound and the algorithm may be improved. If the present lower bound is (n log n) and there is an algorithm with time complexity O(n log n), then the algorithm is optimal. 38
The worst case lower bound of sorting 6 permutations for 3 data elements a 1 a 2 a 3 1 2 3 1 3 2 2 1 3 2 3 1 2 3 2 1 39
Straight insertion sort: n n input data: (2, 3, 1) (1) a 1: a 2 (2) a 2: a 3, a 2 a 3 (3) a 1: a 2, a 1 a 2 input data: (2, 1, 3) (1)a 1: a 2, a 1 a 2 (2)a 2: a 3 40
Decision tree for straight insertion sort 41
Decision tree for bubble sort 42
Lower bound of sorting To find the lower bound, we have to find the depth of a binary tree with the smallest depth. n n! distinct permutations n! leaf nodes in the binary decision tree. n balanced tree has the smallest depth: log(n!) = (n log n) lower bound for sorting: (n log n) n (See the next page. ) n 43
Method 1: 44
Method 2: n Stirling approximation: 45
Heapsort—An optimal sorting algorithm n A heap : parent son 46
n output the maximum and restore: n Heapsort: n n Phase 1: Construction Phase 2: Output 47
Phase 1: construction n input data: 4, 37, 26, 15, 48 n n restore the subtree rooted at A(2): restore the tree rooted at A(1): 48
Phase 2: output 49
Implementation n using a linear array not a binary tree. n n The sons of A(h) are A(2 h) and A(2 h+1). time complexity: O(n log n) 50
Time complexity Phase 1: construction 51
Time complexity Phase 2: output max 52
Average case lower bound of sorting By binary decision tree n The average time complexity of a sorting algorithm: the external path length of the binary tree n! n The external path length is minimized if the tree is balanced. (all leaf nodes on level d or level d 1) n 53
54
Compute the min external path length 1. Depth of balanced binary tree with c leaf nodes: d = log c Leaf nodes can appear only on level d or d 1. 2. x 1 leaf nodes on level d 1 x 2 leaf nodes on level d x 1 + x 2 = c x 1 + = 2 d-1 x 1 = 2 d c x 2 = 2(c 2 d-1) 55
3. External path length: M= x 1(d 1) + x 2 d = (2 d 1)(d 1) + 2(c 2 d-1)d = c(d 1) + 2(c 2 d-1), d 1 = log c = c log c + 2(c 2 log c ) 4. c = n! M = n! log n! + 2(n! 2 log n! ) M/n! = log n! + 2 = log n! + c, 0 c 1 = (n log n) Average case lower bound of sorting: (n log n) 56
Quicksort & Heapsort n n Quicksort is optimal in the average case. ( O(n log n) in average ) (i)worst case time complexity of heapsort is O(n log n) (ii)average case lower bound: (n log n) n n average case time complexity of heapsort is O(n log n) Heapsort is optimal in the average case. 57
Improving a lower bound through oracles n n Problem P: merge two sorted sequences A and B with lengths m and n. Conventional 2 -way merging: 2 3 5 6 1 4 7 8 n Complexity: at most m+n-1 comparisons 58
n (1) Binary decision tree: There are ways ! leaf nodes in the decision tree. The lower bound for merging: log m + n 1 (conventional merging) 59
n When m = n Using Stirling approximation n Optimal algorithm: conventional merging needs 2 m -1 comparisons 60
(2) Oracle: The oracle tries its best to cause the algorithm to work as hard as it might. (to give a very hard data set) Two sorted sequences: n n A: a 1 < a 2 < … < am B: b 1 < b 2 < … < bm The very hard case: n n a 1 < b 1 < a 2 < b 2 < … < am < bm 61
We must compare: a 1 : b 1 : a 2 : bm-1 : am-1 am : bm n Otherwise, we may get a wrong result for some input data. e. g. If b 1 and a 2 are not compared, we can not distinguish a 1 < b 1 < a 2 < b 2 < … < am < bm and a 1 < a 2 < b 1 < b 2 < … < am < bm n Thus, at least 2 m 1 comparisons are required. n The conventional merging algorithm is optimal for m = n. n 62
Finding lower bound by problem transformation n Problem A reduces to problem B (A B) n n iff A can be solved by using any algorithm which solves B. If A B, B is more difficult. Note: T(tr 1) + T(tr 2) < T(B) T(A) T(tr 1) + T(tr 2) + T(B) O(T(B)) n 63
The lower bound of the convex hull problem sorting convex hull A B n n an instance of A: (x 1, x 2, …, xn) ↓transformation an instance of B: {( x 1, x 12), ( x 2, x 22), …, ( xn, xn 2)} assume: x 1 < x 2 < …< xn 64
n If the convex hull problem can be solved, we can also solve the sorting problem. n n The lower bound of sorting: (n log n) The lower bound of the convex hull problem: (n log n) 65
The lower bound of the Euclidean minimal spanning tree (MST) problem sorting Euclidean MST A B n an instance of A: (x 1, x 2, …, xn) n ↓transformation an instance of B: {( x 1, 0), ( x 2, 0), …, ( xn, 0)} n n Assume x 1 < x 2 < x 3 <…< xn there is an edge between (xi, 0) and (xi+1, 0) in the MST, where 1 i n 1 66
n If the Euclidean MST problem can be solved, we can also solve the sorting problem. n n The lower bound of sorting: (n log n) The lower bound of the Euclidean MST problem: (n log n) 67