Mergesort Analysis of Algorithms Jon von Neumann and
- Slides: 22
Mergesort, Analysis of Algorithms Jon von Neumann and ENIAC (1945)
Why Does It Matter? Run time (nanoseconds) Time to solve a problem of size Max size problem solved in one 1. 3 N 3 10 N 2 47 N log 2 N 48 N 1000 1. 3 seconds 10 msec 0. 4 msec 0. 048 msec 10, 000 22 minutes 1 second 6 msec 0. 48 msec 100, 000 15 days 1. 7 minutes 78 msec 4. 8 msec million 41 years 2. 8 hours 0. 94 seconds 48 msec 10 million 41 millennia 1. 7 weeks 11 seconds 0. 48 seconds second 920 10, 000 1 million 21 million minute 3, 600 77, 000 49 million 1. 3 billion hour 14, 000 600, 000 2. 4 trillion 76 trillion day 41, 000 2. 9 million 50 trillion 1, 800 trillion 1, 000 10+ 10 N multiplied by 10, time multiplied by 2
Orders of Magnitude Seconds Equivalent 1 1 second 10 10 seconds 102 1. 7 minutes 103 17 minutes 104 2. 8 hours 105 1. 1 days 106 1. 6 weeks 107 3. 8 months 108 3. 1 years 109 3. 1 decades 1010 3. 1 centuries . . . forever 1021 age of universe Meters Per Second Imperial Units Example 10 -10 1. 2 in / decade Continental drift 10 -8 1 ft / year Hair growing 10 -6 3. 4 in / day Glacier 10 -4 1. 2 ft / hour Gastro-intestinal tract 10 -2 2 ft / minute Ant 1 2. 2 mi / hour Human walk 102 220 mi / hour Propeller airplane 104 370 mi / min Space shuttle 106 620 mi / sec Earth in galactic orbit 108 62, 000 mi / sec 1/3 speed of light Powers of 2 210 thousand 220 million 230 billion 3
Impact of Better Algorithms Example 1: N-body-simulation. n Simulate gravitational interactions among N bodies. – physicists want N = # atoms in universe n Brute force method: N 2 steps. n Appel (1981). N log N steps, enables new research. Example 2: Discrete Fourier Transform (DFT). n Breaks down waveforms (sound) into periodic components. foundation of signal processing – CD players, JPEG, analyzing astronomical data, etc. – n n Grade school method: N 2 steps. Runge-König (1924), Cooley-Tukey (1965). FFT algorithm: N log N steps, enables new technology. 4
Mergesort (divide-and-conquer) n Divide array into two halves. A A L L G G O O R R I T I H T M H S M S divide 5
Mergesort (divide-and-conquer) n Divide array into two halves. n Recursively sort each half. A L G O R I T H M S divide A G L O R H I M S T sort 6
Mergesort (divide-and-conquer) n Divide array into two halves. n Recursively sort each half. n Merge two halves to make sorted whole. A L G O R I T H M S divide A G L O R H I M S T sort A G H I L M O R S T merge 7
Mergesort Analysis How long does mergesort take? n n Bottleneck = merging (and copying). – merging two files of size N/2 requires N comparisons T(N) = comparisons to mergesort N elements. – to make analysis cleaner, assume N is a power of 2 Claim. T(N) = N log 2 N. n n Note: same number of comparisons for ANY file. – even already sorted We'll prove several different ways to illustrate standard techniques. 8
Proof by Picture of Recursion Tree T(N) N T(N/4) 2(N/2) T(N/2) T(N/4) log 2 N 4(N/4). . . 2 k (N / 2 k) T(N / 2 k) . . . T(2) T(2) N/2 (2) N log 2 N 9
Proof by Telescoping Claim. T(N) = N log 2 N (when N is a power of 2). Proof. For N > 1: 10
Mathematical Induction Mathematical induction. n n Powerful and general proof technique in discrete mathematics. To prove a theorem true for all integers k 0: – Base case: prove it to be true for N = 0. – Induction hypothesis: assuming it is true for arbitrary N – Induction step: show it is true for N + 1 Claim: 0 + 1 + 2 + 3 +. . . + N = N(N+1) / 2 for all N 0. Proof: (by mathematical induction) n n n Base case (N = 0). – 0 = 0(0+1) / 2. Induction hypothesis: assume 0 + 1 + 2 +. . . + N = N(N+1) / 2 Induction step: 0 + 1 +. . . + N + 1 = (0 + 1 +. . . + N) + N+1 = N (N+1) /2 + N+1 = (N+2)(N+1) / 2 11
Proof by Induction Claim. T(N) = N log 2 N (when N is a power of 2). Proof. (by induction on N) n Base case: N = 1. n Inductive hypothesis: T(N) = N log 2 N. n Goal: show that T(2 N) = 2 N log 2 (2 N). 12
Proof by Induction What if N is not a power of 2? n T(N) satisfies following recurrence. Claim. Proof. T(N) N log 2 N. See supplemental slides. 13
Computational Complexity Framework to study efficiency of algorithms. Example = sorting. n n MACHINE MODEL = count fundamental operations. – count number of comparisons UPPER BOUND = algorithm to solve the problem (worst-case). – N log 2 N from mergesort LOWER BOUND = proof that no algorithm can do better. – N log 2 N - N log 2 e OPTIMAL ALGORITHM: lower bound ~ upper bound. – mergesort 14
Decision Tree a 1 < a 2 YES NO a 2 < a 3 a 1 < a 3 YES NO print a 1 , a 2 , a 3 YES print a 2 , a 1 , a 3 a 1 < a 3 YES print a 1 , a 3 , a 2 NO NO print a 3 , a 1 , a 2 < a 3 YES print a 2 , a 3 , a 1 NO print a 3 , a 2 , a 1 15
Comparison Based Sorting Lower Bound Theorem. Any comparison based sorting algorithm must use (N log 2 N) comparisons. Proof. Worst case dictated by tree height h. n N! different orderings. n One (or more) leaves corresponding to each ordering. n Binary tree with N! leaves must have height Stirling's formula Food for thought. What if we don't use comparisons? ! Stay tuned for radix sort. 16
Extra Slides
Proof by Induction Claim. T(N) N log 2 N. Proof. (by induction on N) n n n Base case: N = 1. Define n 1 = N / 2 , n 2 = N / 2. Induction step: assume true for 1, 2, . . . , N – 1. 18
Implementing Mergesort mergesort (see Sedgewick Program 8. 3) Item aux[MAXN]; uses scratch array void mergesort(Item a[], int left, int right) { int mid = (right + left) / 2; if (right <= left) return; mergesort(a, left, mid); mergesort(a, mid + 1, right); merge(a, left, mid, right); } 19
Implementing Mergesort merge (see Sedgewick Program 8. 2) void merge(Item a[], int left, int mid, int right) { int i, j, k; for (i = mid+1; i > left; i--) aux[i-1] = a[i-1]; for (j = mid; j < right; j++) aux[right+mid-j] = a[j+1]; for (k = left; k <= right; k++) if (ITEMless(aux[i], aux[j])) a[k] = aux[i++]; else a[k] = aux[j--]; copy to temporary array merge two sorted sequences } 20
Profiling Mergesort Empirically Mergesort prof. out void merge(Item a[], int left, int mid, int right) <999>{ int i, j, k; for (<999>i = mid+1; <6043>i > left; <5044>i--) <5044>aux[i-1] = a[i-1]; for (<999>j = mid; <5931>j < right; <4932>j++) <4932>aux[right+mid-j] = a[j+1]; for (<999>k = left; <10975>k <= right; <9976>k++) Striking feature: if (<9976>ITEMless(aux[i], aux[j])) All numbers <4543>a[k] = aux[i++]; SMALL! else <5433>a[k] = aux[j--]; <999>} # comparisons Theory ~ N log 2 N = 9, 966 void mergesort(Item a[], int left, int right) <1999>{ Actual = 9, 976 int mid = <1999>(right + left) / 2; if (<1999>right <= left) return<1000>; <999>mergesort(a, aux, left, mid); <999>mergesort(a, aux, mid+1, right); <999>merge(a, aux, left, mid, right); <1999>} 21
Sorting Analysis Summary Running time estimates: n Home pc executes 108 comparisons/second. n Supercomputer executes 1012 comparisons/second. computer home super Insertion Sort (N 2) thousand million instant 2. 8 hours instant 1 second billion 317 years 1. 6 weeks Mergesort (N log N) thousand million billion instant 1 sec 18 min instant Quicksort (N log N) thousand million billion instant 0. 3 sec 6 min instant Lesson 1: good algorithms are better than supercomputers. Lesson 2: great algorithms are better than good ones. 22
- Jon von neumann
- Jon von neumann
- Jon von neumann
- Recurrence relation of bubble sort
- Natural mergesort
- Patrick stürmlinger
- Mergesort cuda
- Mergesort
- Von neumann machine simulator
- Von neumann model components
- John von neumann institute
- John von neumann random number generator
- John von neumannovo schéma
- Gargalo de von neumann
- Von neumann poker
- Non von neumann model
- John von neumann university
- John louis von neumann
- John louis von neumann
- Modello di von neumann spiegazione semplice
- Von neumann
- Arquitectura de von neumann
- Von neumann model components