Lecture 25 Selection sort reviewed Insertion sort reviewed

  • Slides: 19
Download presentation
Lecture 25 • • Selection sort, reviewed Insertion sort, reviewed Merge sort Running time

Lecture 25 • • Selection sort, reviewed Insertion sort, reviewed Merge sort Running time of merge sort, 2 ways to look at it • Quicksort • Course evaluations

Selection sort • for k = 1 to n-1 – find kth smallest item

Selection sort • for k = 1 to n-1 – find kth smallest item – swap it with the one in the kth position • number of comparisons – – n-1 for k=1 n-2 for k=2 … 1 for k=n-1 • total: n(n-1)/2

Insertion sort • for k=2 to n – move kth item into sorted order

Insertion sort • for k=2 to n – move kth item into sorted order with respect to the first k-1 items which already sorted – involves moving sorted items over: can do this at the same time as finding where kth item has to go • number of comparisons: worst case – – 1 for k=2 2 for k=3 … n-1 for k=n • total: n(n-1)/2 • on average, maybe n(n-1)/4, but involves more moving of data than selection sort

Bubble sort • Pass through data, swapping neighbors that are out of order •

Bubble sort • Pass through data, swapping neighbors that are out of order • After first pass, largest element has “sunk” to bottom • Repeat until no more swaps are needed, considering one less pair each time • Cost: n-1 comparisons for first pass, n-2 for 2 nd pass, etc. • May need n-1 passes, so also “O(n 2)”

Merge sort • • • Recursive If only one item, return Sort first half

Merge sort • • • Recursive If only one item, return Sort first half by merge sort Sort second half by merge sort Merge the results

Keys to efficiency • Recursion: reduce problem to two problems of half the size

Keys to efficiency • Recursion: reduce problem to two problems of half the size • Also known as: divide and conquer • Merge step: only requires n comparisons, where n is number of items to be merged • Need two arrays, but no more • Do not construct arrays inside the recursive call!!! (The comprehensive edition of Liang does this) • How should we implement this in Java? • public static sort(what parameters? ) • We cannot pass “half an array”, so what to do?

Details for sort public static void sort(int left, int right, double[] x, double[] temp){

Details for sort public static void sort(int left, int right, double[] x, double[] temp){ if(left == right){ return; } int mid = (left+right)/2; sort(left, mid, x, temp); // sort first half of x using temp sort(mid+1, right, x, temp); // sort second half of x using temp merge(left, mid, right, x, temp); // merge them into temp copy(left, right, temp, x); // copy temp back to x }

Details for merge(left, mid, right, x, temp) • • Keep 3 ints: i, j,

Details for merge(left, mid, right, x, temp) • • Keep 3 ints: i, j, k i initialized to left j initialized to mid+1 k initialized to left advance until i > mid or j > right copy the rest of one of the other half cute trick: 2 additional while loops another trick: do nothing if x[mid] <= x[mid+1] • for this reason, better move the copy inside merge

Number of comparisons • Let C(n) = number of comparisons to sort array of

Number of comparisons • Let C(n) = number of comparisons to sort array of length n by merge sort • Clearly, C(n) = 2 C(n/2) + (n-1), if n is even, > 0 • or a little less, if we program merge efficiently • C(1) = 0 • C(2) = 1 • C(4) = 5 • C(8) = 17 • C(16) = 49 • Hard to see a pattern, but when we double n, the number of comparisons is also doubled, + an extra n 1 comparisons

Let’s simplify this slightly • • Count the merge as n comparisons, instead of

Let’s simplify this slightly • • Count the merge as n comparisons, instead of n-1 C(n) = 2 C(n/2) + n, if n is even, > 0 C(1) = 0 C(2) = 2 C(4) = 8 C(8) = 24 C(16) = 64 Do you see a pattern now? Assume n is a power of 2, say 2 k • By inspection, C(n) = k n • In other words, C(n) = n log 2(n)

Proof by induction on k • Base case: C(1) = 1 log 2(1) =

Proof by induction on k • Base case: C(1) = 1 log 2(1) = 0 • Inductive hypothesis: suppose true for k-1: C(2 k-1) = (k-1) 2 k-1 • Just like: assume the recursive magic works • Want to prove true for n=k • C(2 k) = 2 C(2 k-1) + 2 k • = 2 (k-1) 2 k-1 + 2 k • = (k-1) 2 k + 2 k • = k 2 k

Proof that all horses have same color • proof by induction on number of

Proof that all horses have same color • proof by induction on number of horses • if n=1, true • inductive hypothesis: suppose true for n-1: if you have n-1 horses, they are all same color • given n horses in a corral, take 1 out: rest must have same color • put it back and take a different one out: rest must have same color • therefore they are all the same color • what is wrong with this?

An easier way to think about merge sort • • 1 call to sort

An easier way to think about merge sort • • 1 call to sort array of length n 2 calls to sort array of length n/2 4 calls to sort array of length n/4 8 calls to sort array of length n/8 … n calls to sort array of length 1 at each level, the merge operations take a total of at most n comparisons • since the number of levels is log 2 n the total number of comparisons is at most n log 2 n

What if n is not a power of 2? • Does not really matter:

What if n is not a power of 2? • Does not really matter: mid = (left + right)/2 rounds down which is fine • Number of comparisons is still approximately n log 2 n (which is not an integer) • We say running time is O(n log 2 n) • Base of logarithm doesn’t matter much, because log 10 n = log 2 n/log 210

Quicksort • Similar recursive idea, but avoids the need for 2 arrays • Chooses

Quicksort • Similar recursive idea, but avoids the need for 2 arrays • Chooses a pivot element, then divides array into two parts, one with elements ≤ pivot, and one with elements > pivot • Possible ways to choose pivot: see next page • To partition the array, loop from left to find first element > pivot, and loop from right to find first element <= pivot • Swap them and repeat • Place pivot in right place • Make two recursive calls

How to choose the pivot • Goal: want to choose pivot to divide array

How to choose the pivot • Goal: want to choose pivot to divide array approximately in half, but want to do this fast • Ideas? • First position • middle position • average value • median value • randomly chosen • median of items in first, middle and last positions

Number of comparisons for quicksort • Assuming arrays divided approximately in half each time,

Number of comparisons for quicksort • Assuming arrays divided approximately in half each time, same as merge sort • Advantage: only one array • Disadvantage: items with equal values may end up interchanged from their original value • Doesn’t matter for primitive types, but may matter for objects • For this reason the Arrays. sort methods use quicksort for primitive types and merge sort for Comparable objects

Other sorts • Heapsort: slightly slower than quicksort on average but number of comparisons

Other sorts • Heapsort: slightly slower than quicksort on average but number of comparisons is n log 2 n even in worst case, like merge sort • Radix sort

Big O Notation • We say a function f(n) is O(g(n)) if there is

Big O Notation • We say a function f(n) is O(g(n)) if there is a constant c so that f(n) ≤ c g(n) for all n • Thus we say the number of comparisons for merge sort is O(n log n) • We don’t need to write the base • Which of these functions are O of the others: • log n, 2 n, 10 n , n 2, n 3, n, n log n