Inventing A Really Bad Sort It Seemed Like

Inventing A Really Bad Sort: It Seemed Like A Good Idea At The Time Jim Huggins Kettering University jhuggins@kettering. edu http: //www. kettering. edu/~jhuggins

Bless me, Father Knuth, for I have sinned …

The Set-Up Writing questions for take-home exam for Advanced Algorithms course (sophomore) Desired: an algorithm which: • does something useful • is simple to analyze • hasn’t been done before on the web

Inspiration: Stooge. Sort public void Stooge. Sort(int[] arr, int start, int stop) { if (arr[start] > arr[stop]) { int swap = arr[start]; arr[start] = arr[stop]; arr[stop] = swap; } if (start+1 >= stop) return; int third = (stop - start + 1) / 3; Stooge. Sort(arr, start, stop-third); // First two-thirds Stooge. Sort(arr, start+third, stop); // Last two-thirds Stooge. Sort(arr, start, stop-third); // First two-thirds } Comparison count: T(n) = 3 T(⅔n) + 1; T(0)=0, T(1)=0, T(2) = 1 T(n) = Θ(nlog 3/2 3) ≈ Θ(n 2. 7)

My Idea: Goofy. Sort public void goofy. Sort (int[] array, int if (start>=stop) return; goofy. Sort(array, start, stop-1); if (array[stop-1] > array[stop]) { int swap = array[stop-1]; array[stop-1] = array[stop]; array[stop] = swap; goofy. Sort(array, start, stop-1); } } start, int stop) { // first n-1 items // swap last item // first n-1 items // again An added bonus: different behavior in best case, worst case. (More questions!)

Analysis of Goofy. Sort: Best Case public void goofy. Sort (int[] array, int start, int stop) { if (start>=stop) return; goofy. Sort(array, start, stop-1); if (array[stop-1] > array[stop]) { // best case: false int swap = array[stop-1]; array[stop-1] = array[stop]; array[stop] = swap; goofy. Sort(array, start, stop-1); } } Comparison count: T(n) = T(n-1) + 1, T(1) = 0 T(n) = O(n)

Analysis of Goofy. Sort: Worst Case public void goofy. Sort (int[] array, int start, int stop) { if (start>=stop) return; goofy. Sort(array, start, stop-1); if (array[stop-1] > array[stop]) { // worst case: true int swap = array[stop-1]; array[stop-1] = array[stop]; array[stop] = swap; goofy. Sort(array, start, stop-1); } } Comparison count: T(n) = 2 T(n-1) + 1, T(1) = 0 T(n) = O(2 n)

“Beware of bugs in the above code; I have only proved it correct, not tried it. ”

And so, to avoid embarrassment … • Coded the algorithm in Java • Tested with a variety of random inputs • Tested with a variety of list sizes 20, 30, 40, … • And it all works! Great! (What could possibly go wrong? )

Actual Student Answers: • • T(n) = O(n 3) T(n) = T(n-1) + O(n 2) = O(n 3) T(n) = T(n-1) + Σ 1 n i = ? T(n) = O(2 n) – One bright student, at least!

Preparing to hand them back… • Preparing my rant … – “you completely missed the point” – “we did this in class … ” – “I even tested this on lots of inputs. . . ” • And then I remember: – If this is really exponential time, how did I run it on an input of size 40? – @#@!. What if I’m wrong and they’re right?

Bentley: Three Beautiful Quicksorts Paraphrasing: “If you double the input size, and the instruction count quadruples, you’ve got a quadratic algorithm. ” (Watch the Google Tech. Talk … it’s neat. )

Racing to the computer Take the average over 100 random runs … n 10 20 40 80 T(n) 87 761 6609 54907 @#$!. It looks like it’s cubic! ratio 8. 74 8. 68 8. 31

How could this be cubic? public void goofy. Sort (int[] array, int start, int stop) { if (start>=stop) return; goofy. Sort(array, start, stop-1); if (array[stop-1] > array[stop]) { int swap = array[stop-1]; array[stop-1] = array[stop]; array[stop] = swap; goofy. Sort(array, start, stop-1); } } The first recursive call is always bad; the array could be completely unordered The second recursive call is always good; all but the last item are ordered

How badly cubic? • What’s the worst case input? – Reverse sorted, right? • At this point, I don’t trust myself, so … – Generate all permutations on a list of size n • We just covered this in class! (Lucky this is an algorithms course!) – Verified: worst case happens in reverse order

So, what’s the worst-case time? • Do a bunch of input sizes … • … and putz around with a calculator … • The closed-formula appears to be: T(n) = n(n-1)(n-2)/6 + (n-1) • So this does appear to be cubic after all. (Now, how do I prove it? )

A quick overview of the proof • T(n) = 1 + T(n-1) + GT(n-1); T(1) = 0 • GT(n) = (n-1) + GT(n-1); GT(1) = 0 … GT(n) = n(n-1)/2 • T(n) = 1 + T(n-1) + (n-1)(n-2)/2; T(1) = 0 … T(n) = (n-1) + n(n-1)(n-2)/6 (see me for the full proof … it’s not that bad)

Aftermath • Deep apologies to the students – They were gracious • Q: “How did y’all know it was cubic? ” A: “We ran it and it looked cubic. ” – They had no idea how to proceed … so they did the empirical analysis first to find the “right” answer!

“Beware of bugs in the above code; I have only proved it correct, not tried it. ”

“Beware of bugs in the above analysis; I have only proved it correct, not verified it empirically. ”