MedianOrder Statistics Algorithms Minimum and Maximum Selection in



![Quicksort Approach • int Select(int A[], k, low, high) – Choose a pivot item Quicksort Approach • int Select(int A[], k, low, high) – Choose a pivot item](https://slidetodoc.com/presentation_image_h2/a76c00ebc45d0655bc31ec5d9c11530d/image-4.jpg)







![Analysis of number of comparisons • int Select(int A[], k, low, high) – Choosing Analysis of number of comparisons • int Select(int A[], k, low, high) – Choosing](https://slidetodoc.com/presentation_image_h2/a76c00ebc45d0655bc31ec5d9c11530d/image-12.jpg)

- Slides: 13

Median/Order Statistics Algorithms • Minimum and Maximum • Selection in expected linear time • Selection in worst-case linear time

Minimum and Maximum • How many comparisons are sufficient to find minimum/maximum? • How many comparisons are sufficient to find both minimum AND maximum? • Show n + log n - 2 comparisons are sufficient to find second minimum (and minimum)

Selection (Median) Problem • How quickly can we find the median (or in general the kth largest element) of an unsorted list of numbers? • Two approaches – Quicksort partition algorithm expected Θ(n) time but Ω(n 2) time in the worst-case – Deterministic Θ(n) time in the worst-case
![Quicksort Approach int Selectint A k low high Choose a pivot item Quicksort Approach • int Select(int A[], k, low, high) – Choose a pivot item](https://slidetodoc.com/presentation_image_h2/a76c00ebc45d0655bc31ec5d9c11530d/image-4.jpg)
Quicksort Approach • int Select(int A[], k, low, high) – Choose a pivot item – Determine rank of pivot element in current partition • Compare all items to this pivot element – If pivot is kth item, return pivot – Else update low and high and recurse on partition that contains kth item

Example k=5 17 12 6 23 19 8 5 10 6 8 5 10 17 12 23 19 17 12 19 23 12 17 low high rank 1 8 5 8 4 5 6 7 found: 5

Probabilistic Analysis • Assume each of n! permutations is equally likely • Modify earlier indicator variable analysis of quicksort to handle this k-selection problem • What is probability ith smallest item is compared to jth smallest item? – If k is contained in (i. . j)? – If k ≤ i? – If k ≥ j?

Cases where (i. . j) do not contain k • Case k ≥ j: – Σ(i=1 to k-1) Σj = i+1 to k 2/(k-i+1) = Σi=1 to k-1 (k-i) 2/(k-i+1) = Σi=1 to k-1 2 i/(i+1) [replace k-i with i] = 2 Σi=1 to k-1 i/(i+1) ≤ 2(k-1) • Case k ≤ i: – Σ(j=k+1 to n) Σi = k to j-1 2/(j-k+1) = Σj=k+1 to n (j-k) 2/(j-k+1) = Σj = 1 to n-k 2 j/(j+1) [replace j-k with j and change bounds] = 2 Σj=1 to n-k j/(j+1) ≥ 2(n-k) • Total for both cases is ≤ 2 n-2

Case where (i. . j) contains k • At most 1 interval of size 3 contains k – i=k-1, j=k+1 • At most 2 intervals of size 4 contain k – i=k-1, j=k+2 and i=k-2, j= k+1 • In general, at most q-2 intervals of size q contain k • Thus we get Σ(q=3 to n) (q-2)2/q ≤ Σ(q=3 to n) 2 = 2(n-2) • Summing together all cases we see the expected number of comparisons is less than 4 n

Best case, Worst-case • Best case running time? • What happens in the worst-case? – Pivot element chosen is always what? – This leads to comparing all possible pairs – This leads to Θ(n 2) comparisons

Deterministic O(n) approach • Need to guarantee a good pivot element while doing O(n) work to find the pivot element • int Select(int A[], k, low, high) – Choosing pivot element • Divide into groups of 5 • For each group of 5, find that group’s median • Use median of the medians as pivot element – Determine rank of pivot element • Compare some remaining items directly to median – Update low and high and recurse on partition that contains kth item (or return kth item if it is pivot)

Guarantees on the pivot element • Median of medians is guaranteed to be smaller than all the red colored items – Why? – How many red items are there? • Likewise, median of medians is guaranteed to be larger than the blue colored items • Thus median of medians is in the range: • What elements do we need to compare to pivot to determine its rank? – How many of these are there?
![Analysis of number of comparisons int Selectint A k low high Choosing Analysis of number of comparisons • int Select(int A[], k, low, high) – Choosing](https://slidetodoc.com/presentation_image_h2/a76c00ebc45d0655bc31ec5d9c11530d/image-12.jpg)
Analysis of number of comparisons • int Select(int A[], k, low, high) – Choosing pivot element • Analysis – Choosing pivot element • For each group of 5, find that group’s median • Find the median of the medians • c 1 n/5 – c 1 for median of 5 • Recurse on problem of size n/5 – Compare remaining items directly to median – Recurse on correct partition – c 2 n comparisons – Recurse on problem of size at most 7 n/10 • T(n) =

Solving recurrence relation • T(n) = T(7 n/10) + T(n/5) + O(n) – Key observation: 7/10 + 1/5 = 9/10 < 1 • Prove T(n) ≤ cn for some constant c by induction on n • T(n) = 7 cn/10 + cn/5 + dn • = 9 cn/10 + dn • Need 9 cn/10 + dn ≤ cn • Thus c/10 ≥ d c ≥ 10 d