Data Structures Selection Haim Kaplan Uri Zwick December
Data Structures Selection Haim Kaplan & Uri Zwick December 2013 1
Selection Given n items, each with a key that belongs to a totally ordered domain, select the item with the k-th largest key The item with the n/2 -th largest key is called the median The k-th largest item is also called the k-th order statistic Can we do it faster than sorting?
Quick-select ≥ A[k]
Quick-select (Adapted from Sedgewick’s Algorithms in Java) ≥ A[k]
Fredman’s analysis (2013) The probability that ni+1 ≤ (3/4)ni is at least 1/2 Expected number of comparisons needed to get from ni to first nj with nj ≤(3/4)ni is at most 2 ni Total expected number of comparisons is at most
Exact analysis [Knuth 1971] P 2 C 2 E (Slightly more complicated than the analysis of quicksort)
Approximate median by sampling Suppose that we only want an item whose rank is close to n/2. (rank = index in sorted order) Choose a random sample of size s Find the median m of the sample With high probability, the rank of m in the original set is in the range 7
Exact median via sampling [Floyd-Rivest (1975)] Choose a random sample of size n 3/4 8
Exact median via sampling [Floyd-Rivest (1975)] 9
Exact median via sampling [Floyd-Rivest (1975)] 10
Deterministic linear time selection [Blum, Floyd, Pratt, Rivest, and Tarjan (1973)] 11
Split the items into 5 -tuples 6 2 9 5 1 12
Find the median of each 5 -tuples 6 9 5 2 1 13
Find the median of the medians (by a recursive call) 9 6 5 2 1 14
Find the median of the medians (by a recursive call) 5 7 10 4 3 8 11 15
Find the median of the medians (by a recursive call) 5 7 10 4 3 8 11 16
Find the median of the medians (by a recursive call) 5 4 3 7 10 8 11 17
Use median of the medians as pivot ≥x x ≤x 18
Analysis Counting comparisons Induction basis: Easily verified for 2 ≤ n < 10 19
Analysis Counting comparisons Induction step: 20
Some improvements The median of 5 items can be found using 6 comparisons The pivot x should be compared to only 2 items in each 5 -tuple Many other improvements are possible 21
“Master Theorem” for recurrence relations Many generalizations 22
- Slides: 22