Programming Data Structures and Algorithms Sorting Anton Biasizzo
Programming, Data Structures and Algorithms (Sorting) Anton Biasizzo Programming, Data Structures and Algorithms (Sorting) Slide 1/36
Preliminaries q Problem of sorting an array of elements. q We will assume that array contains only integers. q We will assume that N is the number of elements passed to the sorting algorithm. q For some algorithms it is convenient to place a sentinel in position 0, thus array will range from 0 to N. q Actual data starts at position 1. q Only allowed operations on the input data are comparisons and assignment. q Comparison-based sorting. Programming, Data Structures and Algorithms (Sorting) Slide 2/36
Insertion sort q Insertion sort is one of the simplest algorithms. q It consists of N-1 passes. q For pass p=2 through N, insertion sort ensures that the elements in positions 1 through p are in sorted order. q It makes use of the fact that elements in positions 1 through p-1 are already in sorted order. q Algorithm: § In pass p we move p-th element left until its correct place is found among the first p elements. § The sentinel in a[0] terminates the loop in the event that in some pass an element is moved all the way to the front. Programming, Data Structures and Algorithms (Sorting) Slide 3/36
Insertion sort q Insertion sort: void insertion_sort(input_type a[], unsigned int n) { unsigned int j, p; input_type tmp; a[0] = MIN_DATA; for ( p=2; p<=n; p++) { tmp = a[p]; for ( j=p; tmp<a[j-1]; j-- ) a[j] = tmp; } a[j] = a[j-1]; } Programming, Data Structures and Algorithms (Sorting) Slide 4/36
Insertion Sort analysis q Because of nested loops, each of which can take N iterations, insertion sort is O(N 2). q Comparison in the inner for loop is executed at most q This bound is tight because input in reverse order actually achieve that bound. q If the input is pre-sorted, the running time is O(N) because the test of the inner for loop fails immediately. q Running time of an average case for insertion sort is O(N 2). Programming, Data Structures and Algorithms (Sorting) Slide 5/36
Lower bound for simple sorting algorithm q An inversion in an array of numbers is any ordered pair (i, j) having the property that i<j but a[i]>a[j]. q Number of inversions is the number of swaps that needed to be performed in the insertion sort. q Sorted array has no inversions. q Swapping two adjacent elements that are out of place removes exactly one inversion. q In the insertion sort there is O(N) other work involved in algorithm. q The running time of insertion sort is O(I+N), where I is the number of inversions in the original array. q Insertion sort runs in linear time if the number of inversions is N. Programming, Data Structures and Algorithms (Sorting) Slide 6/36
Lower bound for simple sorting algorithm q The average number of inversions in an array of N distinct numbers is N (N - 1)/4. q Proof: § For any list, L, of numbers, consider Lr, the list in reverse order. § Consider any pair of two numbers in the list (i, j), with j>i. Clearly, in exactly one of L and Lr this ordered pair represents an inversions. § The total number of these pairs in L and Lr is N(N-1)/2. § An average list has half of this amount: N (N-1)/4. q Insertion sort is quadratic on average, q Any algorithm that sorts by exchanging adjacent elements requires Ω(N 2) time on average. q Proof: § Initially, the average number of inversions is N (N-1) / 4 = Ω(N 2). § Each swap removes only one inversion, so Ω(N 2) swaps are required. Programming, Data Structures and Algorithms (Sorting) Slide 7/36
Shellsort q Named by inventor Donald Shell. q Breaks the quadratic barrier: § Compare distant elements, § The distance is decreased in each phase, § In the last phase adjacent elements are compared. q Shellsort uses a sequence, h 1, h 2, …, ht, called the increment sequence. q Any sequence is suitable as long as ht =1. q The algorithm performance depends on the sequence. q In a k-th phase, using increment hk, all elements spaced hk apart are sorted. q After a phase, using increment hk, for every i, we have A[i] ≤ A[i+ hk] and array is said to be hk-sorted. q An hk-sorted array that is then hk+1 -sorted remains hk-sorted. Programming, Data Structures and Algorithms (Sorting) Slide 8/36
Shellsort Original After 5 -sort After 3 -sort After 1 -sort 81 35 28 11 94 17 12 12 11 11 11 15 96 28 35 17 12 12 15 28 35 41 41 35 17 75 58 41 95 15 17 58 28 96 94 75 58 58 75 81 41 81 81 94 75 94 96 95 15 95 95 96 q Strategy for hk-sort: § For each position i, in hk+1, hk+2, …, N, place the element in the correct spot among i, i-hk, i-2 hk , … q An hk-sort performs an insertion sort on hk independent subarrays. q A common (but poor) choice of increment sequence is h 1=N/2, hk+1=hk /2 (Shell’s increments). q Increment sequences significantly affect the performance of shellsort algorithm. Programming, Data Structures and Algorithms (Sorting) Slide 9/36
Analysis of Shellsort q Average-case analysis of Shellsort is a long-standing open problem (except for most trivial increment sequences). q The worst-case running time, using Shell’s increments, is θ(N 2). Proof: q Determine lower Ω(N 2) and upper bound O(N 2). q Determine lower bound by constructing a bad case: § Let N be a power of 2 – all the increments are even except h 1, which is 1. § Let the N/2 largest elements be in even positions and N/2 smallest elements be in odd positions. § Since all increments except the last are even, the N/2 largest elements are still all in even positions, and N/2 smallest elements are in odd positions, when we come to the last pass. § The ith smallest element (i ≤ N/2) is in position 2 i-1 before beginning the last pass and requires moving for i-1 places. § To place the N/2 smallest elements in the correct place require Programming, Data Structures and Algorithms (Sorting) Slide 10/36
Analysis of Shellsort Original 1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 16 After 8 -sort 1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 16 After 4 -sort 1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 16 After 2 -sort 1 9 2 10 3 11 4 12 5 13 6 14 7 15 8 16 After 1 -sort 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 q Determine upper bound: § A pass with increment hk consists of hk insertion sorts of N/hk elements. § Since insertion sort is quadratic, the total cost of the pass is O(hk (N/hk)2) = O(N 2/hk) § Summing over all passes gives a total bound § Increments form a geometric series with common ratio 2, hence the § Obtained total bound is O(N 2). Programming, Data Structures and Algorithms (Sorting) Slide 11/36
Analysis of Shellsort q Problem with Shell’s increments: they are not necessarily relatively prime and some increments can have little effect. q Hibbard suggested a slightly different increment sequence of form 1, 3, 7, …, 2 k-1. q Key difference: the consecutive increments have no common factors. q The worst-case running time of Shellsort, using Hibbard’s increments, is θ(N 3/2). q Based on simulations, the average-case running time of Shellsort, using Hibbard’s increments, is thought to be O(N 5/4). q Sedgewick proposed several increments sequences that give an O(N 4/3) worst-case running time, the average running time is conjectured to be O(N 7/6). q One of the sequences is {1, 5, 19, 41, 109, …}, in which terms are either of form 9· 4 i – 9· 2 i + 1 or 4 i – 3· 2 i +1. q These sequences are most easily implemented by placing its values in an array. Programming, Data Structures and Algorithms (Sorting) Slide 12/36
Heapsort q Priority queues can be used to sort in O(N log N) time. q Heapsort gives the best running time growth rate so far. q In practice it is slower than a Shellsort with Sedgewick’s increment sequence. q Sort strategy: § Build a binary heap of N elements – it takes O(N) time. § Perform N Delete. Min operations, the elements leave the heap smallest first, in sorted order. § By recording these elements in a second array and copying the array back we sort N elements. § Each Delete. Min requires O(log N) time. § The total running time is O(N log N). Programming, Data Structures and Algorithms (Sorting) Slide 13/36
Heapsort q The clever way to avoid additional array is to use the fact, that after each Delete. Min, the heap shrinks by 1 thus the cell that was last in the heap can store just deleted (minimum) element. q This way we got array sorted in decreasing order. q Change the ordering property of the heap, so that the parent has a larger key than the child (max Heap). Programming, Data Structures and Algorithms (Sorting) Slide 14/36
Analysis of Heapsort q Building the heap uses at most 2 N comparisions. q In second phase, the ith Delete. Max uses at most 2 log i comparisons, for total at most 2 N log N – O(N) comparisons. q Consequently, in the worst case, at most 2 N log N – O(N) comparisons are used by heapsort. q Experiments have shown that heapsort is extremely stable algorithm: on average it uses only slightly fewer comparisons than the worst case bound suggest. q Heapsort’s average running time is hard to estimate because successive Delete. Max operations destroy the heap’s randomness. Programming, Data Structures and Algorithms (Sorting) Slide 15/36
Mergesort q Mergesort runs in in O(N log N) worst-case running time. q It is good example of a recursive algorithm. q The fundamental operation is merging two sorted lists. Because the lists are sorted this can be done in one pass. q The basic merging algorithm takes: § two input arrays A and B, and output array C. § three counters (aptr, bptr, and cptr) for the corresponding arrays which are initially set to the beginning of the arrays. q The smaller of A[aptr] and B[bptr] is copied to the next entry in C[cptr] and appropriate counters are advanced. Programming, Data Structures and Algorithms (Sorting) Slide 16/36
Mergesort q When one array is exhausted, q the remainder of the other array is copied to output array Programming, Data Structures and Algorithms (Sorting) Slide 17/36
Mergesort q The time to merge two sorted lists is clearly linear, because at most N-1 comparisons are made. § Every comparison adds an element to set C, except the last one, which adds two. q The mergesort algorithm is easy to describe: § If N=1 there is one element array to sort and the array is already sorted. § Otherwise recursively mergesort first and second half of the array. q q This algorithm presents classic divide and conquer strategy. The problem is divided into smaller problems and solved recursively. The conquer phase consist of patching together the answers. If temporary array is declared locally, then there could be log N temporary arrays active – dangerous on machine with small memory. q Only one temporary array is needed. q It can be rewritten without recursion. Programming, Data Structures and Algorithms (Sorting) Slide 18/36
Analysis of Mergesort q q q Let us assume that N is a power of 2. For N=1 the time to mergesort is constant and denoted by 1. The time to mergesort N elements is equal to time to do two recursive mergesorts of size N/2, plus the time to merge, which is linear: q q T(1) = 1 T(N) = 2 T(N/2) + N q Dividing second equation by N we get q And finally Programming, Data Structures and Algorithms (Sorting) Slide 19/36
Analysis of Mergesort q After everything is added we get because there are log N equations. q T(N) = N log N + N = O(N log N) q Although mergesort’s running time is O(N log N), it is rarely used for main memory sorts. q It needs linear extra memory and additional work spent for copying the temporary array slows down the algorithm considerably. Programming, Data Structures and Algorithms (Sorting) Slide 20/36
Quicksort q The quicksort is the fastest known sorting algorithm used in practice. q Its average running time is O(N log N). q It has O(N 2) worst-case performance, but worst-case can be made very unlikely with little effort. q Like mergesort, quicksort is divide and conquer recursive algorithm. q The basic steps to sort an array S are: § If the number of elements in S is 0 or 1, then return. § Pick an element v in S. It is called the pivot. § Partition S-{v} into two disjoint groups: S 1 is a set of elements that are smaller or equal to the pivot and S 2 is a set of elements that are bigger or equal to the pivot. § Return { quicksort(S 1), v, quicksort(S 2) }. Programming, Data Structures and Algorithms (Sorting) Slide 21/36
Quicksort Programming, Data Structures and Algorithms (Sorting) Slide 22/36
Quicksort q The algorithm is ambiguous about what to do with the elements equal to the pivot – this becomes a design decision. q Why is quicksort better then mergesort? § Like mergesort, it recursively solves two sub-problems and requires linear additional work. § The sub-problems are not guaranteed to be of equal size (potentially bad). § Partitioning step can be performed in place and very efficient. Programming, Data Structures and Algorithms (Sorting) Slide 23/36
Picking the Pivot q A wrong way: § The popular choice is to use the first element as the pivot. § This is acceptable if the input is random. § If the input is pre-sorted or in the reverse order than all the elements go into S 1 or S 2, throughout the recursive calls. In such case it takes quadratic time. § This is quite frequent. q A Safe Manoeuvre § Pick the pivot randomly. § Depend on random number generator (might be poor). § Random number generator is expensive. q Median of Three Partitioning § The median of a group of N numbers is the N/2 -th largest number. § Best choice is the median of complete file but is hard to calculate. § A good estimate is a median of three elements: left, right, and center elements. Programming, Data Structures and Algorithms (Sorting) Slide 24/36
Partitioning strategy q There are several partitioning strategies in practice. q Partitioning strategy, which yields good results: § Get the pivot out of the way by swapping it with last element. § i starts at the first element and j starts at the next-to-last element § While i is left to j, we move i right skipping over elements smaller than the pivot. We move j left, skipping over elements that are larger than the pivot. o When i and j have stopped, i is pointing at a large element and j is pointing at a small element. o If i is left of j, those elements are swapped. Programming, Data Structures and Algorithms (Sorting) Slide 25/36
Partitioning strategy Original Pick pivot 1 st Stop 1 st Swap 2 nd Stop 2 nd Swap 3 rd Stop Swap Pivot 8 8 8 2 2 2 1 1 1 1 4 4 4 4 9 9 9 5 5 5 6 0 0 0 0 3 3 3 3 5 5 5 9 9 6 2 2 2 8 8 8 7 7 7 7 0 6 6 6 9 q How to handle elements that are equal to the pivot: § If both pointers stop there are extra swapping, but they will cross in the middle. § If pointers skip equal elements, no swap is performed, however the pivot is put to the next-to-last position, which gives O(N 2) running time if all elements are equal. Programming, Data Structures and Algorithms (Sorting) Slide 26/36
Partitioning strategy q For small files quicksort does not perform as well as insertion sort (N≤ 20). q Because quicksort is recursive these cases will occur frequently. q Use other sorting algorithm for small files. q If the pivot is median of left, right, and center element, we can order these elements and hide pivot. § Left and right element perform as a sentinel § The starting point of both pointers can be moved by one. Programming, Data Structures and Algorithms (Sorting) Slide 27/36
Analysis of Quicksort q Like mergesort quicksort is recursive q T(0)=T(1)=1 q The running time of quicksort is equal to running time of the two recursive calls plus linear time spent in partitioning: q T(N) = T(i) + T(N-i-1) + c. N q Worst-case analysis (pivot is smallest element; i=0, we ignore T(0)=1): § T(N) = T(N-1) + c. N, N>1 § Programming, Data Structures and Algorithms (Sorting) Slide 28/36
Analysis of Quicksort q Best-Case analysis: § The pivot is in the middle § T(N) = 2 T(N/2) + c. N, N>1 § We add all equations and get § Which yields Programming, Data Structures and Algorithms (Sorting) Slide 29/36
Selection Problem q Selection problem: Find k-th largest (smallest) element q Using priority queue, we can find it in O(N+k log. N). Finding median requires O(N log. N) time. q Since we can sort the array in O(N log. N) time we expect to obtain better time bound for selection. q Quick-select algorithm (|S| denotes number of elements in S): § § § If |S| = 1 (only when k=1), return the element of S as answer. Pick a pivot element v. Partition S - {v} into S 1 and S 2, as in quicksort. If k ≤ |S 1| then the k-th element is in S 1 and return quickselect(S 1, k) If k = 1+|S 1| then the pivot is the k-th smallest element. Otherwise, the k-th smallest element is in S 2 and return quickselect(S 2, k-|S 1|-1). q The worst-case running time is O(N 2), however average-case running time is O(N). Programming, Data Structures and Algorithms (Sorting) Slide 30/36
A General Lower Bound for Sorting q We have O(N log N) algorithms for sorting, but can we do better? q Algorithm that uses only comparisons requires Ω(N log N) comparisons in the worst case. q The same bound can be proven for average case. q Decision tree is an abstraction used to prove lower bounds: § It is a binary tree. § Each node represents a set of possible orderings, consistent with the comparisons that have been made, among the elements. § The results of the comparisons are the tree edges. Programming, Data Structures and Algorithms (Sorting) Slide 31/36
A General Lower Bound for Sorting q A decision tree for three element insertion sort: Programming, Data Structures and Algorithms (Sorting) Slide 32/36
A General Lower Bound for Sorting q Every algorithm that sorts by using only the comparisons can be represented by a decision tree. q The maximum number of comparisons used by the algorithm is equal to the depth of the deepest leaf. § In our case this algorithm uses three comparisons in the worst case. q The average number of comparisons used is equal to the average depth of the leaves. q Lemma 1: Let T be a binary tree of depth d. Then T has at most 2 d leaves. q Proof: By induction: § If d=0, there is at most one leaf. § Otherwise, we have root, which cannot be a leaf, and left and right subtree, each of depth at most d-1. By induction hypothesis they can each have at most 2 d-1 leaves, giving a total of 2 d leaves. Programming, Data Structures and Algorithms (Sorting) Slide 33/36
A General Lower Bound for Sorting q Lemma 2: A binary tree with L leaves must have depth at least [log L]. q Proof: Follows from previous lemma. q Theorem 1: Any sorting algorithm that uses only comparisons between elements requires at least [log (N!)] comparisons in the worst case. q Proof: A decision tree to sort N elements have N! leaves. The theorem follows from previous lemma. q Theorem 2: Any sorting algorithm that uses only comparisons between elements requires Ω(N log N) comparisons. Programming, Data Structures and Algorithms (Sorting) Slide 34/36
A General Lower Bound for Sorting q Proof: From previous theorem, log(N!) comparisons is required. log(N!) = log( N (N-1) (N-2) ··· 2 1) = log N + log(N-1) + log (N-2) + ··· + log 2 + log 1 ≥ log N + log(N-1) + log (N-2) + ··· + log (N/2) ≥ N/2 log N – N/2 = Ω( N log N) Programming, Data Structures and Algorithms (Sorting) Slide 35/36
Bucket Sort q In special cases it is possible to sort in linear time. q The input consists of only positive integers smaller than M. q Algorithm: § Array Count, of size M, is initialized to 0, (Count has M cells or buckets, which are initially empty), § When Ai is read, increment Count[Ai] by 1, § After all the input is read, scan the Count array and printout indexes of nonempty cells. q Algorithm violates the lower bound? q It does not use simple comparisons. q Essentially performs M-way comparison. Programming, Data Structures and Algorithms (Sorting) Slide 36/36
- Slides: 36