Sorting And Searching CSE 116 A B 192022
Sorting And Searching CSE 116 A, B 1/9/2022 B. Ramamurthy 1
Introduction The problem of sorting a collection of keys to produce an ordered collection is one of the richest in computer science. The richness derives from the fact that there a number of ways of solving the sorting problem. Sorting is a fascinating operation that provides a model for investigating many problems in Computer Science. Ex: analysis and comparison of algorithms, divide and conquer, space and time tradeoff, recursion etc. 1/9/2022 B. Ramamurthy 2
Sorting Comparison based: selection, insertion sort (with online animations) Divide and conquer: merge sort, quick sort (with animations from supplements of your text book) Priority Queue /Heap based: heap sort Assume: Ascending order for all our discussion 1/9/2022 B. Ramamurthy 3
Selection Sort Select the first smallest element, place it in first location (by exchanging contents of locations), find the next smallest, place it in the next location, and so on. We will look at an example, an algorithm, analyze the algorithm, and look at code implementation. 1/9/2022 B. Ramamurthy 4
Selection Sort: Example 10 8 6 2 16 4 2 8 6 10 16 4 2 4 6 10 16 8 2 4 6 8 1/9/2022 16 10 B. Ramamurthy Sorted array 2 4 6 8 10 16 5
Selection Sort Pseudo Code 1. Let cursor be pointer to first element of an n element list to be sorted. 2. Repeat until cursor points to last but one element. 2. 1 Search for the smallest element in the list starting from the cursor. Let it be at location target. 2. 2 Exchange elements at cursor and target. 2. 3 Update cursor to point to next element. 1/9/2022 B. Ramamurthy 6
Sort Analysis 1. 2. Let el be the list; n be the number of elements; cursor = 0; target = 0; while (cursor < n-1) 2. 1. 1 target = cursor; 2. 1. 2 for (i= cursor+1; i < n; i++) if (el[i] < el[target]) target =i; 2. 2 exchange (el[target], el[cursor]); // takes 3 assignments 2. 3 cursor++; (n-1) *4 + 2 *(n + (n-1) + (n-2). . 1) 4 n – 4 + 2*n(n+1)/2 = 4 n – 4 + n 2 + n = n 2 + 5 n - 4 An 2 + B n + C where A= 1, B = 5, C = -4; for large n, drop the constant, lower order term in n, and the multiplicative constant to get big-O notation O(n 2) – quadratic sort 1/9/2022 B. Ramamurthy 7
Insertion Sort 10 8 6 2 16 4 Unsorted array 10 Trivially sorted 8 10 6 8 10 2 6 8 10 16 Insert 8 Insert 6 Insert 2 Insert 4 Insert 16 2 1/9/2022 B. Ramamurthy 4 6 8 10 16 8
Insertion Sort Pseudo Code 1. Single element is trivially sorted; start with first element; 2. Repeat for second to nth element of the list: 3. 2. 1 cursor = next location; 4. 2. 2 Find a location to insert for list[cursor] by comparing 5. and shifting; 6. let the location be target; 7. 2. 3 Insert list[target] = list[cursor] 1/9/2022 B. Ramamurthy 9
Insertion Sort Analysis 1. cursor = 0; 2. while (cursor < n) 3. 2. 1 cursor = cursor + 1; 4. 2. 2 temp = list[cursor] // save element to inserted 5. 2. 2. 2 j = cursor; //find location 6. 2. 2. 3 while (j > 0 && list[j-1] > temp ) 7. list[j] = list[j-1]; //shift right 8. j = j – 1; 9. // assert : location found or hit left end(j=0) 10. 2. 3 list[j] = temp; 11. Worst case: O(n 2) quadratic 12. Best case : linear (when the list is already sorted) 1/9/2022 B. Ramamurthy 10
Merge Sort Divide and Conquer Divide the list into two subsets s 1, and s 2 (recurse) Sort s 1 and s 2 by divide and conquer (conquer) Merge the sorted s 1 and s 2. O(n log n) algorithm 1/9/2022 B. Ramamurthy 11
Merge Sort (s) merge. Sort(s): If S. size() > 1 1. S 1, S 2 partition (S, n/2) 2. merge. Sort(s 1); 3. merge. Sort(s 2); 4. S merge(s 1, s 2) Lets look at examples. 1/9/2022 B. Ramamurthy 12
Example partition 10 10 8 8 6 2 4 16 4 S 2 10 8 6 2 S 21 S 12 8 S 121 1/9/2022 16 6 S 11 2 6 S 122 B. Ramamurthy 16 4 S 22 16 S 221 4 S 222 13
Example merge 2 6 8 4 6 2 4 16 S 2 10 6 8 2 S 21 S 12 8 S 121 1/9/2022 10 16 10 S 11 8 6 S 122 B. Ramamurthy 4 16 S 221 4 S 222 14
Algorithm list merge(s 1, s 2) 1. s empty list 2. while (!s 1. empty() && !s 2. empty()) 3. // Assert: both lists non empty 4. 2. 1 if s 1. first. Elem() < s 2. first. Elem() 5. s. insert. Last(s 1. remove. First()); 6. else 7. s. insert. Last(s 2. remove. First()); 8. 3. //Assert: s 1 is empty or s 2 is empty 9. 3. 1 while (!s 1. empty()) 10. s. insert. Last(s 1. remove. First()); 11. 3. 2 while (!s 2. empty()) 12. s. insert. Last(s 2. remove. First()); 13. 4. return s; 1/9/2022 B. Ramamurthy 2*k n-k n+k n + c*n (c+1)*n O(n) 15
Analysis of merge sort Running time is time spent each level merging the nodes: Number of levels: 1+ ceiling(log n) Since the time spent at each of the is O(n), we have the following result: Algorithm mergesorts a list of size n in O(n log n) time in the worst case. 1/9/2022 B. Ramamurthy 16
Quick Sort Recursive sort; divide and conquer Divide: select an element called the pivot; typically last or first element is chosen to be the pivot; partition the list into three lists: L: elements in S less than pivot E: elements in S equal to pivot (single element for list of distinct elements) G: elements in S greater than pivot Recurse: Recursively quick sort the lists L and G. Conquer: Form the sorted list by concatenating L, E and G. 1/9/2022 B. Ramamurthy 17
Quicksort Example 10 8 6 pivot 16 4 2 2 10 10 null 8 6 10 B. Ramamurthy 16 null 1/9/2022 8 8 10 Partition around pivot 18
Quicksort Example pivot 10 2 null 8 6 2 16 10 8 6 10 null 1/9/2022 4 2 16 null 8 10 B. Ramamurthy 4 6 8 10 16 6 8 10 10 16 Concatenate {L}{pivot}{G} 19
Quicksort Worst case: when the list is already sorted. Let si be the sum of all sizes of the nodes to be sorted at level i. In the worst case the number of levels is n. S 0 = n S 1 = n – 1 since every element skews to one side; S 2 = n – 2 and so on. worst case running time is O(n + (n-1) + (n 2) +. . 1) = O(n 2) best case: O(n log n) 1/9/2022 B. Ramamurthy 20
Heap : Definition Heap is a loosely ordered complete binary tree. A heap is a complete binary tree with values stored in its nodes such that no child has a value greater than the value of its parent. A heap is a complete binary tree : 1. That is empty or 2 a. Whose root contains a search key greater than or equal to both its children node. 2 b. Whose left subtree and right subtree are heaps. 1/9/2022 B. Ramamurthy 21
Types of heaps Heaps can be “max” heaps or “min” heaps. Above definition was for a “max” heap. In a max-heap the root is higher than or equal to the values in its left and right child. In a min-heap the root is smaller than or equal to the values in its left and right child. 1/9/2022 B. Ramamurthy 22
ADT Heap create. Heap ( ) destroy. Heap ( ) empty ( ) heap. Insert (Object new. Item) Object heap. Delete ( ) // always the root 1/9/2022 B. Ramamurthy 23
Example Consider data set: 6 3 5 9 2 10 1. Implement it as a complete binary tree. 2. Heap left sub-tree. 3. Heap right sub-tree. 4. Heap the root, left node and right node of root. Note : When heaping choose the largest of the two children node to move up for “max” heap. 1/9/2022 B. Ramamurthy 24
Example 1 6 6 3 2 5 9 2 9 5 3 10 10 2 3 10 6 9 3 1/9/2022 10 2 4 9 3 5 B. Ramamurthy 6 2 5 25
Delete Root Item In a max heap the root item is the largest and is the chosen one for deletion. 1. After deletion of root, two disjoint heaps result. 2. Place last node as a root, to form a semiheap. 3. Use trickle-down to form a heap. Running time : 3*log N + 1 = O(log N) Consider a heap example discussed above. Delete root item. 1/9/2022 B. Ramamurthy 26
Insert An Item 1. Insert a node as a last node. 2. Trickle up (repeat for various levels) to form a heap. Consider inserting 15 into the heap of the “delete” example. Insert is also a O(log N) operation. 1/9/2022 B. Ramamurthy 27
Insert Node : Example 9 9 5 1 6 3 2 2 5 6 3 2 9 15 5 3 1/9/2022 15 2 repeat 6 5 3 B. Ramamurthy 9 2 6 28
Priority Queue implemented as a heap. Priority queue inserts are done according to some criteria known as “priority” Insert location are according to priority and delete are to the head of queue. PQueue constructor PQueue. Insert PQueue. Delete Data: Heap p. Q; 1/9/2022 B. Ramamurthy 29
Heap Sort Heapsort: 1. Represent elements as a min-heap. 2. Delete item at root and add to result, as it is the smallest. 3. Heap, repeat step 2, until one item is left. 4. Delete the last item and add to result. O(N*log N) in the worst case! 1/9/2022 B. Ramamurthy 30
Heap sort: Example 10 10 6 2 4 4 2 6 16 16 Min-heap 8 2 8 4 8 10 16 6 Lets do that rest by hand. 1/9/2022 B. Ramamurthy 31
Heap Sort in place Since heap is a complete binary tree, the nodes can be stored in contiguous storage such as an array. Assuming root is at 0: Parent of a node(j) is node ((j-1)/2) Left child of node(j) is node(2 j+1) if 2 j < n Right child of node(j) is node(2(j+1)) if 2(j+1)<n Adding last leaf is equivalent to adding a element as last element of the contiguous storage. 1/9/2022 B. Ramamurthy 32
Priority Queue(PQ) Queue that maintains a list of elements according to some priority. Queue front is the element removed. Element added as last. For linear list PQ: insert is O(n) since we have to search through the list to find the right place. Strict ordering. For heap PQ: insert is O(log n) since we insert it a leaf and sift. Up() which in O(log n); loose ordering. 1/9/2022 B. Ramamurthy 33
Heap and Priority Queue 1/9/2022 B. Ramamurthy 34
Heap and Priority Queue (code) Heap will have other support methods: root(), parent(), right. Child(), left. Child() etc. Homework: To reinforce the concepts, implement the heap and implement the PQ using the class diagram given. Adapt the code in your textbook p. 312313 1/9/2022 B. Ramamurthy 35
Quicksort in place quick. Sort(T, left, right) 1. if left < right 1. 1 pivot = partition(T, left, right); // assert: T is partially sorted during partition 1. 2 quick. Sort(T, left, pivot – 1); 1. 3 quick. Sort(T, pivot+1, right); 2. return; 1/9/2022 B. Ramamurthy 36
Partition(T, first, last) 1. 2. Initialize: i = first; j = last-1; while ( i <= j) 2. 1 Decrease j: while (i<=j) if T(j) > T(last), j = j – 1 else break; 2. 2 Increase i: while(j>=i) if T(i) > T(last), i = i+1 else break; 2. 3 Exchange/Done: 2. 4 If j > i, exchange(T(i), T(j)); 2. 5 else 2. 5. 1. exchange(T(j), T(last)); 2. 5. 2 return j; // done; j is the pivot between // partitions 1/9/2022 B. Ramamurthy 37
Example 85 24 63 45 17 31 96 50 1/9/2022 B. Ramamurthy 38
Comparison Small data set: selection sort Insert into already sorted list: insertion sort Medium size data set to be sorted in place: quick sort Very large data set on disk/tape: merge sort 1/9/2022 B. Ramamurthy 39
Summary of Sorting Algorithms Algorithm Time selection-sort O(n 2) insertion-sort O(n 2) heap-sort O(n log n) merge-sort O(n log n) 1/9/2022 Notes w slow w in-place w for small data sets (< 1 K) w fast w in-place w for large data sets (1 K — 1 M) w fast w sequential data access w for huge data sets (> 1 M) Goodrich and Tamassia's 40
- Slides: 40