CS 3343 Analysis of Algorithms Heap sort and

  • Slides: 81
Download presentation
CS 3343: Analysis of Algorithms Heap sort and priority queue 2/23/2021 1

CS 3343: Analysis of Algorithms Heap sort and priority queue 2/23/2021 1

Heap sort • Another Θ(n log n) sorting algorithm • In practice quick sort

Heap sort • Another Θ(n log n) sorting algorithm • In practice quick sort wins • The heap data structure and its variants are very useful for many algorithms 2/23/2021 2

Selection sort <= Sorted <= <= Find minimum Sorted 2/23/2021 3

Selection sort <= Sorted <= <= Find minimum Sorted 2/23/2021 3

Selection sort <= Sorted <= Find maximum <= Sorted 2/23/2021 4

Selection sort <= Sorted <= Find maximum <= Sorted 2/23/2021 4

Selection sort Selection. Sort(A[1. . n]) for (i = n; i > 0; i--)

Selection sort Selection. Sort(A[1. . n]) for (i = n; i > 0; i--) index = max_element(A[1. . i]) swap(A[i], A[index]); end What’s the time complexity? If max_element takes Θ(n), selection sort takes i=1 n i = Θ(n 2) 2/23/2021 5

Heap • A heap is a data structure that allows you to quickly retrieve

Heap • A heap is a data structure that allows you to quickly retrieve the largest (or smallest) element • It takes time Θ(n) to build the heap • If you need to retrieve largest element, second largest, third largest…, in long run the time taken for building heaps will be rewarded 2/23/2021 6

Idea of heap sort Heap. Sort(A[1. . n]) Build a heap from A For

Idea of heap sort Heap. Sort(A[1. . n]) Build a heap from A For i = n down to 1 Retrieve largest element from heap Put element at end of A Reduce heap size by one end Key: 1. 2. Build a heap in linear time Retrieve largest element (and make it ready for next retrieval) in O(log n) time 2/23/2021 7

Heaps • A heap can be seen as a complete binary tree: Perfect binary

Heaps • A heap can be seen as a complete binary tree: Perfect binary tree 16 14 10 8 2 7 4 9 3 1 A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible 2/23/2021 8

Heaps • In practice, heaps are usually implemented as arrays: 16 14 10 8

Heaps • In practice, heaps are usually implemented as arrays: 16 14 10 8 2 7 4 3 1 16 14 10 8 2/23/2021 9 7 9 3 2 4 1 9

Heaps • To represent a complete binary tree as an array: – – –

Heaps • To represent a complete binary tree as an array: – – – The root node is A[1] Node i is A[i] The parent of node i is A[i/2] (note: integer divide) The left child of node i is A[2 i] The right child of node i is A[2 i + 1] 16 14 A = 16 14 10 8 7 9 3 2 4 8 1 = 2 2/23/2021 10 7 4 9 3 1 10

Referencing Heap Elements • So… Parent(i) {return i/2 ; } Left(i) {return 2*i; }

Referencing Heap Elements • So… Parent(i) {return i/2 ; } Left(i) {return 2*i; } right(i) {return 2*i + 1; } 2/23/2021 11

Heap Height • Definitions: – The height of a node in the tree =

Heap Height • Definitions: – The height of a node in the tree = the number of edges on the longest downward path to a leaf – The height of a tree = the height of its root h=3 16 h=2 14 h=1 8 2 7 4 h=1 10 1 9 3 h=0 • What is the height of an n-element heap? Why? • log 2(n). Basic heap operations take at most time proportional to the height of the heap 2/23/2021 12

The Heap Property • Heaps also satisfy the heap property: A[Parent(i)] A[i] for all

The Heap Property • Heaps also satisfy the heap property: A[Parent(i)] A[i] for all nodes i > 1 – In other words, the value of a node is at most the value of its parent – The value of a node should be greater than or equal to both its left and right children • And all of its descendents – Where is the largest element in a heap stored? 2/23/2021 13

Are they heaps? 16 4 10 14 7 9 3 2 8 10 14

Are they heaps? 16 4 10 14 7 9 3 2 8 10 14 1 = 2 7 8 9 3 1 16 10 14 7 8 9 3 2 4 14 7 1 = 2 8 4 9 3 1 Violation to heap property: a node has value less than one of its children How to find that? How to resolve that? 2/23/2021 14

Heap Operations: Heapify() • Heapify(): maintain the heap property – Given: a node i

Heap Operations: Heapify() • Heapify(): maintain the heap property – Given: a node i in the heap with children l and r – Given: two subtrees rooted at l and r, assumed to be heaps – Problem: The subtree rooted at i may violate the heap property – Action: let the value of the parent node “sift down” so subtree at i satisfies the heap property • Fix up the relationship between i, l, and r recursively 2/23/2021 15

Heap Operations: Heapify() Heapify(A, i) { // precondition: subtrees rooted at l and r

Heap Operations: Heapify() Heapify(A, i) { // precondition: subtrees rooted at l and r are heaps l = Left(i); r = Right(i); if (l <= heap_size(A) && A[l] > A[i]) largest = l; Among A[l], A[i], A[r], else which one is largest? largest = i; if (r <= heap_size(A) && A[r] > A[largest]) largest = r; if (largest != i) { Swap(A, i, largest); If violation, fix it. Heapify(A, largest); } } // postcondition: subtree rooted at i is a heap 2/23/2021 16

Heapify() Example 16 4 10 14 2 7 8 3 1 A = 16

Heapify() Example 16 4 10 14 2 7 8 3 1 A = 16 4 10 14 7 2/23/2021 9 9 3 2 8 1 17

Heapify() Example 16 4 10 14 2 7 8 3 1 A = 16

Heapify() Example 16 4 10 14 2 7 8 3 1 A = 16 4 10 14 7 2/23/2021 9 9 3 2 8 1 18

Heapify() Example 16 4 10 14 2 7 8 3 1 A = 16

Heapify() Example 16 4 10 14 2 7 8 3 1 A = 16 4 10 14 7 2/23/2021 9 9 3 2 8 1 19

Heapify() Example 16 14 10 4 2 7 8 3 1 A = 16

Heapify() Example 16 14 10 4 2 7 8 3 1 A = 16 14 10 4 2/23/2021 9 7 9 3 2 8 1 20

Heapify() Example 16 14 10 4 2 7 8 3 1 A = 16

Heapify() Example 16 14 10 4 2 7 8 3 1 A = 16 14 10 4 2/23/2021 9 7 9 3 2 8 1 21

Heapify() Example 16 14 10 8 2 7 4 3 1 A = 16

Heapify() Example 16 14 10 8 2 7 4 3 1 A = 16 14 10 8 2/23/2021 9 7 9 3 2 4 1 22

Heapify() Example 16 14 10 8 2 7 4 3 1 A = 16

Heapify() Example 16 14 10 8 2 7 4 3 1 A = 16 14 10 8 2/23/2021 9 7 9 3 2 4 1 23

Analyzing Heapify(): Informal • Aside from the recursive call, what is the running time

Analyzing Heapify(): Informal • Aside from the recursive call, what is the running time of Heapify()? • How many times can Heapify() recursively call itself? • What is the worst-case running time of Heapify() on a heap of size n? 2/23/2021 24

Analyzing Heapify(): Formal • Fixing up relationships between i, l, and r takes (1)

Analyzing Heapify(): Formal • Fixing up relationships between i, l, and r takes (1) time • If the heap at i has n elements, how many elements can the subtrees at l or r have? – Draw it • Answer: 2 n/3 (worst case: bottom row 1/2 full) • So time taken by Heapify() is given by T(n) T(2 n/3) + (1) 2/23/2021 25

Analyzing Heapify(): Formal • So in the worst case we have T(n) = T(2

Analyzing Heapify(): Formal • So in the worst case we have T(n) = T(2 n/3) + (1) • By case 2 of the Master Theorem, T(n) = O(lg n) • Thus, Heapify() takes logarithmic time 2/23/2021 26

Heap Operations: Build. Heap() • We can build a heap in a bottom-up manner

Heap Operations: Build. Heap() • We can build a heap in a bottom-up manner by running Heapify() on successive subarrays – Fact: for array of length n, all elements in range A[ n/2 + 1. . n] are heaps (Why? ) – So: • Walk backwards through the array from n/2 to 1, calling Heapify() on each node. • Order of processing guarantees that the children of node i are heaps when i is processed 2/23/2021 27

 • Fact: for array of length n, all elements in range A[ n/2

• Fact: for array of length n, all elements in range A[ n/2 + 1. . n] are heaps (Why? ) Heap size # leaves 1 1 # internal nodes 0 2 1 1 3 4 5 2 2 3 1 2 2 0 <= # leaves - # internal nodes <= 1 # of internal nodes = n/2 2/23/2021 28

Build. Heap() // given an unsorted array A, make A a heap Build. Heap(A)

Build. Heap() // given an unsorted array A, make A a heap Build. Heap(A) { heap_size(A) = length(A); for (i = length[A]/2 downto 1) Heapify(A, i); } 2/23/2021 29

Build. Heap() Example • Work through example A = {4, 1, 3, 2, 16,

Build. Heap() Example • Work through example A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7} 4 1 3 2 14 16 8 A= 2/23/2021 9 10 7 4 1 3 2 16 9 10 14 8 7 30

4 1 3 2 14 16 8 A= 2/23/2021 9 10 7 4 1

4 1 3 2 14 16 8 A= 2/23/2021 9 10 7 4 1 3 2 16 9 10 14 8 7 31

4 1 3 2 14 16 8 A= 2/23/2021 9 10 7 4 1

4 1 3 2 14 16 8 A= 2/23/2021 9 10 7 4 1 3 2 16 9 10 14 8 7 32

4 1 3 14 2 16 8 A= 2/23/2021 9 10 7 4 1

4 1 3 14 2 16 8 A= 2/23/2021 9 10 7 4 1 3 14 16 9 10 2 8 7 33

4 1 3 14 2 16 8 A= 2/23/2021 9 10 7 4 1

4 1 3 14 2 16 8 A= 2/23/2021 9 10 7 4 1 3 14 16 9 10 2 8 7 34

4 1 10 14 2 16 8 A= 2/23/2021 9 3 7 4 1

4 1 10 14 2 16 8 A= 2/23/2021 9 3 7 4 1 10 14 16 9 3 2 8 7 35

4 1 10 14 2 16 8 A= 2/23/2021 9 3 7 4 1

4 1 10 14 2 16 8 A= 2/23/2021 9 3 7 4 1 10 14 16 9 3 2 8 7 36

4 16 10 14 2 1 8 A= 2/23/2021 9 3 7 4 16

4 16 10 14 2 1 8 A= 2/23/2021 9 3 7 4 16 10 14 1 9 3 2 8 7 37

4 16 10 14 2 7 8 A= 2/23/2021 9 3 1 4 16

4 16 10 14 2 7 8 A= 2/23/2021 9 3 1 4 16 10 14 7 9 3 2 8 1 38

4 16 10 14 2 7 8 A= 2/23/2021 9 3 1 4 16

4 16 10 14 2 7 8 A= 2/23/2021 9 3 1 4 16 10 14 7 9 3 2 8 1 39

16 4 10 14 2 7 8 3 1 A = 16 4 10

16 4 10 14 2 7 8 3 1 A = 16 4 10 14 7 2/23/2021 9 9 3 2 8 1 40

16 14 10 4 2 7 8 3 1 A = 16 14 10

16 14 10 4 2 7 8 3 1 A = 16 14 10 4 2/23/2021 9 7 9 3 2 8 1 41

16 14 10 8 2 7 4 3 1 A = 16 14 10

16 14 10 8 2 7 4 3 1 A = 16 14 10 8 2/23/2021 9 7 9 3 2 4 1 42

Analyzing Build. Heap() • Each call to Heapify() takes O(lg n) time • There

Analyzing Build. Heap() • Each call to Heapify() takes O(lg n) time • There are O(n) such calls (specifically, n/2 ) • Thus the running time is O(n lg n) – Is this a correct asymptotic upper bound? – Is this an asymptotically tight bound? • A tighter bound is O(n) – How can this be? Is there a flaw in the above reasoning? 2/23/2021 43

Analyzing Build. Heap(): Tight • To Heapify() a subtree takes O(h) time where h

Analyzing Build. Heap(): Tight • To Heapify() a subtree takes O(h) time where h is the height of the subtree – h = O(lg m), m = # nodes in subtree – The height of most subtrees is small • Fact: an n-element heap has at most n/2 h+1 nodes of height h (why? ) • Therefore T(n) = O(n) 2/23/2021 44

 • Fact: an n-element heap has at most n/2 h+1 nodes of height

• Fact: an n-element heap has at most n/2 h+1 nodes of height h (why? ) • n/2 leaf nodes (h = 0): f(0) = n/2 • f(1) ( n/2 +1)/2 = n/4 • The above fact can be proved using induction • Assume f(h) n/2 h+1 • f(h+1) (f(h)+1)/2 n/2 h+2 2/23/2021 45

Appendix A. 8 Therefore, building a heap takes (n) time!! 2/23/2021 46

Appendix A. 8 Therefore, building a heap takes (n) time!! 2/23/2021 46

Heapsort • Given Build. Heap(), an in-place sorting algorithm is easily constructed: – Maximum

Heapsort • Given Build. Heap(), an in-place sorting algorithm is easily constructed: – Maximum element is at A[1] – Discard by swapping with element at A[n] • Decrement heap_size[A] • A[n] now contains correct value – Restore heap property at A[1] by calling Heapify() – Repeat, always swapping A[1] for A[heap_size(A)] 2/23/2021 47

Heapsort(A) { Build. Heap(A); for (i = length(A) downto 2) { Swap(A[1], A[i]); heap_size(A)

Heapsort(A) { Build. Heap(A); for (i = length(A) downto 2) { Swap(A[1], A[i]); heap_size(A) -= 1; Heapify(A, 1); } } 2/23/2021 48

Heapsort Example • Work through example A = {4, 1, 3, 2, 16, 9,

Heapsort Example • Work through example A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7} 4 1 3 2 14 16 8 A= 2/23/2021 9 10 7 4 1 3 2 16 9 10 14 8 7 49

Heapsort Example • First: build a heap 16 14 10 8 2 7 4

Heapsort Example • First: build a heap 16 14 10 8 2 7 4 3 1 A = 16 14 10 8 2/23/2021 9 7 9 3 2 4 1 50

Heapsort Example • Swap last and first 1 14 10 8 2 7 4

Heapsort Example • Swap last and first 1 14 10 8 2 7 4 A= 2/23/2021 9 3 16 1 14 10 8 7 9 3 2 4 16 51

Heapsort Example • Last element sorted 1 14 10 8 2 7 4 A=

Heapsort Example • Last element sorted 1 14 10 8 2 7 4 A= 2/23/2021 9 3 16 1 14 10 8 7 9 3 2 4 16 52

Heapsort Example • Restore heap on remaining unsorted elements 14 8 10 4 2

Heapsort Example • Restore heap on remaining unsorted elements 14 8 10 4 2 7 1 16 A = 14 8 10 4 2/23/2021 9 3 Heapify 7 9 3 2 1 16 53

Heapsort Example • Repeat: swap new last and first 1 8 10 4 2

Heapsort Example • Repeat: swap new last and first 1 8 10 4 2 7 14 A= 2/23/2021 9 3 16 1 8 10 4 7 9 3 2 14 16 54

Heapsort Example • Restore heap 10 8 9 4 2 7 14 3 16

Heapsort Example • Restore heap 10 8 9 4 2 7 14 3 16 A = 10 8 2/23/2021 1 9 4 7 1 3 2 14 16 55

Heapsort Example • Repeat 9 8 3 4 10 7 14 A= 2/23/2021 1

Heapsort Example • Repeat 9 8 3 4 10 7 14 A= 2/23/2021 1 2 16 9 8 3 4 7 1 2 10 14 16 56

Heapsort Example • Repeat 8 7 3 4 10 2 14 A= 2/23/2021 1

Heapsort Example • Repeat 8 7 3 4 10 2 14 A= 2/23/2021 1 9 16 8 7 3 4 2 1 9 10 14 16 57

Heapsort Example • Repeat 1 2 3 4 10 7 14 A= 2/23/2021 8

Heapsort Example • Repeat 1 2 3 4 10 7 14 A= 2/23/2021 8 9 16 1 2 3 4 7 8 9 10 14 16 58

Analyzing Heapsort • The call to Build. Heap() takes O(n) time • Each of

Analyzing Heapsort • The call to Build. Heap() takes O(n) time • Each of the n - 1 calls to Heapify() takes O(lg n) time • Thus the total time taken by Heap. Sort() = O(n) + (n - 1) O(lg n) = O(n) + O(n lg n) = O(n lg n) 2/23/2021 59

Analyzing Heapsort (more accurately) • The call to Build. Heap() takes Θ(n) time •

Analyzing Heapsort (more accurately) • The call to Build. Heap() takes Θ(n) time • Each of the n - 1 calls to Heapify() takes Θ(lg i) time for i = n downto 2. • Thus the total time taken by Heap. Sort() = Θ(n) + 2≤i≤n log i = Θ(n) + log 2≤i≤n i = Θ(n) + log (n!) = Θ(n) + Θ(n lg n) = Θ(n lg n) 2/23/2021 60

Comparison Time Stable? complexity In-place? Merge sort Quick sort Heap sort 2/23/2021 61

Comparison Time Stable? complexity In-place? Merge sort Quick sort Heap sort 2/23/2021 61

Comparison Time Stable? complexity Merge sort (n log n) Yes Quick sort Heap sort

Comparison Time Stable? complexity Merge sort (n log n) Yes Quick sort Heap sort 2/23/2021 (n log n) No expected. (n^2) worst case (n log n) No In-place? No Yes 62

Priority Queues • Heapsort is a nice algorithm, but in practice Quicksort usually wins

Priority Queues • Heapsort is a nice algorithm, but in practice Quicksort usually wins • The heap data structure is incredibly useful for implementing priority queues – A data structure for maintaining a set S of elements, each with an associated value or key – Supports the operations Insert(), Maximum(), Extract. Max(), change. Key() • What might a priority queue be useful for? 2/23/2021 63

Your personal travel destination list • You have a list of places that you

Your personal travel destination list • You have a list of places that you want to visit, each with a preference score • Always visit the place with highest score • Remove a place after visiting it • You frequently add more destinations • You may change score for a place when you have more information • What’s the best data structure? 2/23/2021 64

Priority Queue Operations • Insert(S, x) inserts the element x into set S •

Priority Queue Operations • Insert(S, x) inserts the element x into set S • Maximum(S) returns the element of S with the maximum key • Extract. Max(S) removes and returns the element of S with the maximum key • Change. Key(S, i, key) changes the key for element i to something else • How could we implement these operations using a heap? 2/23/2021 65

Implementing Priority Queues Heap. Maximum(A) { return A[1]; } 2/23/2021 66

Implementing Priority Queues Heap. Maximum(A) { return A[1]; } 2/23/2021 66

Implementing Priority Queues Heap. Extract. Max(A) { if (heap_size[A] < 1) { error; }

Implementing Priority Queues Heap. Extract. Max(A) { if (heap_size[A] < 1) { error; } max = A[1]; A[1] = A[heap_size[A]] heap_size[A] --; Heapify(A, 1); return max; } 2/23/2021 67

Heap. Extract. Max Example 16 14 10 8 2 7 4 3 1 A

Heap. Extract. Max Example 16 14 10 8 2 7 4 3 1 A = 16 14 10 8 2/23/2021 9 7 9 3 2 4 1 68

Heap. Extract. Max Example Swap first and last, then remove last 1 14 10

Heap. Extract. Max Example Swap first and last, then remove last 1 14 10 8 2 7 4 A= 2/23/2021 9 3 16 1 14 10 8 7 9 3 2 4 16 69

Heap. Extract. Max Example Heapify 14 8 10 4 2 7 1 3 16

Heap. Extract. Max Example Heapify 14 8 10 4 2 7 1 3 16 A = 14 8 10 4 2/23/2021 9 7 9 3 2 1 16 70

Implementing Priority Queues Heap. Change. Key(A, i, key){ if (key <= A[i]){ // decrease

Implementing Priority Queues Heap. Change. Key(A, i, key){ if (key <= A[i]){ // decrease key A[i] = key; Sift down heapify(A, i); } else { // increase key A[i] = key; Bubble up while (i>1 & A[parent(i)]<A[i]) swap(A[i], A[parent(i)]; } } 2/23/2021 71

Heap. Change. Key Example Heap. Change. Key(A, 4, 15) Change key value to 15

Heap. Change. Key Example Heap. Change. Key(A, 4, 15) Change key value to 15 4 th element 16 14 10 8 2 7 4 3 1 A = 16 14 10 8 2/23/2021 9 7 9 3 2 4 1 72

Heap. Change. Key Example Heap. Change. Key(A, 4, 15) 16 14 10 15 2

Heap. Change. Key Example Heap. Change. Key(A, 4, 15) 16 14 10 15 2 7 4 3 1 A = 16 14 10 15 7 2/23/2021 9 9 3 2 4 1 73

Heap. Change. Key Example Heap. Change. Key(A, 4, 15) 16 15 10 14 2

Heap. Change. Key Example Heap. Change. Key(A, 4, 15) 16 15 10 14 2 7 4 3 1 A = 16 15 10 14 7 2/23/2021 9 9 3 2 4 1 74

Implementing Priority Queues Heap. Insert(A, key) { heap_size[A] ++; i = heap_size[A]; A[i] =

Implementing Priority Queues Heap. Insert(A, key) { heap_size[A] ++; i = heap_size[A]; A[i] = -∞; Heap. Change. Key(A, i, key); } 2/23/2021 75

Heap. Insert Example Heap. Insert(A, 17) 16 14 10 8 2 7 4 3

Heap. Insert Example Heap. Insert(A, 17) 16 14 10 8 2 7 4 3 1 A = 16 14 10 8 2/23/2021 9 7 9 3 2 4 1 76

Heap. Insert Example Heap. Insert(A, 17) 16 14 10 8 2 7 4 1

Heap. Insert Example Heap. Insert(A, 17) 16 14 10 8 2 7 4 1 9 3 -∞ -∞ makes it a valid heap A = 16 14 10 8 2/23/2021 7 9 3 2 4 1 -∞ 77

Heap. Insert Example Heap. Insert(A, 17) 16 14 10 8 2 7 4 1

Heap. Insert Example Heap. Insert(A, 17) 16 14 10 8 2 7 4 1 9 3 17 Now call Heap. Change. Key A = 16 14 10 8 2/23/2021 7 9 3 2 4 1 17 78

Heap. Insert Example Heap. Insert(A, 17) 17 16 10 8 2 14 4 1

Heap. Insert Example Heap. Insert(A, 17) 17 16 10 8 2 14 4 1 9 7 A = 17 16 10 8 14 9 2/23/2021 3 3 2 4 1 7 79

 • Heapify: Θ(log n) • Build. Heap: Θ(n) • Heap. Sort: Θ(nlog n)

• Heapify: Θ(log n) • Build. Heap: Θ(n) • Heap. Sort: Θ(nlog n) • • 2/23/2021 Heap. Maximum: Θ(1) Heap. Extract. Max: Θ(log n) Heap. Change. Key: Θ(log n) Heap. Insert: Θ(log n) 80

If we use a sorted array / linked list • Sort: Θ(n log n)

If we use a sorted array / linked list • Sort: Θ(n log n) • Afterwards: • • 2/23/2021 array. Maximum: Θ(1) array. Extract. Max: Θ(n) or Θ(1) array. Change. Key: Θ(n) array. Insert: Θ(n) 81