CS 146 Data Structures and Algorithms July 9

  • Slides: 40
Download presentation
CS 146: Data Structures and Algorithms July 9 Class Meeting Department of Computer Science

CS 146: Data Structures and Algorithms July 9 Class Meeting Department of Computer Science San Jose State University Summer 2015 Instructor: Ron Mak www. cs. sjsu. edu/~mak

Insertion Sort o One of the simplest and intuitive algorithms. n o The way

Insertion Sort o One of the simplest and intuitive algorithms. n o The way you would manually sort a deck of cards. Make N– 1 passes over the list of data. n For pass p = 1 through N– 1, the algorithm guarantees that the data in positions 0 through p– 1 are already sorted. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 2

Insertion Sort public static <Any. Type extends Comparable<? super Any. Type>> void insertion. Sort(Any.

Insertion Sort public static <Any. Type extends Comparable<? super Any. Type>> void insertion. Sort(Any. Type[] a) { int j; for (int p = 1; p < a. length; p++) { Any. Type tmp = a[p]; Does this value belong in the sorted part? for (j = p; j > 0 && tmp. compare. To(a[j-1]) < 0; j--) { a[j] = a[j-1]; Slide values in the sorted part of the list } a[j] = tmp; one to the right to make room for a new member of the sorted part. } } o The inner for loop terminates quickly if the tmp value does not need to be inserted too far into the sorted part. n The entire sort finishes quickly if the data is nearly sorted: O(N). Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 3

Shellsort o Like insertion sort, except we compare values that are h elements apart

Shellsort o Like insertion sort, except we compare values that are h elements apart in the list. n h diminishes after completing a pass, for example, 5, 3, and 1. o The final value of h must be 1, so the final pass is a regular insertion sort. o The previous passes get the array “nearly sorted” quickly. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 4

Shellsort, cont’d o After each pass, the array is said to be hk-sorted. n

Shellsort, cont’d o After each pass, the array is said to be hk-sorted. n Examples: 5 -sorted, 3 -sorted, etc. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 5

Shellsort public static <Any. Type extends Comparable<? super Any. Type>> void shellsort(Any. Type[] a)

Shellsort public static <Any. Type extends Comparable<? super Any. Type>> void shellsort(Any. Type[] a) { Use the (suboptimal) int j; sequence for h which for (int h = a. length/2; h > 0; h /= 2) { for (int i = h; i < a. length; i++) { Any. Type tmp = a[i]; starts at half the list length and is halved for each subsequent pass. for (j = i; j >= h && tmp. compare. To(a[j-h]) < 0; j -= h) { a[j] = a[j-h]; } a[j] = tmp; } } } Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 6

Insertion Sort vs. Shellsort o Insertion sort is slow because it swaps only adjacent

Insertion Sort vs. Shellsort o Insertion sort is slow because it swaps only adjacent values. o A value may have to travel a long way through the array during a pass, one element at a time, to arrive at its proper place in the sorted part of the array. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 7

Insertion Sort vs. Shellsort, cont’d o Shellsort is able to move a value a

Insertion Sort vs. Shellsort, cont’d o Shellsort is able to move a value a longer distance (h) without making the value travel through the intervening values. o Early passes with large h make it easier for later passes with smaller h to sort. o The final value of h = 1 is a simple insertion sort. o Choosing a good increment sequence for h can produce a 25% speedup of the sort. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 8

Heapsort o Heapsort is based on using a priority queue. n n o Which

Heapsort o Heapsort is based on using a priority queue. n n o Which we implement as a binary heap. Which we implement using an underlying array. To sort N values into increasing order: n Build a min heap o n O(N) time Do N deletions to get the values in order. o Each deletion takes O(log N) time, so total O(N log N) time. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 9

Heapsort, cont’d o But where to put the sorted values? o Append them to

Heapsort, cont’d o But where to put the sorted values? o Append them to the end of underlying array as values are being deleted one by one. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 10

Mergesort o Divide and conquer! o Divide n n o Conquer n o Split

Mergesort o Divide and conquer! o Divide n n o Conquer n o Split the list of values into two halves. Recursively sort each of the two halves. Merge the two sorted sublists back into a single sorted list. Nearly the optimal number of comparisons. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 11

Mergesort public static <Any. Type extends Comparable<? super Any. Type>> void merge. Sort(Any. Type[]

Mergesort public static <Any. Type extends Comparable<? super Any. Type>> void merge. Sort(Any. Type[] a) { Any. Type[] tmp. Array = (Any. Type[]) new Comparable[a. length]; merge. Sort(a, tmp. Array, 0, a. length - 1); } private static <Any. Type extends Comparable<? super Any. Type>> void merge. Sort(Any. Type[] a, Any. Type[] tmp. Array, int left, int right) { if (left < right) { int center = (left + right)/2; merge. Sort(a, tmp. Array, left, center); merge. Sort(a, tmp. Array, center+1, right); merge(a, tmp. Array, left, center+1, right); } } Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 12

Mergesort private static <Any. Type extends Comparable<? super Any. Type>> void merge(Any. Type[] a,

Mergesort private static <Any. Type extends Comparable<? super Any. Type>> void merge(Any. Type[] a, Any. Type[] tmp. Array, int left. Pos, int right. End) { int left. End = right. Pos - 1; int tmp. Pos = left. Pos; int num. Elements = right. End - left. Pos + 1; while (left. Pos <= left. End && right. Pos <= right. End) { if (a[left. Pos]. compare. To(a[right. Pos]) <= 0) { tmp. Array[tmp. Pos++] = a[left. Pos++]; } Do the merge. else { tmp. Array[tmp. Pos++] = a[right. Pos++]; } }. . . } Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 13

Mergesort private static <Any. Type extends Comparable<? super Any. Type>> void merge(Any. Type[] a,

Mergesort private static <Any. Type extends Comparable<? super Any. Type>> void merge(Any. Type[] a, Any. Type[] tmp. Array, int left. Pos, int right. End) {. . . } while (left. Pos <= left. End) { tmp. Array[tmp. Pos++] = a[left. Pos++]; } Copy the rest of the first half. while (right. Pos <= right. End) { tmp. Array[tmp. Pos++] = a[right. Pos++]; } Copy the rest of the second half. for (int i = 0; i < num. Elements; i++, right. End--) { a[right. End] = tmp. Array[right. End]; Copy from the temporary } array back into the original. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 14

Analysis of Mergesort o How long does it take mergesort to run? n n

Analysis of Mergesort o How long does it take mergesort to run? n n o Let T(N) be the time to sort N values. It takes a constant 1 if N = 1. It takes T(N/2) to sort each half. N to do the merge. Therefore, we have a recurrence relation: { T(N) = Computer Science Dept. Summer 2015: July 9 1 if N = 1 2 T(N/2) + N if N > 1 CS 146: Data Structures and Algorithms © R. Mak 15

Analysis of Mergesort o Solve: { T(N) = 1 if N = 1 Assume

Analysis of Mergesort o Solve: { T(N) = 1 if N = 1 Assume N is a power of 2. 2 T(N/2) + N if N > 1 Divide both sides by N: Telescope: Since the equation is valid for any N that’s a power of 2, successively replace N by N/2: Add together, and many convenient cancellations will occur. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 16

Analysis of Mergesort since there are log N number of 1’s. Multiply through by

Analysis of Mergesort since there are log N number of 1’s. Multiply through by N: o And so mergesort runs in O(N log N) time. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 17

Mergesort for Linked Lists o Mergesort does not rely on random access to the

Mergesort for Linked Lists o Mergesort does not rely on random access to the values in the list. o Therefore, it is well-suited for sorting linked lists. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 18

Mergesort for Linked Lists, cont’d o How do we split a linked list into

Mergesort for Linked Lists, cont’d o How do we split a linked list into two sublists? n Splitting it at the midpoint is not efficient. o Idea: Iterate down the list and assign the nodes alternating between the two sublists. o Merging two sorted sublists should be easy. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 19

Break Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms

Break Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 20

Partitioning a List of Values o Are there better ways to partition (split) a

Partitioning a List of Values o Are there better ways to partition (split) a list of values other than down the middle? Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 21

Partitioning a List of Values, cont’d o o Pick an arbitrary “pivot value” in

Partitioning a List of Values, cont’d o o Pick an arbitrary “pivot value” in the list. Move all the values less than the pivot value into one sublist. Move all the values greater than the pivot value into the other sublist. Now the pivot value is in its “final resting place”. n o Recursively sort the two sublists. n o It’s in the correct position for the sorted list. The pivot value doesn’t move. Challenge: Find a good pivot value. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 22

Computer Science Dept. Summer 2015: July 9 Mark Weiss CSAllen 146: Data Structures and

Computer Science Dept. Summer 2015: July 9 Mark Weiss CSAllen 146: Data Structures and Algorithms in Java © R. Mak (c) 2006 Pearson Education, Inc. All rights reserved. 0 -13 -257627 -9 23

Partition a List Using a Pivot o Given a list, pick an element to

Partition a List Using a Pivot o Given a list, pick an element to be the pivot. n n o There are various strategies to pick the pivot. The simplest is to pick the first element of the list. First get the chosen pivot value “out of the way” by swapping with the value currently at the right end. 6 1 4 9 0 3 5 2 7 8 8 1 4 9 0 3 5 2 7 6 Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 24

Partition a List Using a Pivot, cont’d 8 1 4 9 0 3 5

Partition a List Using a Pivot, cont’d 8 1 4 9 0 3 5 2 7 6 o Goal: Move all values < pivot to the left part of the list and all values > pivot to the right part of the list. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 25

Partition a List Using a Pivot, cont’d o Set index i to the left

Partition a List Using a Pivot, cont’d o Set index i to the left end of the list and index j to one from the right end. 8 1 4 9 0 3 5 2 7 6 i j o While i < j: n Move i right, skipping over values < pivot. o n Move j left, skipping over values > pivot. o n Stop i when it reaches a value ≥ pivot. Stop j when it reaches a value ≤ pivot. After both i and j have stopped, swap the values at i and j. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 26

Partition a List Using a Pivot, cont’d Move j: 8 1 4 9 0

Partition a List Using a Pivot, cont’d Move j: 8 1 4 9 0 3 5 2 7 6 i j Swap: 2 1 4 9 0 3 5 8 7 6 i j Move i and j: 2 1 4 9 0 3 5 8 7 6 i j Swap: 2 1 4 5 0 3 9 8 7 6 i j Move i and j. They’ve crossed! 2 1 4 5 0 3 9 8 7 6 j i Swap the pivot with the ith element: 2 1 4 5 0 3 6 8 7 9 j i Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak Now the list is properly partitioned for quicksort! 27

Sorting Statistics public class Stats { long moves; long compares; long time; . .

Sorting Statistics public class Stats { long moves; long compares; long time; . . . } Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 28

Quicksort o A fast divide-and-conquer sorting algorithm. n n A very tight and highly

Quicksort o A fast divide-and-conquer sorting algorithm. n n A very tight and highly optimized inner loop. One of the most Looks like magic in animation. elegant and useful Average running time is O(N log N). algorithms in computer science. Worst-case running time is O(N 2). o o The worst case be made to occur very unlikely. Basic idea: n n Partition the list using a pivot. Recursively sort the two sublists. o Sounds like mergesort, but does not require merging or a temporary array. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 29

Quicksort Pivot Strategy o Quicksort is a fragile algorithm! n n o It is

Quicksort Pivot Strategy o Quicksort is a fragile algorithm! n n o It is sensitive to picking a good pivot. Attempts to improve the algorithm can break it. Simplest pivot strategy: Pick the first element of the list. n n Worst strategy if the list is already sorted. Running time O(N 2). Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 30

First Element Pivot Strategy public interface Pivot. Strategy { public Integer choose. Pivot(Integer[] a,

First Element Pivot Strategy public interface Pivot. Strategy { public Integer choose. Pivot(Integer[] a, int left, int right, Stats stats); } public class Pivot. First implements Pivot. Strategy { public Integer choose. Pivot(Integer[] a, int left, int right, Stats stats) { Utilities. swap. References(a, left, right); stats. moves += 2; Pivot is first element Swap it with the right. return a[right]; } } Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak Demo 31

Median-of-Three Pivot Strategy o A good pivot value would be the median value of

Median-of-Three Pivot Strategy o A good pivot value would be the median value of the list. n o The median of a list of unsorted numbers is nontrivial to compute. Compromise: n n Examine the two values at the ends of the list and the value at the middle position of the list. Choose the value that’s in between the other two. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 32

Median-of-Three Pivot Strategy, cont’d public class Pivot. Median. Of. Three implements Pivot. Strategy {

Median-of-Three Pivot Strategy, cont’d public class Pivot. Median. Of. Three implements Pivot. Strategy { public Integer choose. Pivot(Integer[] a, int left, int right, Stats stats) { int center = (left + right)/2; if (a[center]. compare. To(a[left]) < 0) { Utilities. swap. References(a, left, center); stats. moves += 2; } Order the left, if (a[right]. compare. To(a[left]) < 0) { Utilities. swap. References(a, left, right); center, and right stats. moves += 2; elements. } if (a[right]. compare. To(a[center]) < 0) { Utilities. swap. References(a, center, right); stats. moves += 2; } stats. compares += 3; CS 146: Data Structures and Algorithms Computer Science Dept. Summer 2015: July 9 © R. Mak 33

Median-of-Three Pivot Strategy, cont’d Utilities. swap. References(a, center, right); stats. moves += 2; Pivot

Median-of-Three Pivot Strategy, cont’d Utilities. swap. References(a, center, right); stats. moves += 2; Pivot is the center element return a[right]; Swap it with the right. } } Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 34

Quicksort Recursion private Stats quicksort(Integer[] a, int left, int right) { Stats stats =

Quicksort Recursion private Stats quicksort(Integer[] a, int left, int right) { Stats stats = new Stats(); if (left <= right) { Integer pivot = pivot. Strategy. choose. Pivot(a, left, right, stats); int p = partition(a, left, right, pivot, stats); Stats stats 1 = quicksort(a, left, p-1); // Sort small elements Stats stats 2 = quicksort(a, p+1, right); // Sort large elements stats. moves += (stats 1. moves + stats 2. moves); stats. compares += (stats 1. compares + stats 2. compares); } return stats; } Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 35

Quicksort Partitioning private int partition(Integer[] a, int left, int right, Integer pivot, Stats stats)

Quicksort Partitioning private int partition(Integer[] a, int left, int right, Integer pivot, Stats stats) { int i = left-1; int j = right; while (i < j) { do { i++; Move i to the right. stats. compares++; } while ((i <= right) && a[i]. compare. To(pivot) < 0); do { j--; Move j to the left. stats. compares++; } while ((j >= left) && a[j]. compare. To(pivot) > 0); } if (i < j) { Utilities. swap. References(a, i, j); stats. moves += 2; } Computer Science Dept. CS 146: Data Structures and Algorithms Summer 2015: July 9 © R. Mak Swap. 36

Quicksort Partitioning, cont’d Utilities. swap. References(a, i, right); stats. moves += 2; Restore the

Quicksort Partitioning, cont’d Utilities. swap. References(a, i, right); stats. moves += 2; Restore the pivot’s position. return i; } Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 37

Mergesort vs. Quicksort o In the standard Java library: n Mergesort is used to

Mergesort vs. Quicksort o In the standard Java library: n Mergesort is used to sort arrays of object types. o o n o It uses the lowest number of comparisons. Comparing objects can be slow in Java for objects that implement the Comparable interface. Quicksort is used to sort arrays of primitive types. In the standard C++ library: n Quicksort is used for the generic sort. o o Copying large objects can be expensive. Comparing objects can be cheap if the compiler can generate optimized code to do comparisons inline. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 38

Quicksort o Quicksort doesn’t do well for very short lists. o When a sublist

Quicksort o Quicksort doesn’t do well for very short lists. o When a sublist becomes too small, use another algorithm to sort the sublist such as insertion sort. n The textbook uses a cutoff of size 10 for a sublist. Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 39

Sorting Animations o https: //www. cs. usfca. edu/~galles/visualization/C omparison. Sort. html o http: //www.

Sorting Animations o https: //www. cs. usfca. edu/~galles/visualization/C omparison. Sort. html o http: //www. sorting-algorithms. com Computer Science Dept. Summer 2015: July 9 CS 146: Data Structures and Algorithms © R. Mak 40