ECE 250 Algorithms and Data Structures Insertion sort

  • Slides: 31
Download presentation
ECE 250 Algorithms and Data Structures Insertion sort Douglas Wilhelm Harder, M. Math. LEL

ECE 250 Algorithms and Data Structures Insertion sort Douglas Wilhelm Harder, M. Math. LEL Department of Electrical and Computer Engineering University of Waterloo, Ontario, Canada ece. uwaterloo. ca dwharder@alumni. uwaterloo. ca © 2006 -2013 by Douglas Wilhelm Harder. Some rights reserved.

Insertion sort 2 Outline This topic discusses the insertion sort We will discuss: –

Insertion sort 2 Outline This topic discusses the insertion sort We will discuss: – the algorithm – an example – run-time analysis • worst case • average case • best case

Insertion sort 3 8. 2 Background Consider the following observations: – A list with

Insertion sort 3 8. 2 Background Consider the following observations: – A list with one element is sorted – In general, if we have a sorted list of k items, we can insert a new item to create a sorted list of size k + 1

Insertion sort 4 Background 8. 2 For example, consider this sorted array containing of

Insertion sort 4 Background 8. 2 For example, consider this sorted array containing of eight sorted entries 5 7 12 19 21 26 33 40 14 9 18 21 2 Suppose we want to insert 14 into this array leaving the resulting array sorted

Insertion sort 5 8. 2 Background Starting at the back, if the number is

Insertion sort 5 8. 2 Background Starting at the back, if the number is greater than 14, copy it to the right – Once an entry less than 14 is found, insert 14 into the resulting vacancy

Insertion sort 6 The Algorithm 8. 2. 1 For any unsorted list: – Treat

Insertion sort 6 The Algorithm 8. 2. 1 For any unsorted list: – Treat the first element as a sorted list of size 1 Then, given a sorted list of size k – 1 – Insert the kth item in the unsorted list into it into the sorted list – The sorted list is now of size k + 1

Insertion sort 7 The Algorithm 8. 2. 1 Recall the five sorting techniques: –

Insertion sort 7 The Algorithm 8. 2. 1 Recall the five sorting techniques: – Insertion – Exchange – Selection – Merging – Distribution Clearly insertion falls into the first category

Insertion sort 8 8. 2. 2 The Algorithm Code for this would be: for

Insertion sort 8 8. 2. 2 The Algorithm Code for this would be: for ( int j = k; j > 0; --j ) { if ( array[j - 1] > array[j] ) { std: : swap( array[j - 1], array[j] ); } else { // As soon as we don't need to swap, the (k + 1)st // is in the correct location break; } }

Insertion sort 9 8. 2. 2 Implementation and Analysis This would be embedded in

Insertion sort 9 8. 2. 2 Implementation and Analysis This would be embedded in a function call such as template <typename Type> void insertion_sort( Type *const array, int const n ) { for ( int k = 1; k < n; ++k ) { for ( int j = k; j > 0; --j ) { if ( array[j - 1] > array[j] ) { std: : swap( array[j - 1], array[j] ); } else { // As soon as we don't need to swap, the (k + 1)st // is in the correct location break; } }

Insertion sort 10 8. 2. 3 Implementation and Analysis Let’s do a run-time analysis

Insertion sort 10 8. 2. 3 Implementation and Analysis Let’s do a run-time analysis of this code template <typename Type> void insertion_sort( Type *const array, int const n ) { for ( int k = 1; k < n; ++k ) { for ( int j = k; j > 0; --j ) { if ( array[j - 1] > array[j] ) { std: : swap( array[j - 1], array[j] ); } else { // As soon as we don't need to swap, the (k + 1)st // is in the correct location break; } }

Insertion sort 11 8. 2. 3 Implementation and Analysis The Q(1)-initialization of the outer

Insertion sort 11 8. 2. 3 Implementation and Analysis The Q(1)-initialization of the outer for-loop is executed once template <typename Type> void insertion_sort( Type *const array, int const n ) { for ( int k = 1; k < n; ++k ) { for ( int j = k; j > 0; --j ) { if ( array[j - 1] > array[j] ) { std: : swap( array[j - 1], array[j] ); } else { // As soon as we don't need to swap, the (k + 1)st // is in the correct location break; } }

Insertion sort 12 8. 2. 3 Implementation and Analysis This Q(1)- condition will be

Insertion sort 12 8. 2. 3 Implementation and Analysis This Q(1)- condition will be tested n times at which point it fails template <typename Type> void insertion_sort( Type *const array, int const n ) { for ( int k = 1; k < n; ++k ) { for ( int j = k; j > 0; --j ) { if ( array[j - 1] > array[j] ) { std: : swap( array[j - 1], array[j] ); } else { // As soon as we don't need to swap, the (k + 1)st // is in the correct location break; } }

Insertion sort 13 8. 2. 3 Implementation and Analysis Thus, the inner for-loop will

Insertion sort 13 8. 2. 3 Implementation and Analysis Thus, the inner for-loop will be executed a total of n – 1 times template <typename Type> void insertion_sort( Type *const array, int const n ) { for ( int k = 1; k < n; ++k ) { for ( int j = k; j > 0; --j ) { if ( array[j - 1] > array[j] ) { std: : swap( array[j - 1], array[j] ); } else { // As soon as we don't need to swap, the (k + 1)st // is in the correct location break; } }

Insertion sort 14 8. 2. 3 Implementation and Analysis In the worst case, the

Insertion sort 14 8. 2. 3 Implementation and Analysis In the worst case, the inner for-loop is executed a total of k times template <typename Type> void insertion_sort( Type *const array, int const n ) { for ( int k = 1; k < n; ++k ) { for ( int j = k; j > 0; --j ) { if ( array[j - 1] > array[j] ) { std: : swap( array[j - 1], array[j] ); } else { // As soon as we don't need to swap, the (k + 1)st // is in the correct location break; } }

Insertion sort 15 8. 2. 3 Implementation and Analysis The body of the inner

Insertion sort 15 8. 2. 3 Implementation and Analysis The body of the inner for-loop runs in Q(1) in either case template <typename Type> void insertion_sort( Type *const array, int const n ) { for ( int k = 1; k < n; ++k ) { for ( int j = k; j > 0; --j ) { if ( array[j - 1] > array[j] ) { std: : swap( array[j - 1], array[j] ); } else { // As soon as we don't need to swap, the (k + 1)st // is in the correct location break; } } } Thus, the worst-case run time } is

Insertion sort 16 8. 2. 3 Implementation and Analysis Problem: we may break out

Insertion sort 16 8. 2. 3 Implementation and Analysis Problem: we may break out of the inner loop… template <typename Type> void insertion_sort( Type *const array, int const n ) { for ( int k = 1; k < n; ++k ) { for ( int j = k; j > 0; --j ) { if ( array[j - 1] > array[j] ) { std: : swap( array[j - 1], array[j] ); } else { // As soon as we don't need to swap, the (k + 1)st // is in the correct location break; } }

Insertion sort 17 8. 2. 3 Implementation and Analysis Recall: each time we perform

Insertion sort 17 8. 2. 3 Implementation and Analysis Recall: each time we perform a swap, we remove an inversion template <typename Type> void insertion_sort( Type *const array, int const n ) { for ( int k = 1; k < n; ++k ) { for ( int j = k; j > 0; --j ) { if ( array[j - 1] > array[j] ) { std: : swap( array[j - 1], array[j] ); } else { // As soon as we don't need to swap, the (k + 1)st // is in the correct location break; } }

Insertion sort 18 8. 2. 3 Implementation and Analysis As soon as a pair

Insertion sort 18 8. 2. 3 Implementation and Analysis As soon as a pair array[j - 1] <= array[j], we are finished template <typename Type> void insertion_sort( Type *const array, int const n ) { for ( int k = 1; k < n; ++k ) { for ( int j = k; j > 0; --j ) { if ( array[j - 1] > array[j] ) { std: : swap( array[j - 1], array[j] ); } else { // As soon as we don't need to swap, the (k + 1)st // is in the correct location break; } }

Insertion sort 19 8. 2. 3 Implementation and Analysis Thus, the body is run

Insertion sort 19 8. 2. 3 Implementation and Analysis Thus, the body is run only as often as there are inversions template <typename Type> void insertion_sort( Type *const array, int const n ) { for ( int k = 1; k < n; ++k ) { for ( int j = k; j > 0; --j ) { if ( array[j - 1] > array[j] ) { std: : swap( array[j - 1], array[j] ); } else { // As soon as we don't need to swap, the (k + 1)st // is in the correct location break; } } If the number of inversions is d, the run time is Q(n + d)

Insertion sort 20 8. 2. 3 Consequences of Our Analysis A random list will

Insertion sort 20 8. 2. 3 Consequences of Our Analysis A random list will have d = O(n 2) inversions – The average random list has d = Q(n 2) inversions – Insertion sort, however, will run in Q(n) time whenever d = O(n) Other benefits: – The algorithm is easy to implement – Even in the worst case, the algorithm is fast for small problems – Considering these run times, Approximate it appears to be approximately Size Time (ns) 10 instructions per inversion 8 175 16 750 32 2700 64 8000

Insertion sort 21 8. 2. 3 Consequences of Our Analysis Unfortunately, it is not

Insertion sort 21 8. 2. 3 Consequences of Our Analysis Unfortunately, it is not very useful in general: – Sorting a random list of size 223 ≈ 8 000 would require approximately one day Doubling the size of the list quadruples the required run time – An optimized quick sort requires less than 4 s on a list of the above size

Insertion sort 22 Consequences of Our Analysis 8. 2. 3 The following table summarizes

Insertion sort 22 Consequences of Our Analysis 8. 2. 3 The following table summarizes the run-times of insertion sort Case Worst Average Best Run Time Q(n 2) O(d + n) Q(n) Comments Reverse sorted Slow if d = w(n) Very few inversions: d = O(n)

Insertion sort 23 8. 2. 4 The Algorithm Now, swapping is expensive, so we

Insertion sort 23 8. 2. 4 The Algorithm Now, swapping is expensive, so we could just temporarily assign the new entry – this reduces assignments by a factor of 3 – speeds up the algorithm by a factor of two

Insertion sort 24 Implementation and Analysis 8. 2. 4 A reasonably good* implementation template

Insertion sort 24 Implementation and Analysis 8. 2. 4 A reasonably good* implementation template <typename Type> void insertion( Type *const array, int const n ) { for ( int k = 1; k < n; ++k ) { Type tmp = array[k]; for ( int j = k; k > 0; --j ) { if ( array[j - 1] > tmp ) { array[j] = array[j - 1]; } else { array[j] = tmp; goto finished; } } array[0] = tmp; finished: ; // only executed if tmp < array[0] // empty statement } } * we could do better with pointers

Insertion sort A Comment about Using goto 8. 2. 5 Why use goto? Isn’t

Insertion sort A Comment about Using goto 8. 2. 5 Why use goto? Isn’t goto “evil”? ! template <typename Type> void insertion( Type *const array, int const n ) { for ( int k = 1; k < n; ++k ) { Type tmp = array[k]; for ( int j = k; k > 0; --j ) { if ( array[j - 1] > tmp ) { array[j] = array[j - 1]; } else { array[j] = tmp; goto finished; } } array[0] = tmp; finished: ; } } // only executed if tmp < array[0] // empty statement 25

Insertion sort 8. 2. 5 A Comment about Using goto Even xkcd knows it

Insertion sort 8. 2. 5 A Comment about Using goto Even xkcd knows it is evil! http: //xkcd. com/292/ 26

Insertion sort A Comment about Using goto 8. 2. 5 Unfortunately, any other fix

Insertion sort A Comment about Using goto 8. 2. 5 Unfortunately, any other fix appears to be less than elegant… template <typename Type> void insertion( Type *const array, int const n ) { for ( int k = 1; k < n; ++k ) { Type tmp = array[k]; for ( int j = k; k > 0; --j ) { if ( array[j - 1] > tmp ) { array[j] = array[j - 1]; } else { array[j] = tmp; break; } } if ( array[0] > tmp ) { array[0] = tmp; // only executed if tmp < array[0] } } } 27

Insertion sort 8. 2. 5 A Comment about Using goto The excessive use of

Insertion sort 8. 2. 5 A Comment about Using goto The excessive use of the goto statement resulted in spaghetti code – Edsger Dijkstra’s 1968 commentary “Go To Statement Considered Harmful” started what (against Dijkstra’s better judgment) took on the form of a religious crusade against using goto – Donald Knuth gives a more reasoned approach in his 1972 paper “Structured Programming with go to Statements” Unfortunately, in 2011, one student saw this and immediately used a goto that resulted in spaghetti code and claimed it was “reasonable” – In general, use a goto if and only if it is going forward to an obvious point in the routine 28

Insertion sort 29 Summary Insertion Sort: – Insert new entries into growing sorted lists

Insertion sort 29 Summary Insertion Sort: – Insert new entries into growing sorted lists – Run-time analysis • Actual and average case run time: • Detailed analysis: • Best case (O(n) inversions): – Memory requirements: Q(1) O(n 2) Q(n + d) Q(n)

Insertion sort 30 References [1] [2] [3] [4] [5] Donald E. Knuth, The Art

Insertion sort 30 References [1] [2] [3] [4] [5] Donald E. Knuth, The Art of Computer Programming, Volume 3: Sorting and Searching, 2 nd Ed. , Addison Wesley, 1998, § 5. 2. 1, p. 80 -82. Cormen, Leiserson, and Rivest, Introduction to Algorithms, Mc. Graw Hill, 1990, p. 2 -4, 6 -9. Weiss, Data Structures and Algorithm Analysis in C++, 3 rd Ed. , Addison Wesley, § 8. 2, p. 262 -5. Edsger Dijkstra, Go To Statement Considered Harmful, Communications of the ACM 11 (3), pp. 147– 148, 1968.   Donald Knuth, Structured Programming with Goto Statements, Computing Surveys 6 (4): pp. 261– 301, 1972.  

Insertion sort 31 References Wikipedia, http: //en. wikipedia. org/wiki/Insertion_sort [1] [2] [3] [4] [5]

Insertion sort 31 References Wikipedia, http: //en. wikipedia. org/wiki/Insertion_sort [1] [2] [3] [4] [5] Donald E. Knuth, The Art of Computer Programming, Volume 3: Sorting and Searching, 2 nd Ed. , Addison Wesley, 1998, § 5. 2. 1, p. 80 -82. Cormen, Leiserson, and Rivest, Introduction to Algorithms, Mc. Graw Hill, 1990, p. 2 -4, 6 -9. Weiss, Data Structures and Algorithm Analysis in C++, 3 rd Ed. , Addison Wesley, § 8. 2, p. 262 -5. Edsger Dijkstra, Go To Statement Considered Harmful, Communications of the ACM 11 (3), pp. 147– 148, 1968.   Donald Knuth, Structured Programming with Goto Statements, Computing Surveys 6 (4): pp. 261– 301, 1972.   These slides are provided for the ECE 250 Algorithms and Data Structures course. The material in it reflects Douglas W. Harder’s best judgment in light of the information available to him at the time of preparation. Any reliance on these course slides by any party for any other purpose are the responsibility of such parties. Douglas W. Harder accepts no responsibility for damages, if any, suffered by any party as a result of decisions made or actions based on these course slides for any other purpose than that for which it was intended.