Sorting Searching tsaiwncsie nctu edu tw 1 Problem

Sorting & Searching 蔡文能 tsaiwn@csie. nctu. edu. tw 交通大學資訊程學系交大資蔡文能計概 1

Problem Solving Steps 1. Understand the problem 2. Get an idea 3. Formulate the algorithm and represent it as a program (or pseudo code) 4. Evaluate the program 1. For accuracy 2. For its potential as a tool for solving other problems 交大資蔡文能計概 2

Sorting 排序 (排列) • Take a set of items, order unknown • Return ordered set of the items • For instance: è Sorting names alphabetically è Sorting by scores in descending order è Sorting by height in ascending order ü Issues of interest: – Running time in worst case, average/other cases – Space requirements 交大資蔡文能計概 3

常見簡單 Sorting 技巧 • Insertion Sort 插入排列法 • Selection Sort 選擇排列法 • Bubble Sort 氣泡排列法 (Sibling exchange sort; 鄰近比較交換法) • Other Sorting techniques – Quick Sort, Heap Sort, Merge Sort, . . – Shell Sort, Fibonacci Sort 交大資蔡文能計概 4

Sorting the list Fred, Alice, David, Bill, and Carol alphabetically Insertion Sort 交大資蔡文能計概 5

Sorting the list Fred, Alice, David, Bill, and Carol alphabetically (continued) 交大資蔡文能計概 6

Sorting the list Fred, Alice, David, Bill, and Carol alphabetically (cont. ) 交大資蔡文能計概 7

The insertion sort algorithm expressed in pseudocode Key idea: Keep part of array always sorted 交大資蔡文能計概 8

Insertion Sort in C Language Ascending order void sort( double x[ ], int nox) { int n = 1, k; double tmp; /* C array 從 0 開始*/ while(n <= nox-1) { k=n-1; tmp = x[n]; /* 我先放到 tmp */ while( k>=0 && x[k]>tmp){ x[k+1] = x[k]; /* 前面的 copy 到下一個*/ --k; } x[k+1] = tmp; ++n; /* check next element */ } } 交大資蔡文能計概 Lazy evaluation (short-cut evaluation) 0 1 nox-1 9

Test the Sort Algorithm double y[ ] = {15, 38, 12, 75, 20, 66, 49, 58}; #include<stdio. h> void pout(double*, int); void sort(double*, int); int main( ) { printf("Before sort: n"); pout(y, 8); sort(y, sizeof(y)/sizeof(double) ); printf(" After sort: n"); pout(y, 8); } void pout(double*p, int n) { int i; for(i=0; i<=n-1; ++i) { printf("%7. 2 f ", p[i]); } printf(" n"); Before sort: } 15. 00 38. 00 12. 00 75. 00 20. 00 66. 00 49. 00 58. 00 20. 00 38. 00 49. 00 58. 00 66. 00 75. 00 After sort: 12. 00 交大資蔡文能計概 15. 00 10

Insertion Sort Summary • Best case: Already sorted O(n) • Worst case: – # of comparisons : O(n 2) – # of exchanges: O(n 2) : 剛好相反順序時 • • Space: No external storage needed In practice, good for small sets (<30 items) Very efficient on nearly-sorted inputs 想要減少 data 交換次數 : Selection Sort 交大資蔡文能計概 11

Selection Sort 選擇排列法 array index 由 0 到 n-1 Ascending order void sort( double x[ ], int nox) { int i, k, candt; double tmp; for(i = 0; i < nox-1; ++i) { candt = i; /* assume this is our candidate */ for( k= i+1; k<=nox-1; ++k) { if(x[k] < x[candt]) candt = k; /* that is it */ } tmp=x[i]; x[i]=x[candt]; x[candt]=tmp; /*第i個到定位*/ } } 交大資蔡文能計概選出剩下中最小的 12

Select. Sort(array A, length n) array index 由 0 到 n-1 Another version of selection sort 1. for i n-1 to 1 // note we are going down 2. largest_index 0 // assume 0 -th is largest 3. for j 1 to i // loop finds max in [1. . i] 4. if A[j] > A[largest_index] 5. largest_index j 6. next j 7. swap(A[i], A[largest_index]) //put max in i 8. Next i 選出的放最後(第 i 個) 交大資蔡文能計概 13

Selection Sort Summary • Best case: Already sorted – Passes: n-1 – Comparisons each pass: (n-k) where k pass number – # of comparisons: (n-1)+(n-2)+…+1 = O(n 2) • Worst case: O(n 2) • Space: No external storage needed • Very few exchanges: – Always n-1 (better than Bubble Sort) 交大資蔡文能計概 14

Bubble Sort-v 1 氣泡排列法 array index 由 0 到 n-1 Ascending order void sort( double x[ ], int nox) { int i, k; double tmp; for(i = nox-1; i >=1; --i) { for( k= 0; k< i; ++k) { if(x[k] > x[k+1]) { /* 左大右小, 需要調換 */ tmp= x[k]; x[k]=x[k+1]; x[k+1]=tmp; } Sibling exchange sort } // for k 鄰近比較交換法 } // for i } 交大資蔡文能計概 15

Bubble Sort-v 1 example(1/2) 15 15 38 38 12 12 38 38 38 75 75 20 20 75 66 66 75 49 49 75 58 58 75 第一回合 ( pass 1 ) : 7 次比較第一回合後 75 到定位交大資蔡文能計概 16

Bubble Sort-v 1 example (2/2) 15 38 12 75 20 第一回合後 75 到定位: 15 12 38 20 66 第二回合後 66 到定位: 12 15 20 38 49 第三回合 ( pass 3 ) ? 12 15 20 38 49 66 49 58 75 58 66 75 (original) sorted 58 66 75 剛剛都沒換; 還需要再做下一回合(pass)嗎 ? 交大資蔡文能計概 17

Bubble Sort-v 1 Features • Time complexity in Worst case: Inverse sorting – Passes: need n-1 passes – Comparisons each pass: (n-k) where k is pass number – Total number of comparisons: (n-1)+(n-2)+(n-3)+…+1 = n(n-1)/2=n 2/2 -n/2 = O(n 2) • Space: No auxilary storage needed • Best case: already sorted – O(n 2) Still: Many redundant passes with no swaps – Can be improved by using a Flag 交大資蔡文能計概 18

Bubble Sort-v 2 氣泡排列法改良式氣泡排序法 array index 由 0 到 n-1 void sort( double x[ ], int nox) { int i, k, flag; double tmp; for(i = nox-1; i >=1; --i) { flag = 0; /* assume no exchange in this pass */ for( k= 0; k< i; ++k) { if(x[k] > x[k+1]) { /* 需要調換 */ tmp= x[k]; x[k]=x[k+1]; x[k+1]=tmp; flag=1; } // if } // for k if(flag==0) break; /* 剛剛這回合沒交換, 不用再做 */ } // for i 19 }交大資蔡文能計概

Bubble Sort –v 2 Features • Best case: Already sorted – O(n) – one pass • Total number of exchanges – Best case: 0 – Worst case: O(n 2) (資料相反順序時) ØLots of exchanges: A problem with large data items 交大資蔡文能計概 20

Another version of Bubble sort (in pseudo code) Bubble. Sort(array A[ ], int n) array index 由 0 到 n-1 1. i n-1 2. quit false 3. while(i>0 AND NOT quit)// note: going down 4. quit true 5. for j=1 to i // loop does swaps in [1. . i] 6. if (A[j-1] > A[j]) { 7. swap(A[j-1], A[j]) // put max in I 8. quit false } 9. next j 10. i i-1 11. wend 交大資蔡文能計概 21

Selection Sort vs. Bubble Sort • Selection sort: – more comparisons than bubble sort in best case • Always O(n 2) comparisons : n(n-1)/2 – But fewer exchanges : O(n) – Good for small sets/cheap comparisons, large items • Bubble sort-v 2: – Many exchanges : O(n 2) in worst case – O(n) on sorted input (best case) : only one pass 交大資蔡文能計概 22

Quick Sort (1/6) Algorithm quick_sort(array A, from, to) Input: from - pointer to the starting position of array A to - pointer to the end position of array A Output: sorted array: A’ 1. 2. 3. 4. 5. 6. 7. Choose any one element as the pivot; Find the first element a = A[i] larger than or equal to pivot from A[from] to A[to]; Find the first element b = A[j] smaller than or equal to pivot from A[to] to A[from]; If i < j then exchange a and b; Repeat step from 2 to 4 until j <= i; If from < j then recursive call quick_sort(A, from, j); If i < to then recursive call quick_sort(A, i, to); 交大資蔡文能計概 23

Quick Sort (2/6) • Quick sort Choose 5 as pivot main idea: from 1 st step: 3 9 1 6 5 4 8 2 to 10 j i 2 nd step: 3 2 1 6 5 4 8 3 rd step: 3 2 1 4 5 6 8 Smaller than any integer right to 5 交大資蔡文能計概 7 9 9 10 7 greater than any integer left to 5 24

Quick Sort (3/6) from • Quick sort from pivot to 4 th step: 3 2 1 4 5 5 th step: 1 6 th step: 7 th step: 8 th step: 交大資蔡文能計概 2 3 to pivot 4 6 10 9 8 7 5 5 6 7 8 10 9 9 10 25

Quick Sort (4/6) public class Quick. Sorter { // Java function should be in a class public static void sort (int[ ] a, int from, int to) { if ((a == null) || (a. length < 2)) return; int i = from, j = to; int pivot = a[(from + to)/2]; do { while ((i < to) && (a[i] < pivot)) i++; while ((j > from) && (a[j] >= pivot)) j--; if (i < j) { int tmp =a[i]; a [i] = a[j]; a[j] = tmp; } i++; j--; }while (i <= j); exchange(a, i, (from+to)/2 ); /***/ if (from < j) sort(a, from, j); if (i < to) sort(a, i, to); } } 交大資蔡文能計概 26

Quick Sort (5/6) 3, 4, 6, 1, 10, 9, 5, 20, 19, 14, 12, 2, 15, 21, 13, 18, 17, 8, 16, 1 3, 4, 6, 1, 10, 9, 5, 20, 19, 1, 12, 2, 15, 21, 13, 18, 17, 8, 16, 14 j i 3, 4, 6, 1, 10, 9, 5, 8, 19, 1, 12, 2, 15, 21, 13, 18, 17, 20, , 16, 14 i j 3, 4, 6, 1, 10, 9, 5, 8, 13 , 1, 12, 2, 15, 21, 19, 18, 17, 20, 16, 14 i j 3, 4, 6, 1, 10, 9, 5, 8, 13 , 1, 12, 2, 14 21, 19, 18, 17, 20, 16, 15 i 3, 4, 6, 1, 10, 9, 5, 8, 13 , 1, 12, 2 j 交大資蔡文能計概 27

Quick Sort (6/6) void qsort (int a[ ], int from, int to) { int n = to – from + 1; if ( (n < 2) || (from >= to) ) return; int k = (from + to)/2; int tmp =a[to]; a [to] = a[k]; a[k] = tmp; int pivot = a[to]; int i = from, j = to-1; while(i < j ) { while ((i < j) && (a[i] < pivot)) i++; while ((i < j) && (a[j] >= pivot)) j--; if (i < j) { tmp =a[i]; a [i] = a[j]; a[j] = tmp; } }; tmp =a[i]; a [i] = a[to]; a[to] = tmp; // exchange if (from < i-1) qsort(a, from, i-1); if (i < to) qsort(a, i+1, to); } 交大資蔡文能計概 28

qsort( ) in C Library • There is a library function for quick sort in C Language. • #include <stdlib. h> void qsort(void *base, size_t num, size_t size, int (*comp_func)(const void *, const void *) ) void * base --- a pointer to the array to be sorted size_t num --- the number of elements size_t size --- the element size int (*cf) (…) --- is a pointer to a function used to compare 交大資蔡文能計概 29

Quick sort is NOT stable • Definition of Stable sort? – A sorting algorithm is stable if whenever there are two records R and S with the same key and with R appearing before S in the original list, R will appear before S in the sorted list. – stable sorting algorithms maintain the relative order of records with equal keys. (http: //en. wikipedia. org/wiki/Sorting_algorithm) 交大資蔡文能計概 30

Merge Sort (1/3) Merging means the combination of two or more ordered sequence into a single sequence. For example, can merge two sequences: 503, 765 and 087, 512, 677 to obtain a sequence: 087, 503, 512, 677, 703, 765. A simple way to accomplish this is to compare the two smallest items, output the smallest, and then repeat the same process. 503 087 703 512 765 677 087 交大資蔡文能計概 503 512 703 677 765 677 31

Merge Sort (2/3) Algorithm Merge(s 1, s 2) Input: two sequences: s 1 - x 1 x 2. . . xm and s 2 - y 1 y 2. . . yn Output: a sorted sequence: z 1 z 2. . . zm+n. 1. [initialize] i : = 1, j : = 1, k : = 1; 2. [find smaller] if xi yj goto step 3, otherwise goto step 5; 3. [output xi] zk. : = xi, k : = k+1, i : = i+1. If i m, goto step 2; 4. [transmit yj . . . yn] zk, . . . , zm+n : = yj, . . . , yn. Terminate the algorithm; 5. [output yj] zk. : = yj, k : = k+1, j : = j+1. If j n, goto step 2; 6. [transmit xi . . . xm] zk, . . . , zm+n : = xi, . . . , xm. Terminate the algorithm; 交大資蔡文能計概 32

Merge Sort (3/3) Algorithm Merge-sorting(s) Input: a sequences s = < x 1, . . . , xm> Output: a sorted sequence. 1. If |s| = 1, then return s; 2. k : = m/2 ; 3. s 1 : = Merge-sorting(x 1, . . . , xk); 4. s 2 : = Merge-sorting(xk+1, . . . , xm); 5. return(Merge(s 1, s 2)); 交大資蔡文能計概 33

Binary Search 交大資蔡文能計概 34

Binary Search Algorithm 交大資蔡文能計概 35

Binary Search Algorithm in Pseudocode 交大資蔡文能計概 36

Searching for Bill 交大資蔡文能計概 37

Searching for David 交大資蔡文能計概 38

Searching for David 交大資蔡文能計概 David 39

Software Efficiency • Measured as number of instructions executed • notation for efficiency classes – O( ? ) – Q(? ) • Best, worst, and average case 交大資蔡文能計概 40

Asymptotic Upper Bound (Big O) • f(n) c g(n) for all n n 0 • g(n) is called an asymptotic upper bound of f(n). • We write f(n)=O(g(n)) • It reads f(n) equals big oh of g(n). c g(n) f(n) n 0 交大資蔡文能計概 41

Asymptotic Lower Bound (Big Omega) • f(n) c g(n) for all n n 0 • g(n) is called an asymptotic lower bound of f(n). • We write f(n)= (g(n)) • It reads f(n) equals big omega of g(n). f(n) c g(n) n 0 交大資蔡文能計概 42

Asymptotically Tight Bound (Big Theta) • f(n) = O(g(n)) and f(n) = (g(n)) • g(n) is called an asymptotically tight bound of f(n). • We write f(n)= (g(n)) • It reads f(n) equals theta of g(n). c 2 g(n) f(n) c 1 g(n) n 0 交大資蔡文能計概 43

Insertion Sort in Worst Case 交大資蔡文能計概 44

Worst-Case Analysis Insertion Sort 交大資蔡文能計概 45

Time complexity of Mergesort • Takes roughly n·log 2 n comparisons. • Without the shortcut, there is no best or worst case. • With the optional shortcut, the best case is when the array is already sorted: takes only (n-1) comparisons. 交大資蔡文能計概 46

Worst-Case Analysis Binary Search 交大資蔡文能計概 47

Big-theta Notation • Identification of the shape of the graph representing the resources required with respect to the size of the input data – Normally based on the worst-case analysis – Insertion sort: Q(n 2) – Binary search: Q(log n) 交大資蔡文能計概 48

Formal Definition • Q(n 2): complexity is kn 2+o(n 2 ) – f(n)/n 2 k, n • o(n 2 ): functions grow slower than n 2 – f(n)/n 2 0, n 交大資蔡文能計概 49

Thank You! 謝謝捧場 tsaiwn@csie. nctu. edu. tw 蔡文能交大資蔡文能計概 50