CS 4402 Parallel Computing Lecture 9 Sorting Algorithms

Compare and Exchange Operation Take place between processors rank 1, rank 2. Each processor

Compare and Exchange Operation Complexity? What amount of computation is being used? What amount

Compare and Exchange Algorithms Step 1. The array is scattered onto p sub-arrays. Step

Odd-Even Sort 1. Scatter the array onto processors. 2. Sort each sub-array aa. 3.

Odd-Even Sort Simple Remarks: - Odd-Even Sort uses size rounds of exchange. - Odd-Even

if( rank == 0 ) { array = (double *) calloc( n, sizeof(double) );

Comments on Odd-Even Features of the algorithm: - Simple and quite efficient. - In

Odd-Even Sort Complexity Stage 0. To sort out the scattered array Stage 1. Odd-Even

is. Sorted(n, a, comm) The parallel routine int is. Sorted(int n, double *a, MPI_Comm

is. Sorted(n, a, comm) – Strategy 1 The test is done collectively by all

is. Sorted(n, a, comm) – Strategy 2 The test is done at the root.

Shell Sort It is based on the notion of “shell/group” of consecutive processors. -

Shell Sort is based on two stages: Stage 1. Divide the shells for l=0,

Shell Sort Complexity Stage 0. To sort out the scattered array Stage 1. Odd-Even

Complexity Comparison for Parallel Sorting Odd-Even Sort Shell Sort Merge Sort 25

Assignment Description: Write a MPI program to sort out an array: 1. Use a

Slides: 26

Download presentation

CS 4402 – Parallel Computing Lecture 9 – Sorting Algorithms (2) Compare and Exchange Operation Compare and Exchange Sorting 1

Compare and Exchange Operation Take place between processors rank 1, rank 2. Each processor keeps the sub-array a=(a[i], i=0, 1, …, n). if(rank is rank 1){ MPI_Send(&a, n, MPI_INT, rank 2, tag 1, MPI_COMM_WORLD); MPI_Recv(&b, n, MPI_INT, rank 2, tag 2, MPI_COMM_WORLD, &status); c = merge(n, a, n, b); for(i=0; i<n; i++)a[i]=c[i]; } if(rank is rank 2){ MPI_Send(&a, n, MPI_INT, rank 2, tag 2, MPI_COMM_WORLD); MPI_Recv(&b, n, MPI_INT, rank 2, tag 1, MPI_COMM_WORLD, &status); c = merge(n, a, n, b); for(i=0; i<n; i++)a[i]=c[i+n]; } 5

Compare and Exchange Operation Complexity? What amount of computation is being used? What amount of communication takes place? CAN YOU FIND ARGUMENTS TO PROVE THAT THIS IS OPTIMAL OR EFFICIENT? 6

Compare and Exchange Algorithms Step 1. The array is scattered onto p sub-arrays. Step 2. Processor rank sorts a sub-array. At any time the processors keep the sub-arrays sorted. Step 3. While is not sorted / is needed compare and exchange between some processors Step 4. Gather of arrays to restore a sorted array. 7

Bubble Sort 8

Bubble Sort 9

Bubble Sort 10

Odd-Even Sort 1. Scatter the array onto processors. 2. Sort each sub-array aa. 3. Repeat for step=0, 1, 2, …, p-1 if (step is odd){ if(rank is odd)exchange(aa, n/size, rank+1); if(rank is even) exchange(aa, n/size, rank-1, rank); } if (step is even){ if(rank is even)exchange(aa, n/size, rank+1); if(rank is odd) exchange(aa, n/size, rank-1, rank); } 4. Gather the sub-arrays back to root. 14

Odd-Even Sort Simple Remarks: - Odd-Even Sort uses size rounds of exchange. - Odd-Even Sort keeps all processors busy … or almost all. - The complexity is given by - Scatter and Gather the array n/size elements - Sorting the array n/size elements - Compare and Exchange process size rounds involving n/size elements 15

if( rank == 0 ) { array = (double *) calloc( n, sizeof(double) ); srand( ((unsigned)time(NULL)+rank) ); for( x = 0; x < n; x++ ) array[x]=((double)rand()/RAND_MAX)*m; } MPI_Scatter( array, n/size, MPI_DOUBLE, a, n/size, MPI_DOUBLE, 0, MPI_COMM_WORLD ); merge_sort(n/size, a); for(i=0; i<size; i++){ if( (i+rank)%2 ==0 ){ if( rank < size-1 ) exchange(n/size, a, rank+1, MPI_COMM_WORLD); } else { if( rank > 0 ) exchange(n/size, a, rank-1, rank, MPI_COMM_WORLD); } MPI_Barrier(MPI_COMM_WORLD) } MPI_Gather( a, n/size, MPI_DOUBLE, array, n/size, MPI_DOUBLE, 0, MPI_COMM_WORLD ); if( rank == 0 ) { for( x = 0; x < n; x++ ) printf( "Output : %fn", array[x] ); } 16

Comments on Odd-Even Features of the algorithm: - Simple and quite efficient. - In p steps of compare and exchange the array is sorted out - Why? ? ? - The number of steps can be reduced if test “array sorted” but still in O(p). - C&E operations only between neighbors. Can we do C&E operations between other processors? 17

Odd-Even Sort Complexity Stage 0. To sort out the scattered array Stage 1. Odd-Even for p levels Scatter and Gather Total computation complexity 18

is. Sorted(n, a, comm) The parallel routine int is. Sorted(int n, double *a, MPI_Comm comm) 1. Test if the processors have all the local arrays in order. 2. rank 1 < rank 2 elements of rank 1 < rank 2. 3. If the answer if yes then no exchange is needed. How to do it? 1. The test is done at the root. 2. The test is done collectively by all processors. 19

is. Sorted(n, a, comm) – Strategy 1 The test is done collectively by all processors 1. Send last to the right processor 2. Receive last from the left processor 3. Test if last > a[0] then answer = 0 4. All_Reduce answer by using MIN 20

is. Sorted(n, a, comm) – Strategy 2 The test is done at the root. 1. Gather the first elements to the root. 2. Gather the last elements to the root. 3. If rank == 0 then 1. For size-1 times do - test if last[i] > first[i+1] • Broadcast the answer 21

Shell Sort It is based on the notion of “shell/group” of consecutive processors. - C&E take place between equally extreme procs. - The shell is then divided into 2. (0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15) (0 1 2 3 4 5 6 7) (8 9 10 11 12 13 14 15) (0 1 2 3) (4 5 6 7) (8 9 10 11) (12 13 14 15) (0 1) (2 3) (4 5) (6 7) (8 9) (10 11) (12 13) (14 15) - There are log(p) levels of division. #(shell)=p/2 #(shell)=p/4 #(shell)=p/8 For the level l we have - there are pow(2, l) shells each of size p/pow(2, l). - The shell k contains the processors 22

Shell Sort is based on two stages: Stage 1. Divide the shells for l=0, 1, 2, log(p) - exchange in parallel between extreme processors in each shell. Stage 2. Odd-Even for l=0, 1, 2, …, p - if rank and l are both even then exchange in parallel betw rank and rank+1 - if rank and l are both odd then exchange in parallel betw rank and rank+1 - test “array sorted” 23

Shell Sort Complexity Stage 0. To sort out the scattered array Stage 1. Odd-Even for l levels Catch the average complexity of l is in this case O(log^2(p)) so that in average the shell can be Scatter and Gather Total computation complexity 24

Complexity Comparison for Parallel Sorting Odd-Even Sort Shell Sort Merge Sort 25

Assignment Description: Write a MPI program to sort out an array: 1. Use a MPI method to compare and exchange 2. Use a MPI method to test is. Sorted() 3. Use the odd-even sort. 4. Evaluate the performances of the program in a readme. doc General Points: 1. It is for 10% of the marks. 2. Deadline on Monday 2/12/2013 at 5 pm. 3. The following elements must be submitted by email to j. horan@4 c. ucc. ie: 1. The c program name with your name and student number e. g. Sabin. Tabirca_11111. c. 2. The Makefile 3. Readme. doc in which you have 1) to give your student details and 2) to state the performances. 26