PRAM ALGORITHMS2 3122013 Computer Engg IITBHU 1 Merging

PRAM ALGORITHMS-2 3/12/2013 Computer Engg, IIT(BHU) 1

Merging ● O(n) time serial algorithm ● X 1= k[1: m] and X 2=k[m+1: 2 m] , be sorted sequences ● O(logn ) algorithm – Merging = rank of key k in X 1 U X 2 – Let k be in X 1 its rank be j in it – Rank of k in X 2 calculated in logn time using binary search (q) – Total rank = j+q – M CREW PRAM Processors

Odd-Even Merge ● ● Algorithm – Step 1: if m=1, then trival – Step 2: Partition x 1 and x 2 into odd and even parts (x 1 Odd, x 2 Odd, x 1 Even, x 2 Even) – Step 3: Recursively merge x 1 Odd with x 2 Odd In L 1 and even in L 2 – Step 4: Shuffle L 1 and L 2 to form L = l[1], l[m+1], l 2, …l[m], l[2 m]. Compare exchange (l[m+i], l[i+1]) O(logm) time using 2 m EREW PRAM

Merging- Work Optimal Algorithm ● Divide the problem into O(m/log m) sub problems ● Sub problem length – O(log m) ● Solved by one processor in O(logm) time ● Processors used 2 m/log m ● Work Done O(m)

Merging- Work Optimal Algorithm for Division – Partition X 1 into M=m/log m parts…A 1 to AM – Let li be largest key of Ai , do binary search to find li in X 2 – So partition X 2 into B 1 to BM (Bi corresponding to Ai) – Merge Ai with Bi – Partition Bi into [|Bi|/log m] parts , each part at most log m size – For each part using one processor which first finds corresponding subpart in Ai in O(log m) time – Now merge these subparts of max length O(log m)

Merging O(log m) Time Algorithm ● ● ● Similar to previous Step 1: Partition X 1 into √m equal parts A 1 to AM. Like in previous divide X 2 to corresponding parts B 1 to BM. Step 2: Merge Ai with Bi. Divide Bi into [|Bi|/ √m ] parts, each part has max length √m. Now these can be merged in O(1) time using me – ary search. ● Number of processors: ∑Mi=0 √m [|Bi|/ √m ] <= 2 m ● Time: O(log m) – T(m)=T(O(√m)) + O(1)

Sorting- Odd Even Merge Sort ● ● ● Step 1: If n<=1 , return X Step 2: Let X=k[1] to k[n]. Partition input into two : X 1’=k[1] to k[n/2] and X 2’=k[n/2+1] to k[n] Step 3: Allocate n/2 processors to sort X 1’ recursively. Let X 1 be result. At same time sort X 2’ recursively to get X 2 ● Step 4: Merge X 1 and X 2 using work optimal algorithm ● T(n) = O(log 2 n) – T(n)=T(n/2)+O(logn)

Sorting- Preparata’s Algorithm ● ● Step 1: if n is small, then sort using any algorithm Step 2: Partition n keys into log n equal parts. Sort each part recursively using in parallel assigning n processor to each part. Let S 1 to Slogm be the sorted sequences Step 3: Merge Si with Sj( assign n/logn processors to each pair) O(log n) time. We have rank of each key in each part. Step 4: Allocate logn processor to calculate sum of all rank in each part

Sorting- Preparata’s Algorithm ● Uses n log n CREW PRAM ● Step 1: T(n/log n) ● Step 2 and 3: O(log n ) time ● T(n)=O(log n) – T(n)= T(n/logn)+ O(log n)

Graph Problems ● G(V, E) be directed graph – N vertices ● M(i, j) = 0 if i=j or directed edge between i & j ● m(i, i) = 0 for every i m(i, j) =min {M(i[0], i[1])+M(i[1], i[2])…. +M(i[k-1][k] for every i!=j where i[0]=i, i[k]=j and min is taken for sequence of vertices

Graph Problems: Algorithm for calculating m ● m[i, j] : = M[i, j] for 1<=i, j<=n in parallel ● For r: =1 to logn do { Step 1: In parallel set q[i, j, k]: =m[i, j]+m[j, k] for 1<=i, j, k<=n Step 2: In parallel set m[i, j]: =min{q[i, 1, j]…q[i, n, j] for 1<=i, j<=n } ● O(logn) time using O(n 3+E ) CRCW PRAM

Graph Problems ● Transitive Closure : ~m ● Connected Components: Nodes i and j in same components if m(i, j)=0 ● Minimum Spanning Tree : – Using O(n 5+E ) CRCW Processors in O(logn) time – In kruskal sort in parallel using n 2 processors – Let e 1 to en be sorted – For each ei find transitive closure of e 1 to ei-1 and , ei is spanning tree if connecting edges are in same connected components

Alternative transitive closure ● M= I + A 2 + …. + An-1 ● M = (I+A)n ● ● ● xn can be calculated in logn steps Matrix multiplication can be done in O(logn) time using n 3 /logn CREW PRAM So O(log 2 n ) time using n 3 /logn CREW PRAM

All Pair Shortest Path ● Serial O(n 3) algorithm ● Parallel ● – Ak (i, j) is shortes path from i to j – Ak (i, j)= min{Ak-1 (i, j) , Ak-1 (i, k) + Ak-1 (k, j) } – Same as previous( calculating Ak ) – Matrix multiplication – Min=addition – Addition=multiplication O(log 2 n ) time using n 3 /logn CREW PRAM

Convex Hull ● ● Planar convex hull Give set of points sorted by x-coordinates. Find the smallest convex polygon that contains the points

Divide and Conquer n n In this method, two separate hulls are created, one for the leftmost half of the points, and one for the rightmost half. To divide in halves, sort by x-coordinates and find the median. If there is an odd number of points, the leftmost half should have the extra point.

Divide and Conquer n n Recursively find the convex hull for the left set of points and the right set of points. This gives hull A and hull B. Stitch together the two hulls to form the hull of the entire set.

Convex Hull ● ● Overall approach: – Take the set of points and divide the set into two halves – Assume that recursive call computes the convex hull of the two halves – Conquer stage: take the two convex hulls and merge it to obtain the convex hull for the entire set Complexity: – T(n)=T(n/2) + O (1) – O(logn) using n processors

Implementation n n To stitch the hulls together, the upper and lower tangent lines must be found. To find the lower tangent, start with the rightmost point in hull B and the leftmost point in hull A. While the line segment formed between these two points is not the lower tangent line of the hull, figure which of the two points is not at its lower tangent point.

Implementation n While A is not the lower tangent point, move around the hull clockwise, that is A=A – 1. While B is not at the lower tangent point, move around the hull counterclockwise, that is B = B + 1. When the upper and lower tangent lines are found, march around hull A and hull B deleting the points that are now within the merged convex hull.

Efficiency n n The divide and conquer algorithm is also of time complexity O( n logn). This algorithm is often used for cases in three dimensions. A technicality of the divide and conquer method lies in how many points are in the set. If the set has less than four points, there is no need to sort the points, but rather for these special cases, determine the hull separately. A downside to this algorithm is the overhead associated with recursive function calls.