Pattern Programming Seeds Framework Workpool Assignment 1 ITCS

  • Slides: 41
Download presentation
Pattern Programming Seeds Framework Workpool Assignment 1 ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson,

Pattern Programming Seeds Framework Workpool Assignment 1 ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson, 2013. Jan 15, 2014 Pattern. Prog-2 PP-2. 1

Seeds Workpool Diffuse. Data, Compute, and Gather. Data Methods Master Diffuse. Data Gather. Data.

Seeds Workpool Diffuse. Data, Compute, and Gather. Data Methods Master Diffuse. Data Gather. Data. Map d Private variable total (answer) Returns d to each slave Data argument data Compute Slaves Data argument data Data. Map input Data. Map output d created in Diffuse. Data. output created in Compute Diffuse. Data, Compute and Gather. Data methods start with a capital letter although method names should not! PP-2. 2

Objects sent between master and slaves identified by a key, based upon using a

Objects sent between master and slaves identified by a key, based upon using a Java Hash. Map See http: //doc. java. sun. com/Doc. Web/api/java. util. Hash. Map For implementation convenience two classes: • Data class used to pass data between master and slaves (A “segment” number keeps track of packets as they go from one method to another. ) • Data. Map class used inside Diffuse. Data, Compute, and Gather. Data methods Hash. Map object key Extends Data. Map is a subclass of Data and so allows casting PP-2. 3

Data. Map methods (Used inside Diffuse. Data, Compute, and Gather. Data methods) • put

Data. Map methods (Used inside Diffuse. Data, Compute, and Gather. Data methods) • put (key, data) – puts data into Data. Map identified by key • get (key, data) – gets stored data identified by key usually a String -- a programmer-chosen name to data. Often, data handles a primitive data type such as Integer or Long key and data are actually of the Object class PP-2. 4

Note Data object public Data Diffuse. Data (int segment) { Data. Map<String, Object> d

Note Data object public Data Diffuse. Data (int segment) { Data. Map<String, Object> d =new Data. Map<String, Object>(); d. put(“name_of_inputdata", input. Data); return d; } Data By framework segment used by Framework to keep track of where to put results cast into a Data. Map public Data Compute (Data data) { Data. Map<String, Object> input = (Data. Map<String, Object>)data; //data produced by Diffuse. Data() Data. Map<String, Object> output = new Data. Map<String, Object>(); //output returned to gatherdata input. Data = input. get(“name_of_inputdata”); … // computation output. put("name_of _results", results); // to return to Gather. Data() return output; By framework } public void Gather. Data (int segment, Data dat) { Data. Map<String, Object> out = (Data. Map<String, Object>) dat; outdata = out. get (“name_of_results”); result … // aggregate outdata from all the worker nodes. result a private variable } PP-2. 5

Question Will a class field modified in the Diffuse. Data or Gather. Data methods

Question Will a class field modified in the Diffuse. Data or Gather. Data methods be updated with the same values as in the Compute method? Answer NO. The two methods are running on different JVMs (and different nodes) PP-2. 6

Other methods called by framework public void initialize. Module(String[] args) { … // initialize

Other methods called by framework public void initialize. Module(String[] args) { … // initialize private variables datacount = … ; } public int get. Data. Count() { //Set to number of data items to be processed. return datacount; } PP-2. 7

User methods used in Bootstrap class Apart from methods to start and stop the

User methods used in Bootstrap class Apart from methods to start and stop the framework pattern, additional methods can be specified by programmer in the Workpool class and can be invoked in the Bootstrap class. Typically a method is invoked that produces the final result. Example public double get. Pi() { // returns value of pi based all workers double pi = (total / (random_samples * Double. Data. Size)) * 4; return pi; } PP-2. 8

Workpool Pattern 1. Embarrassing Parallel Computation Monte Carlo p 9

Workpool Pattern 1. Embarrassing Parallel Computation Monte Carlo p 9

Monte Carlo Methods A so-called “embarrassingly parallel” computation as it decomposes into obviously independent

Monte Carlo Methods A so-called “embarrassingly parallel” computation as it decomposes into obviously independent tasks that can be done in parallel without any into task communications during the computation. Monte Carlo methods use random selections. For parallelizing Monte Carlo code, must address best way to generate random numbers in parallel. 3. 15

Calculate p using the Monte Carlo method Circle formed within a 2 x 2

Calculate p using the Monte Carlo method Circle formed within a 2 x 2 square. Ratio of area of circle to square given by: Points within square chosen randomly. Score kept of how many points happen to lie within circle. Fraction of points within circle will be , given sufficient number of randomly selected samples. 3. 16

One quadrant can be described by integral: Random pairs of numbers, (xr, yr) generated,

One quadrant can be described by integral: Random pairs of numbers, (xr, yr) generated, each between 0 and 1. Counted as in circle if 3. 18

Alternative (better) Monte Carlo Method (Not used here) Generate random values of x to

Alternative (better) Monte Carlo Method (Not used here) Generate random values of x to compute f(x) Sum values of f(x): where xr are randomly generated values of x between x 1 and x 2. Monte Carlo method very useful if the function cannot be integrated numerically (maybe having a large number of variables) 3. 19

Workpool implementation Slaves Compute Return number of 1000 random points inside arc of circle

Workpool implementation Slaves Compute Return number of 1000 random points inside arc of circle inside seed Send starting seed for random sequence Aggregate answers Gather. Data Diffuse. Data Master Compute node Source/sink 14

Seeds Monte Carlo code Monte. Carlo. Pi. Module. java Diffuse. Data Method (Required to

Seeds Monte Carlo code Monte. Carlo. Pi. Module. java Diffuse. Data Method (Required to be implemented) public Data Diffuse. Data (int segment) { Data. Map<String, Object> d =new Data. Map<String, Object>(); d. put("seed", R. next. Long()); return d; // returns a random seed for each job unit } PP-2. 15

Compute Method (Required to be implemented) public Data Compute (Data data) { Data. Map<String,

Compute Method (Required to be implemented) public Data Compute (Data data) { Data. Map<String, Object> input = (Data. Map<String, Object>)data; Data. Map<String, Object> output = new Data. Map<String, Object>(); Long seed = (Long) input. get("seed"); // get random seed Random r = new Random(); r. set. Seed(seed); Long inside = 0 L; for (int i = 0; i < Double. Data. Size ; i++) { double x = r. next. Double(); double y = r. next. Double(); double dist = x * x + y * y; if (dist <= 1. 0) { ++inside; } } output. put("inside", inside); // to return to Gather. Data() return output; } PP-2. 16

Gather. Data Method (Required to be implemented) public void Gather. Data (int segment, Data

Gather. Data Method (Required to be implemented) public void Gather. Data (int segment, Data dat) { Data. Map<String, Object> out = (Data. Map<String, Object>) dat; Long inside = (Long) out. get("inside"); total += inside; // aggregate answer from all the worker nodes. } PP-2. 17

get. Data. Count Method (Required to be implemented) public int get. Data. Count() {

get. Data. Count Method (Required to be implemented) public int get. Data. Count() { return random_samples; } Set number of data “envelopes” sent from master by Diffuse. Data to slaves, in this case number of “seeds”. (Number of physical slaves processors might be different and determined by compute resources. ) Initialized in: initialize. Module(…) { random_samples = 3000; ) PP-2. 18

Method to compute p result (used in bootstrap module) public double get. Pi() {

Method to compute p result (used in bootstrap module) public double get. Pi() { // returns value of pi based on all workers double pi = (total / (random_samples * Double. Data. Size)) * 4; return pi; } PP-2. 19

public Data Compute (Data data) { // input gets the data produced by Diffuse.

public Data Compute (Data data) { // input gets the data produced by Diffuse. Data() Data. Map<String, Object> input = (Data. Map<String, Object>)data; Data. Map<String, Object> output = new Data. Map<String, Object>(); Long seed = (Long) input. get("seed"); // get random seed Random r = new Random(); r. set. Seed(seed); Long inside = 0 L; for (int i = 0; i < Double. Data. Size ; i++) { double x = r. next. Double(); double y = r. next. Double(); double dist = x * x + y * y; if (dist <= 1. 0) { ++inside; } } output. put("inside", inside); // store partial answer to return to Gather. Data() return output; // output will emit the partial answers done by this method } public Data Diffuse. Data (int segment) { Data. Map<String, Object> d =new Data. Map<String, Object>(); d. put("seed", R. next. Long()); return d; // returns a random seed for each job unit } public void Gather. Data (int segment, Data dat) { Data. Map<String, Object> out = (Data. Map<String, Object>) dat; Long inside = (Long) out. get("inside"); total += inside; // aggregate answer from all the worker nodes. } public double get. Pi() { // returns value of pi based on the job done by all the workers double pi = (total / (random_samples * Double. Data. Size)) * 4; return pi; } public int get. Data. Count() { return random_samples; } Complete module class package edu. uncc. grid. example. workpool; import java. util. Random; import java. util. logging. Level; import edu. uncc. grid. pgaf. datamodules. Data. Map; import edu. uncc. grid. pgaf. interfaces. basic. Workpool; import edu. uncc. grid. pgaf. p 2 p. Node; public class Monte. Carlo. Pi. Module extends Workpool { private static final long serial. Version. UID = 1 L; private static final int Double. Data. Size = 1000; double total; int random_samples; Random R; public Monte. Carlo. Pi. Module() { R = new Random(); } public void initialize. Module(String[] args) { total = 0; Node. get. Log(). set. Level(Level. WARNING); // reduce verbosity for logging random_samples = 3000; // set number of random samples } } 20

Bootstrap class Run. Monte. Carlo. Pi. Module. java Deploys framework and runs code package

Bootstrap class Run. Monte. Carlo. Pi. Module. java Deploys framework and runs code package edu. uncc. grid. example. workpool; import java. io. IOException; import net. jxta. pipe. Pipe. ID; import edu. uncc. grid. pgaf. Anchor; import edu. uncc. grid. pgaf. Operand; import edu. uncc. grid. pgaf. Seeds; import edu. uncc. grid. pgaf. p 2 p. Types; public class Run. Monte. Carlo. Pi. Module { public static void main(String[] args) { try { Monte. Carlo. Pi. Module pi = new Monte. Carlo. Pi. Module(); Seeds. start( args[0] , false); Pipe. ID id = Seeds. start. Pattern( new Operand( (String[])null, new Anchor( args[1] , Types. Data. Flow. Roll. SINK_SOURCE), pi ) ); System. out. println(id. to. String() ); Seeds. wait. On. Pattern(id); System. out. println( "The result is: " + pi. get. Pi() ) ; Seeds. stop(); } catch (Security. Exception e) { … } PP-2. 21 }

Discussion • Does anyone see a flaw in the code (clue: random number generation)

Discussion • Does anyone see a flaw in the code (clue: random number generation) PP-2. 22

Workpool pattern Matrix addition and multiplication very easy to parallelize as each result value

Workpool pattern Matrix addition and multiplication very easy to parallelize as each result value independent of other result values. 23

Matrix Addition, C = A + B Add corresponding elements of each matrix to

Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as ai, j and elements of B as bi, j, each element of C computed as: Add A B C Easy to parallelize – each processor computes one C element or group of C elements 24

Workpool Implementation Slave computation Adds one row of A with one row of B

Workpool Implementation Slave computation Adds one row of A with one row of B to create one row of C (rather than each slave adding single elements) Add A B C 25

Workpool implementation Slaves (one for each row) Return one row of C C A

Workpool implementation Slaves (one for each row) Return one row of C C A B Send one row of A and B to slave Master Following example 3 x 3 arrays and 3 slaves Compute node Source/sink 26

package edu. uncc. grid. example. workpool; Matrix. Add. Module. java import … Continues on

package edu. uncc. grid. example. workpool; Matrix. Add. Module. java import … Continues on several sides public class Matrix. Add. Module extends Workpool { private static final long serial. Version. UID = 1 L; int[][] matrix. A; int[][] matrix. B; int[][] matrix. C; public Matrix. Add. Module() { In this example matrix. C = new int[3][3]; matrices are 3 x 3 } public void init. Matrices(){ matrix. A = new int[][]{{2, 5, 8}, {3, 4, 9}, {1, 5, 2}}; Some initial values matrix. B = new int[][]{{2, 5, 8}, {3, 4, 9}, {1, 5, 2}}; } public int get. Data. Count() { Required method. Number of return 3; data objects (Slaves) } public void initialize. Module(String[] args) { Node. get. Log(). set. Level(Level. WARNING); } 27

Diffuse. Data method public Data Diffuse. Data(int segment) { int[] row. A = new

Diffuse. Data method public Data Diffuse. Data(int segment) { int[] row. A = new int[3]; int[] row. B = new int[3]; Data. Map d returned are pairs of string key and associated array Data. Map<String, int[]> d =new Data. Map<String, int[]>(); int k = segment; for (int i=0; i<3; i++) { row. A[i] = matrix. A[k][i]; row. B[i] = matrix. B[k][i]; } d. put("row. A", row. A); d. put("row. B", row. B); return d; } segment variable used to select rows Copy one row of A and one row of B into row. A, row. B to be sent to slaves row. A and row. B put in d Data. Map to send to slaves 28

Compute method public Data Compute(Data data) { int[] row. C = new int[3]; Data.

Compute method public Data Compute(Data data) { int[] row. C = new int[3]; Data. Map<String, int[]> input = (Data. Map<String, int[]>)data; Data. Map<String, int[]> output = new Data. Map<String, int[]>(); int[] row. A = (int[]) input. get("row. A"); int[] row. B = (int[]) input. get("row. B"); for (int i=0; i<3; i++) { row. C[i] = row. A[i] + row. B[i]; } Get two rows from data received Add rows output. put("row. C", row. C); return output; } Put result row into output with key to be sent back to master 29

Gather. Data method Note segment variable and Data from slave public void Gather. Data(int

Gather. Data method Note segment variable and Data from slave public void Gather. Data(int segment, Data dat) { Data. Map<String, int[]> out = (Data. Map<String, int[]>) dat; int[] row. C = (int[]) out. get("row. C"); for (int i=0; i<3; i++) { matrix. C[segment][i]= row. C[i]; } Get C row sent from slave Place row into result matrix Segment variable associated with Data used to choose correct row } 30

Bootstrap class - Run. Matrix. Add. Module. java package edu. uncc. grid. example. workpool;

Bootstrap class - Run. Matrix. Add. Module. java package edu. uncc. grid. example. workpool; import … public class Run. Matrix. Add. Module { public static void main (String [] args ) { try { long start = System. current. Time. Millis(); In this example the path to Seeds. start( args[0] , false); Seeds and local host name are Matrix. Add. Module m = new Matrix. Add. Module(); command line arguments m. init. Matrices(); Pipe. ID id = Seeds. start. Pattern(new Operand ((String[])null, new Anchor (args[1], Types. Data. Flow. Roll. SINK_SOURCE), m)); Seeds. wait. On. Pattern(id); m. print. Result(); Seeds. stop(); long stop = System. current. Time. Millis(); double time = (double) (stop - start) / 1000. 0; System. out. println("Execution time = " + time); … 31

Matrix Multiplication, C = A * B Multiplication of two matrices, A and B,

Matrix Multiplication, C = A * B Multiplication of two matrices, A and B, produces matrix C whose elements, ci, j (0 <= i < n, 0 <= j < m), computed as follows: where A is an n x l matrix and B is an l x m matrix. 32

Parallelizing Matrix Multiplication Assume throughout that matrices square (n x n matrices). Sequential code

Parallelizing Matrix Multiplication Assume throughout that matrices square (n x n matrices). Sequential code to compute A x B could simply be for (i = 0; i < n; i++) // for each row of A for (j = 0; j < n; j++) { // for each column of B c[i][j] = 0; for (k = 0; k < n; k++) c[i][j] = c[i][j] + a[i][k] * b[k][j]; } Requires n 3 multiplications and n 3 additions Sequential time complexity of O(n 3). Very easy to parallelize as each result independent 33

Matrix Multiplication, C = A * B One slave computes one element of result

Matrix Multiplication, C = A * B One slave computes one element of result in workpool implementation 34

Workpool implementation Slaves (one for each element of result) Return one element of C

Workpool implementation Slaves (one for each element of result) Return one element of C C A Send one row of A and one column of B to slave B Master Following example 3 x 3 arrays and 9 slaves Compute node Source/sink 35

package edu. uncc. grid. example. workpool; Matrix. Add. Module. java import … Continues on

package edu. uncc. grid. example. workpool; Matrix. Add. Module. java import … Continues on several sides public class Matrix. Add. Module extends Workpool { private static final long serial. Version. UID = 1 L; int[][] matrix. A; int[][] matrix. B; int[][] matrix. C; public Matrix. Add. Module() { In this example matrix. C = new int[3][3]; matrices are 3 x 3 } public void init. Matrices(){ matrix. A = new int[][]{{2, 5, 8}, {3, 4, 9}, {1, 5, 2}}; Some initial values matrix. B = new int[][]{{2, 5, 8}, {3, 4, 9}, {1, 5, 2}}; } public int get. Data. Count() { Required method. Number of return 9; data objects (Slaves) } public void initialize. Module(String[] args) { Node. get. Log(). set. Level(Level. WARNING); } 36

Note on mapping rows and columns to segments segment 0 segment 1 segment 2

Note on mapping rows and columns to segments segment 0 segment 1 segment 2 segment 3 segment 4 segment 5 segment 6 segment 7 segment 8 Arow Bcol 0 0 0 1 0 2 1 0 1 1 1 2 2 0 2 1 2 2 int Arow =segment/3; Int Bcol = segment%3; 37

Diffuse. Data method public Data Diffuse. Data(int segment) { int[] row. A = new

Diffuse. Data method public Data Diffuse. Data(int segment) { int[] row. A = new int[3]; int[] col. B = new int[3]; Data. Map d returned are pairs of string key and associated array Data. Map<String, int[]> d =new Data. Map<String, int[]>(); segment variable used to select element in A and B int a=segment/3, b = segment%3 ; for (int i=0; i<3; i++) { row. A[i] = matrix. A[a][i]; Copy one row of A and one column of col. B[i] = matrix. B[i][b]; B into row. A, col. B to be sent to slaves } d. put("row. A", row. A); row. A and col. B put in d Data. Map to d. put(“col. B", col. B); send to slaves return d; } 38

Compute method public Data Compute(Data data) { int[] row. C = new int[3]; Data.

Compute method public Data Compute(Data data) { int[] row. C = new int[3]; Data. Map<String, int[]> input = (Data. Map<String, int[]>)data; Data. Map<String, Integer> output = new Data. Map<String, Integer>(); int[] row. A = (int[]) input. get("row. A"); int[] col. B = (int[]) input. get(“col. B"); int out = 0; for (int i=0; i<3; i++) { out += row. A[i]*col. B[i]; } Get two rows from data received Matrix multiplication, one result output. put(“out", out); return output; } Put result into output with key to be sent back to master 39

Gather. Data method Note segment variable and Data from slave public void Gather. Data(int

Gather. Data method Note segment variable and Data from slave public void Gather. Data(int segment, Data dat) { Data. Map<String, Integer> out = (Data. Map<String, Integer>) dat; int answer = out. get("out"); Get result sent from slave* int a=segment/3, b=segment%3; Place element into result matrix Segment variable associated with matrix. C[a][b]= answer; Data used to choose correct row } * Cast from Integer to int not necessary 40

Questions 41

Questions 41