Parallel Programming with PVM Prof Sivarama Dandamudi School

  • Slides: 46
Download presentation
Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Parallel Algorithm Models v Five basic models v Data parallel model v Task graph

Parallel Algorithm Models v Five basic models v Data parallel model v Task graph model v Work pool model v Master-slave model v Pipeline model v Hybrid models Carleton University © S. Dandamudi 2

Parallel Algorithm Models v Data (cont’d) parallel model v One of the simplest of

Parallel Algorithm Models v Data (cont’d) parallel model v One of the simplest of all the models v Tasks are statically mapped onto processors v Each task performs similar operation on different data v Called v Work data parallelism model may be done in phases v Operations v Ex: in different phases may be different Matrix multiplication Carleton University © S. Dandamudi 3

Parallel Algorithm Models v Data parallel model A 11 A 21 v. A (cont’d)

Parallel Algorithm Models v Data parallel model A 11 A 21 v. A (cont’d) A 12 A 22 . B 11 B 21 B 12 B 22 = C 11 C 21 C 12 C 22 decomposition into four tasks v Task 1: C 11 = A 11 B 11 + A 12 B 21 v Task 2: C 12 = A 11 B 12 + A 12 B 22 v Task 3: C 21 = A 21 B 11 + A 22 B 21 v Task 4: C 11 = A 21 B 21 + A 22 B 22 Carleton University © S. Dandamudi 4

Parallel Algorithm Models v Task (cont’d) graph model v Parallel algorithm is viewed as

Parallel Algorithm Models v Task (cont’d) graph model v Parallel algorithm is viewed as a task-dependency graph v Called task parallelism model v Typically used for tasks that have large amount of data v Static mapping is used to optimize data movement cost v Locality-based v Ex: mapping is important Divide-and-conquer algorithms, parallel quicksort Carleton University © S. Dandamudi 5

Parallel Algorithm Models v Task (cont’d) parallelism Carleton University © S. Dandamudi 6

Parallel Algorithm Models v Task (cont’d) parallelism Carleton University © S. Dandamudi 6

Parallel Algorithm Models v Work (cont’d) pool model v Dynamic mapping of tasks onto

Parallel Algorithm Models v Work (cont’d) pool model v Dynamic mapping of tasks onto processors v Important v Used for load balancing on message passing systems v When the data associated with a task is relatively small v Granularity of tasks v Too small: overhead in accessing tasks can increase v Too big: Load imbalance v Ex: Parallelization of loops by chunk scheduling Carleton University © S. Dandamudi 7

Parallel Algorithm Models v Master-slave (cont’d) model v One or more master processes generate

Parallel Algorithm Models v Master-slave (cont’d) model v One or more master processes generate work and allocate it to worker processes v Also called manager-worker model v Suitable for both shared-memory and message passing systems v Master can potentially become a bottleneck v Granularity Carleton University of tasks is important © S. Dandamudi 8

Parallel Algorithm Models v Pipeline v. A (cont’d) model stream of data passes through

Parallel Algorithm Models v Pipeline v. A (cont’d) model stream of data passes through a series of processors v Each process performs some task on the data v Also called stream parallelism model v Uses producer-consumer relationship v Overlapped execution v Useful in applications such as database query processing v Potential v One problem process can delay the whole pipeline Carleton University © S. Dandamudi 9

Parallel Algorithm Models v Pipeline (cont’d) model R 5 Pipelined processing can avoid writing

Parallel Algorithm Models v Pipeline (cont’d) model R 5 Pipelined processing can avoid writing temporary results on disk and reading them back Carleton University R 4 R 3 R 1 © S. Dandamudi R 2 10

Parallel Algorithm Models v Hybrid (cont’d) models v Possible to use multiple models v

Parallel Algorithm Models v Hybrid (cont’d) models v Possible to use multiple models v Hierarchical v Different models at different levels v Sequentially v Different v Ex: models in different phases Major computation may use task graph model v Each node of the graph may use data parallelism or pipeline model Carleton University © S. Dandamudi 11

PVM v Parallel virtual machine v Collaborative effort v Oak Ridge National Lab, University

PVM v Parallel virtual machine v Collaborative effort v Oak Ridge National Lab, University of Tennessee, Emory University, and Carnegie Mellon University v Began in 1989 v Version 1. 0 was used internally v Version 2. 0 released in March 1991 v Version 3. 0 in February 1993 Carleton University © S. Dandamudi 12

PVM v Parallel (cont’d) virtual machine v Targeted for heterogeneous network computing v Different

PVM v Parallel (cont’d) virtual machine v Targeted for heterogeneous network computing v Different architectures v Data formats v Computational speeds v Machine loads v Network load Carleton University © S. Dandamudi 13

PVM Calls v Process control int tid = pvm_mytid(void) v Returns pid of the

PVM Calls v Process control int tid = pvm_mytid(void) v Returns pid of the calling process v Can be called multiple times int info = pvm_exit(void) v Does not kill the process v Tells the local pvmd that this process is leaving PVM v info < 0 indicates error (error: pvmd not responding) Carleton University © S. Dandamudi 14

PVM Calls v Process (cont’d) control int numt = pvm_spawn(char *task, char **argv, Arguments

PVM Calls v Process (cont’d) control int numt = pvm_spawn(char *task, char **argv, Arguments to task (NULL terminated) int flag, char *where, Specific architecture (PVM_ARCH) int ntask, int *tids) Specific host v Starts ntask copies of the executable file task Carleton University © S. Dandamudi 15

PVM Calls v flag (cont’d) specifies options Value Option Meaning 0 Pvm. Task. Default

PVM Calls v flag (cont’d) specifies options Value Option Meaning 0 Pvm. Task. Default PVM chooses where to span 1 Pvm. Task. Host where specifies a host 2 Pvm. Task. Arch where specifies a architecture 3 Pvm. Task. Debug Starts tasks under debugger Carleton University © S. Dandamudi 16

PVM Calls v Process (cont’d) control int info = pvm_kill(int tid) v Kills the

PVM Calls v Process (cont’d) control int info = pvm_kill(int tid) v Kills the PVM task identified by tid v Does not kill the calling task v To kill the calling task v First call pvm_exit() v Then exit() v Writes to the file /tmp/pvml. <uid> Carleton University © S. Dandamudi 17

PVM Calls (cont’d) v Information int tid = pvm_parent(void) v Returns the tid of

PVM Calls (cont’d) v Information int tid = pvm_parent(void) v Returns the tid of the process that spawned the calling task v Returns Pvm. No. Parent if the task is not created by pvm_spawn() Carleton University © S. Dandamudi 18

PVM Calls (cont’d) v Information int info = pvm_config( int *nhost, int *narch, struct

PVM Calls (cont’d) v Information int info = pvm_config( int *nhost, int *narch, struct pvmhostinfo **hostp) v Returns nhost = number of hosts v Returns narch = number of different data formats Carleton University © S. Dandamudi 19

PVM Calls v Message sending v Involves v Send three steps buffer must be

PVM Calls v Message sending v Involves v Send three steps buffer must be initialized v Use pvm_initsend() v Message v Use must be packed pvm_pk*() v Several v Send (cont’d) pack routines are available the message v Use Carleton University pvm_send() © S. Dandamudi 20

PVM Calls v Message (cont’d) sending int bufid = pvm_initsend( int encoding) v Called

PVM Calls v Message (cont’d) sending int bufid = pvm_initsend( int encoding) v Called before packing a new message into the buffer v Clears the send buffer and creates a new one for packing a new message v bufid = new buffer id Carleton University © S. Dandamudi 21

PVM Calls v (cont’d) encoding can have three options: v Pvm. Data. Default v

PVM Calls v (cont’d) encoding can have three options: v Pvm. Data. Default v XDR encoding is used by default v Useful for heterogeneous architectures v Pvm. Data. Raw v No encoding is done v Messages sent in their original form v Pvm. Data. In. PLace v No buffer copying v Buffer should not be modified until sent Carleton University © S. Dandamudi 22

PVM Calls (cont’d) Packing data v Several routines are available (one for each data

PVM Calls (cont’d) Packing data v Several routines are available (one for each data type) v Each takes three arguments int info = pvm_pkbyte(char *cp, int nitem, int stride) v nitem = # items to be packed v stride = stride in elements v Carleton University © S. Dandamudi 23

PVM Calls data pvm_pkint pvm_pkfloat pvm_pkshort (cont’d) v Packing pvm_pklong pvm_pkdouble v Pack string

PVM Calls data pvm_pkint pvm_pkfloat pvm_pkshort (cont’d) v Packing pvm_pklong pvm_pkdouble v Pack string routine requires only the NULL-terminated string pointer pvm_pkstr(char *cp) Carleton University © S. Dandamudi 24

PVM Calls (cont’d) v Sending data int info = pvm_send(int tid, int msgtag) v

PVM Calls (cont’d) v Sending data int info = pvm_send(int tid, int msgtag) v Sends the message in the packed buffer to task tid v Message is tagged with msgtag v Message tags are useful to distinguish different types of messages Carleton University © S. Dandamudi 25

PVM Calls (cont’d) v Sending data (multicast) int info = pvm_mcast(int *tids, int ntask,

PVM Calls (cont’d) v Sending data (multicast) int info = pvm_mcast(int *tids, int ntask, int msgtag) v Sends the message in the packed buffer to all tasks in the tid array (except itself) v tid array length is given by ntask Carleton University © S. Dandamudi 26

PVM Calls v Receiving v Two (cont’d) data steps v Receive data v Unpack

PVM Calls v Receiving v Two (cont’d) data steps v Receive data v Unpack it v Two versions v Blocking v Waits until the message arrives v Non-blocking v Does Carleton University not wait © S. Dandamudi 27

PVM Calls v Receiving v Blocking (cont’d) data receive int info = pvm_recv(int tid,

PVM Calls v Receiving v Blocking (cont’d) data receive int info = pvm_recv(int tid, int msgtag) v Wait until a message with msgtag has arrived from task tid v Wildcard value (-1) is allowed for both msgtag and tid Carleton University © S. Dandamudi 28

PVM Calls v Receiving (cont’d) data v Non-blocking receive int info = pvm_nrecv(int tid,

PVM Calls v Receiving (cont’d) data v Non-blocking receive int info = pvm_nrecv(int tid, int msgtag) v If no message with msgtag has arrived from task tid v Returns bufid = 0 v Otherwise, Carleton University behaves like the blocking receive © S. Dandamudi 29

PVM Calls v Receiving v Probing (cont’d) data for a message int info =

PVM Calls v Receiving v Probing (cont’d) data for a message int info = pvm_probe(int tid, int msgtag) v If no message with msgtag has arrived from task tid v Returns bufid = 0 v Otherwise, v Does returns a bufid for the message not receive the message Carleton University © S. Dandamudi 30

PVM Calls (cont’d) v Unpacking data (similar to packing routines) pvm_upkint pvm_upklong pvm_upkfloat pvm_upkdouble

PVM Calls (cont’d) v Unpacking data (similar to packing routines) pvm_upkint pvm_upklong pvm_upkfloat pvm_upkdouble pvm_upkshort pvm_upkbyte v Pack string routine requires only the NULL-terminated string pointer pvm_upkstr(char *cp) Carleton University © S. Dandamudi 31

PVM Calls v Buffer (cont’d) information v Useful to find the size of the

PVM Calls v Buffer (cont’d) information v Useful to find the size of the received message int info = pvm_bufinfo(int bufid, int *bytes, int *msgtag, int *tid) v Returns msgtag, source tid, and size in bytes Carleton University © S. Dandamudi 32

Example v Finds sum of elements of a given vector v Vector size is

Example v Finds sum of elements of a given vector v Vector size is given as input v The program can be run on a PVM with up to 10 nodes v Can be modified by changing a constant v Vector is assumes to be evenly divisable by number of nodes in PVM v Easy to modify this restriction v Master (vecsum. c) and slave (vecsum_slave. c) programs Carleton University © S. Dandamudi 33

Example (cont’d) vecsum. c #include <stdio. h> #include <sys/time. h> #include "pvm 3. h"

Example (cont’d) vecsum. c #include <stdio. h> #include <sys/time. h> #include "pvm 3. h" #define MAX_SIZE 250000 /* max. vector size */ NPROCS 10 /* max. number of PVM nodes */ Carleton University © S. Dandamudi 34

Example (cont’d) main() { int cc, tid[NPROCS]; long vector[MAX_SIZE]; double sum = 0, partial_sum;

Example (cont’d) main() { int cc, tid[NPROCS]; long vector[MAX_SIZE]; double sum = 0, partial_sum; /* partial sum received from slaves */ long i, vector_size; Carleton University © S. Dandamudi 35

Example int (cont’d) nhost, /* actual # of hosts in PVM */ size; /*

Example int (cont’d) nhost, /* actual # of hosts in PVM */ size; /* size of vector to be distributed */ struct timeval start_time, finish_time; long sum_time; Carleton University © S. Dandamudi 36

Example (cont’d) printf("Vector size = "); scanf("%ld", &vector_size); for(i=0; i<vector_size; i++) /* initialize vector

Example (cont’d) printf("Vector size = "); scanf("%ld", &vector_size); for(i=0; i<vector_size; i++) /* initialize vector */ vector[i] = i; gettimeofday(&start_time, (struct timezone*)0); /* start time */ Carleton University © S. Dandamudi 37

Example (cont’d) tid[0] = pvm_mytid(); /* establish my tid */ /* get # of

Example (cont’d) tid[0] = pvm_mytid(); /* establish my tid */ /* get # of hosts using pvm_config() */ pvm_config(&nhost, (int *)0, (struct hostinfo *)0); size = vector_size/nhost; /* size of vector to send to slaves */ Carleton University © S. Dandamudi 38

Example (cont’d) if (nhost > 1) pvm_spawn("vecsum_slave", (char **)0, 0, "", nhost-1, &tid[1]); for

Example (cont’d) if (nhost > 1) pvm_spawn("vecsum_slave", (char **)0, 0, "", nhost-1, &tid[1]); for (i=1; i<nhost; i++){ /* distribute data to slaves */ pvm_initsend(Pvm. Data. Default); pvm_pklong(&vector[i*size], size, 1); pvm_send(tid[i], 1); } Carleton University © S. Dandamudi 39

Example (cont’d) for (i=0; i<size; i++) /* perform local sum */ sum += vector[i];

Example (cont’d) for (i=0; i<size; i++) /* perform local sum */ sum += vector[i]; for (i=1; i<nhost; i++){ /* collect partial sums from slaves */ pvm_recv(-1, 2); pvm_upkdouble(&partial_sum, 1, 1); sum += partial_sum; } Carleton University © S. Dandamudi 40

Example (cont’d) gettimeofday(&finish_time, (struct timezone*)0); /* finish time */ Time in msecs sum_time =

Example (cont’d) gettimeofday(&finish_time, (struct timezone*)0); /* finish time */ Time in msecs sum_time = (finish_time. tv_sec – start_time. tv_sec) * 1000000 + finish_time. tv_usec start_time. tv_usec; Carleton University © S. Dandamudi 41

Example (cont’d) printf("Sum = %lfn", sum); printf("Sum time on %d hosts = %lf secn",

Example (cont’d) printf("Sum = %lfn", sum); printf("Sum time on %d hosts = %lf secn", nhost, (double)sum_time/1000000); pvm_exit(); } Carleton University © S. Dandamudi 42

Example (cont’d) vecsum_slave. c #include "pvm 3. h" #define MAX_SIZE 250000 main() { int

Example (cont’d) vecsum_slave. c #include "pvm 3. h" #define MAX_SIZE 250000 main() { int ptid, bufid, vector_bytes; long vector[MAX_SIZE]; double sum = 0; int i; Carleton University © S. Dandamudi 43

Example (cont’d) ptid = pvm_parent(); /* find parent tid */ bufid = pvm_recv(ptid, 1);

Example (cont’d) ptid = pvm_parent(); /* find parent tid */ bufid = pvm_recv(ptid, 1); /* receive data from master */ /* use pvm_bufinfo() to find the number of bytes received */ pvm_bufinfo(bufid, &vector_bytes, (int *)0, (int *) 0); Carleton University © S. Dandamudi 44

Example (cont’d) pvm_upklong(vector, vector_bytes/sizeof(long), 1); /* unpack */ for (i=0; i<vector_bytes/sizeof(long); i++) /* local

Example (cont’d) pvm_upklong(vector, vector_bytes/sizeof(long), 1); /* unpack */ for (i=0; i<vector_bytes/sizeof(long); i++) /* local summation */ sum += vector[i]; Carleton University © S. Dandamudi 45

Example (cont’d) pvm_initsend(Pvm. Data. Default); /* send sum to master */ pvm_pkdouble(&sum, 1, 1);

Example (cont’d) pvm_initsend(Pvm. Data. Default); /* send sum to master */ pvm_pkdouble(&sum, 1, 1); pvm_send(ptid, 2); /* use msg type 2 for partial sum */ pvm_exit(); } Carleton University © S. Dandamudi 46