Parallel Programming Parallel or concurrent programming has been

Parallel Programming • • Parallel or concurrent programming has been around since the 1960 s, during which time-sharing systems became popular. Until time-sharing was created, all computer jobs were run in batch mode, that is, a batch of jobs would be available for execution. At any time, a single job would execute until completion, then another job from the batch would be chosen to execute. Time-sharing was used to allow human-computer interaction by giving each of a number of users time-slices in which they could have exclusive access to the machine. In early systems, these time-slices were allocated in round-robin order and were relatively small (some fraction of a second). This gave the impression that the user had complete access to a machine that (when shared by n users) seemed to be 1/n times as fast as the actual machine in use. Time-sharing represents a form of multi-programming (running multiple programs on a single physical processor) which is distinct from multiprocessing, in which multiple physical processors are employed by a computing system. Distributed systems comprise a number of separate computers (either uni- or multiprocessors) that are loosely coupled via some communication network. Lecture #42 PLP Spring 2004, UF CISE 1

Programming Languages and Parallel Processing • Programming languages have been used to express algorithms to solve problems presented by parallel processing systems. • Programming languages are used to write the operating systems that implement parallel programming. • Programming languages have been used to harness the capabilities of multiprocessors in expressive and comprehensible ways. • Programming languages have been used to implement and express communication across computer networks. Lecture #42 PLP Spring 2004, UF CISE 2

Parallel Processing • A process is an instance of a program segment that has been scheduled for independent execution. • Any system that supports the execution of multiple processes at a single time is a parallel processing system. • A system in which parallel processing is simulated on a uniprocessor is known as a pseudoparallel or multiprogramming system. • A process in a multiprogramming system may exist in any of several states: – executing – blocked (waiting for exclusive access to resources or some other information necessary to continue operation) – waiting (ready to run when access to the processor is granted) Lecture #42 PLP Spring 2004, UF CISE 3

Heavy- vs. Lightweight Processes • A heavyweight process is one that has exclusive ownership of all resources necessary for it to carry out its own computation. In particular, a heavyweight process typically contains both text (code) and data memory segments, and a program stack. • A lightweight process shares some of these resources with other processes and is not considered to be their owner. Such a process is often referred to as a thread and it may share its text and data segments with another process, but have its own program stack. Lecture #42 PLP Spring 2004, UF CISE 4

Flynn’s Taxonomy of Processors • SISD: Single Instruction stream, Single Data stream This is the typical sequential computer. • SIMD: Single Instruction stream, Multiple Data stream This is the kind of parallelism used in what are referred to as massively parallel processors, such as the Lockheed-Martin PAL and GAPP architectures and the Thinking Machines Connection Machine. • MIMD: Multiple Instruction stream, Multiple Data stream This is the kind of parallelism exhibited by typical multiprocessors and distributed processing systems. Each process uses its own data stream (even if they happen to share the same memory device). • MISD: Multiple Instruction stream, Single Data stream I am not aware of anyone who claims to be doing this, although you can think of agent-based computing in which a single collection of data is manipulated by a large number of processes as falling into this category. Lecture #42 PLP Spring 2004, UF CISE 5

Synchronizing Activities • In general, people typically think of MIMD when they think of parallel computing. • To have an interesting MIMD computation, the multiple instruction streams must communicate information between each other. Two general methods are available to do this: – Shared memory (making these processes hybrid MIMD, MISD processes) – Message passing • Shared memory and message passing both require us to solve the problem of synchronization – In shared memory, we must avoid race conditions in which two processes race to access and modify the same unit of memory). – In shared memory and message passing systems we must avoid deadlock (in which a group of blocked processes wait for each other to complete), starvation (in which a processes waits forever), and livelock (in which a group of processes enters an endless repetitive state transition sequence) • Various methods of synchronization have been supported by programming languages and systems over the years. Lecture #42 PLP Spring 2004, UF CISE 6

Semaphores • • The use of semaphores for mutual exclusion was introduced by Dijkstra (Co -operating sequential processes, 1968). His motivation was the use of semaphores in railways to control access to a shared section of track. The operations used to signal the desire for access to the track were P and V, which Louden and many others nowadays dub Delay and Signal. – P: prolagen (a Dijkstra neologism for the Dutch phrase proberen te verlagen or try to decrease) – V: verhogen (increase) • • • The role of the single section of track is usually played by some physical object (memory region, peripheral device, functional unit) and the code that accesses it is referred to as a critical region. The semaphore is a unit of memory that can store an integer value. To get access to the critical region, one signals their intent by executing P(S) or Delay(S), which will stop their process if the semaphore S has value 0. If the value of S is greater than 0, it will be decremented by 1. If S is 0, then the executing process is delayed until it can be decremented. At the end of the critical region, the process must execute V(S) , or Signal(S). This increments the S by 1. Clearly, access to the semaphore must be atomic. A Delay or Signal call must not be interrupted. Lecture #42 PLP Spring 2004, UF CISE 7

Problems with Semaphores They do not lend themselves to easy use. The programmer must insure that each critical region is bounded by correct semaphore usage: Delay(S); {critical region} Signal(S); not Signal(S); {critical region} Delay(S); or anything else equally evil. Lecture #42 PLP Spring 2004, UF CISE 8

Monitors • Introduced by Hoare and Brinch Hansen, a monitor, is an abstract data type that provides mutual exclusion to each of its operations. – At most one process can be executing any of the monitor’s operations at any time. – The monitor maintains a wait-queue of processes that are awaiting execution of each of its operations. • Java incorporates synchronized methods, which implement the kind of exclusion that a moitor provides. Lecture #42 PLP Spring 2004, UF CISE 9

Message Passing • Message passing involves two primitives send, and receive, that are used to communicate some data (possibly empty) message between two processes. • Send and receive may or may not be required to specify the specific target process. • Send and receive may either block or not block. • Gary Andrews’ SR programming language provided an exhaustive examination of the possibilities offered by send and receive. • Hoare’s CSP was the prototypical CSP framework, implemented in the Occam language. • Java’s Remote Method Invocation (RMI) supports distributed computation using interprocess communication. • MPI, CORBA, and COM also provide interprocess communication capabilities. Lecture #42 PLP Spring 2004, UF CISE 10