Pipeline And Vector Processing Parallel Processing Execution of

  • Slides: 48
Download presentation
Pipeline And Vector Processing

Pipeline And Vector Processing

Parallel Processing Execution of Concurrent Events in the computing process to achieve faster Computational

Parallel Processing Execution of Concurrent Events in the computing process to achieve faster Computational Speed The purpose of parallel processing is to speed up the computer processing capability and increase its throughput, that is, the amount of processing that can be accomplished during a given interval of time. The amount of hardware increases with parallel processing, and with it, the cost of the system increases. However, technological developments have reduced hardware costs to the point where parallel processing techniques are economically feasible.

Parallel processing according to levels of complexity At the lower level Serial Shift register

Parallel processing according to levels of complexity At the lower level Serial Shift register VS parallel load registers At the higher level Multiplicity of functional units that performer identical or different operations simultaneously.

Parallel Computers

Parallel Computers

SISD COMPUTER SYSTEMS

SISD COMPUTER SYSTEMS

Von Neumann Architecture

Von Neumann Architecture

MISD COMPUTER SYSTEMS

MISD COMPUTER SYSTEMS

SIMD COMPUTER SYSTEMS

SIMD COMPUTER SYSTEMS

MIMD COMPUTER SYSTEMS

MIMD COMPUTER SYSTEMS

PIPELINING A technique of decomposing a sequential process into suboperations, with each subprocess being

PIPELINING A technique of decomposing a sequential process into suboperations, with each subprocess being executed in a partial dedicated segment that operates concurrently with all other segments. A pipeline can be visualized as a collection of processing segments through which binary information flows. The name “pipeline” implies a flow of information analogous to an industrial assembly line.

Example of the Pipeline Organization

Example of the Pipeline Organization

OPERATIONS IN EACH PIPELINE STAGE

OPERATIONS IN EACH PIPELINE STAGE

GENERAL PIPELINE

GENERAL PIPELINE

Cont.

Cont.

Speedup ratio of pipeline

Speedup ratio of pipeline

Cont.

Cont.

PIPELINE AND MULTIPLE FUNCTION UNITS

PIPELINE AND MULTIPLE FUNCTION UNITS

Cont.

Cont.

ARITHMETIC PIPELINE

ARITHMETIC PIPELINE

Cont. See the example in P. 310

Cont. See the example in P. 310

INSTRUCTION CYCLE

INSTRUCTION CYCLE

INSTRUCTION PIPELINE

INSTRUCTION PIPELINE

INSTRUCTION EXECUTION IN A 4 -STAGE PIPELINE

INSTRUCTION EXECUTION IN A 4 -STAGE PIPELINE

Pipeline

Pipeline

Space time diagram

Space time diagram

MAJOR HAZARDS IN PIPELINED EXECUTION Structural hazards (Resource Conflicts): Hardware resources required by the

MAJOR HAZARDS IN PIPELINED EXECUTION Structural hazards (Resource Conflicts): Hardware resources required by the instructions simultaneous overlapped execution cannot be met. Data hazards (Data Dependency Conflicts): An instruction scheduled to be executed in the pipeline requires the result of a previous instruction, which is not yet available. Control hazards (Branch difficulties): Branches and other instructions that change the PC make the fetch of the next instruction to be delayed.

Data hazards Control hazards

Data hazards Control hazards

STRUCTURAL HAZARDS Occur when some resource has not been duplicated enough to allow all

STRUCTURAL HAZARDS Occur when some resource has not been duplicated enough to allow all combinations of instructions in the pipeline to execute. Example: With one memory-port, a data and an instruction fetch cannot be initiated in the same clock. The Pipeline is stalled for a structural hazard <- Two Loads with one port memory -> Two-port memory will serve without stall

DATA HAZARDS

DATA HAZARDS

FORWARDING HARDWARE

FORWARDING HARDWARE

INSTRUCTION SCHEDULING

INSTRUCTION SCHEDULING

CONTROL HAZARDS

CONTROL HAZARDS

CONTROL HAZARDS

CONTROL HAZARDS

CONTROL HAZARDS

CONTROL HAZARDS

VECTOR PROCESSING There is a class of computational problems that are beyond the capabilities

VECTOR PROCESSING There is a class of computational problems that are beyond the capabilities of conventional computer. These problems are characterized by the fact that they require a vast number of computations that will take a conventional computer days or even weeks to complete.

VECTOR PROCESSING

VECTOR PROCESSING

VECTOR PROGRAMMING

VECTOR PROGRAMMING

VECTOR INSTRUCTIONS

VECTOR INSTRUCTIONS

Matrix Multiplication The multiplication of two nxn matrices consists of n 2 inner products

Matrix Multiplication The multiplication of two nxn matrices consists of n 2 inner products or n 3 multiply-add operations. Example: Product of two 3 x 3 matrices c 11= a 11 b 11+a 12 b 21+a 13 b 31 This requires 3 multiplications and 3 additions. The total number of multiply-add required to compute the matrix product is 9 x 3=27. In general, the inner product consists of the sum of k product terms of the form C = A 1 B 1+A 2 B 2+A 3 B 3+…+Ak Bk

C = A 1 B 1+A 5 B 5+A 9 B 9+A 13 B

C = A 1 B 1+A 5 B 5+A 9 B 9+A 13 B 13+… +A 2 B 2+A 6 B 6+A 10 B 10+A 14 B 14+… +A 3 B 3+A 7 B 7+A 11 B 11+A 15 B 15+… +A 4 B 4+A 8 B 8+A 12 B 12+A 16 B 16+…

VECTOR INSTRUCTION FORMAT

VECTOR INSTRUCTION FORMAT

MULTIPLE MEMORY MODULE AND INTERLEAVING

MULTIPLE MEMORY MODULE AND INTERLEAVING

MULTIPLE MEMORY MODULE AND INTERLEAVING

MULTIPLE MEMORY MODULE AND INTERLEAVING

MULTIPLE MEMORY MODULE AND INTERLEAVING

MULTIPLE MEMORY MODULE AND INTERLEAVING

ARRAY PROCESSOR

ARRAY PROCESSOR

attached array processor with host computer

attached array processor with host computer

SIMD array processor Organization

SIMD array processor Organization

Don’t forget, try to solve the questions of the chapter

Don’t forget, try to solve the questions of the chapter