Chapter One Introduction to Pipelined Processors Handlers Classification

  • Slides: 61
Download presentation
Chapter One Introduction to Pipelined Processors

Chapter One Introduction to Pipelined Processors

Handler’s Classification • Based on the level of processing, the pipelined processors can be

Handler’s Classification • Based on the level of processing, the pipelined processors can be classified as: 1. Arithmetic Pipelining 2. Instruction Pipelining 3. Processor Pipelining

Arithmetic Pipelining • The arithmetic logic units of a computer can be segmented for

Arithmetic Pipelining • The arithmetic logic units of a computer can be segmented for pipelined operations in various data formats. • Example : Star 100

Arithmetic Pipelining

Arithmetic Pipelining

Arithmetic Pipelining • Example : Star 100 – It has two pipelines where arithmetic

Arithmetic Pipelining • Example : Star 100 – It has two pipelines where arithmetic operations are performed – First: Floating Point Adder and Multiplier – Second : Multifunctional • All scalar instructions • Floating point adder, multiplier and divider. – Both pipelines are 64 -bit and can be split into four 32 -bit at the cost of precision

Star 100 Architecture

Star 100 Architecture

Instruction Pipelining • The execution of a stream of instructions can be pipelined by

Instruction Pipelining • The execution of a stream of instructions can be pipelined by overlapping the execution of current instruction with the fetch, decode and operand fetch of the subsequent instructions • It is also called instruction look-ahead

Instruction Pipelining

Instruction Pipelining

Example : 8086 • The organization of 8086 into a separate BIU and EU

Example : 8086 • The organization of 8086 into a separate BIU and EU allows the fetch and execute cycle to overlap. This is called pipelining.

Processor Pipelining • This refers to the processing of same data stream by a

Processor Pipelining • This refers to the processing of same data stream by a cascade of processors each of which processes a specific task • The data stream passes the first processor with results stored in a memory block which is also accessible by the second processor • The second processor then passes the refined results to the third and so on.

Processor Pipelining

Processor Pipelining

Li and Ramamurthy's Classification • According to pipeline configurations and control strategies, Li and

Li and Ramamurthy's Classification • According to pipeline configurations and control strategies, Li and Ramamurthy classify pipelines under three schemes – Unifunction v/s Multi-function Pipelines – Static v/s Dynamic Pipelines – Scalar v/s Vector Pipelines

Uni-function v/s Multi-function Pipelines

Uni-function v/s Multi-function Pipelines

Unifunctional Pipelines • A pipeline unit with fixed and dedicated function is called unifunctional.

Unifunctional Pipelines • A pipeline unit with fixed and dedicated function is called unifunctional. • Example: CRAY 1 (Supercomputer - 1976) • It has 12 unifunctional pipelines described in four groups: – Address Functional Units: • Address Add Unit • Address Multiply Unit

Unifunctional Pipelines – Scalar Functional Units • • Scalar Add Unit Scalar Shift Unit

Unifunctional Pipelines – Scalar Functional Units • • Scalar Add Unit Scalar Shift Unit Scalar Logical Unit Population/Leading Zero Count Unit – Vector Functional Units • Vector Add Unit • Vector Shift Unit • Vector Logical Unit

Unifunctional Pipelines – Floating Point Functional Units • Floating Point Add Unit • Floating

Unifunctional Pipelines – Floating Point Functional Units • Floating Point Add Unit • Floating Point Multiply Unit • Reciprocal Approximation Unit

Cray 1 : Architecture

Cray 1 : Architecture

Cray -1

Cray -1

Multifunctional • A multifunction pipe may perform different functions either at different times or

Multifunctional • A multifunction pipe may perform different functions either at different times or same time, by interconnecting different subset of stages in pipeline. • Example 4 X-TI-ASC (Supercomputer - 1973)

4 X-TI ASC • It has four multifunction pipeline processors, each of which is

4 X-TI ASC • It has four multifunction pipeline processors, each of which is reconfigurable for a variety of arithmetic or logic operations at different times. • It is a four central processor comprised of nine units.

Multifunctional • It has – one instruction processing unit – four memory buffer units

Multifunctional • It has – one instruction processing unit – four memory buffer units and – four arithmetic units. • Thus it provides four parallel execution pipelines below the IPU. • Any mixture of scalar and vector instructions can be executed simultaneously in four pipes.

Architecture Overview of 4 X-TI ASC

Architecture Overview of 4 X-TI ASC

Static Vs Dynamic Pipeline

Static Vs Dynamic Pipeline

Static Pipeline • It may assume only one functional configuration at a time •

Static Pipeline • It may assume only one functional configuration at a time • It can be either unifunctional or multifunctional • Static pipelines are preferred when instructions of same type are to be executed continuously • A unifunction pipe must be static.

Dynamic pipeline • It permits several functional configurations to exist simultaneously • A dynamic

Dynamic pipeline • It permits several functional configurations to exist simultaneously • A dynamic pipeline must be multi-functional • The dynamic configuration requires more elaborate control and sequencing mechanisms than static pipelining

Scalar Vs Vector Pipeline

Scalar Vs Vector Pipeline

Scalar Pipeline • It processes a sequence of scalar operands under the control of

Scalar Pipeline • It processes a sequence of scalar operands under the control of a DO loop • Instructions in a small DO loop are often prefetched into the instruction buffer. • The required scalar operands are moved into a data cache to continuously supply the pipeline with operands • Example: IBM System/360 Model 91

IBM System/360 Model 91 • In this computer, buffering plays a major role. •

IBM System/360 Model 91 • In this computer, buffering plays a major role. • Instruction fetch buffering: – provide the capacity to hold program loops of meaningful size. – Upon encountering a loop which fits, the buffer locks onto the loop and subsequent branching requires less time. • Operand fetch buffering: – provide a queue into which storage can dump operands and execution units can fetch operands. – This improves operand fetching for storage-toregister and storage-to-storage instruction types.

Architecture overview of IBM 360/Model 91

Architecture overview of IBM 360/Model 91

Vector Pipelines • They are specially designed to handle vector instructions over vector operands.

Vector Pipelines • They are specially designed to handle vector instructions over vector operands. • Computers having vector instructions are called vector processors. • The design of a vector pipeline is expanded from that of a scalar pipeline. • The handling of vector operands in vector pipelines is under firmware and hardware control. • Example : Cray 1

Linear pipeline (Static & Unifunctional) • In a linear pipeline data flows from one

Linear pipeline (Static & Unifunctional) • In a linear pipeline data flows from one stage to another and all stages are used once in a computation and it is for one functional evaluation.

Non-linear pipeline • In floating point adder, stage (2) and (4) needs a shift

Non-linear pipeline • In floating point adder, stage (2) and (4) needs a shift register. • We can use the same shift register and then there will be only 3 stages. • Then we should have a feedback from third stage to second stage. • Further the same pipeline can be used to perform fixed point addition. • A pipeline with feed-forward and/or feedback connections is called non-linear

Example: 3 -stage nonlinear pipeline

Example: 3 -stage nonlinear pipeline

3 stage non-linear pipeline Output A Input Sa Sb Sc Output B • It

3 stage non-linear pipeline Output A Input Sa Sb Sc Output B • It has 3 stages Sa, Sb and Sc and latches. • Multiplexers(cross circles) can take more than one input and pass one of the inputs to output • Output of stages has been tapped and used for feedback and feed-forward.

3 stage non-linear pipeline • The above pipeline can perform a variety of functions.

3 stage non-linear pipeline • The above pipeline can perform a variety of functions. • Each functional evaluation can be represented by a particular sequence of usage of stages. • Some examples are: 1. Sa, Sb, Sc 2. Sa, Sb, Sc, Sa 3. Sa, Sc, Sb, Sa, Sb, Sc

Reservation Table • Each functional evaluation can be represented using a diagram called Reservation

Reservation Table • Each functional evaluation can be represented using a diagram called Reservation Table(RT). • It is the space-time diagram of a pipeline corresponding to one functional evaluation. • X axis – time units • Y axis – stages

Reservation Table • For first sequence Sa, Sb, Sc, Sa called function A ,

Reservation Table • For first sequence Sa, Sb, Sc, Sa called function A , we have Sa Sb Sc 0 A 1 2 A 3 4 A A A 5 A

Reservation Table • For second sequence Sa, Sc, Sb, Sa, Sb, Sc called function

Reservation Table • For second sequence Sa, Sc, Sb, Sa, Sb, Sc called function B, we have Sa Sb Sc 0 B 1 2 B B 3 B 4 5 B B

3 stage non-linear pipeline Output A Input Sa Sb Sc Reservation Table Time Stage

3 stage non-linear pipeline Output A Input Sa Sb Sc Reservation Table Time Stage Sa Sb Sc 0 1 2 3 4 5 Output B

Function A

Function A

3 stage pipeline : Sa, Sb, Sc, Sa Output A Input Sa Sb Sc

3 stage pipeline : Sa, Sb, Sc, Sa Output A Input Sa Sb Sc Reservation Table Time Stage Sa Sb Sc 0 A 1 2 3 4 5 Output B

3 stage pipeline : Sa, Sb, Sc, Sa Output A Input Sa Sb Sc

3 stage pipeline : Sa, Sb, Sc, Sa Output A Input Sa Sb Sc Reservation Table Time Stage Sa Sb Sc 0 A 1 A 2 3 4 5 Output B

3 stage pipeline : Sa, Sb, Sc, Sa Output A Input Sa Sb Sc

3 stage pipeline : Sa, Sb, Sc, Sa Output A Input Sa Sb Sc Reservation Table Time Stage Sa Sb Sc 0 A 1 2 A A 3 4 5 Output B

3 stage pipeline : Sa, Sb, Sc, Sa Output A Input Sa Sb Sc

3 stage pipeline : Sa, Sb, Sc, Sa Output A Input Sa Sb Sc Reservation Table Time Stage Sa Sb Sc 0 A 1 2 A 3 A A 4 5 Output B

3 stage pipeline : Sa, Sb, Sc, Sa Output A Input Sa Sb Sc

3 stage pipeline : Sa, Sb, Sc, Sa Output A Input Sa Sb Sc Reservation Table Time Stage Sa Sb Sc 0 A 1 2 A 3 4 A A A 5 Output B

3 stage pipeline : Sa, Sb, Sc, Sa Output A Input Sa Sb Sc

3 stage pipeline : Sa, Sb, Sc, Sa Output A Input Sa Sb Sc Reservation Table Time Stage Sa Sb Sc 0 A 1 2 A 3 4 A A A 5 A Output B

Function B

Function B

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc Output A Input Sa Sb

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc Output A Input Sa Sb Sc Reservation Table Time Stage Sa Sb Sc 0 B 1 2 3 4 5 Output B

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc Output A Input Sa Sb

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc Output A Input Sa Sb Sc Reservation Table Time Stage Sa Sb Sc 0 B 1 B 2 3 4 5 Output B

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc Output A Input Sa Sb

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc Output A Input Sa Sb Sc Reservation Table Time Stage Sa Sb Sc 0 B 1 2 B B 3 4 5 Output B

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc Output A Input Sa Sb

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc Output A Input Sa Sb Sc Reservation Table Time Stage Sa Sb Sc 0 B 1 2 B B 3 B 4 5 Output B

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc Output A Input Sa Sb

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc Output A Input Sa Sb Sc Reservation Table Time Stage Sa Sb Sc 0 B 1 2 B B 3 B 4 B 5 Output B

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc Output A Input Sa Sc

3 stage pipeline: Sa, Sc, Sb, Sa, Sb, Sc Output A Input Sa Sc Reservation Table Time Stage Sa Sb Sc Sb 0 B 1 2 B B 3 B 4 5 B B Output B

Reservation Table • After starting a function, the stages need to be reserved in

Reservation Table • After starting a function, the stages need to be reserved in corresponding time units. • Each function supported by multifunction pipeline is represented by different RTs • Time taken for function evaluation in units of clock period is compute time. (For A & B, it is 6)

Reservation Table • Marking in same row => usage of stage more than once

Reservation Table • Marking in same row => usage of stage more than once • Marking in same column => more than one stage at a time

Multifunction pipelines • Hardware of multifunction pipeline should be reconfigurable. • Multifunction pipeline can

Multifunction pipelines • Hardware of multifunction pipeline should be reconfigurable. • Multifunction pipeline can be static or dynamic

Multifunction pipelines • Static: – Initially configured for one functional evaluation. – For another

Multifunction pipelines • Static: – Initially configured for one functional evaluation. – For another function, pipeline need to be drained and reconfigured. – You cannot have two inputs of different function at the same time

Multifunction pipelines • Dynamic: – Can do different functional evaluation at a time. –

Multifunction pipelines • Dynamic: – Can do different functional evaluation at a time. – It is difficult to control as we need to be sure that there is no conflict in usage of stages.