Advanced Computer Architecture Lecture 4 Pipelined Processor Principle





















- Slides: 21
Advanced Computer Architecture Lecture 4 Pipelined Processor
Principle of pipelining
Processing of a sequence of instructions using a basic pipeline
Pipelined and unpipelined processing
General structure of pipelines
Pipeline Performance Measures n Cycle time: tc ¨ is determined by the worst-case processing time of the longest stage n Repetition Rate: R ¨ the shortest possible time interval between subsequent independent instructions in the pipeline
Pipeline Performance Measures (Example) n Performance potential of a pipeline: P P = 1/(R * tc) n Power. PC 603 FP double Mul. e. g. R = 2, tc = 12 nsec P = 1/(R * tc) = 1/(2*12 nec) = 44. 6 MFLOPS
Performance: RAW-dependent n Latency: ¨ specifies the amount of time that the result of a particular instruction takes to become available in the pipeline for a subsequent dependent instruction. n Define-use latency (10 to 100 cycles) ¨ mul r 1, r 2, r 3 ¨ add r 5, r 1, r 4
Performance: RAW-dependent n Load-use latency (1 to 3 cycles) ¨ load r 1, x ¨ add r 5, r 1, r 2 n Stalled: the immediately following RAWdependent instruction has to be stalled in the pipeline for n-1 cycle
Improve Performance n There is difference between ¨ Additions/Subtractions and Multiplications ¨ Integer and Double Precision Operations
Design space of pipelines n key aspect of the design space of pipeline
Basic layout of a pipeline n Design space of the overall stage layout
Increasing parellelism by raising the number of pipeline stages
Eight -stage pipeline
Problems arise for more stages n data and control dependencies occur more frequently ¨ stalled and wait for data ¨ reload pipe in case of branch n subtask becomes less balanced (in execution time) ¨ cycle time is determined by the worst-case processing time of the longest stage n In most case ¨ 5 -10 stages
Pipelines e. g. DEC 21064
Layout of the stage sequence
Bypasses (data forwarding in RAW) Unless special arrangements are made, n the results of the operation instruction is written into the register file, or into the memory, n and then it is fetched from there as a source operand. n
Principle of bypassing in define-use and load-use conflicts
Possibilities for the timing of pipeline operation
Dependency Resolution (Pipeline Hazards)