CSE 431 Computer Architecture Fall 2005 Lecture 06

  • Slides: 25
Download presentation
CSE 431 Computer Architecture Fall 2005 Lecture 06: Basic MIPS Pipelining Review Mary Jane

CSE 431 Computer Architecture Fall 2005 Lecture 06: Basic MIPS Pipelining Review Mary Jane Irwin ( www. cse. psu. edu/~mji ) www. cse. psu. edu/~cg 431 [Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, UCB] CSE 431 L 06 Basic MIPS Pipelining. 1 Irwin, PSU, 2005

Review: Single Cycle vs. Multiple Cycle Timing Single Cycle Implementation: Cycle 1 Cycle 2

Review: Single Cycle vs. Multiple Cycle Timing Single Cycle Implementation: Cycle 1 Cycle 2 Clk lw sw multicycle clock slower than 1/5 th of single cycle clock due to stage register overhead Multiple Cycle Implementation: Clk Waste Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10 lw IFetch sw Dec Exec CSE 431 L 06 Basic MIPS Pipelining. 2 Mem WB IFetch Dec Exec Mem R-type IFetch Irwin, PSU, 2005

How Can We Make It Even Faster? q Split the multiple instruction cycle into

How Can We Make It Even Faster? q Split the multiple instruction cycle into smaller and smaller steps l q q There is a point of diminishing returns where as much time is spent loading the state registers as doing the work Start fetching and executing the next instruction before the current one has completed l Pipelining – (all? ) modern processors are pipelined for performance l Remember the performance equation: CPU time = CPI * CC * IC Fetch (and execute) more than one instruction at a time l Superscalar processing – stay tuned CSE 431 L 06 Basic MIPS Pipelining. 3 Irwin, PSU, 2005

A Pipelined MIPS Processor q Start the next instruction before the current one has

A Pipelined MIPS Processor q Start the next instruction before the current one has completed l l improves throughput - total amount of work done in a given time instruction latency (execution time, delay time, response time from the start of an instruction to its completion) is not reduced Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 lw sw IFetch Dec Exec Mem WB IFetch Dec Exec Mem R-type WB - clock cycle (pipeline stage time) is limited by the slowest stage - for some instructions, some stages are wasted cycles CSE 431 L 06 Basic MIPS Pipelining. 4 Irwin, PSU, 2005

Single Cycle, Multiple Cycle, vs. Pipeline Single Cycle Implementation: Cycle 1 Cycle 2 Clk

Single Cycle, Multiple Cycle, vs. Pipeline Single Cycle Implementation: Cycle 1 Cycle 2 Clk lw sw Waste Multiple Cycle Implementation: Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10 Clk lw IFetch sw Dec Exec Mem WB IFetch Dec Exec Mem R-type IFetch Pipeline Implementation: lw IFetch sw Dec Exec Mem WB IFetch Dec Exec Mem WB Dec Exec Mem R-type IFetch CSE 431 L 06 Basic MIPS Pipelining. 5 WB Irwin, PSU, 2005

MIPS Pipeline Datapath Modifications q What do we need to add/modify in our MIPS

MIPS Pipeline Datapath Modifications q What do we need to add/modify in our MIPS datapath? State registers between each pipeline stage to isolate them l IF: IFetch ID: Dec EX: Execute MEM: Mem. Access WB: Write. Back Add Read Addr 2 Data 1 File Write Addr Write Data 16 Sign Extend Read Data 2 ALU Exec/Mem Register Read Dec/Exec Read Address Read Addr 1 IFetch/Dec PC Instruction Memory Add Data Memory Address Write Data Read Data Mem/WB Shift left 2 4 32 System Clock CSE 431 L 06 Basic MIPS Pipelining. 6 Irwin, PSU, 2005

Pipelining the MIPS ISA q What makes it easy l all instructions are the

Pipelining the MIPS ISA q What makes it easy l all instructions are the same length (32 bits) - can fetch in the 1 st stage and decode in the 2 nd stage l few instruction formats (three) with symmetry across formats - can begin reading register file in 2 nd stage l memory operations can occur only in loads and stores - can use the execute stage to calculate memory addresses l q each MIPS instruction writes at most one result (i. e. , changes the machine state) and does so near the end of the pipeline (MEM and WB) What makes it hard structural hazards: what if we had only one memory? l control hazards: what about branches? l data hazards: what if an instruction’s input operands CSE 431 L 06 Basic MIPS Pipelining. 7 Irwin, PSU, 2005 depend on the output of a previous instruction? l

Graphically Representing MIPS Pipeline q Reg ALU IM DM Reg Can help with answering

Graphically Representing MIPS Pipeline q Reg ALU IM DM Reg Can help with answering questions like: l l l How many cycles does it take to execute this code? What is the ALU doing during cycle 4? Is there a hazard, why does it occur, and how can it be fixed? CSE 431 L 06 Basic MIPS Pipelining. 8 Irwin, PSU, 2005

Why Pipeline? For Performance! Time (clock cycles) Inst 3 IM Reg DM IM Reg

Why Pipeline? For Performance! Time (clock cycles) Inst 3 IM Reg DM IM Reg ALU Inst 2 DM ALU Inst 1 Reg ALU IM ALU O r d e r Inst 0 ALU I n s t r. Once the pipeline is full, one instruction is completed every cycle, so CPI = 1 Inst 4 Reg Reg DM Reg Time to fill the pipeline CSE 431 L 06 Basic MIPS Pipelining. 9 Irwin, PSU, 2005

Can Pipelining Get Us Into Trouble? q Yes: Pipeline Hazards l structural hazards: attempt

Can Pipelining Get Us Into Trouble? q Yes: Pipeline Hazards l structural hazards: attempt to use the same resource by two different instructions at the same time l data hazards: attempt to use data before it is ready - An instruction’s source operand(s) are produced by a prior instruction still in the pipeline l control hazards: attempt to make a decision about program control flow before the condition has been evaluated and the new PC target address calculated - branch instructions q Can always resolve hazards by waiting l l pipeline control must detect the hazard and take action to resolve hazards CSE 431 L 06 Basic MIPS Pipelining. 10 Irwin, PSU, 2005

A Single Memory Would Be a Structural Hazard Time (clock cycles) q Mem Reg

A Single Memory Would Be a Structural Hazard Time (clock cycles) q Mem Reg Reg Mem Inst 3 Inst 4 Reg ALU Inst 2 Reg ALU Mem Reading data from memory Mem ALU Reg ALU O r d e r Inst 1 Mem ALU I n s t r. lw Mem Reading instruction from memory Mem Reg Fix with separate instr and data memories (I$ and D$) CSE 431 L 06 Basic MIPS Pipelining. 11 Irwin, PSU, 2005

How About Register File Access? Time (clock cycles) Inst 2 DM IM Reg ALU

How About Register File Access? Time (clock cycles) Inst 2 DM IM Reg ALU Inst 1 Reg ALU IM ALU O r d e r add $1, ALU I n s t r. add $2, $1, clock edge that controls register writing CSE 431 L 06 Basic MIPS Pipelining. 13 Reg Fix register file access hazard by doing reads in the second half of the cycle and writes in the first half Reg DM Reg clock edge that controls loading of pipeline state registers Irwin, PSU, 2005

Register Usage Can Cause Data Hazards q Dependencies backward in time cause hazards Reg

Register Usage Can Cause Data Hazards q Dependencies backward in time cause hazards Reg DM IM Reg ALU $8, $1, $9 IM ALU or DM ALU and $6, $1, $7 Reg ALU sub $4, $1, $5 IM ALU add $1, xor $4, $1, $5 q Reg Reg DM Reg Read before write data hazard CSE 431 L 06 Basic MIPS Pipelining. 15 Irwin, PSU, 2005

Loads Can Cause Data Hazards q Reg DM IM Reg ALU sub $4, $1,

Loads Can Cause Data Hazards q Reg DM IM Reg ALU sub $4, $1, $5 IM ALU $1, 4($2) ALU O r d e r lw ALU I n s t r. Dependencies backward in time cause hazards and $6, $1, $7 or $8, $1, $9 xor $4, $1, $5 q Reg Reg DM Reg Load-use data hazard CSE 431 L 06 Basic MIPS Pipelining. 16 Irwin, PSU, 2005

One Way to “Fix” a Data Hazard Reg DM Reg IM Reg DM IM

One Way to “Fix” a Data Hazard Reg DM Reg IM Reg DM IM Reg ALU IM ALU O r d e r add $1, ALU I n s t r. Can fix data hazard by waiting – stall – but impacts CPI stall sub $4, $1, $5 and $6, $1, $7 CSE 431 L 06 Basic MIPS Pipelining. 17 Reg DM Reg Irwin, PSU, 2005

Another Way to “Fix” a Data Hazard or $8, $1, $9 xor $4, $1,

Another Way to “Fix” a Data Hazard or $8, $1, $9 xor $4, $1, $5 CSE 431 L 06 Basic MIPS Pipelining. 19 IM Reg DM IM Reg ALU and $6, $1, $7 DM ALU sub $4, $1, $5 Reg ALU IM ALU O r d e r add $1, ALU I n s t r. Fix data hazards by forwarding results as soon as they are available to where they are needed Reg Reg DM Reg Irwin, PSU, 2005

Forwarding with Load-use Data Hazards or $8, $1, $9 xor $4, $1, $5 q

Forwarding with Load-use Data Hazards or $8, $1, $9 xor $4, $1, $5 q IM Reg DM IM Reg ALU and $6, $1, $7 DM ALU sub $4, $1, $5 Reg ALU $1, 4($2) IM ALU O r d e r lw ALU I n s t r. Reg Reg DM Reg Will still need one stall cycle even with forwarding CSE 431 L 06 Basic MIPS Pipelining. 21 Irwin, PSU, 2005

Branch Instructions Cause Control Hazards q Inst 4 CSE 431 L 06 Basic MIPS

Branch Instructions Cause Control Hazards q Inst 4 CSE 431 L 06 Basic MIPS Pipelining. 22 IM Reg DM IM Reg ALU Inst 3 Reg ALU lw IM ALU O r d e r beq ALU I n s t r. Dependencies backward in time cause hazards DM Reg Reg DM Reg Irwin, PSU, 2005

One Way to “Fix” a Control Hazard beq O r d e r stall

One Way to “Fix” a Control Hazard beq O r d e r stall IM Reg ALU I n s t r. DM Fix branch hazard by waiting – stall – but affects CPI Reg stall CSE 431 L 06 Basic MIPS Pipelining. 23 Reg DM IM Reg ALU Inst 3 IM ALU lw Reg DM Irwin, PSU, 2005

Corrected Datapath to Save Reg. Write Addr q Need to preserve the destination register

Corrected Datapath to Save Reg. Write Addr q Need to preserve the destination register address in the pipeline state registers IF/ID ID/EX EX/MEM Add Shift left 2 4 PC Instruction Memory Read Address Read Addr 1 Read Addr 2 Data 1 File Write Addr 16 Sign Extend Read Data 2 MEM/WB Data Memory Register Read Write Data CSE 431 L 06 Basic MIPS Pipelining. 25 Add ALU Address Read Data Write Data 32 Irwin, PSU, 2005

MIPS Pipeline Control Path Modifications q All control signals can be determined during Decode

MIPS Pipeline Control Path Modifications q All control signals can be determined during Decode l and held in the state registers between pipeline stages ID/EX EX/MEM IF/ID Control Add Shift left 2 4 PC Instruction Memory Read Address Read Addr 1 Read Addr 2 Data 1 File Write Data 16 CSE 431 L 06 Basic MIPS Pipelining. 26 Data Memory Register Read Write Addr Sign Extend Read Data 2 MEM/WB Add ALU Address Read Data Write Data 32 Irwin, PSU, 2005

Other Pipeline Structures Are Possible q What about the (slow) multiply operation? l l

Other Pipeline Structures Are Possible q What about the (slow) multiply operation? l l Make the clock twice as slow or … let it take two cycles (since it doesn’t use the DM stage) MUL q Reg ALU IM DM Reg What if the data memory access is twice as slow as the instruction memory? l l make the clock twice as slow or … let data memory access take two cycles (and keep the same clock rate) CSE 431 L 06 Basic MIPS Pipelining. 27 Reg ALU IM DM 1 DM 2 Reg Irwin, PSU, 2005

Sample Pipeline Alternatives q ARM 7 IM Reg PC update IM access q XScale

Sample Pipeline Alternatives q ARM 7 IM Reg PC update IM access q XScale IM IM 1 PC update BTB access start IM access Reg IM 2 DM Reg SHFT decode reg 1 access IM access CSE 431 L 06 Basic MIPS Pipelining. 28 ALU op DM access shift/rotate commit result (write back) ALU Strong. ARM-1 decode reg access ALU q EX DM 1 Reg DM 2 DM write reg write start DM access exception ALU op shift/rotate reg 2 access Irwin, PSU, 2005

Summary q All modern day processors use pipelining q Pipelining doesn’t help latency of

Summary q All modern day processors use pipelining q Pipelining doesn’t help latency of single task, it helps throughput of entire workload q Potential speedup: a CPI of 1 and fast a CC q Pipeline rate limited by slowest pipeline stage q l Unbalanced pipe stages makes for inefficiencies l The time to “fill” pipeline and time to “drain” it can impact speedup for deep pipelines and short code runs Must detect and resolve hazards l Stalling negatively affects CPI (makes CPI less than the ideal of 1) CSE 431 L 06 Basic MIPS Pipelining. 29 Irwin, PSU, 2005

Next Lecture and Reminders q Next lecture l Overcoming data hazards - Reading assignment

Next Lecture and Reminders q Next lecture l Overcoming data hazards - Reading assignment – PH, Chapter 6. 4 -6. 5 q Reminders l HW 2 due September 29 th l Simple. Scalar tutorials scheduled - Thursday, Sept 22, 5: 30 -6: 30 pm in 218 IST l Evening midterm exam scheduled - Tuesday, October 18 th , 20: 15 to 22: 15, Location 113 IST - You should have let me know by now if you have a conflict CSE 431 L 06 Basic MIPS Pipelining. 30 Irwin, PSU, 2005