Pipelining and Hazards Prof Hakim Weatherspoon CS 3410
- Slides: 78
Pipelining and Hazards Prof. Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University See P&H Chapter: 4. 6 -4. 8
Announcements Prelim next week Tuesday at 7: 30. Go to location based on netid [a-g]* → MRS 146: Morrison Hall 146 [h-l]* → RRB 125: Riley-Robb Hall 125 [m-n]*→ RRB 105: Riley-Robb Hall 105 [o-s]* → MVRG 71: M Van Rensselaer Hall G 71 [t-z]* → MVRG 73: M Van Rensselaer Hall G 73 Prelim reviews TODAY, Tue, Feb 24 @ 7: 30 pm in Olin 255 Sat, Feb 28 @ 7: 30 pm in Upson B 17 Prelim conflicts Contact Deniz Altinbuken <deniz@cs. cornell. edu>
Announcements Prelim 1: • • • Time: We will start at 7: 30 pm sharp, so come early Location: on previous slide Closed Book • Cannot use electronic device or outside material • Practice prelims are online in CMS • Material covered everything up to end of this week • • • Everything up to and including data hazards Appendix B (logic, gates, FSMs, memory, ALUs) Chapter 4 (pipelined [and non] MIPS processor with hazards) Chapters 2 (Numbers / Arithmetic, simple MIPS instructions) Chapter 1 (Performance) HW 1, Lab 0, Lab 1, Lab 2, C-Lab 0, C-Lab 1
Goals for Today RISC and Pipelined Processor: Putting it all together Data Hazards • Data dependencies • Problem, detection, and solutions – (delaying, stalling, forwarding, bypass, etc) • Hazard detection unit • Forwarding unit Next time • Control Hazards What is the next instruction to execute if a branch is taken? Not taken?
MIPS Design Principles Simplicity favors regularity • 32 bit instructions Smaller is faster • Small register file Make the common case fast • Include support for constants Good design demands good compromises • Support for different type of interpretations/classes
Recall: MIPS instruction formats All MIPS instructions are 32 bits long, has 3 formats R-type op 6 bits I-type op 6 bits J-type rs rt 5 bits rs rt rd shamt func 5 bits 6 bits immediate 5 bits 16 bits op immediate (target address) 6 bits 26 bits
Recall: MIPS Instruction Types Arithmetic/Logical • R-type: result and two source registers, shift amount • I-type: 16 -bit immediate with sign/zero extension Memory Access • load/store between registers and memory • word, half-word and byte operations Control flow • conditional branches: pc-relative addresses • jumps: fixed offsets, register absolute
Recall: MIPS Instruction Types Arithmetic/Logical • ADD, ADDU, SUBU, AND, OR, XOR, NOR, SLTU • ADDI, ADDIU, ANDI, ORI, XORI, LUI, SLL, SRL, SLLV, SRAV, SLTIU • MULT, DIV, MFLO, MTLO, MFHI, MTHI Memory Access • LW, LH, LB, LHU, LBU, LWL, LWR • SW, SH, SB, SWL, SWR Control flow • BEQ, BNE, BLEZ, BLTZ, BGEZ, BGTZ • J, JR, JALR, BEQL, BNEL, BLEZL, BGTZL Special • LL, SC, SYSCALL, BREAK, SYNC, COPROC
Pipelining Principle: Throughput increased by parallel execution Balanced pipeline very important Else slowest stage dominates performance Pipelining: • Identify pipeline stages • Isolate stages from each other • Resolve pipeline hazards (this and next lecture)
Basic Pipeline Five stage “RISC” load-store architecture 1. Instruction fetch (IF) – get instruction from memory, increment PC 2. Instruction Decode (ID) – translate opcode into control signals and read registers 3. Execute (EX) – perform ALU operation, compute jump/branch targets 4. Memory (MEM) – access memory if needed 5. Writeback (WB) – update register file
Pipelined Implementation • Each instruction goes through the 5 stages • Each stage takes one clock cycle • So slowest stage determines clock cycle time
Time Graphs Clock cycle 1 add lw IF 2 3 4 5 6 7 8 ID EX MEM WB IF ID 9 EX MEM WB
i. Clicker The pipeline achieves A) Latency: 1, throughput: 1 instr/cycle B) Latency: 5, throughput: 1 instr/cycle C) Latency: 1, throughput: 1/5 instr/cycle D) Latency: 5, throughput: 5 instr/cycle E) None of the above
Time Graphs Clock cycle 1 add lw IF 2 3 4 5 6 7 8 ID EX MEM WB IF ID Latency: Throughput: Concurrency: 5 cycles 1 instr/cycle 5 9 EX MEM WB CPI = 1
Pipelined Implementation • Each instruction goes through the 5 stages • Each stage takes one clock cycle • So slowest stage determines clock cycle time • Stages must share information. How? • Add pipeline registers (flip-flops) to pass results between different stages
Pipelined Processor memory register file alu +4 addr PC din control new pc Fetch Decode memory compute jump/branch targets extend Execute dout Memory WB
A Pipelined Processor alu B D register file D memory +4 IF/ID ID/EX EX/MEM Memory ctrl compute jump/branch targets Execute dout M B Instruction Decode Instruction Fetch din memory ctrl extend imm new pc control ctrl inst PC addr Write. Back MEM/WB
Pipelined Implementation • Each instruction goes through the 5 stages • Each stage takes one clock cycle • So slowest stage determines clock cycle time • Stages must share information. How? • Add pipeline registers (flip-flops) to pass results between different stages And is this it? Not quite….
Hazards 3 kinds • Structural hazards – Multiple instructions want to use same unit • Data hazards – Results of instruction needed before ready • Control hazards – Don’t know which side of branch to take Will get back to this First, how to pipeline when no hazards
A Pipelined Processor alu B D register file D memory +4 IF/ID ID/EX EX/MEM Memory ctrl compute jump/branch targets Execute dout M B Instruction Decode Instruction Fetch din memory ctrl extend imm new pc control ctrl inst PC addr Write. Back MEM/WB
Example: : Sample Code (Simple) add nand lw add sw r 3, r 6, r 4, r 5, r 7, r 1, r 2; r 4, r 5; 20(r 2); r 2, r 5; 12(r 3);
Example: Sample Code (Simple) Assume eight-register machine Run the following code on a pipelined datapath add nand lw add sw Slides thanks to Sally Mc. Kee r 3 r 1 r 2 ; reg 3 = reg 1 + reg 2 r 6 r 4 r 5 ; reg 6 = ~(reg 4 & reg 5) r 4 20 (r 2) ; reg 4 = Mem[reg 2+20] r 5 r 2 r 5 ; reg 5 = reg 2 + reg 5 r 7 12(r 3) ; Mem[reg 3+12] = reg 7
M U X 4 target + PC+4 R 0 R 1 reg. A R 2 reg. B Register file Inst mem instruction PC 0 R 3 Bits 11 -15 Bits 16 -20 Bits 26 -31 ALU result val. A R 4 R 5 R 6 val. B R 7 extend IF/ID PC+4 imm M U X A L U ALU result mdata Data mem data dest val. B Rd Rt op ID/EX M U X dest op op EX/MEM M U X MEM/WB
At time 1, Fetch add r 3 r 1 r 2 M U X 4 + 0 R 1 Register file 4 0 R 2 nop PC add nand lw add sw R 3 R 4 R 5 R 6 R 7 0 36 9 12 18 7 41 22 extend Initial State Bits 11 -15 Bits 16 -20 Bits 26 -31 Time: 0 IF/ID 0 0 0 M U X A L U 0 0 0 Data mem data dest 0 0 0 nop ID/EX M U X 0 0 nop EX/MEM M U X MEM/WB
add 3 1 2 M U X 4 + 4 R 0 R 1 R 2 Register file 8 4 add 3 1 2 PC add nand lw add sw R 3 R 4 R 5 R 6 R 7 0 36 9 12 18 7 41 22 extend Fetch: add 3 1 2 Bits 11 -15 Bits 16 -20 Bits 26 -31 Time: 1/ 2 IF/ID 0 /0 4 0 /0 36 /0 9 0 M U X A L U 0 0 0 Data mem data dest 0 /0 3 /0 2 nop / add ID/EX M U X 0 0 nop EX/MEM M U X MEM/WB
nand 6 4 5 add 3 1 2 M U X 4 + 8 R 0 R 2 2 Register file 12 8 nand 6 4 5 PC add nand lw add sw R 1 1 R 3 R 4 R 5 R 6 R 7 0 36 9 12 18 7 41 22 extend Fetch: nand 6 4 5 Bits 11 -15 Bits 16 -20 Bits 26 -31 Time: 2/ 3 IF/ID /0 4 /4 8 0 36 / 18 /9 7 3 /3 6 /2 5 9 M U X A L U /0 45 0 Data mem M U X 3 /0 3 nop / add EX/MEM M U X data dest /0 9 add / nand ID/EX 0 36 0 nop MEM/WB
lw 4 20(2) nand 6 4 5 add 3 1 2 M U X 4 + 12 R 0 R 2 5 Register file 16 12 lw 4 20(2) PC add nand lw add sw R 1 4 R 3 R 4 R 5 R 6 R 7 0 36 9 12 18 7 41 22 extend Fetch: lw 4 20(2) Bits 11 -15 Bits 16 -20 Bits 26 -31 Time: 3/ 4 IF/ID /4 8 8 0 18 7 /9 7 6 6 5 /0 45 / 18 36 M U X A L U 45 / -3 0 M U X Data mem data dest /9 7 3 2 nand ID/EX M U X 3 /3 6 add / nand EX/MEM /0 3 nop / add MEM/WB
add 5 2 5 lw 4 20(2) nand 6 4 5 add 3 1 2 M U X 4 + 16 R 0 R 2 4 Register file 20 16 add 5 2 5 PC add nand lw add sw R 1 2 R 3 R 4 R 5 R 6 R 7 0 36 9 12 18 7 41 22 extend Fetch: add 5 2 5 Bits 11 -15 Bits 16 -20 Bits 26 -31 Time: 4 IF/ID 8 12 0 9 7 18 20 0 4 45 18 M U X A L U -3 45 0 Data mem data dest 7 6 5 lw ID/EX M U X 6 6 M U X 3 nand EX/MEM 3 add MEM/WB
sw 7 12(3) add 5 2 5 lw 4 20 (2) nand 6 4 5 add 3 1 2 M U X 4 + 20 R 2 5 Register file 24 20 sw 7 12(3) PC add nand lw add sw R 1 2 R 3 R 4 R 5 R 6 R 7 0 36 9 45 18 7 41 22 extend Fetch: sw 7 12(3) Bits 11 -15 Bits 16 -20 Bits 26 -31 Time: 5 IF/ID 12 16 0 9 9 7 M U 20 X 5 5 5 -3 A L U 29 -3 45 0 Data mem data dest 18 0 4 add ID/EX M U X 4 4 M U X 6 lw EX/MEM 6 3 nand MEM/WB
sw 7 12(3) add 5 2 5 lw 4 20(2) nand 6 4 5 M U X 4 + R 0 28 24 add nand lw add sw R 2 7 Register file PC R 1 3 R 4 R 5 R 6 R 7 0 36 9 45 18 7 -3 22 extend No more instructions Bits 11 -15 Bits 16 -20 Bits 26 -31 Time: 6 IF/ID 16 20 0 9 45 7 22 12 0 7 29 M U X A L U 16 29 -3 99 Data mem data dest 7 5 5 sw ID/EX M U X 5 5 M U X 4 add EX/MEM 4 6 lw MEM/WB
nop sw 7 12(3) add 5 2 5 lw 4 20(2) M U X 4 20 + R 0 R 1 32 28 Register file PC R 2 add nand lw add sw R 3 R 4 R 5 R 6 R 7 0 36 9 45 99 7 -3 22 0 M U 12 X A L U 57 16 Bits 11 -15 Bits 16 -20 IF/ID data dest 22 0 7 Bits 26 -31 Time: 7 0 M U 99 X Data mem extend No more instructions 16 45 M U X 7 7 5 sw ID/EX EX/MEM 5 4 add MEM/WB
nop nop sw 7 12(3) add 5 2 5 M U X 4 + R 0 R 1 36 32 Register file PC R 2 add nand ; w add sw R 3 R 4 R 5 R 6 R 7 0 36 9 45 99 16 -3 22 57 M U X 57 Bits 11 -15 M U X Bits 16 -20 IF/ID Slides thanks to Sally Mc. Kee 0 Data mem M U X data dest 7 Bits 26 -31 Time: 8 22 22 extend No more instructions A L U 16 5 sw ID/EX EX/MEM MEM/WB
nop nop sw 7 12(3) M U X 4 + R 0 R 1 40 36 Register file PC R 2 add nand ; w add sw R 3 R 4 R 5 R 6 R 7 0 36 9 45 99 16 -3 22 M U X A L U Bits 11 -15 Data mem data dest extend No more instructions M U X Bits 16 -20 Bits 21 -23 Time: 9 IF/ID ID/EX EX/MEM MEM/WB
Takeaway Pipelining is a powerful technique to mask latencies and increase throughput • Logically, instructions execute one at a time • Physically, instructions execute in parallel – Instruction level parallelism Abstraction promotes decoupling • Interface (ISA) vs. implementation (Pipeline)
Hazards See P&H Chapter: 4. 7 -4. 8
Hazards 3 kinds • Structural hazards – Multiple instructions want to use same unit • Data hazards – Results of instruction needed before • Control hazards – Don’t know which side of branch to take
Next Goal What about data dependencies (also known as a data hazard in a pipelined processor)? i. e. add r 3, r 1, r 2 sub r 5, r 3, r 4 Need to detect and then fix such hazards
Data Hazards • register file reads occur in stage 2 (ID) • register file writes occur in stage 5 (WB) • next instructions may read values about to be written – i. e instruction may need values that are being computed further down the pipeline – in fact, this is quite common
time add r 3, r 1, r 2 sub r 5, r 3, r 4 lw r 6, 4(r 3) or r 5, r 3, r 5 sw r 6, 12(r 3) Data Hazards Clock cycle 1 2 IF ID IF 3 4 5 MEM WB MEM ID IF 7 8 9 WB MEM WB ID IF 6 MEM WB ID IF ID MEM WB
i. Clicker add r 3, r 1, r 2 sub r 5, r 3, r 4 lw r 6, 4(r 3) or r 5, r 3, r 5 sw r 6, 12(r 3) How many data hazards due to r 3 only A) B) C) D) E) 1 2 3 4 5
time r 3 = 10 add r 3, r 1, r 2 Data Hazards Clock cycle 1 2 IF ID 3 4 5 MEM WB r 3 = 20 6 7 8 9 r 3 = 10 r 3 = 20 sub r 5, r 3, r 4 r 3 = 10 IF MEM ID WB r 3 = 10 lw r 6, 4(r 3) IF MEM WB r 6 = Mem[r 3 + 4] ID r 3 = 10 or r 5, r 3, r 5 OK sw r 6, 12(r 3) IF MEM WB ID r 3 = 20 IF ID MEM WB
Data Hazards • register file reads occur in stage 2 (ID) • register file writes occur in stage 5 (WB) • next instructions may read values about to be written i. e. add r 3, r 1, r 2 sub r 5, r 3, r 4 How to detect?
IF/ID ID/EX D M B addr din dout EX/MEM Rd OP Rd mem OP IF/ID. Ra ≠ 0 && (IF/ID. Ra==ID/Ex. Rd IF/ID. Ra==Ex/M. Rd IF/ID. Ra==M/W. Rd) OP PC PC+4 +4 Rt Rd PC+4 imm D A B Ra Rb Rd D inst mem A B Detecting Data Hazards MEM/WB
Data Hazards • register file reads occur in stage 2 (ID) • register file writes occur in stage 5 (WB) • next instructions may read values about to be written How to detect? Logic in ID stage: stall = (IF/ID. Ra != 0 && (IF/ID. Ra == ID/EX. Rd || IF/ID. Ra == EX/M. Rd || IF/ID. Ra == M/WB. Rd)) || (same for Rb)
IF/ID ID/EX B D EX/MEM Rd OP Rd mem OP imm Rt Rd PC+4 detect hazard OP PC PC+4 +4 addr din dout M A B Ra Rb D A Rd D inst add r 3, r 1, r 2 sub inst r 5, r 3, r 5 or r 6, r 3, r 4 mem add r 6, r 3, r 8 B Detecting Data Hazards MEM/WB
Takeaway Data hazards occur when a operand (register) depends on the result of a previous instruction that may not be computed yet. A pipelined processor needs to detect data hazards.
Next Goal What to do if data hazard detected?
Resolving Data Hazards What to do if data hazard detected? A) Wait/Stall B) Reorder in Software (SW) C) Forward/Bypass D) All the above E) None. We will use some other method
Resolving Data Hazards What to do if data hazard detected? A) Wait/Stall Discuss today B) Reorder in Software (SW) C) Forward/Bypass D) All the above E) None. We will use some other method
Next Goal What to do if data hazard detected? Options • Nothing – Change the ISA to match implementation • Stall – Pause current and subsequent instructions till safe • Forward/bypass – Forward data value to where it is needed
Stalling How to stall an instruction in ID stage • prevent IF/ID pipeline register update – stalls the ID stage instruction • convert ID stage instr into nop for later stages – innocuous “bubble” passes through pipeline • prevent PC update – stalls the next (IF stage) instruction
WE=0 IF/ID ID/EX B D Rd Rd EX/MEM OP Mem. Wr=0 Reg. Wr=0 mem OP imm If detect hazard OP detect hazard PC+4 PC Rt Rd PC+4 +4 addr din dout M A B Ra Rb D A Rd D inst add r 3, r 1, r 2 sub inst r 5, r 3, r 5 or r 6, r 3, r 4 mem add r 6, r 3, r 8 B Detecting Data Hazards MEM/WB
time add r 3, r 1, r 2 sub r 5, r 3, r 5 or r 6, r 3, r 4 add r 6, r 3, r 8 1 Clock cycle 2 Stalling 3 4 5 6 7 8
time r 3 = 10 add r 3, r 1, r 2 1 Clock cycle IF Stalling 2 3 4 5 ID Ex M W r 3 = 20 sub r 5, r 3, r 5 or r 6, r 3, r 4 add r 6, r 3, r 8 IF 3 Stalls 6 7 8 ID ID Ex M W IF IF ID Ex M IF ID Ex
Stalling sub r 5, r 3, r 5 or r 6, r 3, r 4 (WE=0) /stall NOP = If(IF/ID. r. A ≠ 0 && (IF/ID. r. A==ID/Ex. Rd IF/ID. r. A==Ex/M. Rd IF/ID. r. A==M/W. Rd)) Rd WE Rd add r 3, r 1, r 2 Op nop M WE PC B data mem Op (Mem. Wr=0 Reg. Wr=0) B Rd +4 D D WE inst mem D r. D B r. A r. B A Op A
Stalling sub r 5, r 3, r 5 or r 6, r 3, r 4 (WE=0) /stall NOP = If(IF/ID. r. A ≠ 0 && (IF/ID. r. A==ID/Ex. Rd IF/ID. r. A==Ex/M. Rd IF/ID. r. A==M/W. Rd)) nop Rd Rd WE (Mem. Wr=0 Reg. Wr=0) M Op nop WE PC B data mem Op (Mem. Wr=0 Reg. Wr=0) B Rd +4 D D WE inst mem D r. D B r. A r. B A Op A add r 3, r 1, r 2
Stalling (WE=0) /stall M (Mem. Wr=0 Reg. Wr=0) NOP = If(IF/ID. r. A ≠ 0 && (IF/ID. r. A==ID/Ex. Rd IF/ID. r. A==Ex/M. Rd IF/ID. r. A==M/W. Rd)) nop WE Rd Rd (Mem. Wr=0 Reg. Wr=0) sub r 5, r 3, r 5 or r 6, r 3, r 4 data mem Op nop WE PC B Op (Mem. Wr=0 Reg. Wr=0) B Rd +4 D D WE inst mem D r. D B r. A r. B A Op A add r 3, r 1, r 2
time r 3 = 10 add r 3, r 1, r 2 1 Clock cycle IF Stalling 2 3 4 5 ID Ex M W r 3 = 20 sub r 5, r 3, r 5 or r 6, r 3, r 4 add r 6, r 3, r 8 IF 3 Stalls 6 7 8 ID ID Ex M W IF IF ID Ex M IF ID Ex
Stalling How to stall an instruction in ID stage • prevent IF/ID pipeline register update – stalls the ID stage instruction • convert ID stage instr into nop for later stages – innocuous “bubble” passes through pipeline • prevent PC update – stalls the next (IF stage) instruction
Takeaway Data hazards occur when a operand (register) depends on the result of a previous instruction that may not be computed yet. A pipelined processor needs to detect data hazards. Stalling, preventing a dependent instruction from advancing, is one way to resolve data hazards. Stalling introduces NOPs (“bubbles”) into a pipeline. Introduce NOPs by (1) preventing the PC from updating, (2) preventing writes to IF/ID registers from changing, and (3) preventing writes to memory and register file. *Bubbles in pipeline significantly decrease performance.
Next Goal: Resolving Data Hazards via Forwarding What to do if data hazard detected? A) Wait/Stall B) Reorder in Software (SW) C) Forward/Bypass
Forwarding bypasses some pipelined stages forwarding a result to a dependent instruction operand (register). Three types of forwarding/bypass • Forwarding from Ex/Mem registers to Ex stage (M Ex) • Forwarding from Mem/WB register to Ex stage (W Ex) • Register. File Bypass
Forwarding Datapath B B IF/ID Rd Rb Ra detect hazard ID/Ex data mem forward unit Ex/Mem Three types of forwarding/bypass • Forwarding from Ex/Mem registers to Ex stage (M Ex) • Forwarding from Mem/WB register to Ex stage (W Ex) • Register. File Bypass M Rd B imm inst mem D D D MC WE A Mem/WB
Forwarding Datapath B B IF/ID Rd Rb Ra detect hazard ID/Ex data mem forward unit Ex/Mem Three types of forwarding/bypass • Forwarding from Ex/Mem registers to Ex stage (M Ex) • Forwarding from Mem/WB register to Ex stage (W Ex) • Register. File Bypass M Rd B imm inst mem D D D MC WE A Mem/WB
Forwarding Datapath 1 Ex/MEM to EX Bypass • EX needs ALU result that is still in MEM stage • Resolve: Add a bypass from EX/MEM. D to start of EX How to detect? Logic in Ex Stage: forward = (Ex/M. WE && EX/M. Rd != 0 && ID/Ex. Ra == Ex/M. Rd) || (same for Rb)
Forwarding Datapath B B IF/ID Rd Rb Ra detect hazard ID/Ex data mem forward unit Ex/Mem Three types of forwarding/bypass • Forwarding from Ex/Mem registers to Ex stage (M Ex) • Forwarding from Mem/WB register to Ex stage (W Ex) • Register. File Bypass M Rd B imm inst mem D D D MC WE A Mem/WB
Forwarding Datapath 1 A inst mem D add r 3, r 1, r 2 sub r 5, r 3, r 1 B IF data mem ID Ex M W IF ID Ex M W
Forwarding Datapath 2 Mem/WB to EX Bypass • EX needs value being written by WB • Resolve: Add bypass from WB final value to start of EX How to detect? Logic in Ex Stage: forward = (M/WB. WE && M/WB. Rd != 0 && ID/Ex. Ra == M/WB. Rd && not (Ex/M. WE && Ex/M. Rd != 0 && ID/Ex. Ra == Ex/M. Rd) || (same for Rb) Check pg. 311
Forwarding Datapath B B IF/ID Rd Rb Ra detect hazard ID/Ex data mem forward unit Ex/Mem Three types of forwarding/bypass • Forwarding from Ex/Mem registers to Ex stage (M Ex) • Forwarding from Mem/WB register to Ex stage (W Ex) • Register. File Bypass M Rd B imm inst mem D D D MC WE A Mem/WB
Forwarding Datapath 2 A inst mem D add r 3, r 1, r 2 sub r 5, r 3, r 1 or r 6, r 3, r 4 B IF data mem ID Ex M W IF ID IF Ex ID M W Ex M W
Register File Bypass • Reading a value that is currently being written Detect: ((Ra == MEM/WB. Rd) or (Rb == MEM/WB. Rd)) and (WB is writing a register) Resolve: Add a bypass around register file (WB to ID) Better: (Hack) just negate register file clock – writes happen at end of first half of each clock cycle – reads happen during second half of each clock cycle
Forwarding Datapath B B IF/ID Rd Rb Ra detect hazard ID/Ex data mem forward unit Ex/Mem Three types of forwarding/bypass • Forwarding from Ex/Mem registers to Ex stage (M Ex) • Forwarding from Mem/WB register to Ex stage (W Ex) • Register. File Bypass M Rd B imm inst mem D D D MC WE A Mem/WB
Register File Bypass A inst mem D add r 3, r 1, r 2 sub r 5, r 3, r 1 or r 6, r 3, r 4 add r 6, r 3, r 8 B IF data mem ID Ex M W IF ID IF Ex ID IF M W Ex M ID Ex W M W
time r 3 = 10 add r 3, r 1, r 2 1 Forwarding Example Clock cycle IF 2 3 4 5 6 7 ID Ex M W IF ID M 8 r 3 = 20 sub r 5, r 3, r 5 or r 6, r 3, r 4 add r 6, r 3, r 8 W Ex W
time add r 3, r 1, r 2 sub r 5, r 3, r 4 lw r 6, 4(r 3) or r 5, r 3, r 5 sw r 6, 12(r 3) Forwarding Example 2 Clock cycle 1 2 3 4 5 6 7 8 IF ID Ex M W IF ID Ex M W W
Forwarding Datapath B B IF/ID Rd Rb Ra detect hazard ID/Ex data mem forward unit Ex/Mem Three types of forwarding/bypass • Forwarding from Ex/Mem registers to Ex stage (M Ex) • Forwarding from Mem/WB register to Ex stage (W Ex) • Register File Bypass M Rd B imm inst mem D D D MC WE A Mem/WB
Takeaway Data hazards occur when a operand (register) depends on the result of a previous instruction that may not be computed yet. A pipelined processor needs to detect data hazards. Stalling, preventing a dependent instruction from advancing, is one way to resolve data hazards. Stalling introduces NOPs (“bubbles”) into a pipeline. Introduce NOPs by (1) preventing the PC from updating, (2) preventing writes to IF/ID registers from changing, and (3) preventing writes to memory and register file. Bubbles (nops) in pipeline significantly decrease performance. Forwarding bypasses some pipelined stages forwarding a result to a dependent instruction operand (register). Better performance than stalling.
Data Hazard Recap Stall • Pause current and all subsequent instructions Forward/Bypass • Try to steal correct value from elsewhere in pipeline • Otherwise, fall back to stalling or require a delay slot Tradeoffs?
- Cs6410
- Caller saved vs callee saved
- Umich eecs 470
- Rfc 3410
- Cornell cs 3410
- Cornell cs 3410
- Cs 3410
- Cs 3410 cornell
- Cs 3410
- Cs 3410
- Cs 3410
- Pipelining and superscalar techniques
- Pipelining and superscalar techniques
- Difference between linear and non linear pipeline
- Milli mücadelenin tek merkezden yürütülmesi
- Dedi budiman hakim
- Hakim isa
- Cecep maskanul hakim
- Nshshshs
- Wasim hakim
- Hakim salim khan
- Wasyawirhum fil amri
- Hakim abdul hameed
- Pasal 666 kuhperdata
- Zahra davoudi
- Dr mazen al hakim
- Hakim boulouiz
- Contoh pipelining
- Pipelining in computer architecture examples
- Pipelined protocols
- Vector pipelining
- Apa yang dimaksud dengan pipeline
- Major hurdles of pipelining
- Principles of pipelining
- Pipelining in verilog
- Collision prevention in computer architecture
- Pipelining in 8086 microprocessor
- Adam smith pipelining
- Pipelining
- Pipelining
- Pipelining
- Pipelining dalam arsitektur komputer
- Fpmul
- Pipelining
- Pipelining adalah
- "us pipelining"
- Pengertian risc
- Kinetic lifting
- "us pipelining"
- Intel 4004 transistor count
- Slip trip fall hazards
- Cdm demolition
- Excavation hazards and controls
- Physical hazards
- Understanding hazards and risks
- Hand tool hazards
- Pinotubo
- Physical hazards examples
- Red roadway markers mean that
- Biological hazards examples
- Radiation hazards
- Physical hazard
- Mobile scaffold inspection requirements.
- Laundry objectives
- Door hazard
- Physical hazards in construction site
- Chemical hazards in a veterinary clinic
- Which pictogram represents acute toxicity
- Pests can cause which two types of contamination?
- Stage of fire
- Environmental physical hazards
- Electrical hazards examples
- Darkness conceals hazards such as
- Natural hazards 4th edition
- California natural hazards
- Autoclave hazards
- Rucsoundings
- Workplace transport hazards
- The hazards of moviegoing