Branch Hazards Branch Hazards n n In our

Branch Hazards

Branch Hazards n n In our pipeline decision about whether a branch instruction should branch or not is not made to Mem stage Know as a control hazard Stalling while resolving a control hazard very inefficient Can predict: n n n branch not taken outcome of branch dynamically Also look at ways to reduce the delay of branches

Branch Hazards P ro g ra m T im e ( in c lo c k c y c le s ) e x e c u tio n CC 1 CC 2 IM Reg CC 3 CC 4 CC 5 DM R eg CC 6 CC 7 CC 8 CC 9 o rd e r ( in i n s t r u c t i o n s ) 40 beq $1, $3, 7 44 and $12, $5 48 or $13, $6, $2 52 add $14, $2 7 2 lw $ 4 , 5 0 ( $ 7 ) IM Reg IM DM R eg IM R eg DM R eg

Branch not taken n Assume branch not taken and continue to fetch instructions until control dependence determined n n n if branch taken then no penalty if branch taken must discard instructions after branch in pipeline if right 50% of time then cost of control hazards halved

Reducing the Delay of Branches n can reduce cost of mis-predicted branches by moving them earlier in pipeline n n n currently on mem stage Many MIPS implementations move it to decode stage Have to: n n move the branch address calculation to the ID stage Make the branch decision in the same stage

Reducing Delay of Branches IF. Flush Hazard detection unit ID/EX M u x WB Control 0 M u x IF/ID 4 PC EX/MEM M WB EX M MEM/WB WB Shift left 2 Registers Instruction memory = M u x ALU M u x Sign extend M u x Forwarding unit Data memory M u x

Dynamic Branch Prediction n n Look at branch history to determine whether a branch should be taken One approach is to use a branch prediction buffer: n n n a small memory indexed by the LSBs of the address, each location contains a bit that determines whether the branch was taken last time or not This scheme is likely to mis-predict twice in a row when a branch that is almost always taken is not taken

Two-bit prediction schemes n A prediction must be wrong twice before it is changed T ake n N o t ta k e n P r e d ic t t a k e n P re d ic t ta k e n Taken N o t ta k e n T ake n N o t ta k e n P r e d ic t n o t t a k e n Taken N o t ta k e n

Superscalar and Dynamic Pipelining n Three main approaches to making processor go even faster: n Super-pipelining – break pipelines into more stages so CCT can be decreased n n n e. g. Alpha pipeline with nine stages Superscalar – replicate internal components so that can launch multiple instructions per clock cycle Dynamic Pipelining – Hardware schedules instructions so as to avoid having to stall the pipeline lw addu sub slti $t 0, $t 1, $s 4, $t 5, 20($s 2) $t 0, $t 2 $s 4, $t 3 $s 4, 20

Superscalar MIPS n Assume 2 instructions can be issued per clock n n n one integer or branch one load or store Assume instruction pairs are aligned on 64 -bit boundary

Superscalar MIPS instruct type ALU or Bra Pipeline stages IF ID EX MEM WB mem IF ID EX MEM WB ALU or Bra IF ID EX MEM WB mem IF ID EX MEM WB

Superscalar MIPS 40000040 M u x 4 ALU PC M u x Registers Instruction memory Write data Data memory Sign extend ALU Sign extend M u x Address

Instruction fetch and decode unit Functional units In-order issue Reservation station … Reservation station Reser vation station Integer … Floating point Load/ Store Out-of-order execute In-order commit Commit unit