Computer Architecture A Constructive Approach Branch Direction Prediction
Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology April 23, 2012 http: //csg. csail. mit. edu/6. S 078 L 20 -1
NA pred with decode feedback Fetch x f Reg Read Decode Execute Memory Writeback d f F f r D d r R r r X x r M m r W Next Address Prediction Direction Prediction April 23, 2012 http: //csg. csail. mit. edu/6. S 078 L 20 -2
Direction prediction recipe Execute n n Send redirects on mispredicts (unchanged) Send direction prediction training Decode n n Check if next address matches direction pred Send redirect if different (update na. Pred) Fetch n n n April 23, 2012 Generate prediction Learn from feedback Accept redirects from later stages http: //csg. csail. mit. edu/6. S 078 L 20 -3
Epoch management recipe Execute n n On exec epoch mismatch - poison instruction Otherwise, w On mispredict – change exec epoch and redirect. Decode n n On new exec epoch – update local exec/decode epochs Otherwise, w On decode epoch mismatch – drop instruction n If not dropped, w On next addr mispredict – change decode epoch and redirect. Fetch n n April 18, 2012 On exec redirect – update local exec epoch On decode redirect – if for current exec epoch then update local decode epoch http: //csg. csail. mit. edu/6. S 078 L 20 -4
Add direction feedback typedef struct { Bool correct; Na. Info na. Pred. Info; Addr next. Addr; Dir. Info dir. Pred. Info; Bool taken; } Feedback deriving (Bits, Eq); • Feedback needs information for training direction predictor • Execute epoch • Decode epoch FIFOF#(Tuple 3#(Epoch, Feedback)) dec. Feedback<-mk. FIFOF; FIFOF#(Tuple 2#(Epoch, Feedback)) exec. Feedback <- mk. FIFOF; • Execute epoch April 23, 2012 http: //csg. csail. mit. edu/6. S 078 L 20 -5
Execute (branch analysis) // after executing instruction. . . let next. Ee. Epoch = ee. Epoch; let cond = exec. Data. exec. Inst. cond; let next. Pc = cond? exec. Data. exec. Inst. addr : exec. Data. pc+4; let correct. Pred = (next. PC == exec. Data. next. Addr. Pred); • Note: may have if (!correct. Pred) next. Ee. Epoch += 1; been reset in ee. Epoch <= next. Ee. Epoch; decode exec. Feedback. enq(tuple 2(next. Ee. Epoch, Feedback{correct: correct. Pred, taken: cond, dir. Pred. Info: exec. Data. dir. Pred. Info, na. Pred. Info: exec. Data. na. Pred. Info, • Always send next. Addr: next. Pc})); feedback // enqueue instruction to next stage April 23, 2012 http: //csg. csail. mit. edu/6. S 078 L 20 -6
Decode with mispredict detect • New exec epoch rule do. Decode; let dec. Data = new. Dec. Data(fr. first); let correct. Path = (dec. Data. exec. Epoch != de. Epoch) ||(dec. Data. dec. Epoch == dd. Epoch); • Same dec epoch let inst. Resp = dec. Data. f. Inst. inst. Resp; let pc. Plus 4 = dec. Data. pc+4; • Determine if epoch of incoming if (correct. Path) instruction is on good path begin dec. Data. dec. Inst = decode(inst. Resp, pc. Plus 4); let target = known. Target. Addr(dec. Data. dec. Inst); let br. Class = get. Br. Class(dec. Data. dec. Inst); let pred. Target = dec. Data. next. Addr. Pred; let pred. Dir = dec. Data. dir. Pred; April 23, 2012 http: //csg. csail. mit. edu/6. S 078 L 20 -7
Decode with mispredict detect let decoded. Target = case (br. Class) • Calculate target as Non. Branch: pc. Plus 4; best as decode can Uncond. Known: target; Cond. Branch: (pred. Dir? target: pc. Plus 4); default: dec. Data. next. Addr. Pred; endcase; • Wrong next addr? if (decoded. Target != pred. Target) begin dec. Data. dec. Epoch = dec. Data. dec. Epoch + 1; • New dec epoch dec. Data. next. Addr. Pred = decoded. Target; • Tell exec addr of dec. Feedback. enq( next instruction! tuple 3(dec. Data. exec. Epoch, dec. Data. dec. Epoch, Feedback{correct: False, na. Pred. Info: dec. Data. na. Pred. Info, next. Addr: decoded. Target, • Send feedback dir. Pred. Info: dec. Data. dir. Pred. Info, taken: dec. Data. taken. Pred})); end • Enqueue to next dr. enq(dec. Data); end // of correct path stage on correct path April 23, 2012 http: //csg. csail. mit. edu/6. S 078 L 20 -8
Decode with mispredict detect else begin // incorrect path dec. Data. dec. Epoch = dd. Epoch; dec. Data. exec. Epoch = de. Epoch; end dd. Epoch <= dec. Data. dec. Epoch; de. Epoch <= dec. Data. exec. Epoch; fr. deq; endrule April 23, 2012 • Preserve current epoch if instruction on incorrect path dec. Data. *Epoch have been set properly so we always save them. http: //csg. csail. mit. edu/6. S 078 L 20 -9
Integration into Fetch rule do. Fetch(); function Action enq. Inst(); action let d <- mem. side(Mem. Req{op: Ld, addr: fetch. PC, data: ? }; match {. n. Addr. Pred, . na. Pred. Info}<-na. Pred. predict(fetch. Pc); match {. dir. Pred, . dir. Pred. Info}<-dir. Pred. predict(fetch. Pc); FBundle f. Inst = FBundle{inst. Resp: d}; FData f. Data = FData{pc: fetch. Pc, f. Inst: f. Inst, inum: i. Num, exec. Epoch: fe. Epoch, na. Pred. Info: na. Pred. Info, next. Addr. Pred: n. Addr. Pred, dir. Pred. Info: dir. Pred. Info, dir. Pred: dir. Pred }; i. Num <= i. Num + 1; fetch. Pc <= n. Addr. Pred; fr. enq(f. Data); endaction endfunction April 18, 2012 http: //csg. csail. mit. edu/6. S 078 L 20 -10
Handling redirect from execute if (exec. Feedback. not. Empty) begin match {. exec. Epoch, . fb} = exec. Feedback. first; exec. Feedback. deq; if(!fb. correct) begin dir. Pred. repair(fb. dir. Pred. Info, fb. taken); dir. Pred. train(fb. dir. Pred. Info, fb. taken); na. Pred. repair(fb. na. Pred. Info, fb. next. Addr); na. Pred. train(fb. na. Pred. Info, fb. next. Addr); fe. Epoch <= exec. Epoch; Train and repair fetch. Pc <= feedback. next. Addr; on redirect end else begin dir. Pred. train(fb. dir. Pred. Info, fb. taken); na. Pred. train(fb. na. Pred. Info, fb. next. Addr); enq. Inst; Just train on end correct prediction end April 23, 2012 http: //csg. csail. mit. edu/6. S 078 L 20 -11
Handling redirect from decode else if (dec. Feedback. not. Empty) begin dec. Feedback. deq; match {. exec. Epoch, . dec. Epoch, . fb} = dec. Feedback. first; if (exec. Epoch == fe. Epoch) begin if (!fb. correct) begin // epoch unchanged fd. Epoch <= dec. Epoch; dir. Pred. repair(fb. dir. Pred. Info, fb. taken); na. Pred. repair(fb. na. Pred. Info, fb. next. Addr); fetch. Pc <= feedback. next. Addr; Just repair end never train else // dec feedback on correct prediction on feedback enq. Inst; from decode end else // dec feedback, but fetch is in new exec epoch enq. Inst; else // no feedback enq. Inst; L 20 -12 http: //csg. csail. mit. edu/6. S 078 April 23, 2012
Immediate update issues If the direction director does not update immediately on predictions things are easy. But if the predictor updates, we will predict and update the predictor on nonbranches. Possible solutions: n n Move direction prediction to decode, so we know not to update on non-branches. But makes timing more critical. Simply use direction predictor even on non-branch instructions. w Note: for superscaler issue designs this is a less significant problem. Note: In the lab code we communicate the branch type of each instruction to allow training and repair to decide if they want to perform updates or not based on instruction type. April 23, 2012 http: //csg. csail. mit. edu/6. S 078 L 20 -13
Predictor Primitive Indexed table holding values Operations n n Index Predict Update Depth I P U Prediction Update Width Algebraic notation Prediction = P[Width, Depth](Index; Update) October 24, 2011 http: //csg. csail. mit. edu/6. s 078 L 20 -14
One-bit Predictor Simple temporal prediction 1 bit PC P Prediction U Taken I A 21064(PC; T) = P[ 1, 2 K ](PC; T) What happens on loop branches? At best, mispredicts twice for every use of loop. October 24, 2011 http: //csg. csail. mit. edu/6. s 078 L 20 -15
Two-bit Predictor 2 bits PC Prediction P Taken I U +/- Adder Counter[W, D](I; T) = P[W, D](I; if T then P+1 else P-1) A 21164(PC; T) = MSB(Counter[2, 2 K](PC; T)) October 24, 2011 http: //csg. csail. mit. edu/6. s 078 L 20 -16
History Register PC P History Taken I U Concatenate History(PC, T) = P(PC; P || T) October 24, 2011 http: //csg. csail. mit. edu/6. s 078 L 20 -17
Global History 0 Global History Concat Prediction +/Taken GHist(; T) = MSB(Counter(History(0, T); T)) Ind-Ghist(PC; T) = MSB(Counter(PC || Hist(GHist(; T))) Can we take advantage of a pattern at a particular PC? October 24, 2011 http: //csg. csail. mit. edu/6. s 078 L 20 -18
Local History PC Concat Prediction +/Taken LHist(PC, T) = MSB(Counter(History(PC; T)) Can we take advantage of the global pattern at a particular PC? October 24, 2011 http: //csg. csail. mit. edu/6. s 078 L 20 -19
Two-level Predictor PC 0 Global History Prediction Concat +/- Taken 2 Level(PC, T) = MSB(Counter(History(0; T)||PC; T)) October 24, 2011 http: //csg. csail. mit. edu/6. s 078 L 20 -20
Two-Level Branch Predictor Pentium Pro uses the result from the last two branches to select one of the four sets of BHT bits (~95% correct) 00 Fetch PC k 2 -bit global branch history shift register Shift in Taken/¬Taken results of each branch October 24, 2011 http: //csg. csail. mit. edu/6. s 078 Taken/¬Taken? L 20 -21
Gshare Predictor PC 0 Global History Prediction xor Concat +/- Taken 2 Level(PC, T) = MSB(Counter(History(0; T) PC; T)) October 24, 2011 http: //csg. csail. mit. edu/6. s 078 L 20 -22
Choosing Predictors LHist Prediction GHist Chooser = MSB(P(PC; P + (A==T) - (B==T)) or Chooser = MSB(P(GHist(PC; T); P + (A==T) - (B==T)) October 24, 2011 http: //csg. csail. mit. edu/6. s 078 L 20 -23
Tournament Branch Predictor (Alpha 21264) Local history table (1, 024 x 10 b) Local prediction (1, 024 x 3 b) Global Prediction (4, 096 x 2 b) Choice Prediction (4, 096 x 2 b) PC Prediction Global History (12 b) Choice predictor learns whether best to use local or global branch history in predicting next branch Global history is speculatively updated but restored on mispredict Claim 90 -100% success on range of applications October 24, 2011 http: //csg. csail. mit. edu/6. s 078 L 12 -24
- Slides: 24