Constructive Computer Architecture Control Hazards Arvind Computer Science

  • Slides: 23
Download presentation
Constructive Computer Architecture: Control Hazards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute

Constructive Computer Architecture: Control Hazards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -1

Killing fetched instructions In the simple design with combinational memory we have discussed so

Killing fetched instructions In the simple design with combinational memory we have discussed so far, the mispredicted instruction was present in the f 2 d. So the Execute stage can atomically Clear the f 2 d Set the pc to the correct target n n In highly pipelined machines there can be multiple mispredicted and partially executed instructions in the pipeline; it will generally take more than one cycle to kill all such instructions Need a more general solution then clearing the f 2 d FIFO October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -2

Epoch: a method for managing control hazards Add an epoch register in the processor

Epoch: a method for managing control hazards Add an epoch register in the processor state The Execute stage changes the epoch whenever the pc prediction is wrong and sets the pc to the correct value The Fetch stage associates the current epoch with every instruction when it is fetched Fetch The epoch of the Execute Epoch instruction is examined target. PC when it is ready to execute. If the processor pred inst f 2 d PC epoch has changed the instruction is thrown away i. Mem October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -3

An epoch based solution Can these rules execute concurrently ? rule do. Fetch ;

An epoch based solution Can these rules execute concurrently ? rule do. Fetch ; let inst. F=i. Mem. req(pc[0]); let ppc. F=next. Addr(pc[0]); pc[0]<=ppc. F; yes f 2 d. enq(Fetch 2 Decode{pc: pc[0], ppc: ppc. F, epoch: epoch, inst: inst. F}); endrule two values for epoch are sufficient rule do. Execute; let x=f 2 d. first; let pc. D=x. pc; let in. Ep=x. epoch; let ppc. D = x. ppc; let inst. D = x. inst; if(in. Ep == epoch) begin let d. Inst = decode(inst. D); . . . register fetch. . . ; let e. Inst = exec(d. Inst, r. Val 1, r. Val 2, pc. D, ppc. D); . . . memory operation. . . . rf update. . . if (e. Inst. mispredict) begin pc[1] <= e. Inst. addr; epoch <= epoch + 1; end f 2 d. deq; endrule October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -4

Discussion Epoch based solution kills one wrong-path instruction at a time in the execute

Discussion Epoch based solution kills one wrong-path instruction at a time in the execute stage It may be slow, but it is more robust in more complex pipelines, if you have multiple stages between fetch and execute or if you have outstanding instruction requests to the i. Mem It requires the Execute stage to set the pc and epoch registers simultaneously which may result in a long combinational path from Execute to Fetch October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -5

Decoupled Fetch and Execute <corrected pc, new epoch> Fetch Execute <inst, pc, ppc, epoch>

Decoupled Fetch and Execute <corrected pc, new epoch> Fetch Execute <inst, pc, ppc, epoch> In decoupled systems a subsystem reads and modifies only local state atomically n In our solution, pc and epoch are read by both rules Properly decoupled systems permit greater freedom in independent refinement of subsystems October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -6

A decoupled solution using epochs fetch f. Epoch execute Add f. Epoch and e.

A decoupled solution using epochs fetch f. Epoch execute Add f. Epoch and e. Epoch registers to the processor state; initialize them to the same value The epoch changes whenever Execute detects the pc prediction to be wrong. This change is reflected immediately in e. Epoch and eventually in f. Epoch via a message from Execute to Fetch Associate the f. Epoch with every instruction when it is fetched In the execute stage, reject, i. e. , kill, the instruction if its epoch does not match e. Epoch October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -7

Control Hazard resolution PC redirect f. Epoch FIFO +4 f 2 d e. Epoch

Control Hazard resolution PC redirect f. Epoch FIFO +4 f 2 d e. Epoch A robust two-rule solution Register File Decode Execute FIFO Data Inst Execute sends information about Memory the target pc to Fetch, which updates f. Epoch and pc whenever it looks at the redirect PC fifo October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -8

Two-stage pipeline Decoupled code structure module mk. Proc(Proc); Fifo#(Fetch 2 Execute) f 2 d

Two-stage pipeline Decoupled code structure module mk. Proc(Proc); Fifo#(Fetch 2 Execute) f 2 d <- mk. Fifo; Fifo#(Addr) exec. Redirect <- mk. Fifo; Reg#(Bool) f. Epoch <- mk. Reg(False); Reg#(Bool) e. Epoch <- mk. Reg(False); rule do. Fetch; let inst. F = i. Mem. req(pc); . . . f 2 d. enq(. . . inst. F. . . , f. Epoch); endrule do. Execute; if(in. Ep == e. Epoch) begin Decode and execute the instruction; update state; In case of misprediction, exec. Redirect. enq(correct pc); end f 2 d. deq; endrule endmodule October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -9

The Fetch rule do. Fetch; let inst. F = i. Mem. req(pc); pass the

The Fetch rule do. Fetch; let inst. F = i. Mem. req(pc); pass the pc and predicted if(!exec. Redirect. not. Empty) to the execute stage begin let ppc. F = next. Addr. Predictor(pc); pc <= ppc. F; f 2 d. enq(Fetch 2 Execute{pc: pc, ppc: ppc. F, inst: inst. F, epoch: f. Epoch}); end else begin f. Epoch <= !f. Epoch; pc <= exec. Redirect. first; exec. Redirect. deq; endrule Notice: In case of PC redirection, pc nothing is enqueued into f 2 d October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -10

The Execute rule exec returns a flag if there was a fetch misprediction rule

The Execute rule exec returns a flag if there was a fetch misprediction rule do. Execute; let inst. D = f 2 d. first. inst; let pc. F = f 2 d. first. pc; let ppc. D = f 2 d. first. ppc; let in. Ep = f 2 d. first. epoch; if(in. Ep == e. Epoch) begin let d. Inst = decode(inst. D); let r. Val 1 = rf. rd 1(valid. Reg. Value(d. Inst. src 1)); let r. Val 2 = rf. rd 2(valid. Reg. Value(d. Inst. src 2)); let e. Inst = exec(d. Inst, r. Val 1, r. Val 2, pc. D, ppc. D); if(e. Inst. i. Type == Ld) e. Inst. data <- d. Mem. req(Mem. Req{op: Ld, addr: e. Inst. addr, data: ? }); else if (e. Inst. i. Type == St) let d < d. Mem. req(Mem. Req{op: St, addr: e. Inst. addr, data: e. Inst. data}); if (is. Valid(e. Inst. dst)) rf. wr(valid. Reg. Value(e. Inst. dst), e. Inst. data); if(e. Inst. mispredict) begin exec. Redirect. enq(e. Inst. addr); e. Epoch <= !in. Ep; end Can these rules execute concurrently? f 2 d. deq; yes, assuming CF FIFOs endrule October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -11

Epoch mechanism is independent of the branch prediction scheme used. We will study sophisticated

Epoch mechanism is independent of the branch prediction scheme used. We will study sophisticated branch prediction schemes later October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -12

Consider a different twostage pipeline Fetch Decode, Register. Fetch pred Register File Insti+1 PC

Consider a different twostage pipeline Fetch Decode, Register. Fetch pred Register File Insti+1 PC Execute, Memory, Write. Back f 2 d Decode Inst Memory Execute Data Memory Suppose we move the pipeline stage from Fetch to after Decode and Register fetch for a better balance of work in two stages Pipeline will still have control hazards October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -13

A different 2 -Stage pipeline: 2 -Stage-DH pipeline Execute, Memory, Write. Back f. Epoch

A different 2 -Stage pipeline: 2 -Stage-DH pipeline Execute, Memory, Write. Back f. Epoch Fetch, Decode, Register. Fetch PC pred redirect Register File e. Epoch Execute Decode d 2 e Inst Memory October 8, 2014 Fifos Use the same epoch solution for control hazards as before http: //csg. csail. mit. edu/6. 175 Data Memory L 12 -14

Type Decode 2 Execute The Fetch stage, in addition to fetching the instruction, also

Type Decode 2 Execute The Fetch stage, in addition to fetching the instruction, also decodes the instruction and fetches the operands from the register file. It passes these operands to the Execute stage typedef struct { Addr pc; Addr ppc; Bool epoch; Decoded. Inst; Data r. Val 1; Data r. Val 2; } Decode 2 Execute deriving (Bits, Eq); values instead of register names October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -15

2 -Stage-DH pipeline module mk. Proc(Proc); Reg#(Addr) pc <- mk. Reg. U; RFile rf

2 -Stage-DH pipeline module mk. Proc(Proc); Reg#(Addr) pc <- mk. Reg. U; RFile rf <- mk. RFile; IMemory i. Mem <- mk. IMemory; DMemory d. Mem <- mk. DMemory; Fifo#(Decode 2 Execute) d 2 e <- mk. Fifo; Reg#(Bool) f. Epoch <- mk. Reg(False); Reg#(Bool) e. Epoch <- mk. Reg(False); Fifo#(Addr) exec. Redirect <- mk. Fifo; rule do. Fetch … rule do. Execute … October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -16

2 -Stage-DH pipeline do. Fetch rule first attempt rule do. Fetch; let inst. F

2 -Stage-DH pipeline do. Fetch rule first attempt rule do. Fetch; let inst. F = i. Mem. req(pc); if(exec. Redirect. not. Empty) begin f. Epoch <= !f. Epoch; pc <= exec. Redirect. first; exec. Redirect. deq; end else begin let ppc. F = next. Addr. Predictor(pc); pc <= ppc. F; moved let d. Inst = decode(inst. F); let r. Val 1 = rf. rd 1(valid. Reg. Value(d. Inst. src 1)); from let r. Val 2 = rf. rd 2(valid. Reg. Value(d. Inst. src 2)); Execute d 2 e. enq(Decode 2 Execute{pc: pc, ppc: ppc. F, d. Iinst: d. Inst, epoch: f. Epoch, r. Val 1: r. Val 1, r. Val 2: r. Val 2}); endrule October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -17

2 -Stage-DH pipeline do. Execute rule first attempt Not quite correct. Why? rule do.

2 -Stage-DH pipeline do. Execute rule first attempt Not quite correct. Why? rule do. Execute; let x = d 2 e. first; Fetch is potentially let d. Inst. E = x. d. Inst; let pc. E = x. pc; reading stale values let ppc. E = x. ppc; let epoch = x. epoch; from rf let r. Val 1 E = x. r. Val 1; let r. Val 2 E = x. r. Val 2; if(epoch == e. Epoch) begin let e. Inst = exec(d. Inst. E, r. Val 1 E, r. Val 2 E, pc. E, ppc. E); if(e. Inst. i. Type == Ld) e. Inst. data <- d. Mem. req(Mem. Req{op: Ld, addr: e. Inst. addr, data: ? }); else if (e. Inst. i. Type == St) let d <- no d. Mem. req(Mem. Req{op: St, addr: e. Inst. addr, data: e. Inst. data}); if (is. Valid(e. Inst. dst) && change valid. Value(e. Inst. dst). reg. Type == Normal) rf. wr(valid. Reg. Value(e. Inst. dst), e. Inst. data); if(e. Inst. mispredict) begin exec. Redirect. enq(e. Inst. addr); e. Epoch <= !e. Epoch; end end d 2 e. deq; endrule October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -18

Data Hazards fetch & decode time t 0 FDstage EXstage execute pc rf d.

Data Hazards fetch & decode time t 0 FDstage EXstage execute pc rf d. Mem d 2 e t 1 t 2 t 3 t 4 t 5 t 6 t 7 FD 1 FD 2 FD 3 FD 4 FD 5 EX 1 EX 2 EX 3 EX 4 EX 5 . . I 1 Add(R 1, R 2, R 3) I 2 Add(R 4, R 1, R 2) I 2 must be stalled until I 1 updates the register file time t 0 FDstage EXstage October 8, 2014 t 1 t 2 t 3 t 4 t 5 t 6 t 7. . FD 1 FD 2 FD 3 FD 4 FD 5 EX 1 EX 2 EX 3 EX 4 EX 5 http: //csg. csail. mit. edu/6. 175 L 12 -19

Dealing with data hazards Keep track of instructions in the pipeline and determine if

Dealing with data hazards Keep track of instructions in the pipeline and determine if the register values to be fetched are stale, i. e. , will be modified by some older instruction still in the pipeline. This condition is referred to as a read-after-write (RAW) hazard Stall the Fetch from dispatching the instruction as long as RAW hazard prevails RAW hazard will disappear as the pipeline drains Scoreboard: A data structure to keep track of the instructions in the pipeline beyond the Fetch stage October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -20

Data Hazard Data hazard depends upon the match between the source registers of the

Data Hazard Data hazard depends upon the match between the source registers of the fetched instruction and the destination register of an instruction already in the pipeline Both the source and destination registers must be Valid for a hazard to exist function Bool is. Found (Maybe#(Full. Indx) x, Maybe#(Full. Indx) y); if(x matches Valid. xv &&& y matches Valid. yv &&& yv == xv) return True; else return False; endfunction October 8, 2014 http: //csg. csail. mit. edu/6. 175 L 12 -21

Scoreboard: Keeping track of instructions in execution Scoreboard: a data structure to keep track

Scoreboard: Keeping track of instructions in execution Scoreboard: a data structure to keep track of the destination registers of the instructions beyond the fetch stage n n October 8, 2014 method insert: inserts the destination (if any) of an instruction in the scoreboard when the instruction is decoded method search 1(src): searches the scoreboard for a data hazard method search 2(src): same as search 1 method remove: deletes the oldest entry when an instruction commits http: //csg. csail. mit. edu/6. 175 L 12 -22

f. Epoch 2 -Stage-DH pipeline: Scoreboard and Stall logic PC pred redirect Register File

f. Epoch 2 -Stage-DH pipeline: Scoreboard and Stall logic PC pred redirect Register File e. Epoch Execute Decode d 2 e Inst Memory October 8, 2014 scoreboard http: //csg. csail. mit. edu/6. 175 Data Memory L 12 -23