Lecture 6 Advanced Pipelines Multicycle inorder pipelines and
- Slides: 20
Lecture 6: Advanced Pipelines • Multi-cycle in-order pipelines and out-of-order pipelines (Appendix A, Sections 3. 5 -3. 6) 1
Control Hazards • Simple techniques to handle control hazard stalls: Ø for every branch, introduce a stall cycle (note: every 6 th instruction is a branch!) Ø assume the branch is not taken and start fetching the next instruction – if the branch is taken, need hardware to cancel the effect of the wrong-path instruction Ø fetch the next instruction (branch delay slot) and execute it anyway – if the instruction turns out to be on the correct path, useful work was done – if the instruction turns out to be on the wrong path, hopefully program state is not lost 2
Branch Delay Slots 3
Slowdowns from Stalls • Perfect pipelining with no hazards an instruction completes every cycle (total cycles ~ num instructions) speedup = increase in clock speed = num pipeline stages • With hazards and stalls, some cycles (= stall time) go by during which no instruction completes, and then the stalled instruction completes • Total cycles = number of instructions + stall cycles • Slowdown because of stalls = 1/ (1 + stall cycles per instr) 4
Pipeline Implementation • Signals for the muxes have to be generated – some of this can happen during ID • Need look-up tables to identify situations that merit bypassing/stalling – the number of inputs to the muxes goes up 5
Detecting Control Signals Situation Example code Action No dependence LD R 1, 45(R 2) DADD R 5, R 6, R 7 DSUB R 8, R 6, R 7 OR R 9, R 6, R 7 No hazards Dependence requiring stall LD R 1, 45(R 2) DADD R 5, R 1, R 7 DSUB R 8, R 6, R 7 OR R 9, R 6, R 7 Detect use of R 1 during ID of DADD and stall Dependence overcome by forwarding LD R 1, 45(R 2) DADD R 5, R 6, R 7 DSUB R 8, R 1, R 7 OR R 9, R 6, R 7 Detect use of R 1 during ID of DSUB and set mux control signal that accepts result from bypass path Dependence with accesses in order LD R 1, 45(R 2) DADD R 5, R 6, R 7 DSUB R 8, R 6, R 7 OR R 9, R 1, R 7 No action required 6
Multicycle Instructions Functional unit Latency Initiation interval Integer ALU 1 1 Data memory 2 1 FP add 4 1 FP multiply 7 1 FP divide 25 25 7
Effects of Multicycle Instructions • Structural hazards if the unit is not fully pipelined (divider) • Frequent RAW hazard stalls • Potentially multiple writes to the register file in a cycle • WAW hazards because of out-of-order instr completion • Imprecise exceptions because of o-o-o instr completion 8
Precise Exceptions • On an exception: Ø must save PC of instruction where program must resume Ø all instructions after that PC that might be in the pipeline must be converted to NOPs (other instructions continue to execute and may raise exceptions of their own) Ø temporary program state not in memory (in other words, registers) has to be stored in memory Ø potential problems if a later instruction has already modified memory or registers • A processor that fulfils all the above conditions is said to provide precise exceptions (useful for debugging and of course, correctness) 9
Dealing with these Effects • Multiple writes to the register file: increase the number of ports, stall one of the writers during ID, stall one of the writers during WB (the stall will propagate) • WAW hazards: detect the hazard during ID and stall the later instruction • Imprecise exceptions: buffer the results if they complete early or save more pipeline state so that you can return to exactly the same state that you left at 10
ILP • Instruction-level parallelism: overlap among instructions: pipelining or multiple instruction execution • What determines the degree of ILP? Ø dependences: property of the program Ø hazards: property of the pipeline 11
Types of Dependences • Data dependences: an instr produces a result for another (true dependence, results in RAW hazards in a pipeline) • Name dependences: two instrs that use the same names (anti and output dependences, result in WAR and WAW hazards in a pipeline) • Control dependences: an instruction’s execution depends on the result of a branch – re-ordering should preserve exception behavior and dataflow 12
An Out-of-Order Processor Implementation Reorder Buffer (ROB) Instr 1 Instr 2 Instr 3 Instr 4 Instr 5 Instr 6 Branch prediction and instr fetch R 1+R 2 R 1+R 3 BEQZ R 2 R 3 R 1+R 2 R 1 R 3+R 2 Instr Fetch Queue Decode & Rename T 1 T 2 T 3 T 4 T 5 T 6 T 1 R 1+R 2 T 2 T 1+R 3 BEQZ T 2 T 4 T 1+T 2 T 5 T 4+T 2 Register File R 1 -R 32 ALU ALU Results written to ROB and tags broadcast to IQ Issue Queue (IQ) 13
Design Details - I • Instructions enter the pipeline in order • No need for branch delay slots if prediction happens in time • Instructions leave the pipeline in order – all instructions that enter also get placed in the ROB – the process of an instruction leaving the ROB (in order) is called commit – an instruction commits only if it and all instructions before it have completed successfully (without an exception) • To preserve precise exceptions, a result is written into the register file only when the instruction commits – until then, the result is saved in a temporary register in the ROB 14
Design Details - II • Instructions get renamed and placed in the issue queue – some operands are available (T 1 -T 6; R 1 -R 32), while others are being produced by instructions in flight (T 1 -T 6) • As instructions finish, they write results into the ROB (T 1 -T 6) and broadcast the operand tag (T 1 -T 6) to the issue queue – instructions now know if their operands are ready • When a ready instruction issues, it reads its operands from T 1 -T 6 and R 1 -R 32 and executes (out-of-order execution) • Can you have WAW or WAR hazards? By using more names (T 1 -T 6), name dependences can be avoided 15
Design Details - III • If instr-3 raises an exception, wait until it reaches the top of the ROB – at this point, R 1 -R 32 contain results for all instructions up to instr-3 – save registers, save PC of instr-3, and service the exception • If branch is a mispredict, flush all instructions after the branch and start on the correct path – mispredicted instrs will not have updated registers (the branch cannot commit until it has completed and the flush happens as soon as the branch completes) • Potential problems: ? 16
Managing Register Names Temporary values are stored in the register file and not the ROB Logical Registers R 1 -R 32 Physical Registers P 1 -P 64 At the start, R 1 -R 32 can be found in P 1 -P 32 Instructions stop entering the pipeline when P 64 is assigned R 1+R 2 R 1+R 3 BEQZ R 2 R 3 R 1+R 2 P 33 P 1+P 2 P 34 P 33+P 3 BEQZ P 34 P 35 P 33+P 34 What happens on commit? 17
The Commit Process • On commit, no copy is required • The register map table is updated – the “committed” value of R 1 is now in P 33 and not P 1 – on an exception, P 33 is copied to memory and not P 1 • An instruction in the issue queue need not modify its input operand when the producer commits • When instruction-1 commits, we no longer have any use for P 1 – it is put in a free pool and a new instruction can now enter the pipeline for every instr that commits, a new instr can enter the pipeline number of in-flight instrs is a constant = number of extra (rename) registers 18
The Alpha 21264 Out-of-Order Implementation Reorder Buffer (ROB) Instr 1 Instr 2 Instr 3 Instr 4 Instr 5 Instr 6 Branch prediction and instr fetch R 1+R 2 R 1+R 3 BEQZ R 2 R 3 R 1+R 2 R 1 R 3+R 2 Instr Fetch Queue Decode & Rename Register Map Table R 1 P 1 R 2 P 33 P 1+P 2 P 34 P 33+P 3 BEQZ P 34 P 35 P 33+P 34 P 36 P 35+P 34 Register File P 1 -P 64 ALU ALU Results written to regfile and tags broadcast to IQ Issue Queue (IQ) 19
Title • Bullet 20
- Difference between single cycle and multicycle datapath
- Multicycle service code
- Andreas klappenecker
- 01:640:244 lecture notes - lecture 15: plat, idah, farad
- Westwood pipelines
- Gdd
- Questar pipelines
- Updm pipelines
- Edge-to-core-to-cloud data pipelines
- Pipelines
- Advanced inorganic chemistry lecture notes
- Iterative inorder traversal
- Inorder iterator
- Inorder traversal visualization
- Adt tree
- Inorder traversal
- Inorder traversal
- In order traversal
- Inorder durchlauf
- Electricity and magnetism lecture notes
- Power system dynamics and stability lecture notes