Lecture 8 Dynamic ILP Topics outoforder processors See
- Slides: 13
Lecture 8: Dynamic ILP • Topics: out-of-order processors (See class notes) • HW 3 is posted, due on Tuesday 1
An Out-of-Order Processor Implementation Reorder Buffer (ROB) Instr 1 Instr 2 Instr 3 Instr 4 Instr 5 Instr 6 Branch prediction and instr fetch R 1+R 2 R 1+R 3 BEQZ R 2 R 3 R 1+R 2 R 1 R 3+R 2 Instr Fetch Queue Decode & Rename T 1 T 2 T 3 T 4 T 5 T 6 T 1 R 1+R 2 T 2 T 1+R 3 BEQZ T 2 T 4 T 1+T 2 T 5 T 4+T 2 Register File R 1 -R 32 ALU ALU Results written to ROB and tags broadcast to IQ Issue Queue (IQ) 2
Design Details - I • Instructions enter the pipeline in order • No need for branch delay slots if prediction happens in time • Instructions leave the pipeline in order – all instructions that enter also get placed in the ROB – the process of an instruction leaving the ROB (in order) is called commit – an instruction commits only if it and all instructions before it have completed successfully (without an exception) • To preserve precise exceptions, a result is written into the register file only when the instruction commits – until then, the result is saved in a temporary register in the ROB 3
Design Details - II • Instructions get renamed and placed in the issue queue – some operands are available (T 1 -T 6; R 1 -R 32), while others are being produced by instructions in flight (T 1 -T 6) • As instructions finish, they write results into the ROB (T 1 -T 6) and broadcast the operand tag (T 1 -T 6) to the issue queue – instructions now know if their operands are ready • When a ready instruction issues, it reads its operands from T 1 -T 6 and R 1 -R 32 and executes (out-of-order execution) • Can you have WAW or WAR hazards? By using more names (T 1 -T 6), name dependences can be avoided 4
Design Details - III • If instr-3 raises an exception, wait until it reaches the top of the ROB – at this point, R 1 -R 32 contain results for all instructions up to instr-3 – save registers, save PC of instr-3, and service the exception • If branch is a mispredict, flush all instructions after the branch and start on the correct path – mispredicted instrs will not have updated registers (the branch cannot commit until it has completed and the flush happens as soon as the branch completes) • Potential problems: ? 5
Managing Register Names Temporary values are stored in the register file and not the ROB Logical Registers R 1 -R 32 Physical Registers P 1 -P 64 At the start, R 1 -R 32 can be found in P 1 -P 32 Instructions stop entering the pipeline when P 64 is assigned R 1+R 2 R 1+R 3 BEQZ R 2 R 3 R 1+R 2 P 33 P 1+P 2 P 34 P 33+P 3 BEQZ P 34 P 35 P 33+P 34 What happens on commit? 6
The Commit Process • On commit, no copy is required • The register map table is updated – the “committed” value of R 1 is now in P 33 and not P 1 – on an exception, P 33 is copied to memory and not P 1 • An instruction in the issue queue need not modify its input operand when the producer commits • When instruction-1 commits, we no longer have any use for P 1 – it is put in a free pool and a new instruction can now enter the pipeline for every instr that commits, a new instr can enter the pipeline number of in-flight instrs is a constant = number of extra (rename) registers 7
The Alpha 21264 Out-of-Order Implementation Reorder Buffer (ROB) Instr 1 Instr 2 Instr 3 Instr 4 Instr 5 Instr 6 Branch prediction and instr fetch R 1+R 2 R 1+R 3 BEQZ R 2 R 3 R 1+R 2 R 1 R 3+R 2 Instr Fetch Queue Committed Reg Map R 1 P 1 R 2 P 2 Register File P 1 -P 64 Decode & Rename Speculative Reg Map R 1 P 36 R 2 P 34 P 33 P 1+P 2 P 34 P 33+P 3 BEQZ P 34 P 35 P 33+P 34 P 36 P 35+P 34 Issue Queue (IQ) ALU ALU Results written to regfile and tags broadcast to IQ 8
Out-of-Order Loads/Stores Ld R 1 [R 2] Ld R 3 [R 4] St R 5 [R 6] Ld R 7 [R 8] Ld R 9 [R 10] What if the issue queue also had load/store instructions? Can we continue executing instructions out-of-order? 9
Memory Dependence Checking Ld 0 x abcdef Ld St Ld Ld 0 x abcdef St 0 x abcd 00 Ld 0 x abc 000 Ld 0 x abcd 00 • The issue queue checks for register dependences and executes instructions as soon as registers are ready • Loads/stores access memory as well – must check for RAW, WAW, and WAR hazards for memory as well • Hence, first check for register dependences to compute effective addresses; then check for memory dependences 10
Memory Dependence Checking Ld 0 x abcdef Ld St Ld Ld 0 x abcdef St 0 x abcd 00 Ld 0 x abc 000 Ld 0 x abcd 00 • Load and store addresses are maintained in program order in the Load/Store Queue (LSQ) • Loads can issue if they are guaranteed to not have true dependences with earlier stores • Stores can issue only if we are ready to modify memory (can not recover if an earlier instr raises an exception) 11
The Alpha 21264 Out-of-Order Implementation Reorder Buffer (ROB) Instr 1 Committed Instr 2 Reg Map Instr 3 R 1 P 1 Instr 4 R 2 P 2 Instr 5 Instr 6 Instr 7 Branch prediction and instr fetch R 1+R 2 R 1+R 3 BEQZ R 2 R 3 R 1+R 2 R 1 R 3+R 2 LD R 4 8[R 3] ST R 4 8[R 1] Instr Fetch Queue Decode & Rename Speculative Reg Map R 1 P 36 R 2 P 34 P 33 P 1+P 2 P 34 P 33+P 3 BEQZ P 34 P 35 P 33+P 34 P 36 P 35+P 34 P 37 8[P 35] P 37 8[P 36] Issue Queue (IQ) P 37 [P 35 + 8] P 37 [P 36 + 8] LSQ Register File P 1 -P 64 ALU ALU Results written to regfile and tags broadcast to IQ ALU D-Cache 12
Title • Bullet 13