CS 7810 Lecture 11 Delaying Physical Register Allocation

  • Slides: 19
Download presentation
CS 7810 Lecture 11 Delaying Physical Register Allocation Through Virtual-Physical Registers T. Monreal, A.

CS 7810 Lecture 11 Delaying Physical Register Allocation Through Virtual-Physical Registers T. Monreal, A. Gonzalez, M. Valero, J. Gonzalez, V. Vinals Proceedings of MICRO-32 November 1999

Register File Design Considerations • Number of ports = 3 x issue width •

Register File Design Considerations • Number of ports = 3 x issue width • Number of entries = window size + logical-regs • Multiple threads more registers (more power) • Wire delays, clock speeds multiple cycle access • Pipelining a RAM structure is hard

Register Allocation Fetch Rename assign pr 7 cycle 4 Issue cycle 15 no result

Register Allocation Fetch Rename assign pr 7 cycle 4 Issue cycle 15 no result – 26 cyc Complete Wake-up write pr 7 cycle 30 read pr 7 cycle 50 Commit release pr 7 cycle 80 useful time – 20 cyc no activity – 30 cyc

Two-Level Register File Base regfile Two-level regfile

Two-Level Register File Base regfile Two-level regfile

Virtual-Physical Registers Register map table lr 3 vr 7 Virtual map table

Virtual-Physical Registers Register map table lr 3 vr 7 Virtual map table

Virtual-Physical Registers Register map table lr 3 vr 7 Virtual map table Instruction issues

Virtual-Physical Registers Register map table lr 3 vr 7 Virtual map table Instruction issues

Virtual-Physical Registers Register map table lr 3 vr 7, pr 9 vr 7 (pr

Virtual-Physical Registers Register map table lr 3 vr 7, pr 9 vr 7 (pr 9) vr 7 pr 9 Virtual map table vr 7, pr 9 Instruction completes Is assigned pr 9

Virtual-Physical Registers Register map table lr 3 vr 7, pr 9 vr 7 (pr

Virtual-Physical Registers Register map table lr 3 vr 7, pr 9 vr 7 (pr 9) vr 7 pr 9 Virtual map table

Lack of Registers Finishes, has no register, keeps re-executing In-flight window Has physical register

Lack of Registers Finishes, has no register, keeps re-executing In-flight window Has physical register Has no physical register

Lack of Registers cycle t+1 commits Finishes, has no register, keeps re-executing gets reg

Lack of Registers cycle t+1 commits Finishes, has no register, keeps re-executing gets reg In-flight window Has physical register Has no physical register

Deadlock Who will generate a register for this instr? Finishes, has no register, keeps

Deadlock Who will generate a register for this instr? Finishes, has no register, keeps re-executing Solution: Reserve a register for the oldest instruction In-flight window Has physical register Has no physical register

Sequential Execution Oldest instr has reserved register In-flight window Has physical register Has no

Sequential Execution Oldest instr has reserved register In-flight window Has physical register Has no physical register

Sequential Execution instr commits, releases another reg, that is then reserved for the new

Sequential Execution instr commits, releases another reg, that is then reserved for the new oldest instr In-flight window Has physical register Has no physical register

Sequential Execution Behaves like an in-order processor instr commits, releases another reg, that is

Sequential Execution Behaves like an in-order processor instr commits, releases another reg, that is then reserved for the new oldest instr In-flight window Has physical register Has no physical register

Reserving All Registers Allows quick progress, but almost behaves like a conventional processor Has

Reserving All Registers Allows quick progress, but almost behaves like a conventional processor Has physical register Has no physical register

Register Stealing Instr finishes; steals register from the youngest finished instr In-flight window Has

Register Stealing Instr finishes; steals register from the youngest finished instr In-flight window Has physical register Has no physical register • No reservation of regs • The younger instrs may have to execute twice • Note the pre-execution effect

Implementation • Finished instructions have to remain in issueq in case they have to

Implementation • Finished instructions have to remain in issueq in case they have to re-execute • Issued dependents of the victim instruction need not re-execute • The VP tag of the victim has to be broadcast so that unissued dependents can reset the ready bit • Can benefit from an instruction reuse buffer? • Pre-execution without explicitly attempting it

Results • Improves the base case by 5% (Int programs) and 24% (FP programs)

Results • Improves the base case by 5% (Int programs) and 24% (FP programs) • FP programs have more ILP, better branch prediction, and are more limited by cache misses • Re-executions: 10% (int) 58% (fp) • Steals: 5% (int) 12% (fp) • For the same IPC, VP registers employ 25% fewer registers

Title • Bullet

Title • Bullet