CPI Pipeline CPI Ideal pipeline CPI register renaming

  • Slides: 59
Download presentation

Τεχνικές βελτίωσης του CPI Pipeline CPI = υπερβαθμωτή εκτέλεση Ideal pipeline CPI + register

Τεχνικές βελτίωσης του CPI Pipeline CPI = υπερβαθμωτή εκτέλεση Ideal pipeline CPI + register renaming δυναμική εκτέλεση loop unrolling static scheduling, software pipelining cslab@ntua 2019 -2020 προώθηση Structural Stalls + Data Hazard Stalls + Control Stalls υποθετική εκτέλεση delayed branches, branch πρόβλεψη scheduling διακλαδώσεων 2

Tomasulo Out of Order Completion 1 IS 2 3 LD 5 6 7 8

Tomasulo Out of Order Completion 1 IS 2 3 LD 5 6 7 8 10 11 12 13 14 15 16 17 18 … … 56 57 WB MULT IS cslab@ntua 2018 -2019 9 WB LD IS Program Flow 4 Time IS SUB WB WB DIV IS IS ADD WB WB 4

Device Interrupt Network Interrupt Handler add r 1, r 2, r 3 subi r

Device Interrupt Network Interrupt Handler add r 1, r 2, r 3 subi r 4, r 1, #4 slli r 4, #2 C ts P ε. In de σ o γ Σώ νερ r M ε iso π Α erv p u S (!) lw r 2, 0(r 4) lw r 3, 4(r 4) add r 2, r 3 sw 8(r 4), r 2 cslab@ntua 2019 -2020 Επ α Us νέφ er ερ M εP od C e Μεγάλωσε priority Ενεργοποίηση Ints Σώσε registers lw r 1, 20(r 0) lw r 2, 0(r 1) addi r 3, r 0, #5 sw 0(r 1), r 3 Επανέφερε registers Καθάρισε Int Απενεργ. Ints Επανέφερε priority RTE 5

Tomasulo Dynamic Execution 1 IS 2 3 4 Program Flow LD 7 8 9

Tomasulo Dynamic Execution 1 IS 2 3 4 Program Flow LD 7 8 9 Time 10 11 12 13 14 15 16 17 18 … … 56 57 WB CM MULT IS IS SUB WB WB CM CM DIV IS In-order issue cslab@ntua 2018 -2019 6 WB CM LD IS 5 IS ADD WB Out-of-order execution & result WB CM CM In-order commit 12

cslab@ntua 2019 -2020 13

cslab@ntua 2019 -2020 13

cslab@ntua 2019 -2020 14

cslab@ntua 2019 -2020 14

cslab@ntua 2019 -2020 15

cslab@ntua 2019 -2020 15

Tomasulo With Reorder buffer(1) Dest Value Instruction Ready FP Op Queue ROB 7 ROB

Tomasulo With Reorder buffer(1) Dest Value Instruction Ready FP Op Queue ROB 7 ROB 6 Newest ROB 5 ROB 4 Reorder Buffer Commit pntr ROB 3 ROB 2 F 0 L. D F 0, 0(R 1) Registers Dest cslab@ntua 2019 -2020 ROB 1 Oldest To Memory from Memory Dest FP adders N Reservation Stations Dest 1 R 1 FP multipliers 16

Tomasulo With Reorder buffer(2) Dest Value Instruction Ready FP Op Queue ROB 7 ROB

Tomasulo With Reorder buffer(2) Dest Value Instruction Ready FP Op Queue ROB 7 ROB 6 Newest ROB 5 ROB 4 Reorder Buffer Commit pntr ROB 3 F 4 F 0 MUL. D F 4, F 0, F 2 L. D F 0, 0(R 1) Registers Dest cslab@ntua 2019 -2020 ROB 2 ROB 1 Oldest To Memory Dest 2 MULD ROB 1, R(F 2) FP adders N N Reservation Stations from Memory Dest 1 R 1 FP multipliers 17

Tomasulo With Reorder buffer(3) Dest Value Instruction Ready FP Op Queue ROB 7 ROB

Tomasulo With Reorder buffer(3) Dest Value Instruction Ready FP Op Queue ROB 7 ROB 6 Newest ROB 5 Reorder Buffer Commit pntr ROB 4 --F 4 F 0 ROB 2 S. D F 4, 0(R 1) N MUL. D F 4, F 0, F 2 N L. D F 0, 0(R 1) N Registers Dest cslab@ntua 2019 -2020 ROB 2 ROB 1 Oldest To Memory Dest 2 MULD ROB 1, R(F 2) FP adders ROB 3 Reservation Stations from Memory Dest 1 R 1 FP multipliers 18

Tomasulo With Reorder buffer(4) Dest Value Instruction Ready FP Op Queue ROB 7 ROB

Tomasulo With Reorder buffer(4) Dest Value Instruction Ready FP Op Queue ROB 7 ROB 6 Newest ROB 5 Reorder Buffer Commit pntr R 1 -F 4 F 0 ROB 2 DADIU R 1, #-8 S. D F 4, 0(R 1) MUL. D F 4, F 0, F 2 L. D F 0, 0(R 1) Registers Dest cslab@ntua 2019 -2020 ROB 4 ROB 3 ROB 2 ROB 1 Oldest To Memory Dest 2 MULD ROB 1, R(F 2) FP adders N N Reservation Stations from Memory Dest 1 R 1 FP multipliers 19

Tomasulo With Reorder buffer(5) Dest Value Instruction Ready FP Op Queue ROB 7 ROB

Tomasulo With Reorder buffer(5) Dest Value Instruction Ready FP Op Queue ROB 7 ROB 6 Reorder Buffer Branch predicted taken Commit pntr -R 1 -F 4 F 0 ROB 2 BNE R 1, R 2, LOOP DADIU R 1, #-8 S. D F 4, 0(R 1) MUL. D F 4, F 0, F 2 L. D F 0, 0(R 1) Registers Dest cslab@ntua 2019 -2020 ROB 5 ROB 4 ROB 3 ROB 2 ROB 1 Oldest To Memory Dest 2 MULD ROB 1, R(F 2) FP adders N Y N Newest Reservation Stations from Memory Dest 1 R 1 FP multipliers 20

Tomasulo With Reorder buffer(6) Dest Value FP Op Queue Instruction Ready ROB 7 Reorder

Tomasulo With Reorder buffer(6) Dest Value FP Op Queue Instruction Ready ROB 7 Reorder Buffer Commit pntr F 0 -R 1 -F 4 F 0 ROB 2 L. D F 0, 0(R 1) BNE R 1, R 2, LOOP DADIU R 1, #-8 S. D F 4, 0(R 1) MUL. D F 4, F 0, F 2 L. D F 0, 0(R 1) Registers Dest cslab@ntua 2019 -2020 ROB 6 ROB 5 ROB 4 ROB 3 ROB 2 ROB 1 Oldest To Memory Dest 2 MULD ROB 1, R(F 2) FP adders N N Y N Newest Reservation Stations FP multipliers from Memory Dest 1 R 1 6 ROB 4 21

Tomasulo With Reorder buffer(7) FP Op Queue Reorder Buffer Commit pntr Dest Value Instruction

Tomasulo With Reorder buffer(7) FP Op Queue Reorder Buffer Commit pntr Dest Value Instruction Ready F 4 MUL. D F 4, F 0, F 2 N ROB 7 F 0 L. D F 0, 0(R 1) N ROB 6 -BNE R 1, R 2, LOOP Y ROB 5 R 1 DADIU R 1, #-8 Y ROB 4 -- ROB 2 S. D F 4, 0(R 1) N ROB 3 F 4 MUL. D F 4, F 0, F 2 N ROB 2 F 0 L. D F 0, 0(R 1) N ROB 1 Registers Dest cslab@ntua 2019 -2020 Oldest To Memory Dest 2 MULD ROB 1, R(F 2) 7 MULD ROB 6, R(F 2) FP adders Newest Reservation Stations FP multipliers from Memory Dest 1 R 1 6 ROB 4 22

Tomasulo With Reorder buffer(8) FP Op Queue Reorder Buffer Commit pntr Dest Value Instruction

Tomasulo With Reorder buffer(8) FP Op Queue Reorder Buffer Commit pntr Dest Value Instruction Ready F 4 MUL. D F 4, F 0, F 2 N ROB 7 F 0 L. D F 0, 0(R 1) N ROB 6 -BNE R 1, R 2, LOOP Y ROB 5 R 1 DADIU R 1, #-8 Y ROB 4 -- ROB 2 S. D F 4, 0(R 1) N ROB 3 F 4 MUL. D F 4, F 0, F 2 N ROB 2 F 0 M[R 1] L. D F 0, 0(R 1) Y ROB 1 Registers Dest cslab@ntua 2019 -2020 Oldest To Memory Dest 22 MULDROB 1, R(F 2) 7 MULD ROB 6, R(F 2) FP adders Newest Reservation Stations from Memory Dest 6 ROB 4 FP multipliers 23

Tomasulo With Reorder buffer(9) FP Op Queue Reorder Buffer Commit pntr Dest Value F

Tomasulo With Reorder buffer(9) FP Op Queue Reorder Buffer Commit pntr Dest Value F 4 F 0 M[ROB 4] -R 1 -- ROB 2 F 4 Instruction Ready MUL. D F 4, F 0, F 2 N ROB 7 L. D F 0, 0(R 1) Y ROB 6 BNE R 1, R 2, LOOP Y ROB 5 DADIU R 1, #-8 Y ROB 4 S. D F 4, 0(R 1) N ROB 3 MUL. D F 4, F 0, F 2 N ROB 2 ROB 1 Registers Dest cslab@ntua 2019 -2020 Oldest To Memory Dest 2 MULDROB 1, R(F 2) 7 MULD ROB 6, R(F 2) FP adders Newest Reservation Stations from Memory Dest FP multipliers 24

Tomasulo With Reorder buffer(12) FP Op Queue Reorder Buffer Commit pntr Dest Value F

Tomasulo With Reorder buffer(12) FP Op Queue Reorder Buffer Commit pntr Dest Value F 4 F 0 M[ROB 4] -R 1 -- ROB 2 F 4 VAL 1 Instruction Ready MUL. D F 4, F 0, F 2 N ROB 7 L. D F 0, 0(R 1) Y ROB 6 BNE R 1, R 2, LOOP Y ROB 5 DADIU R 1, #-8 Y ROB 4 S. D F 4, 0(R 1) N ROB 3 MUL. D F 4, F 0, F 2 Y ROB 2 ROB 1 Registers Dest cslab@ntua 2019 -2020 Oldest To Memory Dest 2 MULD ROB 1, R(F 2) 7 MULD ROB 6, R(F 2) FP adders Newest Reservation Stations from Memory Dest FP multipliers 25

Tomasulo With Reorder buffer(13) FP Op Queue Commit pntr Dest Value F 4 F

Tomasulo With Reorder buffer(13) FP Op Queue Commit pntr Dest Value F 4 F 0 M[ROB 4] -R 1 -- ROB 2 Instruction Ready MUL. D F 4, F 0, F 2 N ROB 7 L. D F 0, 0(R 1) Y ROB 6 BNE R 1, R 2, LOOP Y ROB 5 DADIU R 1, #-8 Y ROB 4 S. D F 4, 0(R 1) N ROB 3 ROB 2 Reorder Buffer ROB 1 Registers Dest cslab@ntua 2019 -2020 Oldest To Memory Dest 2 MULD ROB 1, R(F 2) 7 MULD ROB 6, R(F 2) FP adders Newest Reservation Stations from Memory Dest FP multipliers 26

Exceptions and Interrupts • IBM 360/91 invented “imprecise interrupts” – Computer stopped at this

Exceptions and Interrupts • IBM 360/91 invented “imprecise interrupts” – Computer stopped at this PC; its likely close to this address – Not so popular with programmers – Also, what about Virtual Memory? (Not in IBM 360) • Technique for both precise interrupts/exceptions and speculation: in-order completion and in-order commit – If we speculate and are wrong, need to back up and restart execution to point at which we predicted incorrectly – This is exactly same as need to do with precise exceptions • Exceptions are handled by not recognizing the exception until instruction that caused it is ready to commit in ROB – If a speculated instruction raises an exception, the exception is recorded in the ROB – This is why reorder buffers in all new processors cslab@ntua 2018 -2019 28

Exceptions With Reorder Buffer: Rd. Value FP Op Queue Instr -- M[10] ST 0(R

Exceptions With Reorder Buffer: Rd. Value FP Op Queue Instr -- M[10] ST 0(R 3), F 4 Y F 0 <val 2> ADDD F 0, F 4, F 6 Y F 4 M[10] LD F 4, 0(R 3) Y BNE F 2, <…> -N F 2 Na. N DIVD F 2, F 10, F 6 Ex Reorder Buffer A: DIVD has exception Mark in ROB entry ROB 6 F 10: val 3 F 0: M[30] Reservation Stations FP adder FP multiplier Newest ROB 5 ROB 4 ROB 3 ROB 1 Dest cslab@ntua 2018 -2019 ROB 7 ROB 2 RF Dest Done? Oldest To Memory from Memory Dest 29

Exceptions With Reorder Buffer: Rd. Value FP Op Queue A: DIVD commits ➜ Exception

Exceptions With Reorder Buffer: Rd. Value FP Op Queue A: DIVD commits ➜ Exception is taken!!! LD, ADDD, ST killed Need: Status, PC ROB 7 ROB 6 ROB 4 ROB 3 ROB 1 F 10: val 3 F 0: M[30] Dest Reservation Stations FP adder FP multiplier Newest ROB 5 ROB 2 ARF cslab@ntua 2018 -2019 Done? -- M[10] ST 0(R 3), F 4 Y F 0 <val 2> ADDD F 0, F 4, F 6 Y F 4 M[10] LD F 4, 0(R 3) Y -BNE F 2, <…> N F 2 Na. N DIVD F 2, F 10, F 6 Ex Reorder Buffer Dest Instr Oldest To Memory from Memory Dest 30

Tomasulo With Reorder Buffer: Rd. Value FP Op Queue Instr -- M[10] ST 0(R

Tomasulo With Reorder Buffer: Rd. Value FP Op Queue Instr -- M[10] ST 0(R 3), F 4 F 0 <val 2> ADDD F 0, F 4, F 6 F 4 M[10] LD F 4, 0(R 3) -BNE F 2, <…> F 2 <val 4> DIVD F 2, F 10, F 6 Reorder Buffer B: DIVD completes successfully ROB 7 ROB 6 F 10: val 3 F 2: val 4 F 0: M[30] Reservation Stations FP adder FP multiplier Newest ROB 5 ROB 4 ROB 3 ROB 1 Dest cslab@ntua 2018 -2019 Y Y Y N Y ROB 2 RF Dest Done? Oldest To Memory from Memory Dest 31

Tomasulo With Reorder Buffer: Rd. Value FP Op Queue Instr -- M[10] ST 0(R

Tomasulo With Reorder Buffer: Rd. Value FP Op Queue Instr -- M[10] ST 0(R 3), F 4 F 0 <val 2> ADDD F 0, F 4, F 6 F 4 M[10] LD F 4, 0(R 3) -BNE F 2, <…> Reorder Buffer ROB 7 ROB 6 Newest ROB 5 ROB 4 ROB 2 ROB 1 RF F 10: val 3 F 2: val 4 F 0: M[30] Dest Reservation Stations FP adder FP multiplier cslab@ntua 2018 -2019 Y Y Y N ROB 3 B: DIVD commits RF[F 2] = val 4 Dest Done? Oldest To Memory from Memory Dest 32

Tomasulo With Reorder Buffer: Rd. Value FP Op Queue Dest Y Y Y N

Tomasulo With Reorder Buffer: Rd. Value FP Op Queue Dest Y Y Y N ROB 7 ROB 6 Newest ROB 5 ROB 4 ROB 3 ROB 2 ROB 1 ARF F 10: val 3 F 2: val 4 F 0: M[30] Dest Reservation Stations FP adder FP multiplier cslab@ntua 2018 -2019 Done? -- M[10] ST 0(R 3), F 4 F 0 <val 2> ADDD F 0, F 4, F 6 F 4 M[10] LD F 4, 0(R 3) -BNE F 2, <…> Reorder Buffer BNE commits Prediction Correct? Yes: continue No: Kill ROB PC = Target. ROB Instr Oldest To Memory from Memory Dest 33

When to commit? 1 Program Flow IS 2 3 4 LD 7 8 9

When to commit? 1 Program Flow IS 2 3 4 LD 7 8 9 Time 10 11 12 13 14 15 16 17 18 … … 56 57 WB CM MULT IS IS SUB WB WB CM CM DIV IS In-order issue cslab@ntua 2018 -2019 6 WB CM LD IS 5 IS ADD WB Out-of-order execution & result WB CM CM In-order commit 34

When to commit? 1 IS 2 3 4 Program Flow 6 7 8 9

When to commit? 1 IS 2 3 4 Program Flow 6 7 8 9 Time 10 11 12 13 14 15 16 17 18 … … 56 57 WB CM LD IS 5 LD WB CM MULT IS IS SUB WB CM DIV IS IS ADD WB WB CM CM Can this be legal? No: Easy answer, no thinking Yes: But only if: • Can guarantee no exceptions from before • Cannot stop after DIV, only after ADD ➜ Coarse valid commit boundaries cslab@ntua 2018 -2019 Is this Interesting/Useful? 35

Precise Points • ISA: each instruction defines a “clean” point to stop • BUT,

Precise Points • ISA: each instruction defines a “clean” point to stop • BUT, why do we need to stop/restart? – External Interrupts: we can choose if and when to stop » Maskable interrupts, priorities, etc. – Exceptions: » Internal to the program, cannot choose if and when! » Non-restartable: fatal, kill program, all is lost, easy! » Restartable: Which ones? Not all instructions!!! • Fine, we have to restart execution – What if we restart from an earlier point and re-execute more instructions? Is it valid? Is the cost acceptable? • Commit points: – Branch – Ld/St • Perhaps branch only? (remember basic blocks? ) cslab@ntua 2018 -2019 36

Precise Points Needed restart points in the following program? Label: LD F 6 MULTD

Precise Points Needed restart points in the following program? Label: LD F 6 MULTD LD F 2 SUBD subi bne R 2 34+ F 0 45+ F 8 R 2 R 8 R 2 F 2 R 3 F 6 R 2 Label ✔ address/perm F 4 ✔ overflow ✔ address/prem F 2 ✔ overflow 1 ✔ misprediction Assume exception in second LD. Can I re-execute *all* instructions (from Label)? • Yes, if I have the ARF+PC+Mem for Label cslab@ntua 2018 -2019 37

Coarse-grain Precise Points • Keep enough info only for *some* points in the program

Coarse-grain Precise Points • Keep enough info only for *some* points in the program Branch (and perhaps mem) instructions • On “problem” (exception/misprediction/etc) – Return to the latest safe point before problem i. e. undo/kill all instructions after that point, restart from PC – All memory effects should be undoable: Writes should not reach memory until committed Will return to this when we study current processor structure & mechanisms cslab@ntua 2018 -2019 38

Superscalar Pipeline Design • • • Instruction Fetching Issues Instruction Decoding Issues Instruction Dispatching

Superscalar Pipeline Design • • • Instruction Fetching Issues Instruction Decoding Issues Instruction Dispatching Issues Instruction Execution Issues Instruction Completion & Retiring Issues cslab@ntua 2018 -2019 46

Centralized reservation station • Ενοποίηση των σταδίων dispatch και issue cslab@ntua 2018 -2019 55

Centralized reservation station • Ενοποίηση των σταδίων dispatch και issue cslab@ntua 2018 -2019 55

Distributed reservation station cslab@ntua 2018 -2019 56

Distributed reservation station cslab@ntua 2018 -2019 56

Ολοκλήρωση και Αποδέσμευση εντολών – Instruction Completion and Retiring • Τα στάδια εντολών: Ø

Ολοκλήρωση και Αποδέσμευση εντολών – Instruction Completion and Retiring • Τα στάδια εντολών: Ø Ø Ø Ø Fetch Decode Dispatch Issue Execute Finish Complete Retire cslab@ntua 2018 -2019 59