CSL 718 Superscalar Processors Issue and Despatch 23

  • Slides: 38
Download presentation
CSL 718 : Superscalar Processors Issue and Despatch 23 rd Jan, 2006 Anshul Kumar,

CSL 718 : Superscalar Processors Issue and Despatch 23 rd Jan, 2006 Anshul Kumar, CSE IITD

Early proposals/prototypes IBM Term Superscalar America project(4) Cheetah Multititan project(2) DEC Match(2) Torch(4) Stanford

Early proposals/prototypes IBM Term Superscalar America project(4) Cheetah Multititan project(2) DEC Match(2) Torch(4) Stanford U SIMP(4) DSNS(4) Kyushu U 1982 1983 Anshul Kumar, CSE IITD 1984 1985 1986 1987 1988 1989 2

Commercial superscalars RISCs • • Intel IBM HP SUN DEC Motorola MIPS 960 KA/KB

Commercial superscalars RISCs • • Intel IBM HP SUN DEC Motorola MIPS 960 KA/KB 960 CA (3) Power 1 RS/6000 (4) PA 7000 PA 7100 (2) SPARC Super. Sparc (3) Alpha 21064(2) MC 88100 MC 88110(2) Power. PC 601/603 (3) R 4000 R 8000(4) Anshul Kumar, CSE IITD 1989 1990 1992 1993 1994 3

Commercial superscalars CISCs • Intel 80486 Pentium (2) • Motorola MC 68040 MC 68060

Commercial superscalars CISCs • Intel 80486 Pentium (2) • Motorola MC 68040 MC 68060 (2) • Gmicro/100 p Gmicro 500 (2) • AMD K 5(2) – 4 RISC instr • CYRIX M 1 (2) Anshul Kumar, CSE IITD 1993 1995 4

Tasks of superscalar processing Parallel decoding and issue Anshul Kumar, CSE IITD Parallel instruction

Tasks of superscalar processing Parallel decoding and issue Anshul Kumar, CSE IITD Parallel instruction execution Preserving the sequential consistency of instruction execution and exception processing 5

Superscalar decode and issue I - cache Instruction buffer Scalar Issue IF Decode &

Superscalar decode and issue I - cache Instruction buffer Scalar Issue IF Decode & Issue D/I Anshul Kumar, CSE IITD Superscalar Issue Decode & Issue IF D I 6

Parallel Decoding • Fetch multiple instructions in instruction buffer • Decode multiple instructions in

Parallel Decoding • Fetch multiple instructions in instruction buffer • Decode multiple instructions in parallel – instruction window • Possibly check dependencies among these as well as with the instructions already under execution Anshul Kumar, CSE IITD 7

Pre-decoding • Do partial decoding while instructions are being loaded in I-cache • Decoded

Pre-decoding • Do partial decoding while instructions are being loaded in I-cache • Decoded information is appended to the instruction • This includes instruction class, resources required etc. Anshul Kumar, CSE IITD Second level cache or main memory N bits/cycle Pre-decode unit N + n bits/cycle I - cache 8

Number of Pre-decode bits Processor No. of predecode bits PA 7200 (1995) 5 PA

Number of Pre-decode bits Processor No. of predecode bits PA 7200 (1995) 5 PA 8000 (1996) 5 Power. PC 620(1996) 7 Ultra. Sparc (1995) 4 HAL PM 1 (1995) 4 AMD K 5 (1995) 5 (per byte) R 10000 (1996) 4 Anshul Kumar, CSE IITD 9

Issue vs Dispatch Blocking Issue • Decode and issue to EU Non-blocking Issue •

Issue vs Dispatch Blocking Issue • Decode and issue to EU Non-blocking Issue • Decode and issue to buffer • From buffer dispatch to EU Instructions may be blocked due to data dependency Instructions are not blocked due to data dependency Anshul Kumar, CSE IITD 10

Blocking Issue Instruction buffer issue window Decode Check & Issue EU Anshul Kumar, CSE

Blocking Issue Instruction buffer issue window Decode Check & Issue EU Anshul Kumar, CSE IITD EU EU 11

Non-blocking (shelved) Issue Instruction buffer Decode & Issue Reservation station Dep. Checking/ dispatch EU

Non-blocking (shelved) Issue Instruction buffer Decode & Issue Reservation station Dep. Checking/ dispatch EU EU EU Anshul Kumar, CSE IITD 12

Handling of Issue Blockages Preserving issue order in-order out of order Anshul Kumar, CSE

Handling of Issue Blockages Preserving issue order in-order out of order Anshul Kumar, CSE IITD Alignment of instruction issue aligned unaligned 13

Issue Order Issue in strict program order Instructions to be issued e Issue window

Issue Order Issue in strict program order Instructions to be issued e Issue window d c Instructions issued b Out of order Issue a Instructions to be issued e a Instructions issued Issue window d c b a c a Example: MC 88110, Power. PC 601 Independent instruction Dependent instruction Issued instruction Anshul Kumar, CSE IITD 14

Alignment Aligned Issue checked in cycle 1 Unaligned Issue next window fixed window h

Alignment Aligned Issue checked in cycle 1 Unaligned Issue next window fixed window h d g f e c b issued in cycle 1 checked in cycle 2 h g f e d c b a a h g f e d issued in cycle 2 checked in cycle 3 a gliding window c b c h g f e h Anshul Kumar, CSE IITD g f e d b d issued in cycle 3 a h d g f e d c b 15

Design choices in instruction issue Coping with false data unresolved dependencies control dependencies Use

Design choices in instruction issue Coping with false data unresolved dependencies control dependencies Use of shelving Handling of Issue issue blockages rate (2 -6) blocking shelved no Register renaming wait speculative Anshul Kumar, CSE IITD 16

Frequently used issue policies in scalar processors Traditional scalar issue i 386 MC 68030

Frequently used issue policies in scalar processors Traditional scalar issue i 386 MC 68030 R 3000 Sparc Traditional scalar issue with shelving CDC 6600 Anshul Kumar, CSE IITD Traditional scalar issue with shelving and renaming IBM 360/91 Traditional scalar issue with spec. execution I 486 MC 68040 R 4000 Micro. Sparc 17

Frequently used issue policies in super scalar processors Straightforward superscalar issue aligned unaligned Straightforward

Frequently used issue policies in super scalar processors Straightforward superscalar issue aligned unaligned Straightforward superscalar issue with shelving Straight forward superscalar issue with renaming Advanced superscalar issue (renaming+shelving) (speculative execution in all) MC 68060 MC 88110 Pentium Power. PC 601 PA 7200 R 8000 Ultra. Sparc PA 7100 Super. Sparc Alpha 21164 Anshul Kumar, CSE IITD Power. PC 602 R 10000 Pentium. Pro Power. PC 602 PA 8000 Sparc 64 Am 29000 K 5 18

Frequently used issue policies Traditional scalar issue with spec. execution Straight forward superscalar issue

Frequently used issue policies Traditional scalar issue with spec. execution Straight forward superscalar issue aligned Anshul Kumar, CSE IITD Advanced superscalar Issue unaligned 19

Design Space of Shelving Scope of shelving partial Layout of shelving buffers Operand fetch

Design Space of Shelving Scope of shelving partial Layout of shelving buffers Operand fetch policy Instruction dispatch scheme full Anshul Kumar, CSE IITD 20

Layout of Shelving Buffers Type of the shelving buffers Stand alone (RS) combined with

Layout of Shelving Buffers Type of the shelving buffers Stand alone (RS) combined with renaming and reordering Anshul Kumar, CSE IITD Number of shelving buffer entries individual 2 -4 group 6 -16 central 20 total 15 -40 Number of read and write ports depends on no. of EUs connected 21

Reservation Stations (RS) Individual RSs RS RS EU EU Group RSs RS EU Anshul

Reservation Stations (RS) Individual RSs RS RS EU EU Group RSs RS EU Anshul Kumar, CSE IITD Central RS RS EU EU EU 22

Combined Buffer (for Shelving, Renaming, Reordering) From decode/issue Deferred scheduling, Register renaming and Instruction

Combined Buffer (for Shelving, Renaming, Reordering) From decode/issue Deferred scheduling, Register renaming and Instruction Shelving DRIS EU Anshul Kumar, CSE IITD EU 23

Operand Fetch Policies Issue bound fetch Anshul Kumar, CSE IITD Dispatch bound fetch 24

Operand Fetch Policies Issue bound fetch Anshul Kumar, CSE IITD Dispatch bound fetch 24

Issue bound operand fetch (with single register file) instruction data Decode/issue RF RS RS

Issue bound operand fetch (with single register file) instruction data Decode/issue RF RS RS EU EU Anshul Kumar, CSE IITD 25

Dispatch bound operand fetch (with single register file) Decode/issue RS instruction data RS RS

Dispatch bound operand fetch (with single register file) Decode/issue RS instruction data RS RS RS EU EU EU RF EU Anshul Kumar, CSE IITD 26

Issue bound operand fetch (with multiple register files) instruction data Decode/issue RF RF RS

Issue bound operand fetch (with multiple register files) instruction data Decode/issue RF RF RS RS EU EU Anshul Kumar, CSE IITD 27

Dispatch bound operand fetch (with multiple register files) Decode/issue RS RS RF EU RS

Dispatch bound operand fetch (with multiple register files) Decode/issue RS RS RF EU RS instruction data RS RF EU Anshul Kumar, CSE IITD EU EU 28

Updating RFs and RSs instruction data Decode/issue RF RF RS RS EU EU Anshul

Updating RFs and RSs instruction data Decode/issue RF RF RS RS EU EU Anshul Kumar, CSE IITD 29

Instruction dispatch scheme Dispatch policy Individual RS Dispatch rate single instr/ cycle Anshul Kumar,

Instruction dispatch scheme Dispatch policy Individual RS Dispatch rate single instr/ cycle Anshul Kumar, CSE IITD multiple instr/ cycle Checking operand availability Treatment of empty RS Group or central RS 30

Dispatch policy Selection rule Rule for identifying instructions which are ready for execution (data

Dispatch policy Selection rule Rule for identifying instructions which are ready for execution (data dependency check) Anshul Kumar, CSE IITD Arbitration rule Dispatch order Rule for choosing one out of several ready instructions (earlier instruction has priority) 31

Dispatch order in-order RS check Anshul Kumar, CSE IITD partially out of order RS

Dispatch order in-order RS check Anshul Kumar, CSE IITD partially out of order RS check 32

Checking availability of operands Direct check of score-board bits Check of explicit status bits

Checking availability of operands Direct check of score-board bits Check of explicit status bits in RS (usual for dispatch bound operand fetch) (usual for issue bound operand fetch) control flow approach data flow approach Flynn’s terminology Anshul Kumar, CSE IITD 33

Score-board Introduced with CDC 6600 Data 0 1 2 Register File status 1 0

Score-board Introduced with CDC 6600 Data 0 1 2 Register File status 1 0 1 Anshul Kumar, CSE IITD 34

Checking in dispatch bound fetch decoded instruction check V bits of sources Reservation station

Checking in dispatch bound fetch decoded instruction check V bits of sources Reservation station OC Rs 1 Rs 2 Rd update Rd set V bit Rs 1, Rs 2, Rd reset V bit of Rd Register File Os 1 OC (opcode) Os 2 (operand value) EU result, Rd Anshul Kumar, CSE IITD 35

Checking in issue bound fetch decoded instruction update Rd, set V bit Rs 1,

Checking in issue bound fetch decoded instruction update Rd, set V bit Rs 1, Rs 2, Rd reset V bit of Rd Register File Os 1 Os 2 (operand value) check Vs 1, Vs 2 Reservation station OC Os 1/Is 1 Vs 1 Os 2/Is 2 Vs 2 Rd associative update of Is 1, Is 2 with Rd, set Vs bits OC, Os 1, Os 2, Rd EU result, Rd Anshul Kumar, CSE IITD 36

Treatment of an empty RS Straight forward approach RS EU At least one cycle

Treatment of an empty RS Straight forward approach RS EU At least one cycle stay in RS Nx 586 Anshul Kumar, CSE IITD Bypassing RS if empty RS EU Sparc 64 Power. Pc 604 37

Approaches in dispatching Straight forward Enhanced Advanced in order partially out of order single

Approaches in dispatching Straight forward Enhanced Advanced in order partially out of order single multiple instr/cycle individual RSs group/central RSs Power 1, PPC 603 Nx 586, Am 29000 Anshul Kumar, CSE IITD Power 2 PPC 604, 620 PM 1, Pentium. Pro PA 8000, R 10000 38