CSL 718 Superscalar Processors Issue and Despatch 23
- Slides: 38
CSL 718 : Superscalar Processors Issue and Despatch 23 rd Jan, 2006 Anshul Kumar, CSE IITD
Early proposals/prototypes IBM Term Superscalar America project(4) Cheetah Multititan project(2) DEC Match(2) Torch(4) Stanford U SIMP(4) DSNS(4) Kyushu U 1982 1983 Anshul Kumar, CSE IITD 1984 1985 1986 1987 1988 1989 2
Commercial superscalars RISCs • • Intel IBM HP SUN DEC Motorola MIPS 960 KA/KB 960 CA (3) Power 1 RS/6000 (4) PA 7000 PA 7100 (2) SPARC Super. Sparc (3) Alpha 21064(2) MC 88100 MC 88110(2) Power. PC 601/603 (3) R 4000 R 8000(4) Anshul Kumar, CSE IITD 1989 1990 1992 1993 1994 3
Commercial superscalars CISCs • Intel 80486 Pentium (2) • Motorola MC 68040 MC 68060 (2) • Gmicro/100 p Gmicro 500 (2) • AMD K 5(2) – 4 RISC instr • CYRIX M 1 (2) Anshul Kumar, CSE IITD 1993 1995 4
Tasks of superscalar processing Parallel decoding and issue Anshul Kumar, CSE IITD Parallel instruction execution Preserving the sequential consistency of instruction execution and exception processing 5
Superscalar decode and issue I - cache Instruction buffer Scalar Issue IF Decode & Issue D/I Anshul Kumar, CSE IITD Superscalar Issue Decode & Issue IF D I 6
Parallel Decoding • Fetch multiple instructions in instruction buffer • Decode multiple instructions in parallel – instruction window • Possibly check dependencies among these as well as with the instructions already under execution Anshul Kumar, CSE IITD 7
Pre-decoding • Do partial decoding while instructions are being loaded in I-cache • Decoded information is appended to the instruction • This includes instruction class, resources required etc. Anshul Kumar, CSE IITD Second level cache or main memory N bits/cycle Pre-decode unit N + n bits/cycle I - cache 8
Number of Pre-decode bits Processor No. of predecode bits PA 7200 (1995) 5 PA 8000 (1996) 5 Power. PC 620(1996) 7 Ultra. Sparc (1995) 4 HAL PM 1 (1995) 4 AMD K 5 (1995) 5 (per byte) R 10000 (1996) 4 Anshul Kumar, CSE IITD 9
Issue vs Dispatch Blocking Issue • Decode and issue to EU Non-blocking Issue • Decode and issue to buffer • From buffer dispatch to EU Instructions may be blocked due to data dependency Instructions are not blocked due to data dependency Anshul Kumar, CSE IITD 10
Blocking Issue Instruction buffer issue window Decode Check & Issue EU Anshul Kumar, CSE IITD EU EU 11
Non-blocking (shelved) Issue Instruction buffer Decode & Issue Reservation station Dep. Checking/ dispatch EU EU EU Anshul Kumar, CSE IITD 12
Handling of Issue Blockages Preserving issue order in-order out of order Anshul Kumar, CSE IITD Alignment of instruction issue aligned unaligned 13
Issue Order Issue in strict program order Instructions to be issued e Issue window d c Instructions issued b Out of order Issue a Instructions to be issued e a Instructions issued Issue window d c b a c a Example: MC 88110, Power. PC 601 Independent instruction Dependent instruction Issued instruction Anshul Kumar, CSE IITD 14
Alignment Aligned Issue checked in cycle 1 Unaligned Issue next window fixed window h d g f e c b issued in cycle 1 checked in cycle 2 h g f e d c b a a h g f e d issued in cycle 2 checked in cycle 3 a gliding window c b c h g f e h Anshul Kumar, CSE IITD g f e d b d issued in cycle 3 a h d g f e d c b 15
Design choices in instruction issue Coping with false data unresolved dependencies control dependencies Use of shelving Handling of Issue issue blockages rate (2 -6) blocking shelved no Register renaming wait speculative Anshul Kumar, CSE IITD 16
Frequently used issue policies in scalar processors Traditional scalar issue i 386 MC 68030 R 3000 Sparc Traditional scalar issue with shelving CDC 6600 Anshul Kumar, CSE IITD Traditional scalar issue with shelving and renaming IBM 360/91 Traditional scalar issue with spec. execution I 486 MC 68040 R 4000 Micro. Sparc 17
Frequently used issue policies in super scalar processors Straightforward superscalar issue aligned unaligned Straightforward superscalar issue with shelving Straight forward superscalar issue with renaming Advanced superscalar issue (renaming+shelving) (speculative execution in all) MC 68060 MC 88110 Pentium Power. PC 601 PA 7200 R 8000 Ultra. Sparc PA 7100 Super. Sparc Alpha 21164 Anshul Kumar, CSE IITD Power. PC 602 R 10000 Pentium. Pro Power. PC 602 PA 8000 Sparc 64 Am 29000 K 5 18
Frequently used issue policies Traditional scalar issue with spec. execution Straight forward superscalar issue aligned Anshul Kumar, CSE IITD Advanced superscalar Issue unaligned 19
Design Space of Shelving Scope of shelving partial Layout of shelving buffers Operand fetch policy Instruction dispatch scheme full Anshul Kumar, CSE IITD 20
Layout of Shelving Buffers Type of the shelving buffers Stand alone (RS) combined with renaming and reordering Anshul Kumar, CSE IITD Number of shelving buffer entries individual 2 -4 group 6 -16 central 20 total 15 -40 Number of read and write ports depends on no. of EUs connected 21
Reservation Stations (RS) Individual RSs RS RS EU EU Group RSs RS EU Anshul Kumar, CSE IITD Central RS RS EU EU EU 22
Combined Buffer (for Shelving, Renaming, Reordering) From decode/issue Deferred scheduling, Register renaming and Instruction Shelving DRIS EU Anshul Kumar, CSE IITD EU 23
Operand Fetch Policies Issue bound fetch Anshul Kumar, CSE IITD Dispatch bound fetch 24
Issue bound operand fetch (with single register file) instruction data Decode/issue RF RS RS EU EU Anshul Kumar, CSE IITD 25
Dispatch bound operand fetch (with single register file) Decode/issue RS instruction data RS RS RS EU EU EU RF EU Anshul Kumar, CSE IITD 26
Issue bound operand fetch (with multiple register files) instruction data Decode/issue RF RF RS RS EU EU Anshul Kumar, CSE IITD 27
Dispatch bound operand fetch (with multiple register files) Decode/issue RS RS RF EU RS instruction data RS RF EU Anshul Kumar, CSE IITD EU EU 28
Updating RFs and RSs instruction data Decode/issue RF RF RS RS EU EU Anshul Kumar, CSE IITD 29
Instruction dispatch scheme Dispatch policy Individual RS Dispatch rate single instr/ cycle Anshul Kumar, CSE IITD multiple instr/ cycle Checking operand availability Treatment of empty RS Group or central RS 30
Dispatch policy Selection rule Rule for identifying instructions which are ready for execution (data dependency check) Anshul Kumar, CSE IITD Arbitration rule Dispatch order Rule for choosing one out of several ready instructions (earlier instruction has priority) 31
Dispatch order in-order RS check Anshul Kumar, CSE IITD partially out of order RS check 32
Checking availability of operands Direct check of score-board bits Check of explicit status bits in RS (usual for dispatch bound operand fetch) (usual for issue bound operand fetch) control flow approach data flow approach Flynn’s terminology Anshul Kumar, CSE IITD 33
Score-board Introduced with CDC 6600 Data 0 1 2 Register File status 1 0 1 Anshul Kumar, CSE IITD 34
Checking in dispatch bound fetch decoded instruction check V bits of sources Reservation station OC Rs 1 Rs 2 Rd update Rd set V bit Rs 1, Rs 2, Rd reset V bit of Rd Register File Os 1 OC (opcode) Os 2 (operand value) EU result, Rd Anshul Kumar, CSE IITD 35
Checking in issue bound fetch decoded instruction update Rd, set V bit Rs 1, Rs 2, Rd reset V bit of Rd Register File Os 1 Os 2 (operand value) check Vs 1, Vs 2 Reservation station OC Os 1/Is 1 Vs 1 Os 2/Is 2 Vs 2 Rd associative update of Is 1, Is 2 with Rd, set Vs bits OC, Os 1, Os 2, Rd EU result, Rd Anshul Kumar, CSE IITD 36
Treatment of an empty RS Straight forward approach RS EU At least one cycle stay in RS Nx 586 Anshul Kumar, CSE IITD Bypassing RS if empty RS EU Sparc 64 Power. Pc 604 37
Approaches in dispatching Straight forward Enhanced Advanced in order partially out of order single multiple instr/cycle individual RSs group/central RSs Power 1, PPC 603 Nx 586, Am 29000 Anshul Kumar, CSE IITD Power 2 PPC 604, 620 PM 1, Pentium. Pro PA 8000, R 10000 38
- Despatch
- Wood despatch 1854
- Citation styles mendeley
- Csl kids
- Srikant uiuc
- Alloy 718 api
- Division of florida condominiums
- Asc 718
- Vliw processors rely on
- Pipelining and superscalar techniques
- Pipeline vs superscalar
- Superscalar vs vliw
- Superpipelined processor
- Vliw vs superscalar
- Superscalar simulator
- Superscalar architecture diagram
- Superscalar pipeline
- Superscalar execution
- Superscalar architecture diagram
- Linear pipeline
- Disadvantages of intel processor
- Microcontrollers and embedded processors
- Language and processors for requirement
- Programming massively parallel processors
- Interrupt handling in arm processors
- Processor history
- Handler classification
- Digital camera processors
- Embeded processors
- Comparison of word processors
- Layers of query processing
- Parallel processors from client to cloud
- Programming massively parallel processors
- Programming massively parallel processors
- Gas processors association
- Beagleboard embedded processors
- Ece 526
- Macro call is handled in this pass of macro processor
- Macro processors