Superscalar SMIPS Processor Andy Wright Leslie Maldonado Project
- Slides: 22
Superscalar SMIPS Processor Andy Wright Leslie Maldonado
Project Goals • N-way superscalar execution – Up to N instructions can be issued every cycle – N execution pipelines will share a single data memory • IPC > 1 – Shows that superscalar execution is working
Background • Data Hazards • Control Hazards • Structural Hazards – An instruction can’t be issued if it needs to use the same hardware as another instruction at the same time – Relevant for: • Data Memory • Redirect FIFO • Coprocessor
System Overview
System Overview
Instruction Memory • Needs to be able to output N words 4 3 2 1 0 Normal Instruction Memory
Instruction Memory 12 13 14 15 8 9 10 11 4 5 6 7 0 1 2 3 Read from Address 0
Instruction Memory 12 13 14 15 8 9 10 11 4 5 6 7 0 1 2 3 Read from Address 0
Instruction Memory 12 13 14 15 8 9 10 11 4 5 6 7 0 1 2 3 8 5 6 7 Read from address 5 Unaligned accesses need permutations
Instruction Memory 12 13 14 15 8 9 10 11 4 5 6 7 0 1 2 3 8 5 6 7 8
Instruction Memory
Superscalar Fifo • Needs to be able to enqueue N instruction per cycle • Needs to be able to dequeue 1 -N instructions per cycle • Architecture similar to instruction memory
Superscalar Fifo
Scoreboard • Keeps track of pending register writes to prevent RAW hazards – Scoreboards are used to prevent conflicts between instructions across clock cycles and within the same clock cycle • Dispatch logic searches and writes to the scoreboard • Writeback removes from the scoreboard – The order of these two operations depends on the type of registerfile
Scoreboard
Scoreboard
Execution Pipelines • Cores are given priorities between them • Core 0 has earlier instructions than core 1 • A mispredict in core I should kill instructions in cores > i
Execution Pipelines
Results (N=2)
Results (N=3) • Didn’t get IPCs greater than 1, meaning this design was slower than the N=2 case. • Why? – The branch predictor • The branch predictor only predicts 1 out of every N instructions using the better branch predictor. The misprediction penalty is high, and the processor is paying the penalty more often for larger N’s
Structural Hazards in Bluespec • Dispatch logic prevents two modules from needing to use the same hardware • Bluespec compiler also checks for structural hazards, but is more aggressive. • We had to create wrappers that would allow multiple modules to attempt to write to the same modules but only one actually gets to use it based on a fixed priority. • If dispatch logic works, then the priority doesn’t matter since there will always be only one module write to it.
Conclusion • We added N-way superscalar execution to the original SMIPS processor • We saw IPC > 1 for every benchmarks on at least one processor for N=2 • We tried N=3, but it suffered too much from misprediction
- Superscalar machine
- Smips architecture
- Jorge maldonado cabrera
- V.m. maldonado
- Max bill
- Aba maldonado
- Claudia maldonado trujillo
- Superscalar pipeline design
- Superscalar vs superpipelined
- Vliw vs superscalar
- Pipelining and superscalar techniques
- Superscalar simulator
- Intel pentium
- Superscalar pipeline
- Difference between superscalar and vliw
- Superscalar execution
- Superscalar architecture diagram
- Wendelin wright
- Erik olin wright sınıf kuramı
- C wright mills
- Writing interventions
- Diana browning
- Diana browning