Superscalar SMIPS Processor Andy Wright Leslie Maldonado Project

  • Slides: 22
Download presentation
Superscalar SMIPS Processor Andy Wright Leslie Maldonado

Superscalar SMIPS Processor Andy Wright Leslie Maldonado

Project Goals • N-way superscalar execution – Up to N instructions can be issued

Project Goals • N-way superscalar execution – Up to N instructions can be issued every cycle – N execution pipelines will share a single data memory • IPC > 1 – Shows that superscalar execution is working

Background • Data Hazards • Control Hazards • Structural Hazards – An instruction can’t

Background • Data Hazards • Control Hazards • Structural Hazards – An instruction can’t be issued if it needs to use the same hardware as another instruction at the same time – Relevant for: • Data Memory • Redirect FIFO • Coprocessor

System Overview

System Overview

System Overview

System Overview

Instruction Memory • Needs to be able to output N words 4 3 2

Instruction Memory • Needs to be able to output N words 4 3 2 1 0 Normal Instruction Memory

Instruction Memory 12 13 14 15 8 9 10 11 4 5 6 7

Instruction Memory 12 13 14 15 8 9 10 11 4 5 6 7 0 1 2 3 Read from Address 0

Instruction Memory 12 13 14 15 8 9 10 11 4 5 6 7

Instruction Memory 12 13 14 15 8 9 10 11 4 5 6 7 0 1 2 3 Read from Address 0

Instruction Memory 12 13 14 15 8 9 10 11 4 5 6 7

Instruction Memory 12 13 14 15 8 9 10 11 4 5 6 7 0 1 2 3 8 5 6 7 Read from address 5 Unaligned accesses need permutations

Instruction Memory 12 13 14 15 8 9 10 11 4 5 6 7

Instruction Memory 12 13 14 15 8 9 10 11 4 5 6 7 0 1 2 3 8 5 6 7 8

Instruction Memory

Instruction Memory

Superscalar Fifo • Needs to be able to enqueue N instruction per cycle •

Superscalar Fifo • Needs to be able to enqueue N instruction per cycle • Needs to be able to dequeue 1 -N instructions per cycle • Architecture similar to instruction memory

Superscalar Fifo

Superscalar Fifo

Scoreboard • Keeps track of pending register writes to prevent RAW hazards – Scoreboards

Scoreboard • Keeps track of pending register writes to prevent RAW hazards – Scoreboards are used to prevent conflicts between instructions across clock cycles and within the same clock cycle • Dispatch logic searches and writes to the scoreboard • Writeback removes from the scoreboard – The order of these two operations depends on the type of registerfile

Scoreboard

Scoreboard

Scoreboard

Scoreboard

Execution Pipelines • Cores are given priorities between them • Core 0 has earlier

Execution Pipelines • Cores are given priorities between them • Core 0 has earlier instructions than core 1 • A mispredict in core I should kill instructions in cores > i

Execution Pipelines

Execution Pipelines

Results (N=2)

Results (N=2)

Results (N=3) • Didn’t get IPCs greater than 1, meaning this design was slower

Results (N=3) • Didn’t get IPCs greater than 1, meaning this design was slower than the N=2 case. • Why? – The branch predictor • The branch predictor only predicts 1 out of every N instructions using the better branch predictor. The misprediction penalty is high, and the processor is paying the penalty more often for larger N’s

Structural Hazards in Bluespec • Dispatch logic prevents two modules from needing to use

Structural Hazards in Bluespec • Dispatch logic prevents two modules from needing to use the same hardware • Bluespec compiler also checks for structural hazards, but is more aggressive. • We had to create wrappers that would allow multiple modules to attempt to write to the same modules but only one actually gets to use it based on a fixed priority. • If dispatch logic works, then the priority doesn’t matter since there will always be only one module write to it.

Conclusion • We added N-way superscalar execution to the original SMIPS processor • We

Conclusion • We added N-way superscalar execution to the original SMIPS processor • We saw IPC > 1 for every benchmarks on at least one processor for N=2 • We tried N=3, but it suffered too much from misprediction