Superscalar Pipelines Part 2 12108 1 An example

  • Slides: 14
Download presentation
Superscalar Pipelines Part 2 12/1/08 1

Superscalar Pipelines Part 2 12/1/08 1

An example six stage superscalar pipeline • The six stages: fetch, decode, dispatch, execute,

An example six stage superscalar pipeline • The six stages: fetch, decode, dispatch, execute, complete, and retiring. – The execute stage can include multiple (pipelined) functional units of different types with different execution latencies. – The dispatch stage distributes instructions of different types to their corresponding functional units. – the complete stage reorders instructions to ensure in-order updating of the machine state. 2

k = 6 pipeline stages s = 7 width 3

k = 6 pipeline stages s = 7 width 3

Fetch • Multiple instructions are fetched from I-cache on each machine cycle. • I-cache

Fetch • Multiple instructions are fetched from I-cache on each machine cycle. • I-cache line needs to be a multiple of pipeline width s. 4

Fetch continued • S instructions must be fetched on each clock cycle to sustain

Fetch continued • S instructions must be fetched on each clock cycle to sustain pipeline bandwidth. • Problems – Instruction misalignment – Instructions that change program flow. i. e. branches. 5

Decode • Identify individual instructions. • Determine instruction types. • Detect inter-instruction dependences among

Decode • Identify individual instructions. • Determine instruction types. • Detect inter-instruction dependences among instructions that have been fetched but not yet dispatched. – Determines which instructions can be dispatched in parallel. • Complicated for s > 1. – Much simpler for RISC than CISC. • CISC’s typical require multiple pipeline stages for decoding. • CISC instructions are translated into internal RISC instructions. Intel refers to these as ops (pronounced “you-ops”). – Must quickly identify control-flow changing instructions and provide feedback to the fetch stage. 6

Pentium Pro example 7

Pentium Pro example 7

Dispatch • Different types of instructions are executed by different functional units. • The

Dispatch • Different types of instructions are executed by different functional units. • The decode stage identifies the instruction type. • The dispatch stage routs the instruction to the appropriate functional unit in the execution stage. 8

9

9

Execution • Execution unit is the heart of a superscalar computer. • The trend

Execution • Execution unit is the heart of a superscalar computer. • The trend is towards more parallel and more diversified pipelines. – More functional units and more specialized functional units. – Early scalar pipeline machines had only one functional unit. i. e. our Mips. With possibly a separate functional unit for floating point. – Current superscalar processors may have multiple integer units and multiple floating point units. 10

11

11

Completion and retiring • An instruction is complete when it finishes execution and updates

Completion and retiring • An instruction is complete when it finishes execution and updates the machine state. • An instruction finishes execution when it exits the functional unit and enters the completion buffer. • When the instruction exits the completion unit all registers have been updated. The instruction is architecturally complete. • However, memory may still need to be written. The instruction exits the retire unit when memory has been written. • Instructions that do not update memory are retired as soon as they exit the completion unit. 12

Interrupts and exceptions • Interrupts – asynchronous external events. – – Stop fetching new

Interrupts and exceptions • Interrupts – asynchronous external events. – – Stop fetching new instructions. Allow instructions in pipeline to finish. Save machine state. Transfer control to interrupt service routine. • Exceptions – induced by the execution of an instruction. – Precise interrupts require that machine state be save just prior to the exception. – Complicated. 13

14

14