CS 152 Computer Architecture and Engineering Lecture 16
- Slides: 40
CS 152 Computer Architecture and Engineering Lecture 16 Compiler Optimizations (Cont) Dynamic Scheduling with Scoreboards CS 152 Lec 16. 1
The Big Picture: Where are We Now? ° The Five Classic Components of a Computer Processor Input Control Memory Datapath Output ° Today’s Topics: • • • Recap last lecture Hardware loop unrolling with Tomasulo algorithm Administrivia Speculation, branch prediction Reorder buffers CS 152 Lec 16. 2
Scoreboard: a bookkeeping technique ° Out-of-order execution divides ID stage: 1. 2. Issue—decode instructions, check for structural hazards Read operands—wait until no data hazards, then read operands ° Scoreboards date to CDC 6600 in 1963 ° Instructions execute whenever not dependent on previous instructions and no hazards. ° CDC 6600: In order issue, out-of-order execution, out-of -order commit (or completion) • No forwarding! • Imprecise interrupt/exception model for now CS 152 Lec 16. 3
Registers FP Mult FP Divide FP Add Integer SCOREBOARD Functional Units Scoreboard Architecture(CDC 6600) Memory CS 152 Lec 16. 4
Scoreboard Implications ° Out-of-order completion => WAR, WAW hazards? ° Solutions for WAR: • Stall writeback until registers have been read • Read registers only during Read Operands stage ° Solution for WAW: • Detect hazard and stall issue of new instruction until other instruction completes ° No register renaming! ° Need to have multiple instructions in execution phase => multiple execution units or pipelined execution units ° Scoreboard keeps track of dependencies between instructions that have already issued. ° Scoreboard replaces ID, EX, WB with 4 stages CS 152 Lec 16. 5
Four Stages of Scoreboard Control ° Issue—decode instructions & check for structural hazards (ID 1) • Instructions issued in program order (for hazard checking) • Don’t issue if structural hazard • Don’t issue if instruction is output dependent on any previously issued but uncompleted instruction (no WAW hazards) ° Read operands—wait until no data hazards, then read operands (ID 2) • All real dependencies (RAW hazards) resolved in this stage, since we wait for instructions to write back data. • No forwarding of data in this model! CS 152 Lec 16. 6
Four Stages of Scoreboard Control ° Execution—operate on operands (EX) • The functional unit begins execution upon receiving operands. When the result is ready, it notifies the scoreboard that it has completed execution. ° Write result—finish execution (WB) • Stall until no WAR hazards with previous instructions: Example: DIVD ADDD SUBD F 0, F 2, F 4 F 10, F 8 F 8, F 14 CDC 6600 scoreboard would stall SUBD until ADDD reads operands CS 152 Lec 16. 7
Three Parts of the Scoreboard ° Instruction status: Which of 4 steps the instruction is in ° Functional unit status: —Indicates the state of the functional unit (FU). 9 fields for each functional unit Busy: Op: Fi: Fj, Fk: Qj, Qk: Rj, Rk: Indicates whether the unit is busy or not Operation to perform in the unit (e. g. , + or –) Destination register Source-register numbers Functional units producing source registers Fj, Fk Flags indicating when Fj, Fk are ready ° Register result status—Indicates which functional unit will write each register, if one exists. Blank when no pending instructions will write that register CS 152 Lec 16. 8
Scoreboard Example CS 152 Lec 16. 9
Detailed Scoreboard Pipeline Control Instruction status Issue Wait until Busy(FU) yes; Op(FU) op; Fi(FU) `D’; Fj(FU) `S 1’; Not busy (FU) Fk(FU) `S 2’; Qj Result(‘S 1’); and not result(D) Qk Result(`S 2’); Rj not Qj; Rk not Qk; Result(‘D’) FU; Read operands Rj and Rk Execution complete Functional unit done Write result Bookkeeping Rj No; Rk No f((Fj(f)≠Fi(FU) f(if Qj(f)=FU then Rj(f) Yes); or Rj(f)=No) & f(if Qk(f)=FU then Rj(f) Yes); (Fk(f) ≠Fi(FU) or Result(Fi(FU)) 0; Busy(FU) No Rk( f )=No)) CS 152 Lec 16. 10
Scoreboard Example: Cycle 1 CS 152 Lec 16. 11
Scoreboard Example: Cycle 2 • Issue 2 nd LD? CS 152 Lec 16. 12
Scoreboard Example: Cycle 3 • Issue MULT? CS 152 Lec 16. 13
Scoreboard Example: Cycle 4 CS 152 Lec 16. 14
Scoreboard Example: Cycle 5 CS 152 Lec 16. 15
Scoreboard Example: Cycle 6 CS 152 Lec 16. 16
Scoreboard Example: Cycle 7 • Read multiply operands? CS 152 Lec 16. 17
Scoreboard Example: Cycle 8 a (First half of clock cycle) CS 152 Lec 16. 18
Scoreboard Example: Cycle 8 b (Second half of clock cycle) CS 152 Lec 16. 19
Scoreboard Example: Cycle 9 Note Remaining • Read operands for MULT & SUB? Issue ADDD? CS 152 Lec 16. 20
Scoreboard Example: Cycle 10 CS 152 Lec 16. 21
Scoreboard Example: Cycle 11 CS 152 Lec 16. 22
Scoreboard Example: Cycle 12 • Read operands for DIVD? CS 152 Lec 16. 23
Scoreboard Example: Cycle 13 CS 152 Lec 16. 24
Scoreboard Example: Cycle 14 CS 152 Lec 16. 25
Scoreboard Example: Cycle 15 CS 152 Lec 16. 26
Scoreboard Example: Cycle 16 CS 152 Lec 16. 27
Scoreboard Example: Cycle 17 WAR Hazard! • Why not write result of ADD? ? ? CS 152 Lec 16. 28
Scoreboard Example: Cycle 18 CS 152 Lec 16. 29
Scoreboard Example: Cycle 19 CS 152 Lec 16. 30
Scoreboard Example: Cycle 20 CS 152 Lec 16. 31
Scoreboard Example: Cycle 21 • WAR Hazard is now gone. . . CS 152 Lec 16. 32
Scoreboard Example: Cycle 22 CS 152 Lec 16. 33
Faster than light computation (skip a couple of cycles) CS 152 Lec 16. 34
Scoreboard Example: Cycle 61 CS 152 Lec 16. 35
Scoreboard Example: Cycle 62 CS 152 Lec 16. 36
Review: Scoreboard Example: Cycle 62 • In-order issue; out-of-order execute & commit CS 152 Lec 16. 37
CDC 6600 Scoreboard ° Speedup 1. 7 from compiler; 2. 5 by hand BUT slow memory (no cache) limits benefit ° Limitations of 6600 scoreboard: • No forwarding hardware • Limited to instructions in basic block (small window) • Small number of functional units (structural hazards), especially integer/load store units • Do not issue on structural hazards • Wait for WAR hazards • Prevent WAW hazards CS 152 Lec 16. 38
Summary #1/2: Compiler techniques for parallelism ° Loop unrolling Multiple iterations of loop in software: • Amortizes loop overhead over several iterations • Gives more opportunity for scheduling around stalls ° Software Pipelining Take one instruction from each of several iterations of the loop • Software overlapping of loop iterations • Today will show hardware overlapping of loop iterations ° Very Long Instruction Word machines (VLIW) Multiple operations coded in single, long instruction • Requires sophisticated compiler to decide which operations can be done in parallel • Trace scheduling find common path and schedule code as if branches didn’t exist (+ add “fixup code”) ° All of these require additional registers CS 152 Lec 16. 39
Summary #2/2 ° HW exploiting ILP • Works when can’t know dependence at compile time. • Code for one machine runs well on another ° Key idea of Scoreboard: Allow instructions behind stall to proceed (Decode => Issue instr & read operands) • • Enables out-of-order execution => out-of-order completion ID stage checked both for structural & data dependencies Original version didn’t handle forwarding. No automatic register renaming CS 152 Lec 16. 40
- Computer architecture notes
- Computer architecture lecture
- 01:640:244 lecture notes - lecture 15: plat, idah, farad
- Difference between computer architecture and organization
- Buses in computer architecture
- System architecture example
- Basic computer organization
- Unr152
- Windows mru
- Przedszkole 152 łódź
- Re liveri 2006 qca 152
- Rounding off hundred thousands
- Mae 152
- Blending function in computer graphics
- Cs 152 stanford
- Cs 152 berkeley
- Ba 152
- Ece 152
- Ba 152
- Econ 152
- Which layer of the osi model includes vlans?
- Ba 152
- Macroob
- Organizational atrophy
- Hasil dari 202-152 adalah
- Econ 152
- Ron mak sjsu
- Gfi 152
- Money-time relationship and equivalence
- Requirement analysis in software engineering notes
- Foundation engineering lecture notes
- Engineering ethics lecture notes
- Computer security 161 cryptocurrency lecture
- Computer aided drug design lecture notes
- Return architecture
- Roman civilization architecture
- Call and return architecture in software engineering
- Forward engineering in software engineering
- Computer organization and architecture 10th solution
- Vlab iit kharagpur
- Introduction to computer organization and architecture