CPU Design Steps 1 Analyze instruction set operations

  • Slides: 32
Download presentation
CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements.

CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements. 2. Select required datapath components & establish clock methodology. 3. Assemble datapath meeting the requirements. 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic. EECC 550 - Shaaban #1 Lec # 5 Spring 2003 3 -26 -2003

Single Cycle MIPS Datapath: n. PC_sel 4 Rd Imm 16 Reg. Dst ALUctr Mem.

Single Cycle MIPS Datapath: n. PC_sel 4 Rd Imm 16 Reg. Dst ALUctr Mem. Wr Equal Rd Rt 0 1 32 imm 16 16 0 1 32 Data In 32 Clk 32 Wr. En Adr 0 Mux 00 Clk Extender Clk = 32 ALU bus. W Mux PC Mux Adder Rs Rt 5 5 bus. A Rw Ra Rb 32 32 -bit Registers bus. B 32 Memto. Reg. Wr 5 Adder PC Ext imm 16 Rt Instruction<31: 0> <0: 15> Rs <11: 15> Adr <16: 20> <21: 25> Inst Memory CPI = 1, Long Clock Cycle 1 Data Memory Ext. Op ALUSrc EECC 550 - Shaaban #2 Lec # 5 Spring 2003 3 -26 -2003

Drawbacks of Single-Cycle Processor • Long cycle time. • All instructions must take as

Drawbacks of Single-Cycle Processor • Long cycle time. • All instructions must take as much time as the slowest: – Cycle time for load is longer than needed for all other instructions. • Real memory is not as well-behaved as idealized memory – Cannot always complete data access in one (short) cycle. • Cannot pipeline (overlap) the processing of one instruction with the previous instructions. EECC 550 - Shaaban #3 Lec # 5 Spring 2003 3 -26 -2003

ALU Reg. Wrt Result Store Data Mem. Wr Reg. Dst Reg. Wr Mem. Rd

ALU Reg. Wrt Result Store Data Mem. Wr Reg. Dst Reg. Wr Mem. Rd Mem. Wr fun Mem Access Ext. Op ALUSrc ALUctr Equal op Ext Register Fetch Instruction Fetch PC Next PC n. PC_sel Abstract View of Single Cycle CPU Main Control ALU control EECC 550 - Shaaban #4 Lec # 5 Spring 2003 3 -26 -2003

Single Cycle Instruction Timing Arithmetic & Logical PC Inst Memory Reg File mux ALU

Single Cycle Instruction Timing Arithmetic & Logical PC Inst Memory Reg File mux ALU mux setup Load PC Inst Memory ALU Data Mem Store PC mux Reg File Critical Path Inst Memory Reg File ALU Data Mem Branch PC Inst Memory Reg File mux cmp mux setup mux EECC 550 - Shaaban #5 Lec # 5 Spring 2003 3 -26 -2003

Clock Cycle Time & Critical Path Clk . . . • Critical path: the

Clock Cycle Time & Critical Path Clk . . . • Critical path: the slowest path between any two storage devices • Cycle time is a function of the critical path, must be greater than: – Clock-to-Q + Longest Path through the Combination Logic + Setup EECC 550 - Shaaban #6 Lec # 5 Spring 2003 3 -26 -2003

Reducing Cycle Time: Multi-Cycle Design • Cut combinational dependency graph by inserting registers /

Reducing Cycle Time: Multi-Cycle Design • Cut combinational dependency graph by inserting registers / latches. • The same work is done in two or more fast cycles, rather than one slow cycle. storage element Acyclic Combinational Logic (A) Acyclic Combinational Logic => storage element Acyclic Combinational Logic (B) storage element EECC 550 - Shaaban #7 Lec # 5 Spring 2003 3 -26 -2003

Instruction Processing Steps Instruction Fetch Next Obtain instruction from program storage Update program counter

Instruction Processing Steps Instruction Fetch Next Obtain instruction from program storage Update program counter to address Instruction of next instruction Instruction Determine instruction type Decode Obtain operands from registers Execute Compute result value or status } Common steps for all instructions Result Store result in register/memory if needed Store (usually called Write Back). EECC 550 - Shaaban #8 Lec # 5 Spring 2003 3 -26 -2003

Partitioning The Single Cycle Datapath Instruction Decode Cycle (ID) Execution Cycle (EX) Data Memory

Partitioning The Single Cycle Datapath Instruction Decode Cycle (ID) Execution Cycle (EX) Data Memory Access Cycle (MEM) Result Store Mem. Wr Reg. Dst Reg. Wr Reg. File Data Mem. Rd Mem. Wr ALUctr ALUSrc Exec Mem Access Operand Fetch Ext. Op Instruction Fetch Cycle (IF) Instruction Fetch PC Next PC n. PC_sel Add registers between steps to break into cycles Write back Cycle (WB) EECC 550 - Shaaban #9 Lec # 5 Spring 2003 3 -26 -2003

B Execution (EX) 2 ns Memory (MEM) 2 ns Reg. Dst Reg. Wr File

B Execution (EX) 2 ns Memory (MEM) 2 ns Reg. Dst Reg. Wr File Equal Mem. To. Reg Mem. Rd Mem. Wr ALUSrc ALUctr R M Data Mem Instruction Decode (ID) 1 ns A Mem Access Reg File Ext ALU Ext. Op Instruction Fetch (IF) 2 ns IR PC Next PC n. PC_sel Example Multi-cycle Datapath Write Back (WB) 1 ns Registers added: (not shown register write enable control lines) IR: Instruction register A, B: Two registers to hold operands read from register file. R: or ALUOut, holds the output of the main ALU M: or Memory data register (MDR) to hold data read from data memory Cycle Time: Worst cycle delay = C = 2 ns EECC 550 - Shaaban #10 Lec # 5 Spring 2003 3 -26 -2003

Operations In Each Cycle R-Type Logic Immediate IF Instruction Fetch IR ¬ Mem[PC] IR

Operations In Each Cycle R-Type Logic Immediate IF Instruction Fetch IR ¬ Mem[PC] IR ¬ ID Instruction Decode A ¬ R[rs] B ¬ Mem[PC] Load IR ¬ Mem[PC] A ¬ R[rs] R[rt] Store IR ¬ Branch Mem[PC] IR ¬ Mem[PC] A ¬ R[rs] B ¬ R[rt] If Equal = 1 PC ¬ PC + 4 + EX Execution R¬ A + B R ¬ A OR Zero. Ext[imm 16] R ¬ A + Sign. Ex(Im 16) (Sign. Ext(imm 16) x 4) else PC ¬ PC + 4 MEM WB Memory Write Back M ¬ Mem[R] ¬ R[rd] ¬ R R[rt] PC ¬ PC + 4 Mem[R] ¬ PC + 4 B M EECC 550 - Shaaban #11 Lec # 5 Spring 2003 3 -26 -2003

MIPS Multi-Cycle Datapath: Five Cycles of Load Cycle 1 Cycle 2 Load IF ID

MIPS Multi-Cycle Datapath: Five Cycles of Load Cycle 1 Cycle 2 Load IF ID Cycle 3 Cycle 4 Cycle 5 EX MEM WB 1 - Instruction Fetch (IF) Instruction Fetch • Fetch the instruction from the Instruction Memory. 2 - Instruction Decode (ID): Registers Fetch and Instruction Decode. 3 - Execute (EX): Calculate the effective memory address. 4 - Memory (MEM): Read the data from the Data Memory. 5 - Write Back (WB): Write the data back to the register file. Update PC. EECC 550 - Shaaban #12 Lec # 5 Spring 2003 3 -26 -2003

Multi-cycle Datapath Instruction CPI • R-Type/Immediate: Require four cycles, CPI =4 – IF, ID,

Multi-cycle Datapath Instruction CPI • R-Type/Immediate: Require four cycles, CPI =4 – IF, ID, EX, WB • Loads: Require five cycles, CPI = 5 – IF, ID, EX, MEM, WB • Stores: Require four cycles, CPI = 4 – IF, ID, EX, MEM • Branches: Require three cycles, CPI = 3 – IF, ID, EX • Average program 3 £ CPI £ 5 depending on program profile (instruction mix). EECC 550 - Shaaban #13 Lec # 5 Spring 2003 3 -26 -2003

Single Cycle Vs. Multi-Cycle CPU Cycle 1 Cycle 2 Clk Single Cycle Implementation: 8

Single Cycle Vs. Multi-Cycle CPU Cycle 1 Cycle 2 Clk Single Cycle Implementation: 8 ns Load Store Waste 2 ns Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10 Clk Multiple Cycle Implementation: Load IF ID EX MEM WB Single-Cycle CPU: CPI = 1 C = 8 ns One million instructions take = I x CPI x C = 106 x 1 x 8 x 10 -9 = 8 msec Store IF ID EX R-type MEM IF Multi-Cycle CPU: CPI = 3 to 5 C = 2 ns One million instructions take from 106 x 3 x 2 x 10 -9 = 6 msec to 106 x 5 x 2 x 10 -9 = 10 msec depending on instruction mix used. EECC 550 - Shaaban #14 Lec # 5 Spring 2003 3 -26 -2003

Finite State Machine (FSM) Control Model • State specifies control points for Register Transfer.

Finite State Machine (FSM) Control Model • State specifies control points for Register Transfer. • Transfer occurs upon exiting state (falling edge). inputs (conditions) Next State Logic State X Control State Register Transfer Control Points Depends on Input Output Logic outputs (control points) EECC 550 - Shaaban #15 Lec # 5 Spring 2003 3 -26 -2003

Control Specification For Multi-cycle CPU Finite State Machine (FSM) “instruction fetch” IR ¬ MEM[PC]

Control Specification For Multi-cycle CPU Finite State Machine (FSM) “instruction fetch” IR ¬ MEM[PC] “decode / operand fetch” A ¬ R[rs] B ¬ R[rt] R ¬ A or ZX R[rd] ¬ R PC ¬ PC + 4 R[rt] ¬ R PC ¬ PC + 4 To instruction fetch LW SW BEQ & Equal BEQ & ~Equal PC ¬ PC + 4 R ¬ A + SX M ¬ MEM[R] ¬ B PC ¬ PC + 4 PC ¬ PC + SX || 00 To instruction fetch Write-back R ¬ A fun B ORi Memory Execute R-type R[rt] ¬ M PC ¬ PC + 4 To instruction fetch EECC 550 - Shaaban #16 Lec # 5 Spring 2003 3 -26 -2003

Traditional FSM Controller next state op cond state control points Truth or Transition Table

Traditional FSM Controller next state op cond state control points Truth or Transition Table 11 Equal next State control points 6 4 op datapath State To datapath EECC 550 - Shaaban #17 Lec # 5 Spring 2003 3 -26 -2003

Traditional FSM Controller datapath + state diagram => control • Translate RTN statements into

Traditional FSM Controller datapath + state diagram => control • Translate RTN statements into control points. • Assign states. • Implement the controller. EECC 550 - Shaaban #18 Lec # 5 Spring 2003 3 -26 -2003

Mapping RTNs To Control Points Examples & State Assignments IR ¬ MEM[PC] “instruction fetch”

Mapping RTNs To Control Points Examples & State Assignments IR ¬ MEM[PC] “instruction fetch” 0000 imem_rd, IRen A ¬ R[rs] B ¬ R[rt] Aen, Ben “decode / operand fetch” 0001 ALUfun, Sen ORi LW R ¬ A or ZX R ¬ A + SX 0110 1000 Reg. Dst, Reg. Wr, PCen M ¬ MEM[S] 1001 BEQ & Equal SW BEQ & ~Equal R ¬ A + SX 1011 MEM[S] ¬ B PC ¬ PC + 4 1100 R[rd] ¬ R PC ¬ PC + 4 0101 R[rt] ¬ R PC ¬ PC + 4 0111 To instruction fetch state 0000 PC ¬ PC + 4 0011 PC ¬ PC + SX || 00 0010 To instruction fetch state 0000 Write-back R ¬ A fun B 0100 Memory Execute R-type R[rt] ¬ M PC ¬ PC + 4 1010 To instruction fetch state 0000 EECC 550 - Shaaban #19 Lec # 5 Spring 2003 3 -26 -2003

Detailed Control Specification Current IF ID BEQ R ORI LW SW Op field Eq

Detailed Control Specification Current IF ID BEQ R ORI LW SW Op field Eq Write-Back State Wr Dst 0000 0001 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 ? ? ? BEQ R-type or. I LW SW xxxxxx 0 1 xxxxxx xxxxxx 1 1 Next IR PC en sel. A B ? 0 1 x x x x 1 x x 0 x x x 0 0001 0010 0100 0110 1000 1011 0000 0101 0000 0111 0000 1001 1010 0000 Ops Exec Ex Sr ALU S Mem RWM M-R 1 11 11 11 1 1 0 0 1 fun 1 0 0 or 1 0 0 1 0 add 1 1 0 EECC 550 - Shaaban #20 Lec # 5 Spring 2003 3 -26 -2003

Alternative Multiple Cycle Datapath (In Textbook) • Miminizes Hardware: 1 memory, 1 ALU PCWr.

Alternative Multiple Cycle Datapath (In Textbook) • Miminizes Hardware: 1 memory, 1 ALU PCWr. Cond Zero Mem. Wr IRWr Reg. Dst 1 32 32 Rt 0 5 Rd Ext. Op Rb bus. A Reg File 32 1 Rw 1 1 Mux 0 Imm 16 Ra bus. W bus. B 32 Extend << 2 4 Zero 32 0 1 32 ALU Out 1 32 Mux Ideal Memory Wr. Adr 32 Din Dout Rt 5 Target ALU Mux RAdr Instruction Reg 0 32 0 0 Rs Mux 32 32 ALUSel. A 32 PC 32 Reg. Wr Br. Wr Mux Ior. D PCSrc 32 2 3 32 ALU Control ALUOp Memto. Reg ALUSel. B EECC 550 - Shaaban #21 Lec # 5 Spring 2003 3 -26 -2003

Alternative Multiple Cycle Datapath (In Textbook) • Shared instruction/data memory unit • A single

Alternative Multiple Cycle Datapath (In Textbook) • Shared instruction/data memory unit • A single ALU shared among instructions • Shared units require additional or widened multiplexors • Temporary registers to hold data between clock cycles of the instruction: • Additional registers: Instruction Register (IR), Memory Data Register (MDR), A, B, ALUOut EECC 550 - Shaaban #22 Lec # 5 Spring 2003 3 -26 -2003

Alternative Multiple Cycle Datapath With Control Lines (Fig 5. 33 In Textbook) EECC 550

Alternative Multiple Cycle Datapath With Control Lines (Fig 5. 33 In Textbook) EECC 550 - Shaaban #23 Lec # 5 Spring 2003 3 -26 -2003

Operations In Each Cycle R-Type Instruction Fetch Instruction Decode Execution IR ¬ Mem[PC] PC

Operations In Each Cycle R-Type Instruction Fetch Instruction Decode Execution IR ¬ Mem[PC] PC ¬ PC + 4 A ¬ R[rs] B ¬ R[rt] Logic Immediate IR ¬ Mem[PC] PC ¬ PC + 4 Load Store IR ¬ Mem[PC] PC ¬ PC + 4 Branch IR ¬ Mem[PC] PC ¬ PC + 4 A ¬ R[rs] B ¬ B ¬ R[rt] ALUout ¬ PC + (Sign. Ext(imm 16) x 4) ALUout ¬ PC + ALUout ¬ A + B ALUout (Sign. Ext(imm 16) x 4) ¬ A OR Zero. Ext[imm 16] R[rt] ALUout ¬ PC + (Sign. Ext(imm 16) x 4) ALUout ¬ PC + (Sign. Ext(imm 16) x 4) If Equal = 1 ALUout ¬ A + Sign. Ex(Im 16) (Sign. Ext(imm 16) x 4) PC ¬ ALUout A + Sign. Ex(Im 16) Memory M ¬ Mem[ALUout] Write Back R[rd] ¬ ALUout R[rt] ¬ Mem[ALUout] ¬ B Mem EECC 550 - Shaaban #24 Lec # 5 Spring 2003 3 -26 -2003

High-Level View of Finite State Machine Control • • First steps are independent of

High-Level View of Finite State Machine Control • • First steps are independent of the instruction class Then a series of sequences that depend on the instruction opcode Then the control returns to fetch a new instruction. Each box above represents one or several state. EECC 550 - Shaaban #25 Lec # 5 Spring 2003 3 -26 -2003

Instruction Fetch and Decode FSM States EECC 550 - Shaaban #26 Lec # 5

Instruction Fetch and Decode FSM States EECC 550 - Shaaban #26 Lec # 5 Spring 2003 3 -26 -2003

Load/Store Instructions FSM States EECC 550 - Shaaban #27 Lec # 5 Spring 2003

Load/Store Instructions FSM States EECC 550 - Shaaban #27 Lec # 5 Spring 2003 3 -26 -2003

R-Type Instructions FSM States EECC 550 - Shaaban #28 Lec # 5 Spring 2003

R-Type Instructions FSM States EECC 550 - Shaaban #28 Lec # 5 Spring 2003 3 -26 -2003

Branch Instruction Single State Jump Instruction Single State EECC 550 - Shaaban #29 Lec

Branch Instruction Single State Jump Instruction Single State EECC 550 - Shaaban #29 Lec # 5 Spring 2003 3 -26 -2003

EECC 550 - Shaaban #30 Lec # 5 Spring 2003 3 -26 -2003

EECC 550 - Shaaban #30 Lec # 5 Spring 2003 3 -26 -2003

Finite State Machine (FSM) Specification IR ¬ MEM[PC] PC ¬ PC + 4 “instruction

Finite State Machine (FSM) Specification IR ¬ MEM[PC] PC ¬ PC + 4 “instruction fetch” 0000 ALUout ¬ A fun B 0100 ORi ALUout ¬ A op ZX 0110 Memory Execute R-type ALUout ¬ PC +SX 0001 LW ALUout ¬ A + SX 1000 1001 R[rd] ¬ ALUout R[rt] ¬ M 0101 0111 1010 BEQ SW ALUout ¬ A + SX M ¬ MEM[ALUout] To instruction fetch “decode” 1011 0010 MEM[ALUout] ¬B To instruction fetch If A = B then PC ¬ ALUout To instruction fetch 1100 Write-back A ¬ R[rs] B ¬ R[rt] EECC 550 - Shaaban #31 Lec # 5 Spring 2003 3 -26 -2003

MIPS Multi-cycle Datapath Performance Evaluation • What is the average CPI? – State diagram

MIPS Multi-cycle Datapath Performance Evaluation • What is the average CPI? – State diagram gives CPI for each instruction type. – Workload (program) below gives frequency of each type. Type CPIi for type Frequency CPIi x freq. Ii Arith/Logic 4 40% 1. 6 Load 5 30% 1. 5 Store 4 10% 0. 4 branch 3 20% 0. 6 Average CPI: 4. 1 Better than CPI = 5 if all instructions took the same number of clock cycles (5). EECC 550 - Shaaban #32 Lec # 5 Spring 2003 3 -26 -2003