CSE 431 Computer Architecture Fall 2005 Lecture 05

  • Slides: 28
Download presentation
CSE 431 Computer Architecture Fall 2005 Lecture 05: Basic MIPS Architecture Review Mary Jane

CSE 431 Computer Architecture Fall 2005 Lecture 05: Basic MIPS Architecture Review Mary Jane Irwin ( www. cse. psu. edu/~mji ) www. cse. psu. edu/~cg 431 [Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, UCB] CSE 431 L 05 Basic MIPS Architecture. 1 Irwin, PSU, 2005

Review: THE Performance Equation q Our basic performance equation is then CPU time =

Review: THE Performance Equation q Our basic performance equation is then CPU time = Instruction_count x CPI x clock_cycle or CPU time q = Instruction_count x CPI -----------------------clock_rate These equations separate three key factors that affect performance l l Can measure the CPU execution time by running the program The clock rate is usually given in the documentation l Can measure instruction count by using profilers/simulators without knowing all of the implementation details l CPI varies by instruction type and ISA implementation for which we must know the implementation details CSE 431 L 05 Basic MIPS Architecture. 2 Irwin, PSU, 2005

So the first area of craftsmanship is in trading function for size. … The

So the first area of craftsmanship is in trading function for size. … The second area of craftsmanship is space-time trade-offs. For a given function, the more space, the faster. The Mythical Man-Month, Brooks, pg. 101 CSE 431 L 05 Basic MIPS Architecture. 3 Irwin, PSU, 2005

The Processor: Datapath & Control q Our implementation of the MIPS is simplified l

The Processor: Datapath & Control q Our implementation of the MIPS is simplified l l l q Generic implementation l l l q memory-reference instructions: lw, sw arithmetic-logical instructions: add, sub, and, or, slt control flow instructions: beq, j use the program counter (PC) to supply the instruction address and fetch the instruction from memory (and update the PC) Fetch PC = PC+4 Exec Decode decode the instruction (and read registers) execute the instruction All instructions (except j) use the ALU after reading the registers How? memory-reference? arithmetic? control flow? CSE 431 L 05 Basic MIPS Architecture. 4 Irwin, PSU, 2005

Clocking Methodologies q The clocking methodology defines when signals can be read and when

Clocking Methodologies q The clocking methodology defines when signals can be read and when they are written l q An edge-triggered methodology Typical execution l l l read contents of state elements send values through combinational logic write results to one or more state elements State element 1 Combinational logic State element 2 clock one clock cycle q Assumes state elements are written on every clock cycle; if not, need explicit write control signal l write occurs only when both the write control is asserted and the clock edge occurs CSE 431 L 05 Basic MIPS Architecture. 5 Irwin, PSU, 2005

Fetching Instructions q Fetching instructions involves l l reading the instruction from the Instruction

Fetching Instructions q Fetching instructions involves l l reading the instruction from the Instruction Memory updating the PC to hold the address of the next instruction Add 4 Instruction Memory PC Read Address Instruction l PC is updated every cycle, so it does not need an explicit write control signal l Instruction Memory is read every cycle, so it doesn’t need an explicit read control signal CSE 431 L 05 Basic MIPS Architecture. 6 Irwin, PSU, 2005

Decoding Instructions q Decoding instructions involves l sending the fetched instruction’s opcode and function

Decoding Instructions q Decoding instructions involves l sending the fetched instruction’s opcode and function field bits to the control unit Control Unit Instruction Read Addr 1 Read Register Read Addr 2 Data 1 File Write Addr Read Write Data l Data 2 reading two values from the Register File - Register File addresses are contained in the instruction CSE 431 L 05 Basic MIPS Architecture. 7 Irwin, PSU, 2005

Executing R Format Operations q R format operations (add, sub, slt, and, or) 31

Executing R Format Operations q R format operations (add, sub, slt, and, or) 31 R-type: op 25 rs 20 15 rt rd 10 5 0 shamt funct l perform the (op and funct) operation on values in rs and rt l store the result back into the Register File (into location rd) Reg. Write Instruction Read Addr 1 Register Read Addr 2 Data 1 File Write Addr Read Write Data l ALU control ALU overflow zero Data 2 The Register File is not written every cycle (e. g. sw), so we need an explicit write control signal for the Register File CSE 431 L 05 Basic MIPS Architecture. 8 Irwin, PSU, 2005

Executing Load and Store Operations q Load and store operations involves l l l

Executing Load and Store Operations q Load and store operations involves l l l compute memory address by adding the base register (read from the Register File during decode) to the 16 -bit signed-extended offset field in the instruction store value (read from the Register File during decode) written to the Data Memory load value, read from the Data Memory, written to the Register File Reg. Write ALU control Mem. Write Instruction overflow zero Read Addr 1 Register Read Addr 2 Data 1 File Write Addr Read Write Data 16 CSE 431 L 05 Basic MIPS Architecture. 9 Address ALU Write Data 2 Sign Extend Data Memory Read Data Mem. Read 32 Irwin, PSU, 2005

Executing Branch Operations q Branch operations involves l l compare the operands read from

Executing Branch Operations q Branch operations involves l l compare the operands read from the Register File during decode for equality (zero ALU output) compute the branch target address by adding the updated PC to the 16 -bit signed-extended offset field in the instr Add 4 Add Shift left 2 Branch target address ALU control PC Instruction Read Addr 1 Register Read Addr 2 Data 1 File Write Addr Read Write Data CSE 431 L 05 Basic MIPS Architecture. 10 16 zero (to branch control logic) ALU Data 2 Sign Extend 32 Irwin, PSU, 2005

Executing Jump Operations q Jump operation involves l replace the lower 28 bits of

Executing Jump Operations q Jump operation involves l replace the lower 28 bits of the PC with the lower 26 bits of the fetched instruction shifted left by 2 bits Add 4 4 Instruction Memory PC CSE 431 L 05 Basic MIPS Architecture. 11 Read Address Instruction Shift left 2 Jump address 28 26 Irwin, PSU, 2005

Creating a Single Datapath from the Parts q Assemble the datapath segments and add

Creating a Single Datapath from the Parts q Assemble the datapath segments and add control lines and multiplexors as needed q Single cycle design – fetch, decode and execute each instructions in one clock cycle q l no datapath resource can be used more than once per instruction, so some must be duplicated (e. g. , separate Instruction Memory and Data Memory, several adders) l multiplexors needed at the input of shared elements with control lines to do the selection l write signals to control writing to the Register File and Data Memory Cycle time is determined by length of the longest path CSE 431 L 05 Basic MIPS Architecture. 12 Irwin, PSU, 2005

Fetch, R, and Memory Access Portions Add Reg. Write ALUSrc ALU control 4 Instruction

Fetch, R, and Memory Access Portions Add Reg. Write ALUSrc ALU control 4 Instruction Memory PC Read Address Instruction CSE 431 L 05 Basic MIPS Architecture. 13 Address ALU Data Memory Read Data Write Data 2 Sign 16 Extend Memto. Reg ovf zero Read Addr 1 Register Read Addr 2 Data 1 File Write Addr Read Write Data Mem. Write Mem. Read 32 Irwin, PSU, 2005

Adding the Control q Selecting the operations to perform (ALU, Register File and Memory

Adding the Control q Selecting the operations to perform (ALU, Register File and Memory read/write) q Controlling the flow of data (multiplexor inputs) 31 R-type: op q Observations l l op field always in bits 31 -26 31 I-Type: op 31 25 rs 20 15 rt rd 20 rt 10 5 shamt funct 15 0 address offset 25 0 addr of registers J-type: op target address to be read are always specified by the rs field (bits 25 -21) and rt field (bits 20 -16); for lw and sw rs is the base register l addr. of register to be written is in one of two places – in rt (bits 20 -16) for lw; in rd (bits 15 -11) for R-type instructions l offset for beq, lw, and sw always in bits 15 -0 CSE 431 L 05 Basic MIPS Architecture. 14 0 Irwin, PSU, 2005

Single Cycle Datapath with Control Unit 0 Add ALUOp Reg. Dst PC Read Address

Single Cycle Datapath with Control Unit 0 Add ALUOp Reg. Dst PC Read Address Instr[31 -0] Mem. Read Memto. Reg Mem. Write ALUSrc Reg. Write ovf Instr[25 -21] Read Addr 1 Register Read Instr[20 -16] Read Addr 2 Data 1 File 0 Write Addr Read 1 Instr[15 -11] Instr[15 -0] 1 PCSrc Branch Instr[31 -26] Control Unit Instruction Memory Add Shift left 2 4 Write Data zero 0 ALU Data 2 Address Data Memory Read Data 1 Write Data 0 1 Sign 16 Extend 32 ALU control Instr[5 -0] CSE 431 L 05 Basic MIPS Architecture. 15 Irwin, PSU, 2005

R-type Instruction Data/Control Flow 0 Add ALUOp Reg. Dst PC Read Address Instr[31 -0]

R-type Instruction Data/Control Flow 0 Add ALUOp Reg. Dst PC Read Address Instr[31 -0] Mem. Read Memto. Reg Mem. Write ALUSrc Reg. Write ovf Instr[25 -21] Read Addr 1 Register Read Instr[20 -16] Read Addr 2 Data 1 File 0 Write Addr Read 1 Instr[15 -11] Instr[15 -0] 1 PCSrc Branch Instr[31 -26] Control Unit Instruction Memory Add Shift left 2 4 Write Data zero 0 ALU Data 2 Address Data Memory Read Data 1 Write Data 0 1 Sign 16 Extend 32 ALU control Instr[5 -0] CSE 431 L 05 Basic MIPS Architecture. 16 Irwin, PSU, 2005

Load Word Instruction Data/Control Flow 0 Add ALUOp Reg. Dst PC Read Address Instr[31

Load Word Instruction Data/Control Flow 0 Add ALUOp Reg. Dst PC Read Address Instr[31 -0] Mem. Read Memto. Reg Mem. Write ALUSrc Reg. Write ovf Instr[25 -21] Read Addr 1 Register Read Instr[20 -16] Read Addr 2 Data 1 File 0 Write Addr Read 1 Instr[15 -11] Instr[15 -0] 1 PCSrc Branch Instr[31 -26] Control Unit Instruction Memory Add Shift left 2 4 Write Data zero 0 ALU Data 2 Address Data Memory Read Data 1 Write Data 0 1 Sign 16 Extend 32 ALU control Instr[5 -0] CSE 431 L 05 Basic MIPS Architecture. 18 Irwin, PSU, 2005

Branch Instruction Data/Control Flow 0 Add ALUOp Reg. Dst PC Read Address Instr[31 -0]

Branch Instruction Data/Control Flow 0 Add ALUOp Reg. Dst PC Read Address Instr[31 -0] Mem. Read Memto. Reg Mem. Write ALUSrc Reg. Write ovf Instr[25 -21] Read Addr 1 Register Read Instr[20 -16] Read Addr 2 Data 1 File 0 Write Addr Read 1 Instr[15 -11] Instr[15 -0] 1 PCSrc Branch Instr[31 -26] Control Unit Instruction Memory Add Shift left 2 4 Write Data zero 0 ALU Data 2 Address Data Memory Read Data 1 Write Data 0 1 Sign 16 Extend 32 ALU control Instr[5 -0] CSE 431 L 05 Basic MIPS Architecture. 20 Irwin, PSU, 2005

Adding the Jump Operation Instr[25 -0] Shift left 2 26 28 1 32 0

Adding the Jump Operation Instr[25 -0] Shift left 2 26 28 1 32 0 PC+4[31 -28] 0 Add ALUOp Branch Instr[31 -26] Control Unit Reg. Dst Instruction Memory PC Read Address Instr[31 -0] Jump PCSrc ALUSrc Reg. Write ovf 1 Instr[15 -0] 1 Mem. Read Memto. Reg Mem. Write Instr[25 -21] Read Addr 1 Register Read Instr[20 -16] Read Addr 2 Data 1 File 0 Write Addr Read Instr[15 -11] Add Shift left 2 4 Write Data zero 0 ALU Data 2 Address Data Memory Read Data 1 Write Data 0 1 Sign 16 Extend 32 ALU control Instr[5 -0] CSE 431 L 05 Basic MIPS Architecture. 21 Irwin, PSU, 2005

Single Cycle Disadvantages & Advantages q Uses the clock cycle inefficiently – the clock

Single Cycle Disadvantages & Advantages q Uses the clock cycle inefficiently – the clock cycle must be timed to accommodate the slowest instruction l especially problematic for more complex instructions like floating point multiply Cycle 1 Cycle 2 Clk lw sw Waste May be wasteful of area since some functional units (e. g. , adders) must be duplicated since they can not be shared during a clock cycle but q Is simple and easy to understand q CSE 431 L 05 Basic MIPS Architecture. 22 Irwin, PSU, 2005

Multicycle Datapath Approach q Let an instruction take more than 1 clock cycle to

Multicycle Datapath Approach q Let an instruction take more than 1 clock cycle to complete l Break up instructions into steps where each step takes a cycle while trying to - balance the amount of work to be done in each step - restrict each cycle to use only one major functional unit l q Not every instruction takes the same number of clock cycles In addition to faster clock rates, multicycle allows functional units that can be used more than once per instruction as long as they are used on different clock cycles, as a result l only need one memory – but only one memory access per cycle l need only one ALU/adder – but only one ALU operation per cycle CSE 431 L 05 Basic MIPS Architecture. 23 Irwin, PSU, 2005

Multicycle Datapath Approach, con’t At the end of a cycle l ALUout Write Data

Multicycle Datapath Approach, con’t At the end of a cycle l ALUout Write Data Read Addr 1 Register Read Addr 2 Data 1 File Write Addr Read Data 2 Write Data B Memory Address Read Data (Instr. or Data) A Store values needed in a later cycle by the current instruction in an internal register (not visible to the programmer). All (except IR) hold data only between a pair of adjacent clock cycles (no write control signal needed) IR PC l MDR q IR – Instruction Register MDR – Memory Data Register A, Bused – regfile read data registers ALU output visible register Data by subsequent instructions are ALUout stored in–programmer registers (i. e. , register file, PC, or memory) CSE 431 L 05 Basic MIPS Architecture. 24 Irwin, PSU, 2005

The Multicycle Datapath with Control Signals 1 Memory Address Read Data (Instr. or Data)

The Multicycle Datapath with Control Signals 1 Memory Address Read Data (Instr. or Data) 1 1 Write Data 0 MDR Write Data 2 Shift left 2 28 2 0 1 zero ALU 4 0 Instr[15 -0] Sign Extend 32 Instr[5 -0] CSE 431 L 05 Basic MIPS Architecture. 25 Shift left 2 Instr[25 -0] Read Addr 1 Register Read Addr 2 Data 1 File Write Addr Read IR PC Instr[31 -26] 0 PC[31 -28] ALUout Mem. Read Mem. Write Memto. Reg IRWrite PCSource ALUOp Control ALUSrc. B ALUSrc. A Reg. Write Reg. Dst A Ior. D B PCWrite. Cond PCWrite 0 1 2 3 ALU control Irwin, PSU, 2005

Multicycle Control Unit q Multicycle datapath control signals are not determined solely by the

Multicycle Control Unit q Multicycle datapath control signals are not determined solely by the bits in the instruction l q e. g. , op code bits tell what operation the ALU should be doing, but not what instruction cycle is to be done next Must use a finite state machine (FSM) for control a set of states (current state stored in State Register) l next state function (determined by current state and the input) l output function (determined by current state and the input) Combinational control logic . . . Inst Opcode CSE 431 L 05 Basic MIPS Architecture. 26 . . . l Datapath control points . . . State Reg Next State Irwin, PSU, 2005

The Five Steps of the Load Instruction Cycle 1 Cycle 2 Cycle 3 Cycle

The Five Steps of the Load Instruction Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 lw IFetch Dec Exec Mem WB q IFetch: Instruction Fetch and Update PC q Dec: Instruction Decode, Register Read, Sign Extend Offset q Exec: Execute R-type; Calculate Memory Address; Branch Comparison; Branch and Jump Completion q Mem: Memory Read; Memory Write Completion; Rtype Completion (Reg. File write) q WB: Memory Read Completion (Reg. File write) INSTRUCTIONS TAKE FROM 3 - 5 CYCLES! CSE 431 L 05 Basic MIPS Architecture. 27 Irwin, PSU, 2005

Multicycle Advantages & Disadvantages q Uses the clock cycle efficiently – the clock cycle

Multicycle Advantages & Disadvantages q Uses the clock cycle efficiently – the clock cycle is timed to accommodate the slowest instruction step Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10 Clk lw IFetch q sw Dec Exec Mem WB IFetch Dec Exec Mem R-type IFetch Multicycle implementations allow functional units to be used more than once per instruction as long as they are used on different clock cycles but q Requires additional internal state registers, more muxes, and more complicated (FSM) control CSE 431 L 05 Basic MIPS Architecture. 28 Irwin, PSU, 2005

Single Cycle vs. Multiple Cycle Timing Single Cycle Implementation: Cycle 1 Cycle 2 Clk

Single Cycle vs. Multiple Cycle Timing Single Cycle Implementation: Cycle 1 Cycle 2 Clk lw sw multicycle clock slower than 1/5 th of single cycle clock due to state register overhead Multiple Cycle Implementation: Clk Waste Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10 lw IFetch sw Dec Exec CSE 431 L 05 Basic MIPS Architecture. 29 Mem WB IFetch Dec Exec Mem R-type IFetch Irwin, PSU, 2005

Next Lecture and Reminders q Next lecture l MIPS pipelined datapath review - Reading

Next Lecture and Reminders q Next lecture l MIPS pipelined datapath review - Reading assignment – PH, Chapter 6. 1 -6. 3 q Reminders l HW 2 due September 27 th l Evening midterm exam scheduled - Tuesday, October 18 th , 20: 15 to 22: 15, Location 113 IST - You should have let me know by now if you have a conflict !! CSE 431 L 05 Basic MIPS Architecture. 30 Irwin, PSU, 2005