14 332 331 Computer Architecture and Assembly Language
14: 332: 331 Computer Architecture and Assembly Language Fall 2003 Week 8 [Adapted from Dave Patterson’s UCB CS 152 slides and Mary Jane Irwin’s PSU CSE 331 slides] 331 W 08. 1 Fall 2003
Head’s Up q This week’s material l Building a MIPS datapath - Reading assignment – PH 5. 1 -5. 2 q Next week’s material l Single cycle datapath implementation - Reading assignment – PH 5. 3 and C. 1 through C. 2 331 W 08. 2 Fall 2003
Review: Design Principles q Simplicity favors regularity l l q Good design demands good compromises l q three instruction formats Smaller is faster l l l q fixed size instructions – 32 -bits only three instruction formats limited instruction set limited number of registers in register file limited number of addressing modes Make the common case fast l l 331 W 08. 3 arithmetic operands from the register file (load-store machine) allow instructions to contain immediate operands Fall 2003
The Processor: Datapath & Control q We're ready to look at an implementation of the MIPS q Simplified to contain only: l l l q Generic implementation: l l l q memory-reference instructions: lw, sw arithmetic-logical instructions: add, sub, and, or, slt Fetch control flow instructions: beq, j PC = PC+4 Exec Decode use the program counter (PC) to supply the instruction address and fetch the instruction from memory (and update the PC) decode the instruction (and read registers) execute the instruction All instructions (except j) use the ALU after reading the registers 331 W 08. 4 Fall 2003
Abstract Implementation View q Two types of functional units: l l elements that operate on data values (combinational) elements that contain state (sequential) Instruction Memory PC Address Instruction Write Data Register Read Data Reg Addr File Reg Addr Read Data Reg Address ALU Data Memory Read Data Write Data q Single cycle operation q Split memory (Harvard) model - one memory for instructions and one for data 331 W 08. 5 Fall 2003
Clocking Methodologies q Clocking methodology defines when signals can be read and when they can be written falling (negative) edge cycle time rising (positive) edge clock rate = 1/(cycle time) e. g. , 10 nsec cycle time = 100 MHz clock rate 1 nsec cycle time = 1 GHz clock rate q State element design choices l l 331 W 08. 6 level sensitive latch master-slave and edge-triggered flipflops Fall 2003
Review: State Elements q Set-reset latch R S q Q !Q R S Q(t+1) !Q(t+1) 1 0 0 1 1 0 0 0 Q(t) !Q(t) 1 1 0 0 Level sensitive D latch clock Q D clock D l 331 W 08. 7 !Q Q latch is transparent when clock is high (copies input to output) Fall 2003
Review: State Elements, con’t q Race problem with latch based design … D Q D-latch 0 clock !Q D Q D-latch 1 clock !Q clock q Consider the case when D-latch 0 holds a 0 and Dlatch 1 holds a 1 and you want to transfer the contents of D-latch 0 to D-latch 1 and vica versa l l q must have the clock high long enough for the transfer to take place must not leave the clock high so long that the transferred data is copied back into the original latch Two-sided clock constraint 331 W 08. 8 Fall 2003
Review: State Elements, con’t q Solution is to use flipflops that change state (Q) only on clock edge (master-slave) D D clock D-latch clock Q !Q D Q Q !Q !Q D-latch clock D clock Q - master (first D-latch) copies the input when the clock is high (the slave (second D-latch) is locked in its memory state and the output does not change) - slave copies the master when the clock goes low (the master is now locked in its memory state so changes at the input are not loaded into the master D-latch) q One-sided clock constraint l 331 W 08. 9 must have the clock cycle time long enough to accommodate the worst case delay path Fall 2003
Our Implementation q An edge-triggered methodology q Typical execution l l l read contents of some state elements send values through some combinational logic write results to one or more state elements State element 1 Combinational logic State element 2 clock one clock cycle q 331 W 08. 10 Assumes state elements are written on every clock cycle; if not, need explicit write control signal l write occurs only when both the write control is Fall 2003
Fetching Instructions q Fetching l l instructions involves reading the instruction from the Instruction Memory updating the PC to hold the address of the next instruction Add 4 Instruction Memory PC l l 331 W 08. 11 Read Address Instruction PC is updated every cycle, so it does not need an explicit write control signal Instruction Memory is read every cycle, so it doesn’t need an explicit read control signal Fall 2003
Decoding Instructions q Decoding instructions involves l sending the fetched instruction’s opcode and function field bits to the control unit Control Unit Instruction Read Addr 1 Read Register Read Addr 2 Data 1 File Write Addr Read Write Data l Data 2 reading two values from the Register File - Register File addresses are contained in the instruction 331 W 08. 12 Fall 2003
Executing R Format Operations q R format operations (add, sub, slt, and, or) 31 R-type: op 25 rs 20 15 10 rt rd 5 0 shamt funct l perform the indicated (by op and funct) operation on values in rs and rt l store the result back into the Register File (into location rd) Reg. Write Instruction Read Addr 1 Register Read Addr 2 Data 1 File Write Addr Read Write Data l 331 W 08. 13 ALU control ALU overflow zero Data 2 Note that Register File is not written every cycle (e. g. sw), so we need an explicit write control signal for the Register File Fall 2003
Executing Load and Store Operations q Load and store operations 31 I-Type: l op 25 rs 20 rt 15 0 address offset compute a memory address by adding the base register (in rs) to the 16 -bit signed offset field in the instruction - base register was read from the Register File during decode - offset value in the low order 16 bits of the instruction must be sign extended to create a 32 -bit signed value store value, read from the Register File during decode, must be written to the Data Memory l load value, read from the Data Memory, must be stored in the Register File l 331 W 08. 14 Fall 2003
Executing Load and Store Operations, con’t Reg. Write Instruction Read Addr 1 Read Register Read Addr 2 Data 1 File Write Addr Read Write Data 2 Sign Extend 331 W 08. 15 ALU control Mem. Write overflow zero Address ALU Data Memory Read Data Write Data Mem. Read Fall 2003
Executing Branch Operations q Branch operations have to 31 I-Type: op 25 rs 20 rt 15 0 address offset l compare the operands read from the Register File during decode (rs and rt values) for equality (zero ALU output) l compute the branch target address by adding the updated PC to the sign extended 16 -bit signed offset field in the instruction - “base register” is the updated PC - offset value in the low order 16 bits of the instruction must be sign extended to create a 32 -bit signed value and then shifted left 2 bits to turn it into a word address 331 W 08. 16 Fall 2003
Executing Branch Operations, con’t Add 4 Add Shift left 2 Branch target address ALU control PC Instruction Read Addr 1 Register Read Addr 2 Data 1 File Write Addr Read Write Data 16 331 W 08. 17 zero (to branch control logic) ALU Data 2 Sign Extend 32 Fall 2003
Executing Jump Operations q Jump operations have to 31 J-Type: op l 25 0 jump target address replace the lower 28 bits of the PC with the lower 26 bits of the fetched instruction shifted left by 2 bits Add 4 4 Instruction Memory PC 331 W 08. 18 Read Address Instruction Shift left 2 Jump address 28 26 Fall 2003
Our Simple Control Structure q q We wait for everything to settle down l ALU might not produce “right answer” right away l we use write signals along with the clock edge to determine when to write (to the Register File and the Data Memory) Cycle time determined by length of the longest path We are ignoring some details like register setup and hold times 331 W 08. 19 Fall 2003
- Slides: 19