Single Cycle Processor Design ICS 233 Computer Architecture














































- Slides: 46

Single Cycle Processor Design ICS 233 Computer Architecture and Assembly Language Dr. Aiman El-Maleh College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals [Adapted from slides of Dr. M. Mudawar, ICS 233, KFUPM]

Outline v Designing a Processor: Step-by-Step v Datapath Components and Clocking v Assembling an Adequate Datapath v Controlling the Execution of Instructions v The Main Controller and ALU Controller v Worst case timing Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 2

Designing a Processor: Step-by-Step v Analyze instruction set => datapath requirements ² The meaning of each instruction is given by the register transfers ² Datapath must include storage elements for ISA registers ² Datapath must support each register transfer v Select datapath components and clocking methodology v Assemble datapath meeting the requirements v Analyze implementation of each instruction ² Determine the setting of control signals for register transfer v Assemble the control logic Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 3

Review of MIPS Instruction Formats v All instructions are 32 -bit wide v Three instruction formats: R-type, I-type, and J-type Op 6 Rs 5 Rt 5 Op 6 Rd 5 sa 5 funct 6 immediate 16 immediate 26 ² Op 6: 6 -bit opcode of the instruction ² Rs 5, Rt 5, Rd 5: 5 -bit source and destination register numbers ² sa 5: 5 -bit shift amount used by shift instructions ² funct 6: 6 -bit function field for R-type instructions ² immediate 16: 16 -bit immediate value or address offset ² immediate 26: 26 -bit target address of the jump instruction Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 4

MIPS Subset of Instructions v Only a subset of the MIPS instructions are considered ² ALU instructions (R-type): add, sub, and, or, xor, slt ² Immediate instructions (I-type): addi, slti, andi, ori, xori ² Load and Store (I-type): lw, sw ² Branch (I-type): beq, bne ² Jump (J-type): j v This subset does not include all the integer instructions v But sufficient to illustrate design of datapath and control v Concepts used to implement the MIPS subset are used to construct a broad spectrum of computers Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 5

Details of the MIPS Subset Instruction add sub and or xor slt addi slti andi ori xori lw sw beq bne j Meaning rd, rs, rt addition rd, rs, rt subtraction rd, rs, rt bitwise and rd, rs, rt bitwise or rd, rs, rt exclusive or rd, rs, rt set on less than rt, rs, im 16 add immediate rt, rs, im 16 slt immediate rt, rs, im 16 and immediate rt, rs, im 16 or immediate rt, im 16 xor immediate rt, im 16(rs) load word rt, im 16(rs) store word rs, rt, im 16 branch if equal rs, rt, im 16 branch not equal im 26 jump Single Cycle Processor Design ICS 233 – KFUPM Format op 6 = 0 op 6 = 0 0 x 08 0 x 0 a 0 x 0 c 0 x 0 d 0 x 0 e 0 x 23 0 x 2 b 0 x 04 0 x 05 0 x 02 rs 5 rs 5 rs 5 rs 5 © Muhamed Mudawar slide 6 rt 5 rt 5 rt 5 rt 5 rd 5 rd 5 im 26 0 0 0 im 16 im 16 im 16 0 x 20 0 x 22 0 x 24 0 x 25 0 x 26 0 x 2 a

Register Transfer Level (RTL) v RTL is a description of data flow between registers v RTL gives a meaning to the instructions v All instructions are fetched from memory at address PC Instruction RTL Description ADD Reg(Rd) ← Reg(Rs) + Reg(Rt); PC ← PC + 4 SUB Reg(Rd) ← Reg(Rs) – Reg(Rt); PC ← PC + 4 ORI Reg(Rt) ← Reg(Rs) | zero_ext(Im 16); PC ← PC + 4 LW Reg(Rt) ← MEM[Reg(Rs) + sign_ext(Im 16)]; PC ← PC + 4 SW MEM[Reg(Rs) + sign_ext(Im 16)] ← Reg(Rt); PC ← PC + 4 BEQ if (Reg(Rs) == Reg(Rt)) PC ← PC + 4 × sign_extend(Im 16) else PC ← PC + 4 Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 7

Instructions are Executed in Steps v R-type Fetch instruction: Fetch operands: Execute operation: Write ALU result: Next PC address: Instruction ← MEM[PC] data 1 ← Reg(Rs), data 2 ← Reg(Rt) ALU_result ← func(data 1, data 2) Reg(Rd) ← ALU_result PC ← PC + 4 v I-type Fetch instruction: Fetch operands: Execute operation: Write ALU result: Next PC address: Instruction ← MEM[PC] data 1 ← Reg(Rs), data 2 ← Extend(imm 16) ALU_result ← op(data 1, data 2) Reg(Rt) ← ALU_result PC ← PC + 4 v BEQ Fetch instruction: Fetch operands: Equality: Branch: Instruction ← MEM[PC] data 1 ← Reg(Rs), data 2 ← Reg(Rt) zero ← subtract(data 1, data 2) if (zero) PC ← PC + 4×sign_ext(imm 16) else PC ← PC + 4 Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 8

Instruction Execution – cont’d v LW Fetch instruction: Fetch base register: Calculate address: Read memory: Write register Rt: Next PC address: Instruction ← MEM[PC] base ← Reg(Rs) address ← base + sign_extend(imm 16) data ← MEM[address] Reg(Rt) ← data PC ← PC + 4 v SW Fetch instruction: Fetch registers: Calculate address: Write memory: Next PC address: Instruction ← MEM[PC] base ← Reg(Rs), data ← Reg(Rt) address ← base + sign_extend(imm 16) MEM[address] ← data PC ← PC + 4 v Jump Fetch instruction: Target PC address: Jump: Single Cycle Processor Design ICS 233 – KFUPM concatenation Instruction ← MEM[PC] target ← PC[31: 28] , Imm 26 , ‘ 00’ PC ← target © Muhamed Mudawar slide 9

Requirements of the Instruction Set v Memory ² Instruction memory where instructions are stored ² Data memory where data is stored v Registers ² 32 × 32 -bit general purpose registers, R 0 is always zero ² Read source register Rs ² Read source register Rt ² Write destination register Rt or Rd v Program counter PC register and Adder to increment PC v Sign and Zero extender for immediate constant v ALU for executing instructions Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 10

Next. . . v Designing a Processor: Step-by-Step v Datapath Components and Clocking v Assembling an Adequate Datapath v Controlling the Execution of Instructions v The Main Controller and ALU Controller v Worst case timing Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 11

Components of the Datapath v Combinational Elements ² ALU, Adder 32 0 16 Extend m u x 32 ² Immediate extender Ext. Op ² Multiplexers v Storage Elements select Instruction PC 32 ² Instruction memory 32 32 RA Bus. A RB Bus. B 32 5 RW ² Timing of reads and writes Bus. W Clock Reg. Write © Muhamed Mudawar slide 12 ALU result overflow Data Memory 32 32 Data_out Data_in Mem. Read 5 v Clocking methodology zero Address Registers 5 ² Register file ICS 233 – KFUPM 32 Instruction Memory ² PC register 32 ALU control 32 Address ² Data memory Single Cycle Processor Design 32 1 A L U Mem. Write

Register Element v Register ² Similar to the D-type Flip-Flop v n-bit input and output Data_In Register Clock Data_Out v Write Enable: n bits Write Enable n bits ² Enable / disable writing of register ² Negated (0): Data_Out will not change ² Asserted (1): Data_Out will become Data_In after clock edge v Edge triggered Clocking ² Register output is modified at clock edge Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 13

MIPS Register File RW RA RB v Register File consists of 32 × 32 -bit registers ² Bus. A and Bus. B: 32 -bit output busses for reading 2 registers ² Bus. W: 32 -bit input bus for writing a register when Reg. Write is 1 ² Two registers read and one written in a cycle v Registers are selected by: ² RA selects register to be read on Bus. A ² RB selects register to be read on Bus. B ² RW selects the register to be written v Clock input 5 RA Register File Bus. A 32 5 RB 5 RW 32 Bus. B Clock Bus. W Reg. Write 32 ² The clock input is used ONLY during write operation ² During read, register file behaves as a combinational logic block § RA or RB valid => Bus. A or Bus. B valid after access time Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 14

Tri-State Buffers v Allow multiple sources to drive a single bus v Two Inputs: Enable ² Data signal (data_in) ² Output enable Data_in Data_out v One Output (data_out): ² If (Enable) Data_out = Data_in else Data_out = High Impedance state (output is disconnected) Data_0 v Tri-state buffers can be Output used to build multiplexors Data_1 Select Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 15

Details of the Register File RA 5 Decoder 32 "0" RB 5 Decoder Tri-state buffer 32 R 0 is not used E "0" R 1 RW 5 Decoder 32 Bus. W . . . E 32 R 2 . . . 32 32 E R 31 32 Reg. Write Clock Single Cycle Processor Design Bus. A ICS 233 – KFUPM © Muhamed Mudawar slide 16 Bus. B

Shift Operation Building a Multifunction ALU None = 00 SLL = 01 SRL = 10 SRA = 11 2 SLT: ALU does a SUB and check the sign and overflow 32 Shift Amount Shifter lsb 5 Arithmetic Operation A B c 0 32 32 ADD = 0 SUB = 1 32 A d d e r 0 Logical Operation 1 Single Cycle Processor Design 2 3 2 ICS 233 – KFUPM © Muhamed Mudawar slide 17 ALU Result 1 32 2 3 overflow Logic Unit AND = 00 OR = 01 NOR = 10 XOR = 11 sign 0 2 ALU Selection Shift = 00 SLT = 01 Arith = 10 Logic = 11 zero

Instruction and Data Memories v Instruction memory needs only provide read access ² Because datapath does not write instructions ² Behaves as combinational logic for read 32 v Data Memory is used for load and store ² Mem. Read: enables output on Data_out ² Mem. Write: enables writing of Data_in § Address selects the memory word to be written § The Clock synchronizes the write operation v Separate instruction and data memories ² Later, we will replace them with caches Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 18 32 Instruction Memory ² Address selects Instruction after access time § Address selects the word to put on Data_out Address Instruction Data Memory 32 32 Address Data_out 32 Data_in Clock Mem. Read Mem. Write

Clocking Methodology v Clocks are needed in a sequential v We assume edgelogic to decide when a state element triggered clocking (register) should be updated v All state changes Combinational logic clock rising edge Single Cycle Processor Design falling edge ICS 233 – KFUPM Register 2 Register 1 occur on the same v To ensure correctness, a clocking clock edge methodology defines when data can v Data must be valid be written and read and stable before arrival of clock edge v Edge-triggered clocking allows a register to be read and written during same clock cycle © Muhamed Mudawar slide 19

Determining the Clock Cycle Register 2 Register 1 v With edge-triggered clocking, the clock cycle must be long enough to accommodate the path from one register through the combinational logic to another register Combinational logic writing edge Tmax_comb Ts v Th: hold time that input to a Th register must hold after arrival of clock edge Tcycle ≥ Tclk-q + Tmax_comb + Ts Single Cycle Processor Design ICS 233 – KFUPM v Tmax_comb : longest delay through combinational logic v Ts : setup time that input to a register must be stable before arrival of clock edge clock Tclk-q v Tclk-q : clock to output delay through register v Hold time (Th) is normally satisfied since Tclk-q > Th © Muhamed Mudawar slide 20

Clock Skew v Clock skew arises because the clock signal uses different paths with slightly different delays to reach state elements v Clock skew is the difference in absolute time between when two storage elements see a clock edge v With a clock skew, the clock cycle time is increased Tcycle ≥ Tclk-q + Tmax_combinational + Tsetup+ Tskew v Clock skew is reduced by balancing the clock delays Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 21

Next. . . v Designing a Processor: Step-by-Step v Datapath Components and Clocking v Assembling an Adequate Datapath v Controlling the Execution of Instructions v The Main Controller and ALU Controller v Worst case timing Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 22

Instruction Fetching Datapath v We can now assemble the datapath from its components v For instruction fetching, we need … ² Program Counter (PC) register ² Instruction Memory ² Adder for incrementing PC The least significant 2 bits of the PC are ‘ 00’ since PC is a multiple of 4 32 PC 00 32 A d d Instruction 32 32 Address Instruction Memory Single Cycle Processor Design ICS 233 – KFUPM next PC Datapath does not handle branch or jump instructions © Muhamed Mudawar slide 23 Improved Datapath +1 30 00 4 30 32 PC next PC Improved datapath increments upper 30 bits of PC by 1 Instruction Address Instruction Memory 32

Datapath for R-type Instructions Op 6 Rs 5 Rt 5 Rd 5 sa 5 funct 6 Reg. Write ALUCtrl +1 00 30 30 Instruction Memory Instruction Registers Rs 5 32 PC 32 Address Rt 5 Rd 5 RA RB RW 32 Bus. A Bus. B 32 Bus. W A L U 32 ALU result RA & RB come from the instruction’s Rs & Rt fields ALU inputs come from Bus. A & Bus. B RW comes from the Rd field ALU result is connected to Bus. W v Control signals ² ALUCtrl is derived from the funct field because Op = 0 for R-type ² Reg. Write is used to enable the writing of the ALU result Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 24

Datapath for I-type ALU Instructions Op 6 Rs 5 Rt 5 immediate 16 Reg. Write ALUCtrl +1 00 30 Instruction Memory 30 Instruction Registers Rs 5 32 PC 32 Address 5 Rt 5 RA RB RW 32 Bus. A 32 Bus. B 32 A L U 32 ALU result Bus. W Ext. Op RW now comes from Rt, instead of Rd Imm 16 Extender Second ALU input comes from the extended immediate v Control signals ² ALUCtrl is derived from the Op field RB and Bus. B are not used ² Reg. Write is used to enable the writing of the ALU result ² Ext. Op is used to control the extension of the 16 -bit immediate Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 25

Combining R-type & I-type Datapaths Reg. Write ALUCtrl +1 00 30 30 Instruction Memory Instruction Registers Rs 5 32 PC 32 Rt 5 Address RB 0 m u Rd x A mux selects RW as either Rt or Rd RA 5 1 RW Bus. A 32 Bus. B Ext. Op Extender 0 m u x Bus. W Reg. Dst Imm 16 32 A L U 32 1 32 ALUSrc Another mux selects 2 nd ALU input as either source register Rt data on Bus. B or the extended immediate ALU result v Control signals ² ALUCtrl is derived from either the Op or the funct field ² Reg. Write enables the writing of the ALU result ² Ext. Op controls the extension of the 16 -bit immediate ² Reg. Dst selects the register destination as either Rt or Rd ² ALUSrc selects the 2 nd ALU source as Bus. B or extended immediate Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 26

Controlling ALU Instructions Reg. Write = 1 ALUCtrl +1 00 30 30 Instruction Memory Instruction Registers Rs 5 32 Rt 32 PC Address 5 Bus. A RA 32 RB 5 RW 1 A L U 0 Bus. B 0 m u Rd x 32 m u x Bus. W 1 32 Ext. Op ALUSrc = 0 Reg. Dst = 1 ALU result Extender Imm 16 32 For R-type ALU instructions, Reg. Dst is ‘ 1’ to select Rd on RW and ALUSrc is ‘ 0’ to select Bus. B as second ALU input. The active part of datapath is shown in green Reg. Write = 1 ALUCtrl +1 00 30 30 Instruction Memory Instruction PC 32 Registers Rs 5 32 Rt Address 5 Bus. A RA 32 RB 5 RW 1 Single Cycle Processor Design ICS 233 – KFUPM m u x Bus. W Ext. Op Reg. Dst = 0 Imm 16 0 Bus. B 0 m u Rd x 32 Extender A L U 32 1 32 ALUSrc = 1 ALU result © Muhamed Mudawar slide 27 For I-type ALU instructions, Reg. Dst is ‘ 0’ to select Rt on RW and ALUSrc is ‘ 1’ to select Extended immediate as second ALU input. The active part of datapath is shown in green

Details of the Extender v Two types of extensions ² Zero-extension for unsigned constants ² Sign-extension for signed constants v Control signal Ext. Op indicates type of extension v Extender Implementation: wiring and one AND gate . . . Ext. Op = 0 Upper 16 = 0 Ext. Op Upper 16 bits Ext. Op = 1 Upper 16 = sign bit . . . Imm 16 Single Cycle Processor Design ICS 233 – KFUPM Lower 16 bits © Muhamed Mudawar slide 28

Adding Data Memory to Datapath v A data memory is added for load and store instructions Ext. Op Imm 16 +1 00 30 30 Instruction Memory Instruction PC 32 Rs 5 32 Rt 5 Address Extender 5 32 Mem. Read ALUSrc 32 Registers RB Bus. B RW Bus. W 0 m u x Mem. Write ALU result Bus. A RA 0 m u Rd x ALUCtrl A L U 1 1 32 Memto. Reg Data Memory Address Data_out Data_in 0 32 m 32 u x 1 32 Reg. Dst Reg. Write ALU calculates data memory address v Additional Control signals ² Mem. Read for load instructions ² Mem. Write for store instructions A 3 rd mux selects data on Bus. W as either ALU result or memory data_out Bus. B is connected to Data_in of Data Memory for store instructions ² Memto. Reg selects data on Bus. W as ALU result or Memory Data_out Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 29

Controlling the Execution of Load Ext. Op = ‘sign’ to sign-extend Immmediate 16 to 32 bits Imm 16 +1 00 30 30 Instruction Memory Instruction PC 32 Rs 5 32 Rt 5 Address Ext. Op = sign Extender 5 Reg. Dst = ‘ 0’ selects Rt as destination register 32 Mem. Read =1 ALUSrc =1 32 Registers RB Bus. B RW Bus. W 0 m u x Mem. Write =0 ALU result Bus. A RA 0 m u Rd x ALUCtrl = ADD A L U 1 1 32 Memto. Reg =1 Data Memory Address Data_out Data_in 0 32 m 32 u x 1 32 Reg. Dst Reg. Write =0 =1 Mem. Read = ‘ 1’ to read data memory ALUSrc = ‘ 1’ selects extended immediate as second ALU input Memto. Reg = ‘ 1’ places the data read from memory on Bus. W ALUCtrl = ‘ADD’ to calculate data memory address as Reg(Rs) + sign-extend(Imm 16) Reg. Write = ‘ 1’ to write the memory data on Bus. W to register Rt Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 30

Controlling the Execution of Store Ext. Op = ‘sign’ to sign-extend Immmediate 16 to 32 bits Ext. Op = sign Imm 16 +1 00 30 30 Instruction Memory Instruction PC 32 Rs 5 32 Rt 5 Address Extender ALUSrc =1 32 Registers RB Bus. B RW Bus. W 0 m u x Reg. Ds t=x Mem. Write =1 ALU result A L U 1 1 5 Reg. Dst = ‘x’ because no destination register 32 Mem. Read =0 Bus. A RA 0 m u Rd x ALUCtrl = ADD 32 Memto. Reg =x Data Memory Address Data_out Data_in 0 32 m 32 u x 1 32 Reg. Write =0 Mem. Write = ‘ 1’ to write data memory ALUSrc = ‘ 1’ to select the extended immediate as second ALU input Memto. Reg = ‘x’ because we don’t care what data is placed on Bus. W ALUCtrl = ‘ADD’ to calculate data memory address as Reg(Rs) + sign-extend(Imm 16) Reg. Write = ‘ 0’ because no register is written by the store instruction Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 31

Adding Jump and Branch to Datapath 30 Jump or Branch Target Address 30 30 Next PC Imm 26 +1 PC 0 m u x 00 PCSrc 30 Imm 16 Instruction Memory Instruction Rs 5 32 Rt 5 Address 1 Registers RB 5 Ext 0 Bus. B RW m u x Bus. W Mem. Write ALU result Memto. Reg zero Data Memory Bus. A RA 0 m u Rd x Mem. Read A L U Address Data_out Data_in 0 32 m 32 u x 1 1 1 Reg. Dst Reg. Write ALUSrc ALUCtrl v Additional Control Signals ² J, Beq, Bne for jump and branch instructions ² Zero condition of the ALU is examined ² PCSrc = 1 for Jump & taken Branch Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 32 J, Beq, Bne Next PC computes jump or branch target instruction address For Branch, ALU does a subtraction

Details of Next PC PCSrc Branch or Jump Target Address 30 Inc PC 30 Sign-Extension: Most-significant bit is replicated A D D 30 0 m 30 u x SE Imm 16 Beq Bne msb 4 Imm 26 1 26 Imm 16 is sign-extended to 30 bits J Zero Jump target address: upper 4 bits of PC are concatenated with Imm 26 PCSrc = J + (Beq. Zero) + (Bne. Zero) Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 33

Controlling the Execution of Jump 30 Jump Target Address 30 30 Next PC Imm 26 PCSrc =1 00 PC 0 m u x +1 30 Imm 16 Instruction Memory Instruction Rs 5 32 Rt 5 Address 5 J = 1 selects Imm 26 as jump target address RB RW Ext 0 Bus. B m u x Bus. W Mem. Write =0 ALU result Memto. Reg =x zero Bus. A Registers 0 m u Rd x 1 RA Mem. Read =0 A L U Data Memory Address Data_out Data_in 0 32 m 32 u x 1 1 1 Reg. Dst Reg. Write =0 =x Ext. Op =x ALUSrc ALUCtrl J = 1 =x =x Upper 4 bits are from the incremented PC Mem. Read, Mem. Write & Reg. Write are 0 PCSrc = 1 to select jump target address We don’t care about Reg. Dst, Ext. Op, ALUSrc, ALUCtrl, and Memto. Reg Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 34

Controlling the Execution of Branch Target Address 30 30 30 Next PC Imm 26 PCSrc =1 00 PC 0 m u x +1 30 Imm 16 Instruction Memory Instruction Rs 5 32 Rt 5 Address 5 Either Beq or Bne =1 RB RW Ext Bus. B Bus. W 0 m u x Mem. Write =0 ALU result Memto. Reg =x zero Bus. A Registers 0 m u Rd x 1 RA Mem. Read =0 A L U Data Memory Address Data_out Data_in 0 32 m 32 u x 1 1 1 Reg. Dst Reg. Write =0 =x Ext. Op =x ALUSrc ALUCtrl Beq = 1 =0 = SUB Bne = 1 Next PC outputs branch target address ALUSrc = ‘ 0’ (2 nd ALU input is Bus. B) ALUCtrl = ‘SUB’ produces zero flag Next PC logic determines PCSrc according to zero flag Mem. Read = Mem. Write = Reg. Write = 0 Reg. Dst = Ext. Op = Memto. Reg = x Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 35

Next. . . v Designing a Processor: Step-by-Step v Datapath Components and Clocking v Assembling an Adequate Datapath v Controlling the Execution of Instructions v The Main Controller and ALU Controller v Worst case timing Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 36

Main Control and ALU Control A L U funct 6 J Bne Beq Mem. Write Memto. Reg Mem. Read Reg. Dst Address ALUSrc Datapath 32 Ext. Op Instruction Reg. Write Instruction Memory Op 6 ALUCtrl Main Control Input: ² 6 -bit opcode field from instruction Input: ² 6 -bit function field from instruction Output: ² 10 control signals for datapath ² ALUOp for ALU Control Single Cycle Processor Design ALUOp ALU Control ICS 233 – KFUPM ² ALUOp from main control Output: ² ALUCtrl signal for ALU © Muhamed Mudawar slide 37

Single-Cycle Datapath + Control 30 Jump or Branch Target Address 30 30 Next PC Imm 26 +1 PC 0 m u x 00 PCSrc 30 Imm 16 Instruction Memory Instruction Rs 5 32 Rt 5 Address RA RB 0 m u Rd x 1 5 RW Ext 0 Bus. B m u x Bus. W A L U Address 0 32 Data_out Data_in m 32 u x 1 1 Ext. Op ALUSrc ALUCtrl func Op ALU Ctrl ALUOp Main Control ICS 233 – KFUPM Data Memory 1 Reg. Dst Reg. Write Single Cycle Processor Design ALU result zero Bus. A Registers J, Beq, Bne © Muhamed Mudawar slide 38 Mem. Read Mem. Write Memto. Reg

Main Control Signals Signal Effect when ‘ 0’ Effect when ‘ 1’ Reg. Dst Destination register = Rd Reg. Write None Destination register is written with the data value on Bus. W Ext. Op 16 -bit immediate is zero-extended 16 -bit immediate is sign-extended ALUSrc Second ALU operand comes from the Second ALU operand comes from second register file output (Bus. B) the extended 16 -bit immediate Mem. Read None Data memory is read Data_out ← Memory[address] Mem. Write None Data memory is written Memory[address] ← Data_in Memto. Reg Bus. W = ALU result Bus. W = Data_out from Memory Beq, Bne PC ← PC + 4 PC ← Branch target address If branch is taken J PC ← PC + 4 PC ← Jump target address ALUOp This multi-bit signal specifies the ALU operation as a function of the opcode Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 39

Main Control Signal Values Op Reg Dst Reg Write Ext Op 1 x R-type 1 = Rd ALU Src ALU Op 0=Bus. B R-type Beq Bne J Mem Read Mem Write Mem to. Reg 0 0 0 addi 0 = Rt 1 1=sign 1=Imm ADD 0 0 0 slti 0 = Rt 1 1=sign 1=Imm SLT 0 0 0 andi 0 = Rt 1 0=zero 1=Imm AND 0 0 0 ori 0 = Rt 1 0=zero 1=Imm OR 0 0 0 xori 0 = Rt 1 0=zero 1=Imm XOR 0 0 0 lw 0 = Rt 1 1=sign 1=Imm ADD 0 0 0 1 sw x 0 1=sign 1=Imm ADD 0 0 1 x beq x 0=Bus. B SUB 1 0 0 x bne x 0=Bus. B SUB 0 1 0 0 0 x j x 0 x x x 0 0 1 0 0 x v X is a don’t care (can be 0 or 1), used to minimize logic Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 40

Logic Equations for Control Signals Reg. Dst <= R-type Reg. Write <= (sw + beq + bne + j) Ext. Op <= (andi + ori + xori) ALUSrc <= (R-type + beq + bne) Op 6 R-type addi slti andi ori xori lw sw Decoder Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 41 Beq Bne J Memto. Reg Mem. Write Mem. Read ALUSrc Ext. Op Memto. Reg <= lw Reg. Write Mem. Write <= sw ALUop Mem. Read <= lw Reg. Dst Logic Equations

ALU Control Truth Table Op 6 R-type R-type addi slti andi ori xori lw sw beq bne j ALU Control ALUOp funct 6 ALUCtrl R-type R-type ADD SLT AND OR XOR ADD SUB x Single Cycle Processor Design add sub and or xor slt x x x x x ICS 233 – KFUPM ADD SUB AND OR XOR SLT ADD SLT AND OR XOR ADD SUB x 4 -bit Encoding 0000 0010 0101 0110 1010 0000 1010 0101 0110 0000 0010 x © Muhamed Mudawar slide 42 The 4 -bit encoding for ALUctrl is chosen here to be equal to the last 4 bits of the function field Other binary encodings are also possible. The idea is to choose a binary encoding that will minimize the logic for ALU Control

Next. . . v Designing a Processor: Step-by-Step v Datapath Components and Clocking v Assembling an Adequate Datapath v Controlling the Execution of Instructions v The Main Controller and ALU Controller v Worst case timing Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 43

Worst Case Timing (Load Instruction) Clk-to-q Old PC New PC Instruction Memory Access Time Old Instruction New Instruction = (Op, Rs, Rt, Rd, Funct, Imm 16, Imm 26) Delay Through Control Logic Old Control Signal Values New Control Signal Values (Ext. Op, ALUSrc, ALUOp, …) Register File Access Time Old Bus. A Value New Bus. A Value = Register(Rs) Delay Through Extender and ALU Mux Old Second ALU Input New Second ALU Input = sign-extend(Imm 16) ALU Delay New ALU Result = Address Old ALU Result Data Memory Access Time Old Data Memory Output Value New Value Mux delay + Setup time + Clock skew Clock Cycle Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 44 Write Occurs

Worst Case Timing – Cont'd v Long cycle time: must be long enough for Load operation PC’s Clk-to-Q + Instruction Memory’s Access Time + Maximum of ( Register File’s Access Time, Delay through control logic + extender + ALU mux) + ALU to Perform a 32 -bit Add + Data Memory Access Time + Delay through Memto. Reg Mux + Setup Time for Register File Write + Clock Skew v Cycle time is longer than needed for other instructions Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 45

Summary v 5 steps to design a processor ² Analyze instruction set => datapath requirements ² Select datapath components & establish clocking methodology ² Assemble datapath meeting the requirements ² Analyze implementation of each instruction to determine control signals ² Assemble the control logic v MIPS makes Control easier ² Instructions are of same size ² Source registers always in same place ² Immediates are of same size and same location ² Operations are always on registers/immediates Single Cycle Processor Design ICS 233 – KFUPM © Muhamed Mudawar slide 46