CS 110 Computer Architecture Lecture 11 SingleCycle CPU

  • Slides: 33
Download presentation
CS 110 Computer Architecture Lecture 11: Single-Cycle CPU Datapath & Control Instructor: Sören Schwertfeger

CS 110 Computer Architecture Lecture 11: Single-Cycle CPU Datapath & Control Instructor: Sören Schwertfeger http: //shtech. org/courses/ca/ School of Information Science and Technology SIST Shanghai. Tech University Slides based on UC Berkley's CS 61 C 1

A Single Cycle Datapath Reg. Dst 32 Equal 0 5 5 5 Rw Ra

A Single Cycle Datapath Reg. Dst 32 Equal 0 5 5 5 Rw Ra Rb Reg. File bus. A bus. B 32 16 Extender imm 16 Memto. Reg Mem. Wr Rs Rt clk ALUctr 32 = ALU bus. W PC PC Ext Adder Mux 00 Reg. Wr Adder 4 Rt Rd Imm 16 Rd Rt 1 Instruction<31: 0> <0: 15> n. PC_sel Rs <11: 15> Adr <16: 20> <21: 25> Inst Memory 0 32 1 32 Data In clk imm 16 Ext. Op ALUSrc 32 0 Wr. En Adr Data Memory 1 2

Step 3 b: Add & Subtract • R[rd] = R[rs] op R[rt] (addu rd,

Step 3 b: Add & Subtract • R[rd] = R[rs] op R[rt] (addu rd, rs, rt) – Ra, Rb, and Rw come from instruction’s Rs, Rt, and Rd fields 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 11 rd 5 bits 6 shamt 5 bits funct 6 bits 0 – ALUctr and Reg. Wr: control logic after decoding the instruction Rd Rs Rt Reg. Wr 5 5 5 clk Rw Ra Rb 32 x 32 -bit Registers bus. A 32 bus. B ALU bus. W 32 ALUctr Result 32 32 • … Already defined the register file & ALU 3

3 c: Logical Op (or) with Immediate • R[rt] = R[rs] op Zero. Ext[imm

3 c: Logical Op (or) with Immediate • R[rt] = R[rs] op Zero. Ext[imm 16] 31 Reg. Dst 26 21 op 31 6 bits Reg. Wr 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 clk 16 Zero. Ext imm 16 immediate 16 bits 0 immediate 16 bits ALUctr Rs Rt 32 32 0 ALU 32 rt 5 bits 16 15 0 Writing to Rt register (not Rd)!! 0 5 rs 5 bits 00000000 16 bits Rd Rt 1 16 32 What about Rt Read? 1 ALUSrc 4

3 d: Load Operations • R[rt] = Mem[R[rs] + Sign. Ext[imm 16]] Example: lw

3 d: Load Operations • R[rt] = Mem[R[rs] + Sign. Ext[imm 16]] Example: lw rt, rs, imm 16 31 26 21 op 16 rs 6 bits 0 immediate rt 5 bits 16 bits 1 Reg. Wr bus. W 0 Rs Rt 5 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 imm 16 16 Ext. Op Extender clk 32 0 ALU 32 Memto. Reg ALUctr Reg. Dst Rd Rt 32 0 Adr 32 1 ALUSrc clk Data Memory 1 5

3 e: Store Operations • Mem[ R[rs] + Sign. Ext[imm 16] ] = R[rt]

3 e: Store Operations • Mem[ R[rs] + Sign. Ext[imm 16] ] = R[rt] Ex. : sw rt, rs, imm 16 31 26 21 op 6 bits 16 rs 5 bits rt 5 bits immediate 16 bits ALUctr Reg. Dst Rd Rt 1 Reg. Wr 0 5 5 5 bus. A bus. B 32 imm 16 16 Ext. Op Extender clk 32 0 ALU Reg. File 32 Memto. Reg Mem. Wr Rs Rt Rw Ra Rb bus. W 0 32 Wr. En Adr 1 32 ALUSrc Data In clk Data Memory 1 6

3 e: Store Operations • Mem[ R[rs] + Sign. Ext[imm 16] ] = R[rt]

3 e: Store Operations • Mem[ R[rs] + Sign. Ext[imm 16] ] = R[rt] Ex. : sw rt, rs, imm 16 31 26 21 op 6 bits 16 rs 5 bits rt 5 bits immediate 16 bits ALUctr Reg. Dst Rd Rt 1 Reg. Wr 0 5 5 5 bus. A bus. B 32 imm 16 16 Ext. Op Extender clk 32 0 ALU Reg. File 32 Memto. Reg Mem. Wr Rs Rt Rw Ra Rb bus. W 0 32 Wr. En Adr 1 32 ALUSrc Data In clk Data Memory 1 7

3 f: The Branch Instruction 31 26 op 6 bits 21 rs 5 bits

3 f: The Branch Instruction 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits immediate 16 bits 0 beq rs, rt, imm 16 – mem[PC] Fetch the instruction from memory – Equal = (R[rs] == R[rt]) Calculate branch condition – if (Equal) Calculate the next instruction’s address • PC = PC + 4 + ( Sign. Ext(imm 16) x 4 ) else • PC = PC + 4 8

Datapath for Branch Operations beq rs, rt, imm 16 31 26 op 6 bits

Datapath for Branch Operations beq rs, rt, imm 16 31 26 op 6 bits 21 rs 5 bits 16 0 immediate 16 bits rt 5 bits Datapath generates condition (Equal) Inst Address 00 Reg. Wr clk bus. W clk 5 ALUctr Rs Rt 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 = ALU PC Mux Adder PC Ext imm 16 n. PC_sel Adder 4 Equal 32 32 Already have mux, adder, need special sign extender for PC, need equal compare (sub? ) 9

Instruction Fetch Unit including Branch 31 26 op 21 rs 16 rt 0 immediate

Instruction Fetch Unit including Branch 31 26 op 21 rs 16 rt 0 immediate • if (Zero == 1) then PC = PC + 4 + Sign. Ext[imm 16]*4 ; else PC = PC + 4 Inst Memory Adr n. PC_sel Equal 0 Mux PC Adder PC Ext imm 16 Adder 4 00 MUX ctrl n. PC_sel 1 Instruction<31: 0> • How to encode n. PC_sel? • Direct MUX select? • Branch inst. / not branch inst. • Let’s pick 2 nd option Q: What logic gate? clk 10

Putting it All Together: A Single Cycle Datapath Rt Rd Imm 16 Rd Rt

Putting it All Together: A Single Cycle Datapath Rt Rd Imm 16 Rd Rt 1 00 32 0 5 5 5 Rw Ra Rb Reg. File bus. A bus. B 32 16 Extender imm 16 Memto. Reg Mem. Wr Rs Rt clk ALUctr Equal Ext. Op 32 = ALU bus. W PC Mux Adder PC Ext imm 16 Reg. Wr Adder 4 Instruction<31: 0> <0: 15> n. PC_sel & Equal Reg. Dst Rs <11: 15> Adr <16: 20> <21: 25> Inst Memory 0 32 1 32 Data In clk ALUSrc 32 0 Wr. En Adr Data Memory 1 11

Question What new instruction would need no new datapath hardware? • A: branch if

Question What new instruction would need no new datapath hardware? • A: branch if reg==immediate • B: add two registers and branch if result zero • C: store with auto-increment of base address: – sw rt, rs, offset // rs incremented by offset after store • D: shift left logical by two bits 00 0 5 Rw Ra Rb bus. A Reg. File bus. B imm 16 Memto. Reg Mem. Wr Rs Rt 5 5 32 clk ALUctr Equal 16 Ext. Op 32 = ALU 32 Imm 16 Extender Adder Mux PC Adder PC Ext imm 16 Reg. Wr bus. W Rd Rd Rt 1 4 Rt Instruction<31: 0> <0: 15> Rs <11: 15> n. PC_sel & Equal Reg. Dst <16: 20> <21: 25> Inst Memory Adr 0 32 32 1 Data In clk ALUSrc 32 0 Wr. En Adr 1 Data Memory 12

Administrivia • HW 4 – just one week! – Teach about pointers to functions,

Administrivia • HW 4 – just one week! – Teach about pointers to functions, threads and signals – Go to discussion today if those topics are new to you! • Friday: Review for the Midterm 13

Processor Design: 5 steps Step 1: Analyze instruction set to determine datapath requirements –

Processor Design: 5 steps Step 1: Analyze instruction set to determine datapath requirements – Meaning of each instruction is given by register transfers – Datapath must include storage element for ISA registers – Datapath must support each register transfer Step 2: Select set of datapath components & establish clock methodology Step 3: Assemble datapath components that meet the requirements Step 4: Analyze implementation of each instruction to determine setting of control points that realizes the register transfer Step 5: Assemble the control logic

Datapath Control Signals • Ext. Op: • ALUsrc: • ALUctr: • n. PC_sel: •

Datapath Control Signals • Ext. Op: • ALUsrc: • ALUctr: • n. PC_sel: • • “zero”, “sign” 0 => reg. B; 1 => immed “ADD”, “SUB”, “OR” 1 => branch Mem. Wr: Memto. Reg: Reg. Dst: Reg. Wr: ALUctr Memto. Reg Mem. Wr Reg. Dst Rd Rt 1 Inst Address 4 PC Mux clk bus. W 32 Rs Rt 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 clk imm 16 16 Ext. Op imm 16 5 Extender PC Ext Adder 1 Reg. Wr 0 32 0 ALU Adder 0 00 n. PC_sel & Equal 1 => write memory 0 => ALU; 1 => Mem 0 => “rt”; 1 => “rd” 1 => write register 32 0 32 Wr. En Adr 32 1 ALUSrc Data In clk 1 Data Memory 15

Given Datapath: RTL Control Instruction<31: 0> Rd <0: 15> Rs <11: 15> Rt <16:

Given Datapath: RTL Control Instruction<31: 0> Rd <0: 15> Rs <11: 15> Rt <16: 20> Op Fun <21: 25> <0: 5> <26: 31> Instruction Memory Address Imm 16 Control n. PC_sel Reg. Wr Reg. Dst Ext. Op ALUSrc ALUctr Mem. Wr Memto. Reg DATA PATH 16

RTL: The Add Instruction 31 26 op 6 bits 21 rs 5 bits 16

RTL: The Add Instruction 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 11 rd 5 bits 6 shamt 5 bits 0 funct 6 bits add rd, rs, rt – MEM[PC] Fetch the instruction from memory – R[rd] = R[rs] + R[rt] The actual operation – PC = PC + 4 Calculate the next instruction’s address 17

Instruction Fetch Unit at the Beginning of Add • Fetch the instruction from Instruction

Instruction Fetch Unit at the Beginning of Add • Fetch the instruction from Instruction memory: Instruction = MEM[PC] – same for all instructions Inst Memory n. PC_sel Inst Address 00 Adder 4 Instruction<31: 0> PC Mux Adder PC Ext clk imm 16 18

Single Cycle Datapath during Add 31 26 op 21 16 rs rt 11 rd

Single Cycle Datapath during Add 31 26 op 21 16 rs rt 11 rd R[rd] = R[rs] + R[rt] 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 imm 16 16 Ext. Op=x Extender clk Rs Rt Rd Imm 16 zero ALUctr=ADD Memto. Reg=0 Mem. Wr=0 32 = ALU 32 5 0 32 32 <0: 15> bus. W Rs Rt <11: 15> Reg. Wr=1 Instruction<31: 0> <16: 20> 0 funct <21: 25> Rd Rt 0 shamt instr fetch unit n. PC_sel=+4 Reg. Dst=1 clk 1 6 1 Data In ALUSrc=0 clk 32 0 Wr. En Adr Data Memory 1 19

Instruction Fetch Unit at End of Add • PC = PC + 4 –

Instruction Fetch Unit at End of Add • PC = PC + 4 – Same for all instructions except: Branch and Jump n. PC_sel=+4 Inst Address 00 Adder 4 Inst Memory PC Mux Adder PC Ext clk imm 16 20

Single Cycle Datapath during Jump 31 26 25 J-type 0 target address op jump

Single Cycle Datapath during Jump 31 26 25 J-type 0 target address op jump • New PC = { PC[31. . 28], target address, 00 } Instruction<31: 0> Jump= Rd Imm 16 TA 26 Memto. Reg = Mem. Wr = 0 32 Data In 32 Clk <0: 25> Rt Wr. En Adr 32 Mux Ext. Op = 32 1 <0: 15> 16 Extender imm 16 Zero Mux 32 Clk bus. A Rw Ra Rb 32 32 x 32 -bit Registers bus. B 0 32 ALU bus. W 5 Rs ALUctr = Rs Rt 5 5 <11: 15> Reg. Wr = Clk <16: 20> Reg. Dst = <21: 25> Rd Rt 1 Mux 0 Instruction Fetch Unit n. PC_sel= 1 Data Memory ALUSrc = 21

Single Cycle Datapath during Jump 31 26 25 0 jump target address op J-type

Single Cycle Datapath during Jump 31 26 25 0 jump target address op J-type • New PC = { PC[31. . 28], target address, 00 } Instruction<31: 0> Jump=1 Clk <0: 25> Ext. Op = x Rd Imm 16 TA 26 Memto. Reg = x Mem. Wr = 0 0 Wr. En Adr 32 Mux ALUSrc = x Rt 32 Data In 32 <0: 15> 32 1 <11: 15> 16 Extender imm 16 Zero Mux 32 Clk bus. A Rw Ra Rb 32 32 x 32 -bit bus. B Registers 0 32 ALU bus. W 5 Rs ALUctr =x Rs Rt 5 5 <16: 20> Reg. Wr = 0 Clk <21: 25> Reg. Dst = x Rd Rt 1 Mux 0 Instruction Fetch Unit n. PC_sel=? 1 Data Memory 22

Instruction Fetch Unit at the End of Jump 31 26 25 J-type 0 jump

Instruction Fetch Unit at the End of Jump 31 26 25 J-type 0 jump target address op • New PC = { PC[31. . 28], target address, 00 } Jump Inst Memory n. PC_sel Adr Instruction<31: 0> Zero n. PC_MUX_sel 4 Adder 00 0 Adder imm 16 PC Mux How do we modify this to account for jumps? 1 Clk 23

Instruction Fetch Unit at the End of Jump 31 26 25 0 jump target

Instruction Fetch Unit at the End of Jump 31 26 25 0 jump target address op J-type • New PC = { PC[31. . 28], target address, 00 } Inst Memory Jump n. PC_sel Instruction<31: 0> Adr Zero imm 16 00 TA Mux Adder 1 4 (MSBs) 1 PC Adder 0 26 Mux 4 00 n. PC_MUX_sel 0 Clk Query • Can Zero still get asserted? • Does n. PC_sel need to be 0? • If not, what? 24

Question Which of the following is TRUE? A. The clock can have a shorter

Question Which of the following is TRUE? A. The clock can have a shorter period for instructions that don’t use memory B. The ALU is used to set PC to PC+4 when necessary C. Worst-delay path in Instruction Fetch unit is Add+mux delay D. The CPU’s control needs only opcode to determine the next PC value to select E. npc_sel affects the next PC address on a jump 25

Summary of the Control Signals (1/2) inst Register Transfer add R[rd] ← R[rs] +

Summary of the Control Signals (1/2) inst Register Transfer add R[rd] ← R[rs] + R[rt]; PC ← PC + 4 ALUsrc=Reg. B, ALUctr=“ADD”, Reg. Dst=rd, Reg. Wr, n. PC_sel=“+4” sub R[rd] ← R[rs] – R[rt]; PC ← PC + 4 ALUsrc=Reg. B, ALUctr=“SUB”, Reg. Dst=rd, Reg. Wr, n. PC_sel=“+4” ori R[rt] ← R[rs] + zero_ext(Imm 16); PC ← PC + 4 ALUsrc=Im, Extop=“Z”, ALUctr=“OR”, Reg. Dst=rt, Reg. Wr, n. PC_sel=“+4” lw R[rt] ← MEM[ R[rs] + sign_ext(Imm 16)]; PC ← PC + 4 ALUsrc=Im, Extop=“sn”, ALUctr=“ADD”, Memto. Reg, Reg. Dst=rt, Reg. Wr, sw n. PC_sel = “+ MEM[ R[rs] + sign_ext(Imm 16)] ← R[rs]; PC ← PC + 4 ALUsrc=Im, Extop=“sn”, ALUctr = “ADD”, Mem. Wr, n. PC_sel = “+4” beq if (R[rs] == R[rt]) then PC ← PC + sign_ext(Imm 16)] || 00 else PC ← PC + 4 n. PC_sel = “br”, ALUctr = “SUB” 26

Summary of the Control Signals (2/2) See Appendix A func 10 0000 10 0010

Summary of the Control Signals (2/2) See Appendix A func 10 0000 10 0010 We Don’t Care : -) op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010 Reg. Dst ALUSrc Memto. Reg. Write Mem. Write add 1 0 0 1 0 sub 1 0 0 1 0 ori 0 1 0 lw 0 1 1 1 0 sw x 1 x 0 1 beq x 0 0 jump x x x 0 0 n. PCsel Jump Ext. Op ALUctr<2: 0> 0 0 x Add 0 0 x Subtract 0 0 0 Or 0 0 1 Add 1 0 x Subtract ? 1 x x 31 26 21 16 R-type op rs rt I-type op rs rt J-type op 11 rd 6 shamt immediate target address 0 funct add, sub ori, lw, sw, beq jump 27

Boolean Expressions for Controller ADD 0000 00 ss ssst tttt dddd d 000 00

Boolean Expressions for Controller ADD 0000 00 ss ssst tttt dddd d 000 00 10 0000 Reg. Dst = add + sub SUB 0000 00 ss ssst tttt dddd d 000 00 10 0010 ALUSrc = ori + lw + sw ORI 0011 01 ss ssst tttt iiii Memto. Reg = lw LW 1000 11 ss ssst tttt iiii Reg. Write = add + sub + ori + lw SW 1010 11 ss ssst tttt iiii Mem. Write = sw BEQ 0001 00 ss ssst tttt iiii n. PCsel = beq JUMP 0000 10 ii iiii iiii Jump = jump Ext. Op = lw + sw ALUctr[0] = sub + beq (assume ALUctr is 00 ADD, 01 SUB, 10 OR) ALUctr[1] = or Where: rtype = ~op 5 ∙ ~op 4 ∙ ~op 3 ∙ ~op 2 ∙ ~op 1 ∙ ~op 0 ori = ~op 5 ∙ ~op 4 ∙ op 3 ∙ op 2 ∙ ~op 1 ∙ op 0 lw = op 5 ∙ ~op 4 ∙ ~op 3 ∙ ~op 2 ∙ op 1 ∙ op 0 sw = op 5 ∙ ~op 4 ∙ op 3 ∙ ~op 2 ∙ op 1 ∙ op 0 beq = ~op 5 ∙ ~op 4 ∙ ~op 3 ∙ op 2 ∙ ~op 1 ∙ ~op 0 jump = ~op 5 ∙ ~op 4 ∙ ~op 3 ∙ ~op 2 ∙ op 1 ∙ ~op 0 add = rtype ∙ func 5 ∙ ~func 4 ∙ ~func 3 ∙ ~func 2 ∙ ~func 1 ∙ ~func 0 sub = rtype ∙ func 5 ∙ ~func 4 ∙ ~func 3 ∙ ~func 2 ∙ func 1 ∙ ~func 0 How do we implement this in gates? 28

Controller Implementation opcode func “AND” logic add sub ori lw sw beq jump “OR”

Controller Implementation opcode func “AND” logic add sub ori lw sw beq jump “OR” logic Reg. Dst ALUSrc Memto. Reg. Write Mem. Write n. PCsel Jump Ext. Op ALUctr[0] ALUctr[1] 29

P&H Figure 4. 17 30

P&H Figure 4. 17 30

Summary: Single-cycle Processor • Five steps to design a processor: Processor 1. Analyze instruction

Summary: Single-cycle Processor • Five steps to design a processor: Processor 1. Analyze instruction set Input datapath requirements Control Memory 2. Select set of datapath components & establish Datapath Output clock methodology 3. Assemble datapath meeting the requirements 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic • Formulate Logic Equations • Design Circuits 31

Levels of Representation/Interpretation High Level Language Program (e. g. , C) Compiler Assembly Language

Levels of Representation/Interpretation High Level Language Program (e. g. , C) Compiler Assembly Language Program (e. g. , MIPS) Assembler Machine Language Program (MIPS) temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; lw lw sw sw 0000 1010 1100 0101 $t 0, 0($2) $t 1, 4($2) $t 1, 0($2) $t 0, 4($2) 1001 1111 0110 1000 1100 0101 1010 0000 Anything can be represented as a number, i. e. , data or instructions 0110 1000 1111 1001 1010 0000 0101 1100 1111 1000 0110 0101 1100 0000 1010 1000 0110 1001 1111 Machine Interpretation Hardware Architecture Description (e. g. , block diagrams) Architecture Implementation Logic Circuit Description (Circuit Schematic Diagrams) 32

No More Magic! Application (ex: browser) Compiler Software Operating System (Mac OSX) Assembler Instruction

No More Magic! Application (ex: browser) Compiler Software Operating System (Mac OSX) Assembler Instruction Set Architecture Hardware Processor Memory I/O system Datapath & Control Digital Design Circuit Design transistors 33