CS 61 C Great Ideas in Computer Architecture

  • Slides: 53
Download presentation
CS 61 C: Great Ideas in Computer Architecture (Machine Structures) Single Cycle MIPS CPU

CS 61 C: Great Ideas in Computer Architecture (Machine Structures) Single Cycle MIPS CPU Instructors: Randy H. Katz David A. Patterson http: //inst. eecs. Berkeley. edu/~cs 61 c/sp 11 10/3/2020 Spring 2011 -- Lecture #18 1

You Are Here! Software • Parallel Requests Assigned to computer e. g. , Search

You Are Here! Software • Parallel Requests Assigned to computer e. g. , Search “Katz” • Parallel Threads Assigned to core e. g. , Lookup, Ads Hardware Harness Parallelism & Achieve High Performance Smart Phone Warehouse Scale Computer • Parallel Instructions >1 instruction @ one time e. g. , 5 pipelined instructions • Parallel Data >1 data item @ one time e. g. , Add of 4 pairs of words • Hardware descriptions All gates functioning in parallel at same time 10/3/2020 … Core Memory Core (Cache) Input/Output Instruction Unit(s) Core Functional Unit(s) A 0+B 0 A 1+B 1 A 2+B 2 A 3+B 3 Main Memory Today Logic Gates Spring 2011 -- Lecture #18 3

Levels of Representation/Interpretation High Level Language Program (e. g. , C) Compiler Assembly Language

Levels of Representation/Interpretation High Level Language Program (e. g. , C) Compiler Assembly Language Program (e. g. , MIPS) Assembler Machine Language Program (MIPS) temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; lw lw sw sw 0000 1010 1100 0101 $t 0, 0($2) $t 1, 4($2) $t 1, 0($2) $t 0, 4($2) 1001 1111 0110 1000 1100 0101 1010 0000 Anything can be represented as a number, i. e. , data or instructions 0110 1000 1111 1001 1010 0000 0101 1100 1111 1000 0110 0101 1100 0000 1010 1000 0110 1001 1111 Machine Interpretation Hardware Architecture Description (e. g. , block diagrams) Architecture Implementation Logic Circuit Description (Circuit Schematic Diagrams)Spring 2011 -- Lecture #18 10/3/2020 4

Review • Clocks tell us when D-flip-flops change – Setup and Hold times important

Review • Clocks tell us when D-flip-flops change – Setup and Hold times important • We pipeline long-delay CL for faster clock • Finite State Machines extremely useful • Use muxes to select among input – S input bits selects 2 S inputs – Each input can be n-bits wide, indep of S • Can implement muxes hierarchically • Can implement FSM with register + logic 10/3/2020 Spring 2011 -- Lecture #18 5

Agenda • • MIPS-lite Datapath Administrivia CPU Timing MIPS-lite Control Datapath Control Technology Break

Agenda • • MIPS-lite Datapath Administrivia CPU Timing MIPS-lite Control Datapath Control Technology Break Control Implementation 10/3/2020 Spring 2011 -- Lecture #18 6

The MIPS-lite Subset • ADDU and SUBU 31 op 6 bits – addu rd,

The MIPS-lite Subset • ADDU and SUBU 31 op 6 bits – addu rd, rs, rt – subu rd, rs, rt • OR Immediate: 26 31 op 31 • BRANCH: 31 – beq rs, rt, imm 16 10/3/2020 op 6 bits 5 bits 21 rs 5 bits Spring 2011 -- Lecture #18 rd 5 bits 6 shamt 5 bits 0 funct 6 bits 0 immediate 16 bits 16 rt 5 bits 11 16 21 rs 26 rt 5 bits 26 6 bits 16 21 rs op – lw rt, rs, imm 16 – sw rt, rs, imm 16 rs 5 bits 26 – ori rt, rs, imm 16 6 bits • LOAD and STORE Word 21 0 immediate 5 bits 16 rt 5 bits 0 immediate 16 bits 7

Processor Design Process • Five steps to design a processor: Step 1: Analyze instruction

Processor Design Process • Five steps to design a processor: Step 1: Analyze instruction set to determine datapath requirements (see next slide) Step 2: Select set of datapath components & establish clocking methodology Step 3: Assemble datapath components that meet the requirements Step 4: Analyze implementation of each instruction to determine setting of control points that realizes the register transfer Step 5: Assemble the control logic 10/3/2020 Spring 2011 -- Lecture #18 8

Register Transfer Language (RTL) • RTL gives the meaning of the instructions {op ,

Register Transfer Language (RTL) • RTL gives the meaning of the instructions {op , rs , rt , rd , shamt , funct} MEM[ PC ] {op , rs , rt , Imm 16} MEM[ PC ] • All start by fetching the instruction Inst Register Transfers ADDU R[rd] R[rs] + R[rt]; PC + 4 SUBU R[rd] R[rs] – R[rt]; PC + 4 ORI R[rt] R[rs] | zero_ext(Imm 16); PC + 4 LOAD R[rt] MEM[ R[rs] + sign_ext(Imm 16)]; PC + 4 STORE MEM[ R[rs] + sign_ext(Imm 16) ] R[rt]; PC + 4 BEQ 10/3/2020 if ( R[rs] == R[rt] ) then PC + 4 + (sign_ext(Imm 16) || 00) else PC + 4 Spring 2011 -- Lecture #18 9

Step 1: Requirements of the Instruction Set • Memory (MEM) – Instructions & data

Step 1: Requirements of the Instruction Set • Memory (MEM) – Instructions & data (will use one for each: really caches) • Registers (R: 32 x 32) – Read rs – Read rt – Write rt or rd • PC • Extender (sign/zero extend) • Add/Sub/OR unit for operation on register(s) or extended immediate • Add 4 (+ maybe extended immediate) to PC • Compare if registers equal? 10/3/2020 Spring 2011 -- Lecture #18 10

mux +4 1. Instruction Fetch 10/3/2020 ALU Data memory rd rs rt registers PC

mux +4 1. Instruction Fetch 10/3/2020 ALU Data memory rd rs rt registers PC instruction memory Generic Steps of Datapath imm 2. Decode/ Register Read 3. Execute 4. Memory Spring 2011 -- Lecture #18 5. Register Write 11

Step 2: Components of the Datapath • Combinational Elements • Storage Elements + Clocking

Step 2: Components of the Datapath • Combinational Elements • Storage Elements + Clocking Methodology • Building Blocks OP Carry. In A Carry. Out 32 Adder 10/3/2020 B 32 32 32 Multiplexer Spring 2011 -- Lecture #18 32 ALU 32 Sum A A MUX Adder B 32 Select Y B 32 Result 32 ALU 12

ALU Needs for MIPS-lite + Rest of MIPS • Addition, subtraction, logical OR, ==:

ALU Needs for MIPS-lite + Rest of MIPS • Addition, subtraction, logical OR, ==: ADDU SUBU ORI R[rd] = R[rs] + R[rt]; . . . R[rd] = R[rs] – R[rt]; . . . R[rt] = R[rs] | zero_ext(Imm 16). . . BEQ if ( R[rs] == R[rt] ). . . • Test to see if output == 0 for any ALU operation gives == test. How? • P&H also adds AND, Set Less Than (1 if A < B, 0 otherwise) • ALU from Appendix C, section C. 5 10/3/2020 Spring 2011 -- Lecture #18 13

Storage Element: Idealized Memory Write Enable • Memory (idealized) – One input bus: Data

Storage Element: Idealized Memory Write Enable • Memory (idealized) – One input bus: Data In – One output bus: Data Out • Memory word is found by: Data In 32 Clk Address Data. Out 32 – Address selects the word to put on Data Out – Write Enable = 1: address selects the memory word to be written via the Data In bus • Clock input (CLK) – CLK input is a factor ONLY during write operation – During read operation, behaves as a combinational logic block: Address valid Data Out valid after “access time” 10/3/2020 Spring 2011 -- Lecture #18 14

Storage Element: Register (Building Block) Write Enable • Similar to D Flip Flop except

Storage Element: Register (Building Block) Write Enable • Similar to D Flip Flop except – N-bit input and output – Write Enable input • Write Enable: Data In Data Out N N clk – Negated (or deasserted) (0): Data Out will not change – Asserted (1): Data Out will become Data In on rising edge of clock 10/3/2020 Spring 2011 -- Lecture #18 15

Storage Element: Register File RW RA RB Write Enable 5 5 5 • Register

Storage Element: Register File RW RA RB Write Enable 5 5 5 • Register File consists of 32 registers: – Two 32 -bit output busses: bus. A and bus. B – One 32 -bit input bus: bus. W • Register is selected by: bus. W 32 Clk 32 x 32 -bit Registers bus. A 32 bus. B 32 – RA (number) selects the register to put on bus. A (data) – RB (number) selects the register to put on bus. B (data) – RW (number) selects the register to be written via bus. W (data) when Write Enable is 1 • Clock input (clk) – Clk input is a factor ONLY during write operation – During read operation, behaves as a combinational logic block: • RA or RB valid bus. A or bus. B valid after “access time. ” 10/3/2020 Spring 2011 -- Lecture #18 16

Step 3: Assemble Data. Path Meeting Requirements • Register Transfer Requirements Datapath Assembly •

Step 3: Assemble Data. Path Meeting Requirements • Register Transfer Requirements Datapath Assembly • Instruction Fetch • Read Operands and Execute Operation • Common RTL operations clk – Fetch the Instruction: mem[PC] – Update the program counter: • Sequential Code: PC + 4 • Branch and Jump: PC “something else” 10/3/2020 Spring 2011 -- Lecture #18 PC Next Address Logic Address Instruction Word Instruction Memory 32 17

Step 3: Add & Subtract • R[rd] = R[rs] op R[rt] (addu rd, rs,

Step 3: Add & Subtract • R[rd] = R[rs] op R[rt] (addu rd, rs, rt) – Ra, Rb, and Rw come from instruction’s Rs, Rt, and Rd fields 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 11 rd 5 bits 6 shamt 5 bits funct 6 bits – ALUctr and Reg. Wr: control logic after decoding the instruction rd rs rt Reg. Wr 5 5 5 clk Rw Ra Rb 32 x 32 -bit Registers bus. A 32 bus. B ALU bus. W 32 ALUctr Result 32 32 • … Already defined the register file & ALU 10/3/2020 Spring 2011 -- Lecture #18 18 0

Administrivia • Project 3, Part 2 due Sunday 4/3 – Threads Level Parallelism and

Administrivia • Project 3, Part 2 due Sunday 4/3 – Threads Level Parallelism and Open. MP • Project 4, Part 1 due Sunday 4/10 – Design a 16 -bit pipelined computer in Logisim – Last homework due Sunday 4/10 – Project 4, Part 2 due Sunday 4/17 • Extra Credit due 4/24 – Fastest Matrix Multiply • Final Exam Monday 5/9 11: 30 -2: 30 PM 10/3/2020 Spring 2011 -- Lecture #18 19

Project 3 speeds 45 How many teams got it 40 Speed of Math Library

Project 3 speeds 45 How many teams got it 40 Speed of Math Library Fall 61 C Average 35 30 25 20 15 10/3/2020 6 7 8 9 10 11 12 Gflop/s Spring 2011 -- Lecture #18 13 14 15 16 17 18 20

Lines of Code vs. Performance 512 18. 5 GFLOPS 535 LOC Lines of code

Lines of Code vs. Performance 512 18. 5 GFLOPS 535 LOC Lines of code 256 128 64 32 16. 7 GFLOPS 39 LOC 16 5 10/3/2020 6 7 8 9 10 11 12 Gflop/s 13 Spring 2011 -- Lecture #18 14 15 16 17 18 19 21

Administrivia • What classes should I take (now)? • Take classes from great teachers!

Administrivia • What classes should I take (now)? • Take classes from great teachers! (teacher > class) – Distinguished Teaching Award (very hard to get: ~3/year) • http: //teaching. berkeley. edu/dta-dept. html – HKN Course evaluations (≥ 6 is very good) • https: //hkn. eecs. berkeley. edu/coursesurveys – EECS web site has plan for year (up in late spring: now) • http: //www. eecs. berkeley. edu/Scheduling/CS/schedule-draft. html • If have choice of multiple great teachers – – – 10/3/2020 EE 122 Networking CS 152 Computer Architecture and Engineering CS 162 Operating Systems and Systems Programming CS 169 Software Engineering (for Saa. S with Fox) CS 194 Engineering Parallel Software (offered in Fall? ) Spring 2011 -- Lecture #18 22

61 c in the News • World’s most admired companies – Published by Fortune

61 c in the News • World’s most admired companies – Published by Fortune magazine, March 2011 • Picked by CEOs of 5000 companies • IT companies: 3 of top 5, 1/3 top 12 in all industries – Client and Cloud focus 10/3/2020 1 Apple (mobile client) 2 Google (cloud) 3 Berkshire Hathaway (invest) 4 Johnson & Johnson (health) 5 Amazon. com (cloud) 6 Procter & Gamble (consumer) 7 Toyota Motor (car) 8 Goldman Sachs (finance) 9 Wal-Mart Stores (retail) 10 Coca-Cola (beverage) 11 Microsoft (PC->client-cloud) 12 Southwest Airlines (airline) Spring 2011 -- Lecture #17 23

Agenda • • MIPS-lite Datapath Administrivia CPU Timing MIPS-lite Control Datapath Control Technology Break

Agenda • • MIPS-lite Datapath Administrivia CPU Timing MIPS-lite Control Datapath Control Technology Break Control Implementation 10/3/2020 Spring 2011 -- Lecture #18 24

Clocking Methodology Clk. . . • Storage elements clocked by same edge • “Critical

Clocking Methodology Clk. . . • Storage elements clocked by same edge • “Critical path” (longest path through logic) determines length of clock period • Have to allow for Clock to Q and Setup Times too • This lecture (and P&H sections) 4. 3 -4. 4 do whole instruction in 1 clock cycle for pedagogic reasons – Project 4 will do it in 2 clock cycles via simple pipelining – Next week explain pipelining and use 5 clock cycles per instruction 10/3/2020 Spring 2011 -- Lecture #18 25

Register-Register Timing: One Complete Cycle Clk PC Old Value Rs, Rt, Rd, Op, Func

Register-Register Timing: One Complete Cycle Clk PC Old Value Rs, Rt, Rd, Op, Func Old Value ALUctr Old Value Reg. Wr Old Value bus. A, B Old Value bus. W Old Value New Value Instruction Memory Access Time New Value Delay through Control Logic New Value Register File Access Time New Value ALU Delay New Value ALUctr Reg. Wr Rd Rs Rt 5 bus. W 5 Rw Ra Rb bus. A Reg. File bus. B 32 ALU 10/3/2020 clk 5 32 Register Write Occurs Here 32 Spring 2011 -- Lecture #18 26

Logical Operations with Immediate • R[rt] = R[rs] op Zero. Ext[imm 16] 31 26

Logical Operations with Immediate • R[rt] = R[rs] op Zero. Ext[imm 16] 31 26 21 op 16 15 rs 31 6 bits 0 rt 5 bits immediate 5 bits 16 15 16 bits 0 immediate 16 bits 00000000 16 bits But we’re writing to Rt register? ? ALUctr Reg. Wr Rd Rs Rt 5 bus. W 10/3/2020 5 Rw Ra Rb bus. A Reg. File bus. B 32 ALU clk 5 32 32 Spring 2011 -- Lecture #18 27

Logical Operations with Immediate • R[rt] = R[rs] op Zero. Ext[imm 16] 31 Reg.

Logical Operations with Immediate • R[rt] = R[rs] op Zero. Ext[imm 16] 31 Reg. Dst 26 op 31 6 bits rd rt 1 0 Reg. Wr 5 rs 5 bits 0 rt 5 bits 16 15 00000000 16 bits immediate 16 bits 0 immediate 16 bits What about rt register write? ? rs 5 ALUctr rt 5 Rw Ra Rb bus. A Reg. File bus. B 32 clk 16 Zero. Ext imm 16 16 32 32 0 ALU 32 10/3/2020 21 32 • Already defined 32 -bit MUX; Zero Ext? 1 ALUSrc Spring 2011 -- Lecture #18 28

Load Operations • R[rt] = Mem[R[rs] + Sign. Ext[imm 16]] Example: lw rt, rs,

Load Operations • R[rt] = Mem[R[rs] + Sign. Ext[imm 16]] Example: lw rt, rs, imm 16 31 26 21 op 16 rs 6 bits 0 immediate rt 5 bits 16 bits Reg. Dst rd rt 1 Reg. Wr 10/3/2020 ALUctr rt 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 clk imm 16 16 Zero. Ext What sign extending? ? 5 rs 32 32 0 ALU 32 0 32 1 Spring 2011 -- Lecture #18 ALUSrc 29

Load Operations • R[rt] = Mem[R[rs] + Sign. Ext[imm 16]] Example: lw rt, rs,

Load Operations • R[rt] = Mem[R[rs] + Sign. Ext[imm 16]] Example: lw rt, rs, imm 16 31 26 21 op 16 rs 6 bits 0 immediate rt 5 bits 16 bits ALUctr Reg. Dst rd rt 1 Reg. Wr bus. W rs 5 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 imm 16 16 Ext. Op Extender clk 10/3/2020 rt 32 32 ALU 32 0 Memto. Reg Mem. Wr 0 1 ALUSrc ? 32 Data In Spring 2011 -- Lecture #18 clk 32 0 Wr. En Adr Data Memory 1 30

RTL: The Add Instruction 31 26 op 6 bits 21 rs 5 bits 16

RTL: The Add Instruction 31 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 11 rd 5 bits 6 shamt 5 bits 0 funct 6 bits add rd, rs, rt – MEM[PC] Fetch the instruction from memory – R[rd] = R[rs] + R[rt] The actual operation – PC = PC + 4 Calculate the next instruction’s address 10/3/2020 Spring 2011 -- Lecture #18 31

Instruction Fetch Unit at the Beginning of Add • Fetch the instruction from Instruction

Instruction Fetch Unit at the Beginning of Add • Fetch the instruction from Instruction memory: Instruction = MEM[PC] Inst Memory – same for all instructions n. PC_sel Inst Address Adder 4 Instruction<31: 0> 00 PC Mux Adder PC Ext clk imm 16 10/3/2020 Spring 2011 -- Lecture #18 32

Single Cycle Datapath during Add 31 26 op 21 16 rs rt 11 rd

Single Cycle Datapath during Add 31 26 op 21 16 rs rt 11 rd 6 0 shamt funct R[rd] = R[rs] + R[rt] Reg. Wr=1 bus. W 5 5 rt 5 Rw Ra Rb bus. A Reg. File bus. B 32 imm 16 10/3/2020 16 Ext. Op=x Extender clk Rs Rt Rd Imm 16 zero ALUctr=ADD Memto. Reg=0 Mem. Wr=0 32 = ALU 32 rs 0 32 32 1 Data In ALUSrc=0 Spring 2011 -- Lecture #18 <0: 15> 0 <11: 15> 1 <16: 20> rt <21: 25> rd Instruction<31: 0> instr fetch unit n. PC_sel=+4 Reg. Dst=1 clk 32 0 Wr. En Adr Data Memory 1 33

Instruction Fetch Unit at End of Add • PC = PC + 4 –

Instruction Fetch Unit at End of Add • PC = PC + 4 – Same for all instructions except: Branch and Jump Inst Memory n. PC_sel=+4 Inst Address Adder 4 00 PC Mux Adder PC Ext clk imm 16 10/3/2020 Spring 2011 -- Lecture #18 34

Single Cycle Datapath during Or Immediate 31 26 21 op 16 rs 0 rt

Single Cycle Datapath during Or Immediate 31 26 21 op 16 rs 0 rt immediate • R[rt] = R[rs] OR Zero. Ext[Imm 16] bus. W Rs Rt 5 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 imm 16 10/3/2020 16 Ext. Op= Extender clk 32 = ALU 32 Rs Rt Rd zero ALUctr= 0 0 32 32 1 ALUSrc= Data In Spring 2011 -- Lecture #18 clk <0: 15> Reg. Wr= <11: 15> 1 clk Instruction<31: 0> <16: 20> Rd Rt instr fetch unit <21: 25> n. PC_sel= Reg. Dst= Imm 16 Memto. Reg= Mem. Wr= 32 0 Wr. En Adr Data Memory 1 35

Single Cycle Datapath during Or Immediate 31 26 21 op 16 rs 0 rt

Single Cycle Datapath during Or Immediate 31 26 21 op 16 rs 0 rt immediate • R[rt] = R[rs] OR Zero. Ext[Imm 16] 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 imm 16 10/3/2020 16 Ext. Op=zero Extender clk 32 = ALU 32 5 Rs Rt 0 32 32 1 Data In ALUSrc=1 Spring 2011 -- Lecture #18 clk <0: 15> bus. W Rs Rt Rd zero ALUctr=OR 0 Reg. Wr=1 <11: 15> clk Rd Rt 1 instr fetch unit <21: 25> Reg. Dst=0 Instruction<31: 0> <16: 20> n. PC_sel=+4 Imm 16 Memto. Reg=0 Mem. Wr=0 32 0 Wr. En Adr Data Memory 1 36

Single Cycle Datapath during Load 31 26 21 op 16 rs 0 rt immediate

Single Cycle Datapath during Load 31 26 21 op 16 rs 0 rt immediate • R[rt] = Data Memory {R[rs] + Sign. Ext[imm 16]} bus. W Rs Rt 5 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 imm 16 10/3/2020 16 Ext. Op= Extender clk 32 = ALU 32 Rs Rt Rd zero ALUctr= 0 0 32 32 1 ALUSrc= Data In clk Spring 2011 -- Lecture #18 <0: 15> Reg. Wr= <11: 15> 1 clk Instruction<31: 0> <16: 20> Rd Rt instr fetch unit <21: 25> n. PC_sel= Reg. Dst= Imm 16 Memto. Reg= Mem. Wr= 32 0 Wr. En Adr Data Memory 1 37

Single Cycle Datapath during Load 31 26 21 op 16 rs 0 rt immediate

Single Cycle Datapath during Load 31 26 21 op 16 rs 0 rt immediate • R[rt] = Data Memory {R[rs] + Sign. Ext[imm 16]} 5 Rw Ra Rb bus. A Reg. File bus. B 32 imm 16 10/3/2020 16 Ext. Op=sign Extender clk Rs Rt Rd Imm 16 zero ALUctr=ADD Memto. Reg=1 Mem. Wr=0 32 = ALU 32 5 0 32 32 <0: 15> bus. W 5 Rs Rt <11: 15> Reg. Wr=1 <16: 20> 0 <21: 25> Rd Rt 1 Instruction<31: 0> instr fetch unit n. PC_sel=+4 Reg. Dst=0 clk 1 Data In ALUSrc=1 clk Spring 2011 -- Lecture #18 32 0 Wr. En Adr Data Memory 1 38

Single Cycle Datapath during Branch 31 26 21 op • 16 rs 0 rt

Single Cycle Datapath during Branch 31 26 21 op • 16 rs 0 rt immediate if (R[rs] - R[rt] == 0) then Zero = 1 ; else Zero = 0 bus. W Rs Rt 5 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 imm 16 10/3/2020 16 Ext. Op= Extender clk 32 = ALU 32 Rs Rt Rd zero ALUctr= 0 0 32 32 1 ALUSrc= Data In clk Spring 2011 -- Lecture #18 <0: 15> Reg. Wr= <11: 15> 1 clk <16: 20> Rd Rt Instruction<31: 0> <21: 25> n. PC_sel= Reg. Dst= instr fetch unit Imm 16 Memto. Reg= Mem. Wr= 32 0 Wr. En Adr Data Memory 1 39

Single Cycle Datapath during Branch 31 26 21 op • 16 rs 0 rt

Single Cycle Datapath during Branch 31 26 21 op • 16 rs 0 rt immediate if (R[rs] - R[rt] == 0) then Zero = 1 ; else Zero = 0 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 imm 16 10/3/2020 16 Ext. Op=x Extender clk Rs Rt Rd Imm 16 zero ALUctr=SUB Memto. Reg=x Mem. Wr=0 32 = ALU 32 5 0 32 32 <0: 15> bus. W Rs Rt <11: 15> Reg. Wr=0 <16: 20> 0 <21: 25> Rd Rt 1 Instruction<31: 0> instr fetch unit n. PC_sel=br Reg. Dst=x clk 1 Data In ALUSrc=0 clk Spring 2011 -- Lecture #18 32 0 Wr. En Adr Data Memory 1 40

Instruction Fetch Unit at the End of Branch 31 26 op 21 rs 16

Instruction Fetch Unit at the End of Branch 31 26 op 21 rs 16 0 rt immediate • if (Zero == 1) then PC = PC + 4 + Sign. Ext[imm 16]*4 ; else PC = PC + 4 Inst Memory Adr n. PC_sel Zero 0 Mux PC Adder 10/3/2020 PC Ext imm 16 Adder 4 00 MUX ctrl n. PC_sel Instruction<31: 0> • What is encoding of n. PC_sel? • Direct MUX select? • Branch inst. / not branch • Let’s pick 2 nd option Q: What logic gate? 1 clk Spring 2011 -- Lecture #18 41

Summary: Datapath’s Control Signals • Ext. Op: • ALUsrc: • ALUctr: • • “zero”,

Summary: Datapath’s Control Signals • Ext. Op: • ALUsrc: • ALUctr: • • “zero”, “sign” 0 reg. B; 1 immed “ADD”, “SUB”, “OR” Mem. Wr: Memto. Reg: Reg. Dst: Reg. Wr: ALUctr Memto. Reg Mem. Wr Reg. Dst Rd Rt 1 Inst Address Reg. Wr 4 clk 32 Rs Rt 5 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 clk imm 16 16 Ext. Op Extender imm 16 PC Mux Adder PC Ext 10/3/2020 1 bus. W 0 32 0 ALU Adder 0 00 n. PC_sel 1 write memory 0 ALU; 1 Mem 0 “rt”; 1 “rd” 1 write register 32 0 32 Wr. En Adr 32 Spring 2011 -- Lecture #18 1 ALUSrc Data In clk 1 Data Memory 42

Agenda • • MIPS-lite Datapath Administrivia CPU Timing MIPS-lite Control Datapath Control Technology Break

Agenda • • MIPS-lite Datapath Administrivia CPU Timing MIPS-lite Control Datapath Control Technology Break Control Implementation 10/3/2020 Spring 2011 -- Lecture #18 43

Given Datapath: RTL Control Instruction<31: 0> Rd <0: 15> Rs <11: 15> Rt <16:

Given Datapath: RTL Control Instruction<31: 0> Rd <0: 15> Rs <11: 15> Rt <16: 20> Op Fun <21: 25> <0: 5> <26: 31> Inst Memory Adr Imm 16 Control n. PC_sel Reg. Wr Reg. Dst Ext. Op ALUSrc ALUctr Mem. Wr Memto. Reg DATA PATH 10/3/2020 Spring 2011 -- Lecture #18 44

Summary of the Control Signals (1/2) inst Register Transfer add R[rd] R[rs] + R[rt];

Summary of the Control Signals (1/2) inst Register Transfer add R[rd] R[rs] + R[rt]; PC + 4 ALUsrc=Reg. B, ALUctr=“ADD”, Reg. Dst=rd, Reg. Wr, n. PC_sel=“+4” sub R[rd] R[rs] – R[rt]; PC + 4 ALUsrc=Reg. B, ALUctr=“SUB”, Reg. Dst=rd, Reg. Wr, n. PC_sel=“+4” R[rt] R[rs] + zero_ext(Imm 16); PC + 4 ori ALUsrc=Im, Extop=“Z”, ALUctr=“OR”, Reg. Dst=rt, Reg. Wr, n. PC_sel=“+4” R[rt] MEM[ R[rs] + sign_ext(Imm 16)]; PC + 4 lw ALUsrc=Im, Extop=“sn”, ALUctr=“ADD”, Memto. Reg, Reg. Dst=rt, Reg. Wr, n. PC_sel = “+ MEM[ R[rs] + sign_ext(Imm 16)] R[rs]; PC + 4 sw ALUsrc=Im, Extop=“sn”, ALUctr = “ADD”, Mem. Wr, n. PC_sel = “+4” Beq if (R[rs] == R[rt]) then PC + sign_ext(Imm 16)] || 00 else PC + 4 n. PC_sel = “br”, ALUctr = “SUB” 10/3/2020 Spring 2011 -- Lecture #18 45

Summary of the Control Signals (2/2) See Appendix A func 10 0000 10 0010

Summary of the Control Signals (2/2) See Appendix A func 10 0000 10 0010 We Don’t Care : -) op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010 Reg. Dst ALUSrc Memto. Reg. Write Mem. Write add 1 0 0 1 0 sub 1 0 0 1 0 ori 0 1 0 lw 0 1 1 1 0 sw x 1 x 0 1 beq x 0 0 jump x x x 0 0 n. PCsel Jump Ext. Op ALUctr<2: 0> 0 0 x Add 0 0 x Subtract 0 0 0 Or 0 0 1 Add 1 0 x Subtract ? 1 x x 31 26 21 16 R-type op rs rt I-type op rs rt J-type op 10/3/2020 11 rd 6 shamt immediate target address Spring 2011 -- Lecture #18 0 funct add, sub ori, lw, sw, beq jump 46

Boolean Expressions for Controller Reg. Dst = add + sub ALUSrc = ori +

Boolean Expressions for Controller Reg. Dst = add + sub ALUSrc = ori + lw + sw Memto. Reg = lw Reg. Write = add + sub + ori + lw Mem. Write = sw n. PCsel = beq Jump = jump Ext. Op = lw + sw ALUctr[0] = sub + beq (assume ALUctr is 00 ADD, 01: SUB, 10: OR) ALUctr[1] = or Where: rtype = ~op 5 ~op 4 ~op 3 ~op 2 ~op 1 ~op 0, ori = ~op 5 ~op 4 op 3 op 2 ~op 1 op 0 lw = op 5 ~op 4 ~op 3 ~op 2 op 1 op 0 sw = op 5 ~op 4 op 3 ~op 2 op 1 op 0 beq = ~op 5 ~op 4 ~op 3 op 2 ~op 1 ~op 0 jump = ~op 5 ~op 4 ~op 3 ~op 2 op 1 ~op 0 add = rtype func 5 ~func 4 ~func 3 ~func 2 ~func 1 ~func 0 sub = rtype func 5 ~func 4 ~func 3 ~func 2 func 1 ~func 0 10/3/2020 Spring 2011 -- Lecture #18 How do we implement this in gates? 47

Controller Implementation opcode func “AND” logic 10/3/2020 add sub ori lw sw beq jump

Controller Implementation opcode func “AND” logic 10/3/2020 add sub ori lw sw beq jump “OR” logic Spring 2011 -- Lecture #18 Reg. Dst ALUSrc Memto. Reg. Write Mem. Write n. PCsel Jump Ext. Op ALUctr[0] ALUctr[1] 48

AND Control in Logisim 10/3/2020 Spring 2011 -- Lecture #18 49

AND Control in Logisim 10/3/2020 Spring 2011 -- Lecture #18 49

Summary: Single-cycle Processor • Five steps to design a processor: Processor 1. Analyze instruction

Summary: Single-cycle Processor • Five steps to design a processor: Processor 1. Analyze instruction set Input datapath requirements Control Memory 2. Select set of datapath components & establish Datapath Output clock methodology 3. Assemble datapath meeting the requirements 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic • Formulate Logic Equations • Design Circuits 10/3/2020 Spring 2011 -- Lecture #18 50

Single Cycle Datapath during Store 31 26 21 op 16 rs 0 rt immediate

Single Cycle Datapath during Store 31 26 21 op 16 rs 0 rt immediate • Data Memory {R[rs] + Sign. Ext[imm 16]} = R[rt] bus. W Rs Rt 5 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 imm 16 10/3/2020 16 Ext. Op= Extender clk 32 = ALU 32 Rs Rt Rd zero ALUctr= 0 0 32 32 1 ALUSrc= Data In clk Spring 2011 -- Lecture #18 <0: 15> Reg. Wr= <11: 15> 1 clk Instruction<31: 0> <16: 20> Rd Rt instr fetch unit <21: 25> n. PC_sel= Reg. Dst= Imm 16 Memto. Reg= Mem. Wr= 32 0 Wr. En Adr Data Memory 1 51

Single Cycle Datapath during Store 31 26 21 op 16 rs 0 rt immediate

Single Cycle Datapath during Store 31 26 21 op 16 rs 0 rt immediate • Data Memory {R[rs] + Sign. Ext[imm 16]} = R[rt] bus. W 5 5 Rw Ra Rb bus. A Reg. File bus. B 32 imm 16 10/3/2020 16 Ext. Op=sign Extender clk Rs Rt Rd Imm 16 zero ALUctr=ADD Memto. Reg=x Mem. Wr=1 32 = ALU 32 5 Rs Rt 0 32 32 <0: 15> Reg. Wr=0 <11: 15> 0 <16: 20> Rd Rt <21: 25> n. PC_sel=+4 Reg. Dst=x clk 1 Instruction<31: 0> instr fetch unit 1 Data In ALUSrc=1 clk Spring 2011 -- Lecture #18 32 0 Wr. En Adr Data Memory 1 52

Single Cycle Datapath during Jump 31 J-type 26 25 0 op jump target address

Single Cycle Datapath during Jump 31 J-type 26 25 0 op jump target address • New PC = { PC[31. . 28], target address, 00 } Instruction<31: 0> Jump= <0: 25> Data In 32 ALUSrc = Spring 2011 -- Lecture #18 0 32 Clk Wr. En Adr 32 Mux 32 <0: 15> Ext. Op = 1 <11: 15> 10/3/2020 16 Extender imm 16 Rs Rd Imm 16 TA 26 Memto. Reg = Zero Mem. Wr = ALU bus. A Rw Ra Rb 32 32 32 -bit Registers bus. B 0 32 <16: 20> 5 Rt ALUctr = Rs Rt 5 5 Mux 32 Clk 1 Mux 0 Reg. Wr = bus. W Rt <21: 25> Reg. Dst = Rd Instruction Fetch Unit n. PC_sel= 1 Data Memory 53

Instruction Fetch Unit at the End of Jump 31 26 25 J-type 0 op

Instruction Fetch Unit at the End of Jump 31 26 25 J-type 0 op target address • New PC = { PC[31. . 28], target address, 00 } Jump Inst Memory n. PC_sel jump Instruction<31: 0> Adr Zero n. PC_MUX_sel Adder 0 imm 16 PC Mux Adder 10/3/2020 00 4 How do we modify this to account for jumps? 1 Clk Spring 2011 -- Lecture #18 55