CS 152 Computer Architecture and Engineering Lecture 9

  • Slides: 47
Download presentation
CS 152 Computer Architecture and Engineering Lecture 9 Designing a Multicycle Processor Feb 24,

CS 152 Computer Architecture and Engineering Lecture 9 Designing a Multicycle Processor Feb 24, 1999 John Kubiatowicz (http. cs. berkeley. edu/~kubitron) lecture slides: http: //www-inst. eecs. berkeley. edu/~cs 152/ 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 1

Recap: Processor Design is a Process ° Bottom-up • assemble components in target technology

Recap: Processor Design is a Process ° Bottom-up • assemble components in target technology to establish critical timing ° Top-down • specify component behavior from high-level requirements ° Iterative refinement • establish partial solution, expand improve Instruction Set Architecture processor datapath Reg. File Mux ALU control Reg Cells 2/24/99 Mem Decoder Sequencer Gates ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 2

Recap: A Single Cycle Datapath 1 Mux 0 Reg. Wr 5 5 Rs 5

Recap: A Single Cycle Datapath 1 Mux 0 Reg. Wr 5 5 Rs 5 Rt Rt ALUctr bus. A 0 1 32 Imm 16 Memto. Reg Mem. Wr 0 32 Data In 32 ALUSrc Rd Clk Wr. En Adr 32 Mux 16 Extender imm 16 32 Mux 32 Clk Rw Ra Rb 32 32 -bit Registers bus. B 32 ALU bus. W Zero Rs <0: 15> Clk <11: 15> Reg. Dst Rt <16: 20> Rd Instruction Fetch Unit <21: 25> n. PC_sel Instruction<31: 0> 1 Data Memory Ext. Op 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 3

Recap: The “Truth Table” for the Main Control op 6 Main Control Reg. Dst

Recap: The “Truth Table” for the Main Control op 6 Main Control Reg. Dst ALUSrc func 6 : ALUop ALU Control (Local) ALUctr 3 3 op Reg. Dst ALUSrc Memto. Reg. Write Mem. Write Branch Jump Ext. Op ALUop (Symbolic) ALUop <2> ALUop <1> ALUop <0> 2/24/99 00 0000 R-type 1 0 0 0 x “R-type” 1 0 0 00 1101 10 0011 10 1011 00 0100 00 0010 ori lw sw beq jump 0 0 x x x 1 1 1 0 x 0 1 x x x 1 1 0 0 0 0 0 1 1 x x Or Add Subtract xxx 0 0 0 x 0 1 0 0 x 1 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 4

Recap: PLA Implementation of the Main Control . . op<5> <0> R-type . .

Recap: PLA Implementation of the Main Control . . op<5> <0> R-type . . op<5> <0> ori . . op<5> <0> lw . . op<5> <0> sw <0> beq . . op<5> jump op<0> Reg. Write ALUSrc Reg. Dst Memto. Reg Mem. Write Branch Jump Ext. Op ALUop<2> ALUop<1> ALUop<0> 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 5

Recap: Systematic Generation of Control Logic / Store (PLA, ROM) microinstruction Conditions Instruction Decode

Recap: Systematic Generation of Control Logic / Store (PLA, ROM) microinstruction Conditions Instruction Decode OPcode Control Points Datapath ° In our single-cycle processor, each instruction is realized by exactly one control command or “microinstruction” • in general, the controller is a finite state machine • microinstruction can also control sequencing (see later) 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 6

The Big Picture: Where are We Now? ° The Five Classic Components of a

The Big Picture: Where are We Now? ° The Five Classic Components of a Computer Processor Input Control Memory Datapath Output ° Today’s Topic: Designing the Datapath for the Multiple Clock Cycle Datapath 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 7

Outline of Today’s Lecture ° Recap: single cycle processor ° VHDL versions ° Faster

Outline of Today’s Lecture ° Recap: single cycle processor ° VHDL versions ° Faster designs ° Multicycle Datapath ° Performance Analysis ° Multicycle Control 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 8

Behavioral models of Datapath Components entity adder 16 is generic (cc. Out_delay : TIME

Behavioral models of Datapath Components entity adder 16 is generic (cc. Out_delay : TIME : = 12 ns; adder. Out_delay: TIME : = 12 ns); port(A, B: in vlbit_1 d(15 downto 0); DOUT: out vlbit_1 d(15 downto 0); CIN: in vlbit; architecture behavior of adder 32 is COUT: out vlbit); begin end adder 16; adder 16_process: process(A, B, CIN) A B 16 16 Cout Cin 16 DOUT 2/24/99 variable tmp : vlbit_1 d(18 downto 0); variable adder_out : vlbit_1 d(31 downto 0); variable carry: vlbit; begin tmp : = addum (A, B), CIN); adder_out : = tmp(15 downto 0); carry : =tmp(16); COUT <= carry after cc. Out_delay; DOUT <= adder_out after adder. Out_delay; end process; end behavior; ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 9

Behavioral Specification of Control Logic entity maincontrol is port(opcode: in vlbit_1 d(5 downto 0);

Behavioral Specification of Control Logic entity maincontrol is port(opcode: in vlbit_1 d(5 downto 0); equal_cond: in vlbit; extop ALUsrc ALUop MEMwr Memto. Reg. Wr Reg. Dst n. PC end maincontrol; architecture behavior of maincontrol is out vlbit; begin out vlbit; control: process(opcode, equal_cond) out vlbit_1 d(1 downto 0); constant ORIop: vlbit_ld(5 downto 0) : = “ 001101”; out vlbit; begin out vlbit; -- extop only 0 (no extend) for ORI inst out vlbit; case opcode is out vlbit; when ORIop => extop <= 0; out vlbit; when others => extop <= 1; end case; end process; end behavior; ° Decode / Control-store address modeled by Case statement ° Each arm drives control signals for that operation • just like the microinstruction • either can be symbolic ©UCB Spring 1999 2/24/99 CS 152 / Kubiatowicz Lec 9. 10

Abstract View of our single cycle processor Main Control op Result Store Mem. Wr

Abstract View of our single cycle processor Main Control op Result Store Mem. Wr Reg. Dst Reg. Wrt Data Mem Access Ext. Op ALUSrc ALUctr Equal Register Fetch Instruction Fetch PC n. PC_sel Next PC ALU Mem. Rd Mem. Wr ALU control fun ° looks like a FSM with PC as state 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 11

What’s wrong with our CPI=1 processor? Arithmetic & Logical PC Inst Memory Reg File

What’s wrong with our CPI=1 processor? Arithmetic & Logical PC Inst Memory Reg File mux ALU mux setup Load PC Inst Memory ALU Data Mem Store PC mux Reg File Critical Path Inst Memory Reg File ALU Data Mem Branch PC Inst Memory Reg File mux cmp mux setup mux ° Long Cycle Time ° All instructions take as much time as the slowest ° Real memory is not as nice as our idealized memory • cannot always get the job done in one (short) cycle 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 12

Memory Access Time ° Physics => fast memories are small (large memories are slow)

Memory Access Time ° Physics => fast memories are small (large memories are slow) Storage Array selected word line storage cell address bit line address decoder sense amps Processor Cache 1 cycle 2/24/99 proc. bus ° => Use a hierarchy of memories L 2 Cache 2 -3 cycles ©UCB Spring 1999 mem. bus • question: register file vs. memory 20 - 50 cycles CS 152 / Kubiatowicz Lec 9. 13

Reducing Cycle Time ° Cut combinational dependency graph and insert register / latch °

Reducing Cycle Time ° Cut combinational dependency graph and insert register / latch ° Do same work in two fast cycles, rather than one slow one ° May be able to short-circuit path and remove some components for some instructions! storage element Acyclic Combinational Logic (A) storage element Acyclic Combinational Logic (B) storage element 2/24/99 ©UCB Spring 1999 storage element CS 152 / Kubiatowicz Lec 9. 14

Basic Limits on Cycle Time ° Next address logic • PC <= branch ?

Basic Limits on Cycle Time ° Next address logic • PC <= branch ? PC + offset : PC + 4 ° Instruction Fetch • Instruction. Reg <= Mem[PC] ° Register Access • A <= R[rs] ° ALU operation ©UCB Spring 1999 Result Store Mem. Wr Reg. Dst Reg. Wr Reg. File Data Mem. Rd Mem. Wr ALUctr Exec Mem Access Operand Fetch Instruction Fetch PC Ext. Op n. PC_sel Next PC 2/24/99 ALUSrc Control • R <= A + B CS 152 / Kubiatowicz Lec 9. 15

2/24/99 Operand Fetch Instruction Fetch PC Next PC Exec ©UCB Spring 1999 Reg. File

2/24/99 Operand Fetch Instruction Fetch PC Next PC Exec ©UCB Spring 1999 Reg. File Result Store Data Mem Access Mem. Wr Reg. Dst Reg. Wr Mem. Rd Mem. Wr ALUctr ALUSrc Ext. Op Equal n. PC_sel Partitioning the CPI=1 Datapath ° Add registers between smallest steps CS 152 / Kubiatowicz Lec 9. 16

2/24/99 Ext. Op Equal B ©UCB Spring 1999 S Reg. File Reg. Dst Reg.

2/24/99 Ext. Op Equal B ©UCB Spring 1999 S Reg. File Reg. Dst Reg. Wr Mem. To. Reg Mem. Rd Mem. Wr ALUctr Ext ALUSrc ALU A Result Store Reg File Mem Access IR n. PC_sel E Data Mem Operand Fetch Instruction Fetch PC Next PC Example Multicycle Datapath M ° Critical Path ? CS 152 / Kubiatowicz Lec 9. 17

Administrative Issues ° Read Chapter 5 ° Tapes now available in 205 Mc. Laughlin

Administrative Issues ° Read Chapter 5 ° Tapes now available in 205 Mc. Laughlin Hall ° This lecture and next one slightly different from the book ° Midterm next Wednesday 3/3/99: • 5: 30 pm to 8: 30 pm, 277 Cory • No class on that day ° Midterm reminders: • Pencil, calculator, one 8. 5” x 11” (both sides) of handwritten notes • Sit in every other chair, every other row (odd row & odd seat) ° Meet at La. Val’s pizza after the midterm 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 18

Recall: Step-by-step Processor Design Step 1: ISA => Logical Register Transfers Step 2: Components

Recall: Step-by-step Processor Design Step 1: ISA => Logical Register Transfers Step 2: Components of the Datapath Step 3: RTL + Components => Datapath Step 4: Datapath + Logical RTs => Physical RTs Step 5: Physical RTs => Control 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 19

Step 4: R-rtype (add, sub, . . . ) ° Logical Register Transfer inst

Step 4: R-rtype (add, sub, . . . ) ° Logical Register Transfer inst Logical Register Transfers ADDU R[rd] <– R[rs] + R[rt]; PC <– PC + 4 ° Physical Register Transfers inst Physical Register Transfers Time IR <– MEM[pc] ADDU A<– R[rs]; B <– R[rt] S <– A + B R[rd] <– S; PC <– PC + 4 B ©UCB Spring 1999 Reg. File S Mem Access Exec Reg File IR Inst. Mem A M Data Mem 2/24/99 PC Next PC E CS 152 / Kubiatowicz Lec 9. 20

Step 4: Logical immed ° Logical Register Transfer inst Logical Register Transfers ORI R[rt]

Step 4: Logical immed ° Logical Register Transfer inst Logical Register Transfers ORI R[rt] <– R[rs] OR ZExt(Im 16); PC <– PC + 4 ° Physical Register Transfers inst Physical Register Transfers Time IR <– MEM[pc] ORI A<– R[rs]; B <– R[rt] S <– A or ZExt(Im 16) R[rt] <– S; PC <– PC + 4 B ©UCB Spring 1999 Reg. File S Mem Access Exec Reg File IR Inst. Mem A M Data Mem 2/24/99 PC Next PC E CS 152 / Kubiatowicz Lec 9. 21

Step 4 : Load ° Logical Register Transfer ° Physical Register Transfers inst Logical

Step 4 : Load ° Logical Register Transfer ° Physical Register Transfers inst Logical Register Transfers LW R[rt] <– MEM[R[rs] + SExt(Im 16)]; PC <– PC + 4 inst Physical Register Transfers Time IR <– MEM[pc] LW A<– R[rs]; B <– R[rt] S <– A + SExt(Im 16) M <– MEM[S] R[rd] <– M; PC <– PC + 4 B ©UCB Spring 1999 Reg. File S Mem Access Exec Reg File IR Inst. Mem A M Data Mem 2/24/99 PC Next PC E CS 152 / Kubiatowicz Lec 9. 22

Step 4 : Store ° Logical Register Transfer inst Logical Register Transfers SW MEM[R[rs]

Step 4 : Store ° Logical Register Transfer inst Logical Register Transfers SW MEM[R[rs] + SExt(Im 16)] <– R[rt]; PC <– PC + 4 ° Physical Register Transfers Time inst Physical Register Transfers IR <– MEM[pc] SW A<– R[rs]; B <– R[rt] S <– A + SExt(Im 16); MEM[S] <– B PC <– PC + 4 B ©UCB Spring 1999 Reg. File S Mem Access Exec Reg File IR Inst. Mem A M Data Mem 2/24/99 PC Next PC E CS 152 / Kubiatowicz Lec 9. 23

Step 4 : Branch ° Logical Register Transfer inst Logical Register Transfers BEQ if

Step 4 : Branch ° Logical Register Transfer inst Logical Register Transfers BEQ if R[rs] == R[rt] then PC <= PC + SExt(Im 16) || 00 ° Physical Register Transfers else PC <= PC + 4 Time inst Physical Register Transfers IR <– MEM[pc] BEQ E<– (R[rs] == R[rt]) if E then PC <– PC + 4 else PC <– PC + SExt(Im 16) || 00 B ©UCB Spring 1999 Reg. File S Mem Access Exec Reg File IR Inst. Mem A M Data Mem 2/24/99 PC Next PC E CS 152 / Kubiatowicz Lec 9. 24

Alternative datapath (book): Multiple Cycle Datapath ° Miminizes Hardware: 1 memory, 1 adder PCWr.

Alternative datapath (book): Multiple Cycle Datapath ° Miminizes Hardware: 1 memory, 1 adder PCWr. Cond Zero Mem. Wr IRWr Reg. Dst ALUSel. A Reg. Wr 1 32 PC 1 Wr. Adr 32 Din Dout 32 32 32 Rt 0 5 Rd Mux Ideal Memory Rb bus. A Reg File 32 bus. W bus. B 32 1 << 2 Extend Ext. Op 32 1 Rw 1 Mux 0 Imm 16 2/24/99 Ra 4 0 1 32 32 2 3 ALU Control 32 Memto. Reg ©UCB Spring 1999 Zero ALU Out 32 Rt 5 Target ALU Mux RAdr Rs 32 0 0 Mux 0 32 Instruction Reg 32 Br. Wr Mux Ior. D PCSrc ALUOp ALUSel. B CS 152 / Kubiatowicz Lec 9. 25

Our Control Model ° State specifies control points for Register Transfer ° Transfer occurs

Our Control Model ° State specifies control points for Register Transfer ° Transfer occurs upon exiting state (same falling edge) inputs (conditions) Next State Logic State X Register Transfer Control Points Control State Depends on Input Output Logic outputs (control points) 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 26

Step 4 => Control Specification for multicycle proc IR <= MEM[PC] S <= A

Step 4 => Control Specification for multicycle proc IR <= MEM[PC] S <= A fun B ORi S <= A or ZX LW S <= A + SX M <= MEM[S] SW S <= A + SX MEM[S] <= B PC <= PC + 4 R[rd] <= S R[rt] <= M PC <= PC + 4 2/24/99 BEQ ©UCB Spring 1999 PC <= Next(PC, Equal) Write-back Memory Execute “decode / operand fetch” A <= R[rs] B <= R[rt] R-type “instruction fetch” CS 152 / Kubiatowicz Lec 9. 27

Traditional FSM Controller next state op cond state control points Truth Table 11 Equal

Traditional FSM Controller next state op cond state control points Truth Table 11 Equal next State control points 6 4 State op 2/24/99 datapath State ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 28

Step 5: datapath + state diagram control ° Translate RTs into control points °

Step 5: datapath + state diagram control ° Translate RTs into control points ° Assign states ° Then go build the controller 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 29

Mapping RTs to Control Points IR <= MEM[PC] imem_rd, IRen A <= R[rs] B

Mapping RTs to Control Points IR <= MEM[PC] imem_rd, IRen A <= R[rs] B <= R[rt] S <= A fun B ORi ALUfun, Sen S <= A or ZX LW S <= A + SX M <= MEM[S] R[rd] <= S PC <= PC + 4 Reg. Dst, Reg. Wr, PCen 2/24/99 SW BEQ S <= A + SX MEM[S] <= B PC <= PC + 4 R[rt] <= S R[rt] <= M PC <= PC + 4 ©UCB Spring 1999 PC <= Next(PC, Equal) Write-back Memory Execute “decode” Aen, Ben, Een R-type “instruction fetch” CS 152 / Kubiatowicz Lec 9. 30

Assigning States IR <= MEM[PC] 0000 “instruction fetch” “decode” A <= R[rs] B <=

Assigning States IR <= MEM[PC] 0000 “instruction fetch” “decode” A <= R[rs] B <= R[rt] R-type S <= A fun B 0100 ORi S <= A or ZX 0110 LW S <= A + SX 1000 M <= MEM[S] 1001 SW BEQ S <= A + SX 1011 MEM[S] <= B PC <= PC + 4 1100 R[rd] <= S R[rt] <= M PC <= PC + 4 0101 2/24/99 0111 1010 ©UCB Spring 1999 PC <= Next(PC) 0011 Write-back Memory Execute 0001 CS 152 / Kubiatowicz Lec 9. 31

Detailed Control Specification State Op field Eq Next IR PC Ops Exec Mem Write-Back

Detailed Control Specification State Op field Eq Next IR PC Ops Exec Mem Write-Back en sel A B Ex Sr ALU S R W M M-R Wr Dst 0000 0001 0001 ? ? ? BEQ R-type or. I LW SW ? x x x 0001 0011 0100 0110 1000 1011 0011 R: 0100 0101 ORi: 0110 LW: 0111 xxxxxx 0 1 1 xxxxxx 0 1 0 xxxxxx 1 1 0 xxxxxx 0 1 x x 0000 0101 0000 x x 0111 0000 x x x 1001 1010 0000 BEQ: 1000 SW: 1001 1010 2/24/99 1011 x 1 11 11 -all same in 1 Moore machine 1 11 11 1 1 0 1 fun 1 1 0 0 0 or 1 1 0 add 1 1 0 0 1 ©UCB Spring 1999 1100 0 1 0 CS 152 / Kubiatowicz Lec 9. 32 1 0 add 1

Performance Evaluation ° What is the average CPI? • state diagram gives CPI for

Performance Evaluation ° What is the average CPI? • state diagram gives CPI for each instruction type • workload gives frequency of each type Type CPIi for type Frequency CPIi x freq. Ii Arith/Logic 4 40% 1. 6 Load 5 30% 1. 5 Store 4 10% 0. 4 branch 3 20% 0. 6 Average CPI: 4. 1 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 33

Controller Design ° The state digrams that arise define the controller for an instruction

Controller Design ° The state digrams that arise define the controller for an instruction set processor are highly structured ° Use this structure to construct a simple “microsequencer” ° Control reduces to programming this very simple device • microprogramming sequencer datapath control microinstruction micro-PC 2/24/99 sequencer ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 34

Example: Jump-Counter 0000 i i i+1 Map ROM None of above: Do nothing (for

Example: Jump-Counter 0000 i i i+1 Map ROM None of above: Do nothing (for wait states) op-code Counter 2/24/99 zero inc load ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 35

Using a Jump Counter IR <= MEM[PC] 0000 inc “decode” A <= R[rs] B

Using a Jump Counter IR <= MEM[PC] 0000 inc “decode” A <= R[rs] B <= R[rt] S <= A fun B 0100 inc LW ORi S <= A or ZX 0110 S <= A + SX 1000 inc M <= MEM[S] 1001 inc SW BEQ S <= A + SX 1011 inc MEM[S] <= B PC <= PC + 4 1100 R[rd] <= S R[rt] <= M PC <= PC + 4 0101 zero 2/24/99 0111 zero 1010 zero ©UCB Spring 1999 zero PC <= Next(PC) 0011 zero Write-back Memory Execute 0001 load R-type “instruction fetch” CS 152 / Kubiatowicz Lec 9. 36

Our Microsequencer taken ZIL datapath control Micro-PC op-code Map ROM 2/24/99 ©UCB Spring 1999

Our Microsequencer taken ZIL datapath control Micro-PC op-code Map ROM 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 37

Microprogram Control Specification µPC Taken Next IR PC Ops Exec Mem Write-Back en sel

Microprogram Control Specification µPC Taken Next IR PC Ops Exec Mem Write-Back en sel A B Ex Sr ALU S R W M M-R Wr Dst 0000 0001 ? 0 inc load 0011 R: 0100 0101 ORi: 0110 LW: 0111 0 1 x x zero inc zero x x x inc zero BEQ 1 1 0 1 fun 1 1 0 0 0 or 1 1 0 0 1 2/24/99 11 0 1 1 1000 SW: 1001 1010 1011 1100 1 1 0 add 1 1 0 0 1 0 1 ©UCB Spring 1999 0 1 0 add 1 0 1 CS 152 / Kubiatowicz Lec 9. 38

Mapping ROM 2/24/99 R-type 000000 0100 BEQ 000100 0011 ori 001101 0110 LW 100011

Mapping ROM 2/24/99 R-type 000000 0100 BEQ 000100 0011 ori 001101 0110 LW 100011 1000 SW 101011 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 39

Example: Controlling Memory PC addr Instruction Memory Inst. Mem_rd IM_wait data Inst. Reg 2/24/99

Example: Controlling Memory PC addr Instruction Memory Inst. Mem_rd IM_wait data Inst. Reg 2/24/99 IR_en ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 40

Controller handles non-ideal memory “instruction fetch” IR <= MEM[PC] wait ~wait R-type S <=

Controller handles non-ideal memory “instruction fetch” IR <= MEM[PC] wait ~wait R-type S <= A fun B LW ORi S <= A or ZX SW S <= A + SX M <= MEM[S] ~wait R[rd] <= S R[rt] <= M PC <= PC + 4 2/24/99 BEQ S <= A + SX PC <= Next(PC) MEM[S] <= B ~wait PC <= PC + 4 ©UCB Spring 1999 wait Write-back Memory Execute “decode / operand fetch” A <= R[rs] B <= R[rt] CS 152 / Kubiatowicz Lec 9. 41

IR <= MEM[PC] write-back 2/24/99 wait ~wait A <= R[rs] B <= R[rt] R-type

IR <= MEM[PC] write-back 2/24/99 wait ~wait A <= R[rs] B <= R[rt] R-type S <= A fun B ORi S <= A or ZX Memory Execute decode instruction fetch Really Simple Time-State Control LW SW S <= A + SX M <= MEM[S] wait R[rd] <= S PC <= PC + 4 R[rt] <= M PC <= PC + 4 ©UCB Spring 1999 BEQ S <= A + SX MEM[S] <= B wait PC <= PC + 4 PC <= Next(PC) CS 152 / Kubiatowicz Lec 9. 42

Time-state Control Path ° Local decode and control at each stage 2/24/99 ©UCB Spring

Time-state Control Path ° Local decode and control at each stage 2/24/99 ©UCB Spring 1999 Reg. File M Data Mem B Mem Access PC Next PC Equal IRmem WB Ctrl Exec S IRwb IRex A Mem Ctrl Dcd Ctrl Reg File IR Inst. Mem Valid CS 152 / Kubiatowicz Lec 9. 43

Overview of Control ° Control may be designed using one of several initial representations.

Overview of Control ° Control may be designed using one of several initial representations. The choice of sequence control, and how logic is represented, can then be determined independently; the control can then be implemented with one of several methods using a structured logic technique. Initial Representation Sequencing Control Function Finite State Diagram Microprogram Explicit Next State + Dispatch ROMs Logic Representation Logic Equations Implementation Technique 2/24/99 PLA Microprogram counter Truth Tables ROM “hardwired control” ©UCB Spring 1999 “microprogrammed control” CS 152 / Kubiatowicz Lec 9. 44

Summary ° Disadvantages of the Single Cycle Processor • Long cycle time • Cycle

Summary ° Disadvantages of the Single Cycle Processor • Long cycle time • Cycle time is too long for all instructions except the Load ° Multiple Cycle Processor: • Divide the instructions into smaller steps • Execute each step (instead of the entire instruction) in one cycle ° Partition datapath into equal size chunks to minimize cycle time • ~10 levels of logic between latches ° Follow same 5 -step method for designing “real” processor 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 45

Summary (cont’d) ° Control is specified by finite state digram ° Specialize state-diagrams easily

Summary (cont’d) ° Control is specified by finite state digram ° Specialize state-diagrams easily captured by microsequencer • simple increment & “branch” fields • datapath control fields ° Control design reduces to Microprogramming ° Control is more complicated with: • complex instruction sets • restricted datapaths (see the book) ° Simple Instruction set and powerful datapath => simple control • could try to reduce hardware (see the book) • rather go for speed => many instructions at once! 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 46

Where to get more information? ° Next two lectures: • Multiple Cycle Controller: Appendix

Where to get more information? ° Next two lectures: • Multiple Cycle Controller: Appendix C of your text book. • Microprogramming: Section 5. 5 of your text book. ° D. Patterson, “Microprograming, ” Scientific America, March 1983. ° D. Patterson and D. Ditzel, “The Case for the Reduced Instruction Set Computer, ” Computer Architecture News 8, 6 (October 15, 1980) 2/24/99 ©UCB Spring 1999 CS 152 / Kubiatowicz Lec 9. 47