ECE 361 Computer Architecture Lecture 10 Designing a

  • Slides: 37
Download presentation
ECE 361 Computer Architecture Lecture 10: Designing a Multiple Cycle Processor 361 multipath. .

ECE 361 Computer Architecture Lecture 10: Designing a Multiple Cycle Processor 361 multipath. . 1

Recap: A Single Cycle Datapath ° We have everything except control signals (underline) •

Recap: A Single Cycle Datapath ° We have everything except control signals (underline) • Today’s lecture will show you how to generate the control signals 1 Mux 0 Reg. Wr 5 5 Rs 5 Rt Rt ALUctr bus. A 0 1 32 Ext. Op 361 multipath. . 2 Clk Imm 16 Memto. Reg Mem. Wr 0 32 Data In 32 ALUSrc Rd Wr. En Adr Data Memory 32 Mux 16 Extender imm 16 32 Mux 32 Clk Rw Ra Rb 32 32 -bit Registers bus. B 32 ALU bus. W Zero Rs <0: 15> Clk <11: 15> Reg. Dst Rt <16: 20> Rd Instruction Fetch Unit <21: 25> n. PC_sel Instruction<31: 0> 1

Recap: PLA Implementation of the Main Control . . op<5> <0> R-type . .

Recap: PLA Implementation of the Main Control . . op<5> <0> R-type . . op<5> <0> ori . . op<5> <0> lw . . op<5> <0> sw <0> beq . . op<5> jump op<0> Reg. Write ALUSrc Reg. Dst Memto. Reg Mem. Write Branch Jump Ext. Op ALUop<2> ALUop<1> ALUop<0> 361 multipath. . 3

The Big Picture: Where are We Now? ° The Five Classic Components of a

The Big Picture: Where are We Now? ° The Five Classic Components of a Computer Processor Input Control Memory Datapath Output ° Today’s Topic: Designing the Datapath for the Multiple Clock Cycle Datapath 361 multipath. . 4

Outline of Today’s Lecture ° Recap and Introduction ° Introduction to the Concept of

Outline of Today’s Lecture ° Recap and Introduction ° Introduction to the Concept of Multiple Cycle Processor ° Multiple Cycle Implementation of R-type Instructions ° What is a Multiple Cycle Delay Path and Why is it Bad? ° Multiple Cycle Implementation of Or Immediate ° Multiple Cycle Implementation of Load and Store ° Putting it all Together 361 multipath. . 5

361 multipath. . 6 ° looks like a FSM with PC as state ALU

361 multipath. . 6 ° looks like a FSM with PC as state ALU Reg. Wrt Result Store Data Mem. Wr Reg. Dst Reg. Wr Mem. Rd Mem. Wr fun Mem Access Ext. Op ALUSrc ALUctr Equal op Ext Register Fetch Instruction Fetch PC Next PC n. PC_sel Abstract View of our single cycle processor Main Control ALU control

What’s wrong with our CPI=1 processor? Arithmetic & Logical PC Inst Memory Reg File

What’s wrong with our CPI=1 processor? Arithmetic & Logical PC Inst Memory Reg File mux ALU mux setup Load PC Inst Memory ALU Data Mem Store PC mux Reg File Critical Path Inst Memory Reg File ALU Data Mem Branch PC Inst Memory Reg File mux cmp mux ° Long Cycle Time ° All instructions take as much time as the slowest ° Real memory is not so nice as our idealized memory • cannot always get the job done in one (short) cycle 361 multipath. . 7 mux setup

Drawbacks of this Single Cycle Processor ° Long cycle time: • Cycle time must

Drawbacks of this Single Cycle Processor ° Long cycle time: • Cycle time must be long enough for the load instruction: - PC’s Clock -to-Q + - Instruction Memory Access Time + - Register File Access Time + - ALU Delay (address calculation) + - Data Memory Access Time + - Register File Setup Time + - Clock Skew ° Cycle time is much longer than needed for all other instructions. Examples: • R-type instructions do not require data memory access • Jump does not require ALU operation nor data memory access 361 multipath. . 8

Overview of a Multiple Cycle Implementation ° The root of the single cycle processor’s

Overview of a Multiple Cycle Implementation ° The root of the single cycle processor’s problems: • The cycle time has to be long enough for the slowest instruction ° Solution: • Break the instruction into smaller steps • Execute each step (instead of the entire instruction) in one cycle - Cycle time: time it takes to execute the longest step - Keep all the steps to have similar length • This is the essence of the multiple cycle processor ° The advantages of the multiple cycle processor: • Cycle time is much shorter • Different instructions take different number of cycles to complete - Load takes five cycles - Jump only takes three cycles • Allows a functional unit to be used more than once per instruction 361 multipath. . 9

The Five Steps of a Load Instruction Fetch Address Data Memory Reg Wr Reg.

The Five Steps of a Load Instruction Fetch Address Data Memory Reg Wr Reg. Fetch Clk PC Instr Decode / Old Value Clk-to-Q New Value Instruction Memory Access Time New Value Rs, Rt, Rd, Op, Func Old Value ALUctr Old Value Ext. Op Old Value New Value ALUSrc Old Value New Value Reg. Wr Old Value New Value bus. B Register File Access Time New Value Old Value Delay through Extender & Mux Old Value New Value ALU Delay Address Old Value New Value Data Memory Access Time bus. W 361 multipath. . 10 Old Value New Register File Write Time bus. A Delay through Control Logic New Value

Register File & Memory Write Timing: Ideal vs. Reality ° In previous lectures, register

Register File & Memory Write Timing: Ideal vs. Reality ° In previous lectures, register file and memory are simplified: Wr. En • Write happens at the clock tick Adr • Address, data, and write enable must be 32 Ideal stable one “set-up” time before the clock tick Memory 32 Din Dout 32 Clk ° In real life: • Neither register file nor ideal memory has the clock input • The write path is a combinational logic delay path: Wr. En - Write enable goes to 1 and Din settles down Adr 32 - Memory write access delay Ideal Memory - Din is written into mem[address] Din Dout • Important: Address and Data must be 32 stable BEFORE Write Enable goes to 1 361 multipath. . 11 32

Race Condition Between Address and Write Enable ° This “real” (no clock input) register

Race Condition Between Address and Write Enable ° This “real” (no clock input) register file may not work reliably in the single cycle processor because: • We cannot guarantee Rw will be stable BEFORE Reg. Wr = 1 • There is a “race” between Rw (address) and Reg. Wr (write enable) ° The “real” (no clock input) memory may not work reliably in the single cycle processor because: • We cannot guarantee Address will be stable BEFORE Wr. En = 1 • There is a race between Adr and Wr. En 361 multipath. . 12 Ra Reg. Wr 5 5 Rb bus. A Rw bus. B Reg File 32 5 bus. W 32 32 Wr. En Adr Ideal Memory Din Dout 32

How to Avoid this Race Condition? ° Solution for the multiple cycle implementation: •

How to Avoid this Race Condition? ° Solution for the multiple cycle implementation: • Make sure Address is stable by the end of Cycle N • Assert Write Enable signal ONE cycle later at Cycle (N + 1) • Address cannot change until Write Enable is disasserted 361 multipath. . 13

Dual-Port Ideal Memory ° Dual Port Ideal Memory • Independent Read (RAdr, Dout) and

Dual-Port Ideal Memory ° Dual Port Ideal Memory • Independent Read (RAdr, Dout) and Write (WAdr, Din) ports • Read and write (to different location) can occur at the same cycle ° Read Port is a combinational path: • Read Address Valid --> • Memory Read Access Delay --> • Data Out Valid ° Write Port is also a combinational path: • Mem. Write = 1 --> • Memory Write Access Delay --> • Data In is written into location[Wr. Adr] 361 multipath. . 14 Mem. Wr 00 30 32 32 RAdr<1: 0> <31: 2> Ideal Memory Wr. Adr Din Dout 32

Instruction Fetch Cycle: In the Beginning ° Every cycle begins right AFTER the clock

Instruction Fetch Cycle: In the Beginning ° Every cycle begins right AFTER the clock tick: • mem[PC] PC<31: 0> + 4 Clk One “Logic” Clock Cycle You are here! PCWr=? PC Mem. Wr=? 32 IRWr=? Instruction Reg Clk 32 RAdr Ideal Memory 32 32 Wr. Adr Din Dout ALU 32 4 32 32 32 ALU Control ALUop=? Clk 361 multipath. . 16 32

Instruction Fetch Cycle: The End ° Every cycle ends AT the next clock tick

Instruction Fetch Cycle: The End ° Every cycle ends AT the next clock tick (storage element updates): • IR <-- mem[PC] PC<31: 0> <-- PC<31: 0> + 4 Clk One “Logic” Clock Cycle You are here! PCWr=1 PC 32 Mem. Wr=0 IRWr=1 ALU 32 00 32 32 32 Instruction Reg Clk RAdr Ideal Memory Wr. Adr Din Dout 4 32 32 ALU Control ALUOp = Add 32 Clk 361 multipath. . 17 32

Instruction Fetch Cycle: Overall Picture Ifetch ALUOp=Add 1: PCWr, IRWr x: PCWr. Cond Reg.

Instruction Fetch Cycle: Overall Picture Ifetch ALUOp=Add 1: PCWr, IRWr x: PCWr. Cond Reg. Dst, Mem 2 R Others: 0 s PCWr=1 Wr. Adr 32 Din Dout 32 bus. A 4 bus. B 32 32 ALUSel. B=00 361 multipath. . 18 Zero 0 1 2 3 Target 32 32 1 ALU Ideal Memory 1 Mux RAdr Instruction Reg 0 Br. Wr=0 0 0 32 32 ALUSel. A=0 32 PCSrc=0 Mux PCWr. Cond=x Zero Ior. D=0 Mem. Wr=0 IRWr=1 32 32 ALU Control ALUOp=Add

Register Fetch / Instruction Decode ° bus. A <- Reg. File[rs] ; bus. B

Register Fetch / Instruction Decode ° bus. A <- Reg. File[rs] ; bus. B <- Reg. File[rt] ; ° ALU is not being used: ALUctr = xx PCWr=0 Reg. Wr=0 ALUSel. A=x 1 Wr. Adr 32 Din Dout 32 32 361 multipath. . 19 5 Rd 32 Go to the Op Control Func Rt 0 Mux Ideal Memory Rt 5 1 6 Imm 6 16 Ra Rb bus. A Reg File Rw bus. W bus. B 32 4 Zero 32 1 32 32 ALU Mux RAdr Rs 1 0 0 Mux 0 Instruction Reg 32 32 Reg. Dst=x 32 PCSrc=x Mux PCWr. Cond=0 Zero Ior. D=x Mem. Wr=0 IRWr=0 0 1 2 3 32 32 ALU Control ALUSel. B=xx ALUOp=xx

Register Fetch / Instruction Decode (Continue) Rfetch/Decode ° bus. A <- Reg[rs] ; bus.

Register Fetch / Instruction Decode (Continue) Rfetch/Decode ° bus. A <- Reg[rs] ; bus. B <- Reg[rt] ; ALUOp=Add 1: Br. Wr, Ext. Op ALUSel. B=10 x: Reg. Dst, PCSrc Ior. D, Memto. Reg Others: 0 s ° Target <- PC + Sign. Ext(Imm 16)*4 PCWr=0 ALUSel. A=0 1 Wr. Adr 32 Din Dout 32 32 Rt 0 5 Rd 32 Beq Rtype Ori Memory 361 multipath. . 20 Mux Ideal Memory Rt 5 Ra Rb bus. A Reg File Rw bus. W bus. B 32 1 << 2 Control Op Func : 6 6 Imm 16 Extend Ext. Op=1 4 Zero 0 1 2 3 Target 32 32 1 ALU Mux RAdr Rs Br. Wr=1 0 0 Mux 0 Instruction Reg 32 32 Reg. Wr=0 32 PC 32 Reg. Dst=x PCSrc=x Mux PCWr. Cond=0 Zero Ior. D=x Mem. Wr=0 IRWr=0 32 32 ALU Control ALUSel. B=10 32 ALUOp=Add

Branch Completion Br. Complete ° if (bus. A == bus. B) • PC <-

Branch Completion Br. Complete ° if (bus. A == bus. B) • PC <- Target PCWr=0 Reg. Wr=0 ALUSel. A=1 32 1 Wr. Adr 32 Din Dout 32 32 Rt 0 5 Rd 32 Mux Ideal Memory Rt 5 Ra Rb bus. A Reg File Rw bus. W bus. B 32 1 << 2 Imm 16 361 multipath. . 21 Extend Ext. Op=x 4 Zero 0 1 2 3 Target 32 32 1 ALU Mux RAdr Rs Br. Wr=0 0 0 Mux 0 Instruction Reg 32 32 Reg. Dst=x PCSrc=1 Mux PCWr. Cond=1 Zero Ior. D=x Mem. Wr=0 IRWr=0 PC 32 ALUOp=Sub ALUSel. B=01 x: Ior. D, Mem 2 Reg Reg. Dst, Ext. Op 1: PCWr. Cond ALUSel. A PCSrc 32 32 ALU Control ALUSel. B=01 32 ALUOp=Sub

Instruction Decode: We have a R-type! ° Next Cycle: R-type Execution PCWr=0 PCWr. Cond=0

Instruction Decode: We have a R-type! ° Next Cycle: R-type Execution PCWr=0 PCWr. Cond=0 Zero Ior. D=x Mem. Wr=0 IRWr=0 ALUSel. A=0 1 Wr. Adr 32 Din Dout 32 32 Rt 0 5 Rd 32 Beq Rtype Ori Memory 361 multipath. . 22 Mux Ideal Memory Rt 5 Ra Rb bus. A Reg File 32 Rw bus. W bus. B 32 1 << 2 Control Op Func : 6 6 Imm 16 Extend Ext. Op=1 4 Zero 32 1 0 1 2 3 Target 32 ALU Mux RAdr 1 0 0 Rs Br. Wr=1 Mux Reg. Wr=0 Mux 0 Instruction Reg 32 32 Reg. Dst=x 32 PCSrc=x 32 32 ALU Control ALUSel. B=10 32 ALUOp=Add

R-type Execution RExec 1: Reg. Dst ALUSel. A ALUSel. B=01 ALUOp=Rtype x: PCSrc, Ior.

R-type Execution RExec 1: Reg. Dst ALUSel. A ALUSel. B=01 ALUOp=Rtype x: PCSrc, Ior. D Memto. Reg Ext. Op ° ALU Output <- bus. A op bus. B PCWr=0 ALUSel. A=1 1 Wr. Adr 32 Din Dout 32 32 32 Rt 0 5 Rd Mux Ideal Memory Rt 5 Rb bus. A Reg File 32 4 bus. W bus. B 32 1 Extend Ext. Op=x << 2 32 Memto. Reg=x Zero 0 1 2 3 Target 32 32 1 Rw 1 Mux 0 Imm 16 361 multipath. . 23 Ra 1 ALU Mux RAdr Rs Br. Wr=0 0 0 Mux 0 Instruction Reg 32 32 Reg. Wr=0 32 PC 32 Reg. Dst=1 PCSrc=x Mux PCWr. Cond=0 Zero Ior. D=x Mem. Wr=0 IRWr=0 32 32 ALU Control ALUOp=Rtype ALUSel. B=01

R-type Completion Rfinish ALUOp=Rtype 1: Reg. Dst, Reg. Wr ALUsel. A ALUSel. B=01 x:

R-type Completion Rfinish ALUOp=Rtype 1: Reg. Dst, Reg. Wr ALUsel. A ALUSel. B=01 x: Ior. D, PCSrc Ext. Op ° R[rd] <- ALU Output PCWr=0 ALUSel. A=1 1 Wr. Adr 32 Din Dout 32 32 32 Rt 0 5 Rd Mux Ideal Memory Rt 5 Rb bus. A Reg File 32 4 bus. W bus. B 32 1 Extend Ext. Op=x << 2 32 Memto. Reg=0 Zero 0 1 2 3 Target 32 32 1 Rw 1 Mux 0 Imm 16 361 multipath. . 24 Ra 1 ALU Mux RAdr Rs Br. Wr=0 0 0 Mux 0 Instruction Reg 32 32 Reg. Wr=1 32 PC 32 Reg. Dst=1 PCSrc=x Mux PCWr. Cond=0 Zero Ior. D=x Mem. Wr=0 IRWr=0 32 32 ALU Control ALUOp=Rtype ALUSel. B=01

A Multiple Cycle Delay Path ° There is no register to save the results

A Multiple Cycle Delay Path ° There is no register to save the results between: • Register Fetch: bus. A <- Reg[rs] ; bus. B <- Reg[rt] • R-type Execution: ALU output <- bus. A op bus. B • R-type Completion: Reg[rd] <- ALU output Register here to save outputs of Rfetch? ALUsel. A PCWr Rt 0 5 Rd 1 1 Mux 0 Ra Rb bus. A Reg File 1 32 Rw bus. W bus. B 32 4 32 0 1 2 3 ALUsel. B 361 multipath. . 25 Zero ALU 32 Rt 5 Mux Instruction Reg Rs Mux 0 32 32 ALU Control ALUOp Register here to save outputs of RExec?

A Multiple Cycle Delay Path (Continue) ° Register is NOT needed to save the

A Multiple Cycle Delay Path (Continue) ° Register is NOT needed to save the outputs of Register Fetch: • IRWr = 0: bus. A and bus. B will not change after Register Fetch ° Register is NOT needed to save the outputs of R-type Execution: • bus. A and bus. B will not change after Register Fetch • Control signals ALUSel. A, ALUSel. B, and ALUOp will not change after R-type Execution • Consequently ALU output will not change after R-type Execution ° In theory (P. 316, P&H), you need a register to hold a signal value if: • (1) The signal is computed in one clock cycle and used in another. • (2) AND the inputs to the functional block that computes this signal can change before the signal is written into a state element. ° You can save a register if Cond 1 is true BUT Cond 2 is false: • But in practice, this will introduce a multiple cycle delay path: - A logic delay path that takes multiple cycles to propagate from one storage element to the next storage element 361 multipath. . 26

Pros and Cons of a Multiple Cycle Delay Path ° A 3 -cycle path

Pros and Cons of a Multiple Cycle Delay Path ° A 3 -cycle path example: • IR (storage) -> Reg File Read -> ALU -> Reg File Write (storage) ° Advantages: • Register savings • We can share time among cycles: - If ALU takes longer than one cycle, still “a OK” as long as the entire path takes less than 3 cycles to finish Rt 0 5 Rd 1 1 Mux 0 Ra Rb bus. A Reg File 1 32 Rw bus. W bus. B 32 4 32 0 1 2 3 ALUsel. B 361 multipath. . 27 Zero ALU 32 Rt 5 Mux Instruction Reg Rs Mux 0 32 32 ALU Control

Pros and Cons of a Multiple Cycle Delay Path (Continue) ° Disadvantage: • Static

Pros and Cons of a Multiple Cycle Delay Path (Continue) ° Disadvantage: • Static timing analyzer, which ONLY looks at delay between two storage elements, will report this as a timing violation • You have to ignore the static timing analyzer’s warnings Rt 0 5 Rd 1 1 Mux 0 Ra Rb bus. A Reg File 1 32 Rw bus. W bus. B 32 4 32 0 1 2 3 ALUsel. B 361 multipath. . 28 Zero ALU 32 Rt 5 Mux Instruction Reg Rs Mux 0 32 32 ALU Control

Instruction Decode: We have an Ori! ° Next Cycle: Ori Execution PCWr=0 PCWr. Cond=0

Instruction Decode: We have an Ori! ° Next Cycle: Ori Execution PCWr=0 PCWr. Cond=0 Zero Ior. D=x Mem. Wr=0 IRWr=0 PCSrc=x 1 Wr. Adr 32 Din Dout 32 32 Rt 0 5 Rd 32 Beq Rtype Ori Memory 361 multipath. . 29 Mux Ideal Memory Rt 5 Ra Rb bus. A Reg File 32 Rw bus. W bus. B 32 1 << 2 Control Op Func : 6 6 Imm 16 Extend Ext. Op=1 4 Zero 32 1 0 1 2 3 Target 32 ALU Mux RAdr Intruction Reg 0 Rs 1 0 0 Mux 32 32 ALUSel. A=0 32 PC 32 Reg. Wr=0 Mux Reg. Dst=x Br. Wr=1 32 32 ALU Control ALUSel. B=10 32 ALUOp=Add

Ori Execution ALUOp=Or Ori. Exec 1: ALUSel. A ° ALU output <- bus. A

Ori Execution ALUOp=Or Ori. Exec 1: ALUSel. A ° ALU output <- bus. A or Zero. Ext[Imm 16] ALUSel. B=11 x: Memto. Reg Ior. D, PCSrc PCWr=0 Reg. Wr=0 ALUSel. A=1 1 Wr. Adr 32 Din Dout 32 32 32 Rt 0 5 Rd Mux Ideal Memory Rt 5 Rb bus. A Reg File 32 4 bus. W bus. B 32 1 Extend Ext. Op=0 << 2 32 Memto. Reg=x Zero 0 1 2 3 Target 32 32 1 Rw 1 Mux 0 Imm 16 361 multipath. . 30 Ra 1 ALU Mux RAdr Rs Br. Wr=0 0 0 Mux 0 Instruction Reg 32 32 Reg. Dst=0 32 PCSrc=x Mux PCWr. Cond=0 Zero Ior. D=x Mem. Wr=0 IRWr=0 32 32 ALU Control ALUOp=Or ALUSel. B=11

Ori Completion ALUOp=Or ° Reg[rt] <- ALU output PCWr=0 Reg. Wr=1 ALUSel. A=1 Wr.

Ori Completion ALUOp=Or ° Reg[rt] <- ALU output PCWr=0 Reg. Wr=1 ALUSel. A=1 Wr. Adr 32 Din Dout 32 32 Rt 0 5 Rd Rb bus. A Reg File 32 4 bus. W bus. B 32 1 Extend Ext. Op=0 << 2 32 Memto. Reg=0 Zero 32 1 Rw 1 Mux 0 Imm 16 361 multipath. . 31 Ra 0 1 2 3 Target 32 ALU 1 32 Mux Ideal Memory Rt 5 Mux RAdr Instruction Reg 0 1 0 0 Rs Br. Wr=0 Mux Reg. Dst=0 PCSrc=x 32 32 32 x: Ior. D, PCSrc ALUSel. B=11 1: ALUSel. A Reg. Wr PCWr. Cond=0 Zero Ior. D=x Mem. Wr=0 IRWr=0 PC 32 Ori. Finish 32 32 ALU Control ALUOp=Or ALUSel. B=11

Memory Address Calculation Adr. Cal 1: Ext. Op ALUSel. A ALUSel. B=11 ALUOp=Add x:

Memory Address Calculation Adr. Cal 1: Ext. Op ALUSel. A ALUSel. B=11 ALUOp=Add x: Memto. Reg PCSrc ° ALU output <- bus. A + Sign. Ext[Imm 16] PCWr=0 Reg. Wr=1 ALUSel. A=1 1 Wr. Adr 32 Din Dout 32 32 32 Rt 0 5 Rd Mux Ideal Memory Rt 5 Rb bus. A Reg File 32 4 bus. W bus. B 32 1 Extend Ext. Op=1 << 2 32 Memto. Reg=x Zero 0 1 2 3 Target 32 32 1 Rw 1 Mux 0 Imm 16 361 multipath. . 32 Ra 1 ALU Mux RAdr Rs Br. Wr=0 0 0 Mux 0 Instruction Reg 32 32 Reg. Dst=x 32 PCSrc=x Mux PCWr. Cond=0 Zero Ior. D=x Mem. Wr=0 IRWr=0 32 32 ALU Control ALUOp=Add ALUSel. B=11

Memory Access for Store 1: Ext. Op SWmem Mem. Wr ALUSel. A ALUSel. B=11

Memory Access for Store 1: Ext. Op SWmem Mem. Wr ALUSel. A ALUSel. B=11 ALUOp=Add x: PCSrc, Reg. Dst Memto. Reg PCSrc=x ° mem[ALU output] <- bus. B PCWr=0 PCWr. Cond=0 Zero Mem. Wr=1 IRWr=0 ALUSel. A=1 1 Wr. Adr 32 Din Dout 32 32 32 Rt 0 5 Rd Mux Ideal Memory Rt 5 Rb bus. A Reg File 32 4 bus. W bus. B 32 1 Extend Ext. Op=1 << 2 32 Memto. Reg=x Zero 32 1 Rw 1 Mux 0 Imm 16 361 multipath. . 33 Ra 0 1 2 3 Target 32 ALU Mux RAdr Rs 1 0 0 Mux 0 Instruction Reg 32 32 Reg. Wr=0 32 PC 32 Reg. Dst=x Mux Ior. D=x Br. Wr=0 32 32 ALU Control ALUOp=Add ALUSel. B=11

Memory Access for Load 1: Ext. Op LWmem ALUSel. A, Ior. D ALUSel. B=11

Memory Access for Load 1: Ext. Op LWmem ALUSel. A, Ior. D ALUSel. B=11 ALUOp=Add x: Memto. Reg PCSrc ° Mem Dout <- mem[ALU output] PCWr=0 Reg. Wr=0 ALUSel. A=1 1 Wr. Adr 32 Din Dout 32 32 32 Rt 0 5 Rd Mux Ideal Memory Rt 5 Rb bus. A Reg File 32 4 bus. W bus. B 32 1 Extend Ext. Op=1 << 2 32 Memto. Reg=x Zero 0 1 2 3 Target 32 32 1 Rw 1 Mux 0 Imm 16 361 multipath. . 34 Ra 1 ALU Mux RAdr Rs Br. Wr=0 0 0 Mux 0 Instruction Reg 32 32 Reg. Dst=0 32 PCSrc=x Mux PCWr. Cond=0 Zero Ior. D=1 Mem. Wr=0 IRWr=0 32 32 ALU Control ALUOp=Add ALUSel. B=11

Write Back for Load 1: ALUSel. A Reg. Wr, Ext. Op Memto. Reg ALUSel.

Write Back for Load 1: ALUSel. A Reg. Wr, Ext. Op Memto. Reg ALUSel. B=11 ALUOp=Add x: PCSrc Ior. D ° Reg[rt] <- Mem Dout PCWr=0 ALUSel. A=1 1 Wr. Adr 32 Din Dout 32 32 32 Rt 0 5 Rd Mux Ideal Memory Rt 5 Rb bus. A Reg File 32 4 bus. W bus. B 32 1 Extend Ext. Op=1 << 2 32 Memto. Reg=1 Zero 0 1 2 3 Target 32 32 1 Rw 1 Mux 0 Imm 16 361 multipath. . 35 Ra 1 ALU Mux RAdr Rs Br. Wr=0 0 0 Mux 0 Instruction Reg 32 32 Reg. Wr=0 32 PC 32 Reg. Dst=0 PCSrc=x Mux PCWr. Cond=0 Zero Ior. D=x Mem. Wr=0 IRWr=0 LWwr 32 32 ALU Control ALUOp=Add ALUSel. B=11

Putting it all together: Multiple Cycle Datapath PCWr. Cond Zero Mem. Wr ALUSel. A

Putting it all together: Multiple Cycle Datapath PCWr. Cond Zero Mem. Wr ALUSel. A Reg. Wr 1 1 Wr. Adr 32 Din Dout 32 32 32 Rt 0 5 Rd Mux Ideal Memory Rt 5 Rb bus. A Reg File 32 bus. W bus. B 32 1 Extend Ext. Op 32 1 Rw 1 Mux 0 Imm 16 361 multipath. . 36 Ra << 2 4 0 1 32 32 2 3 32 Memto. Reg Zero ALU Mux RAdr Rs Target 32 0 0 Mux 0 Instruction Reg 32 32 Reg. Dst 32 PC 32 IRWr Br. Wr Mux Ior. D PCSrc ALU Control ALUOp ALUSel. B

Putting it all together: Control State Diagram Ifetch Adr. Cal lw 1: Ext. Op

Putting it all together: Control State Diagram Ifetch Adr. Cal lw 1: Ext. Op ALUSel. A ALUSel. B=11 ALUOp=Add x: Memto. Reg PCSrc ALUOp=Add 1: PCWr, IRWr x: PCWr. Cond Reg. Dst, Mem 2 R Others: 0 s lw or sw sw 1: Ext. Op LWmem ALUSel. A, Ior. D ALUSel. B=11 ALUOp=Add x: Memto. Reg PCSrc 1: ALUSel. A LWwr Reg. Wr, Ext. Op Memto. Reg ALUSel. B=11 ALUOp=Add x: PCSrc Ior. D 361 multipath. . 37 SWMem 1: Ext. Op Mem. Wr ALUSel. A ALUSel. B=11 ALUOp=Add x: PCSrc, Reg. Dst Memto. Reg Rfetch/Decode ALUOp=Add 1: Br. Wr, Ext. Op ALUSel. B=10 x: Reg. Dst, PCSrc Ior. D, Memto. Reg Others: 0 s Br. Complete beq Ori Rtype RExec 1: Reg. Dst ALUSel. A ALUSel. B=01 ALUOp=Rtype x: PCSrc, Ior. D Memto. Reg Ext. Op Rfinish ALUOp=Rtype 1: Reg. Dst, Reg. Wr ALUsel. A ALUSel. B=01 x: Ior. D, PCSrc Ext. Op ALUOp=Sub ALUSel. B=01 x: Ior. D, Mem 2 Reg Reg. Dst, Ext. Op 1: PCWr. Cond ALUSel. A PCSrc Ori. Exec ALUOp=Or 1: ALUSel. A ALUSel. B=11 x: Memto. Reg Ior. D, PCSrc Ori. Finish ALUOp=Or x: Ior. D, PCSrc ALUSel. B=11 1: ALUSel. A Reg. Wr

Summary ° Disadvantages of the Single Cycle Processor • Long cycle time • Cycle

Summary ° Disadvantages of the Single Cycle Processor • Long cycle time • Cycle time is too long for all instructions except the Load ° Multiple Cycle Processor: • Divide the instructions into smaller steps • Execute each step (instead of the entire instruction) in one cycle ° Do NOT confuse Multiple Cycle Processor with Multiple Cycle Delay Path • Multiple Cycle Processor executes each instruction in multiple clock cycles • Multiple Cycle Delay Path: a combinational logic path between two storage elements that takes more than one clock cycle to complete ° It is possible (desirable) to build a MC Processor without MCDP: • Use a register to save a signal’s value whenever a signal is generated in one clock cycle and used in another cycle later 361 multipath. . 38