The Single Cycle Datapath Note Some of the

  • Slides: 23
Download presentation
The Single Cycle Datapath Note: Some of the material in this lecture are COPYRIGHT

The Single Cycle Datapath Note: Some of the material in this lecture are COPYRIGHT 1998 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGH RESERVED. Figures may be reproduced only for classroom or personal education use in conjunction with our text and only when the above line is included. 2/6/02 CSE 141 - Single Cycle Datapath

The Performance Big Picture • Execution Time = Insts * CPI * Cycle Time

The Performance Big Picture • Execution Time = Insts * CPI * Cycle Time • Processor design (datapath and control) will determine: – Clock cycle time – Clock cycles per instruction • Starting today: – Single cycle processor: Execute an entire instruction • Advantage: CPI = 1 • Disadvantage: long cycle time 2 CSE 141 - Single Cycle Datapath

Processor Design • We're ready to implement the MIPS “core” – load-store instructions: lw,

Processor Design • We're ready to implement the MIPS “core” – load-store instructions: lw, sw – reg-reg instructions: add, sub, and, or, slt – control flow instructions: beq • First, we need to fetch an instruction into processor – program counter (PC) supplies instruction address – get the instruction from memory Write Enable Address PC Data In 32 Clk 3 Data. Out 32 CSE 141 - Single Cycle Datapath

Processor Design • We're ready to implement the MIPS “core” – load-store instructions: lw,

Processor Design • We're ready to implement the MIPS “core” – load-store instructions: lw, sw – reg-reg instructions: add, sub, and, or, slt – control flow instructions: beq • First, we need to fetch an instruction into processor – program counter (PC) supplies instruction address – get the instruction from memory 0 Write Enable Address PC Data In 32 Clk 4 Data. Out 32 instruction appears here CSE 141 - Single Cycle Datapath

That was too easy • A problem – how will we do a load

That was too easy • A problem – how will we do a load or store? – remember that memory has only 1 port – and we want to do everything in 1 cycle 0 Write Enable Address PC Data In 32 Clk 5 Data. Out 32 instruction appears here CSE 141 - Single Cycle Datapath

Instruction & Data in same cycle? Solution: separate data and instruction memory There will

Instruction & Data in same cycle? Solution: separate data and instruction memory There will be only one DRAM memory We want a stored program architecture How else can you compile and then run a program? ? But we can have separate SRAM caches (We’ll study caches later) address Instruction cache instruction appears here Write Enable Address PC Data In 32 Clk 6 Data Cache Data. Out 32 CSE 141 - Single Cycle Datapath

Instruction Fetch Unit Updating the PC for next instruction – Sequential Code: PC <-

Instruction Fetch Unit Updating the PC for next instruction – Sequential Code: PC <- PC + 4 – Branch and Jump: PC <- “something else” • we’ll worry about these later 7 CSE 141 - Single Cycle Datapath

The MIPS core subset • R-type 31 26 op 6 bits – add rd,

The MIPS core subset • R-type 31 26 op 6 bits – add rd, rs, rt – sub, and, or, slt • LOAD and STORE 31 26 op 6 bits – lw rt, rs, imm – sw rt, rs, imm • BRANCH: 31 – beq rs, rt, imm 8 26 op 6 bits 21 rs 5 bits 16 rt 5 bits 11 rd 5 bits 6 shamt 5 bits 1. Read registers rs and rt 2. Feed them to ALU 3. Update register file 21 rs 5 bits 16 rt 5 bits 0 funct 6 bits 0 immediate 16 bits 1. Read register rs (and rt for store) 2. Feed rs and immed to ALU 3. Move data between mem and reg 21 rs 5 bits 16 rt 5 bits 0 displacement 16 bits 1. Read registers rs and rt 2. Feed to ALU to compare 3. Add PC to disp; update PC CSE 141 - Single Cycle Datapath

Processor Design • Generic Implementation: – all instruction read some registers – all instructions

Processor Design • Generic Implementation: – all instruction read some registers – all instructions use the ALU after reading registers – memory accessed & registers updated after ALU • Suggests basic design: 9 CSE 141 - Single Cycle Datapath

Datapath for Reg-Reg Operations • R[rd] <- R[rs] op R[rt] Example: add rd, rs,

Datapath for Reg-Reg Operations • R[rd] <- R[rs] op R[rt] Example: add rd, rs, rt – Ra, Rb, and Rw come from rs, rt, and rd fields – ALUoperation signal depends on op and funct 31 26 op 6 bits 10 21 rs 5 bits 16 rt 5 bits 11 rd 5 bits 6 shamt 5 bits 0 funct 6 bits CSE 141 - Single Cycle Datapath

Datapath for Load Operations R[rt] <- Mem[R[rs] + Sign. Ext[imm 16]] Example: lw rt,

Datapath for Load Operations R[rt] <- Mem[R[rs] + Sign. Ext[imm 16]] Example: lw rt, rs, imm 16 31 26 op 6 bits 11 21 rs 5 bits 16 rt 5 bits 0 immediate 16 bits CSE 141 - Single Cycle Datapath

Datapath for Store Operations Mem[R[rs] + Sign. Ext[imm 16]] <- R[rt] Example: sw 31

Datapath for Store Operations Mem[R[rs] + Sign. Ext[imm 16]] <- R[rt] Example: sw 31 26 op 6 bits 12 21 rs 5 bits 16 rt 5 bits rt, rs, imm 16 0 immediate 16 bits CSE 141 - Single Cycle Datapath

Combining datapaths • How do we allow different datapaths for different instructions? ? R-type

Combining datapaths • How do we allow different datapaths for different instructions? ? R-type 13 Store CSE 141 - Single Cycle Datapath

Combining datapaths • How do we allow different datapaths for different instructions? ? •

Combining datapaths • How do we allow different datapaths for different instructions? ? • Use a multiplexor! ALUscr 14 CSE 141 - Single Cycle Datapath

Datapath for Branch Operations beq rs, rt, imm 16 31 26 op 6 bits

Datapath for Branch Operations beq rs, rt, imm 16 31 26 op 6 bits 15 We need to compare Rs and Rt 21 rs 5 bits 16 rt 5 bits 0 immediate 16 bits CSE 141 - Single Cycle Datapath

Computing the Next Address • PC is a 32 -bit byte address into the

Computing the Next Address • PC is a 32 -bit byte address into the instruction memory: – Sequential operation: PC<31: 0> = PC<31: 0> + 4 – Branch: PC<31: 0> = PC<31: 0> + 4 + Sign. Ext[Imm 16] * 4 • We don’t need the 2 least-significant bits because: – The 32 -bit PC is a byte address – And all our instructions are 4 bytes (32 bits) long – The 2 LSB's of the 32 -bit PC are always zeros 16 CSE 141 - Single Cycle Datapath

All together: the single cycle datapath 17 CSE 141 - Single Cycle Datapath

All together: the single cycle datapath 17 CSE 141 - Single Cycle Datapath

The R-Format (e. g. add) Datapath Need ALUsrc=1, ALUop=“add”, Mem. Write=0, Mem. To. Reg=0,

The R-Format (e. g. add) Datapath Need ALUsrc=1, ALUop=“add”, Mem. Write=0, Mem. To. Reg=0, Reg. Dst = 0, Reg. Write=1 and PCsrc=1. 18 CSE 141 - Single Cycle Datapath

The Load Datapath What control signals do we need for load? ? 19 CSE

The Load Datapath What control signals do we need for load? ? 19 CSE 141 - Single Cycle Datapath

The Store Datapath 20 CSE 141 - Single Cycle Datapath

The Store Datapath 20 CSE 141 - Single Cycle Datapath

The beq Datapath 21 CSE 141 - Single Cycle Datapath

The beq Datapath 21 CSE 141 - Single Cycle Datapath

Key Points • CPU is just a collection of state and combinational logic •

Key Points • CPU is just a collection of state and combinational logic • We just designed a very rich processor, at least in terms of functionality • Execution time = Insts * CPI * Cycle Time – where does the single-cycle machine fit in? 22 CSE 141 - Single Cycle Datapath

Computer of the Day • The IBM 1620 (1959) – A 2 nd generation

Computer of the Day • The IBM 1620 (1959) – A 2 nd generation computer: transistors & core storage (First generation ones used tubes and delay-based memory) – Example of creative architecture – ~ 2000 built. Relatively inexpensive ( < $1620/month rental) • A decimal computer – 6 bits per digit or character – 4 bits, flag (for +/- and end-of-word), ECC – Variable-length data – fields terminated by flag • Arithmetic by table lookup! • Codenamed CADET 23 – “Can’t Add, Doesn’t Even Try” CSE 141 - Single Cycle Datapath