Start with a calculator z Integer calculator y

  • Slides: 54
Download presentation
Start with a calculator… z Integer calculator y 6 bits z Values 0 -63

Start with a calculator… z Integer calculator y 6 bits z Values 0 -63 z Operations: +, -, Complement, bitwise logical operations y A+B, A-B, A & B, A | B, A = B, ~B (bitwise complement), -B (same as 0 -A), A XOR B CS 150 – Spring 2008 – Lec #12: Computer Org I - 1

Calculator Layout A B 6 6 Operation 6 C CS 150 – Spring 2008

Calculator Layout A B 6 6 Operation 6 C CS 150 – Spring 2008 – Lec #12: Computer Org I - 2

Calculator Bit Design z Dominated by addition Half-Adder CS 150 – Spring 2008 –

Calculator Bit Design z Dominated by addition Half-Adder CS 150 – Spring 2008 – Lec #12: Computer Org I - 3

Question: How do we subtract? z Tied up in: how do we represent negative

Question: How do we subtract? z Tied up in: how do we represent negative numbers z Option 1: “Sign-and-Magnitude’’ y High-order bit represents sign y Low-order 5 bits represent magnitude y -7 = 10111 y Key negative: need a separate circuit for subtraction! x(How would you do it? ) z Option 2: “Two’s Complement” y Take advantage of fact that 64 = 0 (in 6 -bit machine) y Therefore: 64 -x = -x y -7 = 64 -7 = 57 = 111001 (Notice 7 = 000111) y Key: We can subtract by adding! CS 150 – Spring 2008 – Lec #12: Computer Org I - 4

Question: How do we compute two’s complement? z Positive numbers: 000001…. . 011111 (1

Question: How do we compute two’s complement? z Positive numbers: 000001…. . 011111 (1 - 31) z Negative numbers: 111111…. 100001 (-1 - -31) z Notice: 111111 = -1 z “One’s Complement of A”: Flip each bit of A = ~A y One’s complement of 7 (000111) = 111000 z Notice: A + ~A = 111111 (-1) y XOR of each bit is 1 y AND of each bit is 0 z Therefore: -A = ~A + 1 CS 150 – Spring 2008 – Lec #12: Computer Org I - 5

Operations, Revisited z ADD, OR, AND, COMPLEMENT, EQUALITY, XOR Per-Bit Design z Negation and

Operations, Revisited z ADD, OR, AND, COMPLEMENT, EQUALITY, XOR Per-Bit Design z Negation and Subtraction implemented by manipulating inputs Carry-in to bit 0 CS 150 – Spring 2008 – Lec #12: Computer Org I - 6

Problem: Overflow z Consider 31+31 = 62 = 64 -2 = -2! y Not

Problem: Overflow z Consider 31+31 = 62 = 64 -2 = -2! y Not what we want! y X+Y >= 32 (or X + Y <= -32) yield incorrect results y Numbers >= 32 or <= -32 can’t be represented z Key: detect when it happens and flag it y Exception: “Overflow” z How to detect it? z Observation I: y Can only happen when two positive or two negative numbers added together y High-order bit of operands equal z Observation II: y Happens only on an Add z Observation III: y High-Order bit of output != high-order bit of inputs CS 150 – Spring 2008 – Lec #12: Computer Org I - 7

Overflow Circuit Also Add: • Test for negative (high-order bit is 1, “N”) •

Overflow Circuit Also Add: • Test for negative (high-order bit is 1, “N”) • Test for zero (all bits are 0, “Z”) CS 150 – Spring 2008 – Lec #12: Computer Org I - 8

Adding multiplication z Combinational circuit is big! (see HW#1) y Roughly N^2 elements for

Adding multiplication z Combinational circuit is big! (see HW#1) y Roughly N^2 elements for Nx. N multiplication z So we do it sequentially…just use the algorithm we all learned in school y Partial result = 0, multiplier = A, multiplicand = B y For I = 1 to Nbits-1 x if lsb of multiplier is 1, result = result + multiplicand x Multiplicand = multiplicand << 1 x Multiplier = multiplier >> 1 z What about sign? y Simple algorithm (not optimal!) x S = sign(A x B) x C = |A|x|B| x if (S) result = -C else result = C CS 150 – Spring 2008 – Lec #12: Computer Org I - 9

What do we need? z Minor: y Shifter (too easy to spend class time

What do we need? z Minor: y Shifter (too easy to spend class time on!) y Test lsb for 1/0 (just run an extra wire from low-order input to ALU) z Major: y Place to store partial results y Place to store sign of result y Actions dependent on values y Program for multiplication algorithm y These are easy, but change the character of the device! CS 150 – Spring 2008 – Lec #12: Computer Org I - 10

Multiplying Calculator Layout Registers A Control FSM B 6 6 Operation 6 C CS

Multiplying Calculator Layout Registers A Control FSM B 6 6 Operation 6 C CS 150 – Spring 2008 – Lec #12: Computer Org I - 11

Control FSM: Timing is Key z Basically the same program we saw before, BUT

Control FSM: Timing is Key z Basically the same program we saw before, BUT y Need to keep track of timing! • Compute sign of result • Load multiplier with |A| • Load multiplicand with |B| • Load partial result with 0 • If lsb of multiplier is 1 • Add multiplicand to partial result • Shift multiplier right • Shift multiplicand left • Negate result if needed CS 150 – Spring 2008 – Lec #12: Computer Org I - 12

By This time, we’re getting close to something else… CS 150 – Spring 2008

By This time, we’re getting close to something else… CS 150 – Spring 2008 – Lec #12: Computer Org I - 13

What do we we have z Control program in FSM that sequences instructions z

What do we we have z Control program in FSM that sequences instructions z Storage that are loaded in response to data conditions z Actions taken, or not, in response to conditions y This is better known as a computer z All we have to add is memory CS 150 – Spring 2008 – Lec #12: Computer Org I - 14

Computer Organization z Computer design as an application of digital logic design procedures z

Computer Organization z Computer design as an application of digital logic design procedures z Computer = processing unit + memory system z Processing unit = control + datapath z Control = finite state machine y Inputs = machine instruction, datapath conditions y Outputs = register transfer control signals, ALU operation codes y Instruction interpretation = instruction fetch, decode, execute z Datapath = functional units + registers y Functional units = ALU, multipliers, dividers, etc. CS 150 counter, – Spring 2008 – Lec #12: Computer Org I - 15 y Registers = program

Tri-State Buffers z 0, 1, Z (high impedance state) Basic Inverter + in in

Tri-State Buffers z 0, 1, Z (high impedance state) Basic Inverter + in in out OE out + OE if OE then Out = In else “disconnected” out in Inverting Buffer CS 150 – Spring 2008 – Lec #12: Computer Org I - 16

Tri-States vs. Mux A Sel B Sel 0 D 0 E 1 C Sel

Tri-States vs. Mux A Sel B Sel 0 D 0 E 1 C Sel 1 A B 0 1 2: 1 Mux Buffer circuits simple! Scales nicely for high fan-in and wide bit widths! Sel Scales poorly for high fan-in or wide bit widths CS 150 – Spring 2008 – Lec #12: Computer Org I - 17

Register Transfer A Sel B Sel 0 D 0 E 1 C Sel 1

Register Transfer A Sel B Sel 0 D 0 E 1 C Sel 1 C A Sel 0; Ld 1 C B Sel 1; Ld 1 Bus Ld C Clk Sel Ld A on Bus Ld C from Bus CS 150 – Spring 2008 – Lec #12: Computer Org I - 18 B on Bus ?

+ Open Collector Concept Resistive Pull-up + “ 1” “ 0” Bad! Short circuit!

+ Open Collector Concept Resistive Pull-up + “ 1” “ 0” Bad! Short circuit! Low resistance path from Vdd to Gnd Default is high Must actively drive it low “ 0” Wired AND Configuration: If any attached device wants wire to be “ 0”, it wins If all attached devices want wire to be “ 1”, it is CS 150 – Spring 2008 – Lec #12: Computer Org I - 19

Structure of a Computer z Block diagram view address Processor central processing unit (CPU)

Structure of a Computer z Block diagram view address Processor central processing unit (CPU) Control Memory System read/write data control signals Data Path data conditions instruction unit – instruction fetch and interpretation FSM execution unit – functional units and registers CS 150 – Spring 2008 – Lec #12: Computer Org I - 20

Registers z Selectively loaded – EN or LD input z Output enable – OE

Registers z Selectively loaded – EN or LD input z Output enable – OE input z Multiple registers – group 4 or 8 in parallel LD OE D 7 D 6 D 5 D 4 D 3 D 2 D 1 D 0 Q 7 Q 6 Q 5 Q 4 Q 3 Q 2 Q 1 Q 0 CLK OE asserted causes FF state to be connected to output pins; otherwise they are left unconnected (high impedance) LD asserted during a lo-to-hi clock transition loads new data into FFs CS 150 – Spring 2008 – Lec #12: Computer Org I - 21

Register Transfer z Point-to-point connection y Dedicated wires y Muxes on inputs of each

Register Transfer z Point-to-point connection y Dedicated wires y Muxes on inputs of each register MUX MUX rs rt rd R 4 z Common input from multiplexer y Load enables for each register y Control signals for multiplexer z Common bus with output enables MUX y Output enables and load enables for each register rs rt BUS CS 150 – Spring 2008 – Lec #12: Computer Org I - 22

Register Files z Collections of registers in one package y Two-dimensional array of FFs

Register Files z Collections of registers in one package y Two-dimensional array of FFs y Address used as index to a particular word y Separate read and write addresses so can do both at same time z 4 by 4 register file y 16 D-FFs y Organized as four words of four bits each y Write-enable (load) y Read-enable (output enable) RE RB RA WE WB WA D 3 D 2 D 1 D 0 CS 150 – Spring 2008 – Lec #12: Computer Org I - 23 Q 2 Q 1 Q 0

Memories z Larger Collections of Storage Elements y Implemented not as FFs but as

Memories z Larger Collections of Storage Elements y Implemented not as FFs but as much more efficient latches y High-density memories use 1 -5 switches (transitors) per bit z Static RAM – 1024 words each 4 bits wide y Once written, memory holds forever (not true for denser dynamic RAM) y Address lines to select word (10 lines for 1024 words) y Read enable x Same as output enable x Often called chip select x Permits connection of many chips into larger array y Write enable (same as load enable) y Bi-directional data lines x output when reading, input when writing CS 150 – Spring 2008 – Lec #12: Computer Org I - 24 RD WR A 9 A 8 A 7 A 6 A 5 A 4 A 3 A 2 A 1 A 0 IO 3 IO 2 IO 1 IO 0

Instruction Sequencing z Example – an instruction to add the contents of two registers

Instruction Sequencing z Example – an instruction to add the contents of two registers (Rx and Ry) and place result in a third register (Rz) z Step 1: Get the ADD instruction from memory into an instruction register z Step 2: Decode instruction y Instruction in IR has the code of an ADD instruction y Register indices used to generate output enables for registers Rx and Ry y Register index used to generate load signal for register Rz z Step 3: Execute instruction y Enable Rx and Ry output and direct to ALU y Setup ALU to perform ADD operation y Direct result to Rz so that it can be loaded into register CS 150 – Spring 2008 – Lec #12: Computer Org I - 25

Instruction Types z Data Manipulation y Add, subtract y Increment, decrement y Multiply y

Instruction Types z Data Manipulation y Add, subtract y Increment, decrement y Multiply y Shift, rotate y Immediate operands z Data Staging y Load/store data to/from memory y Register-to-register move z Control y Conditional/unconditional branches in program flow y Subroutine call and return CS 150 – Spring 2008 – Lec #12: Computer Org I - 26

Elements of the Control Unit (aka Instruction Unit) z Standard FSM Elements y State

Elements of the Control Unit (aka Instruction Unit) z Standard FSM Elements y State register y Next-state logic y Output logic (datapath/control signaling) y Moore or synchronous Mealy machine to avoid loops unbroken by FF z Plus Additional ”Control" Registers y Instruction register (IR) y Program counter (PC) z Inputs/Outputs y Outputs control elements of data path y Inputs from data path used to alter flow of program (test if zero) CS 150 – Spring 2008 – Lec #12: Computer Org I - 27

Instruction Execution z Control State Diagram (for each diagram) Reset y Fetch instruction y

Instruction Execution z Control State Diagram (for each diagram) Reset y Fetch instruction y Decode y Execute Init z Instructions partitioned into three classes y Branch y Load/store y Register-to-register z Different sequence through diagram for each instruction type Initialize Machine Fetch Instr. Branch Taken Load/ Store Branch Not Taken CS 150 – Spring 2008 – Lec #12: Computer Org I - 28 Incr. PC XEQ Instr. Registerto-Register

Data Path (Hierarchy) z Arithmetic circuits constructed in hierarchical and iterative fashion Cin y

Data Path (Hierarchy) z Arithmetic circuits constructed in hierarchical and iterative fashion Cin y Each bit in datapath is functionally identical y 4 -bit, 8 -bit, 16 -bit, 32 -bit datapaths Ain Bin FA Sum Cout Ain Bin Cin HA HA CS 150 – Spring 2008 – Lec #12: Computer Org I - 29 Sum Cout

Data Path (ALU) z ALU Block Diagram y Input: data and operation to perform

Data Path (ALU) z ALU Block Diagram y Input: data and operation to perform y Output: result of operation and status information A B 16 16 Operation 16 N S Z CS 150 – Spring 2008 – Lec #12: Computer Org I - 30

Data Path (ALU + Registers) z Accumulator y Special register y One of the

Data Path (ALU + Registers) z Accumulator y Special register y One of the inputs to ALU y Output of ALU stored back in accumulator z One-address instructions y Operation and address of one operand y Other operand destination is accumulator register y AC <– AC op Mem[addr] y ”Single address instructions” (AC implicit operand) z Multiple registers y Part of instruction used to choose register operands 16 REG AC 16 16 OP N Z CS 150 – Spring 2008 – Lec #12: Computer Org I - 31 16

Data Path (Bit-slice) z Bit-slice concept: iterate to build n-bit wide datapaths CO ALU

Data Path (Bit-slice) z Bit-slice concept: iterate to build n-bit wide datapaths CO ALU ALU AC AC AC R 0 R 0 rs rs rs rt rt rt rd rd rd from memory 1 bit wide CI from memory 2 bits wide CS 150 – Spring 2008 – Lec #12: Computer Org I - 32 CI

Instruction Path z Program Counter y Keeps track of program execution y Address of

Instruction Path z Program Counter y Keeps track of program execution y Address of next instruction to read from memory y May have auto-increment feature or use ALU z Instruction Register y Current instruction y Includes ALU operation and address of operand y Also holds target of jump instruction y Immediate operands z Relationship to Data Path y PC may be incremented through ALU y Contents of IR may also be required as input to ALU CS 150 – Spring 2008 – Lec #12: Computer Org I - 33

Data Path (Memory Interface) z Memory y Separate data and instruction memory (Harvard architecture)

Data Path (Memory Interface) z Memory y Separate data and instruction memory (Harvard architecture) x Two address busses, two data busses y Single combined memory (Princeton architecture) x Single address bus, single data bus z Separate memory y y ALU output goes to data memory input Register input from data memory output Data memory address from instruction register Instruction register from instruction memory output Instruction memory address from program counter z Single memory y Address from PC or IR y Memory output to instruction and data registers y Memory input from ALU output CS 150 – Spring 2008 – Lec #12: Computer Org I - 34

Block Diagram of Processor z Register Transfer View of Princeton Architecture y Which register

Block Diagram of Processor z Register Transfer View of Princeton Architecture y Which register outputs are connected to which register inputs y Arrows represent data-flow, other are control signals from control FSM load path 16 y MAR may be a simple multiplexer REG AC rather than separate register rd wr 16 store 16 data path y MBR is split in two Data Memory OP (16 -bit words) (REG and IR) addr N 16 y Load control Z Control MAR for each register FSM 16 IR PC 16 16 OP 16 CS 150 – Spring 2008 – Lec #12: Computer Org I - 35

Block Diagram of Processor z Register transfer view of Harvard architecture y Which register

Block Diagram of Processor z Register transfer view of Harvard architecture y Which register outputs are connected to which register inputs y Arrows represent data-flow, other are control signals from control FSM load path 16 y Two MARs (PC and IR) REG AC y Two MBRs (REG and IR) rd wr 16 store 16 data y Load control for each register path Data Memory (16 -bit words) OP N Z addr 16 Control FSM 16 IR PC 16 16 OP data Inst Memory (8 -bit words) addr 16 CS 150 – Spring 2008 – Lec #12: Computer Org I - 36

A Simplified Processor Data-path and Memory z Princeton architecture z Register file memory has

A Simplified Processor Data-path and Memory z Princeton architecture z Register file memory has only 255 words with a display on the last one z Instruction register z PC incremented through ALU z Modeled after MIPS rt 000 (used in 61 C textbook by Patterson & Hennessy) y Really a 32 bit machine y We’ll do a 16 bit version CS 150 – Spring 2008 – Lec #12: Computer Org I - 37

Processor Control z Synchronous Mealy machine z Multiple cycles per instruction CS 150 –

Processor Control z Synchronous Mealy machine z Multiple cycles per instruction CS 150 – Spring 2008 – Lec #12: Computer Org I - 38

Processor Instructions z Three principal types (16 bits in each instruction) type op R(egister)

Processor Instructions z Three principal types (16 bits in each instruction) type op R(egister) I(mmediate) J(ump) 3 rs 3 3 13 rt 3 3 z Some of the instructions R I add sub and or slt lw offset] = rt beq == J rt) addi j halt rd 3 3 funct 3 7 0 1 2 3 4 0 0 0 1 sw rs rs rs 2 rt rt rt rs rd rd rd offset rt 3 rs rt offset 4 5 7 offset 4 rd = rs + rt rd = rs - rt rd = rs & rt rd = rs | rt rd = (rs < rt) rt = mem[rs + offset] mem[rs + pc = pc + offset, if (rs rs rt offset rt = rs + offset target address pc = target address CS 150 –- Spring 2008 – Lec #12: Computer Org I - 39 stop execution until

Tracing an Instruction's Execution z Instruction: R r 3 = r 1 + r

Tracing an Instruction's Execution z Instruction: R r 3 = r 1 + r 2 0 rs=r 1 rt=r 2 rd=r 3 funct=0 z 1. Instruction fetch y Move instruction address from PC to memory address bus y Assert memory read y Move data from memory data bus into IR y Configure ALU to add 1 to PC y Configure PC to store new value from ALUout z 2. Instruction decode y Op-code bits of IR are input to control FSM y Rest of IR bits encode the operand addresses (rs and rt) x These go to register file CS 150 – Spring 2008 – Lec #12: Computer Org I - 40

Tracing an Instruction's Execution (cont’d) z Instruction: R 0 r 3 = r 1

Tracing an Instruction's Execution (cont’d) z Instruction: R 0 r 3 = r 1 + r 2 rs=r 1 rt=r 2 rd=r 3 funct=0 z 3. Instruction execute y Set up ALU inputs y Configure ALU to perform ADD operation y Configure register file to store ALU result (rd) CS 150 – Spring 2008 – Lec #12: Computer Org I - 41

Tracing an Instruction's Execution (cont’d) z Step 1 CS 150 – Spring 2008 –

Tracing an Instruction's Execution (cont’d) z Step 1 CS 150 – Spring 2008 – Lec #12: Computer Org I - 42

Tracing an Instruction's Execution (cont’d) z Step 2 CS 150 – Spring 2008 –

Tracing an Instruction's Execution (cont’d) z Step 2 CS 150 – Spring 2008 – Lec #12: Computer Org I - 43 to controller

Tracing an Instruction's Execution (cont’d) z Step 3 CS 150 – Spring 2008 –

Tracing an Instruction's Execution (cont’d) z Step 3 CS 150 – Spring 2008 – Lec #12: Computer Org I - 44

Register-Transfer-Level Description z Control y Transfer data btwn registers by asserting appropriate control signals

Register-Transfer-Level Description z Control y Transfer data btwn registers by asserting appropriate control signals z Register transfer notation: work from register to register y Instruction fetch: mabus PC; – move PC to memory address bus (PCma. EN, ALUma. EN) memory read; – assert memory read signal (mr, Reg. Bmd. EN) IR memory; – load IR from memory data bus (IRld) op add – send PC into A input, 1 into B input, add (src. A, src. B 0, scr. B 1, op) PC ALUout – load result of incrementing in ALU into PC (PCld, PCsel) y Instruction decode: IR to controller values of A and B read from register file (rs, rt) y Instruction execution: op add – send reg. A into A input, reg. B into B input, add (src. A, src. B 0, scr. B 1, op) rd ALUout – store result of add into destination register (reg. Write, wr. Data. Sel, wr. Reg. Sel) CS 150 – Spring 2008 – Lec #12: Computer Org I - 45

Register-Transfer-Level Description (cont’d) z How many states are needed to accomplish these transfers? y

Register-Transfer-Level Description (cont’d) z How many states are needed to accomplish these transfers? y Data dependencies (where do values that are needed come from? ) y Resource conflicts (ALU, busses, etc. ) z In our case, it takes three cycles y One for each step y All operation within a cycle occur between rising edges of the clock z How do we set all of the control signals to be output by the state machine? y Depends on the type of machine (Mealy, Moore, synchronous Mealy) CS 150 – Spring 2008 – Lec #12: Computer Org I - 46

Review of FSM Timing decode fetch step 1 step 2 IR mem[PC]; PC +

Review of FSM Timing decode fetch step 1 step 2 IR mem[PC]; PC + 1; A rs B rt execute step 3 rd A + B to configure the data-path to do this here, when do we set the control signals? CS 150 – Spring 2008 – Lec #12: Computer Org I - 47

FSM Controller for CPU (skeletal Moore FSM) z First pass at deriving the state

FSM Controller for CPU (skeletal Moore FSM) z First pass at deriving the state diagram (Moore Machine) y These will be further refined into sub-states reset instruction fetch instruction decode LW SW ADD J CS 150 – Spring 2008 – Lec #12: Computer Org I - 48 instruction execution

FSM Controller for CPU (reset and instruction fetch) z Assume Moore Machine y Outputs

FSM Controller for CPU (reset and instruction fetch) z Assume Moore Machine y Outputs associated with states rather than arcs z Reset state and instruction fetch sequence z On reset (go to Fetch state) y Start fetching instructions y PC will set itself to zero mabus PC; memory read; IR memory data bus; PC + 1; reset Fetch CS 150 – Spring 2008 – Lec #12: Computer Org I - 49 instruction fetch

FSM Controller for CPU (decode) z Operation Decode State y Next state branch based

FSM Controller for CPU (decode) z Operation Decode State y Next state branch based on operation code in instruction y Read two operands out of register file x. What if the instruction doesn’t have two operands? Decode instruction decode branch based on value of Inst[15: 13] and Inst[3: 0] add CS 150 – Spring 2008 – Lec #12: Computer Org I - 50

FSM Controller for CPU (Instruction Execution) z For add instruction y Configure ALU and

FSM Controller for CPU (Instruction Execution) z For add instruction y Configure ALU and store result in register rd A + B y Other instructions may require multiple cycles add CS 150 – Spring 2008 – Lec #12: Computer Org I - 51 instruction execution

FSM Controller for CPU (Add Instruction) z Putting it all together and closing the

FSM Controller for CPU (Add Instruction) z Putting it all together and closing the loop y the famous instruction fetch decode execute cycle reset Fetch instruction fetch Decode instruction decode add CS 150 – Spring 2008 – Lec #12: Computer Org I - 52 instruction execution

FSM Controller for CPU z Now we need to repeat this for all the

FSM Controller for CPU z Now we need to repeat this for all the instructions of our processor y Fetch and decode states stay the same y Different execution states for each instruction x. Some may require multiple states if available register transfer paths require sequencing of steps CS 150 – Spring 2008 – Lec #12: Computer Org I - 53

Next Up: Adders CS 150 – Spring 2008 – Lec #12: Computer Org I

Next Up: Adders CS 150 – Spring 2008 – Lec #12: Computer Org I - 54