Special Microarchitecture based on a lecture by Sanjay
Special Microarchitecture based on a lecture by Sanjay Rajopadhye modified by Yashwant Malaiya, Phil Sharp
Computing Layers Problems Algorithms Language Instruction Set Architecture Microarchitecture Circuits Devices
LC-3 Data Path Revisited
Microarchitecture Functional hardware blocks in a digital system • Responsible for implementing the ISA • Many Microarchitectures can implement the same ISA Car Analogy: • Gas pedal, brake, steering wheel are the ISA • All cars share same ISA, boats and planes have different ISAs • Engine size, brake type, size of tires are the microarchitecture • Vary based on design goals • cost, efficiency, performance, etc.
Microarchitecture Components: Storage: Registers, Register file, Memory Ø Triggered by the system clock Combinational: MUXes, ALU, adder, SEXT, wiring etc. Ø Respond after some propagation delay Design process: • Design the data path and identify control signals • Data Path: components that process an instruction • Design the Control finite state machine • Control: components that generate signals to manage processing of an instruction
Timing relative to system clock Combinational blocks (Logic and wiring) • Output is always a function of the values on input wires • If input changes, the change propagates with some propagation delay. Storage elements are timed • Clock – a special signal that determines this timing • Storage can be updated only at the tick of the clock What happens between ticks? • The “current” values are processed by logic and wiring to produce values … • … that will be used to update at the “next tick” How fast can the clock tick? • Must allow for the longest combinational signal path
Timing relative to system clock How fast can the clock tick? • Must allow for the longest combinational signal path. Clock frequency: tick rate • Ex: 2 GHz mean 2 x 109 cycles per second Clock period: period between two pulses • Inverse of clock frequency • 2 GHz clock frequency means period is 0. 5 nanosecond clock period. • Signals must stabilize between two clock periods. Thus longest combinational signal path must be less than a clock period.
Data Path Components
Data Path Components that process an instruction Global bus • special set of wires that carry a 16 -bit signal to many components • inputs to the bus are “tri-state devices, ” that only place a signal on the bus when they are enabled • only one (16 -bit) signal should be enabled at any time Ø control unit decides which signal “drives” the bus • any number of components can read the bus Ø register only captures bus data if it is write-enabled by the control unit Memory • Control and data registers for memory and I/O devices • memory: MAR, MDR (also control signal for read/write)
Data Path Components ALU • Accepts inputs from register file and from sign-extended bits from IR (immediate field). • Output goes to bus. Ø used by condition code logic, register file, memory Register File • Two read addresses (SR 1, SR 2), one write address (DR) • Input from bus Ø result of ALU operation or memory read • Two 16 -bit outputs Ø used by ALU, PC, memory address Ø data for store instructions passes through ALU
Data Path Components More details later. Multiplexer (MUX): selects data from multiple sources PC and PCMUX • Three inputs to PC, controlled by PCMUX 1. PC+1 – FETCH stage 2. Address adder – BR, JMP 3. bus – TRAP (discussed later) MAR and MARMUX • Two inputs to MAR, controlled by MARMUX 1. Address adder – LD/ST, LDR/STR 2. Zero-extended IR[7: 0] -- TRAP (discussed later)
Data Path Components Condition Code Logic • Looks at value on bus and generates N, Z, P signals • Registers set only when control unit enables them (LD. CC) Ø only certain instructions set the codes (ADD, AND, NOT, LDI, LDR, LEA) Control Unit – Finite State Machine • On each machine cycle, changes control signals for next phase of instruction processing Ø who drives the bus? (Gate. PC, Gate. ALU, …) Ø which registers are write enabled? (LD. IR, LD. REG, …) Ø which operation should ALU perform? (ALUK) Ø… • Logic includes decoder for opcode, etc.
Register Transfer Notation • In one clock period, signals travel from a source register(s) to a destination register, through the combinational logic. • Register transfer notation describes such transfer. • For example show the transfer of data from two source registers in the register file through the combinational logic in the ALU back to the register file Reg[1] + Reg[2] • In addition to the description of the transfer the control signals necessary to produce the transfer must be noted # SR 2, SR 1, SR 2 MUX, ALUK, LD. REG DR
Register Transfer Notation One or more transfers per clock tick • If multiple transfers are happening make sure there is no interference • one line = one clock tick Two columns: • Write the desired transfers • List control signals necessary to cause the transfer MAR PC; PC PC+1 LC 3 -Viz is a handy tool • LC 3 Visualizer # LD. MAR, Gate. PC, LD. PC, PCMUX
Register Transfer Notation Transfer takes one clock cycle. Memory operations assumed here to take one cycle also • • In reality memories are slow, and take multiple cycles See appendix C for info on how the LC 3 can handle multi cycle memory accesses Register transfer languages: • Basic: here • Advanced: VHDL, Verilog: used for description/design
RTN/LC 3 -Viz Conventions Signals indicated must be asserted before the clock tick in order for the indicated transfer to occur. Sequence is: • Signals are asserted • Clock tick arrives, and causes the transfer In an RTN transfer, on either the right hand side (rhs), or left hand side (lhs) • Mem[x] is the memory at address x • Mem[MAR] is the memory at address that is in the MAR • Reg[x] is Register number x
RTN Conventions An RTN transfer is of the form: LHS-location RHS-expression The LHS-location can be a memory location a specific register or the xth register The RHS-expression is: • named registers, e. g. , Reg[3] • memory locations e. g. , Mem[MAR] • simple expressions PC+1, Reg[src] + Reg[dst] Simple expressions can also include: • Select specific bits in a register: Reg[MSB: LSB] • Ex: IR[4: 0] • Sign / Zero extension: Sext(value) / Zext(value) • • ex: Sext(IR[4: 0]) ex: Zext(IR[7: 0])
RTN for fetch phase of an instruction # Cycle 1: Transfer the PC into MAR PC; PC PC+1 # LD. MAR, Gate. PC, LD. PC, PCMUX # Cycle 2: Read memory; increment PC MDR Mem[MAR] # LD. MDR, MDR. SEL, MEM. EN # Cycle 3: Transfer MDR into IR IR MDR # LD. IR, Gate. MDR
How does the LC-3 execute a NOT instruction? • Cycles 1 -3 same as Fetch in previous slide • Cycle 4 Source register contents are negated by ALU and result is stored in destination register • Keep in mind that the condition code is also set in this cycle # Cycle 4: Negate src register and store result in dst register, update CC DR ~SR 1; CC Sign(~SR 1) # LD. REG, DR, Gate. ALU, ALUK, SR 1, LD. CC
Other instructions Every instruction is a sequence of transfers Every instruction has the same first three cycles - Instruction fetch Every instruction takes (at least one) additional cycle - Some take even more Which ones? Why? Each one effected by a specific set of control signals The Controller is responsible for generating the correct signals in the appropriate cycle Reminder - Logic responds after some propagation delay, - Storage loads are on clock ticks
Designing the LC 3 Controller / State Machine
Control Unit State Diagram The control unit is a state machine. Here is part of a simplified state diagram for the LC-3: A complete state diagram is in Appendix C.
Control Unit State Diagram Appendix C.
LC 3 State Machine • Using RTN we have seen what control signals are needed to transfer data in the LC 3 • Can we use this information to build a the Control unit/FSM? • 52 states • Inputs • • • Current state Op code Interrupt … 19 bits total • Outputs • Control signals • LD. PC • Gate. MDR • … • Next State • 49 bits total
LC 3 State Machine
Control Stores are one way of implementing truth tables • Alternative to logic gate implementation • Special memory that contains signal output values • Use input date to index into the memory • Data at memory location used as control signal values • Control Store can also be used to implement the full LC 3 Control state machine
LC 3 Microsequencer • With 19 inputs how many possible states are there? • 219 or ~500 thousand • LC 3 only needs 52 states, How many bits necessary to specify 52 states? • 6 • A microsequencer takes the 19 inputs and turns them into 6 bits which represent the address of the next state in the Control Store • Sometimes address of state is just a zero extended version of the current instructions op code • Other times address is created based on INT, Branch, Address mode, PSR[15], etc. • Responsible for handling special cases so control store can be simple • Interrupts, branches, memory ready, etc.
Microsequencer
Building a Truth Table for the LC 3 By using the microsequencer to generate the address in control store we now need to fill in the Control store with the correct output values • If thinking of the control store as a truth table generated address corresponds to the inputs to the truth table and data in each line of the control stores memory corresponds to the output values of the truth table • Use RTN to see what outputs need to be set to 1 in the control store based on the current state to achieve the desired output • Add next state information to control store based on LC 3 state diagram
Final thoughts Another benefit of control store based processor controllers is the ability to update how the controller works • Better performance • Fix bugs • Fix security vulnerabilities Microcode updates In recitation 14 you will implement the logic for a smaller version of the LC 3 called the e. LC 3 which will use some of what you have learned today
- Slides: 32