UNIT II COMBINATIONAL LOGIC CIRCUITS Combinational vs Sequential

UNIT II COMBINATIONAL LOGIC CIRCUITS

Combinational vs. Sequential Logic Combinational Output = f(In) Sequential Output = f(In, Previous In)

Static Complementary CMOS q Pull-up network (PUN) and pull-down network (PDN) VDD PMOS transistors only … In 1 In 2 PUN In. N F(In 1, In 2, …In. N) … In 1 In 2 In. N pull-up: make a connection from VDD to F when F(In 1, In 2, …In. N) = 1 PDN pull-down: make a connection from F to GND when F(In 1, In 2, …In. N) = 0 NMOS transistors only PUN and PDN are dual logic networks

NMOS Transistors in Series/Parallel Connection Transistors can be thought as a switch controlled by its gate signal NMOS switch closes when switch control input is high

PMOS Transistors in Series/Parallel Connection

Threshold Drops VDD PUN VDD S D VDD 0 VDD VGS D S CL VDD 0 PDN VDD D S CL 0 VDD - VTn CL VGS VDD |VTp| S D CL

Complementary CMOS Logic Style

Example Gate: NAND

Example Gate: NOR

Complex CMOS Gate B A C D OUT = D + A • (B + C) A D B C

Constructing a Complex Gate

Elmore Delay • ON transistors look like resistors • Pullup or pulldown network modeled as RC ladder • Elmore delay of RC ladder

Example: 2 -input NAND • Estimate rising and falling propagation delays of a 2 -input NAND driving h identical gates.

Estimate the worst case falling propagation delays of a 2 -input NAND driving h identical gates The worst case occurs when the node x is already charged up to nearly Vdd through the top n. MOS Suppose A = 1, B = 0, then Y = 1, node X is nearly VDD Now change inputs to A=B=1 both node Y and node X need to discharge

Example: 2 -input NAND • Estimate rising and falling propagation delays of a 2 -input NAND driving h identical gates.

Delay Components • Delay has two parts – Parasitic delay, gate driving its own internal diffusion capacitance • 6 or 7 RC • Independent of load – Effort delay, depends on the ration of external load capacitance to input capacitance, – Effort delay changes with transistor width • Proportional to load capacitance • Logical effort and Electrical effort

Contamination Delay • Best-case (contamination) delay can be substantially less than propagation delay. • Ex: If both inputs fall simultaneously, the output should be pulled up in half the time tcdr = (R/2)(6+4 h)C

CIRCUIT FAMILIES • • • Static CMOS Ratioed Circuits Cascode Voltage Switch Logic (CVSL) Dynamic Circuits Pass-transistor Circuits

Static CMOS Circuits

Static CMOS Circuits q In Static CMOS circuits with n inputs, 2 n transistors are needed. q n. MOS block is a dual of the p. MOS block. q What ever is in series in n. MOS, appears in parallel in p. MOS and vice versa. q CMOS gates consume power only during the transition of inputs.

Static complementary gate structure Pull-up and pull-down networks VDD pull-up network out inputs Pull-down network VSS

Pull-up/pull-down network design q Pull-up and pull-down networks are duals. q To design one gate, first design one network, then compute dual to get other network.

Inverter

CMOS Logic Style Construction

Example Gate: NAND

BUBBLE PUSHING

Compound Gates - AOI/OAI gates q AOI = and/or/invert; OAI = or/and/invert. q Implement larger functions. q Pull-up and pull-down networks are compact: smaller area, higher speed than NAND/NOR network equivalents.

AOI example invert or and

AOI CMOS Gate • AOI complex CMOS gate can be used to directly implement a sum-of-products Boolean function • The pull-down N-tree can be implemented as follows: – Product terms yield series-connected NMOS transistors – Sums are denoted by parallel-connected legs – The complete function must be an inverted representation • The pull-up P-tree is derived as the dual of the N-tree

OAI CMOS Gate • An Or-And-Invert (OAI) CMOS gate is similar to the AOI gate except that it is an implementation of product-of-sums realization of a function • The N-tree is implemented as follows: – Each product term is a set of parallel transistors for each input in the term – All product terms (parallel groups) are put in series – The complete function is again assumed to be an inverted representation • The P-tree can be implemented as the dual of the N-tree • Note: AO and OA gates (non-inverted function representation) can be implemented directly on the P-tree if inverted inputs are available

Properties of CMOS Gates

Ratioed Circuits q Pseudo-n. MOS Circuits q Ganged CMOS q Source-Follower Pull-up Logic (SFPL)

Pseudo-n. MOS Circuits • Adding a single p. FET to otherwise n. FET-only circuit produces a logic family that is called pseudo-n. MOS – Less transistor than CMOS – For N inputs, only requires (N+1) FETs – Pull-up device: p. FET is biased active since the grounded gate gives VSGp = VDD Figure 1 General structure of a – Pull-down device: n. FET logic array acts as a pseudo-n. MOS logic gate large switch between the output f and ground – However, since the p. FET is always biased on, VOL can never achieve the ideal value of 0 V • A simple inverter using pseudo-n. MOS as Figure 2 Pseudo-n. MOS inverter

Ganged CMOS (Symmetric Circuits) A B C Z l l Inverters ganged together to perform a function. NOR gate ; Z = A+B+C

Source-Follower Pull-up Logic (SFPL) • SFPL is a variation on pseudo-NMOS whereby the load device is an N pulldown transistor and N source-follower pull-ups are used on the inputs. – N pull-up transistors can be small limiting input capacitance – N transistors are also duplicated as pulldown devices in order to improve the fall time – Rise time is determined by the P 1 inverter pull-up transistor when all inputs are low • SFPL is useful for high fan-in NOR logic gates R. W. Knepper SC 571, page 5 -25

Cascode Voltage Switch Logic (CVSL) • Differential type of logic circuit where both true and complement inputs are required. • N pull down tree are the dual of each other. • P pull-up devices are cross-coupled to latch output. . • Both true and complement outputs are obtained.

Basic Structure of CVSL Q a b Q Q a . . . b c Q

Dynamic CMOS Logic q Logic function is implemented by the PDN only q. No. of transistors is N+2 q. Smaller in area than static CMOS q Full swing outputs (VOL = gnd and VOH = VDD) q Non-ratioed q Faster switching speed q Power dissipation should be better q Needs precharge clock.

Dual-Rail Domino • Domino only performs noninverting functions: – AND, OR but not NAND, NOR, or XOR • Dual-rail domino solves this problem – Takes true and complementary inputs – Produces true and complementary outputs sig_h sig_l Meaning 0 0 Precharged 0 1 ‘ 0’ 1 0 ‘ 1’ 1 1 invalid

Example: AND/NAND • Given A_h, A_l, B_h, B_l • Compute Y_h = A * B, Y_l = ~(A * B)

Example: AND/NAND • Given A_h, A_l, B_h, B_l • Compute Y_h = A * B, Y_l = ~(A * B) • Pulldown networks are conduction complements

Example: XOR/XNOR • Sometimes possible to share transistors

TRANSMISSION GATES • NMOS pass transistor passes a strong 0 and a weak 1. • PMOS pass transistor passes a strong 1 and a weak 0. • Combine the two to make a CMOS pass gate which will pass a strong 0 and a strong 1.

TRANSMISSION GATE

PROBLEMS WITH TRANSMISSION GATES q No isolation between the input and output. q Output progressively deteriorates as it passes through various stages. However designs get simplified.

Multiplexer

XOR gate

Transmission Gates • N-Channel MOS Transistors pass a 0 better than a 1 • P-Channel MOS Transistors pass a 1 better than a 0 • This is the reason that N-Channel transistors are used in the pull-down network and P-Channel in the pull-up network of a CMOS gate. Otherwise the noise margin would be significantly reduced.

Transmission Gates • Pass transistors produce degraded outputs • Transmission gates pass both 0 and 1 well symbols

Transmission Gates • Implementing XOR gates – With NAND gates and inverters: – With transmission gates: • Why would one of these circuits be preferable to the other?

Transmission Gates • Implementing a multiplexer with transmission gates: – When S = 0, input X 1 is connected to the output Y – When S = 1, input X 2 is connected to the output Y

Pass Transistors • Transistors can be used as switches

Pass Transistor • Pass-transistor circuits are formed by dropping the PMOS transistors and using only NMOS pass transistors • In this case, CMOS inverters (or other means) must be used periodically to recover the full VDD level since the NMOS pass transistors will provide a VOH of VDD – VTn in some cases • The pass transistor circuit requires complementary inputs and generates complementary outputs to pass on to the next stage

Pass Transistor • This figure shows a simple XNOR implementation using pass transistors: • If A is high, B is passed through the gate to the output • If A is low, -B is passed through the gate to the output

Pass Transistor • At right, – (a) is a 2 -input NAND pass transistor circuit – (b) is a 2 -input NOR pass transistor circuit • Each circuit requires 8 transistors, double that required using conventional CMOS realizations

Pass Transistor • Pass-transistor logic gate can implement Boolean functions NOR, XOR, NAND, and OR depending upon the P 1 -P 4 inputs, as shown below. – – – P 1, P 2, P 3, P 4 = 0, 0, 0, 1 gives F(A, B) = NOR P 1, P 2, P 3, P 4 = 0, 1, 1, 0 gives F(A, B) = XOR P 1, P 2, P 3, P 4 = 0, 1, 1, 1 gives F(A, B) = NAND P 1, P 2, P 3, P 4 = 1, 0, 0, 0 gives F(A, B) = AND P 1, P 2, P 3, P 4 = 1, 1, 1, 0 gives F(A, B) = OR Circuit can be operated with clocked P pull-up device or inverterbased latch

Pass Transistor Logic Families q Complementary Pass Transistor Logic q Double Pass Transistor Logic

Complementary Pass-Transistor Logic (CPL)

Basic logic functions in CPL

CPL Logic XOR gate Sum circuit CPL provides an efficient implementation of XOR function

Full Adder Design III • Complementary Pass Transistor Logic (CPL) – Slightly faster, but more area

Double Pass-Transistor Logic (DPL): AND/NAND XOR/XNOR

Double Pass-Transistor Logic (DPL): XOR One bit full-adder: Sum circuit

Double Pass-Transistor Logic (DPL): DPL Full Adder The critical path traverses two transistors only (not counting the buffer)

Dynamic CMOS • In static circuits at every point in time (except when switching) the output is connected to either GND or VDD via a low resistance path. – fan-in of n requires 2 n (n N-type + n P-type) devices • Dynamic circuits rely on the temporary storage of signal values on the capacitance of high impedance nodes. – requires on n + 2 (n+1 N-type + 1 P-type) transistors

Dynamic Gate Clk Mp Out In 1 In 2 In 3 CL PDN off Mp on A C B Clk Me Clk Two phase operation Precharge (Clk = 0) Evaluate (Clk = 1) 1 Out ((AB)+C) off Me on

Conditions on Output • Once the output of a dynamic gate is discharged, it cannot be charged again until the next precharge operation. • Inputs to the gate can make at most one transition during evaluation. • Output can be in the high impedance state during and after evaluation (PDN off), state is stored on CL

Properties of Dynamic Gates • Logic function is implemented by the PDN only – number of transistors is N + 2 (versus 2 N for static complementary CMOS) • Full swing outputs (VOL = GND and VOH = VDD) • Non-ratioed - sizing of the devices does not affect the logic levels • Faster switching speeds – reduced load capacitance due to lower input capacitance (Cin) – reduced load capacitance due to smaller output loading (Cout) – no Isc, so all the current provided by PDN goes into discharging CL

Properties of Dynamic Gates • Overall power dissipation usually higher than static CMOS – no static current path ever exists between VDD and GND (including Psc) – no glitching – higher transition probabilities – extra load on Clk • PDN starts to work as soon as the input signals exceed VTn, so VM, VIH and VIL equal to VTn – low noise margin (NML) • Needs a precharge/evaluate clock

Dynamic Logic • Dynamic gates uses a clocked p. MOS pullup • Two modes: precharge and evaluate

The Foot • What if pulldown network is ON during precharge? • Use series evaluation transistor to prevent fight.

Logical Effort

Issues in Dynamic Design 1: Charge Leakage CLK Clk Mp CL A Clk Out Me Evaluate VOut Precharge Leakage sources Dominant component is subthreshold current

Solution to Charge Leakage Keeper Clk Mp A Mkp CL Out B Clk Me Same approach as level restorer for pass-transistor logic

Issues in Dynamic Design 2: Charge Sharing Clk Mp Out A CL B=0 Clk CA Me CB Charge stored originally on CL is redistributed (shared) over CL and CA leading to reduced robustness

Charge Sharing Example Clk Ca=15 f. F B Cc=15 f. F A A B B C C Clk Out CL=50 f. F !B Cb=15 f. F Cd=10 f. F

Charge Sharing V DD Clk Mp Out A CL Ma X B=0 Clk Mb Me Ca Cb

Solution to Charge Redistribution Clk Mp A Mkp Clk Out B Clk Me Precharge internal nodes using a clock-driven transistor (at the cost of increased area and power)

Issues in Dynamic Design 3: Backgate Coupling Clk Mp A=0 Out 1 =1 CL 1 Out 2 =0 CL 2 B=0 Clk Me Dynamic NAND Static NAND In

Voltage Backgate Coupling Effect Out 1 Clk In Out 2 Time, ns

Issues in Dynamic Design 4: Clock Feedthrough Clk Mp A CL B Clk Out Me Coupling between Out and Clk input of the precharge device due to the gate to drain capacitance. So voltage of Out can rise above VDD. The fast rising (and falling edges) of the clock couple to Out.

Clock Feedthrough Clock feedthrough Clk In 1 Out In 3 In 4 Voltage In 2 In & Clk Out Clk Time, ns Clock feedthrough

Other Effects • • Capacitive coupling Substrate coupling Minority charge injection Supply noise (ground bounce)

Cascading Dynamic Gates V Clk Mp Out 1 In In Clk Me Clk Out 2 Clk Me Out 1 VTn V Out 2 t Only 0 1 transitions allowed at inputs!

Monotonicity • Dynamic gates require monotonically rising inputs during evaluation – 0 -> 0 – 0 -> 1 – 1 -> 1 – But not 1 -> 0

Monotonicity Woes • But dynamic gates produce monotonically falling outputs during evaluation • Illegal for one dynamic gate to drive another!

Domino Logic Clk In 1 In 2 In 3 Clk Mp 1 1 1 0 PDN Me Out 1 Clk 0 0 0 1 In 4 In 5 Clk Mp Mkp PDN Me Out 2

Domino Gates • Follow dynamic stage with inverting static gate – Dynamic / static pair is called domino gate – Produces monotonic outputs

Domino Optimizations • Each domino gate triggers next one, like a string of dominos toppling over • Gates evaluate sequentially but precharge in parallel • Thus evaluation is more critical than precharge • HI-skewed static stages can perform logic

Dual-Rail Domino • Domino only performs noninverting functions: – AND, OR but not NAND, NOR, or XOR • Dual-rail domino solves this problem – Takes true and complementary inputs – Produces complementary outputs sig_h sig_ltrue and Meaning 0 0 Precharged 0 1 ‘ 0’ 1 0 ‘ 1’ 1 1 invalid

Example: AND/NAND • Given A_h, A_l, B_h, B_l • Compute Y_h = AB, Y_l = AB • Pulldown networks are conduction complements

Example: XOR/XNOR • Sometimes possible to share transistors

np-CMOS

NORA Logic

NP Domino

Zipper CMOS • The NP-Domino or NORA logic is very susceptible to noise and leakage. • Zipper Domino has the same structure, but the precharge transistors are left slightly ON during evaluation.

Leakage • Dynamic node floats high during evaluation – Transistors are leaky (IOFF 0) – Dynamic value will leak away over time – Formerly miliseconds, now nanoseconds • Use keeper to hold dynamic node – Must be weak enough not to fight evaluation

Charge Sharing • Dynamic gates suffer from charge sharing

Secondary Precharge • Solution: add secondary precharge transistors – Typically need to precharge every other node • Big load capacitance CY helps as well

Noise Sensitivity • Dynamic gates are very sensitive to noise – Inputs: VIH Vtn – Outputs: floating output susceptible noise • Noise sources – Capacitive crosstalk – Charge sharing – Power supply noise – Feedthrough noise – And more!

Power • Domino gates have high activity factors – Output evaluates and precharges • If output probability = 0. 5, a = 0. 5 – Output rises and falls on half the cycles – Clocked transistors have a = 1 – For a 4 input NAND, a. CMOS = 3/16, a. Dynamic = 1/4 • Leads to very high power consumption • However, glitching does not occur in dynamic logic. • The load capacitances are lower.

MODL • It is often necessary to compute multiple functions where one is a subfunction of the other or shares a subfunction. • One very typical example is the carry in addition:

MODL Carry Chains

MODL • Beware of sneak paths. • Certain inputs must be mutually exclusive.

Domino Summary • Domino logic is attractive for high-speed circuits – 1. 3 – 2 x faster than static CMOS – But many challenges: • Monotonicity, leakage, charge sharing, noise • Widely used in high-performance microprocessors in 1990 s when speed was king • Largely displaced by static CMOS now that power is the limiter • Still used in memories for area efficiency

POWER DISSIPATION q. Power is drawn from a voltage source attached to the VDD pin(s) of a chip. q. Instantaneous Power: q. Energy: q. Average Power:

Overview of Power Dissipation q Ptotal = Pdynamic+Pstatic q Power Consumption (Pdynamic) Dynamic power Consumption Pdynamic = Pswitching + Pshortcircuit Switching load capacitances Short-circuit current – Charging and discharging capacitors q Short Circuit Power Consumption (Pshort-circuit) – Short circuit path between supply rails during switching

Power Dissipation Sources Static power: Pstatic = (Isub + Igate + Ijunct + Icontention)VDD Subthreshold leakage Gate leakage Junction leakage Contention current

Dynamic Power q Dynamic power is required to charge and discharge load capacitances when transistors switch. q One cycle involves a rising and falling output. q On rising output, charge Q = CVDD is required q On falling output, charge is dumped to GND Vdd q This repeats Tfsw times over an interval of T Vin Vout CL fsw

Dynamic Power

Dynamic Power Suppose the system clock frequency = f Let fsw = af, where a = activity factor If the signal is a clock, a = 1 If the signal switches once per cycle, a = ½ Dynamic gates: Switch either 0 or 2 times per cycle, a = ½ Static gates: Depends on design, but typically a = 0. 1 Dynamic power:

Dynamic Power q Pdynamic = Energy/per-transition Transition rate = CL VDD 2 f 0→ 1 = CL VDD 2 P 0→ 1 f = Ceff VDD 2 f q Ceff = effective capacitance = CL P 0→ 1 q Power dissipation is data dependent – Function of Switching Activity q Activity Factor (P 0→ 1) – Clock signal: P 0→ 1(clk) = 1 – Data signal: P 0→ 1(data) < 0. 5

Short Circuit Current q When transistors switch, both n. MOS and p. MOS networks may be momentarily ON at once Vdd q Leads to a blip of “short circuit” current. q ~ 15% of dynamic power Vin Vout – ~85% to charge capacitance CL CL q NMOS and PMOS on – Both transistors in saturation q Long rise / fall times – Slow input transition – Increase short circuit current Make input signal transitions fast to save power!

VDD Short Circuit Current ISC≈0 Vout CL Vin Large capacitive load VDD ISC≈IMAX Vin Vout CL Small capacitive load Because of finite slope of input signal, there is a period when both PMOS and NMOS device are “on” and create a path from supply to ground E / E 8 7 6 5 4 3 2 1 0 W/L|P = 7. 2 mm/1. 2 mm W/L|N = 2. 4 mm/1. 2 mm VDD = 5 V VDD = 3. 3 V 0 1 2 3 4 5 r The power dissipation due to short circuit currents is minimized by matching the rise/fall times of the input and output signals.

Dynamic Power Reduction q q Try to minimize: – Activity factor – Capacitance – Supply voltage – Frequency

Voltage Scaling Dual voltage supply q Internal voltage – Reduced internal voltage 1. 2 V • For low power operation q External voltage – Compatible IO voltage 3. 3 V • To interface other ICs

Capacitance Minimization – Gate capacitance – Fewer stages of logic – Small gate sizes q Wire capacitance – Good floorplanning to keep communicating blocks close to each other – Drive long wires with inverters or buffers rather than complex gates

Clock Gating q The best way to reduce the activity is to turn off the clock to registers in unused blocks – Saves clock activity (a = 1) – Eliminates all switching activity in the block – Requires determining if block will be used

Voltage / Frequency q Run each block at the lowest possible voltage and frequency that meets performance requirements q Voltage Domains – Provide separate supplies to different blocks – Level converters required when crossing from low to high VDD domains q Dynamic Voltage Scaling – Adjust VDD and f according to workload

Static power Dissipation Power dissipation occurring when device is in standby mode As technology scales this becomes significant Leakage power dissipation Components: Reverse biased p-n junction Sub threshold leakage DIBL leakage Channel punch through GIDL Leakage Narrow width effect Oxide leakage Hot carrier tunneling effect

Source of Leakage Current

Leakage q Sub-threshold current – Transistor conducts below Vt – For sub-micron relevant • VDD / Vt ratio smaller • Can dominate power consumption! • Especially in idle mode. Charge nodes fully to VDD! Discharge nodes completely to GND! q Drain leakage current – Reverse biased junction diodes Vdd Vout Drain junction leakage Subthreshold current

Static Power q Static power is consumed even when chip is quiescent. – Ratioed circuits burn power in fight between ON transistors – Leakage draws power from nominally OFF devices

Subthreshold current § Sub-threshold current increases exponentially Subthreshold current can be reduced by increasing Vt §Selective application of multiple threshold (low-Vt transistors on critical paths, high Vt transistors on other paths) §Control Vt through the body voltage §Sub-threshold current decreases in long channel transistors and increases in short channel

Sub-threshold Leakage Component

Gate Leakage q Extremely strong function of tox and Vgs – Negligible for older processes – Approaches subthreshold leakage at 65 nm and below in some processes q An order of magnitude less for p. MOS than n. MOS q Control leakage in the process using tox > 10. 5 Å – High-k gate dielectrics help – Some processes provide multiple tox • e. g. thicker oxide for 3. 3 V I/O transistors q Control leakage in circuits by limiting VDD

Junction Leakage q From reverse-biased p-n junctions – Between diffusion and substrate or well q Ordinary diode leakage is negligible q Band-to-band tunneling (BTBT) can be significant – Especially in high-Vt transistors where other leakage is small – Worst at Vdb = VDD q Gate-induced drain leakage (GIDL) exacerbates – Worst for Vgd = -VDD (or more negative)

RATIOED CIRCUIT q Pseudo-NMOS logic style PMOS as resistor – PDN as static CMOS logic q Static current – When output low q Power consumption – Even without switching activity

Static power Reduction Reduce static power • Selectively use ratioed circuits • Selectively use low Vt devices • Leakage reduction: stacked devices, body bias, low temperature