Low Power Design of Standard Cell Digital VLSI

  • Slides: 40
Download presentation
Low Power Design of Standard Cell Digital VLSI Circuits By Siri Uppalapati Thesis Directors:

Low Power Design of Standard Cell Digital VLSI Circuits By Siri Uppalapati Thesis Directors: Prof. M. L. Bushnell and Prof. V. D. Agrawal ECE Department, Rutgers University May 18, 2004 MS Defense: Uppalapati 1

Talk Outline l l l Motivation Background Prior Work Proposed Design Flow Results Conclusion

Talk Outline l l l Motivation Background Prior Work Proposed Design Flow Results Conclusion and Future Work May 18, 2004 MS Defense: Uppalapati 2

Motivation l Increasing gate count + increasing clock frequency = increasing POWER l Portable

Motivation l Increasing gate count + increasing clock frequency = increasing POWER l Portable equipment runs on battery l Power consumption due to glitches can be 30 – 70% May 18, 2004 MS Defense: Uppalapati 3

Motivation: Chip Power Density Source: Intel Sun’s Surface Power Density (W/cm 2) 10000 Rocket

Motivation: Chip Power Density Source: Intel Sun’s Surface Power Density (W/cm 2) 10000 Rocket Nozzle 1000 Nuclear Reactor 100 8086 Hot Plate 10 4004 8008 8085 386 286 8080 1 1970 May 18, 2004 1980 P 6 Pentium® 486 1990 Year 2000 MS Defense: Uppalapati 2010 4

Motivation (cont’d…) l Present day Application Specific Integrated Circuit (ASIC) chips employ standard cell

Motivation (cont’d…) l Present day Application Specific Integrated Circuit (ASIC) chips employ standard cell based design style • A quick way to design circuits with millions of gates l Existing glitch reduction techniques demand gate re-design: not suitable for a cell-based design May 18, 2004 MS Defense: Uppalapati 5

Problem Statement l To devise a glitch suppressing methodology after the technology mapping phase

Problem Statement l To devise a glitch suppressing methodology after the technology mapping phase • Without requiring cell re • design Without violating circuit delay constraints May 18, 2004 MS Defense: Uppalapati Design Entry Technology Mapping Layout 6

Talk Progress l l l Motivation Background Prior Work Proposed Design Flow Results Conclusion

Talk Progress l l l Motivation Background Prior Work Proposed Design Flow Results Conclusion and Future Work May 18, 2004 MS Defense: Uppalapati 7

Power Dissipation in CMOS Circuits (0. 25µ) Ptotal = CL VDD 2 f 0

Power Dissipation in CMOS Circuits (0. 25µ) Ptotal = CL VDD 2 f 0 1 + tsc. VDD Ipeak f 0 1 + VDDIleakage CL %75 May 18, 2004 %20 MS Defense: Uppalapati %5 8

Glitches? l l l Unnecessary transitions Occur due to differential path delays Contribute about

Glitches? l l l Unnecessary transitions Occur due to differential path delays Contribute about 30 -70% of total power consumption Delay =1 2 2 May 18, 2004 MS Defense: Uppalapati 9

Standard Cell Based Style l l Standard cells organized in rows (and, or, flip-flops,

Standard Cell Based Style l l Standard cells organized in rows (and, or, flip-flops, etc. ) Cells made as full custom l l All cells of same height Reasonable design time l Due to automatic translation from logic level to layout Routing Cell IO cell May 18, 2004 MS Defense: Uppalapati 10

Talk Progress l l l Motivation Background Prior Work Proposed Design Flow Results Conclusion

Talk Progress l l l Motivation Background Prior Work Proposed Design Flow Results Conclusion and Future Work May 18, 2004 MS Defense: Uppalapati 11

Prior Work l Existing glitch reduction techniques • • • l Low power design

Prior Work l Existing glitch reduction techniques • • • l Low power design by hazard filtering [Agrawal, VLSI Design ’ 97] Reduced constraint set linear program [Raja et al. , VLSI Design ’ 03] CMOS circuit design for minimum dynamic power and highest speed [Raja et. al. , VLSI Design ’ 04] Optimization of cell based design • • Cell library optimization [Masgonty et al. , PATMOS ’ 01] Cell selection [Zhang et al. , DAC ’ 01)] May 18, 2004 MS Defense: Uppalapati 12

Prior Work: Hazard Filtering Reference: V. D. Agrawal, “Low Power Design by Hazard Filtering”,

Prior Work: Hazard Filtering Reference: V. D. Agrawal, “Low Power Design by Hazard Filtering”, VLSI Design 1997 l l Glitch is suppressed when the inertial delay of gate exceeds the differential input delays. Re-design all gates in the circuit for inertial delay > differential delay 3 2 Filtering Effect of a gate May 18, 2004 MS Defense: Uppalapati 13

Prior Work: A Reduced Constraint Set LP Model for Glitch Removal Reference: T. Raja,

Prior Work: A Reduced Constraint Set LP Model for Glitch Removal Reference: T. Raja, V. D. Agrawal and M. L. Bushnell, “Minimum Dynamic Power CMOS Circuit Design by a Reduced Constraint Set Linear Program”, VLSI Design ‘ 2003 l l l Gate variables d 4. . d 12 Buffer Variables d 15. . d 29 Corresponding window variables t 4. . t 29 and T 4. . T 29. May 18, 2004 MS Defense: Uppalapati 14

Prior Work: A Reduced Constraint Set LP Model for Glitch Removal (cont’d…) l l

Prior Work: A Reduced Constraint Set LP Model for Glitch Removal (cont’d…) l l Objective function: Minimize sum of buffer delays inserted Objective: minimize Σdj all buffers j Glitch removal constraint: dg > Tg – tg all gates g l Maxdelay constraint: TPO > maxdelay l Transistor sizing or other procedures used to implement these delays May 18, 2004 MS Defense: Uppalapati 15

Prior Work: Cell Library Optimization Reference: J. M. Masgonty, S. Cserveny, C. Arm and

Prior Work: Cell Library Optimization Reference: J. M. Masgonty, S. Cserveny, C. Arm and P. D. Pfister, “Low-Power Low. Voltage Standard Cell Libraries with a Limited Number of Cells”, PATMOS ‘ 01 l l Limited logic functions with greater cell sizing can result in 20 - 25% savings in power Transistor sizing for • • Multiple driving strength Balanced rise and fall times Power optimized by minimizing parasitic capacitances Limitations: • • Discrete set of varieties Optimization of cells cannot be circuit-specific May 18, 2004 MS Defense: Uppalapati 16

Prior Work: Cell Selection Reference: Y. Zhang, X. Hu and D. Z. Chen, “Cell

Prior Work: Cell Selection Reference: Y. Zhang, X. Hu and D. Z. Chen, “Cell Selection from Technology Libraries for Minimizing Power”, DAC ‘ 01 l l Mixed Integer Linear Program (MILP) to select from different realizations of cells such that power consumption is minimized without violating delay constraints • Sum of dynamic and leakage power is minimized • • Supply voltages Threshold voltages A set of variables for each cell to support different • Sizes Achieved 79% power saving on an average Limitation: depends on diversity of the cell library May 18, 2004 MS Defense: Uppalapati 17

Talk Progress l l l Motivation Background Prior Work Proposed Design Flow Results Conclusion

Talk Progress l l l Motivation Background Prior Work Proposed Design Flow Results Conclusion and Future Work May 18, 2004 MS Defense: Uppalapati 18

New Glitch Removing Solution l Balanced the differential delays at cell inputs: • Using

New Glitch Removing Solution l Balanced the differential delays at cell inputs: • Using delay elements called Resistive Feedthrough cells l Automated the delay element • Generation • Insertion into the circuit May 18, 2004 MS Defense: Uppalapati 19

Proposed Design Flow l l l Modified linear program Resistive feed though cell generation:

Proposed Design Flow l l l Modified linear program Resistive feed though cell generation: Design Entry • Fully automated • Scalable to large ICs Tech. Mapping Layout generation of modified netlist Remove Glitches • Can use any place-and-route tool May 18, 2004 Layout MS Defense: Uppalapati 20

First Attempt – Did not work: Modified Linear Program l Changes from Raja’s linear

First Attempt – Did not work: Modified Linear Program l Changes from Raja’s linear program: • • l l l Gate delays – constants Wire delays – only variables Constrained solution space Large number of buffers inserted Buffers consume power • may exceed the power saved May 18, 2004 Circuit # gates # bufs 4 -bit ALU 90 36 c 432 240 120 C 499 618 396 C 880 383 217 C 1355 546 414 C 2670 1193 162 MS Defense: Uppalapati 21

Comparison of Delay Elements l Resistor shows • • • l Maximum delay Minimum

Comparison of Delay Elements l Resistor shows • • • l Maximum delay Minimum power and area per unit delay Hence, best delay element Resistive feed through cell • A fictitious buffer at logic level May 18, 2004 Delay Averag Delay/ Delay element e delay Power /Area (ns) I 0. 28 0. 22 . 03 II 0. 59 4. 43 0. 05 III 0. 72 5. 54 0. 11 IV 0. 63 1. 05 0. 16 I. Inverter pair III. Polysilicon resistor II. n diffusion capacitor IV. Transmission gate MS Defense: Uppalapati 22

Resistive Feed-through Cell l A parameterized cell R = R□*(length of poly) Width of

Resistive Feed-through Cell l A parameterized cell R = R□*(length of poly) Width of poly l l Physical design is simple – easily automated No routing layers(M 2 to M 5) used – not an obstruction to the router May 18, 2004 MS Defense: Uppalapati 23

RC Delay Model l l Used to find the resistance value for a given

RC Delay Model l l Used to find the resistance value for a given delay Delay depends on load capacitance • l l Number of fan-outs R Vin SPECTRE simulations done for varying R and CL values CL is varied in steps of transistor pairs May 18, 2004 MS Defense: Uppalapati CL 24

RC Delay Model (cont’d…) l CL varies during transition • l l Model not

RC Delay Model (cont’d…) l CL varies during transition • l l Model not perfectly linear Measured data stored as a 3 D lookup table Average of signal rise and fall delays TP = TPLH + TPHL 2 l Linear interpolation between two points May 18, 2004 MS Defense: Uppalapati 25

Detailed Design Flow Design Entry Find delays from LP Tech. Mapping Find resistor values

Detailed Design Flow Design Entry Find delays from LP Tech. Mapping Find resistor values from lookup table Remove Glitches Generate feed through cells and modify netlist Layout May 18, 2004 MS Defense: Uppalapati 26

Talk Progress l l l Motivation Background Prior Work Proposed Design Flow Results Conclusion

Talk Progress l l l Motivation Background Prior Work Proposed Design Flow Results Conclusion and Future Work May 18, 2004 MS Defense: Uppalapati 27

Experimental Procedure l l l Extract cell delays from initial layout • LP solver:

Experimental Procedure l l l Extract cell delays from initial layout • LP solver: CPLEX in AMPL • C program to generate the input files Physical design of feed through cells and insertion of fictitious buffers • l SPECTRE simulation PERL script Place-and-Route • Silicon Ensemble from Cadence May 18, 2004 MS Defense: Uppalapati 28

Power Estimation l Logic level • Event-driven delay simulator to count the • l

Power Estimation l Logic level • Event-driven delay simulator to count the • l transitions Power α # transitions × # fanouts Post layout • SPECTRE simulator to measure current • through the power rail Average power calculated by integration May 18, 2004 MS Defense: Uppalapati 29

Results New Standard Cell Based Design Circuit Area Overhead(%) Raja et. al. Power Saved(%)

Results New Standard Cell Based Design Circuit Area Overhead(%) Raja et. al. Power Saved(%) 4 bit ALU 29. 5 23. 7 N/A c 432 114. 0 50. 0 35. 0 C 499 86. 0 32. 0 29. 0 C 880 98. 0 43. 0 44. 0 C 1355 22. 0 68. 3 56. 0 C 2670 14. 0 30. 0 31. 0 May 18, 2004 MS Defense: Uppalapati 30

Glitch Elimination on net 86 in the 4 bit ALU Source: Post layout simulation

Glitch Elimination on net 86 in the 4 bit ALU Source: Post layout simulation in SPECTRE May 18, 2004 MS Defense: Uppalapati 31

Energy Saving in 4 bit ALU May 18, 2004 MS Defense: Uppalapati 32

Energy Saving in 4 bit ALU May 18, 2004 MS Defense: Uppalapati 32

Layouts of c 880 Original layout of c 880 May 18, 2004 Optimized layout

Layouts of c 880 Original layout of c 880 May 18, 2004 Optimized layout of c 880 MS Defense: Uppalapati 33

Talk Progress l l l Motivation Background Prior Work Proposed Design Flow Results Conclusion

Talk Progress l l l Motivation Background Prior Work Proposed Design Flow Results Conclusion and Future Work May 18, 2004 MS Defense: Uppalapati 34

Conclusions l Successfully devised a glitch removal method for the standard cell based design

Conclusions l Successfully devised a glitch removal method for the standard cell based design style • • • l l Does not require re-design of the mapped cells Does not increase the critical path delay Scalable with technology The modified design flow is well automated • Maintains the low design time of this style On an average • • Dynamic power saving: 41% Area overhead: 60% May 18, 2004 MS Defense: Uppalapati 35

Future Work l l Diverse target cell library • • • Cells of different

Future Work l l Diverse target cell library • • • Cells of different propagation delays LP model needs to be changed Might become an ILP • • Interconnect delays can be used Placement and routing algorithms need to be controlled An NP complete problem 70% of necessary delays below 2 ns • May 18, 2004 MS Defense: Uppalapati 36

Future Work (contd…) Reference: 1997 International Technology Roadmap for Semiconductors May 18, 2004 MS

Future Work (contd…) Reference: 1997 International Technology Roadmap for Semiconductors May 18, 2004 MS Defense: Uppalapati 37

References l l V. D. Agrawal, “Low Power Design by Hazard Filtering”, VLSI Design

References l l V. D. Agrawal, “Low Power Design by Hazard Filtering”, VLSI Design 1997 T. Raja, V. D. Agrawal and M. L. Bushnell, “Minimum Dynamic Power CMOS Circuit Design by a Reduced Constraint Set Linear Program”, VLSI Design 2003 Y. Zhang, X. Hu and D. Z. Chen, “Cell Selection from Technology Libraries for Minimizing Power”, DAC 2001 J. M. Masgonty, S. Cserveny, C. Arm and P. D. Pfister, “Low-Power Low-Voltage Standard Cell Libraries with a Limited Number of Cells”, PATMOS 2001 May 18, 2004 MS Defense: Uppalapati 38

THANK YOU May 18, 2004 MS Defense: Uppalapati 39

THANK YOU May 18, 2004 MS Defense: Uppalapati 39

Prior Work: Existing Low Power Design Techniques System Architectural RT - Level HW/SW co-design,

Prior Work: Existing Low Power Design Techniques System Architectural RT - Level HW/SW co-design, Custom ISA, Algorithm design Scheduling, Pipelining, Binding Clock gating, State assignment, Retiming Logic restructuring, Technology mapping Physical Fan-out Optimization, Buffering, Transistor sizing, Glitch elimination May 18, 2004 MS Defense: Uppalapati 40