Optimizing Power Standby Memory Benton H Calhoun Jan
Optimizing Power @ Standby Memory Benton H. Calhoun Jan M. Rabaey Low Power Design Essentials © 2008 Chapter 9
Chapter Outline § § Memory in Standby Voltage Scaling Body Biasing Periphery Low Power Design Essentials © 2008 9. 2
Memory Dominates Processor Area § SRAM is a major source of static power in ICs, especially for low power applications § Special memory requirement: need to retain state in standby § Metrics for standby: – 1. Leakage power – 2. Energy overhead for entering/leaving standby – 3. Timing/area overhead BL BL WL Q M 3 M 2 M 6 M 4 M 1 M 5 QB Low Power Design Essentials © 2008 9. 3
Reminder of “Design Time” Leakage Reduction § Design-time techniques (Ch 7) also impact leakage – High VTH transistors – Different precharge voltages – Floating BLs § This Chapter: adaptive methods that uniquely address memory standby power Low Power Design Essentials © 2008 9. 4
The Voltage Knobs 0 0 (DIBL) 1 Leakage reduction (ratio) § Changing internal voltages has different impact on leakage of various transistors in cell § Voltage changes accomplished by playing tricks with peripheral circuits VDD - NMOS - 0 0 C 0 B 1 10 -1 B 2 10 -2 VDD 0 A 2 VDD - 10 -3 0 A 1 VDD 0 10 -4 L = 90 nm, t. OX = 2 nm VDD = 1 V S = 100 m. V/decade K = 0. 2 V 1/2, 2 = 0. 6 V = 0. 05 0 + 10 -5 0 0. 2 + 0 0. 4 0. 6 0. 8 1. 0 Offset voltage, (V) Low Power Design Essentials © 2008 [Ref: Y. Nakagome, IBM’ 03] 9. 5
Lower VDD in Standby Active mode VDDH VDDL VDD Standby mode drowsy VDDlow drowsy VDD_SRAM Example SRAM § Basic Idea: Lower VDD lowers leakage – sub-threshold leakage – GIDL – gate tunneling § Question: What sets the lower limit? Low Power Design Essentials © 2008 [Ref: K. Flautner, ISCA ’ 02] 9. 6
Limits to VDD Scaling: DRV Data Retention Voltage (DRV): Voltage below which a bitcell loses its data 0. 4 130 nm CMOS V 0. 2 V V 2 (V) 0. 3 That is, the supply voltage at which the Static Noise Margin (SNM) of the SRAM cell in standby mode reduces to zero. DD =0. 4 V =0. 18 V 0. 1 VTC 0 0 0. 1 0. 2 V Low Power Design Essentials © 2008 DD [Ref: H. Qin, ISQED ’ 04] 1 0. 3 1 2 0. 4 (V) 9. 7
Power savings of DRV 1. 4 mm Leakage Current (μA) 60 IP Module of 4 k. B SRAM 50 40 30 Measured DRV range 20 10 0 0 0. 2 0. 4 0. 6 0. 8 1 Supply Voltage (V) Test chip in 130 nm CMOS technology with built-in voltage regulator Low Power Design Essentials © 2008 • More than 90% reduction in leakage power with 350 m. V standby VDD (100 m. V guard band). [Ref: H. Qin, ISQED’ 04] 9. 8
DRV and Transistor Sizes 190 DRV (m. V) 180 170 160 Ma 150 Mp Mn Model 140 0 1 2 Width Scaling Factor 3 With Ma, Mp and Mn the access transistor, PMOS pull-up and NMOS pull-down, respectively Low Power Design Essentials © 2008 [Ref: H. Qin, Jolpe ’ 06] 9. 9
Impact of Process “Balance” Stronger PMOS or NMOS (SP, SN) in subthreshold lowers SNM even for typical cell Low Power Design Essentials © 2008 [Ref: J. Ryan, GLSVLSI’ 07] 9. 10
Impact of Process Variations on DRV § DRV varies widely from cell to cell § Most variations random with some systematic effects (e. g. module boundaries) § DRV histogram has long tail DRV histogram for 32 k. Bit SRAM 6000 130 nm CMOS 5000 4000 DRV Spatial Distribution 3000 2000 1000 0 100 200 300 400 DRV (m. V) Low Power Design Essentials © 2008 [Ref: H. Qin, ISQED’ 04] 9. 11
Impact of Process Variations on DRV distribution for 90 nm and 45 nm CMOS 0. 10 © IEEE 2007 Frequency 0. 08 0. 06 0. 04 90 nm tail 0. 02 45 nm tail 0 50 100 150 200 250 300 350 DRV (m. V) Other sources of variation: Global variations, data values, temperature (weak), bit-line voltage (weak) Low Power Design Essentials © 2008 [Ref: J. Wang, CICC’ 07] 9. 12
DRV Statistics for an Entire Memory § DRV distribution is neither normal nor lognormal § CDF model of DRV distribution (FDRV(x) = 1 - P(SNM < 0, VDD=x)) Worst DRV (m. V) 350 Model Normal Log. Normal Monte-Carlo 300 250 200 150 © IEEE 2007 100 3 4 5 6 7 8 Memory size s Low Power Design Essentials © 2008 [Ref: J. Wang, ESSCIRC 2007] 9. 13
Reducing the DRV 6000 5000 4000 3000 2000 1000 0 100 200 DRV (m. V) 300 400 Chip DRV 1. Cell optimization 2. ECC (Error Correcting Codes) 3. Cell optimization + ECC Low Power Design Essentials © 2008 9. 14
Lowering the DRV Using ECC Encoder Write D P P Data SRAM with ECC Read - 15 Error Correction Challenges § Maximize correction rate § Minimize timing overhead § Minimize area overhead Low Power Design Essentials © 2008 Data Correction Data In ECC Data Out Decoder § Hamming [31, 26, 3] achieves 33% power saving § Reed-Muller [256, 219, 8] achieves 35% power saving [Ref: A. Kumar, ISCAS’ 07] 9. 15
Combining Cell Optimization and ECC Optimized+ECC - 16 Low Power Design Essentials © 2008 Normalized SRAM leakage current Standard A Original SRAM Optimized SRAM w/ ECC 1 50 X 0. 8 0. 6 650 m. V B 0. 4 320 m. V 255 m. V 0. 2 0 C D 0 0. 2 0. 4 0. 6 VDD (V) 0. 8 1 SRAM Standby VDD A Standard 1 V B Standard DRVMAX+100 m. V C Optimized DRVMAX+100 m. V D Optimized with ECC DRVECC_MAX+100 m. V [Ref: A. Kumar, ISCAS’ 07] 9. 16
How to Approach the DRV Safely? Adjustable Power Supply VDD VCTRL voltages Reset Sub-VT Controller “ 1” “ 0” Failure Detectors Core Cells Using “canary cells” to set the standby voltage in closed-loop Low Power Design Essentials © 2008 [Ref: J. Wang, CICC’ 07] 9. 17
How to Approach the DRV Safely? Histogram Less power More reliable Failure Threshold Multiple sets of canary cells SRAM cell 128 Kb SRAM ARRAY DRV © IEEE 2007 Mean DRV of Canary Cells (V) Canary Replica & test circuit 0. 8 0. 6 0. 4 0. 2 0 0 0. 2 0. 4 0. 6 0. 8 0. 6% area overhead in 90 nm test chip VCTRL(V) Low Power Design Essentials © 2008 [Ref: J. Wang , CICC’ 07] 9. 18
Raising VSS § Raise bitcell VSS in standby (e. g. 0 to 0. 5 V) § Lower BL voltage in standby (e. g. 1. 5 V to 1 V) ‘ 0’ is 0. 5 V Lower voltage less gate leakage and GIDL Lower VDS less sub. VTH leakage (DIBL) Negative VBS reduces sub-VTH leakage Low Power Design Essentials © 2008 1. 0 V WL=0 V 1. 5 V ‘ 1’ ‘ 0’ 0. 5 V [Ref: K. Osada, JSSC’ 03] 9. 19
Body Biasing § Reverse Body Bias (RBB) for leakage reduction – Move FET source (as in raised VSS) – Move FET body § Example: Whenever WL is low, apply RBB Active VPB BL WL VDD BLB WL VDD, VSS Standby VDD 0 V 2 VDD VSS VPB, VNB Low Power Design Essentials © 2008 VDD 0 V -VDD [Ref: H. Kawaguchi, VLSI Symp. 98] 9. 20
Combining Body Biasing and Voltage Scaling Active VPB BL WL VDD BLB WL VDD, VSS Standby VDD 0 V 2 VDD VSS VPB, VNB Low Power Design Essentials © 2008 VDD 0 V -VDD [Ref: A. Bhavnagarwala, SOC’ 00] 9. 21
Combining Raised VSS and RBB VPB Supply Active (V) Standby (V) VPB 1. 0 1. 75 VDD 1. 0 VSS 0. 0 0. 65 VNB 0. 0 BL WL BLB VDD VSS VNB 28 X savings in standby power reported Low Power Design Essentials © 2008 [Ref: L. Clark, TVLSI’ 04] 9. 22
Voltage Scaling in and Around the Bitcell Large number of reported techniques Voltage Approach Source(s) lower in active (e. g. DVS) lower in standby raise always raise for read access float or lower for write float for read access raise in standby [1] [2][3][4][5][6][7] [8][9] [5][10] raise in standby raise or float for write access lower for read access [6][7][11][12][13][14][15] [16] Wordline (WL) negative for standby [4][10] WL driver VDD lower in standby [7] Well-biasing change with mode [4][9] Bitline VDD lower for standby [12] Bitcell VDD Bitcell VSS Low Power Design Essentials © 2008 [9] [1] K. Osada et al. JSSC 2001 [2] N. Kim et al. TVLSI 2004 [3] H. Qin et al. ISQED 2004 [4] K. Kanda et al. ASIC/SOC 2002 [5] A. Bhavnagarwala et al. Sym. VLSIC 2004 [6] T. Enomoto et al. JSSC 2003 [7] M. Yamaoka et al. Sym. VLSIC 2002 [8] M. Yamaoka et al. ISSC 2004 [9] A. Bhavnagarwala et al. ASIC/SOC 2000 [10] K. Itoh et al. Sym. VLSIC 1996 [11] H. Yamauchi et al. Sym. VLSIC 1996 [12] K. Osada et al. JSSC 2003 [13] K. Zhang et al. Sym. VLSIC 2004 [14] K. Nii et al. ISSCC 2004 [15] A. Agarwal et al. JSSC 2003 [16] K. Kanda et al. JSSC 2004 9. 23
Periphery Breakdown § Periphery leakage often not ignorable – Wide transistors to drive large load capacitors – Low VTH transistors to meet performance specs § Chapter 8 techniques for logic leakage reduction equally applicable, but … § Task made easier than for generic logic because of well-defined structure and signal patterns of periphery – e. g. decoders output 0 in standby § Lower peripheral VDD can be used, but need fast level-conversion to interface with array Low Power Design Essentials © 2008 9. 24
Summary and Perspectives § SRAM standby power is leakage dominated § Voltage knobs are effective to lower power § Adaptive schemes must account for variation to allow outlying cells to function § Combined schemes are most promising – e. g. Voltage scaling and ECC § Important to assess overhead! – Need for exploration and optimization framework, in the style we have defined for logic Low Power Design Essentials © 2008 9. 25
References Books and Book Chapters: § § K. Itoh, M. Horiguchi, and H. Tanaka, Ultra-Low Voltage Nano-Scale Memories, Springer 2007. T. Takahawara and K. Itoh, “Memory Leakage Reduction, ” in Leakage in Nanometer CMOS Technologies, S. Narendra, Ed, Chapter 7, Springer 2006. Articles: § § § § A. Agarwal, L. Hai, K. Roy, “A single-V/sub t/ low-leakage gated-ground cache for deep submicron, ” IEEE Journal of Solid State Circuits, pp. 319 -328, Febr. 2003. A. Bhavnagarwala, A. Kapoor, A. ; J. Meindl, “Dynamic-threshold CMOS SRAM cells for fast, portable applications, ” Proceedings IEEE ASIC/SOC Conference, pp. 359 -363, Sept. 2000. A. Bhavnagarwala et all, “A transregional CMOS SRAM with single, logic V/sub DD/ and dynamic power rails, ” Proceedings IEEE VLSI Circuits Symposium, pp. 292 -293, June 2004. L. Clark. , M. Morrow, and W. Brown, “Reverse-body bias and supply collapse for low effective standby power, ” IEEE Transactions on VLSI, pp. 947 -956, Sep 2004. T. Enomoto, Y. Ota, and H. Shikano, “A self-controllable voltage level (SVL) circuit and its lowpower high-speed CMOS circuit applications, “ IEEE Journal of Solid State Circuits, “ Vol. 38, Issue 7, pp. 1220 -1226, July 2003. K. Flautner et al. , “Drowsy Caches: Simple Techniques for Reducing Leakage Power. , Proceedings ISCA 2002, pp. 148 -157, Anchorage, May 2002. K. Itoh et al, “A deep sub-V, single power-supply SRAM cell with multi-VT, boosted storage node and dynamic load, Proceedings VLSI Circuits Symposium, pp. 132 -133, June, 1996. K. Kanda, T. Miyazaki, S. Min, H. Kawaguchi, T. Sakurai, “Two orders of magnitude leakage power reduction of low voltage SRAMs by row-by-row dynamic Vdd control (RRDV) scheme, ” Proceedings IEEE ASIC/SOC Conference, pp. 381 -385, Sept. 2002. Low Power Design Essentials © 2008 9. 26
References (cntd) § § § K. Kanda, et al. , “ 90% write power-saving SRAM using sense-amplifying memory cell, ” IEEE Journal of Solid-State Circuits, pp. 927 – 933, June 2004 H. Kawaguchi, Y. Itaka and T. Sakurai, “Dynamic Leakage Cut-off Scheme for Low-Voltage SRAMs, ” Proceedings VLSI Symposium, pp. 140 -141, June 1998. A. Kumar et al, “Fundamental Bounds on Power Reduction during Data-Retention in Standby SRAM, ” Proceedings ISCAS 2007, pp. 1867 -1870, May 2007. N. Kim, K. Flautner, D. Blaauw, and T. Mudge, “Circuit and microarchitectural techniques for reducing cache leakage power, ” IEEE Transactions on VLSI, pp. 167 -184, Feb 04 167 -184 Y. Nakagome et al. . “Review and prospects of low-voltage RAM circuits, ” IBM J. R & D, vol. 47. no. 516, pp. 525 -552, Sep. /Nov. 2003. K. Osada, “Universal-Vdd 0. 65 -2. 0 -V 32 -k. B cache using a voltage-adapted timing-generation scheme and a lithographically symmetrical cell, “ IEEE Journal of Solid State Circuits, pp. 17381744, Nov. 2001. K. Osada et al, “ 16. 7 -f. A/cell tunnel-leakage-suppressed 16 -Mb SRAM for handling cosmic-rayinduced multierrors, ” IEEE Journal of Solid State Circuits, pp. 1952 -1957, Nov. 2003. H. Qin, et al. , “SRAM leakage suppression by minimizing standby supply voltage, ” Proceedings ISQED, pp. 55 -60, 2004. H. Qin, R. Vattikonda, T. Trinh, Y. Cao, and J. Rabaey, “SRAM Cell Optimization for Ultra-Low Power Standby, ” Journal on Low Power Electronics, Vol. 2 No 3, pp. 401– 411, December 2006. J. Ryan, J. Wang, and B. Calhoun, "Analyzing and Modeling Process Balance for Sub-threshold Circuit Design“ Proceedings GLSVLSI, pp. 275 -280, March 2007. J. Wang and B. Calhoun, "Canary Replica Feedback for Near-DRV Standby VDD Scaling in a 90 nm SRAM“, Proceedings Custom Integrated Circuits Conference (CICC), pages 29 -32, September 2007. Low Power Design Essentials © 2008 9. 27
References (cntd) § § J. Wang, A. Singhee, R. Rutenbar, and B. Calhoun, "Statistical Modeling for the Minimum Standby Supply Voltage of a Full SRAM Array“, Proceedings European Solid State Circuits Conference (ESSCIRC), pages 400 -403, September 2007. M. Yamaoka et al. “ 0. 4 -V logic library friendly SRAM array using rectangular-diffusion cell and delta-boosted-array-voltage scheme, Proceedings VLSI Circuits Symposium, pp. 13 -15, June 2002. M. Yamaoka, et al, “A 300 MHz 25/spl m. A/Mb leakage on-chip SRAM module featuring processvariation immunity and low-leakage-active mode for mobile-phone application processor, ” Proceedings IEEE Solid-State Circuits Conference, pp. 15 -19, Febr 2004. K. Zhang et al. , “SRAM design on 65 nm CMOS technology with integrated leakage reduction scheme, ” Proceedings VLSI Circuits Symposium, 2004, pp. 294 -295, June 2004. Low Power Design Essentials © 2008 9. 28
- Slides: 28