Fin FETs From Circuit to Architecture Niraj K

  • Slides: 33
Download presentation
Fin. FETs: From Circuit to Architecture Niraj K. Jha Dept. of Electrical Engineering Princeton

Fin. FETs: From Circuit to Architecture Niraj K. Jha Dept. of Electrical Engineering Princeton University Joint work with: Anish Muttreja, Prateek Mishra, Chun-Yi Lee, Ajay Bhoj and Wei Zhang

Talk Outline • Background • Low Power Fin. FET Circuits – Unusual Logic Styles

Talk Outline • Background • Low Power Fin. FET Circuits – Unusual Logic Styles – Unusual Dual-Vdd/Dual-Vth Circuits • Architectural Impact • Other Ongoing Work • Conclusions

Why Double-gate Transistors ? Feature size 32 nm Bulk CMOS DG-FETs Gap 10 nm

Why Double-gate Transistors ? Feature size 32 nm Bulk CMOS DG-FETs Gap 10 nm Non-Si nano devices • DG-FETs can be used to fill this gap • DG-FETs are extensions of CMOS – Manufacturing processes similar to CMOS • Key limitations of CMOS scaling addressed through – – – Better control of channel from transistor gates Reduced short-channel effects Better Ion/Ioff Improved sub-threshold slope No discrete dopant fluctuations

What are Fin. FETs? • Fin-type DG-FET – A Fin. FET is like a

What are Fin. FETs? • Fin-type DG-FET – A Fin. FET is like a FET, but the channel has been “turned on its edge” and made to stand up Si Fin

Independent-gate Fin. FETs Oxide insulation Back Gate • Both the gates of a FET

Independent-gate Fin. FETs Oxide insulation Back Gate • Both the gates of a FET can be independently controlled • Independent control – Requires an extra process step – Leads to a number of interesting analog and digital circuit structures

Fin. FET Width Quantization • Electrical width of a Fin. FET with n fins:

Fin. FET Width Quantization • Electrical width of a Fin. FET with n fins: W = 2*n*h • Channel width in a Fin. FET is quantized • Width quantization is a design challenge if fine control of transistor drive strength is needed – E. g. , in ensuring stability of memory cells Fin. FET structure Ananthan, ISQED’ 05

Talk Outline • Background • Low Power Fin. FET Circuits – Unusual Logic Styles

Talk Outline • Background • Low Power Fin. FET Circuits – Unusual Logic Styles – Unusual Dual-Vdd/Dual-Vth Circuits • Architectural Impact • Other Ongoing Work • Conclusions

Motivation: Power Consumption • Traditional view of CMOS power consumption – Active mode: Dynamic

Motivation: Power Consumption • Traditional view of CMOS power consumption – Active mode: Dynamic power (switching + short circuit + glitching) – Standby mode: Leakage power • Problem: rising active leakage – 40% of total active mode power consumption (70 nm bulk CMOS) † †J. Kao, S. Narendra and A. Chandrakasan, “Subthreshold leakage modeling and reduction techniques, ” in Proc. ICCAD, 2002.

Logic Styles: NAND Gates SG-mode NAND pull up bias voltage LP-mode NAND pull down

Logic Styles: NAND Gates SG-mode NAND pull up bias voltage LP-mode NAND pull down bias voltage IG-mode NAND IG-mode pull up IG/LP-mode NAND LP-mode pull down

Comparing Logic Styles Design Mode Advantages Disadvantages SG Fastest under all load conditions High

Comparing Logic Styles Design Mode Advantages Disadvantages SG Fastest under all load conditions High leakage† (1μA) LP Very low leakage (85 n. A), low switched capacitance Slowest, especially under load. Area overhead (routing) IG Low area and switched capacitance Unmatched pull-up and pull-down delays. High leakage (772 n. A) IG/LP Low leakage (337 n. A), area and switched capacitance Almost as slow as LP mode Average leakage current for two-input NAND gate (Vdd = 1. 0 V) †

Fin. FET Circuit Power Optimization 32 nm PTM Fin. FET models (UFDG, PTM) Logicgate

Fin. FET Circuit Power Optimization 32 nm PTM Fin. FET models (UFDG, PTM) Logicgate designs Delay/power characterization in SPICE IG SG Synopsys libraries IG/LP • • • Benchmark Minimum-delay synthesis in Design Compiler SG-mode netlist Power-optimized mixed-mode netlists LP †D. Chinnery and K. Keutzer, “Linear programming for sizing, Vdd and Vt assignment, ” in Proc. ISLPED, 2005. Construct Fin. FET-based Synopsys technology libraries Extend linear programming based cell selection† for Fin. FETs Use optimized netlists to compare logic styles at a range of delay constraints Linear programming based cell selection SG+ IG/LP SG+IG

Power Consumption of Optimized Circuits Estimated total power consumption for ISCAS’ 85 benchmarks Vdd

Power Consumption of Optimized Circuits Estimated total power consumption for ISCAS’ 85 benchmarks Vdd = 1. 0 V, α = 0. 1, 32 nm Fin. FETs Available modes Total power savings • 110% arrival time (a. t. ) (34%) • 120% a. t. ( 47. 5%) Leakage power savings • 110% a. t. (68. 5%) • 120% a. t. (80. 3%)

Talk Outline • Background • Low Power Fin. FET Circuits – Unusual Logic Styles

Talk Outline • Background • Low Power Fin. FET Circuits – Unusual Logic Styles – Unusual Dual-Vdd/Dual-Vth Circuits • Architectural Impact • Other Ongoing Work • Conclusions

Dual-Vdd Fin. FET Circuits • Conventional lowpower principle: Reverse bias Vgs=+0. 08 V 1.

Dual-Vdd Fin. FET Circuits • Conventional lowpower principle: Reverse bias Vgs=+0. 08 V 1. 08 V – 1. 0 V Vdd for critical logic, 0. 7 V for off-critical paths • Our proposal: overdriven gates – Overdriven Fin. FET gates leak a lot less! Higher Vth 1 V Leakage current Vin Overdriven inverter

Vth Control with Multiple Vdd’s (TCMS) • Using only two Vdd’s saves leakage only

Vth Control with Multiple Vdd’s (TCMS) • Using only two Vdd’s saves leakage only in P-type Fin. FETs, but not in N-type Fin. FETs • Solution – Use a negative ground voltage (VHss) to symmetrically save leakage in N-type Fin. FETs – Vdd. H Vdd. L – Vdd. H 1. 08 V Vdd. L 1. 0 V Vss. H -0. 08 V Vss. L 0. 0 V Symmetric threshold control for P and N Vss. H Vss. L TCMS buffer

Exploratory Buffer Design L VHdd V dd i’ i S 1 VHss S 2

Exploratory Buffer Design L VHdd V dd i’ i S 1 VHss S 2 VLss lopt S 1 VHss S 2 VLss • Size of high-Vdd inverters kept small to minimize leakage in them • Wire capacitances not driven by high-Vdd inverters • Output inverter in each buffer overdriven and its size (and switched capacitance) can be reduced

Power Savings Saving s Dynamic power -29. 8% Leakage power 57. 9% Total power

Power Savings Saving s Dynamic power -29. 8% Leakage power 57. 9% Total power 50. 4% Chart Title 70 60 Total power (dual Vdd) 50 Total power (TCMS) 40 Leakage power (dual Vdd) 30 Leakage power (TCMS) 20 Dynamic power (dual Vdd) Power (μW) Power component 80 Dynamic power (TCMS) 10 0 p 1 r 1 p 2 r 3 r 4 r 5 • Benchmarks are nets extracted from real layouts and scaled to 32 nm http: //dropzone. tamu. edu/~zhouli/GSRC/fast_buffer_inse rtion. html

Fin-count Savings 700000 Number of fins 600000 500000 400000 300000 Dual Vdd 200000 TCMS

Fin-count Savings 700000 Number of fins 600000 500000 400000 300000 Dual Vdd 200000 TCMS 100000 0 p 1 r 1 p 2 r 3 r 4 r 5 Average • Transistor area is measured as the total number of fins required by all buffers • TCMS can save 9% in transistor area

TCMS Extension Delay-minimized netlist Power : 283. 6 u. W Area: 538 fins Power-optimized

TCMS Extension Delay-minimized netlist Power : 283. 6 u. W Area: 538 fins Power-optimized netlist Power : 149. 9 u. W Area: 216 fins

Power Reduction (ISCAS’ 85 Benchmarks)

Power Reduction (ISCAS’ 85 Benchmarks)

Power-minimized vs Delayminimized Netlists at 130% ATC TCMS (Single. Vth Dual-Vdd % reduction in

Power-minimized vs Delayminimized Netlists at 130% ATC TCMS (Single. Vth Dual-Vdd % reduction in dynamic power 53. 3 49. 8 51. 4 % reduction in leakage power 95. 8 95. 7 95. 8 % reduction in total power 67. 6 65. 3 66. 3 % reduction in Fin-count 65. 2 59. 5 61. 6

Talk Outline • Background • Low Power Fin. FET Circuits – Unusual Logic Styles

Talk Outline • Background • Low Power Fin. FET Circuits – Unusual Logic Styles – Unusual Dual-Vdd/Dual-Vth Circuits • Architectural Impact • Other Ongoing Work • Conclusions

Orion-Fin. FET • Extends ORION for Fin. FET-based power simulation for interconnection networks •

Orion-Fin. FET • Extends ORION for Fin. FET-based power simulation for interconnection networks • Fin. FET power libraries for various temperatures and technologies nodes • Power breakdown of interconnection networks for different Fin. FET modes • Power comparison for different Fin. FET modes under different traffic patterns

Router Microarchitecture & Pipeline Stages

Router Microarchitecture & Pipeline Stages

Power Simulation Flow

Power Simulation Flow

Power Breakdown for SG/LP Modes • 4 X 4 mesh network: 5 ports/router, 48

Power Breakdown for SG/LP Modes • 4 X 4 mesh network: 5 ports/router, 48 -flit buffer/port • Flit width = 128 bits • Clock frequency = 1 GHz Router power breakdown Network power breakdown

Bulk CMOS vs. LP-mode Fin. FETs • Bulk CMOS simulation: 32 nm predictive technology

Bulk CMOS vs. LP-mode Fin. FETs • Bulk CMOS simulation: 32 nm predictive technology model • Leakage power of bulk CMOS network 2. 68 X as compared to an LP-mode Fin. FET network

Router Leakage Power vs. Temp. • Leakage power of SG-mode router grows much faster

Router Leakage Power vs. Temp. • Leakage power of SG-mode router grows much faster with temp. than for LP-mode • Leakage power ratio at 105 o. C: 7: 1

Talk Outline • Background • Low Power Fin. FET Circuits – Unusual Logic Styles

Talk Outline • Background • Low Power Fin. FET Circuits – Unusual Logic Styles – Unusual Dual-Vdd/Dual-Vth Circuits • Architectural Impact • Other ongoing work • Conclusions

Fin. FET SRAM and Embedded DRAM Design • Fin. E: Two-tier Fin. FET simulation

Fin. FET SRAM and Embedded DRAM Design • Fin. E: Two-tier Fin. FET simulation framework for Fin. FET circuit design space exploration: – Sentaurus TCAD+UFDG SPICE model – Quasi Monte-Carlo simulation for process variation analysis – Thermal analysis using Thermal. Scope – Yield estimation • Variation-tolerant ultra low-leakage Fin. FET SRAMs at lower technology nodes • Gated-diode Fin. FET embedded DRAMs

Extension of CACTI for Fin. FETs • Selection of any of the Fin. FET

Extension of CACTI for Fin. FETs • Selection of any of the Fin. FET SRAM and embedded DRAM cells • Use of any of the Fin. FET operating modes • Scaling of Fin. FET designs from 32 nm to 22 nm, 16 nm and 10 nm technology nodes • Accurately modeling the behavior of a wide range of cache configurations

FPGA vs. ASICs CMOS fabrication compatible Run-time reconfiguration Nano RAM on-chip storage Temporal logic

FPGA vs. ASICs CMOS fabrication compatible Run-time reconfiguration Nano RAM on-chip storage Temporal logic folding NATURE Design flexibility Logic density • Distributed non-volatile nano RAMs: main storage for reconfiguration bits • Fine-grain reconfiguration (even cycle-by-cycle) and logic folding Ø More than an order of magnitude increase in logic density and areadelay product Ø Competitive performance and moderate power consumption Ø Non-volatility: useful in low power & secure processing • Nano. Map to map application to NATURE Ø Significant area-delay trade-off flexibility

Conclusions • Fin. FETs a necessary semiconductor evolution step because of bulk CMOS scaling

Conclusions • Fin. FETs a necessary semiconductor evolution step because of bulk CMOS scaling problems beyond 32 nm • Use of the Fin. FET back gate leads to very interesting design opportunities • Rich diversity of design styles, made possible by independent control of Fin. FET gates, can be used effectively to reduce total active power consumption • TCMS able to reduce both delay and subthreshold leakage current in a logic circuit simultaneously • Time has arrived to start exploring the architectural trade -offs made possible by switch to Fin. FETs