Performance and EnergyAware Optimization of BEOL Interconnect Stack
Performance- and Energy-Aware Optimization of BEOL Interconnect Stack Geometry in Advanced Nodes Kwangsoo Han, Andrew B. Kahng, Hyein Lee and Lutong Wang UCSD CSE and ECE Departments abk@ucsd. edu http: //vlsicad. ucsd. edu 1
Outline • Motivation and Background • Path-/Stage-Based Analysis • Block-Level Validation • Potential Benefits of Design-Aware Manufacturing (DAM) and Manufacturing-Aware Design (MAD) Methodologies • Conclusion and Future Work 2
Motivation and Background • Distinct interconnects • High perf CPU: wide, deep wires • High density So. C: narrow, thin wires • Scaling slows down • High resistance at advanced nodes • Slow improvement in low-k material BEOL geometry is (? ) a key lever to achieve better PPA. [1] C. -H. Jan, B. Uddalak, R. Brain, S. -J. Choi, G. Curello, G. Gupta and W. Hafez, “A 22 nm So. C Platform Technology Featuring 3 -D Trigate and High-k/Metal Gate, Optimized for Ultra Low Power High Performance and High Density So. C Applications, Proc. IEDM, 2012, pp. 311 -314. 3
Motivation and Background FEOL transistor modeling BEOL RC modeling Characterization Device-level Single-stage Flexible, multi-flavor cells One fixed stack option Cell timing & power library (flexible, e. g. , sizing, buffering, VT swapping) Given BEOL stack (fixed) PD Block-level Multistage • Q 1: Does single-stage (AR, DC) analysis relate to a block-level design? • Q 2: How to find the optimal block-level (AR, DC)? • Q 3: Towards N 7/N 5, are there potential benefits of new MAD, DAM methodologies? 4
MAD and DAM Methodologies • Manufacturing-aware design (MAD) • Tune (AR, DC) for BEOL P during P&R • Design-aware manufacturing (DAM) • Tune (AR, DC) for BEOL R in manufacturing • Design = placement and routing (P&R) • Manufacturing = Qo. R evaluation using BEOL R Manufacturing BEOL P Design (P&R) Design Post-route layout BEOL R Manufacturing (Qo. R evaluation) 5
Previous Works • Shah et al. [2] • SPICE simulation of single-stage/size inverter • Optimal (AR, DC) predicted using ITRS projection • Ciofi et al. [3] • Physical modeling of RC for various wire geometry • Optimal power/delay dependent on driver and wirelength Our work: • Single-stage SPICE simulation and P&R validation. • Block-level analysis. • Study of potential benefits from design-aware manufacturing (DAM) and manufacturing-aware design (MAD) methodologies. 6
Outline • Motivation and Background • Path-/Stage-Based Analysis • Block-Level Validation • Potential Benefits of Design-Aware Manufacturing (DAM) and Manufacturing-Aware Design (MAD) Methodologies • Conclusion and Future Work 7
Single-Stage Sensitivity Study • Driving strength Wirelength Output load Input slew X 1, X 4, X 8, X 16 5µm, 10µm, 15µm, 20µm 2 f. F, 3 f. F, 5 f. F, 10 f. F 50 ps, 100 ps Fig. 4. Circuit structure for SPICE simulation *: AR defined as height/half_pitch for consistency across multiple DCs. 8
Sensitivity to Driver Strength (a) BUF_X 1 (b) BUF_X 4 (c) BUF_X 8 (d) BUF_X 16 Delay-optimal power direction Power=CV 2, lower (AR, DC) always preferred Delay-optimal power: lower-left upper-right 9
Other Sensitivity Studies • Sensitivity to wirelength/output load • Trade away C for improved R: delay-optimal power can afford high DC given larger wirelength/load • Insensitive to input slew • Well-correlated with single-stage analysis in P&R tool • {DC}x{AR}x{wlen}x{load}x{driver}=1008 points • Correlation: 99. 26% 10
Outline • Motivation and Background • Path-/Stage-Based Analysis • Block-Level Validation • Q 1: Does single-stage (AR, DC) analysis relate to a block-level design? • Q 2: How to find the optimal block-level (AR, DC)? • Potential Benefits of Design-Aware Manufacturing (DAM) and Manufacturing-Aware Design (MAD) Methodologies • Conclusion and Future Work 11
Block Level Validation (1) • Q 1: Does single-stage analysis relate to a block-level design? • Tool’s noise • Different kinds of stages in a block-level design • Our strategy • Denoising: same P&R layout, apply different (AR, DC) • Group cells based on driver strength • Use only one group for entire P&R Netlist/Liberty/LEF • X 1: (ideal) lower power (LP) Default BEOL • X 4: (ideal) high performance (HF) P&R • Experimental setup: • Design# = {AES, LDPC} Layout New BEOL (AR, DC) • Clock period* (LP/HP) • AES: 0. 7/0. 5 ns PEX/STA • LDPC: 1. 3/0. 8 ns • BEOL = {two 1 X, two 1. 5 X, four 2. 5 X layers} * = fastest achievable # Open. Cores, https: //opencores. org/ Results 12
Block Level Validation (1): LDPC, 1 X layer • Q 1: Does single-stage (AR, DC) analysis relate to a block-level design? Delay-optimal • A: Yes. power direction (a) X 1 cell sensitivity study. (b) X 4 cell sensitivity study. (c) Real design delay (TNS, LDPC, X 1 cell) (d) Real design delay (TNS, LDPC, X 4 cell)13
Block Level Validation (2) • Q 2: How to find the optimal block-level (AR, DC)? • A: A possible way is to find the wirelength distribution per driver type and per layer type. + • Wirelength driven by X 4 cells is 2– 18 X longer than X 1 cells Table. I Wirelength distribution per layer type (normalized) grouped by driver cells. Layer 1 X Driver X 1 X 4 AES 15% 41% 6% 31% 2% 5% LDPC 11% 21% 3% 28% 2% 36% Design 1. 5 X 2. 5 X • TNS contour maps follow X 4 delay contours Fig. 13. Contour maps of TNS when varying (AR, DC) for (a) 1 X, (b) 1. 5 X and (c) 2. 5 X layers, respectively. 14
Outline • Motivation and Background • Path-/Stage-Based Analysis • Block-Level Validation • Potential Benefits of Design-Aware Manufacturing (DAM) and Manufacturing-Aware Design (MAD) Methodologies • Q 3: Towards N 7/N 5, are there potential benefits of new DAM/MAD methodologies? • Conclusion and Future Work 15
Potential Benefits of DAM & MAD Methodologies • Design-aware manufacturing (DAM) • Tuning (AR, DC) in manufacturing according to the characteristics of each design • Manufacturing-aware design (MAD) • Optimization (AR, DC) during physical implementation Optimized (AR, DC) Default (AR, DC) P&R Layout Design-specific property Optimized (AR, DC) Manufacturing DAM flow P&R Layout Manufacturing MAD flow 16
DAM & MAD Study: Experimental Setup • #Stacks = 3 x 3 x 3 x 4=108, each w/ unique number • BEOL P{stack}: stack used for P&R • BEOL R{stack}: stack used for PEX/STA • Layer type: two 1 X, two 1. 5 X and four 2. 5 X layers • DC = {0. 5, 0. 6, 0. 7} for each layer type • AR = {1. 5, 1. 75, 2, 2. 25} uniform for all layers • Design: LDPC (HP=0. 8 ns), both X 1 and X 4 cells BEOL P Design (P&R) Post-route layout BEOL R Manufacturing (Qo. R evaluation) Design Manufacturing 17
DAM & MAD Study: Experimental Results • X-axis: P{stack} • Y-axis: • For a given P{stack}, TNS range using all R{stack}s (blue) • For a given P{stack}, TNS using R{default} (red) • DAM+MAD=60% difference in TNS (ns) P 55 P 84 P 106 P 47 P 90 P 80 P 98 P 14 P 107 P 104 P 48 P 12 P 36 P 23 P 10 P 87 P 69 P 64 P 17 P 59 P 21 P 86 P 65 P 58 P 61 P 16 P 70 P&R Stack 0 -1 -2 2. 59 ns, 40% -3 -4 40% -5 -6 -7 3. 87 ns, 49% -8 -9 -10 TNS with Default Stack MAX 18
DAM & MAD Study: Experimental Results • Y-axis: • For a given P{stack}, power using all R{stack}s (orange) • For a given P{stack}, power using R{default} (black) • DAM+MAD=7% difference in power • Weak correlation P 55 P 108 P 84 P 91 P 106 P 100 P 47 P 79 P 90 P 7 P 80 P 20 P 98 P 67 P 14 P 68 P 107 P 49 P 104 P 88 P 4 P 12 P 39 P 36 P 77 P 23 P 37 P 10 P 63 P 87 P 89 P 69 P 29 P 64 P 24 P 17 P 97 P 59 P 52 P 21 P 45 P 86 P 51 P 65 P 6 P 58 P 35 P 61 P 38 P 16 P 31 P 70 P 5 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 -14 -15 5. 9 5. 7 Weak correlation of 5. 5 timing and power Power (m. W) TNS (ns) P&R Stack 5. 3 5. 1 4. 9 4. 7 4. 5 TNS with Default Stack Power with Default Stack MAX MIN 19
DAM & MAD Study: Experimental Results • Q 3: Towards N 7/N 5, are there potential benefits of new DAM/MAD methodologies? • A: Possibly, yes. • Up to 60%/7% difference in TNS/power • One optimal design-specific stack for manufacturing may be preferred regardless of the BEOL stack assumed during P&R Stack 5. 3 PP PP P P P PPPPPPPPPPPPP PPP PPPPPPPPPPPP PPPPP P P 1 P 5. 2 PP 11 11 1 1 1 PP 1 5 88999 6 9457794 4842991691765 246 7874 133 38772738146884816925612 1 935154234 85556 252 7 90 2 86 00 00 0 0 041 0 5 24415 0 9769403 0060289764584 896 8868 239 61723175023374919390454 7 729321145 5. 1 63175 682 38 60 7 4 1 2 5 54. 8072149 Power with Default 4. 78710476999999 Stack 4. 76089861 4. 91778185999999 5. 02475286 4. 95327521 4. 75819282 4. 8343242 4. 78007732 4. 91381523999999 4. 87936738 4. 98628179 4. 86174995 4. 82058482999999 4. 86693132999999 4. 8061669 4. 79482479 4. 71913047 4. 82766119 4. 83627355 4. 89287366 4. 86136678 4. 93135104 4. 76690483999999 4. 79667449 4. 85505572 4. 89123545 4. 92869092 4. 95509959 4. 79882197 4. 91886362 4. 83808982 4. 84512786999999 4. 8262787 4. 96093680999999 4. 9032959 4. 90123048 4. 90959846 4. 88705917 4. 94305171 4. 89292093999999 4. 83673201 4. 88384251 4. 91186735 4. 77021881 4. 84475936 4. 88593636999999 4. 8896255 4. 79891436 4. 89214687 4. 91006705 4. 91336943 4. 94372323 4. 84284103 4. 96436401 4. 85720873 4. 8548947 4. 83532332 4. 82831396 4. 90947556 4. 84179491 4. 87587007 4. 83135668 4. 81229436 4. 87117917 4. 95180782 4. 8475396 4. 89796472999999 4. 8450513 4. 87652178 5. 01528591 4. 78280141 4. 93046744 4. 79853119 5. 0308685 4. 92366051 4. 85059807 4. 81433477999999 4. 7876811 4. 8552209 4. 86224841 4. 78047943 4. 90646164 4. 90200716 4. 79572792 4. 93939594 4. 85523209 4. 93902739 4. 894920439 4. 90472086 4. 84113832 4. 8281620 4. 80822 4. 9552 4. 907 4. 89 5. 0 4. 4 4. 9 Power Min-Max Range 0. 148633930000001 0. 13807904 0. 13159265 0. 15067447 0. 138871819999999 0. 1474788 0. 13344103 0. 14238238 0. 14145222 0. 15810442 0. 13952835 0. 13957901 0. 141139369999999 0. 143641740000001 0. 1475769 0. 14312121 0. 14058336 0. 15165799999 0. 14221518 0. 142838660000001 0. 151626070000001 0. 15096646 0. 140821290000001 0. 14761865 0. 13903272 0. 14425993 0. 14539354 0. 14914457 0. 14252291 0. 14960979 0. 14089367 0. 14491673 0. 14511786 0. 14819045 0. 15009726 0. 1431296 0. 14452007 0. 14579345 0. 14390423 0. 14616971 0. 15520543 0. 13960612 0. 15923283 0. 144466769999999 0. 14549213 0. 15611243 0. 14782544 0. 1478371 0. 15605817 0. 15121718 0. 14475939 0. 15323011 0. 15138054 0. 15403708 0. 143317069999999 0. 15092492 0. 1425136 0. 15373864 0. 152947680000001 0. 14749349 0. 135834600000001 0. 14882932 0. 14330275 0. 146411469999999 0. 15907555 0. 14722392 0. 14394512 0. 159106479999999 0. 14836028 0. 14596425 0. 15005136 0. 15919149 0. 147232020000001 0. 14882706 0. 14996121 0. 1388184 0. 14860908 0. 15379542 0. 1504487 0. 14750197 0. 15106532 0. 1497088 0. 15833417 0. 15174824 0. 14636694 0. 14190944 0. 15445691 0. 146260509 0. 15356766 0. 14054277 0. 158302 0. 159580 0. 14950 0. 1578 0. 1519 0. 153 0. 14 0. 00 MIN 4. 73095858 4. 88450460999999 4. 75796704 4. 99034024 4. 80252956999999 4. 92002316 4. 72829358 4. 74795967 4. 88251234999999 4. 84836107 4. 95113841 4. 83054285 4. 77389669 4. 76357643 4. 68746203 4. 78889993 4. 83569998 4. 79618965 4. 89805523999999 4. 80402539 4. 85994693 4. 82980615 4. 85995096999999 4. 76505129 4. 82364228 4. 73604501 4. 89686853 4. 92139678 4. 76855602 4. 88523734 4. 86958020999999 4. 80712967 4. 79469956 4. 87038637 4. 85438924999999 4. 87654491 4. 81376838 4. 92811821 4. 9108207 4. 80414731 4. 84986855 4. 88022822 4. 85757074999999 4. 85907615 4. 739153 4. 8109405 4. 85213126999999 4. 7667364 4. 85903517 4. 87618489 4. 81005784999999 4. 88095603 4. 82437546999999 4. 90998052 4. 82341337999999 4. 92970131 4. 79595465999999 4. 80240285 4. 87840448 4. 80811845 4. 79752335 4. 81513404999999 4. 843824 4. 78193694 4. 83681198 4. 84383783999999 4. 91902563 4. 81355993 4. 98321842 4. 75037535 4. 86332602 4. 89625698 4. 76645001 4. 99657709 4. 82191566999999 4. 89232213 4. 81836998 4. 7551811 4. 87438605999999 4. 82919848 4. 78203869 4. 74759334 4. 8240891299999 4. 86756346 4. 76302434 4. 90661243 4. 90485679 4. 87087411 4. 80997799 4. 9201848 4. 7938390 4. 86036 4. 86553 4. 77662 4. 7742 4. 873 4. 9 4. 4 4. 8 Power (m. W) Design-specific BEOL preference 4. 7 Power with Default Stack MIN 4. 6 4. 5 20
Outline • Motivation and Background • Path-/Stage-Based Analysis • Block-Level Validation • Potential Benefits of Design-Aware Manufacturing (DAM) and Manufacturing-Aware Design (MAD) Methodologies • Conclusion and Future Work 21
Conclusion and Future Work • Single-stage simulation & validation • Block-level P&R validation • Potential benefits of DAM and MAD methodologies • Future work • Co-optimization of the front-end with the back-end • Airgap-aware BEOL stack optimization 22
THANK YOU! Research at UCSD is supported by the IMPACT+ / C-DEN center, Samsung, NXP, Qualcomm, ASML, Mentor Graphics and NSF. We thank Brian Cline of ARM for inviting us to write this paper, and Praveen Raghavan and Peter Debacker of IMEC for providing key enablements used in our study. 23
BACKUP 24
Single-Stage SPICE Simulation • Sensitivity to driver strength • X 1: smaller AR/DC better timing/power • Driver resistance dominates • X 4, 8, 16: delay-optimal wire dimension changes • Optimal delay prefers thicker than wider wires Fig. 5. Sensitivity of power and delay to driving strength: (a) BUF_X 1 (b) BUF_X 4 (c) BUF_X 8 and (d) BUF_X 16 25
Single-Stage SPICE Simulation • Sensitivity to wirelength/output load • Higher wirelength/load Higher DC • Trade C for R 26
Single-Stage P&R Validation • Manual routing 8 nets on 1 X layer • Modified pin access to avoid via impact • 1008 data points • {DC}x{AR}x{wirelength}x{load}x{driver}=7 x 4 x 3 x 3 x 4 Power/delay for middle nets • Correlation: 99. 26% 27
DAM & MAD Study: Experimental Setup • #Stacks = 3 x 3 x 3 x 4=108, each w/ unique number • P{stack index}: stack used for P&R (design) • R{stack index}: stack used for PEX/STA (manufacturing) • Layer type: two 1 X, two 1. 5 X and four 2. 5 X layers • DC = {0. 5, 0. 6, 0. 7} for each layer type • AR = {1. 5, 1. 75, 2, 2. 25} uniform for all layers Stack 1 X 1. 5 X 2. 5 X • Design: • LDPC (HP=0. 8 ns) • both X 1 and X 4 cells index AR DC 1 1. 50 0. 5 4 2. 25 0. 5 8 1. 75 0. 5 1. 75 0. 6 25 1. 50 0. 7 1. 50 0. 5 55 2. 00 0. 6 108 2. 25 0. 7 28
- Slides: 28