Placement and Timing for FPGAs Considering Variations Yan











![Timing-Driven Placement T-VPlace [Marquardt et al, FPGA 2000] Simulated annealing based placement n Both Timing-Driven Placement T-VPlace [Marquardt et al, FPGA 2000] Simulated annealing based placement n Both](https://slidetodoc.com/presentation_image_h/c76d3e065e166663ebca31fde911f45c/image-12.jpg)











- Slides: 23
Placement and Timing for FPGAs Considering Variations Yan Lin 1, Mike Hutton 2 and Lei He 1 1 EE Department, UCLA 2 Altera Corporation, San Jose ©© 2006 2005 Altera. Corporation
Outline n Preliminaries and Motivation n Timing with Guard-banding/Speed-binning n Stochastic Placement n Experimental Results n Conclusions and Discussions © 2006 Altera Corporation 2
Background n Process variations n Delay with variations - more and more significant in nanometer technology - affect timing and power in both ASICs and FPGAs - Variation sources l Threshold voltage (Vth) and effective channel length (Leff) - Independent Gaussians for global/local variations - First order canonical form n Related work - FPGA device and architecture evaluation with process variations [Wong et al, ICCAD’ 05] - SSTA [Chang et al, ICCAD’ 03] [Viseswariah et al, DAC’ 04] - Statistical criticality analysis [Viseswariah et al, DAC’ 04] [Li et al, ICCAD’ 05] [Xiong et al, TAU’ 06] - Statistical gate sizing for ASICs [Guthaus et al, ICCAD’ 05] [Sinha et al, ICCAD’ 05] © 2006 Altera Corporation 3
Motivation n STA is inaccurate with variation - Slack ignores near criticality - Near-critical paths may be statistically timing critical n Deterministic timing-driven placer (e. g. T-VPlace in VPR) - Based on STA - Optimize for static critical path - May not optimize timing with variation n Stochastic placer is needed with variations - Same placement for one application across chips © 2006 Altera Corporation 4
Pre-routing Interconnect Uncertainty vs. Process Variation in Placement n Existing timingdriven placer - Leverages timing slack in STA - With interconnect delay estimated - May incur uncertainty along with process variation n Clearly, process variation leads to a more significant delay variance in placement stage - Therefore, only consider process variation for placement © 2006 Altera Corporation 5
Outline n Preliminaries and Motivation n Timing with Guard-banding/Speed-binning n Stochastic Placement n Experimental Results n Conclusions and Discussions © 2006 Altera Corporation 6
Uniqueness for Timing in FPGAs vs. ASICs - Similarity l Susceptible to process variations l Critical paths unknown at test time Same timing model to be applied to unknown applications at unknown clock frequency and varied conditions Guard-banded timing model can be arbitrarily conservative or aggressive - Disadvantages l l - Advantages l l l Long switching paths dampen (average out) local variation Binned for speed-grades to isolate global variation Can be programmed repeatedly and differently during timing chip-test © 2006 Altera Corporation 7
Timing with Guard-banding n A guard-band is applied for individual node to model uncertainty in STA n A constant guard-banded delay is µ+cσ - µ and σ are the nominal delay and standard deviation, respectively - c is constant for all circuit elements n Guard-band cost (Tgrd/Tnorm)-1 - Tgrd : critical path delay in STA w/ guard-banding Tnorm: critical path delay in STA w/ nominal timing model Pessimistic/optimistic for designs with longer/shorter critical path Actual timing yield analyzed by SSTA © 2006 Altera Corporation 8
Timing with Speed-binning Test and eliminate local variation by testing multiple similar paths across the test chip n Model global variation Gaussians ΔXi as a single ΔGa n Speed-binning = Categorizing ΔGa n n All chips fell into the same bin share the same guardbanded timing model - e. g. , µ-σg /µ+σg/ µ+3σg for fast/medium/slow bin - STA for the circuit delay Tbin for each bin © 2006 Altera Corporation 9
Yield Analysis with Speed-binning n Yield loss due to ignored local variation Yield loss due to unknown critical paths n Timing yield analysis for a bin n - n circuit delay Tµ+σTgΔGa+σTlΔRa bin k [Glow(k), Gup(k) ] cut-off delay γTbin(k) timing yield for bin k is The overall timing yield is © 2006 Altera Corporation 10
Outline n Preliminaries and Motivation n Timing with Guard-banding/Speed-binning n Stochastic Placement n Experimental Results n Conclusions and Discussions © 2006 Altera Corporation 11
Timing-Driven Placement T-VPlace [Marquardt et al, FPGA 2000] Simulated annealing based placement n Both wiring and timing are considered in the cost function - Wiring cost n - Timing cost l for a connection l for a placement solution - Overall cost n STA is performed at each annealing temperature to update critical path delay and slack © 2006 Altera Corporation 12
Stochastic Placement ST-VPlace n Main differences between ST-VPlace and T-VPlace - Estimate delay matrix in canonical form instead of just nominal delay matrix l Used in SSTA for statistical timing cost during placement l Statistical criticality for an edge/node is the probability that this edge/node is statistically timing critical in SSTA - Perform SSTA instead of STA at each temperature in simulated annealing framework - Using statistical criticality instead of static criticality in cost function - Statistical criticality exponent θ l Static criticality is based on slack and the longest path delay in STA © 2006 Altera Corporation 13
Outline n Preliminaries and Motivation n Timing with Guard-banding/Speed-binning n Stochastic Placement n Experimental Results n Conclusions and Discussions © 2006 Altera Corporation 14
Experimental Settings n Variation and device setting - 10% as 3 sigma for global and local variation in Vth and Leff at IRTS 65 nm technology node - Min-ED device setting l n Vdd=0. 9 v Vth=0. 3 v [Wong et al, ICCAD’ 05] Architecture similar to Altera’s Stratix. TM - Island style FPGA architecture cluster size 10 and LUT size 4 60% length-4 and 40% length-8 wire in interconnects 1. 2 X routing channel width obtained by T-VPlace n Yield loss in failed parts per 10 K parts (pp 10 K) n Evaluated using MCNC and QUIP designs © 2006 Altera Corporation 15
Cost Function Tuning n Perform ST-VPlace and SSTA to obtain mean delay and standard deviation over all designs for each statistical criticality exponent θ n θ=0. 3 leads to the smallest mean and deviation the highest timing yield © 2006 Altera Corporation 16
T-VPlace vs. ST-VPlace n n Some correlation between mean delay and deviation ST-VPlace achieves - smaller mean delay for all designs - smaller variance for most designs - a higher timing yield © 2006 Altera Corporation 17
Statistical Criticality vs. Static Criticality n Statistic criticality vs. static criticality n ST-VPlace considers statistical criticality explicitly - Statistical criticality does not increase monotonically with static one - Statistical criticality may vary significantly with similar static one - Optimizes near-critical paths under variations - Leads to a higher timing yield © 2006 Altera Corporation 18
Impact on Path-length Distribution n Path-length distribution in ST-VPlace is almost on top of that in T-VPlace n ST-VPlace reduces top 10% near-critical paths from 1. 3% to 0. 8% - Although has a larger nominal delay - But has a smaller mean and variance a higher timing yield © 2006 Altera Corporation 19
Effect of Guard-banding 1. 0 20% 0% 0. 1 0 n 1 2 3 Guard-band factor 100% 80% 100. 0 60% 10. 0 40% 20% 0% 4 0 1 2 3 Guard-band factor 4 Guard-band cost 10. 0 40% 10000. 0 guard-band cost T-Vplace yield lost ST-VPlace yield lost 1000. 0 Yield loss (pp 10 k) 100. 0 60% 120% Guard-band cost 80% Yield loss (pp 10 k) Guard-band cost 100% Variation (3 sigma) global 20% local 20% Variation (3 sigma) global 10% local 10% 10000. 0 guard-band cost T-Vplace yield lost STV-Place yield lost 1000. 0 120% 10000. 0 100% 1000. 0 80% 100. 0 60% 10. 0 40% 1. 0 20% 0. 1 0% 0 1 2 3 Guard-band factor ST-VPlace obtains a higher timing yield under varied variations and guard-band factors - Larger gain with smaller variation © 2006 Altera Corporation 20 1. 0 guard-band cost T-Vplace yield lost ST-VPlace yield lost 0. 1 4 Yield loss (pp 10 k) Variation (3 sigma) global 5% local 5% 120%
Effect of Guard-banding 1. 0 20% 0% 0. 1 0 n n 1 2 3 Guard-band factor 100% 80% 100. 0 60% 10. 0 40% 20% 0% 4 0 1 2 3 Guard-band factor 4 Guard-band cost 10. 0 40% 10000. 0 guard-band cost T-Vplace yield lost ST-VPlace yield lost 1000. 0 Yield loss (pp 10 k) 100. 0 60% 120% Guard-band cost 80% Yield loss (pp 10 k) Guard-band cost 100% Variation (3 sigma) global 20% local 20% Variation (3 sigma) global 10% local 10% 10000. 0 guard-band cost T-Vplace yield lost STV-Place yield lost 1000. 0 120% 10000. 0 100% 1000. 0 80% 100. 0 60% 10. 0 40% 1. 0 20% 0. 1 0% 0 1 2 3 Guard-band factor ST-VPlace obtains a higher timing yield under varied variations and guard-band factors - Larger gain with smaller variation - Similar gain with varied local variation when no global variation is considered Yeild loss reduced by 3. 4 X with 3 sigma guard-banding under 10%/10% variations © 2006 Altera Corporation 21 1. 0 guard-band cost T-Vplace yield lost ST-VPlace yield lost 0. 1 4 Yield loss (pp 10 k) Variation (3 sigma) global 5% local 5% 120%
Effect of Speed-binning n n n Fast/Medium/Slow = 40%/30%/29. 999% Discard the slowest 0. 001% (0. 1 pp 10 K) chips Tbin may be relaxed by γ for a higher timing yield n n n Yield loss due to local variation and unknown critical paths ST-VPlace consistently achieves higher timing yield Yield loss is reduced by 25 X with γ=5% © 2006 Altera Corporation 22
Conclusions and Discussions n Conclusions - Quantified the effects of guard-banding and speedbinning with variations - Developed a novel stochastic placer - Evaluated with MCNC and QUIP designs, reduced yield loss by l l n 3. 4 X with guard-banding 25 X with speed-binning Ongoing and future work - Extend timing models with spatial correlated variations - Develop stochastic physical synthesis algorithms, e. g. , clustering, routing, re-timing © 2006 Altera Corporation 23