ECE 667 Synthesis and Verification of Digital Systems
- Slides: 13
ECE 667 Synthesis and Verification of Digital Systems Technology Mapping for FPGAs D. Chen, J. Cong, DAOMap : A Depth optimal Area Optimization mapping algorithm for FPGA Designs, ICCAD 2004 1
FPGA Mapping (LUT-based) • How is it different from ASIC (standard cells) – Structural in nature, simpler – Any function with k inputs can be mapped into a k-LUT – Typically implemented by cut mapping F = x 1’x 2’ + x 1 x 2 • FPGA architecture: k-LUT x 1 x 2 F 0 0 1 1 1 0 0 1 Programming bit P 2 -Input LUT 0/1 F 0/1 0 1 0/1 x 2 ECE 667 Synthesis & Verificatioin - FPGA Mapping 2
FPGA Mapping - example A possible mapping onto 3 -LUTs f - each block has inputs g d e h b a c ECE 667 Synthesis & Verificatioin - FPGA Mapping 3
Definitions • • • DAG: Boolean network Cone Cv: sub-network rooted on node v K-feasible cone: |input(Cv)| K Fanin Cone Fv: the largest Cv k-feasible cut: a k-feasible Cv Unit delay model: – Each LUT contributes one unit delay • Cut rooted on node C: cut with output C PIs a Fv c b d e v 3 -feasible cone Cv ECE 667 Synthesis & Verificatioin - FPGA Mapping Delay of 2 4
Problem Formulation • Delay-optimal Area Optimization problem – Given: a Boolean network; an integer k (LUT size) – Goal: cover the network with k-feasible cones (k-LUTs), such that • Mapping depth (delay) is minimum • Area (number of LUTs) is minimized • NP-hard problem on area minimization • A two-step process – Cut enumeration + evaluation (delay, area) – Cut selection to minimize delay – Possible iteration to remap nodes on non-critical paths (area recovery) – Takes into consideration node duplication ECE 667 Synthesis & Verificatioin - FPGA Mapping 5
Cut Enumeration x w z y c a Subcut c a b d Subcut New cut Another Subcut • Process nodes in topological order from PIs to POs • Combine sub-cuts of the fanin nodes to create a new cut • If the size of the cut exceeds k (LUT size), discard the cut ECE 667 Synthesis & Verificatioin - FPGA Mapping 6
Delay Propagation (k = 3) x Delay = 1 w z y b Delay = 2 1 Delay = 1 Optimal Delay = 1 a c Delay = 1 Optimal Delay = 1 d Delay = 2 Optimal Delay = 1 e g f Delay = 2 Optimal Delay = 2 • Delay computed using dynamic programming method. • The longest best delay on the POs is the optimal mapping delay ECE 667 Synthesis & Verificatioin - FPGA Mapping 7
Area Estimation Tries to estimate area considering fanout effect AC = [Ai / f(i)] + UC Ap m n p o f(p) = 2 i = input(C) • • q Ai : estimated area of the fanin cone of signal i f(i) : fanout number of inputs Uc : area of the cut itself Can underestimate area due to node duplication ECE 667 Synthesis & Verificatioin - FPGA Mapping r s Cut Ct t Cut C As / 2 u X Cut Cu 8
Duplication Cost Adjustment • Considers potential node duplications • Check the sub-cuts for multiple fanouts • Area adjusted by addition of duplication cost Duplication Cost: § NCf : number of nodes contained by subcut Cf m n § IC : cutsize of C q §fi : fanout number of subcut p o r Subcut Cf 2 Subcut Cf 1 New cut C IC = 4 ECE 667 Synthesis & Verificatioin - FPGA Mapping s NCf 2 = 1 Multiple fanouts 9
Cost (Area) Function of a Cut Some Key parameters • IC: cutsize of C • NC: number of nodes covered by C • f(v): fanout number of the root node v • Pf: duplication cost a C 1 c b d C 2 e C 3 v fanin 1 ECE 667 Synthesis & Verificatioin - FPGA Mapping fanin 2 10
Cut Selection • Once cuts are generated, traverse networks from POs to PIs and select cuts that map into LUTs • Select cuts such that timing is met and the area is minimized • Iterative Cut Selection Procedure – Local Cost Adjustment • Input Sharing • Slack Distribution • Cut Probing ECE 667 Synthesis & Verificatioin - FPGA Mapping 11
Local Cost Adjustment – Slack Distribution • Slack. C = Reqv – 1 – MAX (Arri) i input(C) • If Slack. C < 0, C is not a timing_feasible cut • The larger the Slack. C, the better for C in terms of slack distribution effect x y w z b a c d ECE 667 Synthesis & Verificatioin - FPGA Mapping C Largest arrival time among inputs Reqd : Required time of the root 12
Algorithm Recap • Cut generation of k- feasible cuts • Area propagation under timing constraints – optimal area at a node is the minimum area among cuts that give minimum delay • Representation of the cost function for a cut more accurately • Global duplication cost adjustment • Cut selection involving local cost adjustment ECE 667 Synthesis & Verificatioin - FPGA Mapping 13
- Art. 667 cc
- Derechos asertivos básicos
- Zline 667-36
- 667
- Systems engineering verification methods
- Verilog
- Verilog hdl: a guide to digital design and synthesis
- Digital goods ecommerce
- Digital systems testing and testable design
- S domain
- Digital systems and binary numbers
- Digital control
- Investigation in creative process
- Verification and validation