ABC FPGA Mapping ECE 667 Course Presentation Rohit
ABC: FPGA Mapping ECE 667 Course Presentation Rohit Thakar Electrical and Computer Engineering
Introduction : Technology Mapping § Technology Mapping: Input: A Boolean network f Output: A netlist of k-LUTs implementing the Boolean network optimizing some cost function f Technology Mapping a e d b c The subject graph a b c d e The mapped netlist Alan Mishchenko et al. “Improvements to Technology Mapping for LUT-Based FPGAs” Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 2
Previous works into FPGA Mapping § Flow map [J. Cong and Y. Ding`94] • Delay optimum mapping onto FPGAs § DAOmaps [ D. Chen and J. Cong`04] • Technology mapping based on Iterative cut enumeration and selection technique. With added area recovery. § Drawbacks/ Scope for Improvement of these: • Exhaustive cut enumeration: memory and processor intensive. • Scope to improve area recovery. • Relies on non-optimum technology dependent synthesis my not be optimized Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 3
Presentation Overview § Overall LUT Mapping Algorithm – Latency optimized Input: Structural representation AIG 1. Exhaustive Enumeration of all cuts (k-feasible cuts) for each node in the AIG 1. Enumeration – Factored Cuts 2. Filtering - Signatures 2. Select the cuts that give the best latency numbers 3. Perform area recovery • 2 -step Heuristic Approach 4. Chose the best cover Output: Mapped netlist § Lossless Synthesis : • Technique to improve the technology mapping results. Electrical and Computer Engineering Sequential logic synthesis - ABC system © Ciesielski 2010 Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 4 4
Presentation Overview § Overall LUT Mapping Algorithm – Latency optimized Input: Structural representation AIG 1. Exhaustive Enumeration of all cuts (k-feasible cuts) for each node in the AIG 1. Enumeration – Factored Cuts 2. Filtering - Signatures 2. Select the cuts that give the best latency numbers 3. Perform area recovery • 2 -step Heuristic Approach 4. Chose the best cover Output: Mapped netlist § Lossless Synthesis : • Technique to improve the technology mapping results. Electrical and Computer Engineering Sequential logic synthesis - ABC system © Ciesielski 2010 Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 5 5
Cuts & k-feasible Cuts r A cut of a node n is a set of nodes c in transitive fan-in such that any path from a primary input to n passes through c. p A k-feasible cut means the size of the cut must be k or less. a x y q b c z u The set {a, b, c} is a 3 -feasible cut of node r. Alan Mishchenko et al. “Improvements to Technology Mapping for LUT-Based FPGAs” Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 6
Why is a k-Feasible cut is Important? 3 r p p a q b pu y r t L UT b x a In c z u c r x y z 3 Input LUT u Logic between a node and the nodes in its cut can be replaced by a k-LUT. a x Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany b y c z u 7
Cut Enumeration § Technology mapping process: 1. Get a set of all k-feasible cuts (which can be mapped on to k input LUT) 2. Select the ones that offer best latency & area How to get a exhaustive se of all k-feasible cuts? • Bottom Up Approach • Top Down Approach Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 8
Bottom-Up Cut Enumeration The set of cuts of a node is a cross product of the sets of cuts of its children { {t}, {u, v}, {u, r, s}, . . {a, b, c} } t { {u}, {p, q}, {p, a, b}, {a, c, q}, {a, b, c} } u v { {v}, {r, s}, {r, a, c}, {b, c, s}, {a, b, c} } { {q}, {a, b} } { {p}, {a, c} } p q r s Computation is done bottom-up (Pan ’ 98, Cong ’ 99) a b { {a} } { {b} } Any cut that is of size greater than k is discarded. c { {c} } No need to enumerate cuts larger than k to obtain k-feasible cuts. Electrical and Computer Engineering Alan Mishchenko et al. “Improvements to Technology Mapping for LUT-Based FPGAs” Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 9
Top-Down Cut Enumeration Different way of enumerating cuts t v u p q r s Expand cut {u, v} of t Replace v with its cut {r, s} to get a new cut {u, r, s} of t. Alan Mishchenko et al. “Improvements to Technology Mapping for LUT-Based FPGAs” Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 10
Challenges in Exhaustive Cut Enumeration § Cuts of size >7 • Too many possible combinations, sometimes more than 1000 § Macrocells • special LUT • large number of inputs • but can only compute a sub set of all possible functions. § Fat Belly Issue ! Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 11
Top Down Enumeration: Fat Belly Reconvergence t u v p q a r b s c K=2 K=4 Fat Belly K=3 • To get cut {a, b, c} of t we have to expand past the fat belly. • Need to generate intermediate cuts larger than k to obtain all kfeasible cuts. Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 12
Solution: Factor Cuts § Don’t enumerate all k-feasible cuts § Enumerate a subset F (the factor cuts) § F has the property that other k-feasible cuts can be generated easily from it § Two Techniques Used: • Partial Factoring • Complete Factoring Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 13
Presentation Overview § Overall LUT Mapping Algorithm – Latency optimized Input: Structural representation AIG 1. Exhaustive Enumeration of all cuts (k-feasible cuts) for each node in the AIG 1. Enumeration – Factored Cuts 2. Filtering - Signatures 2. Select the cuts that give the best latency numbers 3. Perform area recovery • 2 -step Heuristic Approach 4. Chose the best cover Output: Mapped netlist § Lossless Synthesis : • Technique to improve the technology mapping results. Electrical and Computer Engineering Sequential logic synthesis - ABC system © Ciesielski 2010 Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 14 14
Compacting the Cut Enumeration - Filtering § Reduce the size of cut enumeration set: • Detect cut Duplication • Detect cut Domination • K feasibility § All these tasks require lot of comparison & matching – processing intensive How to do this efficiently? Use Signature of a Cut. Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 15
Using Signatures Signature of a Cut: A M-bit binary Number, given by sign(c): signature of Cut C ID(n): is the index of node n Σ : logical or Properties of Signatures: § Duplication : If cuts C 1 and C 2 are equal, so are their signatures. § Domination : If cut C 1 dominates cut C 2, the 1 s of sign(C 1) are contained in the 1 s of sign(C 2). § K feasibility : If C 1 ∪ C 2 is a K-feasible cut, |sign(C 1)+sign(C 2)|≤ K. Here, |n| denotes the number of 1 s in the binary representation of n, and addition is the bitwise OR. Electrical and Computer Engineering Alan Mishchenko et al. “Improvements to Technology Mapping for LUT-Based FPGAs” Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 16
Signature Example – Dominating Cuts § Give a ID to each node § Consider Cuts C 1{5, 8, 6, 7} and C 2{5, 6, 7} § Clearly C 1 Dominates C 2 Lets see how Signatures help us figure it out § Lets calculate a 4 bit signature § M=4; IDs are 5 or 8 or 6 … § Sign(C 1) = 25%4 + 28%4 + 26%4 +27%4 = 21 + 2 0 + 22 + 23 = 10 + 1 +100+ 1000 = 1111 § Sign(c 2) = 1110 § Since C 1 dominates C 2 => Sig(C 2) contained in sig(C 1) § Dominating Cuts are to be removed sign(c): signature of Cut C ID(n): is the index of node n Σ : logical or 10 C 1 5 8 9 6 7 C 2 2 3 4 1 So we have converted problem of comparing leaves and sets; into a problem of binary comparison. Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 17
Presentation Overview § Overall LUT Mapping Algorithm – Latency optimized Input: Structural representation AIG 1. Exhaustive Enumeration of all cuts (k-feasible cuts) for each node in the AIG 1. Enumeration – Factored Cuts 2. Filtering - Signatures 2. Select the cuts that give the best latency numbers 3. Perform area recovery • 2 -step Heuristic Approach 4. Chose the best cover Output: Mapped netlist § Lossless Synthesis : • Technique to improve the technology mapping results. Electrical and Computer Engineering Sequential logic synthesis - ABC system © Ciesielski 2010 Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 18 18
Improvement in Area Recovery over DAOmaps § Area optimization done after latency optimization. § If a node has positive slack, choose the mapping that reduces the area • This process is called area recovery ABC combines 2 heuristic techniques: § Global area recovery § Local area recovery Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 19
Presentation Overview § Overall LUT Mapping Algorithm – Latency optimized Input: Structural representation AIG 1. Exhaustive Enumeration of all cuts (k-feasible cuts) for each node in the AIG 1. Enumeration – Factored Cuts 2. Filtering - Signatures 2. Select the cuts that give the best latency numbers 3. Perform area recovery • 2 -step Heuristic Approach 4. Chose the best cover Output: Mapped netlist § Lossless Synthesis : • Technique to improve the technology mapping results. Electrical and Computer Engineering Sequential logic synthesis - ABC system © Ciesielski 2010 Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 20 20
Traditional Synthesis Only the network at the end of technology independent synthesis is used for mapping 1) Technology-independent Boolean Technologysweep synthesis algorithms are heuristic, Network independent output not always optimum. synthesis eliminate resub simplify 2) Mapper may get better results from an intermediate network in the flow. 3) Whole network is optimized for only one goal ( say latency or area) fx resub sweep eliminate sweep full simplify Technology Mapping Electrical and Computer Engineering Mapped Netlist Alan Mishchenko et al. “Improvements to Technology Mapping for LUT-Based FPGAs” Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 21
Lossless Synthesis • Merge intermediate networks into a single network with choices • Can combine the results of different technology independent optimization scripts Script optimizes area sweep eliminate resub simplify Boolean Network speed up reduce depth fx resub sweep eliminate sweep full simplify Electrical and Computer Engineering Script optimizes delay Choice operator Technology Mapping Mapped Netlist Alan Mishchenko et al. “Improvements to Technology Mapping for LUT-Based FPGAs” Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 22
Lossless Synthesis – Constructing Choice Network § § § Choice network: Network of nodes which are functionally equivalent. All the nodes with the same global function in terms of the PIs connected by choice edges The result is a choice AIG that has multiple functionally equivalent points grouped together Electrical and Computer Engineering Alan Mishchenko et al. “Improvements to Technology Mapping for LUT-Based FPGAs” Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 23
ABC Command Summary § ‘fpga’ - command can be used perform FPGA technology mapping using the above mentioned techniques. § if – An all-new integrated FPGA mapper based on the notion of priority cuts Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 24
Summary – FPGA Mapping using ABC § Basically a set of techniques, based on AIGs, to improve upon the DAOmap based technology mapping for FPGA. § Reduce the load on processing and memory. § Improve the Area Recovery § Use innovative technique to take the most out of non-optimum technology independent synthesis. Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 25
Questions References: § § [1] Alan Mishchenko et al. “Improvements to Technology Mapping for LUTBased FPGAs” 2007 [2] P Pany, et al. “A new retiming-based technology mapping algorithm for LUT-based FPGAs” 1998 [3] J. Cong and Y. Ding, “An Optimal Technology Mapping Algorithm for Delay Optimization in Lookup-Table Based FPGA Designs, ” 1994 [4] D. Chen and J. Cong, “DAOmap: A depth-optimal area optimization mapping algorithm for FPGA designs, ” 2004 Electrical and Computer Engineering Credits: Alan Mishchenko, et al. , Prof. Maciej Ciesielski , P Pany 26
- Slides: 26