Magic An IndustrialStrength Logic Optimization Technology Mapping and


















- Slides: 18
Magic An Industrial-Strength Logic Optimization, Technology Mapping, and Formal Verification System Robert Brayton Alan Mishchenko Niklas Een UC Berkeley 1
Overview Motivation Ø Big picture Ø Problem representation Ø Handling “Industrial stuff” Ø l Ø Clock domains, flop controls, boxes, etc Algorithms l l Sequential synthesis Combinational synthesis with choices Technology mapping Minimum-perturbation retiming Experimental results Ø Future work Ø 2
Motivation Ø The baseline version of ABC is not applicable to industrial designs because it does not support l l Ø Complex flops Multiple clock domains Special objects (adders, RAMs, DSPs, etc) Standard-cell libraries A fresh start is needed, including l l New design database should support these New integration of application packages to reduce memory and runtime 3
Big Picture Verilog, EDIF, BLIF Programmable APIs A. Mishchenko, N. Een, R. K. Brayton, S. Jang, M. Ciesielski, and T. Daniel, "Magic: An industrial-strength logic optimization, technology mapping, and formal verification 4 tool". Proc. IWLS'10.
Representations Ø Netlist l Ø AIG: The main data-structure l l Ø Original / current / resulting design Represents local / global functions Gets synthesized / mapped/verified Logic network l l Represents the result of tech-mapping AIG annotated with technology primitives (LUTs, gates, etc) 5
“Industrial Stuff” Ø Clock domains l l l Ø Complex controls of the flops l l l Ø Represent clock signal in the data-base Annotate flops with their clock-domain number in the AIG Separate clock domains in sequential transforms Use parametrized flop model Perform elaboration of control signals if needed Handle asynchronous reset carefully! Industrial primitives (adders, RAMs, DSPs, etc) l l Use boxes (black/white, comb/seq, merge/no_merge, etc) Currently propagates timing information, improves quality of synthesis • Elaborate boxes for seq synthesis, but do not map them l Need better support for user-specified attributes (don’t-touch, etc) 6
AIG: A Unifying Representation Ø An underlying data structure for various computations l l Ø A unifying representation for the whole flow l l Ø Representing both local and global functions Used in rewriting, resubstitution, simulation, SAT sweeping, induction, etc Synthesis, mapping, verification pass around AIGs Stored multiple structures for mapping (‘AIG with choices’) The main functional representation in ABC l l Foundation of ‘contemporary’ logic synthesis Source of ‘signature features’ (speed, scalability, etc) 7
AIG Definition and Examples AIG is a Boolean network composed of two-input ANDs and inverters a cd b 00 01 11 10 00 0 0 1 1 11 0 10 0 0 1 0 F(a, b, c, d) = ab + d(ac’+bc) a 6 nodes d b 4 levels a a cd b 00 01 11 10 c b c F(a, b, c, d) = ac’(b’d’)’ + c(a’d’)’ = ac’(b+d) + bc(a+d) 00 0 0 1 1 11 0 1 1 0 7 nodes 10 0 0 1 0 3 levels a c b d b c a d 8
Three Simple Tricks Ø Structural hashing l l Makes sure AIG is stored in a compact form Is applied during AIG construction • Propagates constants • Makes each node structurally unique Ø d Complemented edges l Represents inverters as attributes on the edges • • • Ø c a b Leads to fast, uniform manipulation Does not use memory for inverters Increases logic sharing using De. Morgan’s rule Without hashing Memory allocation l Uses fixed amount of memory for each node • Can be done by a simple custom memory manager • Even dynamic fanout manipulation is supported! l Allocates memory for nodes in a topological order • Optimized for traversal in the same topological order • Small static memory footprint for many applications l Computes fanout information on demand c d a b With hashing 9
Application Packages Ø Framework l l l Design database File input / output Programmable APIs Ø Combinational optimization l l l Ø Sequential optimization l l Ø Retiming Merging equivalence nodes Technology mapping l l Ø AIG rewriting Choice computation Technology mapping Mapping with choices Speedup Verification l l l Simulation Comb equivalence checking Seq equivalence checking 10
Sequential Synthesis Ø Detect, prove, and merge sequentially equivalent nodes l l Ø Comb equiv nodes are equivalent on for any state Seq equiv nodes are equivalent only on reachable states Observations l l Can take into account user specified constraints (don’t-cares) Leads to substantial reduction for large designs (> 10% in area) Can be done using simulation and SAT (without BDDs) Can be implemented efficiently for 1 M gate designs A. Mishchenko, M. L. Case, R. K. Brayton, and S. Jang, "Scalable and scalablyverifiable sequential synthesis", Proc. ICCAD'08. 11
Comb Synthesis With Choices Ø Restructure the AIG and keep track of changes l l l Ø Iterate fast local AIG rewriting with a global view (via hash table) Collect AIG snapshots and prove equivalences across them Use equivalences (choices) during technology mapping Observations l l Leads to improved Qo. R after technology mapping Successfully applied to 1 M gate designs Pre-computing AIG subgraphs for F = abc Rewriting node A A a a b a c Subgraph 1 b b c Subgraph 2 a c Subgraph 3 a b a c Subgraph 1 A a b c Subgraph 2 A. Mishchenko, S. Chatterjee, and R. Brayton, "DAG-aware AIG rewriting: A fresh look at combinational logic synthesis", Proc. DAC '06. 12
Technology Mapping Ø Customizable structural mapping with priority cuts l l Ø Mapped network AIG LUT Computes a small subset of cuts without impacting the Qo. R Uses structural choices Observations l LUT Allows for controlling Qo. R tradeoffs, cost functions • Minimize delay/area, wire count, switching activity, etc l f f Successfully applied to 1 M gate designs LUT a b c d e Primary outputs Choice node A. Mishchenko, S. Chatterjee, R. Brayton, "Combinational and sequential mapping with priority cuts", Proc. ICCAD '07. Primary inputs 13
Minimum-Perturbation Retiming Ø Reduces delay after retiming, while minimizing the number of flops moved l l l Ø Allows the user to control the resources l l l Ø Produces a trade-off: delay gain vs. the number of flops moved Handles “industrial stuff” and retimes over white boxes! Computes new initial state after backward retiming Desired delay gain Maximum allowed number of flops moved Maximum area increase after retiming Delay Observations l l Can be useful before and after placement Can be implemented efficiently Flops moved • Runs in less than a minute for 1 M gates S. Ray, A. Mishchenko, R. K. Brayton, S. Jang, and T. Daniel, "Minimum-perturbation retiming for delay optimization". Proc. IWLS'10. 14
Experimental Setup Integrated Magic into an industrial FPGA synthesis flow Ø Experimented with the full flow, including P&R Ø l l Did not use retiming Did not use post-placement re-synthesis Verified by running Magic and in-house simulation tools Ø Experimented with 20 designs, from 175 K to 648 K LUT 4 Ø Two experimental runs: Ø l l “Reference” stands for the typical industrial flow without Magic “Magic” stands for the new flow with Magic Frontend Design entry, high-level synthesis, quick mapping Magic Seq and comb synthesis, mapping, legalization Backend Placement, routing, design rule checking, etc 15
Experimental Results 16
Cumulative Improvements Ø Ø Ø f. MAX = 11. 8% LUT count = 12. 7% FF-to-FF level = 22. 3% Register count = 9. 4% Total flow runtime = 3. 1% P&R runtime = 50% 17
Future Work Ø Continue to improve application packages l Ø Improve integration of logic and physical synthesis l l Ø AIG rewriting, tech-mapping, sequential synthesis, etc Synthesis/mapping/retiming before placement Retiming/restructuring after placement Extend the flow to work for other technologies l l Macro cells Standard cells 18