Mapping into LUT Structures Sayak Ray Alan Mishchenko
- Slides: 16
Mapping into LUT Structures Sayak Ray, Alan Mishchenko, Niklas Een, Robert Brayton Department of EECS, UC Berkeley Stephen Jang, Chao Chen Agate Logic Inc.
Contributions (in a nutshell) • • New mapping algorithm for FPGAs, which maps into LUT structures, instead of LUTs It has two applications: (1) Improving the quality of mapping into LUTs – Area improves by 7. 4% on average – Delay improves by 11. 3% on average (2) Improving delay for specialized hardware, which supports non-routable connections – Delay improves by 40% on average – With some area penalty
LUT Structure • LUT-structure – a group of LUTs connected by direct, non-routable wires Non-routable Wire 7 -input LUT structure “ 44” Non-routable Wire 10‑input LUT structure “ 444”
Some Terminology • • Let (X) be a Boolean function Let X 1 X be a subset of its support Suppose {q 1(X), q 2(X), …, q (X)} is the set of distinct cofactors of w. r. t. X 1 • is called the column multiplicity of w. r. t X 1 Given a partition of X into two disjoint subsets X 1 and X 2, we say that Ashenhurst-Curtis decomposition of (X) exists if (X) can be expressed as (X) = h(g 1(X 1), g 2(X 1), …, gk(X 1), X 2) • X 1 : bound set • X 2 : free set
Flow of perform. Lut. Matching. XY 1 Support. Minimize removes vacuous variables 2 find. Output. Decomposition Checks for f = x G 3 find. Good. Bound. Set 4 check. Special. Non. Disjoint 5 reverse. Variable. Order 6 find. Good. Bound. Set 7 check. Special. Non. Disjoint • Variable reordering in truth table • Allows cases = 2, 3, 4 • For = 3, 4, consider special decomposition with one shared variable only A heuristic to find suitable decomposition
Checking for XYZ decomposition • X, Y, and Z are sizes of the main/fanin LUTs • Two step process • Checking for XW where W = Y + Z – 2 • If it exists, then check the remainder function G for YZ Priority cut-based technology mapper is modified to accommodate the algorithm for XY and XYZ The results of decomposition checking are cached • This substantially reduces runtime on large designs • •
Experiment 1 Ray, Mishchenko, Een, Brayton, Jang, Chen – DATE 2012 7
Experiment 2 Ray, Mishchenko, Een, Brayton, Jang, Chen – DATE 2012 8
Experiment 3 Ray, Mishchenko, Een, Brayton, Jang, Chen – DATE 2012 9
Experiment 4 – Delay Optimization Ray, Mishchenko, Een, Brayton, Jang, Chen – DATE 2012 10
Experiment 5 – Delay Optimization Ray, Mishchenko, Een, Brayton, Jang, Chen – DATE 2012 11
Experiment 6 – Delay Optimization Ray, Mishchenko, Een, Brayton, Jang, Chen – DATE 2012 12
Experiment 7 : industrial design Ray, Mishchenko, Een, Brayton, Jang, Chen – DATE 2012 13
Experiment 8 : industrial design Ray, Mishchenko, Een, Brayton, Jang, Chen – DATE 2012 14
Future Work • • • Improving Implementation • Handling delay driven decomposition – Currently we ignore arrival time, and just care about detecting any decomposition – Using semi-canonical form to increase the number of hits in the hash table of computed results – Making truth-table based decomposition even faster Combining Boolean decomposition into LUT structures with structural mapping of LUTs into clusters Evaluating results after place and route • This will be especially interesting when specialized hardware is available Ray, Mishchenko, Een, Brayton, Jang, Chen – DATE 2012 15
Questions • Questions…. Ray, Mishchenko, Een, Brayton, Jang, Chen – DATE 2012 16
- Alan mishchenko
- Alan mishchenko
- The associative mapping is costlier than direct mapping.
- Forward mapping vs backward mapping
- Terjemahan
- Terminal ray meaning
- Ray ray model
- Ray tracing and ray casting
- Human arm and whale flipper function
- Tuuli saimaalla
- Lut kieliopinnot
- Lut finland
- Aka lut box
- Lut memory
- Lut gmdss
- Lut tenttiakvaario
- Lut moodle