A MinCost Flow Based Detailed Router for FPGAs
A Min-Cost Flow Based Detailed Router for FPGAs Seokjin Lee*, Yongseok Cheon*, D. F. Wong+ *The + University of Texas at Austin University of Illinois at Urbana-Champaign 1
Outline n n Overview Introduction n Problem Definitions Algorithm Description n n FPGA Architecture, Routing resources Min-cost flow based router Lagrangian relaxation Experimental Results Conclusion 2
Overview n n Flow. Route - A congestion-driven detailed router Finds a feasible routing with minimum total delay for a given placed netlist. Routes all the nets connected to a LUT simultaneously by a min-cost flow algorithm Iterative refinement with Lagrangian relaxation 3
FPGA Architecture n Logic modules n n n Routing resources n n n Implements logic functions LUTs, flip-flops Wire segments Programmable switches I/O modules <A typical FPGA architecture> 4
FPGA Routing Resources n Prefabricated routing resources n n Congestion constraints Limited Routability n High RC delays and large area of switches 5
FPGA Routing Example 6
Graph Representation n n Routing resource graph G (V , E) V : I/O pins of logic modules, wire segments E : feasible connections between the nodes Routing problem: Finding vertex disjoint trees T={T 1, …, Tn} 7
Problem Definitions n The Routing for One LUT (ROL) Problem n n n Find routes for all the net segments connected to a LUT Using equivalence of input pins of a LUT FPGA detailed routing problem n n Find routes for all the nets. Soving ROL problem for all LUTs in an FPGA 8
Flow Network for ROL n n n Construct Gf(Vf, Ef) from G(V, E) Vf = V U {s, s 1, s 2, …, sn, t}, si : subsource Ef = E U Es’ U Et n n n Edge capacity n n rf(e) = 1 for all e in Ef Node capacity n n Es = {(s, si)| i = 1, …, n}, Es’ = {(si, v)|i = 1, …, n, v in Ti} Et = {(pi, t)| pi in Sp} rf(v) = 1, for all v in Vf – {s, t} Cost: n cf(e) = c(e) for e in E, cf(v) = c(v) for v in V 9
Flow Network (example) 10
ROL_NF (example) 11
ROL_NF (example) 12
ROL_NF for ROL n n n A min-cost max-flow f* in Gf corresponds to a solution to ROL with minimum total delay cost. If |f*|=n, all the net segments connected to a LUT ROL_NF exactly solves ROL problem in polynomial time 13
ROL_NF n Algorithm ROL_NF 1. Construct Gf (Vf, Ef) 2. Assign costs and capacities 3. Run min-cost max-flow algorithm on Gf (Vf, Ef) 4. Derive routes for the nets from the computed flow 14
Lagrangian Relaxation n n General technique for solving optimization problems with difficult constraints Original optimization problem is divided into subproblems Each subproblem is solved by repetitive application of ROL_NF Lagrangian multipliers guide the router 15
Lagrangian Relaxation Original problem Lagrangian subproblem 16
LR for FPGA detailed routing Original problem Lagrangian subproblem max{min Ll(x)} 17
Solving Lagrangian Subproblem n By rearranging terms, n n n Ll(x) = Sk. Si(ci + li)xik – Sili LS’l = min{Sk. Si(ci + li)xik} ROL_NF solves LS’l n n Set (ci + li) as a cost of i ci = di (delay term) * qi (congestion term) 18
Updating Lagrangian Multipliers Subgradient Method : stepsize 19
Flow. Route 1. 2. 3. 4. 5. 6. 7. Initialize For each lk in L do Rip up nets connected to lk Call ROL_NF Update costs and reset capacities Update Repeat Step 2 – 6 until no shared resource exists 20
Experimental Results n FPGA model used n n n Symmetrical-array-based FPGA Each logic block contains four 4 -input LUTs and flipflops Switch connections: Fs = 3, Fc = W Fs: number of connections per wire entering the switch box Fc : number of tracks to which each logic block pin can connect W : number of tracks in a channel 21
Experimental Results n n n Tested on MCNC benchmark circuits Results compared with VPR router Used smaller number of routing tracks Improvement on critical path delay up to 28. 9 % (average 14. 1%) Total wire length reduced (ave. 8. 3%) 22
Experimental Results n Channel width and delay comparison Circuits LUTs / FFs Number of tracks Critical Path Delay VPR FR 9 symml 104 10 9 26. 7 25. 1 (6. 0%) term 1 128 13 12 25. 3 23. 3 (7. 9%) apex 7 252 13 13 26. 1 21. 3 (18. 4%) example 2 404 17 16 29. 6 23. 2 (21. 6%) alu 2 224 17 17 54. 7 49. 2 (10. 1%) Too-lrg 208 19 19 31. 2 30. 2 (3. 2%) vda 456 23 23 46. 5 38. 9 (16. 3%) alu 4 1560 33 33 143. 4 122. 5 (14. 6%) s 298 1960 27 27 274. 0 194. 7 (28. 9%) 23
Conclusion n A new congestion-driven routing algorithm for FPGAs Find a feasible routing with minimum total delay – expects reduced critical path delay Can be used in multiple stage routing scheme 24
Thank You! 25
- Slides: 25