CPRE 583 Reconfigurable Computing Lecture 3 Wed 8312011
CPRE 583 Reconfigurable Computing Lecture 3: Wed 8/31/2011 (Reconfigurable Computing Hardware) Instructor: Dr. Phillip Jones (phjones@iastate. edu) Reconfigurable Computing Laboratory Iowa State University Ames, Iowa, USA http: //class. ece. iastate. edu/cpre 583/ 1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Questions From Last Lecture? 2 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Questions From Last Lecture? 3 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Announcements/Reminders • HW 1 due Friday of next week – Try to have it completed by this Friday since MP 1 will be released on Friday • Start thinking about topics you may want to do your miniliterature survey on (HW 2). • Guest Lecturer on this Friday (I will be out of town, but should have email access) 4 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Overview • Logic • Interconnect/Routing • Optimized resources – Adders, Multipliers – Memory – System-on-chip building blocks • Example Commercial FPGA structure 5 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
What you should learn • Basic understanding of the major components that make up an FPGA device. 6 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Basic FPGA Architectural Components • FPGA: Field Programmable Gate Array • Sea of general purpose logic gates CLB CLB CLB CLB 7 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Configurable Logic Block (CLB) Iowa State University
Computational Fabric - LUT A B C D ABCD Z 0000 0001 4 -LUT ABCD Z 0000 0 0001 0 1111 1110 1111 LUT = Look up Table Z ABCD Z 0000 0 0001 1 0 1 1110 1111 1 1 ABCD X 000 X 001 X 010 Z 0 1 0 X 101 X 110 X 111 0 1 1 B A B C D AND Z A B C D OR Z 8 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware C D 1 2: 1 0 Mux Z Iowa State University
Computational Fabric - LUT A B C D Z LUT = Look up Table 4 -LUT How many 4 -LUTs needed to OR 32 -bits Draw 32 1 9 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Computational Fabric - LUT A B C D Z LUT = Look up Table 4 -LUT How many 4 -LUTs needed to OR 32 -bits Draw 32 4 LUT 4 LUT 4 LUT 10 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware 1 Iowa State University
Computational Fabric - LUT A B C D Z LUT = Look up Table 4 -LUT How many 4 -LUTs needed to AND 2 -bits with the 32 -bit OR Draw 32 4 LUT 4 LUT 4 LUT 11 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware 1 Iowa State University
Computational Fabric - LUT A B C D Z LUT = Look up Table 4 -LUT How many 4 -LUTs needed to AND 2 -bits with the 32 -bit OR Draw 32 4 LUT 4 LUT 4 LUT 12 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware 1 Iowa State University
Computational Fabric - LUT A B C D Z 4 -LUT = Look up Table Write out the Truth table ABCD Z 0000 How many 4 -LUTs needed 0001 to AND 2 -bits with the 32 -bit OR 0010 Draw 0011 0100 4 LUT 0101 4 LUT 4 0110 32 4 LUT 0111 4 LUT 1000 1 4 1001 LUT 1010 4 LUT 1011 4 LUT 4 1100 4 LUT 1101 1110 4 LUT 1111 Iowa State University 13 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware
Computational Fabric - LUT A B C D Z 4 -LUT = Look up Table Write out the Truth table ABCD Z 0000 0 How many 4 -LUTs needed 0001 0 to AND 2 -bits with the 32 -bit OR 0010 0 Draw 0011 0 0100 0 4 LUT 0101 0 4 LUT 4 0110 32 4 LUT 0111 4 LUT 1000 0 1 4 1001 0 LUT 1010 0 4 LUT 1011 0 4 LUT 4 1100 0 4 LUT 1101 0 1110 4 LUT 1111 Iowa State University 14 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware
Computational Fabric - LUT A B C D Z 4 -LUT = Look up Table Write out the Truth table ABCD Z 0000 0 How many 4 -LUTs needed 0001 0 to AND 2 -bits with the 32 -bit OR 0010 0 Draw 0011 0 0100 0 4 LUT 0101 0 4 LUT 4 0110 0 32 4 LUT 0111 1 4 LUT 1000 0 1 4 1001 0 LUT 1010 0 4 LUT 1011 0 4 LUT 4 1100 0 4 LUT 1101 0 1110 1 4 LUT 1111 1 Iowa State University 15 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware
Computational Fabric - LUT A B C D LUT = Look up Table Z 4 -LUT How could one build a 4 -LUT? ABCD 4 1 x 16 Memory 0 0 0 1 16: 1 Mux 16 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Z Iowa State University
Computational Fabric - LUT A B C D LUT = Look up Table Z 4 -LUT How many different 4 input functions can a 4 -LUT implement? ABCD 4 216 = 65536 1 x 16 Memory 0 0 0 1 16: 1 Mux 17 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Z Iowa State University
Computational Fabric - LUT A B C D LUT = Look up Table Z 4 -LUT How many different N input functions can a N-LUT implement? ABCD 4 1 x 16 Memory 0 0 0 1 16: 1 Mux 18 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Z Iowa State University
Computational Fabric - LUT A B C D LUT = Look up Table Z 4 -LUT How many different N input functions can a N-LUT implement? ABCD N 1 x 16 Memory 0 0 0 1 16: 1 Mux 19 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Z Iowa State University
Computational Fabric - LUT A B C D LUT = Look up Table Z 4 -LUT How many different N input functions can a N-LUT implement? ABCD N 1 x 2 N Memory 0 0 0 1 = N 2 2 N=4 4 16 2 2 =2 =65536 16: 1 Mux 20 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Z Iowa State University
Granularity of Computation Trade-offs associated with LUT size Example: 2 -LUT (4=2 x 2 bits) vs. 10 -LUT (1024=32 x 32 bits) 1024 -bits 2 -LUT 10 -LUT Microprocessor 1024 -bits 21 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Granularity of Computation Trade-offs associated with LUT size Example: 2 -LUT (4=2 x 2 bits) vs. 10 -LUT (1024=32 x 32 bits) 1024 -bits Microprocessor op A B 2 -LUT 4 3 3 10 -LUT 3 1024 -bits 22 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Granularity of Computation Trade-offs associated with LUT size Example: 2 -LUT (4=2 x 2 bits) vs. 10 -LUT (1024=32 x 32 bits) 1024 -bits Microprocessor op A B 2 -LUT 4 3 3 10 -LUT 3 op A B op 4 A B 3 4 3 3 1024 -bits 3 3 3 23 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Granularity of Computation Trade-offs associated with LUT size Example: 2 -LUT (4=2 x 2 bits) vs. 10 -LUT (1024=32 x 32 bits) 1024 -bits Microprocessor op A B 2 -LUT 4 3 op A B op 4 A B 3 3 10 -LUT 3 3 1024 -bits 3 3 24 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Granularity of Computation Trade-offs associated with LUT size Example: 2 -LUT (4=2 x 2 bits) vs. 10 -LUT (1024=32 x 32 bits) 1024 -bits Microprocessor op A B 2 -LUT 4 3 1024 -bits 4 op A B 10 -LUT 3 3 3 4 3 3 3 25 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Granularity of Computation Trade-offs associated with LUT size Example: 2 -LUT (4=2 x 2 bits) vs. 10 -LUT (1024=32 x 32 bits) 1024 -bits 2 -LUT 10 -LUT Bit logic and constants 26 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware 1024 -bits Iowa State University
Granularity of Computation Trade-offs associated with LUT size Example: 2 -LUT (4=2 x 2 bits) vs. 10 -LUT (1024=32 x 32 bits) 1024 -bits 2 -LUT 10 -LUT Bit logic and constants 1024 -bits (A and “ 1100”) or (B or “ 1000”) 27 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Granularity of Computation Trade-offs associated with LUT size Example: 2 -LUT (4=2 x 2 bits) vs. 10 -LUT (1024=32 x 32 bits) 1024 -bits 2 -LUT A 10 -LUT B Bit logic and constants 1024 -bits (A and “ 1100”) or (B or “ 1000”) 28 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Granularity of Computation Trade-offs associated with LUT size Example: 2 -LUT (4=2 x 2 bits) vs. 10 -LUT (1024=32 x 32 bits) 1024 -bits A 4 2 -LUT AND 10 -LUT 1 Bit logic and constants OR (A and “ 1100”) or (B or “ 1000”) B 0 4 1024 -bits Area that was required using 2 -LUTS OR It’s much worse, each 10 -LUT only has one output 29 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Computational Fabric - DFF A B C D Z 4 -LUT • LUTs are fine for implementing any arbitrary combinational logic (output is ONLY a function of its inputs) function. But what about sequential logic (output is a function of input AND previous state information)? Need Memory!! 30 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Computational Fabric - DFF A B C D Z(t) Z(t+1) 4 -LUT DFF = D Flip Flop Detect the pattern “ 1101” 1/0 0/0 1/0 Input/output 0/0 Start 1/1 1 11 0/0 1101 1/0 31 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Computational Fabric - DFF A B C D Z(t) Z(t+1) 4 -LUT DFF = D Flip Flop Increase circuit performance (pipelining) A B C D 4 -LUT DFF 4 -LUT DFF 4 -LUT DFF 32 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware 4 LUT delays per output 1 DFF delay per output DFF Iowa State University
Communication: Interconnect & Routing Need a mechanism to move results of computation around. CLB CLB CLB CLB CLB CLB CLB CLB CLB 33 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Communication: Interconnect & Routing Need a mechanism to move results of computation around. Nearest Neighbor: CLB CLB CLB CLB CLB CLB CLB CLB CLB 34 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Communication: Interconnect & Routing Need a mechanism to move results of computation around. Nearest Neighbor: CLB CLB CLB CLB CLB CLB CLB CLB CLB Segmented: 35 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Communication: Interconnect & Routing Need a mechanism to move results of computation around. Nearest Neighbor: CLB CLB CLB CLB CLB CLB CLB CLB CLB Segmented: Hierarchical: 36 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Optimized Resources: Dedicated Logic LUTs + DFFs can implement any arbitrary digital logic. But not optimally (ASICs give make much better use of silicon area for Power, Speed, routing resources) • Arithmetic – Add, Multiply • On chip memory • System on chip building blocks – Processor, PCI-express, Gigabit Ethernet, ADC, etc. 37 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Optimized Resources: Dedicated Logic Fast Addition Two output LUT generate propagate logic Carry out A 3 c 4 B 3 Carry Look Ahead 6 -LUT Sum 3 A 2 B 2 A 1 B 1 Carry 1 P 1 G 1 A 1 Sum B 1 Carry 1 1 A 2 CLB P 2 Carry 2 G 2 CLB Sum 2 B 2 A 1 B 1 6 -LUT Sum 2 6 -LUT Dedicated routing resources 38 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Sum 1 Carry in Iowa State University
Optimized Resources: Dedicated Logic Embedded Memory 8 96 bits, 300 MHz 12 39 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Optimized Resources: Dedicated Logic Embedded Memory 8 18 Kbits, 550 MHz Dedicated 12 memory block 40 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Optimized Resources: Dedicated Logic Multiplication 18 x 18 multiply Type # LUTs Latency Speed LUT ~400 5 clks 380 MHz Dedicated 18 x 18 Multiplier 0 3 clks 450 MHz Virtex-5 (6 -LUTs) Very rough estimate of Silicon area comparison (assuming SX 95 and. LX 110 have about the same die size) 6 -LUT 18 x 18 Multiplier 6 -LUT In other word you can replace one LUT based 18 x 18 multiplier With 100 dedicated 18 x 18 Multipliers!!! 41 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Optimized Resources: Dedicated Logic Processor Power. PC hard-core • 500 MHz • Super scalor • Highspeed 2 x 5 switch fabric Micro. Blaze soft-core • 250 MHz • Simple scalar 42 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Optimized Resources: Dedicated Logic System on Chip Dedicated Logic Reconfigurable Logic ADC RAM Matrix Multiplier Coprocessor Ethernet MAC Data Buffer PID Controller Sensor Motor Also see Actel Fusion: http: //www. actel. com/products/fusion/default. aspx 43 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Xilinx CLB Architecture • Virtex 5 FPGA User Guide 44 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Questions/Comments/Concerns 45 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Computational Fabric - LUT • N-Lut, 3, 4… 6, … 8 -LUT – AND, XOR, NOT – Exercises • How many 4 -LUTs to OR 32 bits (draw) • How many 4 -LUTs to AND 2 bits with the OR of these 32 bits (draw) • Draw the truth table for the 4 -LUT that gives the final output – How could one implement a LUT (Memory + MUX) – How many ways can a 4 -LUT be programmed – How many ways can a N-LUT be programmed • Granularity trade-off: Functionality vs. propagation delay (2 -LUT -> CPU), bit-level vs. datapath 46 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Computational Fabric - DFF • Enable building circuits that can store information (sequential circuits, state machines) • Enables pipelining to increase operating frequency/ throughput 47 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Communication: Interconnect & Routing • Need a mechanism to move the results of a LUT to other LUTs. • Island stale (Array of CB) – Nearest neighbor (paper on reconfigure arch that uses this) • Not scalable (large delays, and uses logic elements for routing? ) – Segmented (different length for latency trade-off) • Multi hop scales < O(N)? • Avoid using logic – Hierarchical (good for apps with lots of local communication and little remote communication) • Typical an FPGA silicon area will be 10% logic and 90% interconnect!! 48 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
Optimized Resources: Hard Cores • LUTs + DFFs can implement any arbitrary digital logic. But not optimally (ASICs give make much better use of silicon area for Power, Speed, routing resources) • Arithmetic – Add, Mult • On chip memory • System on chip building blocks – Processor, PCI-express, Gigbit Ethernet, A/D 49 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Hardware Iowa State University
- Slides: 49