Graduate Institute of Electronics Engineering NTU FFT VLSI

  • Slides: 19
Download presentation
Graduate Institute of Electronics Engineering, NTU FFT VLSI Implementation VLSI Signal Processing 台灣大學電機系 吳安宇

Graduate Institute of Electronics Engineering, NTU FFT VLSI Implementation VLSI Signal Processing 台灣大學電機系 吳安宇 1. 2. Shousheng He and Mats Torkelson, A new approach to pipeline FFT processor. IEEE Proc. Of IPPS, P 766 -770, 1996. E. Bidet, D. Castelain, C. Joanblanq, and P. Senn, A fast single-chip implementation of 8192 complex point FFT. IEEE J. Solid-State Circuits, P 300 -305, March 1995 ACCESS IC LAB

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU FFT Review

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU FFT Review

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Implementation --- Two Extreme Method

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Implementation --- Two Extreme Method Slow --------- Speed --------- Fast Small ---------Area---------- Large Complicated ------ Control -------- Simple

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Design Consideration System Requirement e.

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Design Consideration System Requirement e. g. , speed, area, power … Trade-off in these two cases, we need More Processing Elements (PE’s) Better Processing Element Utilization Rate Better Control Scheme

ACCESS IC LAB FFT Processor Graduate Institute of Electronics Engineering, NTU --- Block Diagram

ACCESS IC LAB FFT Processor Graduate Institute of Electronics Engineering, NTU --- Block Diagram

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Some Current Themes Radix-2 Multi-path

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Some Current Themes Radix-2 Multi-path Delay Commutator. ( N = 16 ) Radix-2 Single-path Delay Feedback. ( N = 16 )

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Some Current Themes (cont. )

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Some Current Themes (cont. ) Radix-4 Single-path Delay Feedback. ( N = 256 ) Radix-4 Multi-path Delay Commutator. ( N = 256 ) Radix-4 Single-path Delay Commutator. ( N = 256 )

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Distinctive merit of the above

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Distinctive merit of the above The delay-feedback are more efficient than delay-commutator in terms of memory utilization Radix-4 has higher multiplier utilization , however, Radix-2 has simpler BF which are better utilized

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Comparison Radix / Speed Low

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Comparison Radix / Speed Low ------------------ High Control Theme Simple ------------------ Complex Processing Ability / Unit Low ------------------ High Combine the advantages Further decompose high radix PE

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Decompose Method (1) Simply ‘‘reuse’’

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Decompose Method (1) Simply ‘‘reuse’’ the repeated micro unit A radix-4 PE

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Decompose Method (2) From algorithm

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Decompose Method (2) From algorithm level Applying 3 index: n=<n 1*N/2 + n 2*N/4 + n 3>N where n 1, n 2={0, 1} ; n 3={0~N/4 -1} k=<k 1 + 2 k 2 + 4 k 3>N Summation of n 1

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Decompose Method (2) cont. Summation

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Decompose Method (2) cont. Summation of n 2 Only real-imaginary swapping & sign inversion

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Graphical Explanation (N=16) Trivial multiplication

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Graphical Explanation (N=16) Trivial multiplication

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Graphical Explanation (cont. ) The

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Graphical Explanation (cont. ) The Eqs are equivalent to the operations below

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Circuit of BF 2 I

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Circuit of BF 2 I First N/2 cycles Xr(n) Zr(n+N/2) Xi(n) Zi(n+N/2) Xr(n+N/2) Zr(n) Xi(n+N/2) Zi(n) Second N/2 cycles

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Circuit of BF 2 II

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Circuit of BF 2 II Xr(n) Zr(n+N/2) Xi(n) Zi(n+N/2) Xr(n+N/2) Zr(n) Xi(n+N/2) Zi(n) Swap Re&Im and sign inversion

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Radix-22 Single-path Delay Feedback FFT

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Radix-22 Single-path Delay Feedback FFT architecture using the above technique, for N=256 Compare with original architecture, for N=256

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Structural advantage Radix-22 has the

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Structural advantage Radix-22 has the same complexity as radix-4, but still retain radix-2 BF structure The stage has non-trivial multiplication Control is simple; synchronization controller n address counter for W

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Conclusions 1. FFT Applications: Radar

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Conclusions 1. FFT Applications: Radar Signal Processing, Fast convolution, Spectrum Estimation, OFDM-based Modulation/demodulations 2. Efficient VLSI architectures (parallel processing) are required for real-time processing. 3. However, most systems still employ DSP processors (e. g. , TI C 3 x/C 5 x) for computations (fast algorithms like DIT and DIF FFT). 4. VLIW (Very Long-length Instruction Word)-based processors (TI C 6 x) need new programming skills to utilize the two parallel MAC units.