Design and Implementation of Turbo Decoders for Software
Design and Implementation of Turbo Decoders for Software Defined Radio Yuan Lin 1, Scott Mahlke 1, Trevor Mudge 1, Chaitali Chakrabarti 2, Alastair Reid 3, Krisztian Flautner 3 1 Advanced Computer Architecture Lab, University of Michigan 2 Department of Electrical Engineering, Arizona State University 3 ARM, Ltd. 1 www. eecs. umich. edu/~sdrg
Advantages of Software Defined Radio • Multi-mode operations • Lower costs – – Faster time to market Prototyping and bug fixes Chip volumes Longevity of platforms • Protocol complexity favors software dominated solutions • Enables future wireless communication innovations – Cognitive radio 2 www. eecs. umich. edu/~sdrg
SDR Design Objectives for WCDMA • Programmable processor – Same hardware should support Turbo decoder as well as other DSP algorithms • Throughput requirements – 2 Mbps • Power constraints – 100 m. W ~ 500 m. W 3 www. eecs. umich. edu/~sdrg
SODA: DSP Processor for SDR 4 www. eecs. umich. edu/~sdrg
SODA PE SIMD Pipeline 5 www. eecs. umich. edu/~sdrg
SODA PE SIMD Shuffle Network 6 www. eecs. umich. edu/~sdrg
SODA PE Scalar Pipeline 7 www. eecs. umich. edu/~sdrg
Turbo Decoder on SODA • Most computationally intensive algorithm in W-CDMA • Hardest algorithm to parallelize • Implementation outline – Max. Log. MAP trellis computation with SIMD operations – Parallelizing trellis computations through sliding window – Interleaver implementation 8 www. eecs. umich. edu/~sdrg
Trellis Computation on SODA • Two types of trellis diagram configurations – Blue edges: (0 -branch), Red edges: (1 -branch) • Mapping trellis of size S onto SODA of SIMD size T 9 www. eecs. umich. edu/~sdrg
Forward Trellis on SODA (S = T) Misaligned SIMD operation 10 www. eecs. umich. edu/~sdrg
Handling SIMD Misalignment 11 www. eecs. umich. edu/~sdrg
Sliding Window on SODA • Problem: – W-CDMA uses K=4, 8 wide trellis – SODA has 32 -wide SIMD • Solution: – parallelize trellis computation by implementing sliding window • fully utilize SIMD width • achieving higher-throughput in the process 12 www. eecs. umich. edu/~sdrg
Sliding Window Parallelization 13 www. eecs. umich. edu/~sdrg
Sliding Window on SODA (S < T) 14 www. eecs. umich. edu/~sdrg
Turbo Decoder System Operations 15 www. eecs. umich. edu/~sdrg
SODA DMA Modifications • Traditional DMA controller – Designed for block data transfer – 1 source and 1 destination address per block • Modified DMA controller – Adding data interleaving functionality to DMA – Needs to handle scalar data transfers – 1 source and 1 destination address per scalar 16 www. eecs. umich. edu/~sdrg
Achieved Performance on Average number of of one dummy size. SODA 1 bit of Alpha, Beta data memory cycles for one trellis block calculationtrellis block and LLC computation Overall Turbo decoder throughput access SODA operation frequency Number of sliding windows processed in parallel Number of Turbo Cycles for 1 bit trellis iterations computaion = Tblock/L • SODA operates at 400 MHz • Can achieve 2. 08 Mbps with I = 5 17 Extrinsic scaling www. eecs. umich. edu/~sdrg
Conclusion & Future Work • Implementation summary – SODA consumes <100 m. W in 90 nm – Meets W-CDMA throughput requirements – Hardware features • wide SIMD execution • SIMD permutation network • smart DMA • Beyond 3 G – Support for higher throughput 3 G+ protocols • Multi-processor SODA for Turbo decoder – LDPC decoding 18 www. eecs. umich. edu/~sdrg
Questions? • www. eecs. umich. edu/~sdrg 19 www. eecs. umich. edu/~sdrg
- Slides: 19