MultiSplitRow Threshold Decoding Implementations for LDPC Codes Tinoosh

  • Slides: 17
Download presentation
Multi-Split-Row Threshold Decoding Implementations for LDPC Codes Tinoosh Mohsenin, Dean Truong and Bevan M.

Multi-Split-Row Threshold Decoding Implementations for LDPC Codes Tinoosh Mohsenin, Dean Truong and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis

Outline n n n Introduction LDPC Decoding Goals and Key Features Split-Row Threshold Decoding

Outline n n n Introduction LDPC Decoding Goals and Key Features Split-Row Threshold Decoding Method Multi-Split-Row Threshold Decoder Implementations and Results Conclusion

LDPC Decoding n Message passing decoding n LDPC decoding challenges n n High interconnect

LDPC Decoding n Message passing decoding n LDPC decoding challenges n n High interconnect complexity for large number of processing nodes Large delay, area, and power dissipation caused by long and global wire

Outline n n n Introduction to LDPC Decoding Goals and Key Features Split-Row Threshold

Outline n n n Introduction to LDPC Decoding Goals and Key Features Split-Row Threshold Decoding Method Multi-Split-Row Threshold Decoder Implementations and Results Conclusion

LDPC Decoder Design Goals and Features n Key goals n n n Very high

LDPC Decoder Design Goals and Features n Key goals n n n Very high throughput and high energy efficiency Area efficient (small circuit area) Well suited for long-length and large row weight LDPC codes Easy implementation with automatic CAD tools Good error performance Split-Row decoding key features n n Reduced interconnect complexity Reduced processor complexity T. Mohsenin and B. Baas, “Split-row: A reduced complexity, high throughput LDPC decoder architecture, ” in ICCD, 2006 T. Mohsenin and B. Baas, “High-throughput LDPC decoders using a multiple Split. Row method, ” in ICASSP, 2007

Standard Min. Sum vs. Split-Row Decoding Standard Min. Sum decoding Split-Row decoding 0 0

Standard Min. Sum vs. Split-Row Decoding Standard Min. Sum decoding Split-Row decoding 0 0 1 0 1 0 0 0 1 0 H= reduction of input wires to check processor 0 1 0 0 0 1 1 0 0 0 0 1 0 1 0 0 1 1 0 0 Hsplit-sp 1 reduction of check processor area C 1 sp 1 C 1 sp 0 V 3 0 1 V 5 V 8 V 10

Problem with Original Split-Row Algorithm n n 0. 5 – 0. 7 d. B

Problem with Original Split-Row Algorithm n n 0. 5 – 0. 7 d. B error performance loss from Min. Sum Normalized and SPA. In original Min. Sum Split-Row each partition has no information of the minimum value of the other partition.

Outline n n n Introduction to LDPC Decoding Goals and Key Features Split-Row Threshold

Outline n n n Introduction to LDPC Decoding Goals and Key Features Split-Row Threshold Decoding Method Multi-Split-Row Threshold Decoder Implementation and results Conclusion

Min. Sum Split-Row Threshold Algorithm n n n A signal (Threshold_en) is passed from

Min. Sum Split-Row Threshold Algorithm n n n A signal (Threshold_en) is passed from each partition, which indicates whether a partition has a minimum less than a given threshold (T). Check nodes now take as their minimum of their own local Min or T. Optimum threshold value (T) is obtained by empirical simulations Threshold_en Sp 1=1 Threshold_en Sp 0=0 Mohsenin et al, "An Improved Split-Row Thresholding Decoding Algorithm for LDPC Codes, " To appear to IEEE International Conference on Communications (ICC'09).

(2048, 1723) (6, 32) 10 GBASE-T code n n n Code length =2048 Information

(2048, 1723) (6, 32) 10 GBASE-T code n n n Code length =2048 Information length=1723 Row size (No. of parity checks)=384 Row weight (Wr)=32 Column weight (Wc)=6 32/Spn variable nodes

Error Performance for (2048, 1723) 10 GBASE-T Code n n n MS Split-Row-16 Threshold

Error Performance for (2048, 1723) 10 GBASE-T Code n n n MS Split-Row-16 Threshold is 0. 22 d. B away from MS and is 0. 12 d. B better than Split-Row-2 Original. Threshold (T)=0. 2 In the Plot: n n BPSK modulation AWGN channel Maximum 15 iterations Based on 80 error blocks 0. 22 d. B 0. 12 d. B

Outline n n n Introduction to LDPC Decoding Goals and Key Features Split-Row Threshold

Outline n n n Introduction to LDPC Decoding Goals and Key Features Split-Row Threshold Decoding Method Multi-Split-Row Threshold Decoder Implementations and results Conclusion

Delay Analysis for Decoders n n Path 1: propagation of Threshold_en passing through Spn-2

Delay Analysis for Decoders n n Path 1: propagation of Threshold_en passing through Spn-2 partitions Path 2: delay path through check and variable procs For small Spn the interconnect delay is dominant because of wire interconnect complexity As the number of partitioning increases Path 1 delay increases

Area Analysis for Decoders n In Min. Sum, the synthesis area deviates significantly from

Area Analysis for Decoders n In Min. Sum, the synthesis area deviates significantly from layout area due to low utilization. n Area break down per subblock for Min. Sum and Split -16 n 75% of Min. Sum decoder is empty space for wiring Check Proc 75% Var Proc Clk tree+ Regs Wire (empty space) 10% 11% Min. Sum 38% 4% 43% 2% 17% Split-16 Threshold

Comparison of Decoders (6, 32) (2048, 1723) 10 GBASE-T code with 15 decoding iterations.

Comparison of Decoders (6, 32) (2048, 1723) 10 GBASE-T code with 15 decoding iterations. 10 GBASE-T Code 65 nm, 7 M, 1. 3 V Min. Sum standard Split-2 Threshold Split-4 Threshold Split-8 Split-16 Threshold vs. Min. Sum Area Utilization 25% 40% 85% 98% 3. 9 x Area (mm 2) 18. 2 8. 9 5. 0 4. 5 3. 8 4. 8 x Speed (MHz) 17 40 53 112 101 5. 9 x Throughput @ 15 iter (Gbps) 2. 3 5. 5 7. 2 15. 2 13. 8 6 x CAD Tool CPU Time (hour) >78 36 18 10 5 >15. 6 x

Conclusion n n Split-Row Threshold algorithm improves the error performance when compared with original

Conclusion n n Split-Row Threshold algorithm improves the error performance when compared with original Split. Row. Split-Row Threshold allows for high level of partitionings without losing significant error performance. Higher level of partitioning reduces the number of connections between check and variable processors. This results in a higher logic utilization and a smaller circuit. We can meet the demands of high speed applications while obtaining very low area when compared to standard decoding.

Acknowledgements n Support n n n n ST Microelectronics NSF Grant 430090 and CAREER

Acknowledgements n Support n n n n ST Microelectronics NSF Grant 430090 and CAREER award 546907 Intel SRC Grant 1598 and CSR Grant 1659 Intellasys UC Micro SEM Special thanks n Professor Shu Lin