Radar Pulse Compression Using the NVIDIA CUDA SDK
- Slides: 7
Radar Pulse Compression Using the NVIDIA CUDA SDK Stephen Bash, David Carpman, and David Holl HPEC 2008 September 23 -25, 2008 MIT Lincoln Laboratory 0839 -118 -1 This work is sponsored by the Air Force Research Laboratory under Air Force contract FA 8721 -05 -C-0002. Opinions, interpretations, conclusions and recommendations are those of the author and not necessarily endorsed by the United States Government.
NVIDIA Compute Unified Device Architecture SDK • • Create custom kernels that run on GPU Extension of C language Provides driver- and runtime-level APIs Includes numerical libraries – CUFFT – CUBLAS • $/GFLOP GPU=$1. 27 CPU=$29. 18 HPEC 07: MIT Lincoln Laboratory 0839 -118 -2
NVIDIA Compute Unified Device Architecture SDK • • Create custom kernels that run on GPU Extension of C language Provides driver- and runtime-level APIs Includes numerical libraries – CUFFT – CUBLAS • $/GFLOP GPU=$1. 27 CPU=$29. 18 HPEC 07: MIT Lincoln Laboratory 0839 -118 -3
Radar Pulse Compression • • Waveform design and processing to achieve higher range resolution and sensitivity* Fast Time FFT Processing consists of convolution with FIR filter – Doppler tolerant (top): traditional frequency domain convolution – Doppler intolerant (bottom): additional FFT and Doppler correction required Replica Slow Time FFT Fast Time IFFT Doppler Correction Replica Fast Time FFT Fast Time IFFT MIT Lincoln Laboratory 0839 -118 -4 * Skolnik, Radar Handbook, Second Edition. Mc. Graw Hill Publishing, Boston, MA, 1990.
GPU vs. CPU Comparison CPU vs GPU comparison in real-world conditions – – 2 GHz dual quad-core AMD Opterons vs e. VGA e. Ge. Force 8800 Ultra Memory transfer to and from GPU included in timing 1 D FFT 3 490 2 480 1 0. 5 0. 3 4 16 64 Batch Size 256 Stage Time (ms) 27000 10368 4725 1960 1000 65536 32768 16384 8192 4096 2048 1024 1 Processing Time Per Stage 500 GPU Speedup FFT Size • CPU GPU 60 50 40 30 20 10 0 M F F D M M D E E E pp opp ast T ultip ast T xtra ulti ast T xtra ler ler c im ct R ply i im ly R im ct R ply Wi FF Co e F R R me t R e e nd T rre FT eplic IFFTange epli IFFTange ow cti a 1 Re ca 2 Re Re ca 3 on gio gio n n n Do MIT Lincoln Laboratory 0839 -118 -5
Backups MIT Lincoln Laboratory 0839 -118 -6
Reference: $/GFLOP As of July 2007, these products represent the top of the line consumer CPU and graphics card according to floating point computational power: 1. Kentsfield Core 2 Extreme QX 6800 37. 7 GFLOPS – fastest CPU as of 7/16/2007 http: //www. tomshardware. com/2007/07/16/cpu_charts_2007/page 36. html $1100 – price as of March 10, 2008 http: //www. google. com/products? q=Kentsfield+Core+2+Extreme+QX 6800 $/GFLOPS = $29. 18 Notes: Price excludes motherboard + power supply + memory + GPU 2. EVGA Ge. Force 8800 Ultra Superclocked (NVIDIA) 576 GFLOPS – theoretical peak http: //en. wikipedia. org/wiki/Ge. Force_8_Series $730 – price as of March 10, 2008 http: //www. google. com/products? q=768 -P 2 -N 887 -AR&scoring=p $/GFLOPS = $1. 27 Notes: Price includes 768 MB GDDR 3 memory, but excludes: motherboard + power supply + CPU MIT Lincoln Laboratory 0839 -118 -7