The Current Challenges in Data Flow Supercomputer Programming





- Slides: 5

The Current Challenges in Data. Flow Supercomputer Programming Tel Aviv, June 24, 2013

A Classification of Supercomputer Systems

Intel Nehalem E 5520 Quad-core CPU Computation Capacity Memory Capacity L 1 cache # cores Clock frequency Peak perfor mance Power 4 2. 27 GHz 36 GFLO Ps 80 W Memory Size Bandwidth 128 k. B 291 GB/s not limited 25 GB/s Cell/B. E Computation Capacity Memory Capacity Local store CPU cache # cores Clock frequency Peak perfor mance Power 1+8 hetero 3. 2 GHz 230. 4 GFLO Ps 135 W Size Bandwidth Memo ry size Bandwidth 512 k. B 44 GB/s 8*256 KB 204. 8 GB/s 16 GB 25 GB/s Clear. Speed CSX 700 Computation Capacity Memory Capacity Local store Memory CPU cache # cores Clock frequency 2+192 hetero 250 MHz Peak perfor mance Power 96 GFLO 11. 4 W PS Size Bandwidth 24 KB 2*128 KB 192 GB/s 2*8 GB 2*4 GB/s Bandwidth to host 4 GB/s SGI RASC Accelerator board (2 x Virtex 4 LX 200) max 120 W Computation Capacity Memory Capacity # LUTs # FFs # DSP 48 E Clock freque ncy 200448 x 2 96 x 2 200 MHz Block RAMs Peak perfor mance Power 47 GFLO Ps 120 W On board memory # Size Band width 336 0. 7 MB 40 MB 16 GB/s Bandwidth to host 6. 4 GB/s

Maxeler Max 2 FPGA Acceleration Card Computation Capacity # LUTs # FFs # DSP 48 Es 414720 384 Block RAMs Clock freque Peak performance ncy 150 M Hz 116 GFLOPs # FFs # DSP 48 E 4* 207360 4* 192 Size Band width Size Bandwidth 648 2. 8 MB 1519 GB/s 12 GB 28 GB/s 80 GFLOPs Power 100 W # Size bandwidth size bandwidth 4*288 4*1. 2 5 MB n/a 8 GB 80 GB/s NVidia GTX 580 Computation Capacity shared memory # Multiprocessors # cores clock freque ncy peak performance Power 16 512 1. 54 GHz 1. 58 TFLOPs 244 W # cores clock freque ncy peak performance Power 20 1600 850 MHz 2. 72 TFLOPs 188 W bandwidth to host 1066 MT/s Memory Capacity on board memory band width size bandwidth to host 768 KB N/A 6 GB 192. 4 GB/s 8 GB/s shared memory # Multiprocessors 4 GB/s Size AMD ATI HD 5870 Computation Capacity Bandwidth to host Memory Capacity On board memory Block RAMs clock freque peak performance ncy n/a 55 W # Convey coprocessor HC-1 Computation Capacity # LUTs Power Memory Capacity On board memory Memory Capacity on board memory size Band width size bandwidth to host 640 KB 2176 GB/s 6 GB 153. 6 GB/s 8 GB/s

Table of Contents • • • • 0. Classification. pptx 1. Anegdotic. pptx 2. Maxeler. Alabama. Slides. Wo. Veljko. Final. pptx 3. Maxeler-examples 1. pptx 4. 01_Introduction. pptx 5. 02_Programming. Max. Compiler. pptx 6. 03_More. Max. Compiler. pptx 7. 04_Numerics. pptx 8. 05_Scheduling. pptx 9. 06_Loops. And. Cyclic. Graphs. pptx 10. 07_Elementary. Functions. pptx 11. Maxeler-examples. pptx 12. Students. Worldwide. pptx 13. Alg. Gross. Pitaevskii-real. pptx 14. paper. CACM. pdf 15. Discusion. pptx