Rapid Exploration of Accelerator Rich Architectures Automation from

  • Slides: 19
Download presentation
Rapid Exploration of Accelerator. Rich Architectures: Automation from Concept to Prototyping David Brooks, Jason

Rapid Exploration of Accelerator. Rich Architectures: Automation from Concept to Prototyping David Brooks, Jason Cong, Zhenman Fang, Yakun Sophia Shao, and Sam Xi Harvard University & UCLA

Tutorial Outline Time Topic Speaker 8: 30 am – 9: 00 am Accelerator Research

Tutorial Outline Time Topic Speaker 8: 30 am – 9: 00 am Accelerator Research Infrastructure Overview Sophia Shao 9: 00 am – 9: 30 am Aladdin: Accelerator Pre-RTL Modeling Sophia Shao 9: 30 am – 10: 00 am Rapid Hardware Specialization with HLS: Glass Half Full Prof. Zhiru Zhang 10: 00 am – 10: 30 am PARADE: HLS-Based Accelerator-Rich Architecture Simulation Zhenman Fang 10: 30 am – 11: 00 am Break 11: 00 am – 11: 30 am gem 5 -Aladdin: Accelerator System Co-Design Sam Xi 11: 30 am – 12: 00 pm ARAPrototyper: FPGA Prototyping Zhenman Fang 12: 00 pm – 13: 30 pm Lunch 13: 30 pm – 14: 00 pm Virtual Machine Setup Sophia Shao & Sam Xi 14: 00 pm – 14: 30 pm Hands-on: Accelerator Design Space Exploration using Aladdin Sophia Shao 14: 30 pm – 15: 00 pm Hands-on: So. C Design Space Exploration using gem 5 -Aladdin Sam Xi

Moore’s Law 3

Moore’s Law 3

CMOS Scaling is Slowing Down 180 nm 130 nm 90 nm 65 nm 45

CMOS Scaling is Slowing Down 180 nm 130 nm 90 nm 65 nm 45 nm 32 nm 22 nm 14 nm 10 nm http: //www. anandtech. com/show/9447/intel-10 nm-and-kaby-lake 4

CMOS Technology Scaling Technological Fallow Period 5

CMOS Technology Scaling Technological Fallow Period 5

Potential for Specialized Architectures 16 Encryption 17 Hearing Aid 18 FIR for disk read

Potential for Specialized Architectures 16 Encryption 17 Hearing Aid 18 FIR for disk read 19 MPEG Encoder 20 802. 11 Baseband [Zhang and Brodersen] 6

Cores, GPUs, and Accelerators: Apple A 8 So. C Out-of-Core Accelerators 7

Cores, GPUs, and Accelerators: Apple A 8 So. C Out-of-Core Accelerators 7

Cores, GPUs, and Accelerators: Apple A 8 So. C Out-of-Core Accelerators 8

Cores, GPUs, and Accelerators: Apple A 8 So. C Out-of-Core Accelerators 8

Cores, GPUs, and Accelerators: Apple A 8 So. C Out-of-Core Accelerators Maltiel Consulting estimates

Cores, GPUs, and Accelerators: Apple A 8 So. C Out-of-Core Accelerators Maltiel Consulting estimates 9 Our estimates

Challenges in Accelerators • Flexibility – Fixed-function accelerators are only designed for the target

Challenges in Accelerators • Flexibility – Fixed-function accelerators are only designed for the target applications. • Programmability – Today’s accelerators are explicitly managed by programmers. 10

Today’s So. C OMAP 4 So. C 11

Today’s So. C OMAP 4 So. C 11

Today’s So. C ARM Audio DSP Cores Video DSP Face Imaging GPU DMA USB

Today’s So. C ARM Audio DSP Cores Video DSP Face Imaging GPU DMA USB SD System Bus USB DMA Secondary Bus OMAP 4 So. C Secondary Bus Tertiary Bus 12

Challenges in Accelerators • Flexibility – Fixed-function accelerators are only designed for the target

Challenges in Accelerators • Flexibility – Fixed-function accelerators are only designed for the target applications. • Programmability – Today’s accelerators are explicitly managed by programmers. • Design Cost – Accelerator (and RTL) implementation is inherently tedious and time-consuming. 13

Today’s So. C CPU Buses Mem Interface GPU/ DSP Acc Acc Acc 14

Today’s So. C CPU Buses Mem Interface GPU/ DSP Acc Acc Acc 14

Future Accelerator-Centric Architectures Big Cores GPU/DS P Small Cores Shared Resources Memory Interface Sea

Future Accelerator-Centric Architectures Big Cores GPU/DS P Small Cores Shared Resources Memory Interface Sea of Fine-Grained Accelerators How to decompose applications into accelerators? How to rapidly design lots of accelerators? How to design and manage the shared resources? 15 Flexibility Design Cost Programmability

PARADE: Platform for Accelerator-Rich Architectural Design & Exploration [ICCAD 15] extended gem 5 (Mc.

PARADE: Platform for Accelerator-Rich Architectural Design & Exploration [ICCAD 15] extended gem 5 (Mc. PAT) for X 86 CPU, with OS auto-generated accelerators based on HLS (Auto. Pilot) added SPM, DMA, GAM & TLB model extended Garnet (DSENT) for No. C extended Ruby (CACTI) for coherent cache hierarchy gem 5 memory model [ISPASS 14]

ARAPrototyper: Prototyping an ARA on FPGA – Using Xilinx Zynq So. C (FPGA fabrics

ARAPrototyper: Prototyping an ARA on FPGA – Using Xilinx Zynq So. C (FPGA fabrics + ARM) • Major components of an ARA – General processor cores – A sea of heterogeneous accelerators – Memory system + interconnects (No. C)

Contributions WIICA: Accelerator Workload Characterization [ISPASS’ 13] Big Cores Mach. Suite: Accelerator Benchmark Suite

Contributions WIICA: Accelerator Workload Characterization [ISPASS’ 13] Big Cores Mach. Suite: Accelerator Benchmark Suite [IISWC’ 14] Small Cores Shared Resources GPU/DSP Aladdin: Accelerator Pre-RTL, Power-Performance Simulator [ISCA’ 14, Top. Picks’ 15] Memory Interface Sea of Fine-Grained Accelerators Accelerator Design w/ High-Level Synthesis [ISLPED’ 13_1] gem 5 -Aladdin: Accelerator-System Co-Design [MICRO’ 16] 18

Tutorial Outline Time Topic Speaker 8: 30 am – 9: 00 am Accelerator Research

Tutorial Outline Time Topic Speaker 8: 30 am – 9: 00 am Accelerator Research Infrastructure Overview Sophia Shao 9: 00 am – 9: 30 am Aladdin: Accelerator Pre-RTL Modeling Sophia Shao 9: 30 am – 10: 00 am Rapid Hardware Specialization with HLS: Glass Half Full Prof. Zhiru Zhang 10: 00 am – 10: 30 am PARADE: HLS-Based Accelerator-Rich Architecture Simulation Zhenman Fang 10: 30 am – 11: 00 am Break 11: 00 am – 11: 30 am gem 5 -Aladdin: Accelerator System Co-Design Sam Xi 11: 30 am – 12: 00 pm ARAPrototyper: FPGA Prototyping Zhenman Fang 12: 00 pm – 13: 30 pm Lunch 13: 30 pm – 14: 00 pm Virtual Machine Setup Sophia Shao & Sam Xi 14: 00 pm – 14: 30 pm Hands-on: Accelerator Design Space Exploration using Aladdin Sophia Shao 14: 30 pm – 15: 00 pm Hands-on: So. C Design Space Exploration using gem 5 -Aladdin Sam Xi