CSC 2231 Parallel Computer Architecture and Programming GPUs

  • Slides: 12
Download presentation
CSC 2231: Parallel Computer Architecture and Programming GPUs Prof. Gennady Pekhimenko University of Toronto

CSC 2231: Parallel Computer Architecture and Programming GPUs Prof. Gennady Pekhimenko University of Toronto Fall 2017 The content of this lecture is adapted from the slides of Tor Aamodt (UBC)

Project Progress Report • Due next week Friday (Nov. 3 rd) • Ask questions

Project Progress Report • Due next week Friday (Nov. 3 rd) • Ask questions after the class 2

Review #7 GPUs and the Future of Parallel Computing Steve Keckler et al. ,

Review #7 GPUs and the Future of Parallel Computing Steve Keckler et al. , IEEE Micro 2011 Due Nov. 10 3

Review #5 Results 10 9 8 7 6 5 4 3 2 1 0

Review #5 Results 10 9 8 7 6 5 4 3 2 1 0 Grades (out of 10) Mean: 9. 05 6 s 7 s 8 s 9 s 10 s 4

What is a GPU? • GPU = Graphics Processing Unit – Accelerator for raster

What is a GPU? • GPU = Graphics Processing Unit – Accelerator for raster based graphics (Open. GL, Direct. X) – Highly programmable (Turing complete) – Commodity hardware – 100’s of ALUs; 10’s of 1000 s of concurrent threads NVIDIA Volta: V 100 5

+ The GPU is Ubiquitous [APU 13 keynote] 6

+ The GPU is Ubiquitous [APU 13 keynote] 6

“Early” GPU History – 1981: – 1996: – 1999: – 2001: – 2002: –

“Early” GPU History – 1981: – 1996: – 1999: – 2001: – 2002: – 2005: – 2006: IBM PC Monochrome Display Adapter (2 D) 3 D graphics (e. g. , 3 dfx Voodoo) register combiner (NVIDIA Ge. Force 256) programmable shaders (NVIDIA Ge. Force 3) floating-point (ATI Radeon 9700) unified shaders (ATI R 520 in Xbox 360) compute (NVIDIA Ge. Force 8800) 7

+ process commands Host / Front End / Vertex Fetch transform vertices to screen-space

+ process commands Host / Front End / Vertex Fetch transform vertices to screen-space Vertex Processing generate pertriangle equations Primitive Assembly, Setup generate pixels, delete pixels that cannot be seen Rasterize & Zcull Pixel Shader determine the colors , transparencies and depth of the pixel Texture do final hidden surface test, blend and write out color and new depth Pixel Engines (ROP) [David Kirk / Wen-mei Hwu] Frame Buffer Controller GPU: The Life of a Triangle 8

+ pixel color result of running “shader” program 9

+ pixel color result of running “shader” program 9

Why use a GPU for computing? • GPU uses larger fraction of silicon for

Why use a GPU for computing? • GPU uses larger fraction of silicon for computation than CPU. • At peak performance GPU uses order of magnitude less energy per operation than CPU. Rewrite Application CPU 2 n. J/op GPU 200 p. J/op Order of Magnitude More Energy Efficient However…. Application must perform well 10

+ GPU uses larger fraction of silicon for computation than CPU? Control ALU ALU

+ GPU uses larger fraction of silicon for computation than CPU? Control ALU ALU Cache DRAM CPU [NVIDIA] GPU 11

CSC 2231: Parallel Computer Architecture and Programming GPUs Prof. Gennady Pekhimenko University of Toronto

CSC 2231: Parallel Computer Architecture and Programming GPUs Prof. Gennady Pekhimenko University of Toronto Fall 2017 The content of this lecture is adapted from the slides of Tor Aamodt (UBC)