Programming with CUDA and Parallel Algorithms Waqar Saleem

Organization • People • • • Waqar Saleem, waqar. saleem@uni-jena. de Jens Mueller, jkm@informatik.

Organization • Meetings, before winter break • • • Programming with CUDA, Tue 12

The course • 2 parts • Before winter break: Lectures and assignments • •

Assignments • Build up a minimal ray tracer on GPU • • • Implement

Requirements • Strong background in C programming • Familiarity with your OS • Modifying

Course content • Parallel programming models and platforms • GPGPU • GPGPU on NVIDIA

Today • Organization • Brief introduction to parallel programming and CUDA • Short introduction

Growth of Compute Capability • Moore’s law: the number of transistors that can be

• Moore’s law Growth of Compute Capability source: wikipedia Programming with CUDA, Waqar

Need for increasing compute capability • Problems are getting more complex • e. g.

Barriers to growth • Natural limit on transistor size: the size of an atom

Solution: Parallel architectures Programming with CUDA, Waqar Saleem, Jens

Parallel architectures • Multiple Instructions Multiple Data (MIMD) • multi-threaded, multi-core architectures, clusters, grids

• Simplerarchitecture than MIMD GPU • Little overhead for instruction scheduling, branch prediction

GPU architecture • Simpler architecture leads to higher performance (compared to CPUs) Programming with

General Purpose computing on GPU, GPGPU • Attractive because of raw GPU power •

GPGPU for the masses* • Freeing the GPU from graphics: Nvidia CUDA, ATI Stream

Freeing Parallel Programming • Open. CL: code once, run anywhere • • Programming with

This course • chiefly CUDA: Nvidia specific, mature, well documented, easily available literature •

CUDA, Compute Unified Device Architecture • Software: C like programming interface to the GPU

CUDA hardware model Programming with CUDA, Waqar Saleem, Jens

CUDA programming model • CPU=host, GPU=device, work unit=thread Programming with CUDA, Waqar Saleem, Jens

Programming with CUDA, Waqar Saleem, Jens

Ray tracing • A method to render a given scene • Cast rays from

Ray tracer complexity • A ray tracer can be arbitrarily complex • Recursively compute

Coding a ray tracer • Relatively easy to code on the CPU • Call

This course • Build a trivial ray tracer on the CPU • • compute

Reminders • Exercise session tomorrow • Register on CAJ Programming with CUDA, Waqar Saleem,

See you next time! Programming with CUDA, Waqar Saleem, Jens

Slides: 30

Download presentation

Programming with CUDA and Parallel Algorithms Waqar Saleem Jens Müller Programming with CUDA, Waqar Saleem, Jens

Organization • People • • • Waqar Saleem, waqar. saleem@uni-jena. de Jens Mueller, jkm@informatik. uni-jena. de Room 3335, Ernst-Abbe-Platz 2 The course will be conducted in English 6 points • • Programming with CUDA, Wahl/Wahlpflicht Theoretical/Practical Waqar Saleem, Jens

Organization • Meetings, before winter break • • • Programming with CUDA, Tue 12 -14, CZ 129 Thu 16 -18, CZ 129 • • Every second week Starting next week Exercises: Wed 8 -10, CZ 125 • Starting tomorrow in the pool Waqar Saleem, Jens

The course • 2 parts • Before winter break: Lectures and assignments • • After the break: Group projects • • • Programming with CUDA, Need at least 50% in assignments to qualify for. . . Project chosen by or assigned to each group Regular meetings Presentation of each project on semester end Waqar Saleem, Jens

Assignments • Build up a minimal ray tracer on GPU • • • Implement basic ray tracer on CPU Port to GPU Make ray tracer more interesting/efficient Utilize CUDA concepts Basic framework will be provided • • Programming with CUDA, Scene format and scenes Introduction to ray tracing concepts Waqar Saleem, Jens

Requirements • Strong background in C programming • Familiarity with your OS • Modifying default settings • Writing/understanding Makefiles • Compiler flags and options Programming with CUDA, Waqar Saleem, Jens

Course content • Parallel programming models and platforms • GPGPU • GPGPU on NVIDIA cards: CUDA • Architecture and programming model • Open. CL Programming with CUDA, Waqar Saleem, Jens

Today • Organization • Brief introduction to parallel programming and CUDA • Short introduction to Ray tracing Programming with CUDA, Waqar Saleem, Jens

Growth of Compute Capability • Moore’s law: the number of transistors that can be placed. . . on an integrated circuit [doubles] approximately every two years source: wikipedia Programming with CUDA, Waqar Saleem, Jens

• Moore’s law Growth of Compute Capability source: wikipedia Programming with CUDA, Waqar Saleem, Jens

Need for increasing compute capability • Problems are getting more complex • e. g. Text editing to Image editing to Video editing • Current hardware complexity is never enough • Impractical to stop development at current state of the art Programming with CUDA, Waqar Saleem, Jens

Barriers to growth • Natural limit on transistor size: the size of an atom • More transistors per unit area lead to higher power consumption and heat dissipation Programming with CUDA, Waqar Saleem, Jens

Solution: Parallel architectures Programming with CUDA, Waqar Saleem, Jens

Parallel architectures • Multiple Instructions Multiple Data (MIMD) • multi-threaded, multi-core architectures, clusters, grids • Single Instruction Multiple Data (SIMD) • • Cell processor, GPUs, clusters, grids GPU: Graphics Processing Unit • Parallel programming allows to program for parallel architectures Programming with CUDA, Waqar Saleem, Jens

• Simplerarchitecture than MIMD GPU • Little overhead for instruction scheduling, branch prediction etc. Subsequent figures from NVIDIA CUDA Programming Guide 2. 3. 1 unless mentioned otherwise Programming with CUDA, Waqar Saleem, Jens

GPU architecture • Simpler architecture leads to higher performance (compared to CPUs) Programming with CUDA, Waqar Saleem, Jens

General Purpose computing on GPU, GPGPU • Attractive because of raw GPU power • Traditionally hard because GPU programming was closely associated to graphics • Simplicity of GPU architecture limits the kind of problems suitable for GPGPU • Programming with CUDA, or at least requires some problems to be reformulated Waqar Saleem, Jens

GPGPU for the masses* • Freeing the GPU from graphics: Nvidia CUDA, ATI Stream • C-like programming interface to the GPU • * - knowledge of underlying architecture required to achieve peak performance Programming with CUDA, Waqar Saleem, Jens

Freeing Parallel Programming • Open. CL: code once, run anywhere • • Programming with CUDA, single core, multi core, GPU, . . . platform details transparent to the user supported by major vendors: Apple, Intel, AMD, Nvidia, . . . Open. CL drivers made available by ATI and Nvidia for their cards Waqar Saleem, Jens

This course • chiefly CUDA: Nvidia specific, mature, well documented, easily available literature • some Open. CL: open standard, very new, limited documentation available, very similar concepts to CUDA • no ATI Stream Programming with CUDA, Waqar Saleem, Jens

CUDA, Compute Unified Device Architecture • Software: C like programming interface to the GPU • Hardware: the hardware that supports the above programming model Programming with CUDA, Waqar Saleem, Jens

CUDA hardware model Programming with CUDA, Waqar Saleem, Jens

CUDA programming model • CPU=host, GPU=device, work unit=thread Programming with CUDA, Waqar Saleem, Jens

Programming with CUDA, Waqar Saleem, Jens

Ray tracing • A method to render a given scene • Cast rays from a camera into the scene • Compute ray intersections with scene geometry • Render pixel image source: wikipedia Programming with CUDA, Waqar Saleem, Jens

Ray tracer complexity • A ray tracer can be arbitrarily complex • Recursively compute intersections for reflected, refracted and shadow rays • Account for diffuse lighting • Consider multiple light sources • Consider light sources other than point lights • Account for textures: object materials Programming with CUDA, Waqar Saleem, Jens

Coding a ray tracer • Relatively easy to code on the CPU • Call the same intersection function recursively on secondary rays • CPU code is not so complex • Tricky to code on the GPU as recursion is not yet supported in GPGPU models Programming with CUDA, Waqar Saleem, Jens

This course • Build a trivial ray tracer on the CPU • • compute view rays only part of tomorrow’s exercise • Port to GPU • Add complexity to your GPU ray tracer Programming with CUDA, Waqar Saleem, Jens

Reminders • Exercise session tomorrow • Register on CAJ Programming with CUDA, Waqar Saleem, Jens

See you next time! Programming with CUDA, Waqar Saleem, Jens