NextGeneration Graphics APIs Similarities and Differences Tim Foley

  • Slides: 59
Download presentation

Next-Generation Graphics APIs: Similarities and Differences Tim Foley NVIDIA Corporation

Next-Generation Graphics APIs: Similarities and Differences Tim Foley NVIDIA Corporation

Next-Generation Graphics APIs • Vulkan, D 3 D 12, and Metal • Coming to

Next-Generation Graphics APIs • Vulkan, D 3 D 12, and Metal • Coming to platforms you care about • Why do we want new APIs? • How are they different?

Why new Graphics APIs? • Reduce CPU overhead/bottlenecks • More stable/predictable driver performance •

Why new Graphics APIs? • Reduce CPU overhead/bottlenecks • More stable/predictable driver performance • Explicit, console-like control

CPU Bottlenecks • Only single app thread creating GPU work • Can become bottleneck

CPU Bottlenecks • Only single app thread creating GPU work • Can become bottleneck for complex scenes • Try to do as little on this thread as possible • Multi-threaded driver helps a bit • New APIs: multi-threaded work creation

Driver Overhead, Predictability • App submits a draw call, maps a buffer, etc. •

Driver Overhead, Predictability • App submits a draw call, maps a buffer, etc. • Driver might • • Compile shaders Insert fences into GPU schedule Flush caches Allocate memory

Explicit, Console-Like Control • Explicit synchronization • CPU/GPU sharing, RMW hazards, etc. • Explicit

Explicit, Console-Like Control • Explicit synchronization • CPU/GPU sharing, RMW hazards, etc. • Explicit memory management • Allocate large memory region at load time • Handle sub-allocation in application code

This Talk • Bootstrap your mental model • Introduce concepts shared across APIs •

This Talk • Bootstrap your mental model • Introduce concepts shared across APIs • Point out major differences • Try to hand-wave the small ones

Big Topics • Command buffers • Pipeline state objects • Tiling • Resources /

Big Topics • Command buffers • Pipeline state objects • Tiling • Resources / Binding • Hazards / Lifetime D 3 D 12 Metal Vulkan

Command Buffers D 3 D 12 Metal Vulkan

Command Buffers D 3 D 12 Metal Vulkan

Single-Threaded Submission CPU Thread cmd driver cmd GPU Front. End

Single-Threaded Submission CPU Thread cmd driver cmd GPU Front. End

Writing to a Command Buffer CPU Thread cmd cmd driver GPU Front. End

Writing to a Command Buffer CPU Thread cmd cmd driver GPU Front. End

Submitting a Command Buffer CPU Thread cmd cmd cmd driver GPU Front. End

Submitting a Command Buffer CPU Thread cmd cmd cmd driver GPU Front. End

Submitting a Command Buffer CPU Thread driver Queue cmd cmd cmd GPU Front. End

Submitting a Command Buffer CPU Thread driver Queue cmd cmd cmd GPU Front. End

Start Writing to a New Buffer CPU Thread driver Queue cmd cmd cmd GPU

Start Writing to a New Buffer CPU Thread driver Queue cmd cmd cmd GPU Front. End

CPU-GPU Asynchrony CPU Thread cmd driver Queue cmd cmd GPU Front. End

CPU-GPU Asynchrony CPU Thread cmd driver Queue cmd cmd GPU Front. End

CPU-GPU Asynchrony CPU Thread cmd driver Queue cmd GPU Front. End

CPU-GPU Asynchrony CPU Thread cmd driver Queue cmd GPU Front. End

Command Buffers and Queues D 3 D 12 ID 3 D 12 Command. List

Command Buffers and Queues D 3 D 12 ID 3 D 12 Command. List ID 3 D 12 Command. Queue Metal MTLCommand. Buffer MTLCommand. Queue Vulkan Vk. Cmd. Buffer Vk. Cmd. Queue

Recording and Submitting • Record commands into command buffer • Record many buffers at

Recording and Submitting • Record commands into command buffer • Record many buffers at once, across threads • Submit command buffer to a queue • GPU consumes in order submitted

Multi-Threaded Submission CPU Thread cmd cmd cmd Queue GPU Front. End

Multi-Threaded Submission CPU Thread cmd cmd cmd Queue GPU Front. End

Multi-Threaded Submission CPU Thread cmd cmd cmd Queue GPU Front. End

Multi-Threaded Submission CPU Thread cmd cmd cmd Queue GPU Front. End

Multi-Threaded Submission CPU Thread cmd cmd cmd cmd Queue GPU Front. End

Multi-Threaded Submission CPU Thread cmd cmd cmd cmd Queue GPU Front. End

Multi-Threaded Submission CPU Thread cmd cmd cmd cmd done! cmd Queue GPU Front. End

Multi-Threaded Submission CPU Thread cmd cmd cmd cmd done! cmd Queue GPU Front. End

Multi-Threaded Submission CPU Thread cmd cmd cmd Queue cmd cmd cmd GPU Front. End

Multi-Threaded Submission CPU Thread cmd cmd cmd Queue cmd cmd cmd GPU Front. End

Multi-Threaded Submission CPU Thread cmd cmd cmd Queue cmd cmd cmd GPU Front. End

Multi-Threaded Submission CPU Thread cmd cmd cmd Queue cmd cmd cmd GPU Front. End

“Free Threading” • Call API functions from any thread • Not required to have

“Free Threading” • Call API functions from any thread • Not required to have one “render thread” • Application responsible for synchronization • Calls that read/write same API object(s) • Often, object is owned by one thread at a time

Similarities • Free-threaded record + submit • Command buffer contents are opaque • Can’t

Similarities • Free-threaded record + submit • Command buffer contents are opaque • Can’t ship pre-built buffers like on console • No state inheritance across buffers

Differences • Metal command buffers are one-shot • Vulkan, D 3 D 12 allow

Differences • Metal command buffers are one-shot • Vulkan, D 3 D 12 allow more re-use • Re-submit same buffer across frames • Invoke one command buffer from another • Limited form of command-buffer call/return • “Second-level” command buffer / “bundle”

Pipeline State Objects D 3 D 12 Metal Vulkan

Pipeline State Objects D 3 D 12 Metal Vulkan

State-Change Granularity • GL 1. 0: “the Open. GL State Machine” gl. Enable(GL_BLEND); gl.

State-Change Granularity • GL 1. 0: “the Open. GL State Machine” gl. Enable(GL_BLEND); gl. Blend. Func(GL_ONE, GL_ONE_MINUS_SRC_ALPHA); • D 3 D 10: aggregate state objects d 3 d. Device->Create. Blend. State(&blend. Desc, &blend. State. Obj); . . . d 3 d. Context->OMSet. Blend. State(blend. State. Obj);

Pipeline State Object (PSO) • Encapsulates most of GPU state vector • Application switches

Pipeline State Object (PSO) • Encapsulates most of GPU state vector • Application switches between full PSOs • Compile and validate early • Avoid driver pauses when changing state • May not play nice with some engine designs

What goes in a PSO? • Shader for each active stage • Much fixed-function

What goes in a PSO? • Shader for each active stage • Much fixed-function state • Blend, rasterizer, depth/stencil • Format information • Vertex attributes • Color, depth targets

What doesn’t go in a PSO • Resource bindings • Vertex/index/constant buffers, textures, .

What doesn’t go in a PSO • Resource bindings • Vertex/index/constant buffers, textures, . . . • Some pieces of fixed-function state • A bit different for each API

Setting Non-PSO State • Set directly on command buffer d 3 d. Command. Buffer->OMSet.

Setting Non-PSO State • Set directly on command buffer d 3 d. Command. Buffer->OMSet. Stencil. Ref(0 x. FFFF); mtl. Command. Buffer. set. Triangle. Fill. Mode(. Lines); • Use smaller state objects (Metal/Vulkan) mtl. Command. Buffer. set. Depth. Stencil. State(mtl. Depth. Stencil. State); vk. Create. Dynamic. Viewport. State(device, &vp. Info, &vp. State);

Tiled Architectures and “Passes” • Pass • Sequence of draw calls • Sharing same

Tiled Architectures and “Passes” • Pass • Sequence of draw calls • Sharing same target(s) • Explicit in Metal/Vulkan • Simplifies/enables optimizations • Jesse’s talk will go in depth D 3 D 12 Metal Vulkan

Memory and Resources D 3 D 12 Metal Vulkan

Memory and Resources D 3 D 12 Metal Vulkan

Concepts • Allocation: range of virtual addresses • Resource: memory + layout • View:

Concepts • Allocation: range of virtual addresses • Resource: memory + layout • View: resource + format/usage

Concepts • Allocation: range of virtual addresses • Caching, visibility, … • Resource: memory

Concepts • Allocation: range of virtual addresses • Caching, visibility, … • Resource: memory + layout • Buffer, Texture 3 D, Texture 2 DMSArray, … • View: resource + format/usage • Depth-stencil view, …

Memory and Resources D 3 D 12 Allocation ID 3 D 12 Heap Resource

Memory and Resources D 3 D 12 Allocation ID 3 D 12 Heap Resource ID 3 D 12 Resource View ID 3 D 12 Depth. Stencil. View ID 3 D 12 Render. Target. View … Vulkan Vk. Device. Memory Vk. Image Vk. Buffer Vk. Image. View Vk. Buffer. View

Resource Binding D 3 D 12 Metal Vulkan

Resource Binding D 3 D 12 Metal Vulkan

Samplers Textures Buffers Binding Tables GPU State Vector Pipeline State Object

Samplers Textures Buffers Binding Tables GPU State Vector Pipeline State Object

Descriptor • GPU-specific encoding of a resource view • Size and format opaque to

Descriptor • GPU-specific encoding of a resource view • Size and format opaque to applications • Multiple types, based on usage • Texture, constant buffer, sampler, etc. • Just a block of data; not an allocation

Descriptor Table • An API object that holds multiple descriptors • Kind of like

Descriptor Table • An API object that holds multiple descriptors • Kind of like a buffer, but contents are opaque • Table may hold multiple types of descriptors • D 3 D 12, Vulkan have different rules on this

Samplers Textures Buffers Descriptor Tables GPU State Vector Pipeline State Object

Samplers Textures Buffers Descriptor Tables GPU State Vector Pipeline State Object

Pipeline Layout • Shaders impose constraints on table layout • “Descriptor 2 in table

Pipeline Layout • Shaders impose constraints on table layout • “Descriptor 2 in table 0 had better be a texture” • Pipeline layout is an explicit API object • Interface between PSO and descriptor tables

Samplers Textures Buffers Descriptor Tables Root Table GPU State Vector Pipeline Layout Pipeline State

Samplers Textures Buffers Descriptor Tables Root Table GPU State Vector Pipeline Layout Pipeline State Object

Samplers Textures Buffers Descriptor Tables Root Table GPU State Vector Pipeline Layout Pipeline State

Samplers Textures Buffers Descriptor Tables Root Table GPU State Vector Pipeline Layout Pipeline State Object

Descriptor Tables and Layouts D 3 D 12 ID 3 D 12 Descriptor. Heap

Descriptor Tables and Layouts D 3 D 12 ID 3 D 12 Descriptor. Heap - Vulkan Vk. Descriptor. Pool Vk. Descriptor. Set D 3 D 12_ROOT_DESCRIPTOR_TABLE ID 3 D 12 Root. Layout Vk. Descriptor. Set. Layout Vk. Pipeline. Layout

Data Hazards and Object Lifetimes D 3 D 12 Metal Vulkan

Data Hazards and Object Lifetimes D 3 D 12 Metal Vulkan

Old APIs: Driver Does it For You • Map a buffer that is in

Old APIs: Driver Does it For You • Map a buffer that is in use? • Driver will wait, or allocate a fresh “version” • Render to image, then use as texture? • Driver notices the change, makes it work • Allocate more texture than fit in GPU mem? • Driver will page stuff in/out to make room

New APIs: You Do It Yourself • • Explicitly synchronize CPU/GPU Explicitly manage object

New APIs: You Do It Yourself • • Explicitly synchronize CPU/GPU Explicitly manage object lifetimes Explicitly manage residency (D 3 D 12) Explicitly signal resource transitions • Done drawing to target, about to use as texture

Explicitly Synchronize CPU/GPU • No automatic “versioning” of resources • No “map discard” or

Explicitly Synchronize CPU/GPU • No automatic “versioning” of resources • No “map discard” or equivalent • Don’t write to something GPU might be reading • Use explicit events to synchronize ID 3 D 12 Fence Vk. Event

Explicitly Manage Object Lifetimes • Don’t delete something GPU is using • Same basic

Explicitly Manage Object Lifetimes • Don’t delete something GPU is using • Same basic problem as not writing to it • Use explicit events to synchronize • Sounds like a lot of busy-work, right? • Not actually that bad in practice • Other speakers will share strategies

Explicitly Signal Resource Transitions • Done rendering image, use as texture • Driver may

Explicitly Signal Resource Transitions • Done rendering image, use as texture • Driver may need to do “stuff” • Insert execution barriers • Flush caches • Decompress data

Resource Transitions • Conceptually every resource is in one “state” • Texture, color target,

Resource Transitions • Conceptually every resource is in one “state” • Texture, color target, depth target, … • Switching state is an explicit command • Well-defined time for driver to insert “stuff” • Use resource when in wrong state: error

Summary

Summary

It is all about trade-offs • You get more control, predictability • More like

It is all about trade-offs • You get more control, predictability • More like programming for a console • In return, you get more responsibility • App must do what driver used to • More like programming for a console…

Are the trade-offs worth it? • You’ll need to decide for yourself • Other

Are the trade-offs worth it? • You’ll need to decide for yourself • Other speakers will share their experience • Benefits they’ve seen from these APIs • Strategies to make working with them easier