Using GPUView to Understand your Direct X 11

  • Slides: 45
Download presentation
Using GPUView to Understand your Direct. X 11 Game Jon Story Developer Relations Engineer,

Using GPUView to Understand your Direct. X 11 Game Jon Story Developer Relations Engineer, AMD

Agenda ● ● ● Windows Display Driver Model (WDDM) What is GPUView? CPU &

Agenda ● ● ● Windows Display Driver Model (WDDM) What is GPUView? CPU & GPU Queues Threads & Events Case Studies Summary

Windows Display Driver Model (WDDM)

Windows Display Driver Model (WDDM)

Graphics & WDDM Session Space Kernel Mode Driver (KMD) Win 32 kernel Kernel Mode

Graphics & WDDM Session Space Kernel Mode Driver (KMD) Win 32 kernel Kernel Mode Dxgkrnl Application D 3 D Runtime DWM Application Process User Mode Driver (UMD) DWM Process User Mode

Feeding the GPU… GPU Scheduler Database 1 Wait DMA Buffer 2 Win 32 k

Feeding the GPU… GPU Scheduler Database 1 Wait DMA Buffer 2 Win 32 k & dxgkrnl KMD D 3 D Runtime UMD Application #1 Command Buffer Application #2 Command Buffer Kernel Mode User Mode

What is GPUView?

What is GPUView?

What is GPUView? ● An additional Microsoft performance tool ● ● ● Compliments existing

What is GPUView? ● An additional Microsoft performance tool ● ● ● Compliments existing tools Part of the Windows 7 SDK Built on Event Tracing for Windows Perfect for monitoring CPU/GPU interaction (even for multiple GPU setups) ● Allows you to see how well the GPU is being fed ● Supports DX 9, DX 10 & DX 11 on Win 7 ●

Capturing Data ● Run an elevated command prompt ● ● Start your game in

Capturing Data ● Run an elevated command prompt ● ● Start your game in windowed mode ● ● For fullscreen mode perhaps use Ps. Exec from a remote machine Start capturing with log. cmd ● ● ● Program FilesMicrosoft Windows Performance ToolkitGPUView Capture 10 -15 seconds of your game Stop logging with log. cmd Open merged. etl file with GPUView. exe

Was this tool created for driver programmers?

Was this tool created for driver programmers?

Navigating the Data ● ● Use the mouse to select a region Ctrl+Z zooms

Navigating the Data ● ● Use the mouse to select a region Ctrl+Z zooms in to a selection ● ● Z zooms out Use +/- to see more or less detail Ctrl+E opens the event menu Click on objects for additional details ● More on this later…

Zooming in…

Zooming in…

DMA Packet Color Coding Various types of DMA packets may be submitted to the

DMA Packet Color Coding Various types of DMA packets may be submitted to the GPU: ● ● ● Red: Paging packet Black: Preemption packet Brown: DWM packet Other Color: Standard packet Other Color + Cross-Hatch: Present packet

What does a Standard DMA Packet Represent? ● ● ● Graphics system state objects

What does a Standard DMA Packet Represent? ● ● ● Graphics system state objects Draw commands References to resource allocations ● ● Textures Vertex & Index Buffers Render Targets Constant Buffers

CPU & GPU Queues

CPU & GPU Queues

SW Context CPU Queues (1) Desktop Window Manager packet D 3 D app stacking

SW Context CPU Queues (1) Desktop Window Manager packet D 3 D app stacking up 3 frames of packets

SW Context CPU Queue (2) CPU queue depth is 6 Task submitted to HW

SW Context CPU Queue (2) CPU queue depth is 6 Task submitted to HW queue CPU queue is empty! New Task submitted to CPU queue

SW Context CPU Queues (3) Objects represent work submitted to a GPU context ●

SW Context CPU Queues (3) Objects represent work submitted to a GPU context ● Queue is represented through time as a stack ● ● ● Stack grows on submission of work by the UMD Stack shrinks as work is completed by the GPU

GPU HW Context Queue (1) Present Packet Preemption packet Queued DMA Packet DWM GPU

GPU HW Context Queue (1) Present Packet Preemption packet Queued DMA Packet DWM GPU Processing DMA Packet

GPU HW Context Queue (2) GPU starts working on packet GPU finishes working on

GPU HW Context Queue (2) GPU starts working on packet GPU finishes working on packet GPU has no work to do

GPU HW Context Queue (3) Queue is represented through time as a stack ●

GPU HW Context Queue (3) Queue is represented through time as a stack ● ● Stack grows on submission of work by the KMD Stack shrinks as work is completed by the GPU Gaps indicate a CPU side bottleneck

Object Selection Represents latency

Object Selection Represents latency

Object Details (1) Packet type & timing information Allocation references in DMA packet

Object Details (1) Packet type & timing information Allocation references in DMA packet

Object Details (2) (w) = Writable by GPU Preferred memory segment P 0 =

Object Details (2) (w) = Writable by GPU Preferred memory segment P 0 = Preferred P 1 = Less P 2 = Least

Object Viewer Segment Numbers: 1 = Vid Mem (CPU visible) 2 = Vid Mem

Object Viewer Segment Numbers: 1 = Vid Mem (CPU visible) 2 = Vid Mem (Non visible) 3 = PCI Express Mem Clearly the depth buffer

Paging Buffer Packet Submitted as the result of a paging operation (perhaps a large

Paging Buffer Packet Submitted as the result of a paging operation (perhaps a large texture) ● Cause is usually resulting from preparing a DMA buffer ● Look at the DMA packet that follows the paging operation ●

Threads & Events

Threads & Events

HW Threads Colored bars represent idle time Gaps represent work

HW Threads Colored bars represent idle time Gaps represent work

Thread Execution ● Thread segments are colored coded: ● ● ● Light blue: Kernel

Thread Execution ● Thread segments are colored coded: ● ● ● Light blue: Kernel mode Dark blue: dxgkrnl Red: KMD (Kernel Mode Driver)

Charts: FPS / Latency / Memory

Charts: FPS / Latency / Memory

Viewing Events Ctrl+E opens the Event View window ● Can track whatever events take

Viewing Events Ctrl+E opens the Event View window ● Can track whatever events take your interest ● DX- Create / Destroy Allocation ● DX Block ● ● ● Suggests possible resource contention Perhaps trying to lock an in use buffer

V-Sync Event

V-Sync Event

Case Studies

Case Studies

Draw. Predicated SDK Sample GPU is busy, no gaps CPU queue is buffering up

Draw. Predicated SDK Sample GPU is busy, no gaps CPU queue is buffering up nicely App thread not saturated

Draw. Predicated SDK Sample: + blocking occlusion queries GPU is going idle Not enough

Draw. Predicated SDK Sample: + blocking occlusion queries GPU is going idle Not enough being queued up App thread fully saturated

Getting Occlusion Queries Right ● Delay picking up results by N frames ● Where

Getting Occlusion Queries Right ● Delay picking up results by N frames ● Where N = Number of GPUs May need to artificially inflate occlusion volumes to avoid poping ●

What else could cause this problem? ● Locking a Render Target ● ● Use

What else could cause this problem? ● Locking a Render Target ● ● Use Copy. Resource & Staging Textures This is a queued operation

Content. Streaming SDK Sample (1) Paging packets GPU is going idle

Content. Streaming SDK Sample (1) Paging packets GPU is going idle

Content. Streaming SDK Sample (2) Large resources not getting preferred segments

Content. Streaming SDK Sample (2) Large resources not getting preferred segments

Avoiding Paging Keep your video memory usage under control ● ● ● Especially in

Avoiding Paging Keep your video memory usage under control ● ● ● Especially in MSAA modes Drop texture resolution for lower end HW Avoid excessively large amounts of dynamic data ● ● ● Textures & Vertex Buffers If not sure – talk to us!

Multithreaded. Rendering 11 SDK Sample But there is a lot of D 3 D

Multithreaded. Rendering 11 SDK Sample But there is a lot of D 3 D runtime / driver overhead Additional threads preparing packets

Multi-Threaded Rendering and Deferred Contexts It is a complex issue ● Don‘t expect it

Multi-Threaded Rendering and Deferred Contexts It is a complex issue ● Don‘t expect it to be a magic bullet ● Strongly recommend you talk to developer relations from AMD & NVIDIA ●

Summary

Summary

Summary Make sure you‘re keeping the ever hungry GPU fed ● Keep track of

Summary Make sure you‘re keeping the ever hungry GPU fed ● Keep track of CPU/GPU interaction ● Keep track of your threads ● Monitor multi-GPU interaction ● Add GPUView to your toolbox ●

Acknowledgments Microsoft for creating GPUView ● Microsoft for providing background content ●

Acknowledgments Microsoft for creating GPUView ● Microsoft for providing background content ●

Questions?

Questions?