An Overview of the NVIDIA UNIX Graphics Driver

  • Slides: 38
Download presentation
An Overview of the NVIDIA UNIX Graphics Driver XDev. Conf, February 8, 2006 Andy

An Overview of the NVIDIA UNIX Graphics Driver XDev. Conf, February 8, 2006 Andy Ritger, NVIDIA Corporation

Contents Unified Driver Architecture Driver Components Features Direct-Rendering Client Interaction with X Rendering and

Contents Unified Driver Architecture Driver Components Features Direct-Rendering Client Interaction with X Rendering and Scanout Interaction Video Memory ABI Compatibility and API Compatibility Direct-Rendering Open. GL+Damage/Composite Copyright © NVIDIA Corporation 2004

Unified Driver Architecture Majority of code base for NVIDIA Graphics Drivers leveraged on all

Unified Driver Architecture Majority of code base for NVIDIA Graphics Drivers leveraged on all the operating systems NVIDIA supports: Windows Mac OS X Linux Solaris Free. BSD Everything OS-specific or window-system-specific abstracted behind OS interface layers One driver supports all GPUs Copyright © NVIDIA Corporation 2004

Driver Components kernel module (nvidia. ko) X driver (nvidia_drv. so) Open. GL library (lib.

Driver Components kernel module (nvidia. ko) X driver (nvidia_drv. so) Open. GL library (lib. GL. so) GLX driver (libglx. so) Open. GL core library (lib. GLcore. so) Copyright © NVIDIA Corporation 2004

Driver Components (cont. ) shared memory X protocol X server Open. GL app command

Driver Components (cont. ) shared memory X protocol X server Open. GL app command buffer lib. GL. so libglx. so command buffers nvidia_drv lib. GLcore. so user space kernel nvidia. ko GPU Copyright © NVIDIA Corporation 2004 kernel space

Additional Utilities nvidia-installer (only needed on Linux) nvidia-settings nvidia-xconfig Copyright © NVIDIA Corporation 2004

Additional Utilities nvidia-installer (only needed on Linux) nvidia-settings nvidia-xconfig Copyright © NVIDIA Corporation 2004

Features Hardware-accelerated direct and indirect Open. GL Copyright © NVIDIA Corporation 2004

Features Hardware-accelerated direct and indirect Open. GL Copyright © NVIDIA Corporation 2004

Features Twin. View 2 display devices scanning from same X screen One root window:

Features Twin. View 2 display devices scanning from same X screen One root window: spanning comes "for free" What is DPI? Nonrectangular layouts? Copyright © NVIDIA Corporation 2004

Features (cont. ) Multiple X screens on one GPU Not as efficient as Twin.

Features (cont. ) Multiple X screens on one GPU Not as efficient as Twin. View for spanning Solves DPI and nonrectangular layout problems of Twin. View Can advertise different capabilities on each X screen Copyright © NVIDIA Corporation 2004

Features (cont. ) Support for Open. GL with Xinerama Open. GL direct/indirect rendering can

Features (cont. ) Support for Open. GL with Xinerama Open. GL direct/indirect rendering can span X screens (even across GPUs) Important for CAVEs and Powerwalls Oil & Gas Copyright © NVIDIA Corporation 2004

Features (cont. ) Configurability NV-CONTROL X extension: dynamically query/modify driver attributes nvidia-settings is sample

Features (cont. ) Configurability NV-CONTROL X extension: dynamically query/modify driver attributes nvidia-settings is sample NV-CONTROL client Copyright © NVIDIA Corporation 2004

Features (cont. ) Quad-Buffered Stereo Open. GL application renders Left/Right eyes Driver toggles between

Features (cont. ) Quad-Buffered Stereo Open. GL application renders Left/Right eyes Driver toggles between eyes on every VBlank Important for many workstation users, CAVEs Above: CAVE immersive 3 d environment; stereo images projected on walls and floor; stereo images must be in sync across all projectors. Right: MRI Brain Visualization Copyright © NVIDIA Corporation 2004 CAVE images courtesy Brown University ~ http: //graphics. brown. edu/research/cave/home. html

Features (cont. ) RGB/CI Workstation Overlays 16 -bit RGB overlay 8 -bit CI overlay

Features (cont. ) RGB/CI Workstation Overlays 16 -bit RGB overlay 8 -bit CI overlay Rendering in overlay does not damage content in main plane Useful for user interface in overlay, complex rendering in mainplane Useful for legacy applications that require different depths Used by workstation applications such as Maya Copyright © NVIDIA Corporation 2004

Features (cont. ) Frame. Lock together scanout of displays across a cluster Open. GL

Features (cont. ) Frame. Lock together scanout of displays across a cluster Open. GL Swap. Buffers Locked together Important for CAVEs and powerwalls ORNL visualization expert Jamison Daniel uses the EVEREST powerwall to display data from a large scale climate simulation. Image courtesy of ORNL Copyright © NVIDIA Corporation 2004

Features (cont. ) SDI Serial Digital Interface: video format used in digital broadcast industry

Features (cont. ) SDI Serial Digital Interface: video format used in digital broadcast industry GPU sends data to SDI in 8, 10, or 12 -bit per component Copyright © NVIDIA Corporation 2004

Features (cont. ) SLI Multiple GPUs drive one X screen Alternate Frame Rendering (AFR)

Features (cont. ) SLI Multiple GPUs drive one X screen Alternate Frame Rendering (AFR) Split Frame Rendering (SFR) SLI Anti. Aliasing (SLIAA) GPU Chipset GPU Copyright © NVIDIA Corporation 2004

Direct-Rendering Client Interaction with X Motivation for Direct Rendering: Avoid IPC overhead Avoid moving

Direct-Rendering Client Interaction with X Motivation for Direct Rendering: Avoid IPC overhead Avoid moving large quantities of data between client and server (e. g. , Open. GL textures) Avoid making GLX protocol requests for every Open. GL API call (e. g. , gl. Vertex 3 f() millions of times per frame) When Open. GL application is on same system as X server, performance benefit to bypass GLX protocol Copyright © NVIDIA Corporation 2004

Direct-Rendering Client Interaction with X (cont. ) Hardware-acceleration vs direct-rendering: Hardware-acceleration: using GPU to

Direct-Rendering Client Interaction with X (cont. ) Hardware-acceleration vs direct-rendering: Hardware-acceleration: using GPU to perform some or all of the Open. GL rendering pipeline Direct rendering: by-passing GLX protocol and Open. GL library renders directly to the hardware Server-side must coordinate with Open. GL client library for: Data propogation Synchronization Copyright © NVIDIA Corporation 2004

Direct-Rendering Client Interaction with X (cont. ) What data needs propogating? Drawable's geometry Drawable's

Direct-Rendering Client Interaction with X (cont. ) What data needs propogating? Drawable's geometry Drawable's cliplist Other Drawable attributes: Swap. Interval Anti. Aliasing Sync. To. VBlank etc. . . Copyright © NVIDIA Corporation 2004

Direct-Rendering Client Interaction with X (cont. ) Control flow: NVIDIA X driver pushes current

Direct-Rendering Client Interaction with X (cont. ) Control flow: NVIDIA X driver pushes current drawable state into a shared memory segment Open. GL direct-rendering runs asynchronously to X server When Open. GL performs operation that must be up-to -date wrt window system, checks that it has current drawable data If stale, Open. GL retrieves current data from shared memory and updates internal state Copyright © NVIDIA Corporation 2004

Direct-Rendering Client Interaction with X (cont. ) Synchronization needed to ensure integrity of drawable

Direct-Rendering Client Interaction with X (cont. ) Synchronization needed to ensure integrity of drawable data in shared memory Synchronization also needed to ensure correct ordering of GPU commands issued by each driver (X, each instance of Open. GL) Copyright © NVIDIA Corporation 2004

Direct-Rendering Client Interaction with X (cont. ) Traditional GPUs: 1 command buffer: Shared by

Direct-Rendering Client Interaction with X (cont. ) Traditional GPUs: 1 command buffer: Shared by all driver components Synchronization needed to protect shared buffer NVIDIA GPUs: multiple command buffers: One command buffer for each Open. GL client, one for X driver Hardware context switches between command buffers No need to negotiate shared command buffer Instead, need to manage sequencing of GPU commands Copyright © NVIDIA Corporation 2004

Direct-Rendering Client Interaction with X (cont. ) Why is sequencing important? Consider moving an

Direct-Rendering Client Interaction with X (cont. ) Why is sequencing important? Consider moving an animating Open. GL window that is clipped Operations performed: Open. GL Swap. Buffers: blit back->front per cliprect X driver: blit from old position to new position per cliprect Must make sure all outstanding Open. GL rendering is complete and has reached the framebuffer before X's blit commands are processed by the GPU Copyright © NVIDIA Corporation 2004

Direct-Rendering Client Interaction with X (cont. ) Inter-commandbuffer synchronization; driver-specific problem to solve Important

Direct-Rendering Client Interaction with X (cont. ) Inter-commandbuffer synchronization; driver-specific problem to solve Important concept whenever one client's rendering is read by another client: Direct-rendering Open. GL clients rendering to redirected windows X rendering to pixmaps that are used as Open. GL textures (GLX_EXT_texture_from_pixmap) Copyright © NVIDIA Corporation 2004

Interactions Between Rendering and Scanout Flipping vs Blitting Blit: memcpy Flip: change what portion

Interactions Between Rendering and Scanout Flipping vs Blitting Blit: memcpy Flip: change what portion of video memory is scanned out Flipping is faster, easier to sync to VBlank Copyright © NVIDIA Corporation 2004

Interactions Between Rendering and Scanout (cont. ) To flip while an Open. GL application

Interactions Between Rendering and Scanout (cont. ) To flip while an Open. GL application is in a window: Create a second copy of the desktop Only the content within the Open. GL window is different Flip between copies of the desktop Requires keeping the desktop in sync between the two copies Copyright © NVIDIA Corporation 2004

Interactions Between Rendering and Scanout (cont. ) Quad-Buffered Stereo: Flip between Left/Right eyes every

Interactions Between Rendering and Scanout (cont. ) Quad-Buffered Stereo: Flip between Left/Right eyes every Vblank Swaps can be done with either blit or flip Copyright © NVIDIA Corporation 2004

Interactions Between Rendering and Scanout (cont. ) Ideally, rendering and scanout would be orthogonal

Interactions Between Rendering and Scanout (cont. ) Ideally, rendering and scanout would be orthogonal In practice, they are not: Open. GL needs to control when and where to flip Sync. To. VBlank Video Memory allocation/configuration may depend on whether surface will be scanned out Filtering for AA through scanout SLI (SFR, AFR, SLIAA) Frame delivery for video: Time-sensitive Driver needs precise control of frame display Best accomplished with flipping Copyright © NVIDIA Corporation 2004

Video Memory Most modern NVIDIA GPUs are packaged with large quantities of video memory

Video Memory Most modern NVIDIA GPUs are packaged with large quantities of video memory However: Not all video memory is CPU mappable; SBIOSes limit how much can be mapped to the CPU Some GPUs support rendering to system memory over PCIE bus CPU mappable GPU rendering is slower than to native vidmem Layout of video memory may not be linear Organization of bits within video memory optimized for rendering and texturing acquiring a linear CPU mapping may require sacrifices Copyright © NVIDIA Corporation 2004

Video Memory (cont. ) Many attributes to the video memory Selecting the optimal placement

Video Memory (cont. ) Many attributes to the video memory Selecting the optimal placement of data in the correct memory space is non-trivial Placement heuristics perform best when driver has knowledge of how that data is going to be used Copyright © NVIDIA Corporation 2004

ABI and API Compatibility NVIDIA provides one X driver binary used in all X

ABI and API Compatibility NVIDIA provides one X driver binary used in all X servers since XFree 86 4. 0 This is accomplished through: ABI compatibility Dynamic loading of symbols We understand that ABI compatibility needs to be broken, and we can work with that Copyright © NVIDIA Corporation 2004

ABI and API Compatibility (cont. ) However, here a few suggestions: Breaking ABI compatibility

ABI and API Compatibility (cont. ) However, here a few suggestions: Breaking ABI compatibility painful for anyone distributing a driver separately from the X server tree (will that be more common with the Modular X tree? ) To minimize pain, break ABI infrequently and only when absolutely necessary Add new entry points and deprecate old entry points, rather than change old entry points, to give opportunity to phase in driver support Copyright © NVIDIA Corporation 2004

ABI and API Compatibility (cont. ) More Suggestions: Update ABI version number appropriately ABI

ABI and API Compatibility (cont. ) More Suggestions: Update ABI version number appropriately ABI version querable at install time and run time Minimize number of incompatible ABI versions: minimize number of driver versions to distribute If there are several ABI breakages pending, get them all out of the way at once If ABI is going to be broken anyway, update APIs when appropriate (Xv, Glyph management) Copyright © NVIDIA Corporation 2004

Open. GL + Damage/Composite Direct-rendering GL+Damage/Composite: Clients aware that drawable has been redirected Clients

Open. GL + Damage/Composite Direct-rendering GL+Damage/Composite: Clients aware that drawable has been redirected Clients notify X when drawable is damaged Clients and X drivers to handle synchronization: Do not use direct-rendering content as source for compositing operation until direct-rendering content has reached framebuffer (tricky if direct-rendering client and composite manager's rendering are in separate GPU command buffers) Or, do notify X server of direct-rendered damage until rendering has reached framebuffer; but this increases latency The Synchronization Problem to be discussed later Copyright © NVIDIA Corporation 2004

Open. GL + Damage/Composite (cont. ) Compositing overhead will be substantial for directrendering clients,

Open. GL + Damage/Composite (cont. ) Compositing overhead will be substantial for directrendering clients, especially for applications with a high framerate Important that users can disable compositing when they want: Full Open. GL performance Features that may not be possible with Composite: Workstation Overlays Quad-Buffered Stereo Copyright © NVIDIA Corporation 2004

Open. GL + Damage/Composite (cont. ) All the building blocks are here for Open.

Open. GL + Damage/Composite (cont. ) All the building blocks are here for Open. GL implementors to support direct-rendering Open. GL with Damage and Composite Demo of NVIDIA direct-rendering Open. GL with Damage and Composite; will be available in nvr 85 series drivers Copyright © NVIDIA Corporation 2004

Conclusion NVIDIA Driver has many features important to our users Overview of direct-rendering client/X

Conclusion NVIDIA Driver has many features important to our users Overview of direct-rendering client/X driver interaction Data Propogation Synchronization Rendering and Scanout Interaction Video Memory ABI and API Compatibility Direct-rendering Open. GL + Damage/Composite Copyright © NVIDIA Corporation 2004

Questions? http: //developer. nvidia. com/object/ xdevconf_2006_presentations. html Copyright © NVIDIA Corporation 2004

Questions? http: //developer. nvidia. com/object/ xdevconf_2006_presentations. html Copyright © NVIDIA Corporation 2004