Richard Thomson DAZ 3 D www daz 3

  • Slides: 35
Download presentation
Richard Thomson DAZ 3 D www. daz 3 d. com DIRECT 3 D 11

Richard Thomson DAZ 3 D www. daz 3 d. com DIRECT 3 D 11 PREVIEW UTAH CODE CAMP FALL 2008

Direct 3 D 11 � CTP in November 2008 Direct. X SDK � Vista

Direct 3 D 11 � CTP in November 2008 Direct. X SDK � Vista (and beyond) only, not on XP � Evolution of Direct 3 D 10 � Compatible with D 3 D 10 cards

Evolution of Direct 3 D � Direct 3 D 9 �Stable, been around for

Evolution of Direct 3 D � Direct 3 D 9 �Stable, been around for a while �Last version to be deployed on Win XP � Direct 3 D 10 �First Vista-only version �Big change from D 3 D 9 � Direct 3 D 10. 1 �Incremental tweak to D 3 D 10

Direct 3 D 10/10. 1/11 vs. 9 � Enumeration factored out to DXGI �

Direct 3 D 10/10. 1/11 vs. 9 � Enumeration factored out to DXGI � Same DXGI used for 10, 10. 1 and 11 � Divide render/texture states into chunks � Chunks of state are immutable objects � “Device state” consists of set of assigned state chunks � Introduces new shader stages beyond vertex and pixel shaders � Tighter API specification => no CAPS

Direct 3 D 11 Focus � Scalability and performance � Improving the development experience

Direct 3 D 11 Focus � Scalability and performance � Improving the development experience � Extending the reach of the GPU

Direct 3 D 11 New Features � Tessellation � Compute Shader � Multithreading �

Direct 3 D 11 New Features � Tessellation � Compute Shader � Multithreading � Shader Subroutines � Improved Texture Compression � Other Features

Tessellation Input Assembler Direct 3 D 10 pipeline Plus Three new stages for Tessellation

Tessellation Input Assembler Direct 3 D 10 pipeline Plus Three new stages for Tessellation Vertex Shader Hull Shader Tessellator Domain Shader Geometry Shader Rasterizer Pixel Shader Output Merger Stream Output

Hull Shader HS input: patch control pts One Hull Shader invocation per patch Hull

Hull Shader HS input: patch control pts One Hull Shader invocation per patch Hull Shader HS output: Patch control pts after Basis conversion HS output: • Tess. Factors (how much to tessellate) • fixed tessellator mode declarations Tessellator Domain Shader

Hull Shader Syntax [patchsize(12)] [patchconstantfunc(My. Patch. Constant. Func)] My. Out. Point main(uint Id :

Hull Shader Syntax [patchsize(12)] [patchconstantfunc(My. Patch. Constant. Func)] My. Out. Point main(uint Id : SV_Control. Point. ID, Input. Patch<My. In. Point, 12> In. Pts) { My. Out. Point result; … result = Transform. Control. Point( In. Pts[Id] ); return result; }

Tessellator Hull Shader Note: Tessellator does not see control points TS input: • Tess.

Tessellator Hull Shader Note: Tessellator does not see control points TS input: • Tess. Factors (how much to tessellate) • fixed tessellator mode declarations Tessellator TS output: • U V {W} domain points Tessellator operates per patch Domain Shader TS output: • topology (to primitive assembly)

Domain Shader Hull Shader DS input: • control points • Tess. Factors Tessellator DS

Domain Shader Hull Shader DS input: • control points • Tess. Factors Tessellator DS input: • U V {W} domain points Domain Shader One Domain Shader invocation per point from Tessellator DS output: • one vertex

Domain Shader Syntax void main( out My. DSOutput result, float 2 my. Input. UV

Domain Shader Syntax void main( out My. DSOutput result, float 2 my. Input. UV : SV_Domain. Point, My. DSInputs, Output. Patch<My. Out. Point, 12> Control. Pts, My. Tess. Factors tess. Factors ) { … result. Position = Evaluate. Surface. UV( Control. Points, my. Input. UV ); }

Single Pass Example vertex shader hull shader Animate/skin Control Points Transform basis, Determine how

Single Pass Example vertex shader hull shader Animate/skin Control Points Transform basis, Determine how much to tessellate patch control points transformed control points tessellator Tess Factors control points in Bezier patch Tessellate! domain shader Evaluate surface including displacement U V {W} domain points displacement map Sub-D Patch Bezier Patch

Current Authoring Pipeline (Rocket Frog Taken From Loop &Schaefer, "Approximating Catmull-Clark Subdivision Surfaces with

Current Authoring Pipeline (Rocket Frog Taken From Loop &Schaefer, "Approximating Catmull-Clark Subdivision Surfaces with Bicubic Patches“) Sub-D Modeling Polygon Mesh Animation Displacement Map Generate LODs

New Authoring Pipeline (Rocket Frog Taken From Loop &Schaefer, "Approximating Catmull-Clark Subdivision Surfaces with

New Authoring Pipeline (Rocket Frog Taken From Loop &Schaefer, "Approximating Catmull-Clark Subdivision Surfaces with Bicubic Patches“) Animation Sub-D Modeling Displacement Map Optimally Tessellated Mesh GPU

Tessellation Summary Helps us get closer to eliminating “pointy heads” � Scales visual quality

Tessellation Summary Helps us get closer to eliminating “pointy heads” � Scales visual quality across PC hardware configurations � Supports performance increases � � Coarse model = compression, faster I/0 to GPU � Rendering tailored to each end user’s hardware � Better cross-platform (Windows + Xbox 360) development experience � Xbox 360 has a subset of D 3 D 11’s tessellation � Parity = ease of cross-platform development � Extra features = innovation for Windows gaming � Render content as the artist created it!

More on Tessellation � Game. Fest 2008 Slides and Audio �“Direct 3 D 11

More on Tessellation � Game. Fest 2008 Slides and Audio �“Direct 3 D 11 Tessellation” ○ Kev Gee, Microsoft �“Advanced Topics in GPU Tessellation” ○ Natasha Tatarchuk, AMD/ATI �“Water-Tight, Textured, Displaced Subdivision Surface Tessellation Using Direct 3 D 11” ○ Ignacio Castano, NVIDIA

General Purpose GPU � Data Parallel Computing � GPU performance continues to grow �

General Purpose GPU � Data Parallel Computing � GPU performance continues to grow � Many applications scale well to massive parallelism without tricky code changes � Direct 3 D is the API for talking to GPU � How do we expand Direct 3 D to GPGPU?

Compute Shader Input Assembler Vertex Shader Hull Shader Tessellator Domain Shader Geometry Shader Stream

Compute Shader Input Assembler Vertex Shader Hull Shader Tessellator Domain Shader Geometry Shader Stream Output Direct 3 D 10 pipeline Plus Three new stages for Tessellation Plus Compute Shader Rasterizer Pixel Shader Output Merger Data Structure Compute Shader

Integrated with Direct 3 D � Fully supports all Direct 3 D resources �

Integrated with Direct 3 D � Fully supports all Direct 3 D resources � Targets graphics/media data types � Evolution of Direct. X HLSL � Graphics pipeline updated to emit general data structures… � …which can then be manipulated by compute shader… � And then rendered by Direct 3 D again

Target Applications � Image/Post processing: �Image Reduction �Image Histogram �Image Convolution �Image FFT �

Target Applications � Image/Post processing: �Image Reduction �Image Histogram �Image Convolution �Image FFT � A-Buffer/OIT � Ray-tracing, � Physics � AI radiosity, etc.

Computing a Histogram() { shared int Histograms[16][256]; // array of 16 float 3 v.

Computing a Histogram() { shared int Histograms[16][256]; // array of 16 float 3 v. Pixel = load( sampler, sv_Thread. ID ); float f. Luminance = dot( v. Pixel, LUM_VECTOR ); int i. Bin = f. Luminance*255. 0 f; // compute bin to increment i. Hist = sv_Thread. IDIn. Group & 16; // use thread index Histograms[i. Hist][i. Bin] += 1; // update bin // enable all threads in group to complete Synchronize. Thread. Group;

Computing a Histogram 2 // Write register histograms out to memory: i. Bin =

Computing a Histogram 2 // Write register histograms out to memory: i. Bin = sv_Thread. IDIn. Group. x; if (sv_Thread. ID. x < 256) { for (i. Hist = 0; i. Hist < 16; i. Hist++) { int 2 dest. Addr = int 2(i. Hist, i. Bin); Output. Resource. add(dest. Addr, Histograms[i. Hist][i. Bin]); // atomic } } }

Compute Shader Summary � Enables much more general algorithms � Transparent parallel processing model

Compute Shader Summary � Enables much more general algorithms � Transparent parallel processing model � Full cross-vendor support � Broadest possible installed base � Game. Fest 2008: �“Direct 3 D 11 Compute Shader – More Generality for Advanced Techniques” ○ Chas Boyd, Microsoft

Multithreading � Enables distribution across threads of � Application code � Runtime � Driver

Multithreading � Enables distribution across threads of � Application code � Runtime � Driver Device: free threaded resource creation Immediate Context: your single primary device for state & draws � Deferred Contexts: your per-thread devices for state & draws � Display Lists: Recorded sequence of graphics commands � Requires a driver update � �

Shader Subroutines � Details �Calls must be fast �Binding applies to all primitives in

Shader Subroutines � Details �Calls must be fast �Binding applies to all primitives in a Draw call �Binding operation must be fast �Need parameter passing mechanism �Need access to textures, samplers, etc. � Advantages �Reduce register usage in Über-shaders ○ Not worst case of all if statements �Allows specialization of subroutines

Improved Texture Compression � Why? � Existing block palette interpolations too simple � Results

Improved Texture Compression � Why? � Existing block palette interpolations too simple � Results often rife with blocking artifacts � No high dynamic range (HDR) support

New Texture Formats � BC 6 (aka BC 6 H) �High dynamic range �

New Texture Formats � BC 6 (aka BC 6 H) �High dynamic range � 6: 1 compression (16 bpc RGB) �Targeting high (not lossless) visual quality � BC 7 �LDR with alpha � 3: 1 compression for RGB or 4: 1 for RGBA �High visual quality

Compression of New Formats � Block compression (unchanged) �Each block independent �Fixed compression ratio

Compression of New Formats � Block compression (unchanged) �Each block independent �Fixed compression ratio � Multiple block types (new) �Tailored to different types of content �Smooth gradients vs. noisy normal maps �Varied alpha vs. constant alpha � Decompression results must be bitaccurate with spec

Comparison Results 1 Orig BC 3 Orig BC 7 Abs Error

Comparison Results 1 Orig BC 3 Orig BC 7 Abs Error

Comparison Results 2 Orig BC 3 Orig BC 7 Abs Error

Comparison Results 2 Orig BC 3 Orig BC 7 Abs Error

Comparison Results 3 HDR Original at given exposure Abs Error BC 6 at given

Comparison Results 3 HDR Original at given exposure Abs Error BC 6 at given exposure

Other Features � � � � Addressable Stream Out Draw Indirect Pull-model attribute eval

Other Features � � � � Addressable Stream Out Draw Indirect Pull-model attribute eval Improved Gather 4 Min-LOD texture clamps 16 K texture limits Required 8 -bit subtexel, submip filtering precision � � � Conservative o. Depth 2 GB Resources Geometry shader instance programming model Optional double support Read-only depth or stencil views

Thanks Allison Klein Senior Lead Program Manager Direct 3 D Microsoft Chas. Boyd Architect

Thanks Allison Klein Senior Lead Program Manager Direct 3 D Microsoft Chas. Boyd Architect Windows Desktop & Gaming Technology Microsoft

Thank you to our Sponsors!

Thank you to our Sponsors!