Brink Preferred Rendering with Open GL Mikkel Gjl
Brink Preferred Rendering with Open. GL Mikkel Gjøl Graphics Programmer @pixelmager
outline PC rendering overview state, shaders, occlusion queries virtual texturing bindless vertex attributes debugging Open. GL lessons learned
splash damage Founded 2001 Mod team gone pro Developed Enemy Territory games with id Software Just finished Brink with Bethesda Softworks
brink pc-rendering overview thin, lightweight renderer, data-driven mostly GL 2. x pipeline with GL 3. x shaders Vertex Buffer Objects, Frame Buffer Objects Vertex- and Fragment-programs Pixel Buffer Objects for asynchronous VT-readback
brink pc-rendering overview fully dynamic lighting, deferred, fat attribute buffers fast content-iterations on lighting fast shader-development fast dynamic lighting …more work for static lighting
a frame of brink
a frame of brink
a frame of brink
a frame of brink
sorting for state-change each pass is a list of draw. Cmds sorted to reduce statechange depth: attribute: lights: translucent:
shaders static set of handwritten shaders available to artists. written by gfx-programmers and tech-artists together with artists python-scripts to generate variations with different $defines ~150 files, ~350 shader-combinations, ~3 sec compiletime
shader pre-processor Regular preprocessor functionality (some now natively in glsl) $include "inc/preferred_glsl. inc" $define PREFERRED_RENDERING $ifdef PREFERRED_RENDERING return best. Value(); $else return fugly. Hack(); $endif // vertex-attributes looked up by name vec 4 vertex. Color = $color. Attrib; //uniforms / textures mapped on shader-load vertex. Color = vertex. Color * $color. Modulate + $color. Add; vec 4 diffuse = texture( $diff. Map, tex. Coord ) * vertex. Color;
shader pre-processor Regular preprocessor functionality (some now natively in glsl) $include "inc/preferred_glsl. inc" $define PREFERRED_RENDERING $ifdef PREFERRED_RENDERING return best. Value(); $else return fugly. Hack(); $endif // vertex-attributes looked up by name vec 4 vertex. Color = $color. Attrib; //uniforms / textures mapped on shader-load vertex. Color = vertex. Color * $color. Modulate + $color. Add; vec 4 diffuse = texture( $diff. Map, tex. Coord ) * vertex. Color;
occlusion queries area and frustum-culled before OCQ
occlusion queries area and frustum-culled before OCQ render world-space bboxes / light-frustum if predicted visible, issue OCQ during rendering ~60 avg, ~200 max OCQ per frame ~0. 5 ms
occlusion queries initial version stalled on frame n+1 proved to be a bottleneck
occlusion queries if not ready, assume nothing changed do not issue new OCQ expand bbox in view-movement-direction to predict visibility
virtual texturing
virtual texturing only load the texels you need same texel-density everywhere on screen
virtual texturing only load the texels you need same texel-density everywhere on screen reduces complexity for artists ”texture however you want, as long as it fits”
virtual texturing only load the texels you need same texel-density everywhere on screen reduces complexity for artists ”texture however you want, as long as it fits” reduces dependency between drawcalls everything uses the same texture. . . actually we have multiple pagefiles
virtual texturing
virtual texturing
virtual texturing processing mapped memory from separate thread PBO mapped pointer valid in all threads
virtual texturing processing mapped memory from separate thread PBO mapped pointer valid in all threads minimum 1 frame delay for async readback 2+ frames delayed for multi-GPU
virtual texturing processing mapped memory from separate thread PBO mapped pointer valid in all threads minimum 1 frame delay for async readback 2+ frames delayed for multi-GPU anisotropic mip-selection, bi-linear filtering
bindless vertex attributes 1. decouples vertex-format from vertex-pointers 2. provides direct GPU-pointer, reducing indirectionoverhead in driver GL_NV_vertex_buffer_unified_memory GL_NV_shader_buffer_load
bindless vertex attributes gl. Vertex. Attrib. Pointer. ARB( ATTR_COL, 4, GL_UNSIGNED_BYTE, GL_TRUE, vertsiz, ofs_col ); gl. Buffer. Address. Range. NV( GL_VERTEX_ATTRIB_ARRAY_ADDRESS_NV, ATTR_COL, vbo_gpu_addr+ofs_col, vbo_gpu_siz-ofs_col ); gl. Vertex. Attrib. Format. NV( ATTR_COL, 4, GL_UNSIGNED_BYTE, GL_TRUE, vertsiz );
bindless vertex attributes implementation revealed the slack in our attribute-code, drivers are forgiving this was the main work - writing the code was easy… use debug-context to spare crashes (thanks Jeff Bolz!)
bindless vertex attributes implementation revealed the slack in our attribute-code, drivers are forgiving this was the main work - writing the code was easy… use debug-context to spare crashes (thanks Jeff Bolz!) practically got rid of vertex-attribute cpu-overhead 5 -10% of CPU frame-time …not practical for frame-temporary allocations Getting GPU-address takes time.
debugging Open. GL - tools g. Debugger / gl. Intercept for cmd. Buffer debugging useful for checking correct textures/shaders/"meshes“
debugging Open. GL - tools g. Debugger / gl. Intercept for cmd. Buffer debugging useful for checking correct textures/shaders/"meshes“ Nsight / GPU Perf. Studio almost useful for timing. NSight has actual start/end drawcall GPU-times! Only way to get this. Try it today! Neither is very mature for Open. GL
debugging Open. GL - tools g. Debugger / gl. Intercept for cmd. Buffer debugging useful for checking correct textures/shaders/"meshes“ Nsight / GPU Perf. Studio almost useful for timing. NSight has actual start/end drawcall GPU-times! Only way to get this. Try it today! Neither is very mature for Open. GL no tools currently shows FBO or geometry
debugging Open. GL – custom code hot-reloadable assets shaders / textures / meshes / assets runtime-modifying shaders for debugging
debugging Open. GL – custom code hot-reloadable assets shaders / textures / meshes / assets runtime-modifying shaders for debugging “configurable” render-stages various visualization modes wire, batches, info on obj, lights, ocq etc. show textures / rendertargets, write to files
debugging Open. GL – custom code full Open. GL-statedump write to file and diff query. Timers for profiling Make sure to pair with CPU-timings to find all bottlenecks
lessons learned do not trust state: Reset per frame
lessons learned do not trust state: Reset per frame use the debug context. Add it today! ~1 hour
lessons learned do not trust state: Reset per frame use the debug context. Add it today! ~1 hour no vendor-independent solutions exists for dual-GPU laptops
lessons learned do not trust state: Reset per frame use the debug context. Add it today! ~1 hour no vendor-independent solutions exists for dual-GPU laptops keep a small code-setup for tests driver teams appreciate small reproduction-programs with source -code too
conclusions Open. GL works for AAA games Open. GL-support was never an issue d 3 d 10 on Windows XP
conclusions Open. GL works for AAA games Open. GL-support was never an issue d 3 d 10 on Windows XP responsive IHVs AAA can “force” updated drivers
Open. GL wishlist IHV to use the debug-context more enable a lower-level api GL_PARANOID_LEVEL = INT_MAX; performance warnings
Open. GL wishlist light-weight Display Lists this is our current console setup! ideally, we would still like to be able to set state. . . sorry. GPU self-feed would be great!
references http: //brinkthegame. com/ http: //origin-developer. nvidia. com/object/bindless_graphics. html http: //altdevblogaday. com/2011/06/23/improving-opengl-error-messages/ http: //glintercept. nutty. org/ http: //developer. amd. com/tools/g. DEBugger/Pages/default. aspx http: //parallelnsight. nvidia. com/
acknowledgements Arnout van Meer / Splash Damage Romain Toutain / Splash Damage Simon Green, Phil Scott, Jeff Bolz of NVIDIA fame Nicolas Thibieroz, Kevin Strange of AMD fame All images are copyright of their respective owners
Teh Jobs We’re hiring: www. splashdamage. com/jobs @splashdamage Splash Damage
Teh Questions? Ask us anything!
- Slides: 56