Coherent Hierarchical Culling Hardware Occlusion Queries Made Useful
- Slides: 26
Coherent Hierarchical Culling: Hardware Occlusion Queries Made Useful Jiri Bittner 1, Michael Wimmer 1, Harald Piringer 2, Werner Purgathofer 1 1 Vienna University of Technology 2 VRVis Vienna
Motivation Coherent Hierarchical Culling n Typical hardware occlusion culling scenario CPU R Q GPU R Q R Q C Q R time R Q C Render Occlusion Query Cull Michael Wimmer Waiting time 2 Vienna University of Technology
Occlusion Culling: Offline vs. Online n Offline u + Global information about visibility (from region) Difficult to implement Accuracy and maintenance problems No runtime overhead n Online u + + - Local information about visibility (from point) Easier to implement Greater accuracy, easy maintenance Runtime overhead Michael Wimmer 3 Vienna University of Technology
Online Occlusion Culling n Object space methods - Need complex geometric calculations (hard to handle detailed scenes) + Do not require rasterization n Image space methods + No geometric calculations (easier to handle detailed scenes) - Require rasterization Michael Wimmer 4 Vienna University of Technology
Hardware Occlusion Culling n Hardware is good at rasterization! n Hardware counts rasterized fragments u But need not update frame buffer n NV/ARB_occlusion_query u u Asynchronous Allows multiple simultaneous occlusion queries n General algorithm idea: u Render simple approximation first (bbox) n invisible: cull object n visible: render object Michael Wimmer 5 Vienna University of Technology
Hardware Occlusion Culling n Advantages u u Pixel-exact No explicit occluder rendering Exploit rasterization power of GPU Easy to use (API calls) n Problems u u u Delay in availability of the results Time to execute queries If fill-bound: only useful if several objects culled Michael Wimmer 6 Vienna University of Technology
Hierarchical Stop&Wait (S&W) Front-to-back hierarchy traversal 1. Issue visibility query for node 2. Stop and Wait for result u u Invisible: cull the subtree Visible: render or continue 1. recursively n Advantage: u Hierarchy can cull huge subtrees n Problems: u u Waiting causes CPU stalls and GPU starvation Huge rasterization costs (especially for large interior nodes) Michael Wimmer 7 Vienna University of Technology
CPU Stalls and GPU Starvation CPU R 1 Q 2 GPU R 1 Q 2 R 2 Q 3 C 3 Q 4 R 4 time Rx Qx Cx Render object x Query object x Cull object x Michael Wimmer Waiting time 8 Vienna University of Technology
Solution: Coherent Hierarchical Culling n Scheduling based on temporal coherence u u Skipping certain visibility tests Immediate rendering of certain geometry n Clever interleaving of queries and rendering u Maintaining a queue of running occlusion queries n Design goal: easy implementation Michael Wimmer 9 Vienna University of Technology
Coherent Hierarchical Culling (CHC) visible in previous. Assume frame independent occlusion CPU R 1 Q 2 GPU R 1 Q 2 R 2 Q 3 C 3 Q 4 R 4 time Rx Qx Cx Render object x Query object x Cull object x Michael Wimmer 10 Vienna University of Technology
CHC Algorithm Outline n Front-to-back hierarchy traversal 1. Node handling u u Interior node n Previously invisible: issue visibility query n Previously visible: continue 1. recursively Leaf n Issue visibility query n Previously visible: render immediately 2. Check availability of query results n n Michael Wimmer Invisible: propagate visibility change Visible: render or continue 1. recursively 11 Vienna University of Technology
Why Interleaving Works… n Processing a node only depends on… 1. Front to back order 2. Results of queries for processed nodes where: Previous frame: processed node current node S&W CHC visible yes no visible invisible yes no invisible yes no yes invisible (different subtrees) invisible (parent child, refinement of visibility) Michael Wimmer 12 Vienna University of Technology
CHC: Hierarchy Traversal no queries for previously visible interior nodes assume no query dependencies 1 9 11 12 10 13 hidden regions: queries depend on parents previously visible previously invisible 13 Michael Wimmer front-to-back order 2 3 4 5 6 7 7 6 8 10 734 9 131112 8 5 6 Vienna University of Technology
CHC Features n Reduction of CPU stalls and GPU starvation u Interleaving queries with rendering previously visible geometry n Reduction of the number of queries u u Michael Wimmer Avoids expensive redundant queries for interior nodes Size of tested regions adapts to visibility n pull-up: occluded region growing n pull-down: visible region growing 14 Vienna University of Technology
Implementation Issues n Front-to-back traversal u Priority queue: allows various hierarchical data structures n Checking query results u u gl. Get. Occlusion. Queryiv. NV GL_PIXEL_COUNT_AVAILABLE_NV Very cheap operation n Queries for previously visible nodes u Use actual geometry as occludee (instead of bounding box) Michael Wimmer 15 Vienna University of Technology
Further Optimizations n Conservative visibility testing u Assume visible node remains visible n frames + Saves additional occlusion queries n Approximate visibility u #visible pixels < threshold node invisible + Saves rendered geometry - Produces image errors Michael Wimmer 16 Vienna University of Technology
Results – Test Scenes Teapots 11. 5 M triangles 21 k k. D-tree nodes City 1 M triangles 33 k k. D-tree nodes Power plant 12. 7 M triangles 18. 7 k k. D-tree nodes Michael Wimmer 17 Vienna University of Technology
Results – Speedup Ideal: zero overhead – render only visible geometry Michael Wimmer 18 Vienna University of Technology
Results – Summary n Comparison to hierarchical S&W u #queries reduced by almost 2 u Times for stalls reduced by 20 -60 x (to 0. 18 – 1. 31 ms) n Close to ideal algorithm! u Only 2– 9 ms slower u Overhead due to query time Michael Wimmer 19 Vienna University of Technology
Results – Teapot Michael Wimmer 20 Vienna University of Technology
Results – City Michael Wimmer 21 Vienna University of Technology
Results – Powerplant Michael Wimmer 22 Vienna University of Technology
Optimization Results n Conservative culling, 2 frames assumed visible u u Good for deep hierarchies with simple leaf geometry Further speedup up to 21% n Approximate culling, 25 pixels threshold u u Good for scenes with complex visible geometry Further speedup up to 33% Michael Wimmer 23 Vienna University of Technology
Conclusion n Efficient scheduling of hardware occlusion queries u u n n Greatly reduces CPU stalls and GPU starvation Reduces number of required queries Simple to implement Arbitrary hierarchical data structure Speedup ~4 over VFC Close to ideal solution for tested scenes n Watch out for GPU Gems II Michael Wimmer 24 Vienna University of Technology
Thanks for Your Attention Michael Wimmer 25 Vienna University of Technology
CHC: Example query final previously pull-up issued query classification result queries invisibility visible invisible: available: : render continue issue query render continue cull mark query 1. visible recursively +1. render recursively 1 9 11 2 10 3 4 5 6 7 8 query queue GPU Michael Wimmer R 6 Q 7 Q 8 R 7 Q 10 R 4 Q 5 Q 6/R 6 Q 10/R 10 Q 11 26 Vienna University of Technology
- Infraversion of teeth
- Balanced occlusion definition
- Culling techniques
- Octree culling
- Clipping and culling
- Bfnffn
- Portal culling
- Bgp session culling
- Internal and external components of computer
- Multirelation queries
- Texas railroad commission online queries
- Join ordering in fragment queries
- Hotel database sql queries
- Last square standing
- J queries
- What is dml in sql
- Sql queries for banking database
- Thank you any queries
- Wildcard query in information retrieval
- Ssms intellisense not working
- Tpch queries
- Sql queries for insert update and delete
- Action queries in access
- Any queries
- Conjunctive queries
- Complex sql join queries
- Ir queries