RLODs Fast LODbased Ray Tracing of Massive Models
R-LODs: Fast LOD-based Ray Tracing of Massive Models Sung-Eui Yoon Lawrence Livermore National Lab. Christian Lauterbach Dinesh Manocha Univ. of North Carolina-Chapel Hill
Goal ● Perform an interactive ray tracing of massive models ● Handles various kinds of polygonal meshes (e. g. , scanned data and CAD) Double eagle tanker St. Matthew 82 M triangles 372 M triangles Forest model (32 M) 2
Recent Advances for Interactive Ray Tracing ● Hardware improvements ● Exponential growth of computation power ● Multi-core architectures ● Algorithmic improvements ● Efficient ray coherence techniques [Wald et al. 01, Reshetov et al. 05] 3
Hierarchical Acceleration Data Structures ● kd-trees for interactive ray tracing [Wald 04] ● Simplicity and efficiency ● Used for efficient object culling kd-node Axis-aligned bounding box 4
Ray Tracing of Massive Models ● Logarithmic asymptotic behavior ● Very useful for dealing with massive models ● Due to the hierarchical data structures ● Observed only in in-core cases 5
Performance of Ray Tracing with Different Model Complexity ● Measured with 2 GB main memory Render time (log scale) Working set size Memory thrashing! 2 GB Model complexity (M tri) - log scale 6
Low Growth Rate of Data Access Time Growth rate during 1993 – 2004 47 X 2 X 20 X Courtesy: http: //www. hcibook. com/e 3/online/moores-law/ 7
Inefficient Memory Accesses and Temporal Aliasing ● St. Matthew (256 M triangles) ● Around 100 M visible triangles ● 1 K by 1 K image resolution ● 1 M primary rays ● Hundreds of triangle per pixel ● Each triangle likely in different area of memory 8
Main Contributions ● Propose an LOD (level-of-detail)-based ray tracing of massive models ● R-LOD, a novel LOD representation for Ray tracing ● Efficient LOD error metric for primary and secondary rays ● Integrate ray and cache coherent techniques 9
Performance of Ray Tracing with Different Model Complexity ● Measured with 2 GB main memory Render time (log scale) Working set size Memory thrashing! 2 GB Model complexity (M tri) - log scale 10
Performance of LOD-based Ray Tracing Achieved up to three order of magnitude speedup! ● Measured with 2 GB main memory Render time (log scale) Working set size Model complexity (M tri) - log scale 11
Real-time Captured Video – St. Matthew Model 512 by 512 and 2 x 2 super-sampling, 4 pixels of LOD error in image space 12
Related Work ● Interactive ray tracing ● LOD and out-of-core techniques ● LOD-based ray tracing 13
Interactive Ray Tracing ● Ray coherences ● [Heckbert and Hanrahan 84, Wald et al. 01, Reshetov et al. 05] ● Parallel computing ● [Parker et al. 99, De. Marle et al. 04, Dietrich et al. 05] ● Hardware acceleration ● [Purcell et al. 02, Schmittler et al. 04, Woop et al. 05] ● Large dataset ● [Pharr et al. 97, Wald et al. 04] 14
LOD and Out-of-Core Techniques ● Widely researched ● LOD book [Luebke et al. 02] ● Out-core algorithm course [Chiang et al. 03] ● LOD algorithms combined with out-of-core techniques ● Points clouds [Rusinkiewicz and Levoy 00] ● Regular meshes [Hwa et al. 04, Losasso and Hoppe 04] ● General meshes [Lindstrom 03, Cignoni et al. 04, Yoon et al. 04, Gobbetti and Marton 05] Not clear whether LOD techniques for rasterization is applicable to ray tracing 15
LOD-based Ray Tracing ● Ray differentials [Igehy 99] ● Subdivision meshes [Christensen et al. 03, Stoll et al. 06] ● Point clouds [Wand Straβer 03] Image plane Viewpoint Footprint size of ray Ray beam for one pixel 16
Outline ● R-LODs for ray tracing ● Results 17
Outline ● R-LODs for ray tracing ● Results 18
R-LOD Representation ● Tightly integrated with kd-nodes ● A plane, material attributes, and surface deviation Rays kd-node Plane No intersection Normal Intersection Valid extent of the plane 19
Properties of R-LODs ● Compact and efficient LOD representation ● Add only 4 bytes to (8 bytes) kd-node ● Drastic simplification ● Useful for performance improvement ● Error-controllable LOD rendering ● Error is measured in a screen-space in terms of pixels-of-error (Po. E) ● Provides interactive rendering framework 20
Two Main Design Criteria for LOD Metric ● Controllability of visual errors ● Efficiency ● Performed per ray (not per object) ● More than tens of million times evaluation 21
Visual Artifacts Surface deviation ● Visibility difference ● Illumination difference Projected area Curvature ● Path difference for secondary rays difference View direction Original mesh Ray with original mesh Ray with LODs Image plane 22
R-LOD Error Metric ● Consider two factors ● Projected screen-space area of a kd-node ● Surface deviation 23
Conservative Projection Method ● Measures the screen-space area affected by using an R-LOD ? LOD metric: C (B) dmin > R Image plane dmin Viewpoint B{ R kd-node Po. E error bound One ray beam 24
R-LODs with Different Po. E Values Po. E: Original 1. 85 5 (512 x 512, no anti-aliasing) 10 25
LOD Metric for Secondary Rays ● Applicable to any linear transformation ● Shadow ● Planar reflection ● Not applicable to non-linear transformation ● Refraction ● Uses more general, but expensive ray differentials [Igehy 99] 26
C 0 Discontinuity between R-LODs Ray ● Possible solutions ● Posing dependencies [Lindstrom 03, Hwa et al. 04, Yoon et al. 04, Cignoni et al. 05] ● Implicit surfaces [Wald and Seidel 05] 27
Expansion of R-LODs Ray ● Expansion of the extent of the plane ● Inspired by hole-free point clouds rendering [Kalaiah and Varshney 03] ● A function of the surface deviation (20% of the surface deviation) 28
Impact of Expansions of R-LODs Hole Before expansion After expansion Original model Po. E = 5 at 512 by 512 29
R-LOD Construction ● Principal component analysis (PCA) ● Compute the covariance matrix for the plane of R-LODs Normal (= Eigenvector) ● Hierarchical PCA computation ● Has linear time complexity ● Accesses the original data only one time with virtually no memory overhead 30
Utilizing Coherence ● Ray coherence ● Using LOD improve the utilization of SIMD traversal/intersection ● Cache coherence ● Use cache-oblivious layouts of bounding volume hierarchies [Yoon and Manocha 06] ● 10% ~ 60% performance improvement 31
Outline ● R-LODs for ray tracing ● Results 32
Implementation ● Uses common optimized kd-tree construction methods ● Based on surface-area heuristic [Mac. Donald and Booth 90, Havran 00] ● Out-of-core computation ● Decompose an input model into a set of clusters [Yoon et al. 04] 33
Preprocessing ● Simplification computation speed ● Very fast due to its linear complexity (3 M triangles per min) ● Memory overhead ● Requires 33% more storage over the optimized kd-tree representation [Wald 04] ● Runtime overhead ● 5% compared to non-LOD version of the same efficient ray tracer 34
Impact of R-LODs 10 X speedup # of intersected nodes per ray Render time Working set size Po. E = 0 (No LOD) Po. E = 2. 5 35
Real-time Captured Video – St. Matthew Model 512 x 512, 2 x 2 anti-aliasing, Po. E = 4 36
Pros and Cons ● Limitations ● Does not handle advanced materials such as BRDF ● No guarantee there is no holes ● Advantages ● Simplicity ● Interactivity ● Efficiency 37
Conclusion ● LOD-based ray tracing method ● R-LOD representation ● Efficient LOD error metric ● Hierarchical LOD construction method with a linear time complexity ● Reduce the working set size 38
Ongoing and Future Work ● Investigate an efficient use of implicit surfaces ● Allow approximate visibility ● Extend to global illumination 39
Acknowledgements ● Model contributors ● Funding agencies ● ● ● ● Army Research Office DARPA Lawrence Livermore National Laboratory National Science Foundation Office of Naval Research RDECOM Intel Microsoft 40
Acknowledgements ● Eric Haines ● Martin Isenburg ● Dawoon Jung ● David Kasik ● Peter Lindstrom ● Matt Pharr ● Ingo Wald ● Anonymous reviewers 41
Questions? Thanks! 42
UCRL-PRES-223086 This work was performed under the auspices of the U. S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405 -ENG-48. 43
Additional slides 44
Goal ● Perform an interactive ray tracing of massive models ● Handles various kinds of polygonal meshes (e. g. , scanned data and CAD) Double eagle tanker 82 M triangles St. Matthew 372 M triangles Isosurface (472 M) 45
Memory Hierarchies Speed Size 1 KB Register 100 ns 1 MB Caches 101 ns 1 GB Main memory 102 ns > 1 GB Disk storage 104 ns 46
Hierarchical R-LOD Computation with Linear Time Complexity where , are x, y coordinates of kth points + 47
Performance Comparison – St. Matthew Model 2 ~ 20 X improvements Non-LOD Render time (sec) LOD Approaching the model for every frame 48
Image Quality Comparison – St. Matthew Model 512 x 512, no anti-aliasing LOD (Po. E = 4) Non-LOD 49
Further Information ● R-LODs: Fast LOD-based Ray Tracing of Massive Models ● S. Yoon, C. Lauterbach, and D. Manocha ● (To appear at) Pacific graphics (The Visual Computer) 2006 50
Recent Advances for Interactive Ray Tracing ● Hardware improvements ● Exponential growth of computation power ● Multi-core architectures ● Algorithmic improvements ● Efficient ray coherence techniques [Wald et al. 01, Reshetov et al. 05] These improvements may not provide an efficient solution to our problem! 51
Ray Coherence Techniques ● Assume coherences between spatially coherent rays ● Works well with CAD or architectural models ● Highly-tessellated models ● There may not be much coherence between rays Image plane Small triangles Viewpoint Rays per each pixel 52
Ray Coherence Techniques ● Models with large primitives ● Group rays and test intersections between the group and a bounding box Image plane Large triangles Viewpoint Ray beams 53
Ray Coherence Techniques ● Highly tessellated models ● Fall back to the normal ray tracing ● Causes incoherent memory accesses and temporal aliasing Small Image plane triangles Viewpoint Ray beams 54
Runtime Traversal with R-LODs ● Built on top of the efficient kd-tree traversal algorithm [Wald 04] Check whether the error is met? Check whethere is an intersection? If intersected, return shading info Otherwise, stop traversal : kd-node w/ R-LOD : kd-node w/o R-LOD 55
Two Main Design Criteria for LOD Metric ● Controllability of visual errors ● Efficiency ● Model complexity: 100 M (at least 27 deep kdtree) ● Image resolution: 1 k by 1 K (= 1 M rays) ● 27 M (= 1 M x 27) times of LOD metric evaluation! 56
Surface Deviation ● Combined with the previous projectedspace error bound, R New R Plane of R-LOD Underlying geometry 57
Properties of R-LODs ● Compact and efficient LOD representation ● Add only 4 bytes to (8 bytes) kd-node ● Drastic simplification ● Useful for performance improvement ● Recursively simplify 23 triangles into one R-LOD kd-node w/o R-LOD Simplify 58
Image Quality Comparison – Forest Model (32 M Triangles) 4 X speedup Po. E = 0 (No LOD) Po. E = 4 and cache-oblivious layout of kd-tree Shading difference 59
Image Quality Comparison – Forest Model Po. E = 0 (No LOD) Po. E = 16 Shading difference 60
R-LODs with Different Po. E Values Po. E: Original 40 512 x 512 image resolution 80 61
- Slides: 61