HighQuality Unstructured Volume Rendering on the PC Platform

  • Slides: 32
Download presentation
High-Quality Unstructured Volume Rendering on the PC Platform Stefan Guthe WSI/GRIS University of Tuebigen

High-Quality Unstructured Volume Rendering on the PC Platform Stefan Guthe WSI/GRIS University of Tuebigen Stefan Röttger, Andreas Schieber If. I/VIS University of Stuttgart Hardware Workshop 2002

Overview Introduction • Motivation • Cell Projection High Resolution Ray Integral • Opacity Reconstruction

Overview Introduction • Motivation • Cell Projection High Resolution Ray Integral • Opacity Reconstruction • Chromaticity Reconstruction Hardware Accelerated Pre-Integration Results & Conclusion 2/32

Introduction

Introduction

Motivation Tetrahedral Meshes • Common for numerical simulations • Adaptive resolution • Straight forward

Motivation Tetrahedral Meshes • Common for numerical simulations • Adaptive resolution • Straight forward multiresolution algorithms General purpose hardware • Widely available • Fast polygonal rendering • Flexible fragment shading for recent generations • Fast development of future generations • Cheap compared to special purpose hardware 4/32

Cell Projection Projected Tetrahedra (PT) Algorithm • Shirley and Tuchman ’ 90 • Classify

Cell Projection Projected Tetrahedra (PT) Algorithm • Shirley and Tuchman ’ 90 • Classify tetrahedra based on profile of projection • Split tetrahedra into 3 or 4 triangles 5/32

Cell Projection Projected Tetrahedra (PT) Algorithm • Render projected profiles • Chromaticity vector •

Cell Projection Projected Tetrahedra (PT) Algorithm • Render projected profiles • Chromaticity vector • Scalar optical density • Resulting ray integral 6/32

Ray Integral

Ray Integral

Opacity Reconstruction Approximation of opacity • Corresponding portion of the ray integral • Original

Opacity Reconstruction Approximation of opacity • Corresponding portion of the ray integral • Original approximation • Calculate correct values for vertices • Interpolate linearly between vertices • Improvement by Stein et al. ’ 94 • Calculate average extinction coefficient • Use texture map for exponential lookup • Linear opacity or piecewise linear (HIAC ‘ 98) 8/32

Opacity Reconstruction Approximation of opacity • Corresponding portion of the ray integral • Further

Opacity Reconstruction Approximation of opacity • Corresponding portion of the ray integral • Further improvements • 2 D texture map for lookup of average extinction • 1 D dependent texture lookup • No restriction to linear opacity 9/32

Opacity Reconstruction Approximation of opacity (Ge. Force 4) • Texture setup unit coordinates 0

Opacity Reconstruction Approximation of opacity (Ge. Force 4) • Texture setup unit coordinates 0 1 RGB A chrom. (RGA) 0, 0, l • Pixel shader ps. 1. 3 def texdp 3 lrp +mov c 0, 1, 1, 0, 0 t 0 // t 1, t 0 // r 0. rgb, c 0, t 0. a // r 0. a, t 1. a // load chromaticity and density dependent lookup extract chromaticity. . . and alpha for final color 10/32

Opacity Reconstruction Approximation of opacity (Radeon 8500) • Texture setup unit coordinates 0 1

Opacity Reconstruction Approximation of opacity (Radeon 8500) • Texture setup unit coordinates 0 1 RGB A chromaticity 0, 0, l • Pixel shader ps. 1. 4 texld texcrd mul phase texpass texld mov r 0, t 0 r 1, t 1 r 1, r 0. a, r 1. b // load chromaticity and density // pass l into register // multiply density and l r 0, r 0 r 1, r 1 r 0. a, r 1. a // transfer chromaticity // dependent lookup // correct alpha for final color 11/32

Chromaticity Reconstruction Approximation of chromaticity • Corresponding portion of the ray integral • Original

Chromaticity Reconstruction Approximation of chromaticity • Corresponding portion of the ray integral • Original approximation • Calculate correct values for vertices • Interpolate linearly between vertices • Improvement in HIAC ‘ 98 • Calculate values for slices through tetrahedra • Texture lookup instead of linear interpolation • Support of piecewise linear transfer functions 12/32

Chromaticity Reconstruction Approximation of chromaticity • Corresponding portion of the ray integral • Improvement

Chromaticity Reconstruction Approximation of chromaticity • Corresponding portion of the ray integral • Improvement by Roettger et al. ‘ 00 • 3 D texture for chromaticity and opacity • Slow update of transfer function • High memory requirements of 3 D textures • Accurate only for small tetrahedra due to limited resolution of pre-integration table 13/32

Chromaticity Reconstruction Approximation of chromaticity • Corresponding portion of the ray integral • Different

Chromaticity Reconstruction Approximation of chromaticity • Corresponding portion of the ray integral • Different approach • Higher order polynomials in l • Number of triangles equal to PT • Only 4 slices for cubic polynomials –Higher resolution table high image quality –Faster update of transfer function 14/32

Opacity Reconstruction Approximation of chromaticity (Ge. Force 4) • Texture setup (B and A

Opacity Reconstruction Approximation of chromaticity (Ge. Force 4) • Texture setup (B and A swapped for unit 0) unit coordinates RGB A 0 1 - 2 - 3 0, 0, l - • Additionally store l in primary color alpha • Distribution of duplicate values via vertex shader 15/32

Chromaticity Reconstruction Approximation of chromaticity (Ge. Force 4) • Pixel shader ps. 1. 3

Chromaticity Reconstruction Approximation of chromaticity (Ge. Force 4) • Pixel shader ps. 1. 3 def tex texdp 3 lrp mad +mov c 0, 1, 1, 0, 0 t 1 t 2 t 3, t 0 r 0. rgb, c 0, t 0. a r 0. rgb, v 0. a, r 0, t 1 r 0. rgb, v 0. a, r 0, t 2 r 0. a, t 1. a // load chromaticity and density // dependent lookup // extract chromaticity // calculate polynomial. . . // and get alpha for final color 16/32

Opacity Reconstruction Approximation of chromaticity (Radeon 8500) • Texture setup unit coordinates RGB A

Opacity Reconstruction Approximation of chromaticity (Radeon 8500) • Texture setup unit coordinates RGB A 0 1 - 2 - 3 - 4 - 5 0, 0, l - • Additionally store l in primary color alpha 17/32

Chromaticity Reconstruction Approximation of chromaticity (Radeon 8500) • Pixel shader ps. 1. 4 texld

Chromaticity Reconstruction Approximation of chromaticity (Radeon 8500) • Pixel shader ps. 1. 4 texld texcrd mul phase texld texld mad mad +mov r 0, t 0 r 5, t 5 r 5, r 0. a, r 5. b // load chromaticity and density // pass l into register // multiply density and l r 1, t 1 r 2, t 2 r 3, t 3 r 4, t 4 r 5, r 5 r 0. rgb, v 0. a, r 1. a // load other coefficients r 0, // dependent lookup r 1 // calculate polynomial. . . r 2 r 3 r 4 // and get alpha for final color 18/32

Chromaticity Reconstruction Problems of this approach • Limited precision of pixel shader could be

Chromaticity Reconstruction Problems of this approach • Limited precision of pixel shader could be a problem normalize coefficients • Additional vertex shader needed • Optimal approximation requires a least square fit for chromaticity (infeasible) • Part of the three-dimensional pre-integration table needs to be computed • Interactive change of classification no longer possible with software-only calculation of approximation textures 19/32

HW Acceleration

HW Acceleration

HW Accelerated Pre-Int. Hardware accelerated pre-integration • Use blending capabilities of graphics card •

HW Accelerated Pre-Int. Hardware accelerated pre-integration • Use blending capabilities of graphics card • Construct pre-integration table slice by slice (l constant) Problems • High error with few blending operations • Slow, due to large amount of frame buffer writes • High error with lots of blending operations • Accuracy of 8 bits too low • No problem with new floating point hardware 21/32

HW Accelerated Pre-Int. High accuracy pre-integration • Use high internal precision of pixel shader

HW Accelerated Pre-Int. High accuracy pre-integration • Use high internal precision of pixel shader • Create pre-integration table using 12 -bit values • Perform multiple blending operations at once • 4 blending operations in one step speedup of approximately 2 • Store high precision values in two 8 -bit values • Loose some instructions to combining and splitting high precision values • No alpha blending ping-pong rendering • Separate passes for R, G and B 22/32

HW Accelerated Pre-Int. Comparison of software and hardware pre-integration • Speedup of about 700%

HW Accelerated Pre-Int. Comparison of software and hardware pre-integration • Speedup of about 700% • Relatively low error hardware software difference 8 23/32

HW Accelerated Pre-Int. HW accelerated pre-integration (Radeon 8500) • Pixel shader combine def mad

HW Accelerated Pre-Int. HW accelerated pre-integration (Radeon 8500) • Pixel shader combine def mad c 0, 0. 0019608, 0, 0, 0 r 0, r 0. ggaa, c 0. r, r 0. rrbb // 1/256 // combine values • Use R and B for calculations • Multiply result by 8 during last blending operation faster split • Pixel shader split add_x 8 mov_d 8 r 0. ga, r 0_x 2. rrbb r 0. rb, r 0. rrbb // get low bits // get high bits 24/32

HW Accelerated Pre-Int. HW accelerated pre-integration (Radeon 8500) ps. 1. 4 def texld. .

HW Accelerated Pre-Int. HW accelerated pre-integration (Radeon 8500) ps. 1. 4 def texld. . . texld mad mad mad phase mad mad_x 8 add_x 8 mov_d 8 c 0, r 1, . . . r 4, r 0, r 1, r 2, r 3, r 4, 0. 0019608, 0, 0, 0 t 1 t 4 r 0. ggaa, r 1. ggaa, r 2. ggaa, r 3. ggaa, r 4. ggaa, r 1. rb, r 2. rb, r 3. rb, r 4. rb, r 0. ga, r 0. rb, c 0. r, r 0, 1 -r 1. b, r 1, 1 -r 2. b, r 2, 1 -r 3. b, r 3, 1 -r 4. b, r 4_x 2. rrbb, r 4. rrbb // 1/256 // previous data // 4 samples r 0. rrbb r 1. rrbb r 2. rrbb r 3. rrbb r 4. rrbb // combine values r 1 r 2 r 3 r 4_x 2. rrbb // perform blending // get low bits // get high bits 25/32

Results

Results

Results Prototype (12, 936 tetras) Buckyball (176, 856 tetras) 1280 960 at 89. 45

Results Prototype (12, 936 tetras) Buckyball (176, 856 tetras) 1280 960 at 89. 45 fps 1280 960 at 2. 46 fps 27/32

Results Blunt fin (187, 395 tetras) 1280 960 at 3. 18 fps 28/32

Results Blunt fin (187, 395 tetras) 1280 960 at 3. 18 fps 28/32

Results Bonsai (538, 937 tetras) Trumpet (1, 567, 755 tetras) 1280 960 at 1.

Results Bonsai (538, 937 tetras) Trumpet (1, 567, 755 tetras) 1280 960 at 1. 20 fps 1280 960 at 0. 48 fps 29/32

Conclusion Algorithm overview • Dependent texture for opacity • Polynomial approximation of chromaticity •

Conclusion Algorithm overview • Dependent texture for opacity • Polynomial approximation of chromaticity • High resolution pre-integration table • High quality rendering • 4 slices for cubic approximation • 6 slices for fifth order approximation • Fast update whenever transfer function changes • Number of triangles equal to original PT algorithm • Fast rendering 30/32

Conclusion Migration to new graphics hardware • HW pre-integration • Floating point accuracy improves

Conclusion Migration to new graphics hardware • HW pre-integration • Floating point accuracy improves • More blending steps at once even more performance gain • Image quality will be improved by floating point frame buffer precision • No dependent texture lookup due to exp-function in pixel shader 31/32

Questions? 32/32

Questions? 32/32