Siggraph 2009 Render Ants Interactive REYES Rendering on
- Slides: 52
Siggraph 2009 Render. Ants: Interactive REYES Rendering on GPUs Kun Zhou Minmin Gong Qiming Hou Zhong Ren Xin Sun Baining Guo JAEHYUN CHO
Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion 2
Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion 3
REYES rendering ● “Renders Everything You Ever Saw” ● In 1980 s by Carpenter and Cook ● Photo-realistic images ● Main Idea ● Subdivide every primitive into micropolygons ● In use by Pixar ● Photo. Realistic. Render. Man ( PRMan ) 4
Basic REYES pipeline Modeling Bucketing Application primitives Shade shaded micropolygons Sample Bound Yes Split visible points Too Large? Composite & Filter No Dice 5 pixels unshaded micropolygons
Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion 6
System overview ● Map all basic REYES stages to the GPU ● Add 3 dynamic scheduling stages ● Support multi-GPU rendering 7
Render. Ants system pipeline 8
Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion 9
Bound/Split and Dice 10
Bound/Split and Dice ● Bound/Split ● All input primitives are stored in a queue ● Primitives in queue are bound and split in parallel ● Dice ● Primitives in dicing region are subdivided into micropolygons in parallel 11
Shade 12
Shade ● Main idea ● Translate Render. Man shader instructions to GPU shader instructions ● Use shader compiler ● Each vertex of micropolygons is shaded 13
Shade ● Out-of-core Texture fetch ● Too large to load on GPU memory at one time ● Use CPU-side cache manager ● If not in cache, interrupt GPU then cache reads from disk and copy to GPU 14
Sample 15
Sample ● Main idea ● Each pixel in sampling region is divided into subpixels ● If micropolygon covers sample location of subpixel, compute and store sample point of left micropolygon 16 sample point of right micropolygon
Sample ● Compute sample point ● Interpolate color, opacity and depth values of micropolygon at sample location 17
Composite & Filter 18
Composite & Filter ● Composite ● Sort sample points of each subpixel in depth order ● Composite sample points of each subpixel in depth order until meeting the depth of subpixel in parallel ● Filter ● For each pixel, blend color and opacity of subpixels in parallel 19
Advanced features ● Shadow ● Use shadow maps through shadow pass ● Motion blur & Depth-of-field ● Use accumulation buffer ● Assign unique sample time to each subpixel ● Sample subpixel whose sample time is equal to current rendering time 20
Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion 21
Dynamic scheduling ● Main idea ● Maximize parallelism at each stage ● Estimate memory requirements at each stage 22
Dicing scheduler 23
Dicing scheduler ● Main factor of memory requirements ● Total data of micropolygons ● Estimate memory requirements ● Total # of micropolygons computed from total # of primitives 24
Dicing scheduler ● Main idea ● Split current bucket into dicing regions ● Until # of primitives in processing region fits available GPU memory ● Use binary space partitioning ( BSP ) 25
How to split dicing region? ● Let # of primitive to fit GPU memory = 2 bucket 26 primitive
How to split dicing region? ● Let # of primitive to fit GPU memory = 2 subregion bucket 27 primitive bucket primitive
How to split dicing region? ● Let # of primitive to fit GPU memory = 2 subregion bucket 28 primitive bucket subregion primitive bucket primitive
Shading scheduler 29
Shading scheduler ● Main factor of memory requirements ● Temporary data allocated during shader execution ● Estimate memory requirements ● Different shaders require different sizes of temporary data 30
Shading scheduler ● Main idea ● Split micropolygon list into sublist ● Until # of micropolygons for current shader execution fits available GPU memory ● Do scheduling per shader execution 31
Sampling scheduler 32
Sampling scheduler ● Main factor of memory requirements ● Total data of subpixel framebuffer and sample points ● Estimate memory requirements ● Framebuffer size equals to region size ● Use line scanning process to estimate # of sample points 33
Sampling scheduler ● Main idea ● Split current dicing region into sampling regions ● Until # of sample points in processing region + region size fits available GPU memory ● Use binary space partitioning ( BSP ) 34
Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion 35
Multi-GPU rendering ● Main idea ● Minimize inter-GPU communication ● Balance workloads among GPUs 36
How to minimize inter-GPU communication? ● GPU maintains a complete list of all primitives in a bucket ● Only transfer region description 37
How to minimize inter-GPU communication? ● Let A, B, C denote each GPU A bucket 38
How to minimize inter-GPU communication? ● Let A, B, C denote each GPU subregion A bucket 39 A bucket B
How to minimize inter-GPU communication? ● Let A, B, C denote each GPU subregion B A bucket 40 A bucket B A bucket C
How to balance workloads among GPUs? ● Split region under both conditions ● If # of primitives > threshold ● If idle GPU exists 41
How to balance workloads among GPUs? ● Let threshold = 2 subregion A bucket 42 B primitive
How to balance workloads among GPUs? ● Let threshold = 2 subregion B A bucket 43 B primitive A bucket C primitive
Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion 44
Results 45
Rendering Performance 46
Rendering Time on GPU ● Breakdown of the rendering time on GPU ● Initialization time is relatively short ( Data loading from CPU to GPU ) 47
Scaled Performance on GPU 48
Outline ● REYES Rendering ● System Overview ● GPU REYES Rendering ● Dynamic Scheduling ● Multi-GPU Rendering ● Results ● Conclusion 49
Conclusions ● Advantages ● Faster than CPU-based Rendering ● Performance scalability ● Disadvantages ● Geometry scalability ● Motion/focal blur ●Improved in [Hou et al 2010] 50
Questions & Answers Q&A 51
Finish! Thank You 52
- Reyes rendering
- Siggraph
- Siggraph
- Illumination monaco
- Siggraph
- C0texture
- Siggraph
- Vray atmosphere render element
- Light camera render
- 168spr
- Qualcomm snapdragon developer unreal engine unityverge
- Pædagogens kernefaglighed
- Vray rendering image pass
- Vray vfb
- Cse
- Partial view
- Pygame fonts
- Colortex render
- Vray frame buffer different from saved image
- Julius ceasar play
- Vray beauty pass
- According to heizer and render an office layout
- Rendering fur with three dimensional textures
- Solid edge rendering
- Game rendering techniques
- Photorealistic rendering carlsbad
- Splat rendering
- Arsir bayangan
- Eurographics symposium on rendering
- Real time rendering architecture
- Advances in real-time rendering in games
- Teknik rendering grafik tiga dimensi dengan interaksi sinar
- Light transport
- Rendering pipeline in computer graphics
- Radiosity rendering
- Graphics rendering
- Multipass rendering
- Indirect volume rendering
- Introduction to volume rendering
- Hanspeter pfister
- Radiosity rendering
- Kajiya rendering equation
- Xbrl rendering tool
- Rendering realtime compositing
- Volume rendering tutorial
- Clustered shading
- Image-based modeling and rendering
- Kajiya rendering equation
- Direct volume rendering ray casting
- Inverse rendering
- Car paint rendering
- Camera translate
- Rendering