Scalable Algorithms for Structured Adaptive Mesh Refinement Akhil
Scalable Algorithms for Structured Adaptive Mesh Refinement Akhil Langer, Jonathan Lifflander, Phil Miller, Laxmikant Kale Parallel Programming Laboratory University of Illinois at Urbana-Champaign Kuo-Chuan Pan, Paul Ricker Department of Astronomy University of Illinois at Urbana-Champaign
Outline • Introduction to Adaptive Mesh Refinement (AMR) • Traditional Approach – Existing Frameworks – Algorithm – Disadvantages • New Scalable Approach – Doing it the Charm++ way – The Mesh Restructuring Algorithm – Dynamic Distributed Load Balancing • Comparison with Traditional Approach • Experimental Results/ Performance 16 th April, 2013 Scalable Algorithms for AMR 2
Outline • Introduction to Adaptive Mesh Refinement • Traditional Approach – Existing Frameworks – Algorithm – Disadvantages • New Scalable Approach – Doing it the Charm++ way – The Mesh Restructuring Algorithm – Dynamic Distributed Load Balancing • Comparison with Traditional Approach • Experimental Results/ Performance 16 th April, 2013 Scalable Algorithms for AMR 3
Adaptive Mesh Refinement Introduction • Solving Partial Differential Equations (PDEs) – PDEs solved using discrete domain – Algebraic equations estimate values of unknowns at the mesh points – Resolution/Spacing of mesh points determines error 16 th April, 2013 Scalable Algorithms for AMR 4
Adaptive Mesh Refinement Introduction Uniform Mesh Adaptively Refined Mesh Uniform meshes • High resolution required for handling difficult regions (discontinuities, steep gradients, shocks, etc. ) • Computationally extremely costly Adaptive Mesh Refinement • Start with a coarse grid • Identify regions that need finer resolution • Superimpose finer subgrids only on those regions AMR makes it feasible to solve problems that are intractable on uniform grid 16 th April, 2013 Scalable Algorithms for AMR 5
Adaptive Mesh Refinement Applications CFD Astrophysics Climate Modeling Turbulence Mantle Convection Modeling • Combustion • Biophysics • and many more • • • Impact of Type Ia super nova explosion on a main –sequence binary companion, rendered using FLASH 16 th April, 2013 Scalable Algorithms for AMR 6
Outline • Introduction to Adaptive Mesh Refinement • Traditional Approach – Existing Frameworks – Algorithm – Disadvantages • New Scalable Approach – Doing it the Charm++ way – The Mesh Restructuring Algorithm – Dynamic Distributed Load Balancing • Comparison with Traditional Approach • Experimental Results/ Performance 16 th April, 2013 Scalable Algorithms for AMR 7
Traditional Approach Existing Frameworks • • PARAMESH SAMRAI FLASH p 4 est Enzo Chombo deal. II and many more Our current work focusses on Oct-tree based simulations 16 th April, 2013 Scalable Algorithms for AMR 8
Little more background on AMR Refinement structure can be represented using a quad-tree (2 D)/ oct-tree (3 D) An important condition in AMR Refinement levels of neighboring blocks differ by ± 1 16 th April, 2013 Scalable Algorithms for AMR 9
Traditional Approach Typical Implementations • A set of blocks assigned to a process • Use space-filling curves for load balancing 16 th April, 2013 Scalable Algorithms for AMR 10
Traditional Approach Disadvantages • O(#blocks) memory per-process O(d) reductions Synchronization overheads!!! Memory bottlenecks!!! O(#blocks) 16 th April, 2013 Scalable Algorithms for AMR 11
Outline • Introduction to Adaptive Mesh Refinement • Traditional Approach – Existing Frameworks – Algorithm – Disadvantages • New Scalable Approach – Doing it the Charm++ way – The Mesh Restructuring Algorithm – Dynamic Distributed Load Balancing • Comparison with Traditional Approach • Experimental Results/ Performance 16 th April, 2013 Scalable Algorithms for AMR 12
Scalable Approach Doing it the Charm++ way • Instead of processes, promote individual blocks to first class entities – chare array with custom indices – A chare corresponds to a block – A block is now the end point of communication 16 th April, 2013 Scalable Algorithms for AMR 13
Scalable Approach Doing it the Charm++ way • Block Naming – Bitvector describing path from root to block’s node – One bit per dimension at each level – Easy to compute parent, children, siblings 0 00 1 01 10 11 • using bit manipulation • Block acts as a virtual processor p 0 – Run time handles communication between arbitrary blocks – overlap of computation with communication of other blocks on same physical process • Dynamic placement of blocks (chares) on processes – Facilitates dynamic load balancing • Block is a unit of algorithm expression – Simplifies implementation complexity 16 th April, 2013 Scalable Algorithms for AMR computing Waiting for boundary layers 14
Scalable Approach The Mesh Restructuring (Remeshing) Algorithm • Based on local error estimate, each block makes one of the following decisions: refine, coarse or stay • refine and stay decisions are communicated to neighbors – coarse decision is implied if either refine, stay message is not received 16 th April, 2013 Scalable Algorithms for AMR 15
Scalable Approach The Mesh Restructuring Algorithm refine, stay decisions will propagate Decisions are updated based on the DFA and change in decisions are again communicated 16 th April, 2013 Scalable Algorithms for AMR 16
Scalable Approach The Mesh Restructuring Algorithm When to stop? 16 th April, 2013 Scalable Algorithms for AMR 17
Dynamic Load Balancing • 16 th April, 2013 Scalable Algorithms for AMR 18
Outline • Introduction to Adaptive Mesh Refinement • Traditional Approach – Existing Frameworks – Algorithm – Disadvantages • New Scalable Approach – Doing it the Charm++ way – The Mesh Restructuring Algorithm – Dynamic Distributed Load Balancing • Comparison with Traditional Approach • Experimental Results/ Performance 16 th April, 2013 Scalable Algorithms for AMR 19
Comparison with the Traditional Approach Scalable Approach Memory Mesh Restructuring Load Balancing Synchronized Highly asynchronous Centralized and Distributed Neighbor Lookup Implementation Hash table Complex implementation Simplementation (SLOC: 1300 for 2 D, 1600 for 3 D Advection) Conclusion New approach promises high performance for much more deeply refined computations than are currently practiced 16 th April, 2013 Scalable Algorithms for AMR 20
Outline • Introduction to Adaptive Mesh Refinement • Traditional Approach – Existing Frameworks – Algorithm – Disadvantages • New Scalable Approach – Doing it the Charm++ way – The Mesh Restructuring Algorithm – Dynamic Distributed Load Balancing • Comparison with Traditional Approach • Experimental Results/ Performance 16 th April, 2013 Scalable Algorithms for AMR 21
Performance Benchmark Advection benchmark • First-order upwind method in 2 D space • Advection of a tracer along with the fluid 16 th April, 2013 Scalable Algorithms for AMR 22
Performance Mesh Restructuring Latency Two components • Remeshing decisions and communication • scales well • Termination detection time • Logarithmic in #processes 2 D Advection on BG/Q 16 th April, 2013 Scalable Algorithms for AMR 23
Performance Strong Scaling 2 D Advection on BG/Q with max-depth of 15 16 th April, 2013 Scalable Algorithms for AMR 24
Thank you! Questions? ? Scalable Algorithms for Structured Adaptive Mesh Refinement Akhil Langer, Jonathan Lifflander, Phil Miller Parallel Programming Laboratory University of Illinois at Urbana-Champaign Kuo-Chuan Pan, Paul Ricker Department of Astronomy University of Illinois at Urbana-Champaign
- Slides: 25