The Athena AMR Framework Jim Stone Princeton University

  • Slides: 32
Download presentation
The Athena++ AMR Framework Jim Stone Princeton University Radiation Energy Density Mass Density work

The Athena++ AMR Framework Jim Stone Princeton University Radiation Energy Density Mass Density work with Shane Davis (UVa), Kyle Felker (PU), Yan-Fei Jiang (UCSB), Kengo Tomida (Osaka), Chris White (UCSB) and the

Outline 1. 2. 3. 4. 5. Some of Åke’s work that has influenced me.

Outline 1. 2. 3. 4. 5. Some of Åke’s work that has influenced me. A new code: Athena++ An example application Software challenges Summary

Solar convection Papers by Nordlund, Stein, et al. on MHD convection were very inspiring

Solar convection Papers by Nordlund, Stein, et al. on MHD convection were very inspiring for me while I was a graduate student in early 1990’s. Demonstrated “state-of-the-art” in numerical MHD at the time. Stein & Nordlund (1998) Contours: Mach number = 1 Grayscale: emergent intensity Axes labelled by grid indices, Cell-size = 24 km.

Accretion disk dynamo Brandenburg et al. (1995) Oscillating net <By > driven by MRI

Accretion disk dynamo Brandenburg et al. (1995) Oscillating net <By > driven by MRI in stratified disks Generated by currents at vertical boundaries. Comparison of domains with +/- 2 H and +/- 3 H Of great interest to me, since I was working on same problem (Stone et al. 1996)

Turbulence in ISM Density field from sub-Alfvenic isothermal turbulence from Padoan & Nordlund (1999)

Turbulence in ISM Density field from sub-Alfvenic isothermal turbulence from Padoan & Nordlund (1999) Led to highly-cited paper on IMF:

Code comparison Åke was a “driver” of the KITP code-comparison project in 2007 PPM

Code comparison Åke was a “driver” of the KITP code-comparison project in 2007 PPM ZEUS SPH Published in Ap. J 737: 13 (2011) Figure courtesy of M. Norman

The Athena MHD Code Motivation: better shock capturing algorithm (no AV). able to use

The Athena MHD Code Motivation: better shock capturing algorithm (no AV). able to use with mesh refinement. Athena: MHD Godunov scheme • Directionally-unsplit 3 D integration algorithms • Constrained Transport (CT) to preserve div(B)=0 • Static mesh refinement Code is documented in a series of papers: • Gardiner & Stone, JCP 2005; 2008 • Stone & Gardiner, Ap. JS 2008, New. Ast 2008 Additional Physics • Viscosity, resistivity, ambipolar diffusion, Hall MHD • Optically thin cooling, thermal conduction • Special relativity • Full-transport radiation MHD

Motivation for a New Code • Original Athena code started in 2001 to study

Motivation for a New Code • Original Athena code started in 2001 to study accretion physics, MHD turbulence in ISM • Eventually extended with lots of physics • Latest version (v 4. 2) publicly available However, by 2014 it became clear a re-write was needed: • Patch-based (Berger & Oliger) AMR did not scale well • Adding non-uniform and/or new coordinates difficult • C-code had become unmanageable: 100, 000+ lines with complex physics all controlled by nested #ifdef

Athena++ Feature: AMR Based on “block-based” octree refinement. Regions of mesh refinement restricted by

Athena++ Feature: AMR Based on “block-based” octree refinement. Regions of mesh refinement restricted by block locations. Implemented with staggered-mesh MHD. Matsumoto 2007 Similar to PARAMESH in FLASH. Differs from “patch-based” (Berger&Oliger) AMR in PLUTO, Enzo • Blocks communicate only through boundaries. • No overlap between fine and coarse grids. • Solution at each point represented on only one mesh – avoids inconsistencies between fine/coarse solutions. • Very efficient on large core counts. • Much easier to code

Athena++ Feature: non-uniform curvilinear mesh Non-uniform grid in spherical polar coordinates allows large dynamic

Athena++ Feature: non-uniform curvilinear mesh Non-uniform grid in spherical polar coordinates allows large dynamic range in radius. Nested grid allows de-refinement towards pole, avoiding very small cells. Polar boundary condition allows free-flow over poles.

AMR in spherical polar Spiral density waves in circumplanetary disk due to planet-disk interaction

AMR in spherical polar Spiral density waves in circumplanetary disk due to planet-disk interaction (Zhu et al. 2016) important source of AM transport

Athena++ Feature: Dynamic execution • AMR design results in many mesh blocks per core.

Athena++ Feature: Dynamic execution • AMR design results in many mesh blocks per core. • Integration steps on blocks are organized into a Task. List • Order of tasks is dynamic, based on which MPI calls have finished. • Allows overlap of computation and communication. • Makes adding more physics very easy by adding more tasks. • Different blocks can have different tasks (physics), enabling multi-scale physics applications.

Athena++: Physics Modules Current modules (in development version): Relativistic/non-relativistic MHD Self-gravity Time-dependent radiation transfer

Athena++: Physics Modules Current modules (in development version): Relativistic/non-relativistic MHD Self-gravity Time-dependent radiation transfer Chemistry General EOS Modules under development: GR-radiation transfer Nuclear reaction networks Dust/tracer particles Hybrid particle-in-cell method

Validating physics modules Simply passing regression or unit tests is not enough. Must demonstrate

Validating physics modules Simply passing regression or unit tests is not enough. Must demonstrate correct solutions in regime of application. Comparison of Athena++ and Dedalus on non-linear Kelvin. Helmoltz instability with explicit dissipation. Athena++ reproduces spectral code results at 2 x resolution.

Performance and Scaling of MHD Single core performance of Athena++ is greatly improved (~4

Performance and Scaling of MHD Single core performance of Athena++ is greatly improved (~4 x) by efficient vectorization. • • 0. 5 ms per cell per core for 3 D MHD on Intel Skylake (2 M zc/sec) Performance is memory bandwidth limited when using all cores Performance per Skylake CPU is about 1/10 performance per Volta GPU Excellent weak scaling to 0. 5 million cores (on KNL)

Performance and Scaling of Multigrid • Multigrid is faster than FFTW for self-gravity due

Performance and Scaling of Multigrid • Multigrid is faster than FFTW for self-gravity due to O(N) instead of O(N log[N]) scaling • Self-gravity costs < 20% of MHD with 643 cells per core and 4096 cores

K-Athena: GPU version of Athena++ (Grete et al. 2019) Uses the Kokkos library to

K-Athena: GPU version of Athena++ (Grete et al. 2019) Uses the Kokkos library to provide portability SKX = Intel Skylake, BDW = Intel Broadwell

Radiation MHD in the Athena code. Davis, Stone, & Jiang (2012); Jiang, Stone, &

Radiation MHD in the Athena code. Davis, Stone, & Jiang (2012); Jiang, Stone, & Davis (2012; 2014) 1. Compute closure relation P = f E directly – Variable Eddington tensor (VET) f calculated from “snapshot” solution of time-independent transfer equation using short characteristics. – Source terms can be very stiff, requires modified Godunov method for stability – For non-relativistic flows, wide range of timescales associated with v, Cs, c; use fully implicit (backward Euler) differencing of radiation moment equations 2. For relativistic flows, direct explicit differencing of time-dependent transfer equation used.

Computing the Variable Eddington Tensor (VET) For non-relativistic flows, the light-crossing time is much

Computing the Variable Eddington Tensor (VET) For non-relativistic flows, the light-crossing time is much shorter than an MHD time step. In this case, solve the time-independent transfer equation using the method of short characteristics along Nr rays per cell. Include scattering and non-LTE effects using accelerated lambda iteration (ALI). Olson & Kunasz 1987; Stone, Mihalas, & Norman 1992; Trujillo Bueno & Fabiani Bendicho 1995; Davis, Stone, & Jiang 2012 Then compute f directly from moments of I.

Importance of Algorithms • Both algorithms for RT in Athena require direct solution of

Importance of Algorithms • Both algorithms for RT in Athena require direct solution of transport equation. • More complicated and expensive than: • FLD – assumes radiation flux proportional to gradient of Erad • M 1 – assumes particular form for Eddington tensor, rather than computing directly But we have found in certain tests qualitatively different behavior in the solutions with approximate methods. Care must be taken in using methods like FLD. Example, dynamics of a radiation supported atmosphere (Krumholz & Thompson 2012; Davis et al. 2014)

FLD solution Atmosphere remains bound. No outflow. (e. g. Krumholz & Thompson) VET solution

FLD solution Atmosphere remains bound. No outflow. (e. g. Krumholz & Thompson) VET solution Atmosphere blown away. Radiation driven outflow. Very clumpy. (e. g. Davis et al. )

With FLD, vertical flux strongly correlated with density. Radiation escapes through low-density channels. Density

With FLD, vertical flux strongly correlated with density. Radiation escapes through low-density channels. Density Vertical component of Frad

In VET solution, direction of radiation flux is not given by gradient of Erad

In VET solution, direction of radiation flux is not given by gradient of Erad White arrows: Frad in VET solution Green arrows: Frad computed from gradient of Erad

Example application: Global thin disk with net vertical field in 3 D work led

Example application: Global thin disk with net vertical field in 3 D work led by Zhaohuan Zhu (Zhu & Stone 2018) • • 3 D spherical polar nested grid 0. 1 < r < 100, full 2 p in azimuth 1088 x 512 x 1024 on finest level Initially H/R ~ 0. 1, b. Z = 1000, 100 At least 16 grid points per H everywhere Evolved for 40 orbits at R=1 Evolved for 0. 25 “viscous” times at R=1

Azimuthally averaged structure B field lines T = 40 orbits at r=1 fast Alfven

Azimuthally averaged structure B field lines T = 40 orbits at r=1 fast Alfven

Vertical slice at r = 1 T = 40 orbits at r=1

Vertical slice at r = 1 T = 40 orbits at r=1

Short summary of global thin disk • Vertical fields drive rapid accretion in surface

Short summary of global thin disk • Vertical fields drive rapid accretion in surface layers • Vertical flux advected rapidly inward in surface layers (“coronal accretion”, Beckwith et al. 2009) • Vigorous MRI turbulence at midplane, a ~ 0. 1 • Slow outflow at midplane • Fast magnetosonic wind produced over small range of radii • Next: repeat with non-ideal MHD, radiation and dust particles.

4. Software Design Principles • Strict adherence to ANSI C++11 • Only requires C++

4. Software Design Principles • Strict adherence to ANSI C++11 • Only requires C++ compiler and python distribution to run • MPI library required for distributed-memory parallelism • Some problems require FFTW and HDF 5 libraries • Regression tests implemented through python scripts • Style checks provided by Google’s cpplint. py • Continuous integration tools (Jenkins, Travis CI) used to test commits • Development version hosted in private Git. Hub repo • Public version available (copy of master branch) distributed through Git. Hub repo

First Athena++ Developer/User Meeting. April, 2019 UNLV. Sixty participants.

First Athena++ Developer/User Meeting. April, 2019 UNLV. Sixty participants.

Download Athena++: http: //princetonuniversity. github. io/athena/index. html

Download Athena++: http: //princetonuniversity. github. io/athena/index. html

My concern: Athena++ is already too big to be sustained by application scientists alone.

My concern: Athena++ is already too big to be sustained by application scientists alone. Athena++ lines of code 2014 -2019

5. Summary • Athena++ is a framework for adaptive mesh refinement (AMR) calculations of

5. Summary • Athena++ is a framework for adaptive mesh refinement (AMR) calculations of plasma dynamics on Eulerian meshes. • Various physics modules are built on this framework: • • Non-relativistic/relativistic MHD Self-gravity Radiation transfer Many more are under development • Good performance and scaling on modern CPUs and GPUs is achieved. • The code is developed by a loose collaboration of application scientists. • Hopefully the code will enable new understanding of radiation -dominated astrophysical fluid flows. • This work follows the example set by Åke’s career: developing new numerical tools is important for solving new astrophysics problems.