Parallel Preconditioners for the Incompressible Navier Stokes Equations
Parallel Preconditioners for the Incompressible Navier. Stokes Equations Robert Shuttleworth Applied Math & Scientific Computation (AMSC) University of Maryland 10/7/2020 AMSC Candidacy Presentation 1
Outline n Background q q n Incompressible Navier-Stokes Equations Discretization/Linearization Preconditioning the N-S Equations q q General Preconditioners N-S Problem Specific n n High Performance Computing Issues Preliminary Results q q n Pressure Correction Methods Pressure Convection-Diffusion Lid driven cavity problem Flow over a diamond obstruction Conclusions 10/7/2020 AMSC Candidacy Presentation 2
Motivation/Focus n n n Motivation: Efficient and robust solution of steady state and transient flow problems q Develop fully implicit solution methods to the incompressible Navier-Stokes q Solving the linear systems that arise can take upwards of 70% of the CPU time of a given simulation Linear Solvers: Operator Based Block Preconditioning Focus: Adapt block preconditioners to the linear subproblems that arise in realistic fluid flow problems 10/7/2020 AMSC Candidacy Presentation 3
Introduction Given the Incompressible Navier-Stokes Equations: Nonlinear Term: Discretization and Linearization: Oseen: Newton: Jacobian of Momentum Eq. 10/7/2020 AMSC Candidacy Presentation 4
Discretization and Linearization 10/7/2020 AMSC Candidacy Presentation 5
Complete Algorithm u(0) = initial condition or initial guess p(0) = initial condition or initial guess for m = 1: Ntimesteps time loop u(m) = u(m-1) , p(m) = p(m-1) while || F (u(m) , u(m-1) , p(m) , u(m) ) || > nonlinear loop ulag = u(m) /* */ /* Set up linear subproblem */ /* */ (m) (m-1) (m) /* corresponding to F (u , p , ulag ) = 0. */ Iterate on A u(m) = b until || rk || /|| r 0 || < saddle linear solver Block Precondition 10/7/2020 AMSC Candidacy Presentation 6
General Preconditioning Premise n n n Preconditioning needed in solution of any large scale PDE Bottleneck of solving N-S is the iterative solution of the linear systems Given: “Good” “Cheaper” n Preconditioning speeds up convergence by improving the spectral properties of a matrix 10/7/2020 AMSC Candidacy Presentation 7
Types of Preconditioners n General (Algebraic) Preconditioners q q n Incomplete LU Factorization (ILU) Sparse Approximate Inverses (SPAI) Multigrid Domain Decomposition N-S Problem Specific Preconditioners q q 10/7/2020 Pressure Correction Pressure Convection-Diffusion AMSC Candidacy Presentation 8
Incomplete LU Factorization (ILU) n n Factoring a sparse matrix by Gaussian Elimination generates fill-in. So, the L and U factors are less sparse than the original matrix. By ignoring the fill-in that occurs within a certain tolerance, approximate factors to L and U are available. Advantages – simple to implement, inexpensive, good for certain problems Disadvantages – potential instabilities, lack of scalability, not good for CFD applications, and difficulty in parallelization 10/7/2020 AMSC Candidacy Presentation 9
Block Preconditioners n Discretization n Consider: q q 10/7/2020 Optimal preconditioner is when X is the Schur Complement, Question: How to approximate the Schur complement? AMSC Candidacy Presentation 10
Pressure Correction – (LD)U So, we can apply a preconditioner to the saddle point matrix of the form: 10/7/2020 AMSC Candidacy Presentation 11
Pressure Correction n SIMPLER: Projection Matrix: Enforces Incompressibility 10/7/2020 AMSC Candidacy Presentation 12
Pressure Correction n Advantages q q n Used in both transient and steady state Easy to implement and parallelize Disadvantages q q q 10/7/2020 Slower convergence – coupling of physics is violated Choosing a relaxation parameter Inefficient for convection dominated flows AMSC Candidacy Presentation 13
Pressure Convection-Diffusion – L(DU) Therefore, a right oriented preconditioner can be applied to this problem: 10/7/2020 AMSC Candidacy Presentation 14
Pressure Convection Diffusion - Fp Suppose: Suppose the velocity and pressure convection-diffusion terms commute with one another: Then, 10/7/2020 AMSC Candidacy Presentation 15
Pressure Convection Diffusion - BFB 10/7/2020 AMSC Candidacy Presentation 16
Pressure Convection-Diffusion n Advantages q q q n Insensitive to mesh size, time step, and CFL number Minor Reynolds number dependence Solves coupled system Disadvantages q q 10/7/2020 Applications do not supply Fp Designed for Oseen iterations Equations for Stabilized FEM are not developed Boundary Conditions AMSC Candidacy Presentation 17
Packages: Time Loop Component Methods Package Fluid Flow Finite Element MPSalsa (Epetra) Nonlinear Solver Newton-Krylov Methods NOX Block Precond Linear Solver GMRESR Aztec 00 (Epetra, TSF) End Non. Lin Loop block precondition Nonlinear Loop Linear Solver End Time Loop 10/7/2020 Meros (TSF) F-1 : GMRES/AMG X-1 : CG/AMG AMSC Candidacy Presentation Aztec 00, ML Epetra 18
Epetra – Sparse Matrix Package n n Facilitates sparse matrix construction on both parallel and serial machines Compressed Row Storage (CRS) q q q 10/7/2020 Double precision nonzero values are stored in contiguous memory locations Builds a map (graph) – an array of integers corresponding to nonzero row/column entries Rows are stored in consecutive order AMSC Candidacy Presentation 19
Epetra - Sparse Matrix Library n n Serial – interfaces the BLAS Parallel - handles distributed matrix details q q q Local versus global indices Global Map (graph) details which processor owns which entries Local Map (graph) details how local data (and ghosts) is represented Matrix-vector products Wrappers to Message Passing Interface (MPI) (Note: The global map is normally determined by the user or other library. ) 10/7/2020 AMSC Candidacy Presentation 20
TSF Properties n What is TSF? q q n TSF – Trilinos Solver Framework High level matrix/block matrix manipulation language Provides framework for integrating different solvers/preconditioners Interface for representation-independent solvers Why TSF? q q q 10/7/2020 Abstract interfaces for vectors and operators Composable Block operators Deferred inverse and transpose Overloaded operators (matlab like syntax) Matlab-like simplicity, running on a supercomputer Transparent memory management AMSC Candidacy Presentation 21
Benchmark Problems n Lid Driven Cavity q q n Flow over a diamond obstruction q q n Contains many features of harder flows Steady and Unsteady Solutions Inflow/Outflow boundary conditions Harder flow MPSalsa q 10/7/2020 Realistic massively parallel, chemically reactive fluid flow code AMSC Candidacy Presentation 22
MPSalsa Steady Problem Results 2 D Lid driven cavity on a 64 x 64 grid Re ILU Simplec Simple Fp 10 88. 0 52. 5 46. 8 25. 4 50 92. 8 56. 6 50. 2 30. 8 100 95. 7 59. 2 53. 0 40. 8 200 95. 9 70. 2 61. 3 56. 6 The values in each column represent the average number of Outer Saddle Point Solves per Newton Step. 10/7/2020 AMSC Candidacy Presentation Residual reduction for each of the preconditioners. 23
MPSalsa Steady Problem Results 2 D Lid driven cavity Re Mesh ILU Fp Proc 10 64 x 64 88. 0 25. 4 4 10 128 x 128 194. 2 23. 2 16 10 256 x 256 >1200 23. 4 64 100 64 x 64 95. 7 40. 8 4 100 128 x 128 335. 3 40. 7 16 100 256 x 256 >1200 41. 3 64 500 64 x 64 94. 9 98. 3 4 500 128 x 128 350. 2 91. 4 16 500 256 x 256 >1200 92. 2 64 Mesh Independence 10/7/2020 Preliminary Time Comparison AMSC Candidacy Presentation 24
Implementation Challenge n n n Timings are not very good After profiling, upwards of 50% of the CPU time is spent in an inefficient memory allocation routine A multigrid smoother is inefficiently implemented 10/7/2020 AMSC Candidacy Presentation 25
MPSalsa Steady Problem Results 2 D Flow over a Diamond Obstruction Re # Unknowns ILU Fp 10 16, 000 45. 0 36. 0 64, 000 110. 8 37. 8 250, 000 332. 2 36. 2 16, 000 45. 5 48. 8 64, 000 101. 7 51. 5 250, 000 297. 2 47. 0 25 The values in each column represent the average number of Outer Saddle Point Solves per Newton Step. 10/7/2020 AMSC Candidacy Presentation 26
Future Work n n Sparse Approximate Commutator (SPAC) for Fp Compare/Optimize CPU Time amongst methods Tests on higher Re # for both steady/time dependent problems More realistic problems q q q 3 D Problems Chemically reacting flow Turbulent flows 10/7/2020 AMSC Candidacy Presentation 27
Conclusions n n Incompressible Navier-Stokes Equations Preconditioning the N-S Equations q q General Preconditioners Problem Specific n n n Pressure Correction Methods Pressure Convection-Diffusion Preliminary Remarks q q 10/7/2020 ILU preconditioner does not scale well Fp preconditioner is mesh independent and competitive in CPU time AMSC Candidacy Presentation 28
References n n n A. J. Chorin, A numerical method for solving incompressible viscous problems, Journal of Computational Physics, 2: 12, 1967. H. C. Elman, D. J. Silvester and A. J. Wathen, Finite Elements and Fast Iterative Solvers, Oxford University Press, 2005. Howard Elman, V. E. Howle, John Shadid and Ray Tuminaro, A Parallel Block Multi-level Preconditioner for the 3 D Incompressible Navier-Stokes Equations. Journal of Computational Physics 187: 504 -523, 2003. D. Kay, D. Loghin, and A. J. Wathen, 2002, A preconditioner for the steady-state Navier-Stokes equations. SIAM J. Sci. Comput. 24, pp. 237 -256. M. Pernice and M. D. Tocci, A multigrid-preconditioned Newton Krylov method for the incompressible Navier-Stokes equations. , SIAM J. Sci. Comput. 123, pp. 398 -418. S. V. Patankar, Numerical heat transfer and fluid flow, Hemisphere Pub. Corp, New York, 1980. 10/7/2020 AMSC Candidacy Presentation 29
- Slides: 29