Parallel solution of the Helmholtz equation with high

  • Slides: 33
Download presentation
Parallel solution of the Helmholtz equation with high frequency Dan Gordon Rachel Gordon Computer

Parallel solution of the Helmholtz equation with high frequency Dan Gordon Rachel Gordon Computer Science University of Haifa Aerospace Eng. Technion Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 1

OUTLINE • The wave equation • The Kaczmarz algorithm (KACZ) • KACZ CARP (Component-Averaged

OUTLINE • The wave equation • The Kaczmarz algorithm (KACZ) • KACZ CARP (Component-Averaged Row Projections) • CARP-CG: CG acceleration of CARP • Sample results with the Helmholtz equation Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 2

The Helmholtz Equation • speed: c frequency: ν wave length: l = c/ν wave

The Helmholtz Equation • speed: c frequency: ν wave length: l = c/ν wave number: k = 2 p/l = 2 pν/c • wave eqn: -Δu - k 2 u = f • Discretization with uniform grid size h • No. of grid pts per l: Ng = l/h = 2 p/kh • Considered desirable: Ng ≥ 8, but Ng = 6 also gave good results • Linear system is complex and strongly indefinite Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 3

The Helmholtz Equation • Challenging problem when ν is high • Shifted Laplacian approach:

The Helmholtz Equation • Challenging problem when ν is high • Shifted Laplacian approach: – Bayliss, Goldstein & Turkel, 1983 • introduced a shift into the Laplacian – Erlangga, Vuik & Oosterlee, 2004/06 • complex shift: -Δu - (1 - i b) k 2 u = f • Uses multigrid to solve the preconditioner • Not simple for irregular grids Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 4

The Helmholtz Equation • Bollhöfer, Grote & Schenk, 2009: – Introduced algebraic multilevel preconditioner

The Helmholtz Equation • Bollhöfer, Grote & Schenk, 2009: – Introduced algebraic multilevel preconditioner – Use symmetric max weight matchings and inverse-based pivoting – Applied to heterogeneous 2 D and 3 D domains – Can be parallelized in principle Apologies to many others! Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 5

The Kaczmarz algorithm (KACZ) initial point eq. 1 eq. 2 eq. 3 Oct. 14,

The Kaczmarz algorithm (KACZ) initial point eq. 1 eq. 2 eq. 3 Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 6

KACZ with Relaxation Parameter • KACZ can be used with a relaxation parameter w

KACZ with Relaxation Parameter • KACZ can be used with a relaxation parameter w • w=1: project exactly on the hyperplane • w<1: project in front of hyperplane • w>1: project beyond the hyperplane • Cyclic relaxation: eq. i is assigned a relaxation parameter wi Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 7

Algebraic formulation of KACZ • Given the system Ax = b • Consider the

Algebraic formulation of KACZ • Given the system Ax = b • Consider the "normal equations" system AATy = b, x = ATy • Well-known fact: KACZ is SOR applied to the normal equations • The relaxation parameter of KACZ is the usual relax. par. of SOR Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 8

CARP: Component-Averaged Row Projections • A block-parallel version of KACZ • The equations are

CARP: Component-Averaged Row Projections • A block-parallel version of KACZ • The equations are divided into blocks (not necessarily disjoint) • A variable shared by 2 or more blocks is "cloned" into its neighboring blocks. • For each block (in parallel) do KACZ iterations • Every shared variable becomes the average of its values in the different blocks • Repeat until convergence Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 9

CARP-CG: CG acceleration of CARP • CARP is KACZ in some superspace (with cyclic

CARP-CG: CG acceleration of CARP • CARP is KACZ in some superspace (with cyclic relaxation parameters) • Björck & Elfving (BIT '79): developed CGMN, which is a (sequential) CGacceleration of KACZ (double sweep, fixed relax. parameter) • We extended this result to allow cyclic relaxation parameters • Result: CARP-CG Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 10

CARP-CG: Properties • On one processor, CARP-CG is identical to CGMN • Particularly useful

CARP-CG: Properties • On one processor, CARP-CG is identical to CGMN • Particularly useful on systems with large off-diagonal elements – example: convection-dominated PDEs • Discontinuous coefficients are handled without requiring domain decomposition (DD) Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 11

Robustness of CARP-CG • KACZ inherently normalizes the eqns • After normalization, the diagonal

Robustness of CARP-CG • KACZ inherently normalizes the eqns • After normalization, the diagonal elements of T AA are larger than the off-diagonal ones (in each row) • This is not diagonal dominance, but it makes the normal eqns manageable • Normalization was also found to be useful for discontinuous coefficients Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 12

Application of CARP-CG to Helmholtz equation • A fixed relaxation parameter of 1. 5

Application of CARP-CG to Helmholtz equation • A fixed relaxation parameter of 1. 5 was used in all cases • Domain: mostly unit square or unit cube • 2 nd order central difference scheme Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 13

Homogeneous 2 D problem • • Based on Erlangga et al. '04, § 6.

Homogeneous 2 D problem • • Based on Erlangga et al. '04, § 6. 2 Eqn: Δu + k 2 u = 0 Domain: unit square [0, 1] Dirichlet bndry cond. on one side, with a discontinuity at midpoint 1 st-order absorbing bndry cond. on other sides (Sommerfeld radiation condition) Grid points per l: Ng = 6, 8, 10 No. of processors: 1 – 32 k = (75), 150, 300, 450, 600 Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 14

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 15

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 15

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 16

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 16

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 17

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 17

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 18

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 18

Heterogeneous 2 D problem • • 3 -layer heterogeneous problem Based on Erlangga et

Heterogeneous 2 D problem • • 3 -layer heterogeneous problem Based on Erlangga et al. '04, § 6. 3 Everything is identical to previous problem EXCEPT: k=600 k=450 k=300 Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 19

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 20

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 20

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 21

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 21

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 22

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 22

The Marmousi Model • Well-known benchmark for solvers of the Helmholtz equation • 6000

The Marmousi Model • Well-known benchmark for solvers of the Helmholtz equation • 6000 m x 1600 m vertical slice of earth surface, disturbance at top center • Highly heterogeneous and irregular • Speed of sound: 1500 m/s to 4000 m/s • Tested on 12 node infiniband machine • Each node: 2 quad CPUs Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 23

Time (s) for rel-res<10 -7, Ng ≥ 7. 5 grid & freq. 751 x

Time (s) for rel-res<10 -7, Ng ≥ 7. 5 grid & freq. 751 x 401 1 proc 4 proc 8 proc 12 proc 16 proc 24 proc 32 proc 97 35 22 18 20 19 18 786 262 144 106 107 85 77 1868 627 334 237 253 190 158 ν=25 1501 x 401 ν=50 2001 x 534 ν=65 Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 24

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 25

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 25

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 26

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 26

3 D heterogeneous problem • • • Domain: [0, 1] Divided into 3 layers

3 D heterogeneous problem • • • Domain: [0, 1] Divided into 3 layers with k=60, 72, 90 Point source in middle of one side Sommerfeld radiation condition on bndry Also tested with k=60, 90, 145 on the infiniband machine Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 27

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 28

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 28

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 29

Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 29

Time (s) for 3 D hom. & het. problems, 4 conv. goals, on 1

Time (s) for 3 D hom. & het. problems, 4 conv. goals, on 1 xeon E 5520 proc. k: N g: Grid: 60 6. 5 633 10– 4 16 10– 7 145 6. 5 1513 60/90/145 6. 5 – 15. 7 1513 77 465 533 33 168 1024 1347 10– 10 51 258 1607 2159 10– 13 69 351 2209 2967 Oct. 14, 2010 90 953 6. 6 ISCM-29, Technion, Haifa, Israel 30

Summary – serial and parallel machines • When the frequency increases: – Faster convergence

Summary – serial and parallel machines • When the frequency increases: – Faster convergence (on a fixed grid) – Improved scalability (on a fixed grid) – Improved speedup (with a fixed Ng) • For fixed Ng: no. of iterations is linear in k • Homogeneous & heterogeneous problems • Simple to implement • Generally useful for various problems with large off-diagonal elements and discontinuous coefficients Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 31

Other Potential Applications • Higher order schemes for the Helmholtz equation (good initial results)

Other Potential Applications • Higher order schemes for the Helmholtz equation (good initial results) • Maxwell equations? • Saddle-point problems? • Circuit problems? • Linear solvers in some eigenvalue methods? • Suggestions are welcome! Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 32

Relevant Publications http: //cs. haifa. ac. il/~gordon/pub. html CARP: SIAM J Sci Comp 2005

Relevant Publications http: //cs. haifa. ac. il/~gordon/pub. html CARP: SIAM J Sci Comp 2005 CGMN: ACM Trans Math Software 2008 Microscopy: J Parallel & Distr Comp 2008 Large convection + discontin coef: CMES 2009 CARP-CG: Parallel Comp 2010 Scaling for discont coef: J Comp & Appl Math 2010 Helmholtz equation: tech rept http: //cs. haifa. ac. il/~gordon/helm. pdf CARP-CG SOFTWARE AVAILABLE ON REQUEST THANK YOU! Oct. 14, 2010 ISCM-29, Technion, Haifa, Israel 33