Efficient Eigensolvers for Largescale Electronic Nanostructure Calculations Stanimire

  • Slides: 19
Download presentation
Efficient Eigensolvers for Large-scale Electronic Nanostructure Calculations ________________________ Stanimire Tomov 1 Andrew Canning 2,

Efficient Eigensolvers for Large-scale Electronic Nanostructure Calculations ________________________ Stanimire Tomov 1 Andrew Canning 2, Jack Dongarra 1, Osni Marques 2 Christof Vömel 2 and Lin-Wang 2 Innovative Computing Laboratory 1 University of Tennessee Supported by: U. S. DOE, Office of Science SC 05, Seattle 11/16/2005 Alex Zunger Gabriel Bester Joonhee An Alberto Franceschetti Wesley Jones Kim Kwiseon Peter Graf Lawrence Berkeley National Laboratory 2 Computational Research Division Jack Dongarra Julien Langou Stanimire Tomov Lin-Wang Andrew Canning Osni Marques Christof Vömel M. Claudia Troparevsky 1

Outline • Background • Problem formulation • Solution approach – Iterative Conjugate Gradients (CG)

Outline • Background • Problem formulation • Solution approach – Iterative Conjugate Gradients (CG) type eigensolvers • Preconditioning – The Bulk-band (BB) preconditioner • Numerical results • Conclusions 2

Background • Quantum dots – Tiny crystals ranging from a few hundred to few

Background • Quantum dots – Tiny crystals ranging from a few hundred to few thousand atoms in size; made by humans – Electronic properties critically depend on shape and size – Colors of light absorbed and emitted can be tuned by the quantum dot size • Absorbed energy can lift an electron from its valence band to its conduction band (generate electrical current) • Electron falling back from conduction to valence band lead to loss of energy, emitted as light • The mathematical simulation leads to eigen-value problems Total electron charge density of a quantum dot of gallium arsenide, containing just 465 atoms. Quantum dots of the same material but different sizes have different band gaps and emit different colors – Different electronic properties than their bulk material • But still, bulk material properties may be useful: we found ways to use them in designing preconditioners that would significantly accelerate quantum dots electronic structure calculations 3

Problem formulation • Solve a single particle Schrödinger-type equation (E) (- 0. 5 +

Problem formulation • Solve a single particle Schrödinger-type equation (E) (- 0. 5 + V ) i = i i with periodic boundary conditions • Many electronic nano-structure calculations lead to it • Leads to a discrete eigenvalue problem H i = Ei i , where H is Hermitian • Many additional requirements – Find a few (4 -10) interior eigenvalues closest to a given point Eref – Repeated eigenvalues are allowed (degeneracy up to 4), etc. • The problem size requires a parallel iterative solution approach 4

Solution approach • Phase 1: Iterative eigen-solvers – Conjugate Gradients (CG) type with spectral

Solution approach • Phase 1: Iterative eigen-solvers – Conjugate Gradients (CG) type with spectral transformation • Based on their previous successful use in the field • Folded spectrum: solve for (H-Eref)2 to get interior eigen-states (L. W. Wang & A. Zunger, 1993) – Developed library of 3 non-linear CG eigen-solvers – The library includes the A. Knyazev’s LOBPCG method • Supports blocking • Supports preconditioning • Developed and integrated in Nano. PSE (S. Tomov and J. Langou) 5

Solution approach … – We use the Nanoscience Problem Solving Environment (Nano. PSE) package

Solution approach … – We use the Nanoscience Problem Solving Environment (Nano. PSE) package • Integrate various nano-codes (developed over ~10 years) • Its design goal: provide a software context for collaboration – Features easy install; runs on many platforms, etc. • Collected and maintained by Wesley Jones (NREL) – Results: • 43% improvement in speed and 49% in number of matrix-vector products – On a In. As nanowire system of ~ 70, 000 atoms, eigen-system of size 2, 265, 827 (A. Canning and G. Bester) • Results are good: reference algorithm & implementation were very efficient • But limited by the effectiveness of the available preconditioner • Phase 2: Preconditioning 6

Preconditioning • Preconditioning: term coming from accelerating the convergence of iterative solvers for linear

Preconditioning • Preconditioning: term coming from accelerating the convergence of iterative solvers for linear systems Ax = b in particular, find operator/preconditioner T “ A-1” s. t. (TA) x = Tb be “easier” to solve • Preconditioning for eigenproblems – Harder problem / not “as straightforward” – Can be shown that efficient preconditioners for linear systems are efficient preconditioners for CG-type eigensolvers 7

Bulk Band (BB) Preconditioner Basic idea: • Use the electronic properties of the bulk

Bulk Band (BB) Preconditioner Basic idea: • Use the electronic properties of the bulk materials constituent for the nanostructure in designing a preconditioner • What does it mean and how? 8

BB preconditioner 9

BB preconditioner 9

BB preconditioner • Find electronic properties of the bulk materials: – Solve (E) on

BB preconditioner • Find electronic properties of the bulk materials: – Solve (E) on infinite crystal (bulk material) – Because of the periodicity solve just on the primary cell (much smaller problem); Find solution in form (Bloch theorem): nk (r ) = unk( r) eikr, unk (r+A) = unk( r) – Denote span{ nk } as BB space • Denote by HBB the Hamiltonian stemming from a bulk problem; if BB space, HBB-1 is easy to compute • Note that if H stems from a bulk problem HBB-1 is the exact preconditioner for H (=H-1) 10

BB preconditioner, continued … • Decompose the current residual R as R = QBB

BB preconditioner, continued … • Decompose the current residual R as R = QBB R + (R – QBB R) where QBB is the L 2 projection in the BB space • Use HBB-1 to precondition the QBB R component of R and a diagonal preconditioner D-1 for the (R–QBB R) component, i. e. (1) T R HBB-1 QBB R + D-1 (R – QBB R) • TR in (1) is just one example … • Preconditioners of form (1) are refered to in the literature as additive; another variation is (2) T R HBB-1 QBB R + w D-1 R, where w>0 is a dumping parameter 11

BB preconditioner, continue … • (2) can be viewed as a multilevel (two-level) preconditioner:

BB preconditioner, continue … • (2) can be viewed as a multilevel (two-level) preconditioner: “correct” the low frequency components of R with HBB-1 and “smooth” the high frequencies with D-1 • How to choose w in (2); also present in (1)? • Avoid the problem of determining it by considering a multiplicative multilevel version of the BB preconditioner: r 1 = D-1 R r 2 = r 1 + HBB-1 QBB (R – H r 1) T R r 2 + D-1 (R – H r 2) 12

Numerical results • Tests on a bulk problem 64 atoms of Cd 48 -Se

Numerical results • Tests on a bulk problem 64 atoms of Cd 48 -Se 34 512 atoms of Cd 48 -Se 34 • The BB preconditioner should be most efficient for this case (speedup of factor 3, increasing with problem size increase) • We start with arbitrary initial guess • Here BB space dimension is 1. 5% of solution space dimension 13

Numerical results • Tests with “perturbed” potential (simulate a quantum dot) 64 atoms of

Numerical results • Tests with “perturbed” potential (simulate a quantum dot) 64 atoms of Cd 48 -Se 34 512 atoms of Cd 48 -Se 34 • Factor of 2 speedup • Increasing with increasing problem size 14

Numerical results • Tests with “perturbed” potential (simulate a quantum dot) • Localized wave-functions

Numerical results • Tests with “perturbed” potential (simulate a quantum dot) • Localized wave-functions with density charge confinement simulating a quantum dot 15

Numerical results • Various perturbations with the BB multiplicative preconditioner 64 atoms of Cd

Numerical results • Various perturbations with the BB multiplicative preconditioner 64 atoms of Cd 48 -Se 34 512 atoms of Cd 48 -Se 34 • Not that sensitive to perturbation increase 16

Numerical results • BB vs diagonal preconditioning on a bigger system (4096 atoms of

Numerical results • BB vs diagonal preconditioning on a bigger system (4096 atoms of Cd 48 -Se 34) for various perturbations BB multiplicative preconditioning Diagonal preconditioning • Speedup exceeding a factor of 3 • Goes to about factor of 7 for perturbation 4 17

Numerical results • Comparison of diagonal (in red) vs BB preconditoining (in green) using

Numerical results • Comparison of diagonal (in red) vs BB preconditoining (in green) using folded spectrum; (H-Eref)2 64 atoms of Cd 48 -Se 34 512 atoms of Cd 48 -Se 34 • The speedup from the H case is multiplied by a factor of 2 • A speedup of factor 4 for small problems; increasing with problem size increase 18

Conclusions • A new preconditioning technique was presented • Numerical results show the efficiency

Conclusions • A new preconditioning technique was presented • Numerical results show the efficiency of the BB preconditioning – A factor of 4 speedup for small problems with folded spectrum (compared to diagonal preconditioning) – Increased efficiency with problem size increase • More testing has to be done – On bigger problems – With real quantum dots 19