Parallel Optimization Methods for SimulationBased Problems in Nanoscience
Parallel Optimization Methods for Simulation-Based Problems in Nanoscience Juan Meza, Michel van Hove, Zhengji Zhao Lawrence Berkeley National Laboratory Berkeley, CA http: //hpcrd. lbl. gov/~meza Supported by DOE/MICS SIAM CSE Conference, Orlando, FL, Feb. 12 -15, 2005 C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Many scientific applications require the solution of an optimization problem C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Low-energy electron diffraction (LEED) v Goal is to determine surface structure through low energy electron diffraction (LEED) v Inverse problem consists of minimizing so-called R-factor - a measure of fitness between experiment and theory v Combination of global/local optimization v Inherently noisy optimization problem Low-energy electron diffraction pattern due to monolayer of ethylidyne attached to a rhodium (111) surface C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Surface structure determination from experiment v Forty five structural models were proposed in a complex surface structure determination by low energy electron diffraction experiments v Lattice sites can be occupied by Ni or Li atoms, or have a vacancy. In addition continuous fit parameters corresponding to local relaxation of positions are also allowed here. v Many arrangements of Ni atoms (light and dark green) and Li atoms (yellow and orange) are possible within the outlined 2 -dimensional square unit cell. C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Surface structure determination from experiment v Electron diffraction determination of atomic positions in a surface: § Li atoms on a Ni surface Global optimization of structure type: which of these 45 structure types best fits experiment? Local optimization of structure parameters: which are the best interatomic distances and angles? C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Low Energy Electron Diffraction R-Factors C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Characteristics of optimization problem v Inverse problem § minimize R-factor - defined as the misfit between theory an experiment § Several ways of computing the R-factor v Combination of continuous and categorical variables • • Atomic coordinates, i. e. x, y, z Ni, Li v No derivatives available - standard issue with “black- box” simulations v Invalid structures lead to function being undefined in certain regions and/or discontinuous C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Pendry R-factor where the intensity curve, I, is computed by the LEED code C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Previous Work v Previous attempt used genetic algorithms to solve the v v global optimization method. Large number of invalid structures generated (more on this later). Overall, a solution was found - after adding sufficient constraints. Global Optimization in LEED Structure Determination Using Genetic Algorithms, R. Döll and M. A. Van Hove, Surf. Sci. 355, L 393 -8 (1996). A Scalable Genetic Algorithm Package for Global Optimization Problems with Expensive Objective Functions, G. S. Stone, M. S. dissertation, Computer Science Dept. , San Francisco State University, 1998. C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Pattern search methods v Pattern search methods, Torczon, Lewis & Torczon, Lewis, Kolda, Torczon (2004), etc. Extension to mixed variable problems by Audet and Dennis (2000). Case of nonlinear constraints studied in Abramson’s Ph. D dissertation (2002). A frame-based Mesh Adaptive Direct Search (MADS) method proposed by Audet and Dennis (2004) that removes restriction of a finite number of poll directions. Good software available APPSPACK (Kolda), NOMADm (Abramson), OPT++ (Hough, Meza, Williams) v v C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
NOMADm v Variables can be continuous, discrete, or categorical v General constraints (bound, linear, nonlinear) § Nonlinear constraints can be handled by either filter method or MADS-based approach for constructing poll directions v Objective and constraint functions can be discontinuous, extended-value, or nonsmooth. v Available at: http: //en. afit. edu/ENC/Faculty/MAbramson/NOMADm. html C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
MVP Algorithm 1. Initialization: Given D , x 0 , M 0, P 0 2. For k = 0, 1, … a) SEARCH: Evaluate f on a finite subset of trial points on the mesh Mk Global phase can include user heuristics or surrogate functions 1. POLL: Evaluate f on the frame Pk Local phase more rigid, but necessary to ensure convergence 3. If successful - mesh expansion: 1. xk+1 = xk + Dk dk 4. Otherwise contract mesh C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Convergence properties Assuming f(x) is suitably smooth. . . v For unsuccessful iterations, k rf(xk) k is bounded as a function of the step length Dk v Via globalization, lim inf Dk = 0 v Conclude: lim inf k rf(xk) k = 0 C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Test problem v Model contains three layers of atoms v Using symmetry considerations we can reduce the problem to 14 atoms § 14 categorical variables § 42 continuous variables v Positions of atoms constrained to lie within a box v Best known previous solution had R-factor =. 24 Model 31 from set of TLEED model problems C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
GA results - atomic coordinates constrained to 0. 4 Angstrom (invalid structures) C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
NOMAD results for minimization with respect to the continuous variables Best known solution: R-factor = 0. 24 C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
NOMAD results for minimization with respect to the continuous variables R-factor = 0. 12 # of func call = 591 Best known solution: R-factor = 0. 24 C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
GA results - categorical variable search with fixed atomic positions best known solution: 1111122222 Li Ni 11111122211122 11111122221122 11111222221222 1111122222 211122222 Remark: population size = 10 / Generation C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
NOMAD results for categorical variables with fixed atomic positions Best known solution (R = 0. 24): 1111122222 Li Ni 11111122211122 R = 0. 2387 # of func call = 49 1111122222 C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Robustness of NOMAD: 15 of 20 initial guesses (poll step only) Best known solution (R = 0. 24): 1111122222 Li Ni AVG initial R = 0. 5671 AVG # of func call = 63 C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Five of 20 trials are trapped in local minima using only poll step Best known solution (R = 0. 24): 1111122222 Li C O M P U T A T I O N A L R E S E A R C H Ni D I V I S I O N
LHS search + GSS poll escapes from local minima (R = 0. 24): 1111122222 Li Ni New minimum found (R = 0. 1184): 22222112111111 N C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
NOMAD results for 20 trials using LHS + GSS 20 trials of identity search AVG intial R= 0. 5243 R = 0. 2387 AVG # of func call =73 R = 0. 1184 AVG # of func call =152 Best known solution (R = 0. 24): 1111122222 New minimum found (R = 0. 1184): 22222112111111 C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Minimization with respect to both continuous and categorical variables Simultaneous relaxation of both continuous and categorical variables removes restriction on coordinates R-factor = 0. 24 # of func call = 212 Best known solution: R-factor = 0. 24 R-factor = 0. 2151 # of func call = 1195 C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Minimization with respect to both types of variables removes coordinate constraints Penalty R-factor = 1. 6 (invalid structures) Best known solution: R-factor = 0. 24 C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
LEED Chemical Identity Search: Ni (100)-(5 x 5)-Li Best known solution (R = 0. 24) New structure found (R = 0. 1184) C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Conclusions v GSS methods for mixed variable problems were successful in solving the surface structure determination problem § On average NOMAD took 60 function evaluations versus 280 for previous solution (GA) § Improved solutions from previous best known solutions found in all cases § Generation of far fewer invalid structures v Algorithm appears to be fairly robust, with a better structure found in all 20 trial points v Ability to minimize with respect to both categorical and continuous variables a critical advantage for these types of problems C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Future work v Implement parallel version of algorithms v Improve the probability of not being trapped in local minima § Develop new SEARCH strategies, especially for categorical variables v Develop automatic strategies for switching between different structure models v Improve objective function call § Develop new validity check § Experiment with other R-factor formulations to increase the sensitivity v Implement simultaneous minimization of other physical quantities (e. g. , energy) C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Acknowledgements v Chao Yang v Lin-Wang v Xavier Cartoxa v Andrew Canning v Byounghak Lee C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
Questions C O M P U T A T I O N A L R E S E A R C H D I V I S I O N
- Slides: 30