Machine Learning Augmented Density Functional Tight Binding Theory
Machine Learning Augmented Density Functional Tight Binding Theory Adam Mc. Sloy adam. mcsloy@warwick. ac. uk 04 -05 -20 1
Computational Chemistry: Introduction Definition: “…a branch of chemistry that uses computer simulation to assist in solving chemical problems. ”[1] Example Use Cases: ● Rate constant prediction ● Diffusion coefficients and mechanisms ● Computer aided drug design (CADD) ● Catalyst development [1] Wikipedia et al. 2020 2
Computational Surface Chemistry Group Systems of interest : ● Molecular photo-switches ● Molecular adsorption & packing ● Cluster formation ● … etc. Primary Methods: ● DFT (physics driver) ● Molecular dynamics (time evolution) 3
Computational Chemistry: Methods Strengths: ● Large System ● Long Time Scales ● Electronic Structure Use-Cases: ● Molecular Overlayer Formation ● Biological Systems 4
SCC-DFTB: Method Is a 2 nd order expansion of DFT Kohn-Sham ETotal w. r. t Δq: ▍Tight binding band structure term (H) ▍Long range electrostatic term (S, U) ▍Repulsive term exchange, correlation + “everything else” (Vrep) P. Koskinen and V. Mäkinen, Comput. Mater. Sci. , 2009, 47, 237– 253. 5
DFTB: Parameter Set Construction P. Koskinen and V. Mäkinen, Comput. Mater. Sci. , 2009, 47, 237– 253. 6
DFTB: Parameter Set Use P. Koskinen and V. Mäkinen, Comput. Mater. Sci. , 2009, 47, 237– 253. 7
DFTB: Current Shortcomings Parameter: ● Time consuming ● Labour intensive Traditional Parameter Sets: ● Poorly Transferable ● Fail upon bond breaking ● Struggle with systems more complex than simple covalent and ionic interactions. P. Koskinen and V. Mäkinen, Comput. Mater. Sci. , 2009, 47, 237– 253. 8
DFTB: Machine Learning Augmentation 9
DFTB Network Layer Variable Fixed H. Li, D. J. Yaron et al. J. Chem. Theory Comput. , 14, 5764– 5776 (2018). B. Aradi, B. Hourahine, and Th. Frauenheim, J. Phys. Chem. A, 111 5678 (2007). 10
DFTB Network: Augmentation Metal handling: ● d-orbital treatment ● Finite temperature modelling Electronic structure resolution: ● Angular resolved Δq cost ● PDo. S cost function 11
Data Set Requirements: • H, C, N, O & Au systems • Span chemical space of interest Specification: a) 97 Mols with 2 Au atoms. (CSIM) b) 388 Mol-Au 4 -10 cluster systems (MD) Total: ● 485 systems with 400 geometries each. ● Single point DFT calculations 12
Initial Training Random test/train partition: ● Poor test / train distinction ● Prevents detection of overfitting Solution: Training Set Testing Set 13
The SCC Cycle Network pass: few minuets SCC run: hours Cause: ● Serial ● Sequential ● Single threaded 14
Acknowledgements Collaborators Prof. D. Yaron Dr. B. Hourahine Dr. B. Aradi Funding: ● AI 3 SD ● UKRI Carnegie Mellon Strathclyde University Bremen University Compute Time: ● HEC-MCC ● Midlands HPC ● Warwick CSC
- Slides: 15