PARALLEL COMPUTING IN COMPUTATIONAL CHEMISTRY Why What Happens

Why ? What Happens in molecular level? 2

is a branch of chemistry that uses computer simulation to assist in solving chemical

Phase Transition for a Hard Sphere System B. J. Alder and T. E. Wainwright

IBM-704 The first mass-produced computer with floating-point arithmetic hardware was introduced by IBM in

1960: Vineyard group; Simulated radiation damage of a Cu crystal with MD 1964: Rahman;

Ref: www. maximumpc. com Year speed unit 1985 33 MHz 1989 100 MHz 1993

: is a type of computation in which many calculations are carried out simultaneously,

: ctions Instru ows e wind h t n a e 1 - cl

problem instructions processor processor 11

Having suitable hardware The problem can be parallelized Having suitable algorithm 12

Having suitable hardware The problem can be parallelized Having suitable algorithm 13

HARDWARE: Parallel hardware architectures CPU Memory CPU Control Unit Arithmetic C Logic P Unit

entral rocessing control, input/output) nit (basic arithmetic, logical, Single Core CPU Dual Core Quad

HARDWARE: Computational Units (GPU) raphics rocessing CPU nit GPU 16

HARDWARE: Computational Units (GPU) Ref: www. ks. uiuc. edu/Research/namd Molecular dynamics simulation of protein

HARDWARE: Computational Units (GPU) GPUs need a fundamentally different architecture. GPU constraints: One would

Having suitable hardware The problem can be parallelized Having suitable algorithm 19

x(1)=100. DO 10 i=2, 1000 x(i)=sin(x(i-1)) 10 CONTINUE i=2 : X(2)=sin(x(1)) i=3 : X(3)=sin(x(2))

Having suitable hardware The problem can be parallelized Having suitable algorithm 24

Obtain initial guess for density matrix Fock matrix formation Iterate Two-electron integrals Diagonalize Fock

Molecular dynamics (MD) is a computer simulation technique where the time evolution of a

Initialize Force Calculation forces Motion Analysis Others Summarize 29

. . . Ref: ROM. J. BIOCHEM. , 46, 2, 129 -148 (2009) 30

Advantages: Simplicity (this is relatively easy parallel strategy to implement, requiring only minor changes

Properties: Communication operations scale as rather than N Memory cost for positions and force

rcut Properties: The communication costs can be minimized. Needs more sophisticated programming. Ref: DOI:

A lot of independent programs run as serial using a lot of CPU. (embarrassingly

Hexanitroethane C 2 N 6 O 12 B 3 LYP/6 -31 g(df, pd) Single

Slides: 41

Download presentation

PARALLEL COMPUTING IN COMPUTATIONAL CHEMISTRY

Why ? What Happens in molecular level? 2

is a branch of chemistry that uses computer simulation to assist in solving chemical problems. It uses methods of theoretical chemistry, incorporated into efficient computer programs, to calculate the structures and properties of molecules. 4

Phase Transition for a Hard Sphere System B. J. Alder and T. E. Wainwright J. Chem. Phys. 27, 1208 (1957); doi: 10. 1063/1. 1743957 PNO: 96 PBC MD simulation IBM-704

IBM-704 The first mass-produced computer with floating-point arithmetic hardware was introduced by IBM in 1954 32 bit execute up to 12, 000 calculations per second (O k. FLOPS) : Petaplops 10^15 ( A quadrillion (thousand trillion) calculations per second ) Future: exa. FLOPS 10^18 (a billion calculations per second) !!!! 6

1960: Vineyard group; Simulated radiation damage of a Cu crystal with MD 1964: Rahman; MD simulation of liquid Ar 1969: Barker and Watts; Monte Carlo simulation of water 1971: Rahman and Stillinger; MD simulation of water Cray T 3 E 1(1995) (1976) 7

Ref: www. maximumpc. com Year speed unit 1985 33 MHz 1989 100 MHz 1993 233 MHz 1996 385 MHz 1997 450 MHz 1999 570 MHz 1999 1. 4 GHz 2000 2 GHz 2001 2. 25 GHz 2004 2. 3 GHz 2004 3. 2 GHz 2006 3. 2 GHz

: is a type of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved at the same time. 9

: ctions Instru ows e wind h t n a e 1 - cl oor n the d a e l c 2 oof n the r 3 - clea le he tab t n a e l 4 - c

problem instructions processor processor 11

Having suitable hardware The problem can be parallelized Having suitable algorithm 12

Having suitable hardware The problem can be parallelized Having suitable algorithm 13

HARDWARE: Parallel hardware architectures CPU Memory CPU Control Unit Arithmetic C Logic P Unit U Shared Memory Distributed Memory Input Output Memory CPU network Memory CPU

entral rocessing control, input/output) nit (basic arithmetic, logical, Single Core CPU Dual Core Quad Core 15

HARDWARE: Computational Units (GPU) raphics rocessing CPU nit GPU 16

HARDWARE: Computational Units (GPU) Ref: www. ks. uiuc. edu/Research/namd Molecular dynamics simulation of protein insertion process NCSA Lincoln Cluster performance (8 Intel cores and 2 NVIDIA Tesla GPUs per node, 1 million atoms) 17

HARDWARE: Computational Units (GPU) GPUs need a fundamentally different architecture. GPU constraints: One would have to program an application specifically for a GPU that uses different techniques. It needs new programming languages. It needs new programming paradigm. NAMD (www. ks. uiuc. edu/Research/namd) LAMMPS (lammps. sandia. gov) Gromacs (www. gromacs. org) DL_POLY 4 (www. stfc. ac. uk//research/app/ccg/software/DL_POLY/44516. aspx) GAMESS 2012 closed shell MP 2 and closed shell CCSD(T) energy (www. msg. ameslab. gov/gamess) 18

Having suitable hardware The problem can be parallelized Having suitable algorithm 19

x(1)=100. DO 10 i=2, 1000 x(i)=sin(x(i-1)) 10 CONTINUE i=2 : X(2)=sin(x(1)) i=3 : X(3)=sin(x(2)) i=4 : X(4)=sin(x(3)) 21

A= 2 1 5 3 0 7 1 6 9 2 4 4 3 6 7 2 for (i = 0; i < n; i++) for (j = 0; i < n; j++) B= 6 1 2 3 4 5 6 5 1 9 8 -8 4 0 -8 5 C[i][j] = 0; for (k= 0; k < n; k++) C[i][j] += a[i][k] * b[k][j] end for 22

A= B= 2 1 5 3 0 7 1 6 9 2 4 4 3 6 7 2 6 1 2 3 4 5 6 5 1 9 8 -8 4 0 -8 5 23

Having suitable hardware The problem can be parallelized Having suitable algorithm 24

Obtain initial guess for density matrix Fock matrix formation Iterate Two-electron integrals Diagonalize Fock matrix formation Form new density matrix Density formation Annihilation Others Integral evaluation 26

. . . Ref: DOI: 10. 1039/c 002859 b 27

Molecular dynamics (MD) is a computer simulation technique where the time evolution of a set of interacting atoms is followed by integrating their equation of motion. 28

Initialize Force Calculation forces Motion Analysis Others Summarize 29

. . . Ref: ROM. J. BIOCHEM. , 46, 2, 129 -148 (2009) 30

Advantages: Simplicity (this is relatively easy parallel strategy to implement, requiring only minor changes to scalar code. dis-advantages: Memory usage is high (due to duplication of data) Communication costs are quite high 31

Properties: Communication operations scale as rather than N Memory cost for positions and force vectors are reduced by the factor Retains the simplicity of the RD technique. Ref: DOI: 10. 1007/1 -4020 -2670 -5_15 32

rcut Properties: The communication costs can be minimized. Needs more sophisticated programming. Ref: DOI: 10. 1002/(SICI)1096 -987 X(199703)18: 4<478: : AID-JCC 3>3. 0. CO; 2 -Q 33

A lot of independent programs run as serial using a lot of CPU. (embarrassingly parallel) A problem divides in to some parts and each parts is run on each CPU. Load balancing Communication cost Computation cost The number of CPU Amount of memory The chosen algorithm 34

Hexanitroethane C 2 N 6 O 12 B 3 LYP/6 -31 g(df, pd) Single point 39

THANKS! 41