An Introduction to Monte Carlo Methods for Lattice

  • Slides: 155
Download presentation
An Introduction to Monte Carlo Methods for Lattice Quantum Field Theory A D Kennedy

An Introduction to Monte Carlo Methods for Lattice Quantum Field Theory A D Kennedy University of Edinburgh 2/18/2021 XI Seminario Nazionale di Fisica Teorica

Model or Theory? 4 QCD is highly constrained by symmetry 4 Only very few

Model or Theory? 4 QCD is highly constrained by symmetry 4 Only very few free parameters 4 Coupling constant g 4 Quark masses m 4 Scaling behaviour well understood 4 Approach to continuum limit 4 Approach to thermodynamic (infinite volume) limit 2/18/2021 XI Seminario Nazionale di Fisica Teorica 2

Progress in Algorithms 4 Important advances made 4 None so far have had a

Progress in Algorithms 4 Important advances made 4 None so far have had a major impact on machine architecture 4 Inclusion of dynamical fermion effects 4 Now routine, but expensive 4 Improved actions 4 New formulations for lattice fermions 4 Chiral fermions can be defined satisfactorily 4 Need algorithms to use new formalism 4 Very compute intensive 2/18/2021 XI Seminario Nazionale di Fisica Teorica 3

Algorithms 4 Hybrid Monte Carlo 4 Full QCD including dynamical quarks 4 Most of

Algorithms 4 Hybrid Monte Carlo 4 Full QCD including dynamical quarks 4 Most of the computer time spent integrating Hamilton’s equations in fictitious (fifth dimensional) time 4 Symmetric symplectic integrators used 8 Known as leapfrog to its friends 8 Higher-order integration schemes go unstable for smaller integration step sizes 2/18/2021 XI Seminario Nazionale di Fisica Teorica 4

Sparse Linear Solvers 4 Non-local dynamical fermion effects require solution of large system of

Sparse Linear Solvers 4 Non-local dynamical fermion effects require solution of large system of linear equations for each time step 4 Krylov space methods used 8 Conjugate gradients, Bi. CGStab, … 4 Typical matrix size > 107 8324 lattice 83 colours 84 spinor components 2/18/2021 XI Seminario Nazionale di Fisica Teorica 5

Functional Integrals and QFT 4 The Expectation value of an operator is defined non-perturbatively

Functional Integrals and QFT 4 The Expectation value of an operator is defined non-perturbatively by the Functional Integral 4 Normalisation constant Z chosen such that 4 The action is S( ) 4 Defined in Euclidean space-time 4 Lattice regularisation 4 d is the appropriate functional measure 4 Continuum limit: lattice spacing a 0 4 Thermodynamic limit: physical volume V 2/18/2021 XI Seminario Nazionale di Fisica Teorica 6

Monte Carlo Integration 4 Monte Carlo integration is based on the identification of probabilities

Monte Carlo Integration 4 Monte Carlo integration is based on the identification of probabilities with measures 4 There are much better methods of carrying out low dimensional quadrature 4 All other methods become hopelessly expensive for large dimensions 4 In lattice QFT there is one integration per degree of freedom 8 We are approximating an infinite dimensional functional integral 2/18/2021 XI Seminario Nazionale di Fisica Teorica 7

Example 4 Let’s integrate sin x over the interval [0, 2π] 4 The integral

Example 4 Let’s integrate sin x over the interval [0, 2π] 4 The integral is 4 Its variance is 4 The error (or standard deviation) of our Monte Carlo estimate is thus σ = ± 1. 7725 (σ2 = V) 2/18/2021 XI Seminario Nazionale di Fisica Teorica 8

Monte Carlo Method 4 Generate a sequence of random field configurations chosen from the

Monte Carlo Method 4 Generate a sequence of random field configurations chosen from the probability distribution 4 Measure the value of on each configuration and compute the average 2/18/2021 XI Seminario Nazionale di Fisica Teorica 9

Central Limit Theorem 4 Law of Large Numbers 4 Central Limit Theorem 4 The

Central Limit Theorem 4 Law of Large Numbers 4 Central Limit Theorem 4 The variance of the distribution of is 4 The Laplace–De. Moivre Central Limit theorem is an asymptotic expansion for the probability distribution of 4 Distribution of values for a single sample = ( ) 2/18/2021 XI Seminario Nazionale di Fisica Teorica 10

Cumulant Expansion 4 Generating function for connected moments 4 The first few cumulants are

Cumulant Expansion 4 Generating function for connected moments 4 The first few cumulants are 4 Note that this is an asymptotic expansion 2/18/2021 XI Seminario Nazionale di Fisica Teorica 11

Distribution of the Average 4 Distribution of the average of N samples 4 Connected

Distribution of the Average 4 Distribution of the average of N samples 4 Connected generating function 2/18/2021 XI Seminario Nazionale di Fisica Teorica 12

Central Limit Theorem 4 Take inverse Fourier transform to obtain distribution 2/18/2021 XI Seminario

Central Limit Theorem 4 Take inverse Fourier transform to obtain distribution 2/18/2021 XI Seminario Nazionale di Fisica Teorica 13

Central Limit Theorem 2/18/2021 XI Seminario Nazionale di Fisica Teorica 14

Central Limit Theorem 2/18/2021 XI Seminario Nazionale di Fisica Teorica 14

Central Limit Theorem 4 Re-scale to show convergence to Gaussian distribution 4 where and

Central Limit Theorem 4 Re-scale to show convergence to Gaussian distribution 4 where and 2/18/2021 XI Seminario Nazionale di Fisica Teorica 15

Importance Sampling 8 Integral 8 Sample from distribution 8 Probability 8 Normalisation 8 Estimator

Importance Sampling 8 Integral 8 Sample from distribution 8 Probability 8 Normalisation 8 Estimator of integral 8 Estimator of variance 2/18/2021 XI Seminario Nazionale di Fisica Teorica 16

Optimal Importance Sampling 8 Minimise variance 8 Constraint N=1 8 Lagrange multiplier 8 Optimal

Optimal Importance Sampling 8 Minimise variance 8 Constraint N=1 8 Lagrange multiplier 8 Optimal measure 8 Optimal variance 2/18/2021 XI Seminario Nazionale di Fisica Teorica 17

Example 8 Optimal weight 8 Optimal variance 2/18/2021 XI Seminario Nazionale di Fisica Teorica

Example 8 Optimal weight 8 Optimal variance 2/18/2021 XI Seminario Nazionale di Fisica Teorica 18

Binning 1. 0 1 Construct cheap approximation to |sin(x)| 2 Calculate relative area within

Binning 1. 0 1 Construct cheap approximation to |sin(x)| 2 Calculate relative area within each rectangle 3 Choose a random number uniformly 4 Select corresponding rectangle 0. 9 0. 8 0. 7 0. 6 0. 5 0. 4 5 Choose another random number 0. 3 uniformly 6 Select corresponding x value within rectangle 7 Compute |sin(x)| 2/18/2021 0. 2 0. 1 0. 0 XI Seminario Nazionale di Fisica Teorica 19

Example 8 With 100 rectangles we have V = 16. 02328561 8 … but

Example 8 With 100 rectangles we have V = 16. 02328561 8 … but we can do better! 8 For which 8 With 100 rectangles we have V = 0. 011642808 2/18/2021 XI Seminario Nazionale di Fisica Teorica 20

Markov chains 8 State space 8(Ergodic) stochastic transitions P’: 8 Deterministic evolution of probability

Markov chains 8 State space 8(Ergodic) stochastic transitions P’: 8 Deterministic evolution of probability distribution P: Q Q 8 Distribution converges to unique fixed point 2/18/2021 XI Seminario Nazionale di Fisica Teorica 21

Convergence of Markov Chains 4 Define a metric on the space of (equivalence classes

Convergence of Markov Chains 4 Define a metric on the space of (equivalence classes of) probability distributions 4 Prove that with > 0, so the Markov process P is a contraction mapping 4 The sequence Q, P 2 Q, P 3 Q, … is Cauchy 4 The space of probability distributions is complete, so the sequence converges to a unique fixed point distribution 2/18/2021 XI Seminario Nazionale di Fisica Teorica 22

Convergence of Markov Chains 2/18/2021 XI Seminario Nazionale di Fisica Teorica 23

Convergence of Markov Chains 2/18/2021 XI Seminario Nazionale di Fisica Teorica 23

Convergence of Markov Chains 2/18/2021 XI Seminario Nazionale di Fisica Teorica 24

Convergence of Markov Chains 2/18/2021 XI Seminario Nazionale di Fisica Teorica 24

Use of Markov Chains 4 Use Markov chains to sample from Q 4 Suppose

Use of Markov Chains 4 Use Markov chains to sample from Q 4 Suppose we can construct an ergodic Markov process P which has distribution Q as its fixed point 4 Start with an arbitrary state (“field configuration”) 4 Iterate the Markov process until it has converged (“thermalized”) 4 Thereafter, successive configurations will be distributed according to Q 8 But in general they will be correlated 4 To construct P we only need relative probabilities of states 8 Don’t know the normalisation of Q 8 Cannot use Markov chains to compute integrals directly 8 We can compute ratios of integrals 2/18/2021 XI Seminario Nazionale di Fisica Teorica 25

Metropolis Algorithm 4 How do we construct a Markov process with a specified fixed

Metropolis Algorithm 4 How do we construct a Markov process with a specified fixed point? 4 Detailed balance 4 integrate w. r. t. y to obtain fixed point condition 4 sufficient but not necessary for fixed point 4 Metropolis algorithm 4 consider cases Q(x)>Q(y) and Q(x)<Q(y) separately to obtain detailed balance condition 4 sufficient but not necessary for detailed balance 4 other choices are possible, e. g. , 2/18/2021 XI Seminario Nazionale di Fisica Teorica 26

Composite Markov Steps 4 Composition of Markov steps 4 Let P 1 and P

Composite Markov Steps 4 Composition of Markov steps 4 Let P 1 and P 2 be two Markov steps which have the desired fixed point distribution. . . 4 … they need not be ergodic 4 Then the composition of the two steps P 1 P 2 will also have the desired fixed point. . . 4 … and it may be ergodic 4 This trivially generalises to any (fixed) number of steps 4 For the case where P is not ergodic but Pn is the terminology “weakly” and “strongly” ergodic are sometimes used 2/18/2021 XI Seminario Nazionale di Fisica Teorica 27

Site-by-Site Updates 4 This result justifies “sweeping” through a lattice performing single site updates

Site-by-Site Updates 4 This result justifies “sweeping” through a lattice performing single site updates 4 Each individual single site update has the desired fixed point because it satisfies detailed balance 4 The entire sweep therefore has the desired fixed point, and is ergodic. . . 4 … but the entire sweep does not satisfy detailed balance 4 Of course it would satisfy detailed balance if the sites were updated in a random order… 4 … but this is not necessary 4 … and it is undesirable because it puts too much randomness into the system 2/18/2021 XI Seminario Nazionale di Fisica Teorica 28

Exponential Autocorrelations 4 The unique fixed point of an ergodic Markov process corresponds to

Exponential Autocorrelations 4 The unique fixed point of an ergodic Markov process corresponds to the unique eigenvector with eigenvalue 1 4 All its other eigenvalues must lie within the unit circle 4 In particular, the largest subleading eigenvalue is |λmax|<1 4 This corresponds to the exponential autocorrelation time 2/18/2021 XI Seminario Nazionale di Fisica Teorica 29

Integrated Autocorrelations 8 Consider autocorrelation of some operator Ω 8 Without loss of generality

Integrated Autocorrelations 8 Consider autocorrelation of some operator Ω 8 Without loss of generality we may assume 4 Define the autocorrelation function 2/18/2021 XI Seminario Nazionale di Fisica Teorica 30

Integrated Autocorrelations 4 The autocorrelation function falls faster that the exponential autocorrelation 4 For

Integrated Autocorrelations 4 The autocorrelation function falls faster that the exponential autocorrelation 4 For a sufficiently large number of samples 4 Define integrated autocorrelation function 2/18/2021 XI Seminario Nazionale di Fisica Teorica 31

Coupling From The Past 4 Propp and Wilson (1996) 4 Use fixed set of

Coupling From The Past 4 Propp and Wilson (1996) 4 Use fixed set of random numbers 4 Flypaper principle: if states coalesce they stay together forever 8 Eventually, all states coalesce to some state with probability one 8 Any state from t = - will coalesce to 8 is a sample from the fixed point distribution 2/18/2021 XI Seminario Nazionale di Fisica Teorica 32

Continuum Limits 4 We are not interested in lattice QFTs per se, but in

Continuum Limits 4 We are not interested in lattice QFTs per se, but in their continuum limit as a 0 4 This corresponds to a continuous phase transition of the discrete lattice model 8 For the continuum theory to have a finite correlation length a (inverse mass gap) in physical units this correlation length must diverge in lattice units 4 We expect such systems to exhibit universal behaviour as they approach a continuous phase transition 8 The nature of the continuum theory is characterised by its symmetries 8 The details of the microscopic interactions are unimportant at macroscopic length scales 8 Universality is a consequence of the way that the theory scales as a 0 while a is kept fixed 2/18/2021 XI Seminario Nazionale di Fisica Teorica 33

Continuum Limits 4 The nature of the continuum field theory depends on the way

Continuum Limits 4 The nature of the continuum field theory depends on the way that physical quantities behave as the system approaches the continuum limit 4 The scaling of the parameters in the action required is described by the renormalisation group equation (RGE) 8 The RGE states the “reparameterisation invariance” of theory as the choice of scale at which we choose to fix the renormalisation conditions 8 As a 0 we expect the details of the regularisation scheme (cut off effects, lattice artefacts) to vanish, so the effect of changing a is just an RGE transformation 8 On the lattice the a renormalisation group transformation may be implemented as a “block spin” transformation of the fields 8 Strictly speaking, the renormalisation “group” is a semigroup on the lattice, as blockings are not invertible 2/18/2021 XI Seminario Nazionale di Fisica Teorica 34

Continuum Limits 4 The continuum limit of a lattice QFT corresponds to a fixed

Continuum Limits 4 The continuum limit of a lattice QFT corresponds to a fixed point of the renormalisation group 8 At such a fixed point the form of the action does not change under an RG transformation 4 The parameters in the action scale according to a set of critical exponents 4 All known four dimensional QFTs correspond to trivial (Gaussian) fixed points 8 For such a fixed point the UV nature of theory may by analysed using perturbation theory 8 Monomials in the action may be classified according to their power counting dimension 8 d < 4 relevant (superrenormalisable) 8 d = 4 marginal (renormalisable) 8 d > 4 irrelevant (nonrenormalisable) 2/18/2021 XI Seminario Nazionale di Fisica Teorica 35

Continuum Limits 4 The behaviour of our Markov chains as the system approaches a

Continuum Limits 4 The behaviour of our Markov chains as the system approaches a continuous phase transition is described by its dynamical critical exponents 4 these describe how badly the system (model + algorithm) undergo critical slowing down 4 the dynamical critical exponent z tells us how the cost of generating an independent configuration grows as the correlation length of the system is taken to , 4 this is closely related (but not always identical) to the dynamical critical exponent for the exponential or integrated autocorrelations 2/18/2021 XI Seminario Nazionale di Fisica Teorica 36

Global Heatbath 4 The ideal generator selects field configurations randomly and independently from the

Global Heatbath 4 The ideal generator selects field configurations randomly and independently from the desired distribution 4 It is hard to construct such global heatbath generators for any but the simplest systems 4 They can be built by selecting sufficiently independent configurations from a Markov process… 4 … or better yet using CFTP which guarantees that the samples are truly uncorrelated 4 But this such generators are expensive! 2/18/2021 XI Seminario Nazionale di Fisica Teorica 37

Global Heatbath 4 For the purposes of Monte Carlo integration these is no need

Global Heatbath 4 For the purposes of Monte Carlo integration these is no need for the configurations to be completely uncorrelated 4 We just need to take the autocorrelations into account in our estimate of the variance of the resulting integral 4 Using all (or most) of the configurations generated by a Markov process is more cost-effective than just using independent ones 4 The optimal choice balances the cost of applying each Markov step with the cost of making measurements 2/18/2021 XI Seminario Nazionale di Fisica Teorica 38

Local Heatbath 4 For systems with a local bosonic action we can build a

Local Heatbath 4 For systems with a local bosonic action we can build a Markov process with the fixed point distribution exp(-S) out of steps which update a single site with this fixed point 4 If this update generates a new site variable value which is completely independent of its old value then it is called a local heatbath 4 For free field theory we just need to generate a Gaussian-distributed random variable 2/18/2021 XI Seminario Nazionale di Fisica Teorica 39

Gaussian Generators 4 If are uniformly distributed random numbers then is approximately Gaussian by

Gaussian Generators 4 If are uniformly distributed random numbers then is approximately Gaussian by the Central Limit theorem 4 This is neither cheap nor accurate 2/18/2021 XI Seminario Nazionale di Fisica Teorica 40

Gaussian Generators 4 If x is uniformly distributed and f is a monotonically increasing

Gaussian Generators 4 If x is uniformly distributed and f is a monotonically increasing function then f(x) is distributed as 4 Choosing we obtain 4 Therefore generate two uniform random numbers x 1 and x 2, set , then are two independent Gaussian distributed random numbers 2/18/2021 XI Seminario Nazionale di Fisica Teorica 41

Gaussian Generators 4 Even better methods exist 4 The Rectangle-Wedge-Tail (RWT) method 4 The

Gaussian Generators 4 Even better methods exist 4 The Rectangle-Wedge-Tail (RWT) method 4 The Ziggurat method 4 These do not require special function evaluations 4 They can be more interesting to implement for parallel computers 2/18/2021 XI Seminario Nazionale di Fisica Teorica 42

Local Heatbath 4 For pure gauge theories the field variables live on the links

Local Heatbath 4 For pure gauge theories the field variables live on the links of the lattice and take their values in a representation of the gauge group 4 For SU(2) Creutz gave an exact local heatbath algorithm 8 It requires a rejection test: this is different from the Metropolis accept/reject step in that one must continue generating candidate group elements until one is accepted 4 For SU(3) the “quasi-heatbath” method of Cabibbo and Marinari is widely used 8 Update a sequence of SU(2) subgroups 8 This is not quite an SU(3) heatbath method… 8 … but sweeping through the lattice updating SU(2) subgroups is also a valid algorithm, as long as the entire sweep is ergodic 8 For a higher acceptance rate there is an alternative SU(2) subgroup heatbath algorithm 2/18/2021 XI Seminario Nazionale di Fisica Teorica 43

Hybrid Monte Carlo 4 In order to carry out Monte Carlo computations including the

Hybrid Monte Carlo 4 In order to carry out Monte Carlo computations including the effects of dynamical fermions we would like to find an algorithm which 4 Updates the fields globally 8 Because single link updates are not cheap if the action is not local 4 Takes large steps through configuration space 8 Because small-step methods carry out a random walk which leads to critical slowing down with a dynamical critical exponent z=2 4 Does not introduce any systematic errors 2/18/2021 XI Seminario Nazionale di Fisica Teorica 44

Hybrid Monte Carlo 4 A useful class of algorithms with these properties is the

Hybrid Monte Carlo 4 A useful class of algorithms with these properties is the (Generalised) Hybrid Monte Carlo (HMC) method 4 Introduce a “fictitious momentum” p corresponding to each dynamical degree of freedom q 4 Find a Markov chain with fixed point exp[-H(q, p)] where H is the “fictitious Hamiltonian” ½ p 2 + S(q) 8 The action S of the underlying QFT plays the rôle of the potential in the “fictitious” classical mechanical system 8 This gives the evolution of the system in a fifth dimension, “fictitious” or computer time 4 This generates the desired distribution exp[-S(q)] if we ignore the momenta q (i. e. , the marginal distribution) 2/18/2021 XI Seminario Nazionale di Fisica Teorica 45

Hybrid Monte Carlo 4 The GHMC Markov chain alternates two Markov steps 4 Molecular

Hybrid Monte Carlo 4 The GHMC Markov chain alternates two Markov steps 4 Molecular Dynamics Monte Carlo (MDMC) 4 Partial Momentum Refreshment 4 Both have the desired fixed point 4 Together they are ergodic 2/18/2021 XI Seminario Nazionale di Fisica Teorica 46

Molecular Dynamics 4 If we could integrate Hamilton’s equations exactly we could follow a

Molecular Dynamics 4 If we could integrate Hamilton’s equations exactly we could follow a trajectory of constant fictitious energy 4 This corresponds to a set of equiprobable fictitious phase space configurations 4 Liouville’s theorem tells us that this also preserves the functional integral measure dp dq as required 4 Any approximate integration scheme which is reversible and area preserving may be used to suggest configurations to a Metropolis accept/reject test 4 With acceptance probability min[1, exp(- H)] 2/18/2021 XI Seminario Nazionale di Fisica Teorica 47

MDMC 4 We build the MDMC step out of three parts 4 Molecular Dynamics

MDMC 4 We build the MDMC step out of three parts 4 Molecular Dynamics (MD), an approximate integrator which is exactly 8 Area preserving 8 Reversible 4 A momentum flip 4 A Metropolis accept/reject step 4 The composition of these gives 4 With y being uniformly distributed in [0, 1] 2/18/2021 XI Seminario Nazionale di Fisica Teorica 48

Momentum Refreshment 4 This mixes the Gaussian distributed momenta p with Gaussian noise 4

Momentum Refreshment 4 This mixes the Gaussian distributed momenta p with Gaussian noise 4 The Gaussian distribution of p is invariant under F 4 the extra momentum flip F ensures that for small the momenta are reversed after a rejection rather than after an acceptance 4 for = /2 all momentum flips are irrelevant 2/18/2021 XI Seminario Nazionale di Fisica Teorica 49

Special Cases of GHMC 4 The Hybrid Monte Carlo (HMC) algorithm is the special

Special Cases of GHMC 4 The Hybrid Monte Carlo (HMC) algorithm is the special case where = /2 4 =0 corresponds to an exact version of the Molecular Dynamics (MD) or Microcanonical algorithm (which is in general non-ergodic) 4 The L 2 MC algorithm of Horowitz corresponds to choosing arbitrary but MDMC trajectories of a single leapfrog integration step ( = ). This method is also called Kramers algorithm. 4 The Langevin Monte Carlo algorithm corresponds to choosing = /2 and MDMC trajectories of a single leapfrog integration step ( = ). 2/18/2021 XI Seminario Nazionale di Fisica Teorica 50

Further Special Cases 4 The Hybrid and Langevin algorithms are approximations where the Metropolis

Further Special Cases 4 The Hybrid and Langevin algorithms are approximations where the Metropolis step is omitted 4 The Local Hybrid Monte Carlo (LHMC) or Overrelaxation algorithm corresponds to updating a subset of the degrees of freedom (typically those living on one site or link) at a time 2/18/2021 XI Seminario Nazionale di Fisica Teorica 51

Langevin algorithm 4 Consider the Hybrid Monte Carlo algorithm when take only one leapfrog

Langevin algorithm 4 Consider the Hybrid Monte Carlo algorithm when take only one leapfrog step. 4 Combining the leapfrog equations of motion we obtain 4 Ignore the Monte Carlo step 4 Recall that for the fictitious momenta are just Gaussian distributed random noise 4 Let , then which is the usual form of the Langevin equation 2/18/2021 XI Seminario Nazionale di Fisica Teorica 52

Inexact algorithms 4 What happens if we omit the Metropolis test? 4 We still

Inexact algorithms 4 What happens if we omit the Metropolis test? 4 We still have an ergodic algorithm, so there is some unique fixed point distribution which will be generated by the algorithm 4 The condition for this to be a fixed point is 2/18/2021 XI Seminario Nazionale di Fisica Teorica 53

Langevin Algorithm 4 For the Langevin algorithm (which corresponds to = , a single

Langevin Algorithm 4 For the Langevin algorithm (which corresponds to = , a single leapfrog step) we may expand in powers of and find a solution for 4 The equation determining the leading term in this expansion is the Fokker-Planck equation; the general equations are sometimes known as the Kramers-Moyal equations 2/18/2021 XI Seminario Nazionale di Fisica Teorica 54

Inexact algorithms 4 In the general case we change variables to 4 Whence we

Inexact algorithms 4 In the general case we change variables to 4 Whence we obtain 2/18/2021 XI Seminario Nazionale di Fisica Teorica 55

Inexact algorithms 4 Since H is extensive so is H, and thus so is

Inexact algorithms 4 Since H is extensive so is H, and thus so is the connected generating function (cumulants) 4 We can thus show order by order in that S is extensive too 2/18/2021 XI Seminario Nazionale di Fisica Teorica 56

Langevin Algorithm 4 For the Langevin algorithm we have and , so we immediately

Langevin Algorithm 4 For the Langevin algorithm we have and , so we immediately find that 4 If S is local then so are all the Sn 4 We only have an asymptotic expansion for S, and in general the subleading terms are neither local nor extensive 4 Therefore, taking the continuum limit a 0 limit for fixed will probably not give exact results for renormalised quantities 2/18/2021 XI Seminario Nazionale di Fisica Teorica 57

Hybrid Algorithm 4 For the Hybrid algorithm and so again we find 4 We

Hybrid Algorithm 4 For the Hybrid algorithm and so again we find 4 We have made use of the fact that H’ is conserved and differs from H by O(δτ2) 2/18/2021 XI Seminario Nazionale di Fisica Teorica 58

Inexact algorithms 4 What effect do such systematic errors in the distribution have on

Inexact algorithms 4 What effect do such systematic errors in the distribution have on measurements? 4 Large errors will occur where observables are discontinuous functions of the parameters in the action, e. g. , near phase transitions 4 If we want to improve the action so as to reduce lattice discretisation effects and get a better approximation to the underlying continuum theory, then we have to ensure that for the appropriate n 4 Step size errors need not be irrelevant 2/18/2021 XI Seminario Nazionale di Fisica Teorica 59

Local HMC 4 Consider the Gaussian model defined by the free field action 4

Local HMC 4 Consider the Gaussian model defined by the free field action 4 Introduce the Hamiltonian on “fictitious” phase space. 8 The corresponding equations of motion for the single site x where and 4 The solution in terms of the Gaussian distributed random initial momentum and the initial field value is 2/18/2021 XI Seminario Nazionale di Fisica Teorica 60

Local HMC 4 Identify and to get the usual Adler overrelaxation update 4 For

Local HMC 4 Identify and to get the usual Adler overrelaxation update 4 For gauge theories various overrelaxation methods have been suggested 4 Hybrid Overrelaxation: this alternates a heatbath step with many overrelaxation steps with =2 4 LHMC: this uses an analytic solution for the equations of motion for the update of a single U(1) or SO(2) subgroup at a time. In this case the equations of motion may be solved in terms of elliptic functions 2/18/2021 XI Seminario Nazionale di Fisica Teorica 61

Classical Mechanics on Group Manifolds 4 Gauge fields take their values in some Lie

Classical Mechanics on Group Manifolds 4 Gauge fields take their values in some Lie group, so we need to define classical mechanics on a group manifold which preserves the group-invariant Haar measure 4 A Lie group G is a smooth manifold on which there is a natural mapping L: G G G defined by the group action 4 This induces a map called the pull-back of L on the cotangent bundle defined by 8 F is the space of 0 forms, which are smooth mappings from G to the real numbers 2/18/2021 XI Seminario Nazionale di Fisica Teorica 62

Classical Mechanics on Group Manifolds 4 A form is left invariant if 4 The

Classical Mechanics on Group Manifolds 4 A form is left invariant if 4 The tangent space to a Lie group at the origin is called the Lie algebra, and we may choose a set of basis vectors which satisfy the commutation relations where are the structure constants of the algebra 4 We may define a set of left invariant vector fields on TG by 4 The corresponding left invariant dual forms satisfy the Maurer-Cartan equations 4 We may therefore define a closed symplectic 2 form which globally defines an invariant Poisson bracket by 2/18/2021 XI Seminario Nazionale di Fisica Teorica 63

Classical Mechanics on Group Manifolds 4 We may now follow the usual procedure to

Classical Mechanics on Group Manifolds 4 We may now follow the usual procedure to find the equations of motion 4 Introduce a Hamiltonian function (0 form) H on the cotangent bundle (phase space) over the group manifold 4 Define a vector field h such that 4 The classical trajectories are then the integral curves of h: 4 In the natural basis we have 2/18/2021 XI Seminario Nazionale di Fisica Teorica 64

Classical Mechanics on Group Manifolds 4 Equating coefficients of the components of y we

Classical Mechanics on Group Manifolds 4 Equating coefficients of the components of y we find 4 The equations of motion in the local coordinate basis are therefore 4 Which for a Hamiltonian of the form reduce to 2/18/2021 XI Seminario Nazionale di Fisica Teorica 65

Classical Mechanics on Group Manifolds 4 In terms of constrained variables 4 The representation

Classical Mechanics on Group Manifolds 4 In terms of constrained variables 4 The representation of the generators is 4 From which it follows that 4 And for the Hamiltonian leads to the equations of motion 4 For the case G = SU(n) the operator T is the projection onto traceless antihermitian matrices 2/18/2021 XI Seminario Nazionale di Fisica Teorica 66

Classical Mechanics on Group Manifolds 4 Discrete equations of motion 4 We can now

Classical Mechanics on Group Manifolds 4 Discrete equations of motion 4 We can now easily construct a discrete PQP symmetric integrator from these equations 4 The exponential map from the Lie algebra to the Lie group may be evaluated exactly using the Cayley-Hamilton theorem 8 All functions of an n n matrix M may be written as a polynomial of degree n - 1 in M 8 The coefficients of this polynomial can be expressed in terms of the invariants (traces of powers) of M 2/18/2021 XI Seminario Nazionale di Fisica Teorica 67

Symplectic Integrators 4 Reversible and area-preserving integrator for Hamiltonian H(q, p) = T(p) +

Symplectic Integrators 4 Reversible and area-preserving integrator for Hamiltonian H(q, p) = T(p) + S(q) = ½p 2 + S(q) 2/18/2021 XI Seminario Nazionale di Fisica Teorica 68

Symplectic Integrators 4 The operators and are easily exponentiated (Taylor’s theorem) 4 The symmetric

Symplectic Integrators 4 The operators and are easily exponentiated (Taylor’s theorem) 4 The symmetric symplectic QPQ integrator applies these steps iteratively 2/18/2021 XI Seminario Nazionale di Fisica Teorica 69

BCH Formula 4 If A and B belong to any (non-commutative) algebra then ,

BCH Formula 4 If A and B belong to any (non-commutative) algebra then , where is constructed from commutators of A and B 4 It is in the Free Lie Algebra generated by {A, B} 4 More precisely, where c 1=A+B and 4 The Bm are Bernouilli numbers 2/18/2021 XI Seminario Nazionale di Fisica Teorica 70

BCH Formula 4 Explicitly, the first few terms are 2/18/2021 XI Seminario Nazionale di

BCH Formula 4 Explicitly, the first few terms are 2/18/2021 XI Seminario Nazionale di Fisica Teorica 71

Symmetric Symplectic Integrator 4 In order to construct reversible integrators we use symmetric symplectic

Symmetric Symplectic Integrator 4 In order to construct reversible integrators we use symmetric symplectic integrators 4 The following identity follows directly from the BCH formula 2/18/2021 XI Seminario Nazionale di Fisica Teorica 72

Symmetric Symplectic Integrator 4 The BCH formula tells us that the QPQ integrator has

Symmetric Symplectic Integrator 4 The BCH formula tells us that the QPQ integrator has evolution 2/18/2021 XI Seminario Nazionale di Fisica Teorica 73

QPQ Integrator 4 The QPQ integrator therefore exactly conserves the Hamiltonian , where 4

QPQ Integrator 4 The QPQ integrator therefore exactly conserves the Hamiltonian , where 4 So 4 Note that cannot be written as the sum of a p -dependent kinetic term and a q-dependent potential term 2/18/2021 XI Seminario Nazionale di Fisica Teorica 74

QPQ Integrator 4 Observe that for any trajectory length, even 2/18/2021 XI Seminario Nazionale

QPQ Integrator 4 Observe that for any trajectory length, even 2/18/2021 XI Seminario Nazionale di Fisica Teorica 75

Campostrini Wiggles 4 From the form of the evolution operator Campostrini noted that we

Campostrini Wiggles 4 From the form of the evolution operator Campostrini noted that we can easily write a higher-order integrator 4 The leading error vanishes if we choose 4 The total step size is unchanged if 4 This trick may be applied to recursively to obtain arbitrarily high-order symplectic integrators 2/18/2021 XI Seminario Nazionale di Fisica Teorica 76

Instabilities 4 Consider the simplest leapfrog scheme for a single simple harmonic oscillator with

Instabilities 4 Consider the simplest leapfrog scheme for a single simple harmonic oscillator with frequency 4 For a single step we have 4 The eigenvalues of this matrix are 4 For > 2 the integrator becomes unstable 2/18/2021 XI Seminario Nazionale di Fisica Teorica 77

Instabilities 4 The orbits change from ellipses to hyperbolae 4 The energy diverges exponentially

Instabilities 4 The orbits change from ellipses to hyperbolae 4 The energy diverges exponentially in instead of oscillating 4 For bosonic systems H» 1 for such a large integration step size 4 For light dynamical fermions there seem to be a few “modes” which have a large force due to the small eigenvalues of the Dirac operator, and this force plays the rôle of the frequency 4 For these systems is limited by stability rather than by the Metropolis acceptance rate 2/18/2021 XI Seminario Nazionale di Fisica Teorica 78

Instabilities 4 Some full QCD measurements on large lattices with quite light quarks 4

Instabilities 4 Some full QCD measurements on large lattices with quite light quarks 4 large CG residual: always unstable 4 small CG residual: unstable for > 0. 0075 4 intermediate CG residual: unstable for > 0. 0075 2/18/2021 XI Seminario Nazionale di Fisica Teorica 79

Dynamical Fermions 4 Fermion fields are Grassmann valued 4 Required to satisfy the spin-statistics

Dynamical Fermions 4 Fermion fields are Grassmann valued 4 Required to satisfy the spin-statistics theorem 4 Even “classical” Fermion fields obey anticommutation relations 4 Grassmann algebras behave like negative dimensional manifolds 2/18/2021 XI Seminario Nazionale di Fisica Teorica 80

Grassmann Algebras 8 Linear space spanned by generators { 1, 2, 3, …} with

Grassmann Algebras 8 Linear space spanned by generators { 1, 2, 3, …} with coefficients a, b, … in some field (usually the real or complex numbers) 4 Algebra structure defined by nilpotency condition for elements of the linear space 2 = 0 4 There are many elements of the algebra which are not in the linear space (e. g. , 1 2) 4 Nilpotency implies anticommutativity + = 0 80 = 2 = ( + )2 = 2 + + + 2 = + = 0 4 Anticommutativity implies nilpotency 2 ² = 0 (unless the coefficient field has characteristic 2, i. e. , 2 = 0) 2/18/2021 XI Seminario Nazionale di Fisica Teorica 81

Grassmann Algebras 4 Grassmann algebras have a natural grading corresponding to the number of

Grassmann Algebras 4 Grassmann algebras have a natural grading corresponding to the number of generators in a given product 4 deg(1) = 0, deg( i) = 1, deg( i j) = 2, . . . 4 All elements of the algebra can be decomposed into a sum of terms of definite grading 4 The parity transform of is 4 A natural antiderivation is defined on a Grassmann algebra 4 Linearity: d(a + b ) = a d + b d 4 Anti-Leibniz rule: d( ) = (d ) + P( )(d ) 2/18/2021 XI Seminario Nazionale di Fisica Teorica 82

Grassmann Integration 4 Definite integration on a Grassmann algebra is defined to be the

Grassmann Integration 4 Definite integration on a Grassmann algebra is defined to be the same as derivation 4 Hence change of variables leads to the inverse Jacobian 4 Gaussians over Grassmann manifolds 4 Like all functions, Gaussians are polynomials over Grassmann algebras 4 There is no reason for this function to be positive even for real coefficients 2/18/2021 XI Seminario Nazionale di Fisica Teorica 83

Grassmann Integration 4 Gaussian integrals over Grassmann manifolds 4 Where Pf(a) is the Pfaffian,

Grassmann Integration 4 Gaussian integrals over Grassmann manifolds 4 Where Pf(a) is the Pfaffian, Pf(a)² = det(a) 4 If we separate the Grassmann variables into two “conjugate” sets we find the more familiar result 4 Despite the notation, this is a purely algebraic identity 4 It does not require the matrix a > 0, unlike its bosonic analogue 2/18/2021 XI Seminario Nazionale di Fisica Teorica 84

Fermion Determinant 4 Direct simulation of Grassmann fields is not feasible 4 The problem

Fermion Determinant 4 Direct simulation of Grassmann fields is not feasible 4 The problem is not that of manipulating anticommuting values in a computer 4 It is that is not positive, and thus we get poor importance sampling 4 We integrate out the fermion fields to obtain the fermion determinant 4 and always occur quadratically 4 The overall sign of the exponent is unimportant 2/18/2021 XI Seminario Nazionale di Fisica Teorica 85

Fermionic Observables 4 Any operator can be expressed solely in terms of the bosonic

Fermionic Observables 4 Any operator can be expressed solely in terms of the bosonic fields 4 E. g. , the fermion propagator is 2/18/2021 XI Seminario Nazionale di Fisica Teorica 86

Dynamical Fermions 4 Including the determinant as part of the observable to be measured

Dynamical Fermions 4 Including the determinant as part of the observable to be measured is not feasible 4 The determinant is extensive in the lattice volume, thus we get poor importance sampling 2/18/2021 XI Seminario Nazionale di Fisica Teorica 87

Pseudofermions 4 Represent the fermion determinant as a bosonic Gaussian integral with a non-local

Pseudofermions 4 Represent the fermion determinant as a bosonic Gaussian integral with a non-local kernel 4 The fermion kernel must be positive definite (all its eigenvalues must have positive real parts) otherwise the bosonic integral will not converge 4 The new bosonic fields are called “pseudofermions” 2/18/2021 XI Seminario Nazionale di Fisica Teorica 88

Pseudofermions 4 It is usually convenient to introduce two flavours of fermion and to

Pseudofermions 4 It is usually convenient to introduce two flavours of fermion and to write 4 This not only guarantees positivity, but also allows us to generate the pseudofermions from a global heatbath by applying to a random Gaussian distributed field 2/18/2021 XI Seminario Nazionale di Fisica Teorica 89

Equations of Motion 4 The equations for motion for the boson (gauge) fields are

Equations of Motion 4 The equations for motion for the boson (gauge) fields are and 4 The evaluation of the pseudofermion action and the corresponding force then requires us to find the solution of a (large) set of linear equations 2/18/2021 XI Seminario Nazionale di Fisica Teorica 90

Linear Solver Accuracy 4 It is not necessary to carry out the inversions required

Linear Solver Accuracy 4 It is not necessary to carry out the inversions required for the equations of motion exactly 4 There is a trade-off between the cost of computing the force and the acceptance rate of the Metropolis MDMC step 4 The inversions required to compute the pseudofermion action for the accept/reject step does need to be computed exactly 4 We usually take “exactly” to be synonymous with “to machine precision” 2/18/2021 XI Seminario Nazionale di Fisica Teorica 91

Krylov Spaces 4 One of the main reasons why dynamical fermion lattice calculations are

Krylov Spaces 4 One of the main reasons why dynamical fermion lattice calculations are feasible is the existence of very effective numerical methods for solving large sparse systems of linear equations 4 Family of iterative methods based on Krylov spaces 4 Conjugate Gradients (CG, CGNE) 4 Bi. Conjugate Gradients (Bi. CG, Bi. CGstab, Bi. CG 5) 2/18/2021 XI Seminario Nazionale di Fisica Teorica 92

Krylov Spaces 4 These are often introduced as exact methods 4 They require O(V)

Krylov Spaces 4 These are often introduced as exact methods 4 They require O(V) iterations to find the solution 4 They do not give the exact answer in practice because of rounding errors 4 They are more naturally thought of as methods for solving systems of linear equations in an (almost) -dimensional linear space 4 This is what we are approximating on the lattice anyway 2/18/2021 XI Seminario Nazionale di Fisica Teorica 93

Bi. CG on a Banach Space 4 We want to solve the equation Ax

Bi. CG on a Banach Space 4 We want to solve the equation Ax = b on a Banach space 4 This is a normed linear space 4 The norm endows the space with a topology 4 The linear space operations are continuous in this topology 4 We solve the system in the Krylov subspace 2/18/2021 XI Seminario Nazionale di Fisica Teorica 94

Bi. CG on a Banach Space 4 There is no concept of “orthogonality” in

Bi. CG on a Banach Space 4 There is no concept of “orthogonality” in a Banach space, so we also need to introduce a dual Krylov space of linear functionals on the Banach space 4 The vector is arbitrary 4 The adjoint of a linear operator is defined by 2/18/2021 XI Seminario Nazionale di Fisica Teorica 95

Bi. CG on a Banach Space 4 We construct bi-orthonormal bases Galerkin condition (projectors)

Bi. CG on a Banach Space 4 We construct bi-orthonormal bases Galerkin condition (projectors) Bi-orthogonality and normalisation Short (3 term) recurrence Lanczos tridiagonal form Bi. CGstab: minimise norm ||rn|| 2/18/2021 XI Seminario Nazionale di Fisica Teorica 96

Tridiagonal Systems 4 The problem is now reduced to solving a fairly small tridiagonal

Tridiagonal Systems 4 The problem is now reduced to solving a fairly small tridiagonal system 4 Hessenberg and Symmetric → Tridiagonal = 2/18/2021 = XI Seminario Nazionale di Fisica Teorica 97

LU Decomposition 4 Reduce matrix to triangular form 4 Gaussian elimination (LU decomposition) 1

LU Decomposition 4 Reduce matrix to triangular form 4 Gaussian elimination (LU decomposition) 1 1 1 1 1 = 1 -1 L 1 1 2/18/2021 U T XI Seminario Nazionale di Fisica Teorica 98

QR Decomposition 4 Reduce matrix to triangular form 4 Givens rotations (QR factorisation) 1

QR Decomposition 4 Reduce matrix to triangular form 4 Givens rotations (QR factorisation) 1 1 1 1 1 = 1 -1 Q 1 1 1 2/18/2021 R T XI Seminario Nazionale di Fisica Teorica 99

Krylov Solvers 4 Convergence is measured by the residual rn=||b – Axn|| 4 Does

Krylov Solvers 4 Convergence is measured by the residual rn=||b – Axn|| 4 Does not decrease monotonically for Bi. CG 4 Better for Bi. CGstab 4 Bad breakdown for unlucky choice of starting form 4 LU might fail if zero pivot occurs 4 QR is more stable 4 In a Hilbert space there is an inner product 4 Which is relevant if A is symmetric (Hermitian) 4 In this case we get the CG algorithm 4 No bad breakdown (solution is in Krylov space) 4 A>0 required only to avoid zero pivot for LU 2/18/2021 XI Seminario Nazionale di Fisica Teorica 100

Random Number Generators 4 Pseudorandom number generators 4 Random numbers used infrequently in Monte

Random Number Generators 4 Pseudorandom number generators 4 Random numbers used infrequently in Monte Carlo for QFT 8 Compared to spin models 4 Linear congruential generator 8 For suitable choice of a, b, and m (see, e. g. , Donald Knuth, Art of Computer Programming) 8 Usually chose b = 0 and m = power of 2 8 Seem to be good enough in practice 8 Problems for spin models if m is too small a multiple of V 2/18/2021 XI Seminario Nazionale di Fisica Teorica 101

Parallel Random Numbers 2/18/2021 x 2 x 3 x 4 x. V+1 x. V+2

Parallel Random Numbers 2/18/2021 x 2 x 3 x 4 x. V+1 x. V+2 x. V+3 x. V+4 x 2 V+1 x 2 V+2 x 2 V+3 x 2 V+4 x 3 V XI Seminario Nazionale di Fisica Teorica 102

Acceptance Rates 4 We can compute the average Metropolis acceptance rate 4 Area preservation

Acceptance Rates 4 We can compute the average Metropolis acceptance rate 4 Area preservation implies that 4 The probability distribution of H has an asymptotic expansion as the lattice volume V 4 The average Metropolis acceptance rate is thus 2/18/2021 XI Seminario Nazionale di Fisica Teorica 103

Acceptance Rates 4 The acceptance rate is a function of the variable x =

Acceptance Rates 4 The acceptance rate is a function of the variable x = Vδτ4 n+4 for the nth order Campostrini “wiggle” used to generate trajectories with = 1 4 For single-step trajectories = the acceptance rate is a function of x = Vδτ4 n+6 2/18/2021 XI Seminario Nazionale di Fisica Teorica 104

Cost and Dynamical Critical Exponents 4 To a good approximation, the cost C of

Cost and Dynamical Critical Exponents 4 To a good approximation, the cost C of a Molecular Dynamics computation is proportional to the total time for which we have to integrate Hamilton’s equations. 4 The cost per independent configuration is then 4 Note that the cost depends on the particular operator under consideration 4 The optimal trajectory lengths obtained by minimising the cost as a function of the parameters , , and of the algorithm 2/18/2021 XI Seminario Nazionale di Fisica Teorica 105

Cost and Dynamical Critical Exponents 4 While the cost depends upon the details of

Cost and Dynamical Critical Exponents 4 While the cost depends upon the details of the implementation of the algorithm, the way that it scales with the correlation length of the system is an intrinsic property of the algorithm; where z is the dynamical critical exponent. 4 For local algorithms the cost is independent of the trajectory length, , and thus minimising the cost is equivalent to minimising 4 Free field theory analysis is useful for understanding and optimising algorithms, especially if our results do not depend on the details of the spectrum 2/18/2021 XI Seminario Nazionale di Fisica Teorica 106

Cost and Dynamical Critical Exponents 4 HMC 4 For free field theory we can

Cost and Dynamical Critical Exponents 4 HMC 4 For free field theory we can show that choosing and = /2 gives z=1 4 This is to be compared with z=2 for constant 4 The optimum cost for HMC is thus 4 For nth order Campostrini integration (if it were stable) the cost is 2/18/2021 XI Seminario Nazionale di Fisica Teorica 107

Cost and Dynamical Critical Exponents 4 LMC 4 For free field theory we can

Cost and Dynamical Critical Exponents 4 LMC 4 For free field theory we can show that choosing = and = /2 gives z=2 4 The optimum cost for LMC is thus 4 For nth order Campostrini integration (if it were stable) the cost is 2/18/2021 XI Seminario Nazionale di Fisica Teorica 108

Cost and Dynamical Critical Exponents 4 L 2 MC (Kramers) 4 In the approximation

Cost and Dynamical Critical Exponents 4 L 2 MC (Kramers) 4 In the approximation of unit acceptance rate we find that setting = and suitably tuning we can arrange to get z = 1 4 However, if then the system will carry out a random walk backwards and forwards along a trajectory because the momentum, and thus the direction of travel, must be reversed upon a Metropolis rejection. 2/18/2021 XI Seminario Nazionale di Fisica Teorica 109

Cost and Dynamical Critical Exponents 4 A simple-minded analysis is that the average time

Cost and Dynamical Critical Exponents 4 A simple-minded analysis is that the average time between rejections must be O( ) to achieve z = 1 4 This time is approximately 4 For small we have , hence we are required to scale so as to keep fixed 4 This leads to a cost for L 2 MC of 4 Or using nth order Campostrini integration 4 A more careful free field theory analysis leads to the same conclusions 2/18/2021 XI Seminario Nazionale di Fisica Teorica 110

Cost and Dynamical Critical Exponents 4 Optimal parameters for GHMC 4 (Almost) analytic calculation

Cost and Dynamical Critical Exponents 4 Optimal parameters for GHMC 4 (Almost) analytic calculation in free field theory 4 Minimum cost for GHMC appears for acceptance probability in the range 40% -90% 4 Very similar to HMC 4 Minimum cost for L 2 MC (Kramers) occurs for acceptance rate very close to 1 4 And this cost is much larger 2/18/2021 XI Seminario Nazionale di Fisica Teorica 111

Reversibility 4 Are HMC trajectories reversible and area preserving in practice? 4 The only

Reversibility 4 Are HMC trajectories reversible and area preserving in practice? 4 The only fundamental source of irreversibility is the rounding error caused by using finite precision floating point arithmetic 8 For fermionic systems we can also introduce irreversibility by choosing the starting vector for the iterative linear equation solver time-asymmetrically 8 We might do this if we want to use (some extrapolation of) the previous solution as the starting vector 8 Floating point arithmetic is not associative 8 It is more natural to store compact variables as scaled integers (fixed point) 8 Saves memory 8 Does not solve the precision problem 2/18/2021 XI Seminario Nazionale di Fisica Teorica 112

Reversibility 4 Data for SU(3) gauge theory and QCD with heavy quarks show that

Reversibility 4 Data for SU(3) gauge theory and QCD with heavy quarks show that rounding errors are amplified exponentially 4 The underlying continuous time equations of motion are chaotic 4 Liapunov exponents characterise the divergence of nearby trajectories 4 The instability in the integrator occurs when H » 1 8 Zero acceptance rate anyhow 2/18/2021 XI Seminario Nazionale di Fisica Teorica 113

Reversibility 4 In QCD the Liapunov exponents appear to scale with as the system

Reversibility 4 In QCD the Liapunov exponents appear to scale with as the system approaches the continuum limit 4 = constant 4 This can be interpreted as saying that the Liapunov exponent characterises the chaotic nature of the continuum classical equations of motion, and is not a lattice artefact 4 Therefore we should not have to worry about reversibility breaking down as we approach the continuum limit 4 Caveat: data is only for small lattices, and is not conclusive 2/18/2021 XI Seminario Nazionale di Fisica Teorica 114

Reversibility 4 Data for QCD with lighter dynamical quarks 4 Instability occurs close to

Reversibility 4 Data for QCD with lighter dynamical quarks 4 Instability occurs close to region in where acceptance rate is near one 8 May be explained as a few “modes” becoming unstable because of large fermionic force 4 Integrator goes unstable if too poor an approximation to the fermionic force is used 2/18/2021 XI Seminario Nazionale di Fisica Teorica 115

Chebyshev Approximation 4 What is the best polynomial approximation p(x) to a continuous function

Chebyshev Approximation 4 What is the best polynomial approximation p(x) to a continuous function f(x) for x in [0, 1] ? 4 Weierstrass’ theorem: any continuous function can be arbitrarily well approximated by a polynomial 8 Bernstein polynomials: 4 Minimise the appropriate norm where n 1 4 Chebyshev’s theorem 8 The error |p(x) - f(x)| reaches its maximum at exactly d+2 points on the unit interval 8 There is always a unique polynomial of any degree d which minimises 2/18/2021 XI Seminario Nazionale di Fisica Teorica 116

Chebyshev Approximation 4 Necessity: 4 Suppose p-f has less than d+2 extrema of equal

Chebyshev Approximation 4 Necessity: 4 Suppose p-f has less than d+2 extrema of equal magnitude 4 Then at most d+1 maxima exceed some magnitude 4 This defines a “gap” 4 We can construct a polynomial q of degree d which has the opposite sign to p-f at each of these maxima (Lagrange interpolation) 4 And whose magnitude is smaller than the “gap” 4 The polynomial p+q is then a better approximation than p to f 2/18/2021 XI Seminario Nazionale di Fisica Teorica 117

Chebyshev Approximation 4 Sufficiency: 4 Suppose there is a polynomial 4 Then at each

Chebyshev Approximation 4 Sufficiency: 4 Suppose there is a polynomial 4 Then at each of the d+2 extrema of 4 Therefore p’ - p must have d+1 zeros on the unit interval 4 Thus p’ – p = 0 as it is a polynomial of degree d 2/18/2021 XI Seminario Nazionale di Fisica Teorica 118

Chebyshev Approximation 4 Convergence is often exponential in d 4 The best approximation of

Chebyshev Approximation 4 Convergence is often exponential in d 4 The best approximation of degree d-1 over [1, 1] to xd is 8 Where the Chebyshev polynomials are 8 The notation comes from a different transliteration of Chebyshev! 4 And the error is 2/18/2021 XI Seminario Nazionale di Fisica Teorica 119

Chebyshev Approximation 4 Chebyshev’s theorem is easily extended to rational approximations 4 Rational functions

Chebyshev Approximation 4 Chebyshev’s theorem is easily extended to rational approximations 4 Rational functions with equal degree numerator and denominator are usually best 4 Convergence is still often exponential 4 And rational functions usually give much better approximations 2/18/2021 XI Seminario Nazionale di Fisica Teorica 120

Chebyshev Approximation 4 A realistic example of a rational approximation is 8 This is

Chebyshev Approximation 4 A realistic example of a rational approximation is 8 This is accurate to within almost 0. 1% over the range [0. 003, 1] 4 Using a partial fraction expansion of such rational functions allows us to use a multiple mass linear equation solver, thus reducing the cost significantly. 4 The partial fraction expansion of the rational function above is 8 This appears to be numerically stable. 2/18/2021 XI Seminario Nazionale di Fisica Teorica 121

Chebyshev Approximation 2/18/2021 XI Seminario Nazionale di Fisica Teorica 122

Chebyshev Approximation 2/18/2021 XI Seminario Nazionale di Fisica Teorica 122

Золотарев Formula 4 Elliptic function are doubly periodic analytic functions 4 Jacobi elliptic function

Золотарев Formula 4 Elliptic function are doubly periodic analytic functions 4 Jacobi elliptic function 4 Real period 4 K, K=K(k) 4 Imaginary period 2 i. K’, K’=K(k’) , k 2+k’ 2=1 Im z ∞ i. K’ -2 K -K 0 K 2 K Re z 1/k 1 0 -1 -1/k -i. K’ 4 Complete elliptic integral 2/18/2021 XI Seminario Nazionale di Fisica Teorica 123

Золотарев Formula 4 All analytic functions with the same periods may be expressed as

Золотарев Formula 4 All analytic functions with the same periods may be expressed as a rational function of sn(z, k) 4 Modular transformations 4 Divide imaginary period by n sn(z/M, λ) Im z i. K’ -2 K -K 0 K 2 K Re z sn(z, k) -i. K’ 2/18/2021 XI Seminario Nazionale di Fisica Teorica 124

Multibosons 4 Chebyshev polynomial approximations were introduced by Lüscher in his Multiboson method 4

Multibosons 4 Chebyshev polynomial approximations were introduced by Lüscher in his Multiboson method 4 No solution of linear equations is required 4 One must store n scalar fields 4 The dynamics of multiboson fields is stiff, so very small step sizes are required for gauge field updates 2/18/2021 XI Seminario Nazionale di Fisica Teorica 125

Reweighting 4 Making Lüscher’s method exact 4 One can introduce an accept/reject step 4

Reweighting 4 Making Lüscher’s method exact 4 One can introduce an accept/reject step 4 One can reweight the configurations by the ratio det[M] det[P(M)] 8 This factor is close to one if P(M) is a good approximation 8 If one only approximates 1/x over the interval [ , 1] which does not cover the spectrum of M, then the reweighting factor may be large 8 It has been suggested that this will help to sample configurations which would otherwise be suppressed by small eigenvalues of the Dirac operator 2/18/2021 XI Seminario Nazionale di Fisica Teorica 126

Noisy methods 4 Since the GHMC algorithm is correct for any reversible and area-preserving

Noisy methods 4 Since the GHMC algorithm is correct for any reversible and area-preserving mapping, it is often useful to use some modified MD process which is cheaper 4 Accept/reject step must use the true Hamiltonian, of course 4 Tune the parameters in the MD Hamiltonian to maximise the Metropolis acceptance rate 4 Add irrelevant operators? 8 A different kind of “improvement” 2/18/2021 XI Seminario Nazionale di Fisica Teorica 127

Noisy Inexact Algorithms 4 Replace the force -S’ in the leapfrog equations of motion

Noisy Inexact Algorithms 4 Replace the force -S’ in the leapfrog equations of motion by a noisy estimator such that 4 Choose the noise independently for each step 4 The equation for the shift in the fixed point distribution is now 4 For the noisy Langevin algorithm we have and , but since we obtain 4 For the noisy Hybrid algorithm and , so 2/18/2021 XI Seminario Nazionale di Fisica Teorica 128

Noisy Inexact Algorithms 4 This method is useful for non-local actions, such as staggered

Noisy Inexact Algorithms 4 This method is useful for non-local actions, such as staggered fermions with flavours 4 Here the action may be written as , and the force is 8 A cheap way of estimating such a trace noisily is to use the fact that 4 Using a non area preserving irreversible update step the noisy Hybrid algorithm can be adjusted to have , and thus too 8 This is the Hybrid R algorithm 8 One must use Gaussian noise (Z 2 noise does not work) 4 Campostrini’s integration scheme may be used to produce higher order Langevin and Hybrid algorithms with for arbitrary n, but this is not applicable to the noisy versions 2/18/2021 XI Seminario Nazionale di Fisica Teorica 129

Kennedy—Kuti Algorithm 4 Suppose it is prohibitively expensive to evaluate , but that we

Kennedy—Kuti Algorithm 4 Suppose it is prohibitively expensive to evaluate , but that we can compute an unbiased estimator for it cheaply, 4 If we look carefully at the proof that the Metropolis algorithm satisfies detailed balance we see that the ratio R is used for two quite different purposes 4 It is used to give an ordering to configurations: ’ < if R < 1, that is, if Q( ’) < Q( ) 4 It is used as the acceptance probability if ’ < 4 We can produce a valid algorithm using the noisy estimator for the latter rôle just by choosing another ordering for the configurations 4 A suitable ordering is provided by any “cheap” function f such that the set of configurations for which f( ’) = f( ) has measure zero 2/18/2021 XI Seminario Nazionale di Fisica Teorica 130

Kennedy—Kuti Algorithm 4 We now define the acceptance probability as 4 Where are to

Kennedy—Kuti Algorithm 4 We now define the acceptance probability as 4 Where are to be chosen so that 8 Taking is often a convenient choice 4 Detailed balance is easily established by considering the cases f( ) > f( ’) and f( ) < f( ’) separately 4 In practice the condition can rarely be satisfied exactly, but usually the number of violation can be made (exponentially) small 2/18/2021 XI Seminario Nazionale di Fisica Teorica 131

Noisy Fermions 4 An interesting application is when we have a non-local fermionic determinant

Noisy Fermions 4 An interesting application is when we have a non-local fermionic determinant in the fixed point distribution, 4 In this case we need to produce an unbiased estimator of 2/18/2021 XI Seminario Nazionale di Fisica Teorica 132

Noisy Fermions 4 It is easy to construct unbiased estimators for 4 Let be

Noisy Fermions 4 It is easy to construct unbiased estimators for 4 Let be a vector of random numbers whose components are chosen independently from a distribution with mean zero and variance one (Gaussian or Z 2 noise, for example) 4 Set , then 4 Furthermore, if and are independent, then 4 As long as all the eigenvalues of Q lie within the unit circle then 4 Similarly the exponential can be expanded as a power series 2/18/2021 XI Seminario Nazionale di Fisica Teorica 133

Bhanot–Kennedy algorithm 4 In order to obtain an unbiased estimator for R we sum

Bhanot–Kennedy algorithm 4 In order to obtain an unbiased estimator for R we sum these series stochastically 4 Suppose where 8 Our series can be transformed into this form 4 This can be “factored” as 4 And may be summed stochastically using the following scheme 2/18/2021 XI Seminario Nazionale di Fisica Teorica 134

“Kentucky” Algorithm 4 This gets rid of the systematic errors of the Kennedy— Kuti

“Kentucky” Algorithm 4 This gets rid of the systematic errors of the Kennedy— Kuti method by eliminating the violations 4 Promote the noise to the status of dynamical fields 4 Suppose that we have an unbiased estimator 2/18/2021 XI Seminario Nazionale di Fisica Teorica 135

“Kentucky” Algorithm 4 The and fields can be updated by alternating Metropolis Markov steps

“Kentucky” Algorithm 4 The and fields can be updated by alternating Metropolis Markov steps 4 There is a sign problem only if the Kennedy— Kuti algorithm would have many violations 4 If one constructs the estimator using the Kennedy—Bhanot algorithm then one will need an infinite number of noise fields in principle 4 Will these ever reach equilibrium? 2/18/2021 XI Seminario Nazionale di Fisica Teorica 136

Cost of Noisy Algorithms 4 Inexact algorithms 4 These have only a trivial linear

Cost of Noisy Algorithms 4 Inexact algorithms 4 These have only a trivial linear volume dependence, with z = 1 for Hybrid and z = 2 for Langevin 4 Noisy inexact algorithms 4 The noisy trajectories deviate from the true classical trajectory by a factor of for each step, or for a trajectory of steps 8 This will not affect the integrated autocorrelation function as long as 8 Thus the noisy Langevin algorithm should have 8 Thus the noisy Hybrid algorithm should have 2/18/2021 XI Seminario Nazionale di Fisica Teorica 137

Cost of Noisy Algorithms 4 Noisy exact algorithms 4 These algorithms use noisy estimators

Cost of Noisy Algorithms 4 Noisy exact algorithms 4 These algorithms use noisy estimators to produce an (almost) exact algorithm which is applicable to non-local actions 8 A straightforward approach leads to a cost of for exact noisy Langevin 8 It is amusing to note that this algorithm should not care what force term is used in the equations of motion 8 An exact noisy Hybrid algorithm is also possible, and for it 2/18/2021 XI Seminario Nazionale di Fisica Teorica 138

Cost of Noisy Algorithms 4 These results apply only to operators (like the magnetisation)

Cost of Noisy Algorithms 4 These results apply only to operators (like the magnetisation) which couple sufficiently strongly to the slowest modes of the system. For other operators, like the energy in 2 dimensions, we can even obtain z = 0 2/18/2021 XI Seminario Nazionale di Fisica Teorica 139

Cost of Noisy Algorithms 4 Summary 4 Too little noise increases critical slowing down

Cost of Noisy Algorithms 4 Summary 4 Too little noise increases critical slowing down because the system is too weakly ergodic 4 Too much noise increases critical slowing down because the system takes a drunkard’s walk through phase space 4 To attain z = 1 for any operator (and especially for the exponential autocorrelation time) one must be able to tune the amount of noise suitably 2/18/2021 XI Seminario Nazionale di Fisica Teorica 140

PHMC 4 Polynomial Hybrid Monte Carlo algorithm 4 Instead of using Chebyshev polynomials in

PHMC 4 Polynomial Hybrid Monte Carlo algorithm 4 Instead of using Chebyshev polynomials in the multiboson algorithm, Frezzotti & Jansen and de. Forcrand suggested using them directly in HMC 4 Polynomial approximation to 1/x are typically of order 40 to 100 at present 4 Numerical stability problems with high order polynomial evaluation 4 Polynomials must be factored 4 Correct ordering of the roots is important 4 Frezzotti & Jansen claim there advantages from using reweighting 2/18/2021 XI Seminario Nazionale di Fisica Teorica 141

RHMC 4 Rational Hybrid Monte Carlo algorithm 4 The idea is similar to PHMC,

RHMC 4 Rational Hybrid Monte Carlo algorithm 4 The idea is similar to PHMC, but uses rational approximations instead of polynomial ones 4 Much lower orders required for a given accuracy 4 Evaluation simplified by using partial fraction expansion and multiple mass linear equation solvers 41/x is already a rational function, so RHMC reduces to HMC in this case 4 Can be made exact using noisy accept/reject step 2/18/2021 XI Seminario Nazionale di Fisica Teorica 142

Ginsparg–Wilson fermions 4 Is it possible to have chiral symmetry on the lattice without

Ginsparg–Wilson fermions 4 Is it possible to have chiral symmetry on the lattice without doublers if we only insist that the symmetry holds on shell? 4 Such a transformation should be of the form (Lüscher) 4 For it to be a symmetry the Dirac operator must be invariant 4 For a small transformation this implies that 4 Which is the Ginsparg-Wilson relation 2/18/2021 XI Seminario Nazionale di Fisica Teorica 143

Neuberger’s Operator 4 We can find a solution of the Ginsparg-Wilson relation as follows

Neuberger’s Operator 4 We can find a solution of the Ginsparg-Wilson relation as follows 4 Let the lattice Dirac operator to be of the form 4 This satisfies the GW relation if 4 And it must also have the correct continuum limit 4 Both of these conditions are satisfied if we define (Neuberger) 2/18/2021 XI Seminario Nazionale di Fisica Teorica 144

Neuberger’s Operator 4 There are many other possible solutions 4 The discontinuity is necessary

Neuberger’s Operator 4 There are many other possible solutions 4 The discontinuity is necessary 4 This operator is local (Lüscher) 4 At least if we constrain the fields such that the plaquette < 1/30 4 By local we mean a function of fast decrease, as opposed to ultralocal which means a function of compact support 2/18/2021 XI Seminario Nazionale di Fisica Teorica 145

Computing Neuberger’s Operator 4 Use polynomial approximation to Neuberger’s operator 4 High degree polynomials

Computing Neuberger’s Operator 4 Use polynomial approximation to Neuberger’s operator 4 High degree polynomials have numerical instabilities 4 For dynamical GW fermions this leads to a PHMC algorithm 4 Use rational approximation 4 Optimal rational approximations for sgn(x) are know in closed form (Zolotarev) 4 Requires solution of linear equations just to apply the operator 4 For dynamical GW fermions this leads to an RHMC algorithm 4 Requires nested linear equation solvers in the dynamical case 4 Nested solvers can be avoided at the price of a much more illconditioned system 4 Attempts to combine inner and outer solves in one Krylov space method 2/18/2021 XI Seminario Nazionale di Fisica Teorica 146

Computing Neuberger’s Operator 4 Extract low-lying eigenvectors explicitly, and evaluate their contribution to the

Computing Neuberger’s Operator 4 Extract low-lying eigenvectors explicitly, and evaluate their contribution to the Dirac operator directly 4 Efficient methods based on Ritz functional 4 Very costly for dynamical fermions if there is a finite density of zero eigenvalues in the continuum limit (Banks—Casher) 4 Might allow for noisy accept/reject step if we can replace the step function with something continuous (so it has a reasonable series expansion) 4 Use better approximate solution of GW relation instead of Wilson operator 4 E. g. , a relatively cheap “perfect action” 2/18/2021 XI Seminario Nazionale di Fisica Teorica 147

Update Ordering for LHMC 4 Consider the update of a single site x using

Update Ordering for LHMC 4 Consider the update of a single site x using the LHMC algorithm 4 The values of at all other sites are left unchanged 4 This may be written in matrix form as where 2/18/2021 XI Seminario Nazionale di Fisica Teorica 148

Update Ordering for LHMC 4 For a complete sweep through the lattice consisting of

Update Ordering for LHMC 4 For a complete sweep through the lattice consisting of exactly updates in some order , we have 4 It is convenient to introduce the Fourier transformed fields and momenta 4 The corresponding Fourier transformed single site update matrix is 2/18/2021 XI Seminario Nazionale di Fisica Teorica 149

Update Ordering for LHMC 4 We want to compute integrated autocorrelation functions for operators

Update Ordering for LHMC 4 We want to compute integrated autocorrelation functions for operators like the magnetic susceptibility 8 The behaviour of the autocorrelations of linear operators such as the magnetisation can be misleading 4 In order to do this it is useful to consider quadratic operators as linear operators on quadratic monomials in the fields, , so 8 These quadratic monomials are updated according to 8 So, after averaging over the momenta we have 8 Where and 2/18/2021 XI Seminario Nazionale di Fisica Teorica 150

Update Ordering for LHMC 4 The integrated autocorrelation function for is 2/18/2021 XI Seminario

Update Ordering for LHMC 4 The integrated autocorrelation function for is 2/18/2021 XI Seminario Nazionale di Fisica Teorica 151

Update Ordering for LHMC 4 Random updates 4 The update matrix for a sweep

Update Ordering for LHMC 4 Random updates 4 The update matrix for a sweep after averaging over the V independent choices of update sites is just 4 We thus find that 4 Which tells us that z = 2 for any choice of the overrelaxation parameter 2/18/2021 XI Seminario Nazionale di Fisica Teorica 152

Update Ordering for LHMC 4 Even/Odd updates 4 In a single site update the

Update Ordering for LHMC 4 Even/Odd updates 4 In a single site update the new value of the field at x only depends at the old value at x and its nearest neighbours, so even sites depend only on odd sites and vice versa 8 The update matrix for a sweep in Fourier space thus reduces to a 2 × 2 matrix coupling and 8 thus becomes a 3 × 3 matrix in this case 4 The integrated autocorrelation function is 8 This is minimised by choosing , for which 8 Hence z = 1 2/18/2021 XI Seminario Nazionale di Fisica Teorica 153

Update Ordering for LHMC 4 Lexicographical updates 4 This is a scheme in which

Update Ordering for LHMC 4 Lexicographical updates 4 This is a scheme in which is always updated before 8 Except at the edges of the lattice 4 A single site update depends on the new values of the neighbouring sites in the + directions and the old values in the - directions new 4 So we may write the sweep update implicitly as 8 Where old updateable 2/18/2021 XI Seminario Nazionale di Fisica Teorica 154

Update Ordering for LHMC 4 The sweep update matrices are easily found explicitly 4

Update Ordering for LHMC 4 The sweep update matrices are easily found explicitly 4 The Fourier transforms and are diagonal up to surface terms , which we expect to be suppressed by a factor of 1/L 4 We thus obtain 8 Which is minimised by the choice 8 For which and hence z = 0 4 This can be achieved for most operators 8 Though with different values of 8 The dynamical critical exponent for the exponential autocorrelation time is still one at best 2/18/2021 XI Seminario Nazionale di Fisica Teorica 155