Overview of simulation Posterior distribution and posterior prediction

Overview of simulation • Posterior distribution and posterior prediction – Plot shape of the distribution. – Calculate its statistical properties and confidence bounds. – Additional analysis if necessary. • Phase I: analytical approach – Use analytical functions of standard pdfs. Posterior distribution q of Binom. . Available pdf for Predict. distribution NA, m of N(m, s 2) s 2 of N(m, s 2) NA -1 -

Overview of simulation Posterior distribution m, s 2 of N(m, s 2) Available Pdf Available pdf for Predict. distribution NA • Phase II: sampling approach based on factorization – Use sampling technique of standard pdfs. 1. Draw s 2 from the marginal pdf 2. Draw m from the conditional pdf -2 -

Overview of simulation Posterior distribution a, b of regress for death prob. q Too complex to express in closed form Available Pdf NA Available pdf for Predict. distribution NA • Phase III: sampling approach based on factorization – Use sampling technique of inverse CDF for general case. 1. Draw a from the marginal pdf p(a|y). 2. Draw b from the conditional pdf p(b|a, y). -3 -

Overview of simulation • Remark – For more complicated & practical problems, analytic treatment of posterior distribution become more and more difficult or impossible. – A battery of powerful methods has been developed over the past few decades for simulating from probability distributions. • References – Chap 10 & 11 of Gelman – Andrieu, C. , et al. (2003). An Introduction to MCMC for Machine Learning, 50, 5– 43. • Methods of simulation – – Grid method (inverse CDF method) Rejection sampling Importance sampling Markov Chain Monte Carlo (MCMC) method -4 -

Grid method (inverse CDF method) • Procedure – 1. 2. 3. In order to generate samples following pdf f(v), Construct approx. cdf F(v) which is the integral of f(v). Draw random value U from the uniform distribution on [0, 1]. let v=F-1(U). Then the value v will be a random draw from f(v). • Practice with matlab • Remarks – Effective only when we have knowledge of the range and we miss nothing outside their ranges. – Not good for higher-dimensional multivariate problems, where computing at every point in the multidimensional grid becomes prohibitively expensive. – Conclusion: this method is not used well in practice. -5 -

Rejection sampling • Procedure – In order to generate samples for pdf p(x), introduce an arbitrary pdf q(x) that has sampling capability, such that Mq(x) covers whole p(x). 1. Sample q at random from the proposal pdf q(x). 2. With probability p(x)/(Mq(x)), accept x as a draw from p. – M is just chosen such that Mq exceeds p at everywhere. • Pseudo-code & illustration -6 -

Rejection sampling • Practice with matlab – generate samples of this distribution. • Remarks – it is not always possible to bound p/q with reasonable amount M over the whole space. If M is too large, the acceptance probability Pr(x accepted) ~ 1/M is too small. -7 -

Importance sampling • Calculation of moment – Introduce an arbitrary pdf q(x) that has sampling capability. In this case, q(x) need not cover p(x). – Then moment (or expectation) of an arbitrary function f(x) becomes where x(i) is the sample drawn from q(x). – In case p(x) is not normalized, normalize the weight samples. • Practice with matlab – Calculate mean & variance of p(x) using importance sampling. -8 -

Importance sampling • Calculation of probability – In case that we can draw samples from p(x) where Ig is 1 when g(x)<0. This is to count # of x where g<0, or sum all Ig where Ig is 1 when g<0. – In case that we can’t draw samples, This is to sum all Ig but with uneven weight where Ig is 1 when g<0. • Practice with matlab – Calculate P[p(x)<5] using importance sampling. -9 -

Importance sampling • Generation of samples – Recall that – Another meaning of this is that the distribution p(x) has weight w(xi) at the sample points xi drawn by q(x). This can be written as • Practice with matlab – Generate samples of this distribution. - 10 -

General guidelines – From Chap 10 of Gelman • Use of simulation in the Bayesian analysis – Inferences are conveniently conducted using random sampling from the posterior distribution, which include percentiles at 2. 5%, 25%, …. – Once simulations obtained, it is also easy to draw samples for predictive distribution. For each draw of q from p(q|y), just draw one y from p( y|q). • Normalized vs unnormalized distribution – We assume that the target density p(q|y), being a function of q, can be easily computed for any value of q whether it is closed form or not. – We assume that the density need not be normalized, it is just OK if it is proportional to the true distribution. • Crude or first hand estimation – Rough estimate of the location of the distribution – that is, a point estimate of the parameters - using some simple technique is necessary. – Finding modes by optimization or Newton’s method may also be needed. - 11 -

General guidelines • How many simulation draws needed ? - 12 -