Random Numbers and Simulation Generating truly random numbers
Random Numbers and Simulation § Generating truly random numbers is not possible • Programs have been developed to generate pseudo-random numbers • Values are generated from deterministic algorithms © Fall 2011 John Grego and the University of South Carolina 1
Random Numbers n Pseudo-random deviates can pass any statistical test for randomness n They appear to be independent and identically distributed n Random number generators for common distributions are available in R n Special techniques (STAT 740) may be needed as well 2
Monte Carlo Simulation § Some common uses of simulation • • • Modeling stochastic behavior Calculating definite integrals Approximating the sampling distribution of a statistic (e. g. , maximum of a random sample) 3
Modeling Stochastic Behavior Buffon’s needle § Random Walk § Observe X 1, X 2, …, where p=P(Xi=1)=P(Xi=-1)=. 5 and study S 1, S 2, …, where § § This is also called Gambler’s ruin; each Xi represents a $1 bet with a return of $2 for a win and $0 for a loss. 4
A Fair Game The properties of a fair game (p=. 5) are a lot more interesting than the properties of an unfair game (p≠. 5) § Some properties of this process are easy to anticipate (E(S)) § 5
Gambler’s Ruin § Some properties are difficult to anticipate, and can be aided by simulation. • Expected number of returns to 0 • Expected length of a winning streak • Probability of going broke given an initial bank 6
Calculating Definite Integrals n In statistics, we often have to calculate difficult definite integrals (posterior distributions, expected values) (here, x could be multidimensional) 7
Integral Examples n Example 1 n Example 2 8
Hit-or-Miss Monte Carlo Example n Example 1 c such that c≥h(x) across entire region of interest (here, c=4) n Determine 9
Hit-or-Miss Monte Carlo Simulation n random uniform (Xi, Yi) pairs, Xi’s from U[a, b] (here, U[0, 1]) and Yi’s from U[0, c] (here, U[0, 4]) n Count the number of times (call this m) that Yi is less than h(Xi) n Then I 1 ≈c(b-a)m/n n Generate • I. e. , (height)(width)(proportion under curve) 10
Classical Monte Carlo Integration n Take n random uniform values, U 1, …, Un over [a, b] and estimate I using n This method seems straightforward, but is actually more efficient than Hit-or-Miss Monte Carlo 11
Expected Values n Suppose X is a random variable with density f. Find E[h(x)] for some function h, e. g. , 12
Esimtating Expected Values n For n random values X 1, X 2, …, Xn from the distribution of X (i. e. , with density f), 13
Examples 3: If X is a random variable with a N(10, 1) distribution, find E(X 2) § Example 4: If Y is a random variable with a Beta(5, 1) distribution, E(-ln. Y) n There are more advanced methods of integration using simulation (Importance Sampling) n Example 14
Integration performs numerical integration for functions of a single variable (not using simulation techniques) n adapt() in the adapt package performs multivariate numerical integration n integrate() 15
The Sampling Distribution of a Statistic n To perform inference (CI’s, hypothesis tests) based on sampling statistics, we need to know the sampling distribution of the statistics, at least up to an approximation n Example: X 1, X 2, …, Xn ~ iid N(m, s 2). 16
Approximating the Sampling Distribution of a Statistic n What if the data’s distribution is not known? • Large sample: Central Limit Theorem • Small sample: Normal theory or nonparametric procedures based on permutation distributions 17
Simulating the Sampling Distribution of a Statistic n If the population distribution is known, we can approximate the sampling distribution with simulation. • Repeatedly (m times) generate random samples of size n from the population distribution • Calculate a statistic (say, S) each time • The empirical (observed) distribution of Svalues approximates the true distribution of S 18
Example n X 1, X 2, X 3, X 4 ~Expon(1) n What is the sampling distribution of: 19
- Slides: 19