Binomial and Poisson Probability Distributions Binomial Probability Distribution

Binomial and Poisson Probability Distributions Binomial Probability Distribution l Consider a situation where there are only two possible outcomes (a “Bernoulli trial”) Example: u flipping a coin The outcome is either a head or tail u rolling a dice For example, 6 or not 6 (i. e. 1, 2, 3, 4, 5) Label the possible outcomes by the variable k We want to find the probability P(k) for event k to occur James Bernoulli (Jacob I) Since k can take on only 2 values we define those values as: born in Basel, Switzerland k = 0 or k = 1 Dec. 27, 1654 -Aug. 16, 1705 u Let the probability for outcome “k” to occur be: P(k = 0) = q He is one 8 mathematicians (remember 0 ≤ q ≤ 1) in the Bernoulli family. (from Wikipedia) u something must happen so P(k = 0) + P(k = 1) = 1 (mutually exclusive events) P(k = 1) = p = 1 - q u We can write the probability distribution P(k) as: P(k) = pkq 1 -k (Bernoulli distribution) u coin toss: define probability for a head as P(1) P(k=1) = 0. 5 and P(0=tail) = 0. 5 too! u dice rolling: define probability for a six to be rolled from a six sided dice as P(k=1) = 1/6 and P(k=0=not a six) = 5/6. R. Kass/Sp 12 P 416 Lecture 2 1

l What is the mean ( ) of P(k)? Mean and Variance of a discrete distribution (remember p+q=1) l What is the Variance ( 2) of P(k)? l Let’s do something more complicated: Suppose we have N trials (e. g. we flip a coin N times) what is the probability to get m successes (= heads)? l Consider tossing a coin twice. The possible outcomes are: no heads: P(m = 0) = q 2 one head: P(m = 1) = qp + pq (toss 1 is a tail, toss 2 is a head or toss 1 is head, toss 2 is a tail) = 2 pq we don't care which of the tosses is a head so 2 two heads: P(m = 2) = p there are two outcomes that give one head l Note: P(m=0)+P(m=1)+P(m=2)=q 2+ qp + pq +p 2= (p+q)2 = 1 (as it should!) We want the probability distribution P(m, N, p) where: m = number of success (e. g. number of heads in a coin toss) N = number of trials (e. g. number of coin tosses) p = probability for a success (e. g. 0. 5 for a head) R. Kass/Sp 12 P 416 Lecture 2 2

l l If we look at the three choices for the coin flip example, each term is of the form: Cmpmq. N-m m = 0, 1, 2, N = 2 for our example, q = 1 - p always! coefficient Cm takes into account the number of ways an outcome can occur without regard to order. for m = 0 or 2 there is only one way for the outcome (both tosses give heads or tails): C 0 = C 2 = 1 for m = 1 (one head, two tosses) there are two ways that this can occur: C 1 = 2. Binomial coefficients: number of ways of taking N things m at time 0! = 1, 2! = 1· 2 = 2, 3! = 1· 2· 3 = 6, m! = 1· 2· 3···m Order of occurrence is not important u e. g. 2 tosses, one head case (m = 1) n we don't care if toss 1 produced the head or if toss 2 produced the head Unordered groups such as our example are called combinations Ordered arrangements are called permutations For N distinguishable objects, if we want to group them m at a time, the number of permutations: u u example: If we tossed a coin twice (N = 2), there are two ways for getting one head (m = 1) example: Suppose we have 3 balls, one white, one red, and one blue. n Number of possible pairs we could have, keeping track of order is 6 (rw, wr, rb, br, wb, bw): n If order is not important (rw = wr), then the binomial formula gives number of “two color” combinations R. Kass/Sp 12 P 416 Lecture 2 3

l Binomial distribution: the probability of m success out of N trials: p is probability of a success and q = 1 - p is probability of a failure l To show that the binomial distribution is properly normalized, use Binomial Theorem: Þ binomial distribution is properly normalized R. Kass/Sp 12 P 416 Lecture 2 4

Mean of binomial distribution: A cute way of evaluating the above sum is to take the derivative: Variance of binomial distribution (obtained using similar trick): R. Kass/Sp 12 P 416 Lecture 2 5

Example: Suppose you observed m special events (success) in a sample of N events u The measured probability (“efficiency”) for a special event to occur is: What is the error (standard deviation) on the probability ("error on the efficiency"): we will derive this later in the course The sample size (N) should be as large as possible to reduce the uncertainty in the probability measurement. Let’s relate the above result to Lab 2 where we throw darts to measure the value of p. If we inscribe a circle inside a square with side=s then the ratio of the area of the circle to the rectangle is: r s So, if we throw darts at random at our rectangle then the probability (ºe) of a dart landing inside the circle is just the ratio of the two areas, p/4. The we can determine p using: The error in p is related to the error in e by: p= 4 e. We can estimate how well we can measure p by this method by assuming that e=p/4= (3. 14159…)/4: This formula “says” that to improve our estimate of p by a factor of 10 we have to throw 100 (N) times as many darts! Clearly, this is an inefficient way to determine p. R. Kass/Sp 12 P 416 Lecture 2 6

Example: Suppose a baseball player's batting average is 0. 300 (3 for 10 on average). u Consider the case where the player either gets a hit or makes an out (forget about walks here!). prob. for a hit: p = 0. 30 prob. for "no hit”: q = 1 - p = 0. 7 u On average how many hits does the player get in 100 at bats? = Np = 100· 0. 30 = 30 hits u What's the standard deviation for the number of hits in 100 at bats? = (Npq)1/2 = (100· 0. 30· 0. 7)1/2 ≈ 4. 6 hits we expect ≈ 30 ± 5 hits per 100 at bats u Consider a game where the player bats 4 times: probability of 0/4 = (0. 7)4 = 24% Pete Rose’s lifetime batting average: 0. 303 probability of 1/4 = [4!/(3!1!)](0. 3)1(0. 7)3 = 41% probability of 2/4 = [4!/(2!2!)](0. 3)2(0. 7)2 = 26% probability of 3/4 = [4!/(1!3!)](0. 3)3(0. 7)1 = 8% probability of 4/4 = [4!/(0!4!)](0. 3)4(0. 7)0 = 1% probability of getting at least one hit = 1 - P(0) = 1 -0. 24=76% R. Kass/Sp 12 P 416 Lecture 2 7

Poisson Probability Distribution l l The Poisson distribution is a widely used discrete probability distribution. Consider the following conditions: p is very small and approaches 0 u example: a 100 sided dice instead of a 6 sided dice, p = 1/100 instead of 1/6 u example: a 1000 sided dice, p = 1/1000 Siméon Denis Poisson N is very large and approaches ∞ June 21, 1781 -April 25, 1840 example: throwing 100 or 1000 dice instead of 2 dice uradioactive decay The product Np is finite unumber of Prussian soldiers kicked Example: radioactive decay to death by horses per year ! uquality control, failure rate predictions Suppose we have 25 mg of an element very large number of atoms: N ≈ 1020 (avogadro’s number is large!) Suppose the lifetime of this element t = 1012 years ≈ 5 x 1019 seconds probability of a given nucleus to decay in one second is very small: p = 1/t = 2 x 10 -20/sec BUT Np = 2/sec finite! The number of decays in a time interval is a Poisson process. Poisson distribution can be derived by taking the appropriate limits of the binomial distribution N>>m R. Kass/Sp 12 P 416 Lecture 2 8

m is always an integer ≥ 0 u does not have to be an integer It is easy to show that: = Np = mean of a Poisson distribution 2 = Np = = variance of a Poisson distribution Radioactivity example with an average of 2 decays/sec: i) What’s the probability of zero decays in one second? u l The mean and variance of a Poisson distribution are the same number! ii) What’s the probability of more than one decay in one second? iii) Estimate the most probable number of decays/sec? u R. Kass/Sp 12 To solve this problem its convenient to maximize ln. P(m, ) instead of P(m, ). P 416 Lecture 2 9

u In order to handle the factorial when take the derivative we use Stirling's Approximation: ln 10!=15. 10 ln 50!=148. 48 10 ln 10 -10=13. 03 ® 14% 50 ln 50 -50=145. 60® 1. 9% The most probable value for m is just the average of the distribution u This is only approximate since Stirlings Approximation is only valid for large m. u Strictly speaking m can only take on integer values while is not restricted to be an integer. If you observed m events in a “counting” experiment, the error on m is R. Kass/Sp 12 P 416 Lecture 2 10

Comparison of Binomial and Poisson distributions with mean m = 1 Not much difference between them! N N For N large and m fixed: Binomial Þ Poisson R. Kass/Sp 12 P 416 Lecture 2 11

Uniform distribution and Random Numbers What is a uniform probability distribution: p(x)? p(x)=constant (c) for a £ x £b p(x)=zero everywhere else Therefore p(x 1)dx 1= p(x 2)dx 2 if dx 1=dx 2 Þ equal intervals give equal probabilities For a uniform distribution with a=0, b=1 we have p(x)=1 1 p(x) What is a random number generator ? 0 A number picked at random from a uniform distribution with limits [0, 1] x 1 All major computer languages (C, C++) come with a random number generator. In C++ rand() gives a random number. The following C++ program generates 5 random numbers: #include<iostream> #include<math. h> #include<stdlib. h> // compile and link using: c++ -o abc. C -lm with source code named abc. C int main() { // program to calculate and print out some random numbers int K, points; srand(1); //initial random number generator points=4; for(K=0; K<points; K++) If we generate “a lot” of random numbers all equal intervals should contain the same amount of numbers. For example: generate: 106 random numbers expect: 105 numbers [0. 0, 0. 1] 105 numbers [0. 45, 0. 55] { std: : cout<<" K "<<K<<" random number "<<rand()/(1. 0*RAND_MAX)<<"n"; } K= 0 random number 0. 840188 return 0; K= 1 random number 0. 394383 } K= 2 random number 0. 783099 K= 3 random number 0. 79844 K= 4 random number 0. 911647 R. Kass/Sp 12 P 416 Lecture 2 12

A C++ program to throw dice #include<iostream> #include<cstdio> #include<math. h> #include<stdlib. h> // compile and link on unix using: cxx -o abc. C -lm // where abc. C is the file that contains the source code int main() { // program to roll dice int numroll; // number of rolls int K, dice; float roll[7]; //keeps track of how often a 1, 2. . 6 occurs float prob; // probability of rolling a 1, 2, 3. . 6 // srand(1); //initialize random number generator std: : cout<<"give number of rolls of the dice "<<"n"; std: : cin>>numroll; std: : cout<<"number of rolls of the dice "<< numroll<<"n"; for(K=0; K<=6; K++) { roll[K]=0; } for(K=0; K<numroll; K++) { // % is the modulus operator dice =1+rand()%6; roll[dice]++; // cout<<"dice "<<dice 1<<"n"; } for(K=1; K<=6; K++) { prob=roll[K]/numroll; std: : cout<<"roll of dice "<<K<<" number "<<roll[K]<<" prob "<<prob<<"n"; } return 0; } R. Kass/Sp 12 P 416 Lecture 2 13