PROBABILITY DISTRIBUTION BINOMIAL POISSON NORMAL BINOMIAL AND POISSON
• PROBABILITY DISTRIBUTION – • BINOMIAL & POISSON • NORMAL
BINOMIAL AND POISSON
WHAT IS BINOMIAL DISTRIBUTION ? A binomial random variable is the number of successes x in n repeated trials of binomial experiment. The probability distribution of a binomial random variable is called a binomial distribution. Example Suppose we flip a coin two times and count the number of heads (successes).
CONDITIONS • 1: The number of observations n is fixed. • 2: Each observation is independent. • 3: Each observation represents one of two outcomes ("success" or "failure"). • 4: The probability of "success" p is the same for each outcome.
Properties • • • A binomial experiment is a statistical experiment that has the following properties: The experiment consists of n repeated trials. Each trial can result in just two possible outcomes. We call one of these outcomes a success and the other, a failure. The probability of success, denoted by P, is the same on every trial. The trials are independent; that is, the outcome on one trial does not affect the outcome on other trials. Consider the following statistical experiment. You flip a coin 2 times and count the number of times the coin lands on heads. This is a binomial experiment because: The experiment consists of repeated trials. We flip a coin 2 times. Each trial can result in just two possible outcomes heads or tails. The probability of success is constant 0. 5 on every trial. The trials are independent; that is, getting heads on one trial does not affect whether we get heads on other trials.
Application • • • The formula that we commonly used for determining the distribution of binomial types is more generally known also as the formula for “Bernoulli trials” (when n = 1). Let us assume that in an experiment done, ‘n’ is representing the number of trials attempted, and that ‘k’ is the count of successes that is to be attained in those ‘n’ trials. This implies that number of failures clearly will be ‘n k’. Assuming, ‘s’ to be the probability of succeeding in a trial, we get that the probability of failure is ‘ 1 s’. • Then the formula for calculating the achievement of ‘k’ successes in ‘n’ trials is given below: • P (‘k’ successes in ‘n’ trials) = C(n, k)sk(1−s)(n−k) • C (n, k) is called the coefficient for binomial distribution or binomial coefficient. It is on this coefficient that the distribution is named. • • The factorial of any number ‘m’ is the product of all natural numbers starting from m, (m 1) to 1. Hence, C (n, k) is evaluated as below: C (n, k) = n!(k!(n−k)!) With n! = n * (n 1) * … * 2 * 1 If k > n 2 then the following is applicable, f (k, n, s) = f (n k, n, 1 s) This holds true for every k > n 2.
EXAMPLES • Couple of binomial distribution problems are given below: Example 1: A test is conducted which is consisting of 20 MCQs (multiple choices questions) with every MCQ having its four options out of which only one is correct. Determine the probability that a person undertaking that test has answered exactly 5 questions wrong. • Solution: • Here, n = 20, n k = 5, k = 20 5 = 15 • Here the probability of success = probability of giving a right answer = s = 14 Hence, the probability of failure = probability of giving a wrong answer = 1 s • = 1 14 = 34 • When we substitute these values in the formula for Binomial distribution we get, • So, P (exactly 5 out of 20 answers incorrect) = C (20, 5) * (14) 15 * (34) 5 • → P (5 out of 20) = (20∗ 19∗ 18∗ 17∗ 16)(5∗ 4∗ 3∗ 2∗ 1) * (14) 15 * (34) 5 • = 0. 0000034 (approximately) • Thus the required probability is 0. 0000034 approximately.
• Example 2: A die marked A to E is rolled 50 times. Find the probability of getting a “D” exactly 5 times. • Solution: • Here, n = 50, k = 5, n k = 45. • The probability of success = probability of getting a “D”= s = 15 • Hence, the probability of failure = probability of not getting a “D” = 1 s = 45.
• Example: • A coin is tossed 10 times. What is the probability that exactly 6 heads will occur. • Success = "A head is flipped on a single coin" • p = 0. 5 • q = 0. 5 • n = 10 • x = 6 • P(x=6) = 10 C 6 * 0. 5^4 = 210 * 0. 015625 * 0. 0625 = 0. 205078125
• Example: • Find the mean, variance, and standard deviation for the number of sixes that appear when rolling 30 dice. • Success = "a six is rolled on a single die". p = 1/6, q = 5/6. • The mean is 30 * (1/6) = 5. The variance is 30 * (1/6) * (5/6) = 25/6. The standard deviation is the square root of the variance = 2. 041241452 (approx)
• Suppose a die is tossed 5 times. What is the probability of getting exactly 2 fours? • Solution: This is a binomial experiment in which the number of trials is equal to 5, the number of successes is equal to 2, and the probability of success on a single trial is 1/6 or about 0. 167. Therefore, the binomial probability is: • b(2; 5, 0. 167) = 5 C 2 * (0. 167)2 * (0. 833)3 b(2; 5, 0. 167) = 0. 161
• What is the probability of obtaining 45 or fewer heads in 100 tosses of a coin? • Solution: To solve this problem, we compute 46 individual probabilities, using the binomial formula. The sum of all these probabilities is the answer we seek. Thus, • b(x < 45; 100, 0. 5) = b(x = 0; 100, 0. 5) + b(x = 1; 100, 0. 5) +. . . + b(x = 45; 100, 0. 5) b(x < 45; 100, 0. 5) = 0. 184
• POISSON DISTRIBUTION
MEANING • In probability theory and statistics, the Poisson distribution (French pronunciation [pwasɔ ]; in English usually /ˈpwɑːsɒn/), named after French mathematician Siméon Denis Poisson, is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since the last event. [1] The Poisson distribution can also be used for the number of events in other specified intervals such as distance, area or volume. • For instance, an individual keeping track of the amount of mail they receive each day may notice that they receive an average number of 4 letters per day. If receiving any particular piece of mail doesn't affect the arrival times of future pieces of mail, i. e. , if pieces of mail from a wide range of sources arrive independently of one another, then a reasonable assumption is that the number of pieces of mail received per day obeys a Poisson distribution. [2] Other examples that may follow a Poisson: the number of phone calls received by a call center per hour or the number of decay events per second from a radioactive source.
• Characteristics of a Poisson Distribution • The experiment consists of counting the number of events that will occur during a specific interval of time or in a specific distance, area, or volume. • The probability that an event occurs in a given time, distance, area, or volume is the same. • Each event is independent of all other events. For example, the number of people who arrive in the first hour is independent of the number who arrive in any other hour.
Applications • the number of deaths by horse kicking in the Prussian army (first application) • birth defects and genetic mutations • rare diseases (like Leukemia, but not AIDS because it is infectious and so not independent) especially in legal cases • car accidents • traffic flow and ideal gap distance • number of typing errors on a page • hairs found in Mc. Donald's hamburgers • spread of an endangered animal in Africa • failure of a machine in one month
• 1. Calculate the Poisson Distribution whose λ (Average Rate of Success)) is 3 & X (Poisson Random Variable) is 6. Substitute these values in the above formula, f = 36 x e 3/6! = 729 x 0. 0498 / 720 = 0. 0504
• 2. A manufacturer of television set known that on an average 5% of their product is defective. They sells television sets in consignment of 100 and guarantees that not more than 2 set will be defective. What is the probability that the TV set will fail to meet the guaranteed quality? e 5 = 0. 0067 Solution: Success = the TV is defective X = number of successes p = probability of success = 5% = 0. 05 n = 100 , λ = np = 100 x 0. 05 = 5 Poisson Distribution is P(X=x) = e λ λx /x! ; x=0, 1, 2, 3, 4 Guarantee: X not less than 2 => X= 0, 1, 2 P(X > 2) = 1 [P(0)+ P(1) + P(2)] = 1 e 5 [1 + 5 + 25/2 ] = 1 e 5 (37/2) = 1 (0. 0067) x 37/2 = 1 0. 12395 = 0. 87605
NORMAL DISTRIBUTION
MEANING • Meaning of Normal distribution or law of error • Binomial and passion distribution are discrete probability distribution. But normal probability distribution commonly called normal distribution. It is theoretical distribution for the continuous variable. The normal distribution was first discovered by English mathematician De Moivre in 1733. later it was rediscovered by Karl Gauss in 1809 and in 1812 by Laplace. Normal distribution is also called Gaussian distribution or Gaussian law of error as this theory describes the accidental error of measurements.
Example of normal distribution curve • Here’s an example of a normal distribution curve: • A graphical representation of a normal distribution is sometimes called a bell curve because of its flared shape. The precise shape can vary according to the distribution of the population but the peak is always in the middle and the curve is always symmetrical. In a normal distribution, the mean, mode and median are all the same.
• • • • Properties of normal distribution 1) The normal curve is bell shaped in appearance. 2) There is one maximum point of normal curve which occur at mean. 3) As it has only one maximum curve so it is unimodal. 4) In binomial and possion distribution the variable is discrete while in this it is continuous. 5) Here mean= median =mode. 6) The total area of normal curve is 1. The area to the left and the area to the right of the curve is 0. 5. 7) No portion of curve lies below x axis so probability can never be negative. 8) The curve becomes parallel to x axis which is supposed to meet it at infinity. 9) Here mean deviation = 4/5 standard deviation. 10) Quartile deviation = 5/6 mean deviation 11) Quartile deviation : mean deviation : standard deviation 10 : 12 : 15 12) 4 standard deviation = 5 mean deviation = 6 quartile deviation These are the properties of normal distribution.
• Importance of normal distribution • 1) It has one of the important properties called central theorem. Central theorem means relationship between shape of population distribution and shape of sampling distribution of mean. This means that sampling distribution of mean approaches normal as sample size increase. • 2) In case the sample size is large the normal distribution serves as good approximation. • 3) Due to its mathematical properties it is more popular and easy to calculate. • 4) It is used in statistical quality control in setting up of control limits. • 5) The whole theory of sample tests t, f and chi square test is based on the normal distribution.
Standard normal distribution: How to Find Probability (Steps) Step 1: Draw a bell curve and shade in the area that is asked for in the question. The example below shows z > 0. 8. That means you are looking for the probability that z is greater than 0. 8, so you need to draw a vertical line at 0. 8 standard deviations from the mean and shade everything that’s greater than that number. shaded area is z > 0. 8 Step 2: Visit the normal probability area index and find a picture that looks like your graph. Follow the instructions on that page to find the z value for the graph. The z value is the probability. Tip: Step 1 is technically optional, but it’s always a good idea to sketch a graph when you’re trying to answer probability word problems. That’s because most mistakes happen not because you can’t do the math or read a z table, but because you subtract a z score instead of adding (i. e. you imagine the probability under the curve in the wrong direction. A sketch helps you cement in your head exactly what you are looking for
characteristics • Some of the major characteristics of normal probability curve are as follows: • 1. The curve is bilaterally symmetrical. • The curve is symmetrical to its ordinate of the central point of the curve. It means the size, shape and slope of the curve on one side of the curve is identical to the other side of the curve. If the curve is bisected then its right hand side completely matches to the left hand side. • 2. The curve is asymptotic: • The Normal Probability Curve approaches the horizontal axis and extends from ∞ to + ∞. Means the extreme ends of the curve tends to touch the base line but never touch it.
• 3. The Mean, Median and Mode: • The mean, Median and mode fall at the middle point and they are numerically equal. • 4. The Points of inflection occur at ± 1 Standard deviation unit: • The points of influx in a NPC occur at ± 1σ to unit above and below the mean. Thus at this point the curve changes from convex to concave in relation to the horizontal axis.
• 5. The total area of NPC is divided in to ± standard deviations: • The total of NPC is divided into six standard deviation units. From the center it is divided in to three +ve’ standard deviation units and three —ve’ standard deviation units. • Thus ± 3σ of NPC include different number of cases separately. Between ± 1σ lie the middle 2/3 rd cases or 68. 26%, between ± 2σ lie 95. 44% cases and between ± 3σ lie 99. 73% cases and beyond + 3σ only 0. 37% cases fall. • 6. The Y ordinate represents the height of the Normal Probability Curve: • The Y ordinate of the NPC represents the height of the curve. At the center the maximum ordinate occurs. The height of the curve at the mean or mid point is denoted as Y 0. • In order to determine the height of the curve at any point we use the following formula:
• 7. It is unimodal: • The curve is having only one peak point. Because the maximum frequency occurs only at one point. • 8. The height of the curve symmetrically declines: • The height of the curve decline to both the direction sym metrically from the central point. Means the M + σ and M — σ are equal if the distance from the mean is equal. • 9. The Mean of NPC is µ and the standard deviation is σ: • As the mean of the NPC represent the population mean so it is represented by the µ (Meu). The standard deviation of the curve is represented by the Greek Letter, σ.
• 10. In Normal Probability Curve the Standard deviation is the 50% larger than the Q: • In NPC the Q is generally called the probable error or PE. • The relationship between PE and a can be stated as following: • 1 PE =. 6745σ • 1σ = 1. 4826 PE. • 11. Q can be used as a unit of measurement in determining the area within a given part: • 12. The Average Deviation about the mean of NPC is. 798σ: • There is a constant relationship between standard deviation and average deviation in a NPC. • 13. The model ordinate varies increasingly to the standard deviation: • In a Normal Probability curve the modal ordinate varies in creasingly to the standard deviation. The standard deviation of the Normal Probability Curve increases, the modal ordinate decreases and vice versa.
• Example: • Given a distribution of scores with a mean of 24 and σ of 8. Assuming normality what percentage of the cases will fall between 16 and 32. • Solution: • Here first of all we have to convert both the scores 16 and 32 into a standard score. • Entering in to the Table A, the table area under NPC, it is found that 34. 13 cases fall between mean and – 1σ and 34. 13 cases fall between mean and + 1σ. So ± σ covers 68. 26% of cases. So that 68. 25% cases will fall between 16 and 32.
• Given a distribution of scores with a mean of 40 and σ of 8. Assuming normality what percentage of cases will lie above and below the score 36. • Solution: • First of all we have to convert the raw score 36 into standard score. • Entering into the Table A, the table area under the NPC it is found that 19. 15% cases fall between Mean and . 5σ. Therefore the total percentage of cases above the score 36 is 50 + 19. 15 = 69. 15% and below the score 36 is 50 19. 15 = 30. 85%. So in the distribution 69. 15% cases are above the score 36 and 30. 85% scores are below the score 36.
• Example: • Given a distribution of scores with a mean of 20 and σ of 5. If we assume normality what limits will include the middle 75% of cases. • Solution: • In a normal distribution the middle 75% cases include 37. 5% cases above the mean and 37. 5% cases below the mean. From the Table A we can say that 37. 5% cases covers 1. 15 σ units. Therefore the middle 75% cases lie between mean and ± 1. 15 σ units. • So in this distribution middle 75% cases will include the limits 14. 25 to 25. 75.
• Example 2: If customers come into a bank with variance 36/hour. Find the standard deviation of customer visit per hour using Poisson distribution. Solution: According to a Poisson distribution, the expected value is μxμx = variance = λλ = 36 customers per hour. Now the standard deviation = σσ = λ−−√λ (∵∵ standard deviation = square root of variance) σσ = 36−−√ 36 = 6 customers per hour. •
• Example 1: A man was able to complete 3 files a day on an average. Find the probability that he can complete 5 files the next day. • • Solution: Here we know this is a Poisson experiment with following values given: μμ= 3, average number of files completed a day • • x = 5, the number of files required to be completed next day And e = 2. 71828 being a constant • On substituting the values in the Poisson distribution formula mentioned above we get the Poisson probability in this case. • We get, • • • P(x, μμ) = (e−μ)(μx)x! →→ P (5, 3) = (2. 71828)− 3(35)5! = 0. 1008 approximately. • Hence the probability for the person to complete 5 files the next day is 0. 1008 approximately.
• The deals cracked by an agent per day is a random Poisson variable with mean 2. Given that each day is independent of other day, find the probability of getting 2 deals cracked on first day and 1 deal to be cracked the next day. Solution: The probability of getting 2 deals in a day is P(2; 2) and the probability of getting 1 deal is P(1; 2). The probability of getting 2 deals on first day and one deal on second day = P(2; 2) ×× P(1; 2) PP = e− 2 222! ×× e− 2 211! = 0. 27067056647+0. 27067056647=0. 541341132950. 27067056647+0. 27067056647=0. 541341132 95 • The probability the first day two deals are cracked and the second day one deal is cracked is 0. 54134113295.
• Number of calls coming to the customer care center of a mobile company per minute is a Poisson random variable with mean 5. Find the probability that no call comes in a certain minute. Solution: The mean value, m=5 m=5 We need to find the probability of getting zero calls when 5 calls are known to come every minute. P(0; 5) = e− 5 500! = 0. 006737947 Hence, the probability of getting zero calls in a minute is 0. 006737947.
• The mean value for an event X to occur is 2 in a day. Find the probability of event X to occur thrice in a day. • • Solution: • • Mean, m=2 m=2 • • Probability of the event to occur thrice, P(3; 2) = e− 2 233! = 0. 18044650. 18044 65
• X is a normally distributed variable with mean μ = 30 and standard deviation σ = 4. Find a) P(x < 40) b) P(x > 21) • c) P(30 < x < 35) Note: What is meant here by area is the area under the standard normal curve. a) For x = 40, the z value z = (40 30) / 4 = 2. 5 Hence P(x < 40) = P(z < 2. 5) = [area to the left of 2. 5] = 0. 9938 b) For x = 21, z = (21 30) / 4 = 2. 25 Hence P(x > 21) = P(z > 2. 25) = [total area] [area to the left of 2. 25] = 1 0. 0122 = 0. 9878 c) For x = 30 , z = (30 30) / 4 = 0 and for x = 35, z = (35 30) / 4 = 1. 25 Hence P(30 < x < 35) = P(0 < z < 1. 25) = [area to the left of z = 1. 25] [area to the left of 0] • = 0. 8944 0. 5 = 0. 3944
• A radar unit is used to measure speeds of cars on a motorway. The speeds are normally distributed with a mean of 90 km/hr and a standard deviation of 10 km/hr. What is the probability that a car picked at random is travelling at more than 100 km/hr?
• Let x be the random variable that represents the speed of cars. x has μ = 90 and σ = 10. We have to find the probability that x is higher than 100 or P(x > 100) For x = 100 , z = (100 90) / 10 = 1 P(x > 90) = P(z >, 1) = [total area] [area to the left of z = 1] = 1 0. 8413 = 0. 1587 The probability that a car selected at a random has a speed greater than 100 km/hr is equal to 0. 1587
• For a certain type of computers, the length of time bewteen charges of the battery is normally distributed with a mean of 50 hours and a standard deviation of 15 hours. John owns one of these computers and wants to know the probability that the length of time will be between 50 and 70 hours.
• Let x be the random variable that represents the length of time. It has a mean of 50 and a standard deviation of 15. We have to find the probability that x is between 50 and 70 or P( 50< x < 70) For x = 50 , z = (50 50) / 15 = 0 For x = 70 , z = (70 50) / 15 = 1. 33 (rounded to 2 decimal places) P( 50< x < 70) = P( 0< z < 1. 33) = [area to the left of z = 1. 33] [area to the left of z = 0] = 0. 9082 0. 5 = 0. 4082 The probability that John's computer has a length of time between 50 and 70 hours is equal to 0. 4082.
• Entry to a certain University is determined by a national test. The scores on this test are normally distributed with a mean of 500 and a standard deviation of 100. Tom wants to be admitted to this university and he knows that he must score better than at least 70% of the students who took the test. Tom takes the test and scores 585. Will he be admitted to this university?
• Let x be the random variable that represents the scores. x is normally ditsributed with a mean of 500 and a standard deviation of 100. The total area under the normal curve represents the total number of students who took the test. If we multiply the values of the areas under the curve by 100, we obtain percentages. For x = 585 , z = (585 500) / 100 = 0. 85 The proportion P of students who scored below 585 is given by P = [area to the left of z = 0. 85] = 0. 8023 = 80. 23% Tom scored better than 80. 23% of the students who took the test and he will be admitted to this University.
• The length of similar components produced by a company are approximated by a normal distribution model with a mean of 5 cm and a standard deviation of 0. 02 cm. If a component is chosen at random a) what is the probability that the length of this component is between 4. 98 and 5. 02 cm? b) what is the probability that the length of this component is between 4. 96 and 5. 04 cm?
• a) P(4. 98 < x < 5. 02) = P( 1 < z < 1) = 0. 6826 b) P(4. 96 < x < 5. 04) = P( 2 < z < 2) = 0. 9544
• The length of life of an instrument produced by a machine has a normal ditribution with a mean of 12 months and standard deviation of 2 months. Find the probability that an instrument produced by this machine will last a) less than 7 months. b) between 7 and 12 months.
• a) P(x < 7) = P(z < 2. 5) = 0. 0062 b) P(7 < x < 12) = P( 2. 5 < z < 0) = 0. 4938
• The time taken to assemble a car in a certain plant is a random variable having a normal distribution of 20 hours and a standard deviation of 2 hours. What is the probability that a car can be assembled at this plant in a period of time a) less than 19. 5 hours? b) between 20 and 22 hours? • A large group of students took a test in Physics and the final grades have a mean of 70 and a standard deviation of 10. If we can approximate the distribution of these grades by a normal distribution, what percent of the students a) scored higher than 80? b) should pass the test (grades≥ 60)?
• a) P(x < 19. 5) = P(z < 0. 25) = 0. 4013 b) P(20 < x < 22) = P(0 < z < 1) = 0. 3413 • a) For x = 80, z = 1 Area to the right (higher than) z = 1 is equal to 0. 1586 = 15. 87% scored more that 80. b) For x = 60, z = 1 Area to the right of z = 1 is equal to 0. 8413 = 84. 13% should pass the test. c)100% 84. 13% = 15. 87% should fail the test.
Thankyou !
- Slides: 51