The Statistical Imagination Chapter 7 Using Probability Theory
The Statistical Imagination • Chapter 7. Using Probability Theory to Produce Sampling Distributions © 2008 Mc. Graw-Hill Higher Education
Sampling Error for a Particular Sample • Sampling error is the difference between a calculated value of a sample statistic and the true value of a population parameter • E. g. , suppose the mean GPA on campus is 2. 60. A sample reveals a mean of 2. 80. The. 20 difference is sampling error © 2008 Mc. Graw-Hill Higher Education
Estimating the Parameters of a Population • Point estimate – a statistic provided without indicating a range of error • Point estimates are limited because a calculation made for sample data is only an estimate of a population parameter. This is apparent when different results are found with repeated sampling © 2008 Mc. Graw-Hill Higher Education
Repeated Sampling • Repeated sampling refers to the procedure of drawing a sample and computing its statistic, and then drawing a second sample, a third, a fourth, and so on • Repeated sampling reveals the nature of sampling error • An illustration of repeated sampling is presented in Figure 7 -1 in the text © 2008 Mc. Graw-Hill Higher Education
Symbols • Sample statistics are usually noted with English letters • Population parameters are usually noted with Greek letters © 2008 Mc. Graw-Hill Higher Education
What Repeated Sampling Reveals 1. A given sample’s statistic will be slightly off from the true value of its population’s parameter due to sampling error 2. Sampling error is patterned, systematic, and predictable 3. Sampling variability is mathematically predictable from probability curves called sampling distributions 4. The larger the sample size, the smaller the range of error © 2008 Mc. Graw-Hill Higher Education
A Sampling Distribution • A mathematical description of all possible sampling event outcomes and the probability of each one • Sampling distributions are obtained from repeated sampling • Many sampling distributions can be displayed as probability curves; partitioning (Chapter 6) tells us the probability of occurrence of any sample outcome © 2008 Mc. Graw-Hill Higher Education
A Sampling Distribution of Means • A sampling distribution of means describes all possible sampling event outcomes and the probability of each outcome when means are repeatedly calculated on an infinite number of samples • It answers the question: What would happen if we repeatedly sampled a population using a sample size of n, calculated each sample mean, and plotted it on a histogram? © 2008 Mc. Graw-Hill Higher Education
Features of a Sampling Distribution of Means • A sampling distribution of means is illustrated in the text in Figure 7 -3. It reveals that for an interval/ratio variable, means calculated from a repeatedly sampled population calculate to similar values which cluster around the value of the population mean • Simply put: Sample means center on the value of the population parameter © 2008 Mc. Graw-Hill Higher Education
The Normal Curve as a Sampling Distribution • When repeatedly sampling means for sample sizes greater that 121 cases, a histogram plot of the resulting means will fit the normal curve • The X axis of a sampling distribution of means is comprised of values of X-bars • As with any normal curve, probabilities may be calculated for specific values on the X-axis © 2008 Mc. Graw-Hill Higher Education
The Standard Error • The standard error is the standard deviation of a sampling distribution • It measures the spread of sampling error that occurs when a population is sampled repeatedly • Rather than repeatedly sample, we estimate standard errors using the sample standard deviation of a single sample © 2008 Mc. Graw-Hill Higher Education
The Law of Large Numbers • The law of large numbers states that the larger the sample size, the smaller the standard error of the sampling distribution • The relationship between sample size and sampling error is apparent in the formula for the standard error of the mean; a large n in the denominator produces a small quotient © 2008 Mc. Graw-Hill Higher Education
The Central Limit Theorem • The central limit theorem states that regardless of the shape of the raw score distribution of an interval/ratio variable, its sampling distribution: 1. will be normal when the sample size, n, is greater than 121 cases and 2. will center on the true population mean • This is illustrated in the text in Figure 7 -8 © 2008 Mc. Graw-Hill Higher Education
Sampling Distributions for Nominal Variables • A sampling distribution of proportions is normal in shape when the smaller of P or Q times n is greater than 5 • The larger the sample size, the smaller the range of error © 2008 Mc. Graw-Hill Higher Education
Features of a Sampling Distribution for Nominal Variables • The mean of a sampling distribution of proportions is equal to the probability of success ( P ) in the population • The standard error is estimated using the probabilities of success and failure in a sample © 2008 Mc. Graw-Hill Higher Education
Demystifying “Sampling Distribution” • Although we represent a sampling distributions using formulas and a probability curve, its occurrence is real • To truly grasp how down to earth they are, generate sampling distributions by repeatedly sampling means and proportions © 2008 Mc. Graw-Hill Higher Education
Keep Straight the Assorted Symbols • Take care to distinguish population from sample from sampling distribution • Keep straight the symbols for each of these entities • See Figure 7 -8 in the text © 2008 Mc. Graw-Hill Higher Education
Statistical Follies • An appreciation of sampling distributions is a key part of understanding statistics • Poor understanding of sampling distributions leads the statistically unimaginative person to treat point estimates as though they are true values of a population’s parameters • Remember: A second sample will produce a different point estimate © 2008 Mc. Graw-Hill Higher Education
- Slides: 18