CHAPTER 8 Estimating with Confidence 8 3 Estimating

  • Slides: 36
Download presentation
CHAPTER 8 Estimating with Confidence 8. 3 Estimating a Population Mean

CHAPTER 8 Estimating with Confidence 8. 3 Estimating a Population Mean

Estimating a Population Mean ü STATE and CHECK the Random, 10%, and Normal/Large Sample

Estimating a Population Mean ü STATE and CHECK the Random, 10%, and Normal/Large Sample conditions for constructing a confidence interval for a population mean. ü EXPLAIN how the t distributions are different from the standard Normal distribution and why it is necessary to use a t distribution when calculating a confidence interval for a population mean. ü DETERMINE critical values for calculating a C% confidence interval for a population mean using a table or technology. ü CONSTRUCT and INTERPRET a confidence interval for a population mean. ü DETERMINE the sample size required to obtain a C% confidence interval for a population mean with a specified margin of error. The Practice of Statistics, 5 th Edition 2

is Unknown: The t Distributions + n When Estimating a Population Mean As with

is Unknown: The t Distributions + n When Estimating a Population Mean As with proportions, when we don’t know σ, we can estimate it using the sample standard deviation sx. What happens when we standardize? WE WILL EXPLORE THIS DISTRIBUTION WITH A LITTLE CALCULATOR BINGO!

When s is unknown The sample standard deviation s provides an estimate of the

When s is unknown The sample standard deviation s provides an estimate of the population standard deviation s. • When the sample size is large, the sample is likely to contain elements representative of the whole population. Then s is a good estimate of s. • But when the sample size is small, the sample contains only a few individuals. Then s is a more mediocre estimate of s. Population distribution Large sample Small sample

 • When is Unknown: The t Distributions sx, our statistic has a new

• When is Unknown: The t Distributions sx, our statistic has a new distribution called a t distribution developed by William Gossett an employee of the Guinness Brewing Company in Dublin, Ireland. Also called the Student’s t-Distribution, this distribution forms a whole family of related distributions that depend on a parameter known as degrees of freedom, DEGREES OF FREEDOM = Sample Size - 1 There is a different t distribution for each sample size, specified by its DEGREES OF FREEDOM (df). Estimating a Population Mean When we standardize based on the sample standard deviation

William Sealy Gossett It was while he was working at the Guinness Brewery that

William Sealy Gossett It was while he was working at the Guinness Brewery that he devised Student’s T-test. In his work with barley research, Gosset had to draw conclusions from the analysis of relatively small sample sizes. He introduced a new mathematical approach to the subject which laid the foundations for all later statistical techniques.

When σ Is Unknown: The t Distributions When we standardize based on the sample

When σ Is Unknown: The t Distributions When we standardize based on the sample standard deviation sx, our statistic has a new distribution called a t distribution. ü It is symmetric with a single peak at 0, ü However, it has much more area in the tails. Therefore it has a slightly different shape than the Normal Curve. There is a different t distribution for each sample size, specified by its degrees of freedom (df).

N-1

N-1

üThe density curves of the t distributions are similar in shape to the standard

üThe density curves of the t distributions are similar in shape to the standard Normal curve. üThe spread of the t distributions is a bit greater than that of the standard Normal distribution. üThe t distributions have more probability in the tails and less in the center than does the standard Normal. üAs the degrees of freedom increase, the t density curve approaches the standard Normal curve ever more closely. We can use Table B in the back of the book to determine critical values t* for t distributions with different degrees of freedom. Estimating a Population Mean • The t Distributions; Degrees of Freedom When comparing the density curves of the standard Normal distribution and t distributions, several facts are apparent:

What are the Assumptions and Conditions? Gosset found the t-model by simulation. Years later,

What are the Assumptions and Conditions? Gosset found the t-model by simulation. Years later, when Sir Ronald A. Fisher showed mathematically that Gosset was right, he needed to make some assumptions to make the proof work. We will use these assumptions when working with Student’s t. 10

Assumptions and Conditions Independence Assumption: n Independence Assumption: The data values should be independent.

Assumptions and Conditions Independence Assumption: n Independence Assumption: The data values should be independent. n 10% Condition: When a sample is drawn without replacement, the sample should be no more than 10% of the population. Randomization Condition: n The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly from an SRS) are ideal. Normal Population Assumption: n The population has a Normal distribution or the sample size is large (n ≥ 30) so we can use CLT If we are not certain that the data are from a population that follows a Normal model we should check the n Nearly Normal Condition: The data come from a distribution that is unimodal and symmetric. n Check this condition by making a histogram or Normal probability plot. 11

Recall Working with a Normal Probability Plot We use this normal probability plot to

Recall Working with a Normal Probability Plot We use this normal probability plot to check the normality and potential outlier(s) of the data. As we can see, this plot reveals no potential outlier and falls roughly in a straight line. 12

t Distributions; Degrees of Freedom The t Distributions; Degrees of Freedom Draw an SRS

t Distributions; Degrees of Freedom The t Distributions; Degrees of Freedom Draw an SRS of size n from a large population that has a Normal distribution with mean µ and standard deviation σ. The statistic has the t distribution with degrees of freedom df = n – 1. The statistic will have approximately a tn – 1 distribution as long as the sampling distribution is close to Normal. Estimating a Population Mean When we perform inference about a population mean µ using a t distribution, the appropriate degrees of freedom are found by subtracting 1 from the sample size n, making df = n - 1. We will write the t distribution with n - 1 degrees of freedom as tn-1. + n The

Example: Using Table B to Find Critical t* Values Problem: What critical value t*

Example: Using Table B to Find Critical t* Values Problem: What critical value t* from Table B should be used in constructing a confidence interval for the population mean in each of the following settings? (a) A 95% confidence interval based on an SRS of size n = 12. Solution: In Table B, we consult the row corresponding to df = 12 - 1 = 11. We move across that row to the entry that is directly above 95% confidence level on the bottom of the chart. The desired critical value is t* = 2. 201.

Example: Using Table B to Find Critical t* Values Problem: What critical value t*

Example: Using Table B to Find Critical t* Values Problem: What critical value t* from Table B should be used in constructing a confidence interval for the population mean in each of the Can also use inv. T on the calculator! following settings? (b) A 90% confidence interval from a random sample of 48 observations. Need upper t* value with p = area under hand tail Upper tail probability left p Solution: With 48 observations, we want to find the t* critical value for df = inv. T(p, df) 48 - 1 = 47 and 90% confidence. 1. 697 2. 042 2. 147 See page 514 for keystroke 1. 684 2. 021 2. 123 There is no df = 47 row in Table B, so instructions 1. 676 2. 009 2. 109 we use the more conservative df = 40. df . 10 30 1. 310 40 1. 303 50 1. 299 z* 1. 282 1. 645 1. 960 2. 054 80% . 05 90% . 025 95% Confidence level C . 02 96% The corresponding critical value is t* = 1. 684.

How to find t* n Use Table B for t distributions n Look up

How to find t* n Use Table B for t distributions n Look up confidence level at bottom & df on the sides n df = n – 1 n Or use INVT on calculator Find these t* 90% confidence when n = 5 95% confidence when n = 15 t* =2. 132 t* =2. 145

Conditions for Estimating µ As with proportions, you should check some important conditions before

Conditions for Estimating µ As with proportions, you should check some important conditions before constructing a confidence interval for a population mean. Conditions For Constructing A Confidence Interval About A Mean • Random: The data come from a well-designed random sample or randomized experiment. o 10%: When sampling without replacement, check that • Normal/Large Sample: The population has a Normal distribution or the sample size is large (n ≥ 30). If the population distribution has unknown shape and n < 30, use a graph of the sample data to assess the Normality of the population. Do not use t procedures if the graph shows strong skewness or outliers.

Constructing a Confidence Interval for µ To construct a confidence interval for µ, ü

Constructing a Confidence Interval for µ To construct a confidence interval for µ, ü üUse critical values from the t distribution with n - 1 degrees of freedom in place of the z critical values. That is,

One-Sample t Interval for a Population Mean The one-sample t interval for a population

One-Sample t Interval for a Population Mean The one-sample t interval for a population mean is similar in both reasoning and computational detail to the one-sample z interval for a population proportion One-Sample t Interval for a Population Mean When the conditions are met, a C% confidence interval for the unknown mean µ is where t* is the critical value for the tn-1 distribution with C% of its area between −t* and t*.

Example: A one-sample t interval for µ Environmentalists, government officials, and vehicle manufacturers are

Example: A one-sample t interval for µ Environmentalists, government officials, and vehicle manufacturers are all interested in studying the auto exhaust emissions produced by motor vehicles. The major pollutants in auto exhaust from gasoline engines are hydrocarbons, carbon monoxide, and nitrogen oxides (NOX). Researchers collected data on the NOX levels (in grams/mile) for a random sample of 40 light-duty engines of the same type. The mean NOX reading was 1. 2675 and the standard deviation was 0. 3332. Problem: (a) Construct and interpret a 95% confidence interval for the mean amount of NOX emitted by light-duty engines of this type.

Example: Constructing a confidence interval for µ State: We want to estimate the true

Example: Constructing a confidence interval for µ State: We want to estimate the true mean amount µ of NOX emitted by all light-duty engines of this type at a 95% confidence level. Plan: If the conditions are met, we should use a one-sample t interval to estimate µ. • Random: The data come from a “random sample” of 40 engines from the population of all light-duty engines of this type. o 10%? : We are sampling without replacement, so we need to assume that there at least 10(40) = 400 light-duty engines of this type. • Large Sample: We don’t know if the population distribution of NOX emissions is Normal. Because the sample size is large, n = 40 > 30, we should be safe using a t distribution.

Example: Constructing a confidence interval for µ _ Do: From the information given, x

Example: Constructing a confidence interval for µ _ Do: From the information given, x = 1. 2675 g/mi and sx = 0. 3332 g/mi. To find the critical value t*, we use the t distribution with df = 40 - 1 = 39. Unfortunately, there is no row corresponding to 39 degrees of freedom in Table B. We can’t pretend we have a larger sample size than we actually do, so we use the more conservative df = 30.

Example: Constructing a confidence interval for µ = (1. 1599, 1. 3751) Conclude: We

Example: Constructing a confidence interval for µ = (1. 1599, 1. 3751) Conclude: We are 95% confident that the interval from 1. 1599 to 1. 3751 grams/mile captures the true mean level of nitrogen oxides emitted by this type of light-duty engine. Once again, although you must write out the formula, there is a Calculator function which will generate a t-interval in the STAT > TEST menu

Video Screen Tension + n Example: 269. 5 297. 0 269. 6 283. 3

Video Screen Tension + n Example: 269. 5 297. 0 269. 6 283. 3 304. 8 280. 4 233. 5 257. 4 317. 5 327. 4 264. 7 307. 7 310. 0 343. 3 328. 1 342. 6 338. 8 340. 1 374. 6 336. 1 Construct & interpret a 90% CI for the mean tension μ of all the screens produced on this day. STATE: We want to estimate the true mean tension µ of all the video terminals produced this day at a 90% confidence level. PLAN: If the conditions are met, we can use a one-sample t interval to estimate µ. Estimating a Population Mean A manufacturer of high-resolution video terminals must control the tension on the mesh of fine wires that lies behind the surface of the viewing screen. Here are the tension readings from a random sample of 20 screens from a single day’s production:

Video Screen Tension + n Example: Random: We are told that the data come

Video Screen Tension + n Example: Random: We are told that the data come from a random sample of 20 screens from the population of all screens produced that day. Normal: Since the sample size is small (n < 30), we must check whether it’s reasonable to believe that the population distribution is Normal. Examine the distribution of the sample data. These graphs give no reason to doubt the Normality of the population Independent: Because we are sampling without replacement, we must check the 10% condition: we must assume that at least 10(20) = 200 video terminals were produced this day. Estimating a Population Mean PLAN: If the conditions are met, we can use a one-sample t interval to estimate µ.

Video Screen Tension df . 10 . 05 . 025 Since n = 20,

Video Screen Tension df . 10 . 05 . 025 Since n = 20, we use the t distribution with df = 19 to find the critical value. 18 1. 130 1. 734 2. 101 From Table B, we find t* = 1. 729. 19 1. 328 1. 729 2. 093 20 1. 325 1. 725 2. 086 80% 95% Upper-tail probability p Therefore, the 90% confidence interval for µ is: Estimating a Population Mean DO: Using our calculator, we find that the mean and standard deviation of the 20 screens in the sample are: + n Example: Confidence level C CONCLUDE: We are 90% confident that the interval from 292. 32 to 320. 32 m. V captures the true mean tension in the entire batch of video terminals produced that day.

+ TECHNOLOGY CORNER ONE SAMPLE t INTERVALS FOR μ ON THE CALCULATOR TECHNOLOGY CORNER

+ TECHNOLOGY CORNER ONE SAMPLE t INTERVALS FOR μ ON THE CALCULATOR TECHNOLOGY CORNER ONE-SAMPLE t INTERVALS FOR μ ON THE CALCULATOR n Confidence intervals for a population mean using t distributions can be constructed on the TI-83/84 and TI-89, thus avoiding the use of Table B. Here is a brief summary of the techniques when you have the actual data values and when you have only numerical summaries. You can find these instructions on page 521 in your textbook.

t Procedures Wisely Definition: An inference procedure is called robust if the probability calculations

t Procedures Wisely Definition: An inference procedure is called robust if the probability calculations involved in the procedure remain fairly accurate when a condition for using the procedures is violated. Estimating a Population Mean The stated confidence level of a one-sample t interval for µ is exactly correct when the population distribution is exactly Normal. No population of real data is exactly Normal. The usefulness of the t procedures in practice therefore depends on how strongly they are affected by lack of Normality. + n Using Fortunately, the t procedures are quite robust against non-Normality of the population except when outliers or strong skewness are present. Larger samples improve the accuracy of critical values from the t distributions when the population is not Normal.

t Procedures Wisely Using One-Sample t Procedures: The Normal Condition • Sample size less

t Procedures Wisely Using One-Sample t Procedures: The Normal Condition • Sample size less than 15: Use t procedures if the data appear close to Normal (roughly symmetric, single peak, no outliers). If the data are clearly skewed or if outliers are present, do not use t. • Sample size at least 15: The t procedures can be used except in the presence of outliers or strong skewness. • Large samples: The t procedures can be used even for clearly skewed distributions when the sample is large, roughly n ≥ 30. Estimating a Population Mean Except in the case of small samples, the condition that the data come from a random sample or randomized experiment is more important than the condition that the population distribution is Normal. Here are practical guidelines for the Normal condition when performing inference about a population mean. + n Using

– Consumer Reports tested 14 randomly selected brands of vanilla yogurt and found the

– Consumer Reports tested 14 randomly selected brands of vanilla yogurt and found the following numbers of calories per serving: 160 200 220 230 120 180 140 130 170 190 80 120 100 170 Compute a 98% confidence interval for the average calorie content per serving of vanilla yogurt. (126. 16, 189. 56)

A diet guide claims that you will get 120 Note: confidence intervals tell us

A diet guide claims that you will get 120 Note: confidence intervals tell us calories from a serving of vanilla yogurt. if something is NOT EQUAL – What does this evidence indicate? never less or greater than! Since 120 calories is not contained within the 98% confidence interval, the evidence suggest that the average calories per serving does not equal 120 calories.

Choosing the Sample Size We determine a sample size for a desired margin of

Choosing the Sample Size We determine a sample size for a desired margin of error when estimating a mean in much the same way we did when estimating a proportion. Choosing Sample Size for a Desired Margin of Error When Estimating µ To determine the sample size n that will yield a level C confidence interval for a population mean with a specified margin of error ME: • Get a reasonable value for the population standard deviation σ from an earlier or pilot study. • Find the critical value z* from a standard Normal curve for confidence level C. • Set the expression for the margin of error to be less than or equal to ME and solve for n:

Example: Determining sample size from margin of error Researchers would like to estimate the

Example: Determining sample size from margin of error Researchers would like to estimate the mean cholesterol level µ of a particular variety of monkey that is often used in laboratory experiments. They would like their estimate to be within 1 milligram per deciliter (mg/dl) of the true value of µ at a 95% confidence level. A previous study involving this variety of monkey suggests that the standard deviation of cholesterol level is about 5 mg/dl. Problem: Obtaining monkeys is time-consuming and expensive, so the researchers want to know the minimum number of monkeys they will need to generate a satisfactory estimate.

Example: Determining sample size from margin of error Solution: For 95% confidence, z* =

Example: Determining sample size from margin of error Solution: For 95% confidence, z* = 1. 96. We will use σ = 5 as our best guess for the standard deviation of the monkeys’ cholesterol level. Set the expression for the margin of error to be at most 1 and solve for n: Because 96 monkeys would give a slightly larger margin of error than desired, the researchers would need 97 monkeys to estimate the cholesterol levels to their satisfaction.

Estimating a Population Mean ü STATE and CHECK the Random, 10%, and Normal/Large Sample

Estimating a Population Mean ü STATE and CHECK the Random, 10%, and Normal/Large Sample conditions for constructing a confidence interval for a population mean. ü EXPLAIN how the t distributions are different from the standard Normal distribution and why it is necessary to use a t distribution when calculating a confidence interval for a population mean. ü DETERMINE critical values for calculating a C% confidence interval for a population mean using a table or technology. ü CONSTRUCT and INTERPRET a confidence interval for a population mean. ü DETERMINE the sample size required to obtain a C% confidence interval for a population mean with a specified margin of error. The Practice of Statistics, 5 th Edition 36