STATISTICS 200 Lecture 16 Thursday October 13 2016

STATISTICS 200 Lecture #16 Thursday, October 13, 2016 Textbook: Sections 9. 3, 9. 4, 10. 1, 10. 2 Objectives: • Define standard error, relate it to both standard deviation and sampling distribution ideas. • Describe the sampling distribution of a sample proportion. • Reformulate confidence interval formula using general idea of estimate plus/minus (multiplier × standard error) • Interpret confidence level as a relative frequency • Calculate new values of the multiplier for new confidence levels other than 95%

We now begin a strong focus on Inference Means Proportions One population proportion Two population proportions This week One population mean Difference between Means Mean difference

Motivation Eventual Goal: Use statistical inference to answer the question “What is the percentage of Creamery customers who prefer chocolate ice cream over vanilla? ” Strategy: Get a random sample of 90 individuals and ask them this question. Use the answers to perform a hypothesis test to answer the question.

Comparison of Binomial-based statistics Variable Count of successes Chapter 8 Proportion of successes Chapter 9 and beyond Notation Mean St. Dev.

Binomial Distribution vs. approximate p-hat sampling distribution: n = 100 & p = 0. 70

A better confidence interval OLD: Conservative margin of error: ME = (multiplier)*(standard error) NEW:

New formula for margin of error ME = (multiplier) × (standard error) Z* • Related to Empirical rule ______. • Expresses level of confidence that the interval includes the parameter _____. Estimate of the Standard deviation ________ of the sampling distribution of p-hat

Z*-multiplier • Use when the normal approximation is appropriate, i. e. n*p > 10 and _______. n*(1 -p) > 10 when _____ Confidence Multiplier level (z*) 90% 1. 65 95% 1. 96 2 98% 2. 33 99% 2. 58 0. 90 The z-multiplier for a 68% confidence 1 level would be _______, because we 1 standard deviation must go _____ from the mean to capture 68% of the area. 0. 95 0. 98

Three Factors affect the width of a confidence interval Page 382 textbook 1. Level of confidence Z* 2. ME Sample size sample size ME

The scatterplot shows the variation is… A. largest when p-hat = 1. 0 B. largest when p-hat = 0. 5 C. largest when p-hat = 0. 25 D. smallest when p-hat = 0. 9 E. smallest when p-hat = 0. 2

Factor 3: Value of p-hat impacts width of C. I. At a given level of confidence and sample size, the confidence interval is the widest when p-hat 0. 5 and it becomes narrower as p-hat equals ______ 0. 5 in either direction. moves away from _______

Confidence Intervals: Population Proportion Conservative Method: Chapter 1 & 5 Normal Approximation: Chapter 10 When normal conditions aren’t met, use this option Exact (Binomial) Need a computer to calculate the interval. Does not include a M. E. Minitab: provides both options Pages 389 & 390 in the textbook

13 Binomial distributions n fixed at 10, p increasing p fixed at 0. 02, n increasing Values of n and p determine whether binomial is normal in shape

What does it mean to be 95% confident? • Before the sample is drawn: We can say that P(conf. int. contains the true parameter) = 0. 95. • After the sample is drawn: There is no more randomness! (Both the CI and the parameter are now fixed. ) So we cannot talk of “probability” any longer.

Interpreting 95% confidence: An example Suppose we have a sample of 200 students in STAT 100 and find that 28 of them are left handed. Our sample proportion is: 0. 14 We now find the ME and construct a 95% CI.

Find the standard error: That is, estimate the standard deviation of the sample proportion based on a sample of size 200: Hence, z* times the standard error = 2×. 025 =. 05 On the following two slides, we'll pretend that the true population proportion is 0. 12.

The green curve is the true distrtibution of phat. Of course, ordinarily we don't know where it lies, but at least we know its approximate standard deviation. Thus, we can build a confidence interval around our 14% estimate (in red). If we take another sample, the red line will move but the green curve will not!

If we repeat the sampling over and over, 95% of our confidence intervals will contain the true proportion of 0. 12. This is why we use the term "95% confidence interval".

Definition of "95% confidence interval for the true population proportion": An interval of values computed from a sample that will cover the true but unknown population proportion for 95% of the possible samples. To find a 95% CI: • The center is at p-hat. • The margin of error is 2 times the S. E. , where… • …the S. E. is the square root of [p-hat(1 -p-hat)/n].

What does it mean to be 95% confident? A. There is a 95% probability that the one interval that I calculated contains the true value for the parameter. B. If I get 100 such intervals, about 95 of them will contain the true value for the parameter. C. The sample estimate has a 95% chance of being inside the calculated interval. D. The p-value has a 95% chance of being inside the interval.

If you understand today’s lecture… 9. 25, 9. 33, 9. 35, 9. 37, 10. 1, 10. 3, 10. 7, 10. 9, 10. 11, 10. 13, 10. 15, 10. 19, 10. 21, 10. 23, 10. 25, 10. 27, 10. 33, 10. 45 Objectives: • Define standard error, relate it to both standard deviation and sampling distribution ideas. • Describe the sampling distribution of a sample proportion. • Reformulate confidence interval formula using general idea of estimate plus/minus (multiplier × standard error) • Interpret confidence level as a relative frequency • Calculate new values of the multiplier for new confidence levels other than 95%