Chapter 10 Estimating Proportions with Confidence Copyright 2011

  • Slides: 41
Download presentation
Chapter 10 Estimating Proportions with Confidence Copyright © 2011 Brooks/Cole, Cengage Learning

Chapter 10 Estimating Proportions with Confidence Copyright © 2011 Brooks/Cole, Cengage Learning

Principle Idea: Confidence interval: an interval of estimates that is likely to capture the

Principle Idea: Confidence interval: an interval of estimates that is likely to capture the population value. Based on our sample, we are 95% confidence that somewhere between 33% and 39% of all Americans suffer from allergies. What does this mean? Objective: how to calculate and interpret a confidence interval estimate of a population proportion. Copyright © 2011 Brooks/Cole, Cengage Learning 2

10. 1 CI Module 0: An Overview of Confidence Intervals Lesson 1: The Basic

10. 1 CI Module 0: An Overview of Confidence Intervals Lesson 1: The Basic Idea of a CI Confidence interval: an interval of values computed from sample data that is likely to include the unknown value of a population parameter. Curiosity and Confidence Intervals • If you suffer from allergies, you might wonder how much company you have. The parameter we may wish to estimate is p = proportion of the population that suffers from allergies. Copyright © 2011 Brooks/Cole, Cengage Learning 3

Population Parameters and Sample Statistics • Population: the entire collection of units about which

Population Parameters and Sample Statistics • Population: the entire collection of units about which we would like information or the entire collection of measurements we would have if we could measure the whole population. • Population parameter: is a fixed summary number associated with a population, and in the context of confidence intervals is unknown and we want to estimate it. E. g. p is the proportion of the population with a particular characteristic. Copyright © 2011 Brooks/Cole, Cengage Learning 4

Population Parameters and Sample Statistics • Sample: the collection of units we will actually

Population Parameters and Sample Statistics • Sample: the collection of units we will actually measure or the collection of measurements we will actually obtain. • Sample size: the number of units or measurements in the sample, denoted by n. • Sample statistic: a summary number computed from a sample that is used to estimate the corresponding population parameter. E. g. is the proportion of a sample with a particular characteristic. • Sample estimate, and point estimate are synonyms for a sample statistic. Copyright © 2011 Brooks/Cole, Cengage Learning 5

More Language and Notation of Estimation The Fundamental Rule for Using Data for Inference

More Language and Notation of Estimation The Fundamental Rule for Using Data for Inference is that available data can be used to make inferences about a much larger group if the data can be considered to be representative with regard to the question(s) of interest. Copyright © 2011 Brooks/Cole, Cengage Learning 6

Concept of CI as an Interval Estimate • Interval estimate is used as a

Concept of CI as an Interval Estimate • Interval estimate is used as a synonym for confidence interval. Even though it is an interval of values, an interval estimates one single, fixed population value. • A confidence interval is always accompanied by a confidence level, which tells us how likely it is that the interval estimate actually contains the true value of the parameter we are estimating. Copyright © 2011 Brooks/Cole, Cengage Learning 7

Example 10. 1 Teens and Interracial Dating 1997 USA Today/Gallup Poll of teenagers across

Example 10. 1 Teens and Interracial Dating 1997 USA Today/Gallup Poll of teenagers across country: 57% of the 496 teens who go out on dates say they’ve been out with someone of another race or ethnic group. 95% confident that percentage of all teens who date in US who have dated interracially or someone of another ethnic group is between 52% and 62% Copyright © 2011 Brooks/Cole, Cengage Learning 8

Interpreting the Confidence Level • The confidence level is the probability that the procedure

Interpreting the Confidence Level • The confidence level is the probability that the procedure used to determine the interval will provide an interval that includes the population parameter. • If we consider all possible randomly selected samples of same size from a population, the confidence level is the fraction percent of those samples for which the confidence interval includes the population parameter. Note: Often express the confidence level as a percent. Common levels are 90%, 95%, 98%, and 99%. Be careful: The confidence level only expresses how often the procedure works in the long run. Any one specific interval either does or does not include the true unknown population value. Copyright © 2011 Brooks/Cole, Cengage Learning 9

Lesson 2: Computing Confidence Intervals for the Big Five Parameters • One population proportion,

Lesson 2: Computing Confidence Intervals for the Big Five Parameters • One population proportion, p • Difference in two population proportions for independent samples, p 1 – p 2 • One population mean, m • Population mean of paired differences, md • Difference in two population means for independent samples, m 1 – m 2 Copyright © 2011 Brooks/Cole, Cengage Learning 10

Confidence Interval or Interval Estimate Sample estimate Multiplier × Standard Error Multiplier is a

Confidence Interval or Interval Estimate Sample estimate Multiplier × Standard Error Multiplier is a number based on the confidence level desired and determined from the standard normal distribution (for proportions) or Student’s t-distribution (for means). Copyright © 2011 Brooks/Cole, Cengage Learning 11

95% Confidence Interval for One Proportion • Sample estimate: sample proportion • Multiplier is

95% Confidence Interval for One Proportion • Sample estimate: sample proportion • Multiplier is 1. 96, we will round off to 2 • Standard error of is Sample estimate Margin of error Copyright © 2011 Brooks/Cole, Cengage Learning 12

Example 10. 2 Pollen Count High Today Are you allergic to anything? 1998 Poll:

Example 10. 2 Pollen Count High Today Are you allergic to anything? 1998 Poll: 36% of 883 randomly selected American adults said “YES” • Sample estimate: sample proportion =. 36 • Multiplier = 2 (for 95% confidence • Standard error is • Confidence Interval: . 36 ± 2×. 016 . 326 to. 392 • Interpretation: the interval, . 328 to. 392, estimates the proportion of all American adults who have an allergy. Copyright © 2011 Brooks/Cole, Cengage Learning 13

What Determines the Width? 1. The sample size, n. When sample size increases, margin

What Determines the Width? 1. The sample size, n. When sample size increases, margin of error decreases. 2. The confidence level. The Multiplier is determined by the desired confidence level. Later you’ll learn: the larger the multiplier, the more confidence we can have. The price of more confidence in a wider interval. 3. The natural variability in individual units. More natural variability results in wider intervals. For proportions, as gets closer to. 5 the standard error gets larger. For means, natural variability is measured by sample standard deviations. Copyright © 2011 Brooks/Cole, Cengage Learning 14

10. 2 CI Module 1: CI for a Population Proportion Lesson 1: Details of

10. 2 CI Module 1: CI for a Population Proportion Lesson 1: Details of How to Compute a Confidence Interval for a Population Proportion Two common settings: 1. A population exists, and we are interested in knowing what proportion of it has a certain trait, opinion, characteristic, response to a treatment, and so on. • What proportion of drivers are talking on a cell phone at any given moment? • What proportion of a population of smokers would quit smoking if wear a nicotine patch for 8 weeks? Copyright © 2011 Brooks/Cole, Cengage Learning 15

Lesson 1: Details of How to Compute a Confidence Interval for a Population Proportion

Lesson 1: Details of How to Compute a Confidence Interval for a Population Proportion Two common settings: 2. A repeatable situation exists, and we are interested in the long-run probability of a specific outcome. • What is the probability that a new fertility procedure will be successful for a randomly selected couple who tries it? • What is the probability that a randomly selected television of a certain model will fail before the warranty period is over? Copyright © 2011 Brooks/Cole, Cengage Learning 16

How to Computer a Confidence Interval for a Population Proportion Sample estimate Margin of

How to Computer a Confidence Interval for a Population Proportion Sample estimate Margin of error • Sample estimate: sample proportion • Standard error of Copyright © 2011 Brooks/Cole, Cengage Learning is 17

More about the Multiplier Note: Increase confidence level => larger multiplier. Multiplier, denoted as

More about the Multiplier Note: Increase confidence level => larger multiplier. Multiplier, denoted as z*, is the standardized score such that the area between -z* and z* under the standard normal curve corresponds to the desired confidence level. Copyright © 2011 Brooks/Cole, Cengage Learning 18

Example 10. 6 Intelligent Life Elsewhere? Poll: Random sample of 1003 Americans Is there

Example 10. 6 Intelligent Life Elsewhere? Poll: Random sample of 1003 Americans Is there intelligent life on other planets? Results: 56% said “very or somewhat likely”, =. 56 90% Confidence Interval: . 56 1. 645(. 016), or. 56 . 026 98% Confidence Interval: . 56 2. 33(. 016), or. 56 . 037 Note: entire interval is above 50% high confidence that a majority believe there is intelligent life. Copyright © 2011 Brooks/Cole, Cengage Learning 19

Formula for a Confidence Interval for a Population Proportion p • is the sample

Formula for a Confidence Interval for a Population Proportion p • is the sample proportion. • z* denotes the multiplier. where • is the standard error of Copyright © 2011 Brooks/Cole, Cengage Learning . 20

Conditions for Using the Formula 1. Sample is randomly selected from the population. Note:

Conditions for Using the Formula 1. Sample is randomly selected from the population. Note: Available data can be used to make inferences about a much larger group if the data can be considered to be representative with regard to the question(s) of interest. 2. Normal curve approximation to the distribution of possible sample proportions assumes a “large” sample size. Both and should be at least 10 (although some say these need only to be at least 5). Copyright © 2011 Brooks/Cole, Cengage Learning 21

Example 10. 4 Return Lost Wallet? Study: 120 wallets were lost and 77 were

Example 10. 4 Return Lost Wallet? Study: 120 wallets were lost and 77 were returned Note: Sample size is large enough, at least 10 returned and at least 10 not returned. Results: = 77/120 =. 64 95% Confidence Interval: . 64 2 ×. 044, or about. 55 to. 73 We can be 95% confident that the probability of a lost wallet being returned under the conditions used in the experiment is between. 55 and. 73. Copyright © 2011 Brooks/Cole, Cengage Learning 22

Lesson 2: Understanding the Formula: Developing the 95% Confidence Interval Key = Starting with

Lesson 2: Understanding the Formula: Developing the 95% Confidence Interval Key = Starting with the Sampling Distribution … From the sampling distribution of we have: For 95% of all samples, -2 standard deviations < – p < 2 standard deviations Don’t know true standard deviation, so use standard error. For approximately 95% of all samples, -2 standard errors < – p < 2 standard errors which implies for approximately 95% of all samples, – 2 standard errors < p < + 2 standard errors Copyright © 2011 Brooks/Cole, Cengage Learning 23

Other Levels of Confidence Multiplier, denoted as z*, is the standardized score such that

Other Levels of Confidence Multiplier, denoted as z*, is the standardized score such that the area between -z* and z* under the standard normal curve corresponds to the desired confidence level. For 90% confidence: Copyright © 2011 Brooks/Cole, Cengage Learning 24

Example 10. 5 Intelligent Life Elsewhere? Poll: Random sample of 1003 Americans “Do you

Example 10. 5 Intelligent Life Elsewhere? Poll: Random sample of 1003 Americans “Do you think there is intelligent life on other planets? Results: 56% of the sample said “yes”, =. 56 We want a 50% confidence interval. If the area between -z* and z* is. 50, then the area to the left of z* is. 75. From Table A. 1 we have z* . 67. 50% Confidence Interval: . 56 . 67(. 016), or. 56 . 011 Note: Lower confidence level results in a narrower interval. Copyright © 2011 Brooks/Cole, Cengage Learning 25

Lesson 3: Reconciling and Understanding the Different Margin of Error Formulas General Formula: Sample

Lesson 3: Reconciling and Understanding the Different Margin of Error Formulas General Formula: Sample estimate Multiplier Standard error For one proportion we had: Chapter 5: Sample estimate Margin of error For one proportion we had: Copyright © 2011 Brooks/Cole, Cengage Learning 26

The Conservative Estimate of Margin of Error Conservative estimate of the margin of error

The Conservative Estimate of Margin of Error Conservative estimate of the margin of error = • It usually overestimates the actual size of the margin of error. • It works (conservatively) for all survey questions based on the sample size, even if the sample proportions differ from one question to the next. • Obtained when =. 5 in the margin of error formula. Copyright © 2011 Brooks/Cole, Cengage Learning 27

Example 10. 6 Winning the Lottery and Quitting Work Poll: 40% of employed workers

Example 10. 6 Winning the Lottery and Quitting Work Poll: 40% of employed workers sampled would stop working if they won the lottery. Margin of error was 4%. 95% Confidence Interval Estimate: Sample estimate Margin of error 40% 4% 36% to 44% With 95% confidence, somewhere between 36% and 44% of working Americans would say they would quit working if they won $10 million in the lottery. Interval does not cover 50% Appears that fewer than half of all working Americans think they would quit if won lottery. Copyright © 2011 Brooks/Cole, Cengage Learning 28

Example 10. 8 Really Bad Allergies Poll: Random sample of 883 American adults 3%

Example 10. 8 Really Bad Allergies Poll: Random sample of 883 American adults 3% of the sample experience “severe” symptoms 95% (conservative) Confidence Interval: 3% 3. 4%, or -0. 4% to 6. 4% When is far from. 5, the conservative margin of error is too conservative. The 95% margin of error using =. 03 is just 1. 1%, for an interval from 1. 9% to 4. 1%. Copyright © 2011 Brooks/Cole, Cengage Learning 29

Intuitive Explanation of Margin of Error Characteristics: • The difference between the sample proportion

Intuitive Explanation of Margin of Error Characteristics: • The difference between the sample proportion and the population proportion is less than the margin of error about 95% of the time, or for about 19 of every 20 sample estimates. • The difference between the sample proportion and the population proportion is more than the margin of error about 5% of the time, or for about 1 of every 20 sample estimates Copyright © 2011 Brooks/Cole, Cengage Learning 30

10. 3 CI Module 2: CI for the Difference Between Two Proportions CI for

10. 3 CI Module 2: CI for the Difference Between Two Proportions CI for Difference Between Two Population Proportions: where z* is the value of the standard normal variable with area between -z* and z* equal to the desired confidence level. Copyright © 2011 Brooks/Cole, Cengage Learning 31

Necessary Conditions • Condition 1: Sample proportions are available based on independent, randomly selected

Necessary Conditions • Condition 1: Sample proportions are available based on independent, randomly selected samples from the two populations. • Condition 2: All of the quantities – – are at least 10. Copyright © 2011 Brooks/Cole, Cengage Learning 32

Example 10. 9 Age and Using Internet for News 2008 General Social Survey Young:

Example 10. 9 Age and Using Internet for News 2008 General Social Survey Young: 92 of 262 use Internet as main news source =. 351 Old: 59 of 632 use Internet as main news source =. 093 So, Approximate 95% Confidence Interval: . 258 1. 96(. 0317) . 196 to. 320 We are 95% confident that somewhere between 19. 6% and 32. 0% more young adults than older adults use the Internet as their main news source. Copyright © 2011 Brooks/Cole, Cengage Learning 33

10. 4 Using Confidence Intervals to Guide Decisions Principle 1. A value not in

10. 4 Using Confidence Intervals to Guide Decisions Principle 1. A value not in a confidence interval can be rejected as a possible value of the population proportion. A value in a confidence interval is an “acceptable” possibility for the value of a population proportion. Principle 2. When a confidence interval for the difference in two population proportions does not cover 0, it is reasonable to conclude the two population proportions are different. Principle 3. When the confidence intervals for proportions in two different populations do not overlap, it is reasonable to conclude the two population proportions are different. Copyright © 2011 Brooks/Cole, Cengage Learning 34

Example 10. 11 Which Drink Tastes Better? Taste Test: A sample of 60 people

Example 10. 11 Which Drink Tastes Better? Taste Test: A sample of 60 people taste both drinks and 55% like taste of Drink A better than Drink B. Makers of Drink A want to advertise these results. Makers of Drink B make a 95% confidence interval for the population proportion who prefer Drink A. 95% Confidence Interval: Note: Since. 50 is in the interval, there is not enough evidence to claim that Drink A is preferred by a majority of population represented by the sample. Copyright © 2011 Brooks/Cole, Cengage Learning 35

Case Study 10. 1 ESP Works with Movies ESP Study by Bem and Honorton

Case Study 10. 1 ESP Works with Movies ESP Study by Bem and Honorton (1994) • Subjects (receivers) described what another person (sender) was seeing on a screen. • Receivers shown 4 pictures, asked to pick which they thought sender had actually seen. • Actual image shown randomly picked from 4 choices. • Image was either a single, “static” image or a “dynamic” short video clip, played repeatedly (additional three choices shown were always of the same type as actual. Copyright © 2011 Brooks/Cole, Cengage Learning 36

Case Study 10. 1 ESP Works (cont) Bem and Honorton (1994) ESP Study Results

Case Study 10. 1 ESP Works (cont) Bem and Honorton (1994) ESP Study Results Is there enough evidence to say that the % of correct guesses for dynamic pictures is significantly above 25%? 95% CI: Can claim the true % of correct guesses is significantly better than what would occur from random guessing. Copyright © 2011 Brooks/Cole, Cengage Learning 37

Case Study 10. 2 Nicotine Patches vs Zyban Study: New England Journal of Medicine

Case Study 10. 2 Nicotine Patches vs Zyban Study: New England Journal of Medicine 3/4/99) • 893 participants randomly allocated to four treatment groups: placebo, nicotine patch only, Zyban only, and Zyban plus nicotine patch. • Participants blinded: all used a patch (nicotine or placebo) and all took a pill (Zyban or placebo). • Treatments used for nine weeks. Copyright © 2011 Brooks/Cole, Cengage Learning 38

Case Study 10. 2 Nicotine (cont) Conclusions: Zyban is effective (no overlap of Zyban

Case Study 10. 2 Nicotine (cont) Conclusions: Zyban is effective (no overlap of Zyban and not Zyban CIs) Nicotine patch is not particularly effective (overlap of patch and no patch CIs) Copyright © 2011 Brooks/Cole, Cengage Learning 39

Case Study 10. 3 What a Great Personality Would you date someone with a

Case Study 10. 3 What a Great Personality Would you date someone with a great personality even though you did not find them attractive? Women: 61. 1% of 131 answered “yes. ” 95% confidence interval is 52. 7% to 69. 4%. Men: 42. 6% of 61 answered “yes. ” 95% confidence interval is 30. 2% to 55%. Conclusions: • Higher proportion of women would say yes. CIs slightly overlap • Women CI narrower than men CI due to larger sample size Copyright © 2011 Brooks/Cole, Cengage Learning 40

In Summary: CI for Proportions General CI for p: Approximate 95% CI for p:

In Summary: CI for Proportions General CI for p: Approximate 95% CI for p: Conservative 95% CI for p: General CI for p 1 – p 2 : Copyright © 2011 Brooks/Cole, Cengage Learning 41