Statistics Chapter 9 Inferences Based on Two Samples

Statistics Chapter 9: Inferences Based on Two Samples: Confidence Intervals and Tests of Hypotheses

Where We’ve Been n Made inferences based on confidence intervals and tests of hypotheses Studied confidence intervals and hypothesis tests for µ, p and σ 2 Selected the necessary sample size for a given margin of error Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 2

Where We’re Going n n n Learn to use confidence intervals and hypothesis tests to compare two populations Learn how to use these tools to compare two population means, proportions and variances Select the necessary sample size for a given margin of error when comparing parameters from two populations Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 3

9. 1: Identifying the Target Parameter µ -µ Mean 1 difference; 2 difference in averages p 1 - p 2 Difference between proportions, percentages, fractions or rates; compare proportions 2/� 2 � 1 2 Ratio of variances; difference in variability or spread; compare variation Qualitative Data Quantitative Data Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 4

9. 2: Comparing Two Population Means: Independent Sampling Point Estimators �→ µ � 1 - � 2 → µ 1 - µ 2 To construct a confidence interval or conduct a hypothesis test, we need the standard deviation: Singe sample Two samples Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 5

9. 2: Comparing Two Population Means: Independent Sampling 1. 2. 3. The Sampling Distribution for (� 1 - � 2) The mean of the sampling distribution is (µ 1µ 2). If the two samples are independent, the standard deviation of the sampling distribution (the standard error) is The sampling distribution for (� 1 - � 2) is approximately normal for large samples. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 6

9. 2: Comparing Two Population Means: Independent Sampling The Sampling Distribution for (� 1 - � 2) Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 7

9. 2: Comparing Two Population Means: Independent Sampling Large Sample Confidence Interval for (µ 1 - µ 2 ) Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 8

9. 2: Comparing Two Population Means: Independent Sampling Two samples concerning retention rates for first-year students at private and public institutions were obtained from the Department of Education’s data base to see if there was a significant difference in the two types of colleges. n n Private Colleges n: 71 Mean: 78. 17 Standard Deviation: 9. 55 Variance: 91. 17 n n Public Universities n: 32 Mean: 84 Standard Deviation: 9. 88 Variance: 97. 64 What does a 95% confidence interval tell us about retention rates? Source: National Center for Education Statistics Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 9

9. 2: Comparing Two Population Means: Independent Sampling n n Private Colleges n: 71 Mean: 78. 17 Standard Deviation: 9. 55 Variance: 91. 17 n n Public Universities n: 32 Mean: 84 Standard Deviation: 9. 88 Variance: 97. 64 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 10

9. 2: Comparing Two Population Means: Independent Sampling n n Private Colleges Public Universities n: 71 n n: 32 Mean: 78. 17 n Mean: 84 Standard Deviation: 9. 55 n Standard Deviation: 9. 88 Variance: 91. 17 Since 0 is not n in Variance: 97. 64 the confidence interval, the difference in the sample means appears to indicate a real difference in retention. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 11

9. 2: Comparing Two Population Means: Independent Sampling Small Sample Confidence Interval for (µ 1 - µ 2 ) The value of t is based on (n 1 + n 2 -2) degrees of freedom. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 12

9. 2: Comparing Two Population Means: Independent Sampling For small samples, the t-distribution can be used with a pooled sample estimator of σ 2, σ p 2 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 13

9. 2: Comparing Two Population Means: Independent Sampling One-Tailed Test Two-Tailed Test H 0: (µ 1 - µ 2) = D 0 Ha: (µ 1 - µ 2) > D 0 (< D 0) Rejection region: z < -za (> za) H 0: (µ 1 - µ 2) = D 0 Ha: (µ 1 - µ 2) ≠ D 0 Rejection region: |z| > za/2 Test Statistic: where Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 14

9. 2: Comparing Two Population Means: Independent Sampling Conditions Required for Valid Large. Sample Inferences about (µ 1 - µ 2) 1. The two samples are randomly and independently selected from the target populations. 2. The sample sizes are both ≥ 30. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 15

9. 2: Comparing Two Population Means: Independent Sampling Let’s go back to the retention data and test the hypothesis that there is no significant difference in retention at privates and publics. n n Private Colleges n: 71 Mean: 78. 17 Standard Deviation: 9. 55 Variance: 91. 17 n n Public Universities n: 32 Mean: 84 Standard Deviation: 9. 88 Variance: 97. 64 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 16

9. 2: Comparing Two Population Means: Independent Sampling Test statistic: Reject the null hypothesis: Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 17

9. 2: Comparing Two Population Means: Independent Sampling For small samples, the t-distribution can be used with a pooled sample estimator of σ 2, σ p 2 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 18

9. 2: Comparing Two Population Means: Independent Sampling One-Tailed Test Two-Tailed Test H 0: (µ 1 - µ 2) = D 0 Ha: (µ 1 - µ 2) > D 0 (< D 0) Rejection region: t < -ta (> ta) H 0: (µ 1 - µ 2) = D 0* Ha: (µ 1 - µ 2) ≠ D 0 Rejection region: |t| > ta/2 Test Statistic: Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 19

9. 2: Comparing Two Population Means: Independent Sampling Conditions Required for Valid Small-Sample Inferences about (µ 1 - µ 2) 1. The two samples are randomly and independently selected from the target populations. 2. Both sampled populations have distributions that are approximately normal. 3. The population variances are equal. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 20

9. 2: Comparing Two Population Means: Independent Sampling n Does class time affect performance? ¡ The test performance of students in two sections of international trade, meeting at different times, were compared. 8: 00 a. m. Class Mean: 78 Standard Deviation: 14 Variance: 196 n: 21 9: 30 a. m. Class Mean: 82 Standard Deviation: 17 Variance: 289 n: 21 With a =. 05, test H 0 : µ 1 = µ 2 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 21

9. 2: Comparing Two Population Means: Independent Sampling 8: 00 a. m. Class Mean: 78 Variance: 196 n: 21 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 9: 30 a. m. Class Mean: 82 Variance: 289 n: 21 22

9. 2: Comparing Two Population Means: Independent Sampling 8: 00 a. m. Class Mean: 78 Variance: 196 n: 21 9: 30 a. m. Class Mean: 82 Variance: 289 n: 21 With df = 18 + 24 – 2 = 40, ta/2 = t. 025 = 2. 021. Since out test statistic t = -. 812. |t| < t. 025. Do not reject the null hypothesis Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 23

9. 2: Comparing Two Population Means: Independent Sampling 8: 00 a. m. Class Mean: 72 Variance: 154 n: 13 9: 30 a. m. Class Mean: 86 Variance: 163 n: 21 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 24

9. 2: Comparing Two Population Means: Independent Sampling 8: 00 a. m. Class Mean: 72 Variance: 154 n: 13 9: 30 a. m. Class Mean: 86 Variance: 163 n: 21 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 25

9. 2: Comparing Two Population Means: Independent Sampling 8: 00 a. m. Class Mean: 72 Variance: 154 n: 13 9: 30 a. m. Class Mean: 86 Variance: 163 n: 21 Since |t| > t. 025, df=26 , , reject the null hypothesis. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 26

9. 3: Comparing Two Population Means: Paired Difference Experiments Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 27

9. 3: Comparing Two Population Means: Paired Difference Experiments Suppose ten pairs of puppies were housetrained using two different methods: one puppy from each pair was paper-trained, with the paper gradually moved outside, and the other was taken out every three hours and twenty minutes after each meal. The number of days until the puppies were considered housetrained (three days straight without an accident) were compared. Nine of the ten paper -trained dogs took longer than the other paired dog to complete training, with the average difference equal to 4 days, with a standard deviation of 3 days. What is a 90% confidence interval on the difference in successful training? Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 28

9. 3: Comparing Two Population Means: Paired Difference Experiments Since 0 is not in the interval, one program does seem to work more effectively. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 29

9. 3: Comparing Two Population Means: Paired Difference Experiments Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 30

9. 3: Comparing Two Population Means: Paired Difference Experiments n Suppose 150 items were priced at two online stores, “cport” and “warriorwoman. ” ¡ ¡ ¡ Mean difference: $1. 75 Standard Deviation: $10. 35 Test at the 95% level that the difference in the two stores is zero. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 31

9. 3: Comparing Two Population Means: Paired Difference Experiments Suppose 150 items were priced at two online stores, “cport” and “warriorwoman. ” ¡ ¡ ¡ Mean difference: $1. 75 Standard Deviation: $10. 35 a =. 05 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 32

9. 3: Comparing Two Population Means: Paired Difference Experiments Suppose 150 items were priced at two online stores, “cport” and “warriorwoman. ”The critical ¡ ¡ ¡ Mean difference: value of z. 05 is 1. 96, so we $1. 75 would reject Standard Deviation: this null hypothesis. $10. 35 a =. 05 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 33

9. 4: Comparing Two Population Proportions: Independent Sampling Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 34

9. 4: Comparing Two Population Proportions: Independent Sampling Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 35

9. 4: Comparing Two Population Proportions: Independent Sampling Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 36

9. 4: Comparing Two Population Proportions: Independent Sampling n A group of men and women were asked their opinions on the following important issue: Are the Three Stooges funny? The results are as follow: Men Women Yes 290 200 No 50 50 n 340 250 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 37

9. 4: Comparing Two Population Proportions: Independent Sampling n Calculate a 95% confidence interval on the difference in the opinions of men and women. Men Women p . 85 . 80 q . 15 . 20 n 340 250 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 38

9. 4: Comparing Two Population Proportions: Independent Sampling n Calculate a 95% confidence interval on the difference in the opinions of men and women. Men p . 85 q . 15 n 340 Since 0 is in the confidence Women interval, we cannot rule out. 80 the possibility that both genders find the Stooges. 20 equally funny. 250 Nyuk nyuk. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 39

9. 4: Comparing Two Population Proportions: Independent Sampling Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 40

9. 4: Comparing Two Population Proportions: Independent Sampling n Randy Stinchfield of the University of Minnesota studied the gambling activities of public school students in 1992 and 1998 (Journal of Gambling Studies, Winter 2001). His results are reported below: 1992 n 1998 Survey n 21, 484 23, 199 Number who gambled 4. 684 5, 313 Proportion who gambled . 218 . 229 Do these results represent a statistically significant difference at the a =. 01 level? Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 41

9. 4: Comparing Two Population Proportions: Independent Sampling 1992 1998 Survey n 21, 484 23, 199 Number who gambled 4. 684 5, 313 Proportion who gambled . 218 . 229 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 42

9. 4: Comparing Two Population Proportions: Independent Sampling 1992 1998 Survey n 21, 484 23, 199 Number who gambled 4. 684 5, 313 Proportion who gambled Since the computed value of z, 2. 786, is of greater magnitude. 218. 229 than the critical value, 2. 576, we can reject the null hypothesis at the a =. 01 level. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 43

9. 4: Comparing Two Population Proportions: Independent Sampling n For valid inferences ¡ ¡ The two samples must be independent The two sample sizes must be large: Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 44

9. 5: Determining the Sample Size n With a given level of confidence, and a specified sampling error, it is possible to calculate the required sample size ¡ Typically, n 1 = n 2 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 45

9. 5: Determining the Sample Size n Sample size needed to estimate (µ 1 - µ 2) ¡ Given (1 - a ) and the sampling error (SE) required ¡ Estimates of σ1 and σ2 will be needed Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 46

9. 5: Determining the Sample Size n n Suppose you need to estimate the difference between two population means to within 2. 2 at the a = 5% level. You have good reason to believe the two variances are equal to each other, and equal 15. How large must n 1 and n 2 be? Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 47

9. 5: Determining the Sample Size n n SE = 2. 2 a = 5% level. σ12 = σ22 =15. n 1 and n 2 = ? Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 48

9. 5: Determining the Sample Size n Sample size needed to estimate (p 1 - p 2) ¡ Given (1 - a ) and the sampling error (SE) required ¡ Estimates of p 1 and p 2 will be needed n The most conservative values are p 1 = p 2 =. 5 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 49

9. 5: Determining the Sample Size n n Suppose you need to calculate a 90% confidence interval of width. 05, with no information about possible values of p 1 and p 2. What size do n 1 and n 2 need to be? Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 50

9. 5: Determining the Sample Size n n Suppose you need to calculate a 90% confidence interval of width. 05, with no information about possible values of p 1 and p 2. What size do n 1 and n 2 need to be? Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 51

9. 6: Comparing Two Population Variances: Independent Sampling Two samples taken from the same population … should have means that are close to one another and … variances that are close to one another as well. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 52

9. 6: Comparing Two Population Variances: Independent Sampling n Since variances do not follow a normal distribution (due to the squaring involved) we use a different technique for inferences about variances. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 53

9. 6: Comparing Two Population Variances: Independent Sampling Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 54

9. 6: Comparing Two Population Variances: Independent Sampling n If the ratio of variances gets too far from 1 in value, in either direction the likelihood that both samples share a population variance drops. For convenience the larger sample variance is usually put in the numerator. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 55

9. 6: Comparing Two Population Variances: Independent Sampling Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 56

9. 6: Comparing Two Population Variances: Independent Sampling Has the designated hitter produced a significant difference in homeruns in the two major leagues? Home Runs, American 2003 League National League µ 178. 5 169. 25 σ2 1415. 8 951. 4 n 14 16 To use the two-sample t-test, we should check the variance ratio to see if the assumption of equal variances is acceptable. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 57

9. 6: Comparing Two Population Variances: Independent Sampling Home Runs, 2003 American League National League µ 178. 5 169. 25 σ2 1415. 8 951. 4 n 14 16 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 58

9. 6: Comparing Two Population Variances: Independent Sampling Home Runs, 2003 American League National League µ 178. 5 169. 25 σ2 1415. 8 951. 4 n 14 16 Since 1 is in the confidence interval, the assumption of equal variances needed for the t-test is not rejected. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 59

9. 6: Comparing Two Population Variances: Independent Sampling Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 60

9. 6: Comparing Two Population Variances: Independent Sampling n In 2006, ten fast-growing economies had an average GDP growth rate of 8. 69%, with a standard deviation of 1. 7. Ten slow-growing economies had an average GDP growth rate of 2. 29%, with a standard deviation of. 58. * Do slow- and fast-growing economies have similar variability (a=5%)? * Source: euroekonom. com Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 61

9. 6: Comparing Two Population Variances: Independent Sampling µfast-growing = 8. 69% µslow-growing = 2. 29% σ 2 fast-growing = 2. 89 σ 2 slow-growing =. 34 Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 62

9. 6: Comparing Two Population Variances: Independent Sampling µfast-growing = 8. 69% µslow-growing = 2. 29% σ 2 fast-growing = 2. 89 σ 2 slow-growing =. 34 Reject the null hypothesis at the a=. 05 level. Mc. Clave, Statistics, 11 th ed. Chapter 9: Inferences Based on Two Samples 63