Chapter 9 Inferences from Two Samples In this

  • Slides: 78
Download presentation
Chapter 9 Inferences from Two Samples In this chapter we will deal with two

Chapter 9 Inferences from Two Samples In this chapter we will deal with two samples from two populations. The general goal is to compare the parameters of the two populations. For the first population we use index 1, for the second population index 2. 1

Section 9 -2 Two Proportions 2

Section 9 -2 Two Proportions 2

Notation for Two Proportions For the first population, we let: p 1 = first

Notation for Two Proportions For the first population, we let: p 1 = first population proportion n 1 = size of the first sample x 1 = number of successes in the first sample x 1 ^ p = n (the first sample proportion) 1 1 ^q = 1 – ^p 1 1 p , n , x , p^ , and q^ are used for the second population. 2 2 2 3

Pooled Sample Proportion v The pooled sample proportion is denoted by p and is

Pooled Sample Proportion v The pooled sample proportion is denoted by p and is given by: x 1 + x 2 p= n +n 1 2 v We denote q =1–p 4

Requirements 1. We have two independent random samples. 2. For each of the two

Requirements 1. We have two independent random samples. 2. For each of the two samples, the number of successes is at least 5 and the number of failures is at least 5. 5

Tests for Two Proportions The goal is to compare the two proportions. H 0:

Tests for Two Proportions The goal is to compare the two proportions. H 0: p 1 = p 2 H 1: p 1 p 2 , two tails H 1: p 1 < p 2 , H 1: p 1> p 2 left tail right tail Note: no numerical values for p 1 or p 2 are claimed in the hypotheses. 6

Test Statistic for Two Proportions z= ^ )–(p –p ) ( p^1 – p

Test Statistic for Two Proportions z= ^ )–(p –p ) ( p^1 – p 2 1 2 pq pq n 1 + n 2 Note: p 1 – p^ 1 p= x 1 + x 2 n 1 + n 2 p 2 =0 according to H 0 x 1 = n 1 and p^ 2 x 2 = n 2 q=1–p 7

Example: The table below lists results from a simple random sample of front-seat occupants

Example: The table below lists results from a simple random sample of front-seat occupants involved in car crashes. Use a 0. 05 significance level to test the claim that the fatality rate of occupants is lower for those in cars equipped with airbags. 8

Example: Requirements are satisfied: two simple random samples, two samples are independent; Each has

Example: Requirements are satisfied: two simple random samples, two samples are independent; Each has at least 5 successes and 5 failures. Step 1: Express the claim as p 1 < p 2. Step 2: p 1 < p 2 does not contain equality so it is the alternative hypothesis. The null hypothesis is the statement of equality. 9

Example: H 0: p 1 = p 2 H 1: p 1 < p

Example: H 0: p 1 = p 2 H 1: p 1 < p 2 (original claim) Step 3: Significance level is 0. 05 Step 4: Compute the pooled proportion: With it follows 10

Example: Step 5: Find the value of the test statistic. 11

Example: Step 5: Find the value of the test statistic. 11

Example: Left-tailed test. Area to left of z = – 1. 91 is 0.

Example: Left-tailed test. Area to left of z = – 1. 91 is 0. 0281 (Table A-2), so the P-value is 0. 0281. 12

Example: Step 6: Because the P-value of 0. 0281 is less than the significance

Example: Step 6: Because the P-value of 0. 0281 is less than the significance level of = 0. 05, we reject the null hypothesis of p 1 = p 2. Because we reject the null hypothesis, we conclude that there is sufficient evidence to support the original claim. Final conclusion: the proportion of accident fatalities for occupants in cars with airbags is less than the proportion of fatalities for occupants in cars without airbags. 13

Example: Using the Traditional Method With a significance level of = 0. 05 in

Example: Using the Traditional Method With a significance level of = 0. 05 in a left- tailed test, the critical value is z = – 1. 645. The test statistic of z = – 1. 91 does fall in the critical region bounded by the critical value of z = – 1. 645. We again reject the null hypothesis. 14

Confidence Interval Estimate of p 1 – p 2 ( p^1 – p^2 )

Confidence Interval Estimate of p 1 – p 2 ( p^1 – p^2 ) – E < ( p 1 – p 2 ) < ( p^ 1 where E = z ^ )+ –p 2 E ^ q^ p 1 1 2 2 n 1 + n 2 15

Example: Use the sample data to construct a 90% confidence interval estimate of the

Example: Use the sample data to construct a 90% confidence interval estimate of the difference between the two population proportions. Note: 1─ = 0. 90, so = 0. 10 and = 0. 05. 16

Example: Requirements are satisfied as we saw earlier. 90% confidence interval: z /2 =

Example: Requirements are satisfied as we saw earlier. 90% confidence interval: z /2 = 1. 645 Calculate the margin of error, E 17

Example: Construct the confidence interval 18

Example: Construct the confidence interval 18

Final note: The confidence interval limits do not contain 0, implying that there is

Final note: The confidence interval limits do not contain 0, implying that there is a significant difference between the two proportions. Thus the confidence interval, too, suggests that the fatality rate is lower for occupants in cars with air bags than for occupants in cars without air bags. 19

Two proportions by TI-83/84 • • Press STAT and select TESTS Scroll down to

Two proportions by TI-83/84 • • Press STAT and select TESTS Scroll down to 2 -Prop. ZTest press ENTER Type in x 1: (number of successes in 1 st sample) n 1: (number of trials in 1 st sample) x 2: (number of successes in 2 nd sample) n 2: (number of trials in 2 nd sample) choose H 1: p 1 ≠p 2 <p 2 >p 2 (two tails) (left tail) (right tail) • Press on Calculate • Read test statistic z=… and P-value p=… 20

Two proportions by TI-83/84 • Press STAT and select TESTS • Scroll down to

Two proportions by TI-83/84 • Press STAT and select TESTS • Scroll down to 2 -Prop. ZInt press ENTER • Type in x 1: (number of successes in 1 st sample) • n 1: (number of trials in 1 st sample) • x 2: (number of successes in 2 nd sample) • n 2: (number of trials in 2 nd sample) C-Level: (confidence level) • Press on Calculate • Read the interval (…, …) 21

Section 9 -3 Two Means: Independent Samples 22

Section 9 -3 Two Means: Independent Samples 22

Definitions Two samples are independent if the sample values selected from one population are

Definitions Two samples are independent if the sample values selected from one population are not related to or somehow paired or matched with the sample values from the other population. 23

Notation for the first population: 1 = population mean σ1 = population standard deviation

Notation for the first population: 1 = population mean σ1 = population standard deviation n 1 = size of the first sample = sample mean s 1 = sample standard deviation Corresponding notations for 2, σ2, s 2, and n 2 apply to the second population. 24

Requirements 1. σ1 an σ2 are unknown and no assumption is made about the

Requirements 1. σ1 an σ2 are unknown and no assumption is made about the equality of σ1 and σ2. 2. The two samples are independent. 3. Both samples are random samples. 4. Either or both of these conditions are satisfied: The two sample sizes are both large (with n 1 > 30 and n 2 > 30) or both populations have normal distributions. 25

Tests for Two Means The goal is to compare the two means. H 0:

Tests for Two Means The goal is to compare the two means. H 0: 1 = 2 H 1: 1 2 , two tails H 1: 1 < 2 , H 1: 1> 2 left tail right tail Note: no numerical values for claimed in the hypotheses. 1 or 2 are 26

Hypothesis Test for Two Means with Independent Samples: Test Statistic is Note: 1 –

Hypothesis Test for Two Means with Independent Samples: Test Statistic is Note: 1 – 2 =0 according to H 0 Degrees of freedom: df = smaller of n 1 – 1 and n 2 – 1. 27

Example: A headline in USA Today proclaimed that “Men, women are equal talkers. ”

Example: A headline in USA Today proclaimed that “Men, women are equal talkers. ” That headline referred to a study of the numbers of words that men and women spoke in a day, see below. Use a 0. 05 significance level to test the claim that men and women speak the same mean number of words in a day. 28

Example: Requirements are satisfied: two population standard deviations are not known and not assumed

Example: Requirements are satisfied: two population standard deviations are not known and not assumed to be equal, independent samples, both samples are large. Step 1: Express claim as 1 = 2. Step 2: If original claim is false, then 1 ≠ 2. Step 3: Alternative hypothesis does not contain equality, null hypothesis does. H 0 : 1 = 2 (original claim) H 1 : 1 ≠ 2 29

Example: Step 4: Significance level is 0. 05 Step 5: Use a t distribution

Example: Step 4: Significance level is 0. 05 Step 5: Use a t distribution Step 6: Calculate the test statistic 30

Example: Use Table A-3: area in two tails is 0. 05, df = 185,

Example: Use Table A-3: area in two tails is 0. 05, df = 185, which is not in the table, the closest value is t = ± 1. 972 31

Example: Step 7: Because the test statistic does not fall within the critical region,

Example: Step 7: Because the test statistic does not fall within the critical region, fail to reject the null hypothesis: 1 = 2 (or 1 – 2 = 0). Final conclusion: There is sufficient evidence to support the claim that men and women speak the same mean number of words in a day. 32

Confidence Interval Estimate of 1 – 2: Independent Samples (x 1 – x 2)

Confidence Interval Estimate of 1 – 2: Independent Samples (x 1 – x 2) – E < (µ 1 – µ 2) < (x 1 – x 2) + E where E = t s 2 s + n 2 n 1 2 df = smaller n 1 – 1 and n 2 – 1 33

Example: Using the given sample data, construct a 95% confidence interval estimate of the

Example: Using the given sample data, construct a 95% confidence interval estimate of the difference between the mean number of words spoken by men and the mean number of words spoken by women. 34

Example: Find the margin of Error, E; use t /2 = 1. 972 Construct

Example: Find the margin of Error, E; use t /2 = 1. 972 Construct the confidence interval use E = 1595. 4 and 35

Tests about two means by TI-83/84 • • • Press STAT and select TESTS

Tests about two means by TI-83/84 • • • Press STAT and select TESTS Scroll down to 2 -Samp. TTest press ENTER Select Input: Data or Stats. For Stats: Type in x 1: (1 st sample mean) sx 1: (1 st sample st. deviation) n 1: (1 st sample size) x 2: (2 nd sample mean) sx 2: (2 nd sample st. deviation) n 2: (2 nd sample size) choose H 1: m 1 ≠m 2 < m 2 > m 2 (two tails) (left tail) (right tail) 36

Tests about two means (continued) • choose Pooled: No or Yes (always No) •

Tests about two means (continued) • choose Pooled: No or Yes (always No) • • Press on Calculate Read the test statistic t=… and the P-value p=… Note: the calculator gives a more accurate Pvalue than the book does, because it uses a more accurate formula for degrees of freedom (see the line df=… in the calculator). The book adopts a simple but inaccurate rule df=smaller of n 1 -1 and n 2 -1. 37

Intervals for two means by TI-83/84 • • • Press STAT and select TESTS

Intervals for two means by TI-83/84 • • • Press STAT and select TESTS Scroll down to 2 -Samp. TInt press ENTER Select Input: Data or Stats. For Stats: Type in x 1: (1 st sample mean) sx 1: (1 st sample st. deviation) n 1: (1 st sample size) x 2: (2 nd sample mean) sx 2: (2 nd sample st. deviation) n 2: (2 nd sample size) C-Level: (confidence level) 38

Intervals for two means (continued) • • choose Pooled: No or Yes (always No)

Intervals for two means (continued) • • choose Pooled: No or Yes (always No) Press on Calculate Read the confidence interval (…, …) Note: the calculator gives a more accurate confidence interval than the book does, because it uses a more accurate formula for degrees of freedom (see the line df=… in the calculator). The book adopts a simple but inaccurate rule df=smaller of n 1 -1 and n 2 -1. 39

Section 9 -4 Two Means: Matched Pairs In this section we deal with dependent

Section 9 -4 Two Means: Matched Pairs In this section we deal with dependent samples. In other words, there is some relationship between the two samples so that each value in one sample is paired (naturally matched or coupled) with a corresponding value in the other sample. So the two samples can be treated as matched pairs of values. 40

Examples: • Blood pressure of patients before they are given medicine and after they

Examples: • Blood pressure of patients before they are given medicine and after they take it. • Predicted temperature (by Weather Forecast) and the actual temperature. • Heights of selected people in the morning and their heights by night time. • Test scores of selected students in Calculus-I and their scores in Calculus-II. 41

Example: First sample: weights of 5 students in April Second sample: their weights in

Example: First sample: weights of 5 students in April Second sample: their weights in September These weights make 5 matched pairs Third line: differences between April weights and September weights (net change in weight for each student, separately) In our calculations we only use differences d, not the values in the two samples. 42

Notation for Dependent Samples d = µd = mean value of the differences d

Notation for Dependent Samples d = µd = mean value of the differences d for the population of paired data d = mean value of the differences d for the paired sample data (equal to the mean of the x – y values) sd = standard deviation of the differences d for the paired sample data n = number of pairs of data. individual difference between the two values of a matched pair 43

Requirements 1. The sample data are dependent (make matched pairs). 2. Either or both

Requirements 1. The sample data are dependent (make matched pairs). 2. Either or both of these conditions is satisfied: The number of pairs of sample data is large (n > 30) or the pairs of values have differences that are from a population that is approximately normal. 44

Tests for Matched Pairs The goal is to see whethere is a difference. H

Tests for Matched Pairs The goal is to see whethere is a difference. H 0: d = 0 H 1: d 0, two tails H 1: d < 0 , H 1: d > 0 left tail right tail 45

Hypothesis Test Statistic for Matched Pairs: t= d – µd sd n Note: d

Hypothesis Test Statistic for Matched Pairs: t= d – µd sd n Note: d =0 according to H 0 degrees of freedom df = n – 1 46

P-values and Critical Values Use Table A-3 (t-distribution) 47

P-values and Critical Values Use Table A-3 (t-distribution) 47

Example: Use a 0. 05 significance level to test the claim that for the

Example: Use a 0. 05 significance level to test the claim that for the population of students, the mean change in weight from September to April is 0 kg (so there is no change, on the average) 48

Example: Weight gained = April weight – Sept. weight d denotes the mean of

Example: Weight gained = April weight – Sept. weight d denotes the mean of the “April – Sept. ” differences in weight; the claim is d = 0 kg Step 1: claim is d = 0 Step 2: If original claim is not true, we have d ≠ 0 Step 3: H 0: d = 0 (original claim) H 1: d ≠ 0 Step 4: significance level is = 0. 05 Step 5: use the student t distribution 49

Example: Step 6: find values of d and sd differences are: – 1, 4,

Example: Step 6: find values of d and sd differences are: – 1, 4, – 2, 1 d = 0. 2 and sd = 2. 4 now compute the test statistic Table A-3: df = n – 1, area in two tails is 0. 05, yields a critical value t = ± 2. 776 50

Example: Step 7: Because the test statistic does not fall in the critical region,

Example: Step 7: Because the test statistic does not fall in the critical region, we fail to reject the null hypothesis. 51

Example: We conclude that there is sufficient evidence to support the claim that for

Example: We conclude that there is sufficient evidence to support the claim that for the population of students, the mean change in weight from September to April is equal to 0 kg. 52

Example: The P-value method: Using technology, we can find the P-value of 0. 8605.

Example: The P-value method: Using technology, we can find the P-value of 0. 8605. (Using Table A-3 with the test statistic of t = 0. 186 and 4 degrees of freedom, we can determine that the P-value is greater than 0. 20. ) We again fail to reject the null hypothesis, because the P-value is greater than the significance level of = 0. 05. 53

Confidence Intervals for Matched Pairs d – E < µd < d + E

Confidence Intervals for Matched Pairs d – E < µd < d + E where E = t /2 sd n Critical values of tα/2 : Use Table A-3 with df = n – 1 degrees of freedom. 54

Example: Construct a 95% confidence interval estimate of d , which is the mean

Example: Construct a 95% confidence interval estimate of d , which is the mean of the “April– September” weight differences of college students in their freshman year. = 0. 2, sd = 2. 4, n = 5, ta/2 = 2. 776 Find the margin of error, E 55

Example: Construct the confidence interval: We have 95% confidence that the limits of ─2.

Example: Construct the confidence interval: We have 95% confidence that the limits of ─2. 8 kg and 3. 2 kg contain the true value of the mean weight change from September to April. 56

Dependent samples by TI-83/84 • • Enter 1 st sample in list L 1

Dependent samples by TI-83/84 • • Enter 1 st sample in list L 1 and 2 nd sample in L 2 Clear screen, type L 1─L 2→L 3 (use STO key) Press STAT and select TESTS Scroll down to T-Test for hypotheses testing or to TInterval for confidence intervals Select Input: Data (not Stats) and use list L 3 Then proceed as if you had just one sample… 57

Section 9 -5 Comparing Variation in Two Samples 58

Section 9 -5 Comparing Variation in Two Samples 58

Requirements 1. The two populations are independent. 2. The two samples are random samples.

Requirements 1. The two populations are independent. 2. The two samples are random samples. 3. The two populations are each normally distributed. The last requirement is strict. 59

Important: • The first sample must have a larger sample standard deviation s 1

Important: • The first sample must have a larger sample standard deviation s 1 than the second sample, i. e. we must have s 1 ≥ s 2 • If this is not so, i. e. if s 1 < s 2 , then we will need to switch the indices 1 and 2, i. e. we need to label the second sample (and population) as first, and the first as second. 60

Notation for Hypothesis Tests with Two Variances or Standard Deviations s 1 = first

Notation for Hypothesis Tests with Two Variances or Standard Deviations s 1 = first (larger) sample st. deviation n 1 = size of the first sample s 1 = st. deviation of the first population s 2 n 2 s 2 are used for the second sample and population 61

Tests for Two Variances The goal is to compare the two population variances (or

Tests for Two Variances The goal is to compare the two population variances (or standard deviations) H 0 : s 1 = s 2 H 1 : s 1 s 2 , two tails Note: H 1: s < s 1 2 H 1 : s 1 > s 2 right tail is not considered. Note: no numerical values for claimed in the hypotheses. s 1 or s 2 are 62

Test Statistic for Hypothesis Tests with Two Variances F= s s 2 1 2

Test Statistic for Hypothesis Tests with Two Variances F= s s 2 1 2 Where s 12 is the first (larger) of the two sample variances 2 Critical Values: Using Table A-5, we obtain critical F values that are determined by the following three values: 1. The significance level 2. Numerator degrees of freedom = n 1 – 1 3. Denominator degrees of freedom = n 2 – 1 63

Properties of the F Distribution • The F distribution is not symmetric. • Values

Properties of the F Distribution • The F distribution is not symmetric. • Values of the F distribution cannot be negative, i. e. F ≥ 0. • The exact shape of the F distribution depends on the two different degrees of freedom (numerator df and denominator df) 64

Density curve of F distribution 65

Density curve of F distribution 65

Use of the F Distribution If the two populations do have equal s 12

Use of the F Distribution If the two populations do have equal s 12 variances, then F = s 2 will be close to 1 2 2 2 because s 1 and s 2 are close in value. 66

Use of the F Distribution If the two populations have radically different variances, then

Use of the F Distribution If the two populations have radically different variances, then F will be a large number. Remember: the larger sample variance is s 21 , so F is either equal to 1 or greater than 1. 67

Conclusions from the F Distribution Consequently, a value of F near 1 will be

Conclusions from the F Distribution Consequently, a value of F near 1 will be evidence in favor of the 2 conclusion that 1 = 22. But a large value of F will be evidence against the conclusion of equality of the population variances. 68

Critical region is shaded red: there we reject H 0: 1= 2 69

Critical region is shaded red: there we reject H 0: 1= 2 69

Finding Critical F Values To find a critical F value corresponding to a 0.

Finding Critical F Values To find a critical F value corresponding to a 0. 05 significance level, refer to Table A-5 and use the right-tail area of 0. 025 or 0. 05, depending on the type of test: Two-tailed test: use 0. 025 in right tail Right-tailed test: use 0. 05 in right tail 70

Example: Below are sample weights (in g) of quarters made before 1964 and weights

Example: Below are sample weights (in g) of quarters made before 1964 and weights of quarters made after 1964. When designing coin vending machines, we must consider the standard deviations of pre 1964 quarters and post-1964 quarters. Use a 0. 05 significance level to test the claim that the weights of pre-1964 quarters and the weights of post-1964 quarters are from populations with the same standard deviation. 71

Example: Step 1: claim of equal standard deviations is equivalent to claim of equal

Example: Step 1: claim of equal standard deviations is equivalent to claim of equal variances Step 2: if the original claim is false, then Step 3: original claim 72

Example: Step 4: significance level is 0. 05 Step 5: involves two population variances,

Example: Step 4: significance level is 0. 05 Step 5: involves two population variances, use F distribution variances Step 6: calculate the test statistic For the critical values in this two-tailed test, refer to Table A-5 for the area of 0. 025 in the right tail. The critical value is 1. 8752. 73

Example: Step 7: The test statistic F = 1. 9729 does fall within the

Example: Step 7: The test statistic F = 1. 9729 does fall within the critical region, so we reject the null hypothesis of equal variances. There is sufficient evidence to warrant rejection of the claim of equal standard deviations. 74

Example: Left tail is not used and need not be shown ! 75

Example: Left tail is not used and need not be shown ! 75

Conclusion: There is sufficient evidence to warrant rejection of the claim that the two

Conclusion: There is sufficient evidence to warrant rejection of the claim that the two standard deviations are equal. The variation among weights of quarters made after 1964 is significantly different from the variation among weights of quarters made before 1964. 76

Tests about two variances by TI-83/84 • • Press STAT and select TESTS Scroll

Tests about two variances by TI-83/84 • • Press STAT and select TESTS Scroll down to 2 -Samp. FTest press ENTER Select Input: Data or Stats. For Stats: Type in sx 1: (1 st sample st. deviation) n 1: (1 st sample size) sx 2: (2 nd sample st. deviation) n 2: (2 nd sample size) choose H 1: s 1 ≠s 2 < s 2 > s 2 (two tails) (left tail) (right tail) 77

Tests about two variances (continued) • • Press on Calculate Read the test statistic

Tests about two variances (continued) • • Press on Calculate Read the test statistic F=… and the P-value p=… Note: the calculator does not require the first sample variance be larger than the second. It can handle both left-tailed and right-tailed tests. 78