Chapter 10 Statistical Inference for Two Samples Learning

  • Slides: 43
Download presentation
Chapter 10 Statistical Inference for Two Samples

Chapter 10 Statistical Inference for Two Samples

Learning Objectives • Comparative experiments involving two samples • Test hypotheses on the difference

Learning Objectives • Comparative experiments involving two samples • Test hypotheses on the difference in means of two normal distributions • Test hypotheses on the ratio of the variances or standard deviations of two normal distributions • Test hypotheses on the difference in two population proportions • Compute power, type II error probability, and make sample size decisions for two-sample tests • Explain and use the relationship between confidence intervals and hypothesis tests

Assumptions • Interested on statistical inferences on the difference in means of two normal

Assumptions • Interested on statistical inferences on the difference in means of two normal distributions • Populations represented by X 1 and X 2 • Expected Value

Assumptions • Quantity • Has a N(0, 1) distribution • Used to form tests

Assumptions • Quantity • Has a N(0, 1) distribution • Used to form tests of hypotheses and confidence intervals on μ 1 -μ 2

Hypothesis Tests for a Difference in Means, Variances Known • Difference in means μ

Hypothesis Tests for a Difference in Means, Variances Known • Difference in means μ 1 -μ 2 is equal to a specified value ∆0 – H 0: μ 1 -μ 2 =∆0 – H 1: μ 1 -μ 2 #∆0 • Test statistic

Hypothesis Tests for a Difference in Means, Variances Known • Alternative Hypothesis • H

Hypothesis Tests for a Difference in Means, Variances Known • Alternative Hypothesis • H 1: μ 1 -μ 2 #∆0 – Rejection Criterion • z 0> zα/2 or z 0<-zα/2 • H 1: μ 1 -μ 2 >∆0 – Rejection Criterion • z 0> zα • H 1: μ 1 -μ 2<∆0 – Rejection Criterion • Z 0< -zα

Choice of Sample Size • Use of OC Curves – Use OC curves in

Choice of Sample Size • Use of OC Curves – Use OC curves in Appendix Charts VIa, VIb, VIc, and VId – Abscissa scale of the OC curves

Choice of Sample Size • Two-sided Sample Size – Sample size n=n 1=n 2

Choice of Sample Size • Two-sided Sample Size – Sample size n=n 1=n 2 required to detect a true difference in means ∆ of with power at least 1 -β – Where ∆ is the true difference in means of interest • One-sided Sample Size

Type II Error • Follows the singe-sample case • Two-sided alternative

Type II Error • Follows the singe-sample case • Two-sided alternative

C. I. on a Difference in Means, Variances Known, and Choice of Sample Size

C. I. on a Difference in Means, Variances Known, and Choice of Sample Size • Confidence Interval – 100(1 -α)% C. I. on the difference in two means μ 1 -μ 2

Choice of Sample Size • Choice of Sample Size – Error in estimating μ

Choice of Sample Size • Choice of Sample Size – Error in estimating μ 1 -μ 2 by E at 100(1 -α)% confidence less than

Example • Two machines are used for filling plastic bottles with a net volume

Example • Two machines are used for filling plastic bottles with a net volume of 16. 0 ounces • The fill volume can be assumed normal, with standard deviation 1=0. 020 and 2=0. 025 ounces • A member of the quality engineering staff suspects that both machines fill to the same mean net volume, whether or not this volume is 16. 0 ounces. A random sample of 10 bottles is taken from the output of each machine as follows

Questions 1. 2. 3. 4. 5. Do you think the engineer is correct? Use

Questions 1. 2. 3. 4. 5. Do you think the engineer is correct? Use =0. 05 What is the P-value for this test? What is the power of the test in part (1) for a true difference in means of 0. 04? Find a 95% confidence interval on the difference in means. Provide a practical interpretation of this interval. Assuming equal sample sizes, what sample size should be used to assure that =0. 05 if the true difference in means is 0. 04? Assume that =0. 05

Solution-Part 1 1. 2. 3. 4. 5. Parameter of interest is the difference in

Solution-Part 1 1. 2. 3. 4. 5. Parameter of interest is the difference in fill volume, H 0 : or H 1 : or = 0. 05 The test statistic is 6. 7. Reject H 0 if z 0 < z /2 = 1. 96 or z 0 > z /2 = 1. 96 16. 015, 16. 005, = 0, 0. 025, 0. 02, n 1 = 10, and n 2 = 10 8. Since -1. 96 < 0. 99 < 1. 96, do not reject the null hypothesis

Solution-Part 2 and 3 2. P-value = 3. = 0 0 = 0 Hence,

Solution-Part 2 and 3 2. P-value = 3. = 0 0 = 0 Hence, the power = 1 0 = 1

Solution-Part 4 4. Confidence interval With 95% confidence, we believe the true difference in

Solution-Part 4 4. Confidence interval With 95% confidence, we believe the true difference in the mean fill volumes is between 0. 0098 and 0. 0298. Since 0 is contained in this interval, we can conclude there is no significant difference between the means.

Solution-Part 5 5. Assume the sample sizes are to be equal, use = 0.

Solution-Part 5 5. Assume the sample sizes are to be equal, use = 0. 05, and = 0. 08 Hence, n = 3, use n 1 = n 2 = 3

Hypotheses Tests for a Difference in Means, Variances Unknown • Tests of hypotheses on

Hypotheses Tests for a Difference in Means, Variances Unknown • Tests of hypotheses on the difference in means μ 1 -μ 2 of two normal distributions • If n 1 and n 2 exceed 40, use the CLT • Otherwise base our hypotheses tests and C. I. on the t distribution • Two cases for the variances

Case I: 12= 2: Pooled Test • Two normal populations with unknown means and

Case I: 12= 2: Pooled Test • Two normal populations with unknown means and unknown but equal variances • Expected value • Form an estimator of 2 • Pooled estimator of 2, denoted by S 2 p • Test statistic

Hypotheses Tests • Test hypothesis – H 0: μ 1 -μ 2 =∆0 –

Hypotheses Tests • Test hypothesis – H 0: μ 1 -μ 2 =∆0 – H 1: μ 1 -μ 2 #∆0 • Test statistic • Where Sp is the pooled estimator of

Critical Regions • Alternative Hypothesis – H 1: μ 1 -μ 2 #∆0 –

Critical Regions • Alternative Hypothesis – H 1: μ 1 -μ 2 #∆0 – Rejection Criterion • t 0>tα/2, n 1+n 2 -2 or • t 0<-tα/2, n 1+n 2 -2 – H 1: μ 1 -μ 2 >∆0 – Rejection Criterion • t 0>tα, n 1+n 2 -2 – H 1: μ 1 -μ 2 <∆0 – Rejection Criterion • t 0<-tα, n 1+n 2 -2

Case 2: 12# 22 • Not able to assume that the unknown variances 12,

Case 2: 12# 22 • Not able to assume that the unknown variances 12, 22 are equal • Test statistic • With v degrees of freedom • Critical regions – Identical to the case I – Degrees of freedom will be replaced by v

Confidence Interval on the Difference in Means • Case 12= 22 – 100(1 -

Confidence Interval on the Difference in Means • Case 12= 22 – 100(1 - )% CI on the difference in means μ 1 -μ 2 • Case 12# 22 – 100(1 - )% CI on the difference in means μ 1 -μ 2

Example • The diameter of steel rods manufactured on two different extrusion machines is

Example • The diameter of steel rods manufactured on two different extrusion machines is being investigated • Two random samples of of sizes n 1=15 and n 2=17 are selected, and the sample means and sample variances are 8. 73, s 12=0. 35, 8. 68, and s 22=0. 40, respectively • Assume that equal variances and that the data are drawn from a normal distribution – Is there evidence to support the claim that the two machines produce rods with different mean diameters? Use α=0. 05 in arriving at this conclusion – Find the P-value for the t-statistic you calculated in part (1) – Construct a 95% confidence interval for the difference in mean rod diameter. Interpret this interval

Solution 1. Parameter of interest, 2. H 0 : or 3. H 1 :

Solution 1. Parameter of interest, 2. H 0 : or 3. H 1 : or 4. = 0. 05 5. Test statistic is 6. Reject the null hypothesis if t 0 < where = 2. 042 or t 0 > where = 2. 042 7. 8. 73, 8. 68, 0 = 0, 0. 35, 0. 40, n 1 = 15, and n 2 = 17,

Solution 8. Since 2. 042 < 0. 230 < 2. 042, do not reject

Solution 8. Since 2. 042 < 0. 230 < 2. 042, do not reject the null hypothesis

Solution-Cont. • P-value = 2 P 2( 0. 40), P-value > 0. 80 •

Solution-Cont. • P-value = 2 P 2( 0. 40), P-value > 0. 80 • 95% confidence interval: t 0. 025, 30 = 2. 042 • Since zero is contained in this interval, we are 95% confident that machine 1 and machine 2 do not produce rods whose diameters are significantly different

Paired t Test • Special case of the two-sample t-tests • When the observations

Paired t Test • Special case of the two-sample t-tests • When the observations are collected in pairs • Each pair of observations is taken under homogeneous conditions • Conditions may change from one pair to another • Testing – H 0: μD=∆0 – H 1: μD#∆0

Paired t Test • Test statistic – D (bar) is the sample average of

Paired t Test • Test statistic – D (bar) is the sample average of the n differences • Rejection Region – t 0>tα/2, n-1 or t 0<-tα/2, n-1 • 100(1 -α)% C. I. on the difference in means

Example • Ten individuals have participated in a diet-modification program to stimulate weight loss

Example • Ten individuals have participated in a diet-modification program to stimulate weight loss • Their weight both before and after participation in the program is shown in the following list – Is there evidence to support the claim that this particular dietmodification program is effective in producing a mean weight reduction? Use α=0. 05. Subject Before After 1 195 187 2 213 195 3 247 221 4 201 190 5 187 175 6 210 197 7 215 199 8 246 221 9 294 278 10 310 285

Solution 1. Parameter of interest is the difference in mean weight, d where di

Solution 1. Parameter of interest is the difference in mean weight, d where di =Weight Before Weight After. 2. H 0 : 3. H 1 : 4. = 0. 05 5. Test statistic is 6. Reject the null hypothesis if t 0 > where = 1. 833 7. 17, 6. 41, n=10 8) Since 8. 387 > 1. 833 reject the null

Inferences on the Variances of Two Normal Populations • Both populations are normal and

Inferences on the Variances of Two Normal Populations • Both populations are normal and independent • Test the hypotheses – H 0: 12= 22 – H 1: 12≠ 22 • Requires a new probability distribution, the F distribution

The F Distribution • Define rv F as the ratio of two independent chi-square

The F Distribution • Define rv F as the ratio of two independent chi-square r. v. , each divided by its number of dof • F=(W/u) /(Y(v)) • Follows the F distribution with u dof in the numerator and v dof in the denominator. • Usually abbreviated as Fu, v

The F Distribution • Shape of pdf with two dof • Table V provides

The F Distribution • Shape of pdf with two dof • Table V provides the percentage points of the F distribution • Note that f 1 -α, u, v =1/fα, v, u

Hypothesis Tests on the Ratio of Two Variances • Suppose H 0: 12= 22

Hypothesis Tests on the Ratio of Two Variances • Suppose H 0: 12= 22 • S 12 and S 22 are sample variances • Test statistics • F 0= S 12 / S 22 • Suppose H 1: 12# 22 • Rejection Criterion • f 0>fα/2, n 1 -1, n 2 -1 or f 0<f 1 -α/2, n 1 -1, n 2 -1

Example • Two chemical companies can supply a raw material. • The concentration of

Example • Two chemical companies can supply a raw material. • The concentration of a particular element in this material is important. • The mean concentration for both suppliers is the same, but we suspect that the variability in concentration may differ between the two companies • The standard deviation of concentration in a random sample of n 1=10 batches produced by company 1 is s 1=4. 7 grams per liter, while for company 2, a random sample of n 2=16 batches yields s 2=5. 8 grams per liter. • Is there sufficient evidence to conclude that the two population variances differ? Use α=0. 05.

Solution 1. Parameters of interest are the variances of concentration, 2. H 0 :

Solution 1. Parameters of interest are the variances of concentration, 2. H 0 : 3. H 1 : 4. = 0. 05 5. Test statistic is 6. Reject the null hypothesis if f 0 < where = 0. 265 or f 0 > where =3. 12 7. n 1=10, n 2=16, s 1= 4. 7, and s 2=5. 8 8. Since 0. 265 < 0. 657 < 3. 12 do not reject the null hypothesis

Hypothesis Tests on Two Population Proportions • Suppose two binomial parameters of interest, p

Hypothesis Tests on Two Population Proportions • Suppose two binomial parameters of interest, p 1 and p 2 • Large-Sample Test • Test statistic • Critical regions

β-Error • If the H 1 is two sided, the β-error • Where

β-Error • If the H 1 is two sided, the β-error • Where

Confidence Interval on the Difference in Means • Two sided 100(1 -α)% C. I.

Confidence Interval on the Difference in Means • Two sided 100(1 -α)% C. I. on the difference in the true proportions p 1 -p 2

Example • Two different types of injection-molding machines are used to form plastic parts.

Example • Two different types of injection-molding machines are used to form plastic parts. A part is considered defective if it has excessive shrinkage or is discolored • Two random samples, each of size 300, are selected, and 15 defective parts are found in the sample from machine 1 while 8 defective parts are found in the sample from machine 2 • Is it reasonable to conclude that both machines produce the same fraction of defective parts, using α=0. 05?

Solution 1. 2. 3. 4. 5. 6. 7. Parameters of interest are the proportion

Solution 1. 2. 3. 4. 5. 6. 7. Parameters of interest are the proportion of defective parts, p 1 and p 2 H 0 : H 1 : = 0. 05 Test statistic is Reject the null hypothesis if z 0 < where = 1. 96 or z 0 > where = 1. 96 n 1=300, n 2=300, x 1=15, x 2=8, 0. 05, 0. 0267

Solution-Cont • Since 1. 96 < 1. 49 < 1. 96 do not reject

Solution-Cont • Since 1. 96 < 1. 49 < 1. 96 do not reject the null hypothesis • P-value = 2(1 P(z < 1. 49)) = 0. 13622