TwoSample Proportions Inference Sampling Distributions for the difference

Two-Sample Proportions Inference

Sampling Distributions for the difference in proportions When tossing pennies, the probability of the coin landing on heads is 0. 5. However, when spinning the coin, the probability of the coin landing on heads is 0. 4. Let’s investigate. Pairs of students will be given pennies and assigned to either flip or spin the penny

Looking at the sampling distribution of the difference in sample proportions: What is the mean of the difference in sample proportions (flip - spin)? What is the standard deviation of the difference in sample proportions (flip - spin)? Can the sampling distribution of difference in sample proportions (flip - spin) be approximated by a normal distribution? Yes, since n 1 p 1=12. 5, n 1(1 -p 1)=12. 5, n 2 p 2=10, n 2(1 -p 2)=15 – so all are at least 5) What is the probability that the difference in proportions (flipped – spun) is at least. 25?

Assumptions: • Two, Two independent SRS’s from populations ( or randomly assigned treatments) • Populations at least 10 n • Normal approximation for both

Formula for confidence interval: Margin of error! Standard error! Note: use p-hat when p is not known

Example 1: At Community Hospital, the burn center Since is experimenting new nplasma n 1 p 1=259, n 1 with (1 -p 1 a )=57, 2 p 2=94, compressn 2 treatment. A random of 316 (1 -p 2)=325 and all > 5, sample then the distribution of burns difference in proportions patients with minor received the plasma is approximately normal. compress treatment. Of these patients, it was found that 259 had no visible scars after treatment. Another random sample of 419 patients with minor burns received no plasma compress treatment. For this group, it was found that 94 had no visible scars after treatment. What is the shape & standard error of the sampling distribution of the difference in the proportions of people with visible scars between the two groups?

Example 1: At Community Hospital, the burn center is experimenting with a new plasma compress treatment. A random sample of 316 patients with minor burns received the plasma compress treatment. Of these patients, it was found that 259 had no visible scars after treatment. Another random sample of 419 patients with minor burns received no plasma compress treatment. For this group, it was found that 94 had no visible scars after treatment. What is a 95% confidence interval of the difference in proportion of people who had no visible scars between the plasma compress treatment & control group?

Assumptions: Since these are all burn patients, we can add 316 + 419 treatment = 735. • Have 2 independent randomly assigned groups If not the same – you MUST list separately. • Both distributions are approximately normal since n 1 p 1=259, n 1(1 -p 1)=57, n 2 p 2=94, n 2(1 -p 2)=325 and all > 5 • Population of burn patients is at least 7350. We are 95% confident that the true the difference in proportion of people who had no visible scars between the plasma compress treatment & control group is between 53. 7% and 65. 4%

Example 2: Suppose that researchers want to estimate the difference in proportions of people who are against the death penalty in Texas & in Since both n’s are the same California. size, Ifyou the two sample sizes have common denominators so add! are the same, what –size sample is needed to be within 2% of the true difference at 90% confidence? n = 3383

Example Researchers comparing the effectiveness of SO 3: – which is correct? two pain medications randomly selected a group of patients who had been complaining of a certain kind of CIA = divided (. 67, . 83) joint pain. They randomly these people into two CIB =(. 52, the. 70) groups, and then administered painkillers. Of the 112 Since intervals overlap, it appears that people in thethe group who received medication A, 84 said there is no was difference in Of thethe proportion ofin the this pain reliever effective. 108 people 66 who reported other group, reported thatpainrelieverbetween B was effective. (BVD, p. the 435)two medicines. a) Construct separate 95% confidence intervals for the CI = who (0. 017, 0. 261) proportion of people reported that the pain reliever zero Based is not on in these the interval, was Since effective. intervals there how doisthe a difference in who the reported proportion ofrelieve peoplewith proportions of people pain who reported pain relieve between the medication A or medication B compare? medicines. b) Construct a 95%two confidence interval for the difference in the proportions of people who may find these medications effective.

Hypothesis statements: H 0: p 1 -= pp 22 = 0 Ha: p 1 >- p 2 > 0 Ha: p 1 <- p 2 < 0 Ha: p 1 ≠- p 22 ≠ 0 Be sure to define both p 1 & p 2!

Since we assume that the population proportions are equal in the null hypothesis, the variances are equal. Therefore, we pool the variances!

Formula for Hypothesis test: Usually p 1 – p 2 =0

Example 4: A forest in Oregon has an infestation of spruce moths. In an effort to control the moth, one area has been regularly sprayed from airplanes. In this area, a random sample of 495 spruce trees showed that 81 had been killed by moths. A second nearby area receives no treatment. In this area, a random sample of 518 spruce trees showed that 92 had been killed by the moth. Do these data indicate that the proportion of spruce trees killed by the moth is different for these areas?

Assumptions: • Have 2 independent SRS of spruce trees • Both distributions are approximately normal since n 1 p 1=81, n 1(1 -p 1)=414, n 2 p 2=92, n 2(1 -p 2)=426 and all > 5 • Population of spruce trees is at least 10, 130. H 0: p 1=p 2 Ha: p 1≠p 2 where p 1 is the true proportion of trees killed by moths in the treated area p 2 is the true proportion of trees killed by moths in the untreated area P-value = 0. 5547 a = 0. 05 Since p-value > a, I fail to reject H 0. There is not sufficient evidence to suggest that the proportion of spruce trees killed by the moth is different for these areas