Inference about a Mean in Practice Inference on
Inference about a Mean in Practice Inference on a single population mean (σ unknown)
Introduction to Statistical Inference Methods � Statistical Inference: Drawing conclusions about a population from sample data. � Methods Ø Point Estimation– Using a sample statistic to estimate a parameter Ø Confidence Intervals – supplements an estimate of a parameter with an indication of its variability Ø Hypothesis Tests - assesses evidence for a claim about a parameter by comparing it with observed data Parameter Measure Statistic Mean of a single population Proportion of a single population Mean difference of two dependent populations (MP) Difference in means of two populations Difference in proportions of two populations Variance of a single population Standard deviation of a single population S
Inference for a single mean (μ) � We have seen Z confidence intervals and Hypothesis tests when μ is our parameter of interest. � In demonstrating those ideas we assumed we knew σ, but this is unrealistic in practice. � When we don’t know σ we can use the t distribution � Estimate the population standard deviation s with the sample standard deviation, s, calculated from the sample data.
Inference about a Mean �
Building a CI for μ (σ known) �
Building a CI for μ (σ unknown) �
Testing a Claim About a Mean (σ unknown) A t hypothesis test still has the same steps: 1) 2) 3) 4) State your hypotheses & significance level Calculate the appropriate test statistic Make a decision. Draw conclusions and interpret results. We still need to follow the same logic we did when creating confidence intervals for means to decide what distribution to use: Z (σ known) vs. T (σ unknown)
Test Statistic: Hypothesis Test of the population mean (σ known) � 8
Test Statistic: Hypothesis Test of the population mean (σ unknown) � 9
CI for μ (σ unknown) � Estimate the average height of adult males in Virginia. � We do not have information about the Population � We will take a sample of size 24. Heights(n=24) 67 70 68 70 71 70 68 68 69 66 73 73 68 65 67 74 73 68 65 66 67 63 71 68
Sample of Heights (in inches) � Construct a 95% confidence interval to estimate the mean height of adult males in Virginia. � Sample statistics: Column Heights � Population n 24 Mean 68. 666667 Std. dev. 2. 8386566 parameters are unknown so we should use t ◦ Check conditions
Checking Conditions � Remember, t inference methods work well if the population is not normal as long as our data from our sample ◦ Do not contain outliers ◦ Is not extremely skewed � Check these conditions visually ◦ Boxplot ◦ Histogram ◦ Normal Probability Plot
Basic Graphs No outliers, looks symmetric, conditions hold
Normal Probability Plot Conditions of Normality seem to hold
T critical Value � We need to obtain t* from table. � degrees of freedom: ◦ n-1 = 24 – 1=23. � 95% � t* confidence level. = 2. 069
Solution �
Correct Interpretation � We have said before that if we took many, many samples and constructed many, many 95% confidence intervals, 95% of them will capture the true unknown population mean. � To check this, we will simulate the process of constructing many t confidence intervals.
A closer look � Suppose I took two samples, both n=24, and created 2 confidence intervals Sample 1 Sample 2 Larger Width �T Smaller Width Confidence intervals that use the same CL and n may have different values of s, and margins of error and widths.
Behavior of Z vs T intervals � Using a t* value creates a larger confidence interval to account for the fact that the sample standard deviations vary from sample to sample. � If we used the z* value, a smaller amount than 95% of the intervals would capture the population mean.
Sample Size with the t Confidence Interval sample size (n) is a trial-and error process since n appears in two factors. � Determining � Procedure ◦ ◦ ◦ to find a desired E: Start with the zα/2 value and a guess for σ Solve for n. Gather sample observations & calculate s. Determine t α/2, n-1 and evaluate E. Gather more observations to further reduce E and repeat. � Leave it up to software! 21
Example: HT for μ (σ unknown) � Information (data is from a few years back) from a gas tracking website stated the average price in the country for a gallon of regular gasoline is $3. 50. You take a random sample of 21 gas stations in Virginia and want to see if the average in our state is actually lower than that (data are shown later). Assume a = 0. 01. � To use the one sample z test, we need to know the population standard deviation s. s with s introduces additional random variability so we will need to use T � Estimating
Full Solution State the hypotheses. H 0: = 3. 50 Ha: < 3. 50 We see that this test is left-tailed w/ a = 0. 01 We do not know σ, so should use a T Test Statistic w/ Degrees of freedom: df = 21 – 1 = 20
Full Solution � � Summary statistics : Conditions: ◦ ◦ Column Gas Prices n 21 Mean 3. 4381 SRS of size 21. Symmetry can be checked by looking at a histogram. Outliers can be checked by using a boxplot We could accomplish both by a Normal Probability Plot Std. dev. 0. 0903 Gas Prices 3. 27 3. 31 3. 33 3. 35 3. 36 3. 38 3. 39 3. 42 3. 44 3. 45 3. 46 3. 47 3. 48 3. 49 3. 51 3. 52 3. 56 3. 57 3. 61
Full Solution � State the Critical region � We need a T Critical value with: ◦ D. o. F= 20 ◦ a = 0. 01 ◦ left tailed � Table Value ◦ -2. 528
Full Solution Conduct the experiment and calculate the test statistic Column Gas Prices n 21 Mean 3. 4381 Std. dev. 0. 0903
Finding a P-value for a T-test � We know how to find a P-value when using the Z table which had many values and probabilities listed. � The T table only has commonly used Critical Values � Three options: ◦ Technology (find exact P-val) ◦ Opt for the Critical Value method ◦ Estimate a range for the p-value using the T table 27
Full solution � Since it is a left tailed test we are interested in: ◦ P(t < -3. 1413) � We can estimate it using the table: 0. 001< p-val <0. 005 � Or use technology:
Full Solution Draw your Conclusion Using the Critical Value Our Test Stat. of T=-3. 14 falls in our rejection region Using the p-value: Comparing p-val=0. 00257 to α=0. 01 Both reject H 0. Since we are rejecting H 0, we conclude that there is enough (significant) evidence to infer that the alternative hypothesis Ha is true.
Summary of Inference for the Sample Mean � For known σ � For unknown σ but “large enough” sample (n≥ 30) � For unknown σ and small sample size Z Z T 30
- Slides: 30