Hypothesis Testing Introduction Hypothesis A conjecture about the
Hypothesis Testing – Introduction • Hypothesis: A conjecture about the distribution of some random variables. For example, a claim about the value of a parameter of the statistical model. • A hypothesis can be simple or composite. • A simple hypothesis completely specifies the distribution. A composite does not. • There are two types of hypotheses: Ø The null hypothesis, H 0, is the current belief. Ø The alternative hypothesis, Ha, is your belief; it is what you want to show. • We treat the hypotheses being considered as contradictory. STA 248 week 9 1
Examples Each of the following situations requires a significance test about a population mean . State the appropriate null hypothesis H 0 and alternative hypothesis Ha in each case. (a) The mean area of the several thousand apartments in a new development is advertised to be 1250 square feet. A tenant group thinks that the apartments are smaller than advertised. They hire an engineer to measure a sample of apartments to test their suspicion. (b) Larry's car consume on average 32 miles per gallon on the highway. He now switches to a new motor oil that is advertised as increasing gas mileage. After driving 3000 highway miles with the new oil, he wants to determine if his gas mileage actually has increased. (c) The diameter of a spindle in a small motor is supposed to be 5 millimeters. If the spindle is either too small or too large, the motor will not perform properly. The manufacturer measures the diameter in a sample of motors to determine whether the mean diameter has moved away from the target. STA 248 week 9 2
Testing Process • • Hypothesis testing is a proof by contradiction. The testing process has four steps: Step 1: Assume H 0 is true. Step 2: Use statistical theory to make a statistic (function of the data) that includes H 0. This statistic is called the test statistic. • Step 3: Find the probability that the test statistic would take a value as extreme or more extreme than that actually observed. Think of this as: probability of getting our sample assuming H 0 is true. • Step 4: If the probability we calculated in step 3 is high it means that the sample is likely under H 0 and so we have no evidence against H 0. If the probability is low it means that the sample is unlikely under H 0. This in turn means one of two things; either H 0 is false or we are unlucky and H 0 is true. STA 248 week 9 3
Test Statistic • The test is based on a statistic that estimate the parameter that appears in the hypotheses. Usually this is the same estimate we would use in a confidence interval for the parameter. When H 0 is true, we expect the estimate to take a value near the parameter value specified in H 0. • Values of the estimate far from the parameter value specified by H 0 give evidence against H 0. The alternative hypothesis determines which directions count against H 0. • A test statistic measures compatibility between the null hypothesis and the data. • We use it for the probability calculation that we need for our test of significance • It is a random variable with a distribution that we know. STA 248 week 9 4
Example • An air freight company wishes to test whether or not the mean weight of parcels shipped on a particular root exceeds 10 pounds. A random sample of 49 shipping orders was examined and found to have average weight of 11 pounds. Assume that the stdev. of the weights ( ) is 2. 8 pounds. • The null and alternative hypotheses in this problem are: H 0: μ = 10 ; Ha: μ > 10. • The test statistic for this problem is the standardized version of • Decision: ? STA 248 week 9 5
Example • An old production process is known to have 10% rat of defective. • A new process was established to improve the defective rat. • Let p be the probability that an item is defective. • The hypotheses to test here are: • Suppose we took a sample of 200 items. • The appropriate test statistic to use in this case is…. STA 248 week 9 6
Graphical Representation • Let Sn be the set of all possible samples of size n from the population we are sampling from. • Let C be the set of all samples for which we reject H 0. It is called the critical region. • is the set of all samples for which we fail to reject H 0. It is called the acceptance region. • Question: how to choose a rejection region? STA 248 week 9 7
Decision Errors • When we perform a statistical test we hope that our decision will be correct, but sometimes it will be wrong. There are two possible errors that can be made in hypothesis test. • The error made by rejecting the null hypothesis H 0 when in fact H 0 is true is called a type I error. • The error made by failing to reject the null hypothesis H 0 when in fact H 0 is false is called a type II error. STA 248 week 9 8
Size of a Test • The probability that defines the critical region is called the size of the test and is denoted by α. • The size of the test is also the probability of type I error. • Example. . . STA 248 week 9 9
Power • The probability that a fixed size test will reject H 0 when H 0 is false is called the power of the test. • Power is not about an error. A powerful test has a large probability of rejecting H 0 when we should. • Example… STA 248 week 9 10
Decision Rules • A hypothesis test is a decision made where we attach a probability of type I error and fix it to be α. • However, for any set up there are lots of decision rules with the same size. • The Neyman-Pearson lemma: fix α at the largest value that can be tolerated and choose the rejection region that provide the highest power. • Typical values of α are 0. 05, 0. 01, 0. 1. This is called the “significant level” of the test. STA 248 week 9 11
Test for Mean of Normal Population σ2 is known • Suppose X 1, …, Xn is a random sample from a N(μ, σ2) distribution where σ2 is known. We are interested in testing hypotheses about μ. • The test statistics is the standardized version of the sample mean. • We could test three sets of hypotheses… STA 248 week 9 12
Example • The Pfft Light Bulb Company claims that the mean life of its 2 watt bulbs is 1300 hours. Suspecting that the claim is too high, Nalph Rader gathered a random sample of 64 bulbs and tested each. He found the average life to be 1295 hours. Test the company's claim using = 0. 01. Assume = 20 hours. STA 248 week 9 13
Exercise • A standard intelligence examination has been given for several years with an average score of 80 and a standard deviation of 7. If 25 students taught with special emphasis on reading skill, obtain a mean grade of 83 on the examination, is there reason to believe that the special emphasis changes the result on the test? Use = 0. 05. STA 248 week 9 14
Notes about P-values • P-value is the probability of getting the value of the test statistic we have observed or more extreme than it assuming H 0 is correct. • Small P-values are evidence against H 0. The smaller the P-value, the stronger the evidence. • Guideline for how small is “small”: Ø P-value > 0. 1 provides no evidence against H 0. Ø 0. 05 < P-value < 0. 1 provides weak evidence against H 0. Ø 0. 01 < P-value < 0. 05 provides moderated evidence against H 0. Ø P-value < 0. 01 provides strong evidence against H 0. STA 248 week 9 15
Power Calculations - Example STA 248 week 9 16
Test for Mean of Normal Population σ2 is unknown • Suppose X 1, …, Xn is a random sample from a N(μ, σ2) distribution where σ2 is unknown, n is small and we are interested in testing hypotheses about μ. • The test statistics is. . . STA 248 week 9 17
Example • In a metropolitan area, the concentration of cadmium (Cd) in leaf lettuce was measured in 6 representative gardens where sewage sludge was used as fertilizer. The following measurements (in mg/kg of dry weight) were obtained. Cd: 21 38 12 15 14 8 • Is there evidence that the mean concentration of Cd is higher than 12? STA 248 week 9 18
R- Example • Data from the US federally mandated maximum speed limits in 1995 were obtained. In 1996, 32 stated increased speed limits. Data on the percent change in traffic fatalities (1995 -1996) in these 32 states on interstate highways were obtained. • The parameter of interest is the, µ, the mean of % change in fatalities. • The hypothesis of interest here are… STA 248 week 9 19
Test for Mean of a Non-Normal Population • Suppose X 1, …, Xn are iid from some distribution with E(Xi)=μ and Var(Xi)= σ2. Further suppose that n is large and we are interested in testing hypotheses about μ. • Since n is large the CLT applies to the sample mean and the test statistics is again the standardized version of the sample mean , that is we use the z-test. • If the variance of the population is unknown the result of the test is approximately correct. STA 248 week 9 20
Example –Binomial Distribution • Suppose X 1, …, Xn are random sample from Bernoulli(θ) distribution. • We are interested in testing hypotheses about θ… STA 248 week 9 21
- Slides: 21