Statistics and Quantitative Analysis U 4320 Segment 7

  • Slides: 48
Download presentation
Statistics and Quantitative Analysis U 4320 Segment 7 : Hypothesis Testing Prof. Sharyn O’Halloran

Statistics and Quantitative Analysis U 4320 Segment 7 : Hypothesis Testing Prof. Sharyn O’Halloran

Hypothesis Testing n I. Introduction n A. Review of Confidence Intervals

Hypothesis Testing n I. Introduction n A. Review of Confidence Intervals

Introduction n (cont. ) B. Hypothesis Testing: Basic Definitions n n 1. A Hypotheses

Introduction n (cont. ) B. Hypothesis Testing: Basic Definitions n n 1. A Hypotheses is a statement about the population 2. Null Hypothesis n n The Null Hypothesis (Ho)- the statement about our data that we want to test. It is always stated as an equality. For instance; n Ho: m = 82, where m is the average test score n Or, H 0: D = 0, where D is the difference between men's and women' salaries is zero.

Introduction n (cont. ) 3. Alternative Hypothesis n n Every Null Hypothesis has an

Introduction n (cont. ) 3. Alternative Hypothesis n n Every Null Hypothesis has an associated Alternative Hypothesis, denoted Ha. This is always stated as an inequality either , >, or <. n For instances, the alternative hypothesis to the test scores having a mean of 82 might be Ha: m 82. n The alternative hypothesis to men's and women's' salaries being equal might be Ha: D > 0.

Introduction n (cont. ) 4. One Tail vs. Two Tail Tests n n If

Introduction n (cont. ) 4. One Tail vs. Two Tail Tests n n If the alternative hypothesis is in terms of a sign, it is called a two-tailed test. If the alternative hypothesis is in terms of a < or > sign, it is called a one-tailed test.

Introduction n (cont. ) C. Three Methods for Testing Hypothesis n n n 1.

Introduction n (cont. ) C. Three Methods for Testing Hypothesis n n n 1. Method I: Testing hypotheses using confidence intervals. 2. Method II: Testing hypotheses using pvalues. 3. Method III: Testing hypotheses using critical values.

Hypothesis Testing Using Confidence Intervals n II. Method I: Hypothesis Testing Using Confidence Intervals

Hypothesis Testing Using Confidence Intervals n II. Method I: Hypothesis Testing Using Confidence Intervals Note: This method works only for two-tail tests

Hypothesis Testing Using Confidence Intervals (cont. ) n A. Example: Differences in Means n

Hypothesis Testing Using Confidence Intervals (cont. ) n A. Example: Differences in Means n In a large university, 10 male professors and 5 female professors were randomly sampled. Their salaries were:

Hypothesis Testing Using Confidence Intervals (cont. ) n 1. Step 1: Define Hypothesis n

Hypothesis Testing Using Confidence Intervals (cont. ) n 1. Step 1: Define Hypothesis n We are interested in the difference between the means of men's and women's salaries. Call this difference D = (m 1 -m 2), n The males state that D = 0, n The females say that D = 7, Do the data support both of these hypotheses, one of them, or neither? We will test these hypotheses at the 5 % a-level. n

Hypothesis Testing Using Confidence Intervals (cont. ) n 2. Step 2: Calculate a Confidence

Hypothesis Testing Using Confidence Intervals (cont. ) n 2. Step 2: Calculate a Confidence Interval n n Form a 95% confidence interval: Notice that our data are two samples, one of men and other of women, from the same larger population of university professors. So we can pool our sample variances.

Hypothesis Testing Using Confidence Intervals (cont. ) n So the 95% confidence interval is

Hypothesis Testing Using Confidence Intervals (cont. ) n So the 95% confidence interval is from 1 to 9 thousand dollars.

Hypothesis Testing Using Confidence Intervals (cont. ) n 3. Step 3: Accept or Reject

Hypothesis Testing Using Confidence Intervals (cont. ) n 3. Step 3: Accept or Reject the Hypothesis n n n According to these data, is the claim that D = 0 plausible? We must reject the hypothesis that D = 0 because it falls outside the 95% confidence interval What about the hypothesis that D = 7? n

Hypothesis Testing Using Confidence Intervals (cont. ) n 4. n Summary: Step by Step

Hypothesis Testing Using Confidence Intervals (cont. ) n 4. n Summary: Step by Step Procedure 1. Step 1: Define Hypothesis n n Pick a significance level; the usual one is 5%. 2. Step 2: Construct confidence interval n Formula depends on type of data, (matched or pooled variance) and how confident you want to be. 3. Step 3: Accept or Reject n If falls within this interval, then we fail to reject the null, otherwise we reject it. n n n

Hypothesis Testing Using Confidence Intervals (cont. ) n B. Another Example: Matched Data n

Hypothesis Testing Using Confidence Intervals (cont. ) n B. Another Example: Matched Data n A firm producing plate glass has developed a less expensive tempering process to allow glass for fireplaces to rise to a higher temperature without breaking. To test it, five different plates of glass were drawn randomly from a production run, then cut in half, with one half tempered by the new process and one half tempered by the old. The two halves were then heated until they broke. The results of the experiment look like this: (next slide)

Hypothesis Testing Using Confidence Intervals (cont. ) n Matched Data n (cont. ) We

Hypothesis Testing Using Confidence Intervals (cont. ) n Matched Data n (cont. ) We want to test the hypothesis that the two processes are equal at the 95% confidence level or at the a =. 05 significance level.

Hypothesis Testing Using Confidence Intervals (cont. ) n n 1. Step 1: Define Hypothesis

Hypothesis Testing Using Confidence Intervals (cont. ) n n 1. Step 1: Define Hypothesis H 0: D = 0; Ha: D 0; Significance level a = 5%. 2. Step 2: Calculate a 95% Confidence interval. (s 2 unknown)

Hypothesis Testing Using Confidence Intervals (cont. ) n Step 2 (cont. )

Hypothesis Testing Using Confidence Intervals (cont. ) n Step 2 (cont. )

Hypothesis Testing Using Confidence Intervals (cont. ) n 3. Step 3: Accept or reject

Hypothesis Testing Using Confidence Intervals (cont. ) n 3. Step 3: Accept or reject null hypothesis? n So we do not reject the hypothesis that H 0: D = 0 because 0 falls within that range. The two processes are seen as indistinguishable.

p-Values n III. Method II: p-Values n n P-values are essentially the significance level.

p-Values n III. Method II: p-Values n n P-values are essentially the significance level. In essence, we are calculating the probability that the hypothesis is true. It summarizes the credibility of the null hypothesis.

p-Values n A. s known n 1. Step 1: State the Hypothesis n A

p-Values n A. s known n 1. Step 1: State the Hypothesis n A manufacturing process produces TV. tubes with an average life m=1200 hours and s = 300 hours. A new process is thought to give tubes a higher average life. And out of a sample of 100 tubes we find that they have an average life = 1265 hours. Is the new process really any better than the old?

p-Values n Step 1 (cont. ) H 0: m = 1200 Ha: m >

p-Values n Step 1 (cont. ) H 0: m = 1200 Ha: m > 1200 a=. 05 or 5% significance-level n This is a one-tailed test because we have put all the area in one-tail of the distribution. We are interested in those values that are greater than the mean.

p-Values n 2. Step 2: Calculate p-value n n n We know s and

p-Values n 2. Step 2: Calculate p-value n n n We know s and n is large so we can use the normal distribution. m 0 = 1200, and s= 300 and n= 100 Standard error = s/Ön = 300/ Ö 100 = 30. The observed value = 1265. a. Standardize n We then standardize (get the z-value )

p-Values n b. Find z-score (probability of the event occurring) n

p-Values n b. Find z-score (probability of the event occurring) n

p-Values n 3. n n Step 3: Accept or Reject the Hypothesis This suggests

p-Values n 3. n n Step 3: Accept or Reject the Hypothesis This suggests that if the null hypothesis was true that there would be only a 1. 5% probability of observing as larger as 1265. Since 1. 5% lies to the right of our initial 5% significance level, we can reject the null hypothesis.

p-Values n 4. Two-Tailed Test H 0: m = 1200 Ha: m ¹ 1200

p-Values n 4. Two-Tailed Test H 0: m = 1200 Ha: m ¹ 1200 a =. 05 or 5% significance-level

p-Values n Accept or Reject n Since the area to the right of 1265

p-Values n Accept or Reject n Since the area to the right of 1265 is only 1. 5%, we can again reject H 0.

p-Values n B. s unknown n Usually s is unknown and has to be

p-Values n B. s unknown n Usually s is unknown and has to be estimated with the sample standard deviation s. The test statistic is then t instead of Z.

p-Values n 1. Step 1: State Hypothesis (e. g. , difference in men's and

p-Values n 1. Step 1: State Hypothesis (e. g. , difference in men's and women's salaries) n n n We know from the above example, ( - ) = 5 Standard Error = 1. 84 Is this a one or a two tailed test?

p-Values n 2. Step 2: Calculate p-value n a. Standardize

p-Values n 2. Step 2: Calculate p-value n a. Standardize

p-Values n b. Find probability of event from t-table n Degrees of freedom =

p-Values n b. Find probability of event from t-table n Degrees of freedom = (n-1) = 13 n So the probability of observing a t-value of 2. 72 lies beyond n This means that the tail probability is smaller than. 01. That is, p-value <. 01.

p-Values n 3. Step 3: Accept or Reject Hypothesis n n Since the p-value

p-Values n 3. Step 3: Accept or Reject Hypothesis n n Since the p-value is a measure of the credibility of H 0, such a low value (below a = 5%) leads us to conclude that H 0 is implausible. Therefore, we reject the null hypothesis.

p-Values n C. Getting t-values from Computers (Review of Homework) n 1. Calculate t-values

p-Values n C. Getting t-values from Computers (Review of Homework) n 1. Calculate t-values n How does the computer calculate the t-value?

p-Values n 2. Calculate p-value n n The 2 -tail probability gives the area

p-Values n 2. Calculate p-value n n The 2 -tail probability gives the area to the right of the t-value times two. If this value is less than your significance level for a 2 -tail test, then reject your null hypothesis.

p-Values n 3. Example: Sample Homework n n For example, the difference of means

p-Values n 3. Example: Sample Homework n n For example, the difference of means test between men and women's incomes, produced a t-value = 6. 60 and an associated p-value of. 00. Therefore, I can reject the hypothesis that m 1 -m 2 = 0 because. 00 is less than. 025.

p-Values n D. Summary n 1. Step 1: Define Hypothesis n n Choose H

p-Values n D. Summary n 1. Step 1: Define Hypothesis n n Choose H 0, Ha and a significance level a (default is 5%). 2. Step 2: Calculate p-value n Calculate your p-value from the statistics n if s known n if s is unknown

p-Values n 3. Step 3: Accept or Reject hypothesis Reject H 0 if p-value

p-Values n 3. Step 3: Accept or Reject hypothesis Reject H 0 if p-value a n n For a One-Tailed Test n Reject H 0 if the p-value is less than the significance level a. n Accept H 0 otherwise. For a Two-tailed Test n Reject H 0 if the p-value is less than 1/2 the significance level. (i. e. , 1/2 a =. 025) n Accept H 0 otherwise.

Critical Values n IV. Method III: Critical Values n n Classical hypothesis testing is

Critical Values n IV. Method III: Critical Values n n Classical hypothesis testing is very similar to the p-value approach. A. Example: Manufacturing of TV tubes n 1. State the Hypothesis: H 0: m = 1200 Ha: m > 1200 a = 5%. n=100 m 0=1200 s=300

Critical Values n 2. Test Hypothesis: Find the Critical Values n A. In General

Critical Values n 2. Test Hypothesis: Find the Critical Values n A. In General n What z-value is associated with 5% of the area under the curve? n From the z-tables we see that the area of 5% is associated with a z-value of 1. 64. n The question is what value on the x-axis corresponds to a z-value of 1. 64?

Critical Values n n B. Critical Value n The critical value is the X-value

Critical Values n n B. Critical Value n The critical value is the X-value that corresponds to a Z-value. n We obtain the critical value by arbitrarily setting a= 5% and calculating: C. Calculating the Critical Value for Manufacturing TV Tubes n We know that the m 0=1200, and SE=300/Ö 100=30. n The Critical Value then is:

Critical Values n 3. Step 3: Reject or Accept the Hypothesis n n To

Critical Values n 3. Step 3: Reject or Accept the Hypothesis n n To accept or reject our hypothesis we collect data and see if our sample mean is greater then this critical value. From the above example we observed a sample mean = 1265. Therefore we reject H 0: m=1200 because 1265>1249. So we once again conclude that the new process is better than the old.

Critical Values n B. Example of 2 -tailed test How do we construct a

Critical Values n B. Example of 2 -tailed test How do we construct a two-tailed test at the 5% significance value? n 1. Step 1: State Hypothesis H 0: m = 1200 Ha: m 1200 a = 5%.

Critical Values n 2. Step 2: Calculate Critical Value n n We use Z.

Critical Values n 2. Step 2: Calculate Critical Value n n We use Z. 025 instead of Z. 05. In this case, we would get c = m 0 Z. 025*SE. n c = 1200 1. 96*30 = 1141 and 1259.

Critical Values n 3 Step 3: Accept or reject null Hypothesis n n We

Critical Values n 3 Step 3: Accept or reject null Hypothesis n n We would reject H 0 if the observed fell below 1141 or above 1259. Again 1265 exceeds the critical value so we still reject H 0.

Critical Values n C. Summary: n 1. Step 1: Define Hypothesis n n n

Critical Values n C. Summary: n 1. Step 1: Define Hypothesis n n n State H 0; State Ha; and Choose a significance level a.

Critical Values n 2. Step 2: Calculate Critical Value n n n Draw a

Critical Values n 2. Step 2: Calculate Critical Value n n n Draw a normal curve and find the critical values at the level of significance you arbitrarily set. Usually at the. 05 significance-level. For two-tailed test: n s known: c = m 0 ± Z. 025*SE. n s unknown: c = m 0 + t. 025*SE(estimated) For one-tailed test: n s known: c = m 0 + Z. 05*SE. n s unknown: c = m 0 + t. 05*SE(estimated)

Critical Values n 3. Step 3: Accept or Reject n n Then collect sample

Critical Values n 3. Step 3: Accept or Reject n n Then collect sample data. If the sample mean exceeds the critical value, then reject H 0; otherwise accept H 0.

Notes About the Exam n V. Notes About the Exam n n n 1.

Notes About the Exam n V. Notes About the Exam n n n 1. Hand in your homework at the beginning of class 2. The exam will cover the material through today's lecture. 3. Problems, no definitions. 4. You may bring a calculator and one 3 X 5 index card with whatever you want written on it. 5. Z-tables and t-tables will be supplied.

Review Session: Saturday March 8 11 to 1 PM Room 411 IAB

Review Session: Saturday March 8 11 to 1 PM Room 411 IAB