HYPOTHESIS TESTING HERE WE GO Hypothesis Testing Outline

  • Slides: 47
Download presentation
HYPOTHESIS TESTING HERE WE GO!

HYPOTHESIS TESTING HERE WE GO!

Hypothesis Testing - Outline 1. Compare the statistical questions that hypothesis tests are designed

Hypothesis Testing - Outline 1. Compare the statistical questions that hypothesis tests are designed to address with those that confidence intervals are designed to address. 2. Define the Null and Alternative Hypotheses 3. Explain why we always start by assuming the Null Hypothesis is true 4. Review the basic steps for conducting a hypothesis test

Hypothesis Testing - Outline 5. Present terminology for describing the results of hypothesis tests

Hypothesis Testing - Outline 5. Present terminology for describing the results of hypothesis tests 6. Discuss the kinds of errors that hypothesis test are liable to produce 7. Understanding how α functions in the context of hypothesis testing. 8. Outline the basic steps for conducting a hypothesis test.

Example Jake is the manager of ε the hippest café at State University. He

Example Jake is the manager of ε the hippest café at State University. He is environmentally conscious so he is always looking for ways to conserve resources. He thinks that placing napkin dispensers on every table rather than at the silverware station might reduce the number of napkins that people use. To test his hypothesis, Jake could run an experiment in which he moves the napkin dispensers and measures napkin consumption for a period of time (in other words, collect a sample).

What can Jake do with his sample data? A: He could calculate a confidence

What can Jake do with his sample data? A: He could calculate a confidence interval, but Jake doesn’t care about µ. Jake cares about whether moving the napkin dispenser affected napkin consumption Plan B: So, what can Jake do? A: Hypothesis Testing

What is hypothesis testing? Does the sun revolve around the earth? The church said

What is hypothesis testing? Does the sun revolve around the earth? The church said ‘YES’, but Copernicus wasn’t convinced, so he collected analyzed tons of data and compared those data against two competing hypotheses: a. The sun revolves around the earth b. The earth revolves around the sun

Key Terms Hypothesis Testing: a statistical method for deciding which of two hypothetical outcomes

Key Terms Hypothesis Testing: a statistical method for deciding which of two hypothetical outcomes for an experiment is more consistent with the experimental data. Statistical Hypothesis: an educated guess about the value of a population parameter on the basis of past experience or data collection

Key Terms Null Hypothesis: a statistical hypothesis that suggests that an experimental manipulation will

Key Terms Null Hypothesis: a statistical hypothesis that suggests that an experimental manipulation will not affect the outcome of an experiment. The null hypothesis is usually denoted with either HO or H 0. Alternative Hypothesis: a statistical hypothesis that suggests that an experimental manipulation will affect the outcome of an experiment. The alternative hypothesis is usually denoted with either HA or H 1.

The Null and Alternative Hypothesis Label Null Hypothesis Alternative Hypothesis Common Language Moving the

The Null and Alternative Hypothesis Label Null Hypothesis Alternative Hypothesis Common Language Moving the napkin dispensers will not affect consumption. Moving the napkin dispensers will affect consumption. Statistical Language Statistical Notation µ will remain at 100 pounds after the napkin HO: µ = 100 dispensers are moved µ will no longer be 100 pounds after the napkin HA: µ ≠ 100 dispensers are moved

The Null and Alternative Hypothesis Label Null Hypothesis Alternative Hypothesis Common Language Moving the

The Null and Alternative Hypothesis Label Null Hypothesis Alternative Hypothesis Common Language Moving the napkin dispensers will not affect consumption. Moving the napkin dispensers will affect consumption. Statistical Language Statistical Notation µ will remain at 100 pounds after the napkin HO: µ = 100 dispensers are moved µ will no longer be 100 pounds after the napkin HA: µ ≠ 100 dispensers are moved

Visually μ =100 Ho : After napkins are moved μ =100 HA: After napkins

Visually μ =100 Ho : After napkins are moved μ =100 HA: After napkins are moved: µ ≠ 100 μ =?

Assuming the Null Hypothesis is True What is the guiding principal of our legal

Assuming the Null Hypothesis is True What is the guiding principal of our legal system? § Innocent until proven guilty beyond the shadow of a doubt What is the guiding principal of hypothesis testing? § The null hypothesis is assumed to be true unless there is overwhelming evidence to the contrary.

The logic of hypothesis testing Jake’s napkin dispensers § Ho: µ = 100 §

The logic of hypothesis testing Jake’s napkin dispensers § Ho: µ = 100 § Ha: µ ≠ 100 Collect a sample of data and compare it with the two hypotheses. What would you conclude if his sample mean was: § 99. 8 pounds? § 97 pounds? § 95 pounds? § 90 pounds? At some point, the sample mean would be so far from 100, that we could not believe the null hypothesis was true.

Examine Sampling Distribution for the null hypothesis

Examine Sampling Distribution for the null hypothesis

How rare does an event have to be to abandon the null hypothesis? ■

How rare does an event have to be to abandon the null hypothesis? ■ It’s up to you…sort of. ■ We decide how rare an event needs to be by setting alpha (α) – α = is the probability value that defines a very unlikely sample mean – Common α are. 05; . 01; . 001 – (but you could set α to anything) We will reject the null if: the probability of getting our sample mean (or one more extreme) is less than α

Alpha also defines the probability that the statistical inference we make will be an

Alpha also defines the probability that the statistical inference we make will be an error. If we set α =. 05: § We are 95% sure that our confidence interval will contain µ. § We are 95% sure that the decision we make about the null will be correct. § There is a 5% chance that we will make an error. We can never be 100% sure of any statistical inference. α/2 = 2. 5%

Hypothesis Testing using Z § Can only be done IF: § We know population

Hypothesis Testing using Z § Can only be done IF: § We know population SD § AND sample size is greater than 30 § As with confidence intervals this procedure will rarely be appropriate…BUT it offers a helpful place to start.

Basic Steps in Z Hypothesis Testing 1. Set alpha and find the critical value

Basic Steps in Z Hypothesis Testing 1. Set alpha and find the critical value for z (zcrit) that corresponds to α/2 in the tail. 2. Use the sample mean to calculate observed z (zobs) to see where sample mean falls if null is true. 3. Compare zobs with zcrit: – If |zobs| > |zcrit|, we conclude that our sample mean is a rare event – If |zobs| ≤ |zcrit|, we conclude that our sample mean is not a rare event

Step 1: Determining a value for zcrit Find the value for z that leave

Step 1: Determining a value for zcrit Find the value for z that leave α/2 in each tail. If α =. 05, we find the z-score that leaves. 025 in each tail. Sampling distribution if null is true

Step 2: Calculating a value for zobs Z-score Formula Zobs Formula M: sample mean

Step 2: Calculating a value for zobs Z-score Formula Zobs Formula M: sample mean µ 0: the value of µ according to the null hypothesis

Jake: Steps 1 and 2 Jake’s napkin dispensers ■ Ho: µ = 100 ■

Jake: Steps 1 and 2 Jake’s napkin dispensers ■ Ho: µ = 100 ■ Ha: µ ≠ 100 α =. 05 zcrit= 1. 96 Assume we know: σ = 20 Jake measures paper waste for 100 weeks after his napkin intervention; the mean waste per week is 95. 5 pounds

Jake: Steps 1 and 2 α =. 05 zcrit= 1. 96 zobs = -2.

Jake: Steps 1 and 2 α =. 05 zcrit= 1. 96 zobs = -2. 25 (Remember: the sign matters when calculating z-obs!)

Step 3: Comparing zobs with zcrit μ=100 Jake’s zobs = -2. 25 Sampling distribution

Step 3: Comparing zobs with zcrit μ=100 Jake’s zobs = -2. 25 Sampling distribution if null is true The shaded region is referred to as the rejection region ■ If zobs falls in the shaded region we reject the null hypothesis. ■ Interpretation: the experimental manipulation affected the outcome of the experiment

What if we set the alpha at. 01 μ=100 (99% sure)? If zobs does

What if we set the alpha at. 01 μ=100 (99% sure)? If zobs does not fall in the shaded region we fail to reject the null hypothesis. Interpretation: the experimental manipulation did not affect the outcome of the experiment

Statistical significance Statisticians communicate their results in terms of the statistical significance of their

Statistical significance Statisticians communicate their results in terms of the statistical significance of their experimental manipulation. If we say a result is significant… ■ We mean: the result is very unlikely to be due to chance (sampling error) and we are rejecting the null hypothesis

Example: If α =. 05, Jake would reject the null hypothesis, but he would

Example: If α =. 05, Jake would reject the null hypothesis, but he would report that moving the napkin dispensers significantly decreased napkin consumption. If α =. 01, Jake would fail to reject the null hypothesis, but he would report that moving the napkin dispensers did not significantly decrease napkin consumption. – Put another way, we do not have enough evidence to conclude that moving the napkin dispensers significantly decreased consumption.

Basic steps in Hypothesis Testing 1. Specify the NULL hypothesis (HO): Usually µ =

Basic steps in Hypothesis Testing 1. Specify the NULL hypothesis (HO): Usually µ = some value 2. Specify the ALTERNATIVE hypothesis (HA): Usually µ ≠ some value 3. Designate the rejection region by selecting . Must do this BEFORE data collection! 4. Determine the critical value of your test statistic Find the critical value (z score) such that /2 lies in each tail 5. Use sample statistics to calculate test statistic. 6. Compare zobs with zcrit: If test statistic falls in the rejection region, we reject the null. If test statistic does not fall in the rejection region, we fail to reject the null. 7. Interpret your decision regarding the null

Self-scheduled Assignments To eliminate complaints about final exam conflicts, Professor Hobbes allowed his students

Self-scheduled Assignments To eliminate complaints about final exam conflicts, Professor Hobbes allowed his students to self-schedule their final exams for any day of finals week. He was curious about whether self-scheduling would affect student performance. Hobbes compared exams scores from this semester with what he has come to expect from his many years of teaching Intro Philosophy: µ = 85, σ = 16. He chose a sample of 100 students from his class and obtained a mean exam score of 81. Conduct a hypothesis test (α =. 05) to determine whether or not self-scheduling influenced exam performance.

Self-scheduled Assignments µ = 85, σ = 16. n=100; M = 81 Step 1:

Self-scheduled Assignments µ = 85, σ = 16. n=100; M = 81 Step 1: H 0: µ = 85 Step 2: HA: µ ≠ 85 Steps 3 and 4: =. 05: zcrit = 1. 96. Step 5: zobs =

Self-Scheduled Assignments: =. 05 Zobs = -2. 50 Zcrit = ± 1. 96 Sampling

Self-Scheduled Assignments: =. 05 Zobs = -2. 50 Zcrit = ± 1. 96 Sampling Distribution when Null is True

Proper Statistical Notation For z: z = z observed value, p < alpha value

Proper Statistical Notation For z: z = z observed value, p < alpha value z = -2. 5, p <. 05 YOU MUST DO THIS WHEN REPORTING YOUR RESULTS!!!

Self-scheduled Assignments µ = 85, σ = 16. n=100; M = 81 Step 1:

Self-scheduled Assignments µ = 85, σ = 16. n=100; M = 81 Step 1: H 0: µ = 85 Step 2: HA: µ ≠ 85 Steps 3 and 4: =. 05: zcrit = 1. 96. Step 5: zobs = Step 6: Zobs falls in the rejection region: reject the null Step 7: What does this tell us about the effect of selfscheduled exams on performance? Self-scheduled exams significantly lowers grades, z = -2. 5, p <. 05

How does Hypothesis Testing relate to CIs? Edison Light Bulbs § µ = 1200

How does Hypothesis Testing relate to CIs? Edison Light Bulbs § µ = 1200 hr and σ = 180 hr. § We measure 100 light bulbs and get a M= 1170 hours We can compute a 95% CI for the mean we got Interpretation: We are 95% certain that the population mean from which this sample was drawn is between 1135 and 1205

We never “accept” the null… Statisticians are conservative Professor Hobbes and his suspected cheaters:

We never “accept” the null… Statisticians are conservative Professor Hobbes and his suspected cheaters: • HO: The two students DID NOT cheat. • HA: The two students DID cheat. If the evidence of cheating was suspicious, but not strong enough for a formal accusation, would Hobbes: • Accept the null? • Fail to reject the null? Statisticians are cautious a. Is the earth flat? b. Are there cows with giant holes in their sides? On your own time, google fistulated cow.

Hypothesis Testing Errors Reality H 0 is True H 0 is False Reject H

Hypothesis Testing Errors Reality H 0 is True H 0 is False Reject H 0 TYPE I Correct Result of ERROR ( ) Rejection Test Fail to Correct FTR TYPE II Reject H 0 ERROR ( ) Type I Error: rejecting the null even though (in reality) it is true; P (Type I error) = α. Type II Error: failing to reject the null even though (in reality) it is false. P (Type II error) = β Questions: • Which error type concerns statisticians more? Why?

The Mystery of α How can α simultaneously be both • The critical probability

The Mystery of α How can α simultaneously be both • The critical probability for rejecting the null AND • The Type I error rate?

The Mystery of α (continued) Critical probability • If α =. 05, we will

The Mystery of α (continued) Critical probability • If α =. 05, we will reject the null hypothesis if the probability of observing our sample mean is less than. 05. Why is the Type I error rate =. 05 in this scenario? We can only make a Type I error if the null is true. Q: What has to happen for us to reject the null even if it is true? A: Our sample mean has to fall in the rejection region. Q: What is the probability that our sample mean will fall in the RR if the null is true? A: α

Ping. Pong Volcano

Ping. Pong Volcano

Ping Pong Volcano Randomly select one ball from the volcano Hypothesis Testing 95% yellow

Ping Pong Volcano Randomly select one ball from the volcano Hypothesis Testing 95% yellow balls Randomly select one sample from the sampling distribution 95% of samples 5 red balls 5% of samples 5% chance of ‘winning’ 5% chance of selecting red ball sample Last question: You are omniscient, so you know moving the napkin dispensers has no effect? What is the probability that Jake will collect a sample of data that will lead him to reject the null (commit a Type I error)?

Why not make alpha as small as possible? If null is true: any sample

Why not make alpha as small as possible? If null is true: any sample mean in this range will lead us to erroneously reject the null

Why not make alpha as small as possible? If null is false: any sample

Why not make alpha as small as possible? If null is false: any sample mean in this range will lead us to erroneously fail to reject the null

Why not make alpha as small as possible? If null is false: any sample

Why not make alpha as small as possible? If null is false: any sample mean in this range will lead us to erroneously fail to reject the null

Edison Light Bulbs: What affects hypothesis testing? Taking this class has made you intellectually

Edison Light Bulbs: What affects hypothesis testing? Taking this class has made you intellectually curious (I’m serious!). So, you decide to see whether Edison light bulbs last as long as the package says they do; according to the package: µ = 1200 hr and σ = 180 hr. Along with your physics major roommate, you construct a bank of 100 light bulbs and watch them round the clock to see when they burn out. The mean of your sample is 1170 hours. Does this constitute evidence of consumer fraud by Edison light bulbs if α =. 05? Zobs = If we set alpha at. 05 then Z crit = ± 1. 96 We will fail to reject the null

What affects the hypothesis test? 1. Variability (SE): smaller σ will make you more

What affects the hypothesis test? 1. Variability (SE): smaller σ will make you more likely to reject the null Example: We sample 100 Big Y light bulbs to determine if they last as long as the package claims: 1200 hours. M = 1170 σ = 150 (instead of 180) What is the probability that we get this sample from a population with a mean of 1200? If we set alpha at. 05 then Z crit = ± 1. 96 We will reject the null

What affects the hypothesis test? 2. Sample size: the larger the sample size, the

What affects the hypothesis test? 2. Sample size: the larger the sample size, the more likely we are to reject Example: We sample 225 (rather than 100) Big Y light bulbs to determine if they last as long as the package claims: 1200 hours. M = 1170 σ = 180 (back to original) What is the probability that we get this sample from a population with a mean of 1200? If we set alpha at. 05 then Z crit = ± 1. 96 We will reject the null

What affects the hypothesis test? 3. Sample mean: farther it is from the null

What affects the hypothesis test? 3. Sample mean: farther it is from the null mean, the more likely we are to reject Example: We sample 100 (back to original) Big Y light bulbs to determine if they last as long as the package claims: 1200 hours. M = 1140 (rather than 1170) σ = 180 (back to original) What is the probability that we get this sample from a population with a mean of 1200? If we set alpha at. 05 then Z crit = ± 1. 96 We will reject the null