Common Statistical Mistakes Mistake 1 Failing to investigate

  • Slides: 22
Download presentation
Common Statistical Mistakes

Common Statistical Mistakes

Mistake #1 • Failing to investigate data for data entry or recording errors. •

Mistake #1 • Failing to investigate data for data entry or recording errors. • Failing to graph data and calculate basic descriptive statistics before analyzing data.

Example: Wrong Decision Due to Error

Example: Wrong Decision Due to Error

Example: Wrong Decision Due to Error Test of mu = 26. 000 vs mu

Example: Wrong Decision Due to Error Test of mu = 26. 000 vs mu not = 26. 000 Variable Without N 16 15 Variable N With 16 Without 15 Mean 25. 625 24. 733 St. Dev 3. 964 1. 792 SE Mean 0. 991 0. 463 T -0. 38 -2. 74 P 0. 71 0. 016 95. 0 % CI (23. 513, 27. 737) (23. 741, 25. 725)

Mistake #2 • Using the wrong statistical procedure in analyzing your data. • Includes

Mistake #2 • Using the wrong statistical procedure in analyzing your data. • Includes failing to check that necessary assumptions are met.

Example: Wrong Decision Due to Wrong Analysis Pulse Rates Before and After Marching Student

Example: Wrong Decision Due to Wrong Analysis Pulse Rates Before and After Marching Student 1 2 3 4 BEFORE 60 56 90 78 AFTER 78 66 96 88 DIFFA-B 18 10 6 10 Paired Data Design, so analyze with Paired t-test.

Example: Wrong Decision Due to Wrong Analysis Paired T for AFTER - BEFORE AFTER

Example: Wrong Decision Due to Wrong Analysis Paired T for AFTER - BEFORE AFTER BEFORE Difference N 4 4 4 Mean 82. 00 71. 00 11. 00 St. Dev 12. 96 15. 87 5. 03 SE Mean 6. 48 7. 94 2. 52 95% CI for mean difference: (2. 99, 19. 01) T-Test of mean difference = 0 (vs not = 0): T-Value = 4. 37 P-Value = 0. 02 Conclude mean pulse rate after is greater than mean pulse rate before.

Example: Wrong Decision Due to Wrong Analysis Two sample T for AFTER vs BEFORE

Example: Wrong Decision Due to Wrong Analysis Two sample T for AFTER vs BEFORE AFTER BEFORE N 4 4 Mean 82. 0 71. 0 St. Dev 13. 0 15. 9 SE Mean 6. 5 7. 9 95% CI for mu AFTER - mu BEFORE: ( -15. 3, 37. 3) T-Test mu AFTER = mu BEFORE (vs not =): T = 1. 07 DF = 5 P = 0. 33 Conclude no difference in mean pulse rates before and after marching.

Mistake #3 • Failing to design your study so that it has high enough

Mistake #3 • Failing to design your study so that it has high enough power to call meaningful differences “significantly different. ” • Includes concluding that the null hypothesis is true. Should be “not enough evidence to say the null is false. ”

Example: Low Power Success = Yes, I recycle. Gender Male Female X 33 54

Example: Low Power Success = Yes, I recycle. Gender Male Female X 33 54 N 59 79 Sample p 0. 559322 0. 683544 Estimate for p(1) - p(2): -0. 124222 95% CI for p(1) - p(2): (-0. 287215, 0. 0387704) Test for p(1) - p(2) = 0 (vs not = 0): Z = -1. 49 P-Value = 0. 135 A number of students said that they were surprised that the hypothesis test said “no difference in percentages. ”

Example: Low Power and Sample Size Test for Two Proportions Testing proportion 1 =

Example: Low Power and Sample Size Test for Two Proportions Testing proportion 1 = proportion 2 (versus not =) Calculating power for: proportion 1 = 0. 55 and proportion 2 = 0. 70 Alpha = 0. 05 Difference = -0. 15 Sample Size Power 60 0. 4366 70 0. 4911 80 0. 5421 *Sample size = # in EACH group

Mistake #4 • Failing to report a confidence interval as well as the P-value.

Mistake #4 • Failing to report a confidence interval as well as the P-value. • P-value tells you if statistically significant. • Confidence interval tells you what the population value might be.

Example: A Significant, but Potentially Meaningless Difference Two sample T for Phone Gender Male

Example: A Significant, but Potentially Meaningless Difference Two sample T for Phone Gender Male Female N 59 80 Mean 79 153 St. Dev 162 247 SE Mean 21 28 95% CI for mu (1) - mu (2): ( -142, -5) T-Test mu (1) = mu (2) (vs not =): T = -2. 11 P = 0. 036 DF = 135 P-value tells us significant difference, but confidence interval tells us that the difference in the averages could be as small as 5 minutes.

Incidentally…. Outliers

Incidentally…. Outliers

Removing Outliers … Two sample T for Phone Gender Male Female N 58 79

Removing Outliers … Two sample T for Phone Gender Male Female N 58 79 Mean 59. 9 129 St. Dev 66. 5 133 SE Mean 8. 7 15 95% CI for mu (1) - mu (2): ( -103. 7, -35) T-Test mu (1) = mu (2) (vs not =): T = -4. 02 DF = 121 P = 0. 0001 The difference in male and female phone usage becomes even more significant. We are 95% confident that the difference in the averages is now more than 35 minutes.

Mistake #5 • “Fishing” for significant results. That is, performing several hypothesis tests on

Mistake #5 • “Fishing” for significant results. That is, performing several hypothesis tests on a data set, and reporting only those results that are significant. • If = P(Type I) = 0. 05, and we perform 20 tests on the same data set, we can expect to make 1 Type I error. (0. 05 × 20 = 1).

Example: Results Obtained from Fishing • Primary driver of $10, 000 vehicle and going

Example: Results Obtained from Fishing • Primary driver of $10, 000 vehicle and going away for Spring Break are related (P=0. 01). • Virginity and supporting self through school are related (P = 0. 045). • Virginity and graduating in four years are related (P = 0. 041). • Virginity and attending non-football PSU sports events are related (P = 0. 016).

Mistake #6 • Overstating the results of an observational study. – That is, suggesting

Mistake #6 • Overstating the results of an observational study. – That is, suggesting that one variable “caused” the differences in the other variable. – As opposed to correctly saying that the two variables are “associated” or “correlated. ” • Don’t forget that a significant result may be “spurious. ”

Example: Misleading Headlines • Virgins don’t support themselves through school. • Non-virgins too busy

Example: Misleading Headlines • Virgins don’t support themselves through school. • Non-virgins too busy to go to non-football PSU sporting events. • Non-virgins also too busy to graduate in four years.

Mistake #7 • Using a non-random or unrepresentative sample. • Includes extending the results

Mistake #7 • Using a non-random or unrepresentative sample. • Includes extending the results of an unrepresentative sample to the population.

Example: Unrepresentative sample • Shere Hite wrote a book in 1987 called “Women in

Example: Unrepresentative sample • Shere Hite wrote a book in 1987 called “Women in Love” • 100, 000 questionnaires about love, sex, and relationships sent to women’s groups. Only 4, 500 questionnaires returned. • Entire book devoted to results of survey. • Examples: 91% of divorcees initiated the divorce; 70% of women married 5 years committed adultery.

Mistake #8 • Failing to use all of the basic principles of experiments, including

Mistake #8 • Failing to use all of the basic principles of experiments, including randomization, blinding, and controlling.