Everyday is a new beginning in life Every

Everyday is a new beginning in life. Every moment is a time for self vigilance.

Multiple Comparisons l. Error rate of control l. Pairwise comparisons l. Comparisons to a control l. Linear contrasts

Multiple Comparison Procedures Once we reject H 0: = =. . . c in favor of H 1: NOT all ’s are equal, we don’t yet know the way in which they’re not all equal, but simply that they’re not all the same. If there are 4 columns, are all 4 ’s different? Are 3 the same and one different? If so, which one? etc.

These “more detailed” inquiries into the process are called MULTIPLE COMPARISON PROCEDURES. Errors (Type I): We set up “ ” as the significance level for a hypothesis test. Suppose we test 3 independent hypotheses, each at =. 05; each test has type I error (rej H 0 when it’s true) of. 05. However, P(at least one type I error in the 3 tests) = 1 -P( accept all ) = 1 - (. 95)3 . 14 3, given true

In other words, Probability is. 14 that at least one type one error is made. For 5 tests, prob =. 23. Question - Should we choose =. 05, and suffer (for 5 tests) a. 23 Experimentwise Error rate (“a” or E)? OR Should we choose/control the overall error rate, “a”, to be. 05, and find the individual test by 1 - (1 - )5 =. 05, (which gives us =. 011)?

The formula 1 - (1 - )5 =. 05 would be valid only if the tests are independent; often they’re not. 1 2 3 [ e. g. , 1= 2 2= 3, 1= 3 IF 1 accepted & 2 rejected, isn’t it more likely that 3 rejected? ]

Error Rates When the tests are not independent, it’s usually very difficult to arrive at the correct for an individual test so that a specified value results for the experimentwise error rate (or called family error rate).

There are many multiple comparison procedures. We’ll cover only a few. Pairwise Comparisons Method 1: (Fisher Test) Do a series of pairwise t-tests, each with specified value (for individual test). This is called “Fisher’s LEAST SIGNIFICANT DIFFERENCE” (LSD).

Example: Broker Study A financial firm would like to determine if brokers they use to execute trades differ with respect to their ability to provide a stock purchase for the firm at a low buying price per share. To measure cost, an index, Y, is used. Y=1000(A-P)/A where P=per share price paid for the stock; A=average of high price and low price per share, for the day. “The higher Y is the better the trade is. ”

Co. L: broker 1 12 3 5 -1 12 5 6 2 7 17 13 11 7 17 12 3 8 1 7 4 3 7 5 4 21 10 15 12 20 6 14 5 24 13 14 18 14 19 17 } R=6 Five brokers were in the study and six trades were randomly assigned to each broker.

“MSW” =. 05, FTV = 2. 76 (reject equal column MEANS)

For any comparison of 2 columns, /2 Yi -Yj CL /2 0 Cu AR: 0+ t /2 x MSW x 1 n + 1 n dfw MSW : i j (ni = nj = 6, here) Pooled Variance, the estimate for the common variance

In our example, with =. 05 0 2. 060 ( 21. 2 x 16 + 16 ) 0 5. 48 This value, 5. 48 is called the Least Significant Difference (LSD). When same number of data points, R, in each column, LSD = t /2 x 2 x. MSW R.

Underline Diagram Summarize the comparison results. (p. 443) 1. Now, rank order and compare: Col: 3 1 2 4 5 5 6 12 14 17

Step 2: identify difference > 5. 48, and mark accordingly: 3 1 2 4 5 5 3: 6 12 14 17 compare the pair of means within each subset: Comparison difference vs. LSD < 3 vs. 1 * < 2 vs. 4 * 5 < 2 vs. 5 < 4 vs. 5 * * Contiguous; no need to detail

Conclusion : 3, 1 2, 4, 5 Can get “inconsistency”: Suppose col 5 were 18: 3 1 2 4 5 5 6 12 14 18 Now: Comparison |difference| vs. LSD < 3 vs. 1 * < 2 vs. 4 * 6 2 vs. 5 4 vs. 5 < * Conclusion : 3, 1 2 4 5 ? ? ? >

Conclusion : 3, 1 2 4 5 • Broker 1 and 3 are not significantly different but they are significantly different to the other 3 brokers. • Broker 2 and 4 are not significantly different, and broker 4 and 5 are not significantly different, but broker 2 is different to (smaller than) broker 5 significantly.

Minitab: Stat>>ANOVA>>One-Way Anova then click “comparisons”. Fisher's pairwise comparisons (Minitab) Family error rate = 0. 268 Individual error rate = 0. 0500 Critical value = 2. 060 t_ /2 (not given in version 16. 1) Intervals for (column level mean) - (row level mean) 1 2 2 3 4 -11. 476 -0. 524 Col 1 < Col 2 3 4 5 -4. 476 1. 524 6. 476 12. 476 -13. 476 -7. 476 -14. 476 -2. 524 3. 476 -3. 524 -16. 476 -10. 476 -17. 476 -8. 476 -5. 524 0. 476 -6. 524 2. 476 Col 2 = Col 4

Minitab Output for Broker Data • Grouping Information Using Fisher Method • • • broker 5 6 4 6 2 6 1 6 3 6 N Mean Grouping 17. 000 A 14. 000 A 12. 000 A 6. 000 B 5. 000 B • Means that do not share a letter are significantly different.

Pairwise comparisons Method 2: (Tukey Test) A procedure which controls the experimentwise error rate is “TUKEY’S HONESTLY SIGNIFICANT DIFFERENCE TEST ”.

Tukey’s method works in a similar way to Fisher’s LSD, except that the “LSD” counterpart (“HSD”) is not t /2 x MSW x 1 n + 1 n i ( j ) or, for equal number = t x 2 x. MSW , /2 R of data points/col but tuk /2 X 2 x. MSW R , where tuk has been computed to take into account all the inter-dependencies of the different comparisons.

HSD = tuk /2 x 2 MSW R ____________________ A more general approach is to write HSD = q x MSW where q = tuk /2 x R 2 --- q = (Ylargest - Ysmallest) / MSW R ---- probability distribution of q is called the “Studentized Range Distribution”. --- q = q(c, df), where c =number of columns, and df = df of MSW

With c = 5 and df = 25, from table (or Minitab): q = 4. 15 tuk = 4. 15/1. 414 = 2. 93 Then, HSD = 4. 15 . / 7. 8 also, . 9 x . / 7. 8

In our earlier example: 3 1 2 4 5 5 6 12 14 17 Rank order: (No differences [contiguous] > 7. 80)

Comparison |difference| >or< 7. 80 < 3 vs. 1 (contiguous) * 7 < 3 vs. 2 > 9 3 vs. 4 > 12 3 vs. 5 < * 1 vs. 2 > 8 1 vs. 4 > 11 1 vs. 5 < * 2 vs. 4 < 5 2 vs. 5 < * 4 vs. 5 3, 1, 2 4, 5 2 is “same as 1 and 3, but also same as 4 and 5. ”

Minitab: Stat>>ANOVA>>One-Way Anova then click “comparisons”. Tukey's pairwise comparisons (Minitab) Family error rate = 0. 0500 Individual error rate = 0. 00706 Critical value = 4. 15 q_ (not given in version 16. 1) Intervals for (column level mean) - (row level mean) 2 3 4 5 1 -13. 801 1. 801 -6. 801 8. 801 -15. 801 -0. 199 -18. 801 -3. 199 2 -0. 801 14. 801 -9. 801 5. 801 -12. 801 3 -16. 801 -1. 199 -19. 801 -4. 199 4 -10. 801 4. 801

Minitab Output for Broker Data • Grouping Information Using Tukey Method • • • broker 5 6 4 6 2 6 1 6 3 6 N Mean Grouping 17. 000 A 14. 000 A 12. 000 A B 6. 000 B 5. 000 B • Means that do not share a letter are significantly different.

Special Multiple Comp. Method 3: Dunnett’s test Designed specifically for (and incorporating the interdependencies of) comparing several “treatments” to a “control. ” Example: CONTROL 1 2 6 12 Analog of LSD (=t /2 x 2 MSW ) R Col 3 4 5 5 14 17 } R=6 D = Dut /2 x 2 MSW R From table or Minitab

D= Dut /2 x 2 MSW/R CONTROL = 2. 61 ( 2(21. 2) ) 6 = 6. 94 1 2 3 4 5 In our example: 6 12 5 14 17 Comparison |difference| >or< 6. 94 1 vs. 2 1 vs. 3 1 vs. 4 1 vs. 5 6 1 8 11 - Cols 4 and 5 differ from the control [ 1 ]. - Cols 2 and 3 are not significantly different from control. < < > >

Minitab: Stat>>ANOVA>>General Linear Model then click “comparisons”. Dunnett's comparisons with a control (Minitab) Family error rate = 0. 0500 controlled!! Individual error rate = 0. 0152 Critical value = 2. 61 Dut_ /2 Control = level (1) of broker Intervals for treatment mean minus control mean Level 2 3 4 5 Lower -0. 930 -7. 930 1. 070 4. 070 Center 6. 000 -1. 000 8. 000 11. 000 Upper --+---------+-----+----12. 930 (-----*----) 5. 930 (-----*----) 14. 930 (----*-----) 17. 930 (-----*-----) --+---------+-----+-----7. 0 0. 0 7. 0 14. 0

What Method Should We Use? l Fisher procedure can be used only after the F-test in the Anova is significant at 5%. l Otherwise, use Tukey procedure. Note that to avoid being too conservative, the significance level of Tukey test can be set bigger (10%), especially when the number of levels is big.

Contrast 1 Example 1 Placebo 2 3 Sulfa Type S 1 4 Anti. Type biotic Type A S 2 Suppose the questions of interest are (1) Placebo vs. Non-placebo (2) S 1 vs. S 2 (3) (Average) S vs. A

In general, a question of interest can be expressed by a linear combination of column means such as with restriction that Saj = 0. Such linear combinations are called contrasts.

Test if a contrast has mean 0 The sum of squares for contrast Z is where R is the number of rows (replicates). The test statistic Fcalc = SSC/MSW is distributed as F with 1 and (df of error) degrees of freedom. Reject E[C]= 0 if the observed Fcalc is too large (say, > F 0. 05(1, df of error) at 5% significant level).

Example 1 (cont. ): aj’s for the 3 contrasts P S 1 S 2 A 1 2 3 4 P vs. P: C 1 -3 1 1 1 S 1 vs. S 2: C 2 0 -1 1 0 S vs. A: C 3 0 -1 -1 2

Calculating top row middle row bottom row

5 Y. 1 6 P Y. 2 7 S 1 Y. 3 S 2 10 Y. 4 A Placebo vs. drugs -3 1 1 1 5. 33 S 1 vs. S 2 0 -1 1 0 0. 50 0 -1 -1 2 Average S vs. A 8. 17 14. 00

5. 33 42. 64 . 50 4. 00 8. 17 65. 36

Tests for Contrasts Source SSQ C 1 C 2 C 3 Error df 42. 64 4. 00 65. 36 140 MSQ 1 1 1 28 42. 64 4. 00 65. 36 F 8. 53. 80 13. 07 5 F 1 -. 05(1, 28)=4. 20

Example 1 (Cont. ): Conclusions l l l The mean response for Placebo is significantly different to that for Non-placebo. There is no significant difference between using Types S 1 and S 2. Using Type A is significantly different to using Type S on average.