Topic 29 ThreeWay ANOVA Outline Threeway ANOVA Data

  • Slides: 35
Download presentation
Topic 29: Three-Way ANOVA

Topic 29: Three-Way ANOVA

Outline • Three-way ANOVA – Data – Model – Inference

Outline • Three-way ANOVA – Data – Model – Inference

Data for three-way ANOVA • • Y, the response variable Factor A with levels

Data for three-way ANOVA • • Y, the response variable Factor A with levels i = 1 to a Factor B with levels j = 1 to b Factor C with levels k = 1 to c • Yijkl is the lth observation in cell (i, j, k), l = 1 to nijk • A balanced design has nijk=n

KNNL Example • KNNL p 1005 • Y is exercise tolerance, minutes until fatigue

KNNL Example • KNNL p 1005 • Y is exercise tolerance, minutes until fatigue on a bicycle test • A is gender, a=2 levels: male, female • B is percent body fat, b=2 levels: high, low • C is smoking history, c=2 levels: light, heavy • n=3 persons aged 25 -35 per (i, j, k) cell

Read and check the data a 1; infile 'c: . . . CH 24

Read and check the data a 1; infile 'c: . . . CH 24 TA 04. txt'; input extol gender fat smoke; proc print data=a 1; run;

Obs extol 1 24. 1 2 29. 2 3 24. 6 4 20. 0

Obs extol 1 24. 1 2 29. 2 3 24. 6 4 20. 0 5 21. 9 6 17. 6 7 14. 6 8 15. 3 9 12. 3 10 16. 1 11 9. 3 12 10. 8 13 17. 6. . . 24 6. 1 gender 1 1 1 2 2 2 1 2 fat 1 1 1 2 2 2 1 smoke 1 1 1 2 2 2

Define variable for a plot data a 1; set a 1; if (gender eq

Define variable for a plot data a 1; set a 1; if (gender eq 1)*(fat then gfs='1_Mfs'; if (gender eq 1)*(fat then gfs='2_MFs'; if (gender eq 1)*(fat then gfs='3_Mf. S'; if (gender eq 1)*(fat then gfs='4_MFS'; if (gender eq 2)*(fat then gfs='5_Ffs'; if (gender eq 2)*(fat then gfs='6_FFs'; if (gender eq 2)*(fat then gfs='7_Ff. S'; if (gender eq 2)*(fat then gfs='8_FFS'; run; eq 1)*(smoke eq 1) eq 2)*(smoke eq 1)*(smoke eq 2)*(smoke eq 2)

Obs 1 2 3 4 5 6 7 8 9 10 11 12 extol

Obs 1 2 3 4 5 6 7 8 9 10 11 12 extol 24. 1 29. 2 24. 6 17. 6 18. 8 23. 2 14. 6 15. 3 12. 3 14. 9 20. 4 12. 8 gender fat smoke 1 1 1 1 1 2 2 2 1 1 1 2 2 2 gfs 1_Mfs 3_Mf. S 2_MFs 4_MFS

Plot the data title 1 'Plot of the data'; symbol 1 v=circle i=none c=black;

Plot the data title 1 'Plot of the data'; symbol 1 v=circle i=none c=black; proc gplot data=a 1; plot extol*gfs/frame; run;

Find the means proc sort data=a 1; by gender fat smoke; proc means data=a

Find the means proc sort data=a 1; by gender fat smoke; proc means data=a 1; output out=a 2 mean=avextol; by gender fat smoke;

Define fat*smoke data a 2; set a 2; if (fat eq 1)*(smoke then fs='1_fs';

Define fat*smoke data a 2; set a 2; if (fat eq 1)*(smoke then fs='1_fs'; if (fat eq 1)*(smoke then fs='2_f. S'; if (fat eq 2)*(smoke then fs='3_Fs'; if (fat eq 2)*(smoke then fs='4_FS'; eq 1) eq 2)

Obs gen fat 1 2 3 4 5 6 7 8 1 2 1

Obs gen fat 1 2 3 4 5 6 7 8 1 2 1 2 1 1 2 2 smoke FR 1 1 2 2 3 3 3 3 avextol fs 25. 97 19. 83 19. 87 12. 13 14. 07 12. 07 16. 03 10. 20 1_fs 2_f. S 3_Fs 4_FS

Plot the means proc sort data=a 2; by fs; title 1 'Plot of the

Plot the means proc sort data=a 2; by fs; title 1 'Plot of the means'; symbol 1 v='M' i=join c=black; symbol 2 v='F' i=join c=black; proc gplot data=a 2; plot avextol*fs=gender/frame; run;

Cell means model • Yijkl = μijk + εijkl – where μijk is theoretical

Cell means model • Yijkl = μijk + εijkl – where μijk is theoretical mean or expected value of all observations in cell (i, j, k) – the εijkl are iid N(0, σ2) – Yijkl ~ N(μijk, σ2), independent

Estimates • Estimate μijk by the mean of the observations in cell (i, j,

Estimates • Estimate μijk by the mean of the observations in cell (i, j, k), • = (Σk. Yijkl)/nijk • For each (i, j, k) combination, we can get an estimate of the variance • We need to combine these to get an estimate of σ2

Pooled estimate of 2 σ • We pool the sijk 2, giving weights proportional

Pooled estimate of 2 σ • We pool the sijk 2, giving weights proportional to the df, nijk -1 • The pooled estimate is MSE=s 2 = (Σ (nijk-1)sijk 2) / (Σ(nijk-1))

Factor effects model • Model cell mean as μijk = μ + αi +

Factor effects model • Model cell mean as μijk = μ + αi + βj + γk + (αβ)ij + (αγ)ik + (βγ)jk + (αβγ)ijk • μ is the overall mean • αi, βj, γk are the main effects of A, B, and C • (αβ)ij, (αγ)ik, and (βγ)jk are the two-way interactions (first-order interactions) • (αβγ)ijk is the three-way interaction (second-order interaction) • Extension of the usual constraints apply

ANOVA table • Sources of model variation are three main effects, the three two-way

ANOVA table • Sources of model variation are three main effects, the three two-way interactions, and the one three-way interaction • With balanced data the SS and DF add to the model SS and DF • Still have Model + Error = Total • Each effect is tested by an F statistic with MSE in the denominator

Run proc glm data=a 1; class gender fat smoke; model extol=gender fat smoke gender*fat

Run proc glm data=a 1; class gender fat smoke; model extol=gender fat smoke gender*fat gender*smoke fat*smoke gender*fat*smoke; means gender*fat*smoke; run;

Run proc glm data=a 1; class gender fat smoke; model extol=gender|fat|smoke; means gender*fat*smoke; run;

Run proc glm data=a 1; class gender fat smoke; model extol=gender|fat|smoke; means gender*fat*smoke; run; Shorthand way to express model

SAS Parameter Estimates • Solution option on the model statement gives parameter estimates for

SAS Parameter Estimates • Solution option on the model statement gives parameter estimates for the glm parameterization • These are as we have seen before; any main effect or interaction with a subscript of a, b, or c is zero • These reproduce the cell means in the usual way

ANOVA Table Source Model Error Sum of DF Squares Mean Square F Value Pr

ANOVA Table Source Model Error Sum of DF Squares Mean Square F Value Pr > F 7 588. 582917 84. 0832738 9. 01 0. 0002 16 149. 366667 9. 3354167 Corrected Total 23 737. 949583 Type I and III SS the same here

Factor effects output Source gender Type I SS Mean Square F Value Pr >

Factor effects output Source gender Type I SS Mean Square F Value Pr > F 1 176. 5837500 18. 92 0. 0005 DF fat 1 242. 57042 242. 5704167 25. 98 0. 0001 gender*fat 1 13. 650417 13. 6504167 1. 46 0. 2441 smoke 1 70. 3837500 7. 54 0. 0144 gender*smoke 1 11. 0704167 1. 19 0. 2923 fat*smoke 1 72. 4537500 7. 76 0. 0132 gender*fat*smoke 1 1. 8704167 0. 20 0. 6604

Analytical Strategy • First examine interactions…highest order to lowest order • Some options when

Analytical Strategy • First examine interactions…highest order to lowest order • Some options when one or more interactions are significant – Interpret the plot of means – Run analyses for each level of one factor, eg run A*B by C (lsmeans with slice option) – Run as a one-way with abc levels – Define a composite factor by combining two factors, eg AB with ab levels – Use contrasts

Analytical Strategy • Some options when no interactions are significant – Use a multiple

Analytical Strategy • Some options when no interactions are significant – Use a multiple comparison procedure for the main effects – Use contrasts – When needed, rerun without the interactions

Example Interpretation • Since there appears to be a fat by smoke interaction, let’s

Example Interpretation • Since there appears to be a fat by smoke interaction, let’s run a two-way ANOVA (no interaction) using the fat*smoke variable • Note that we could also use the interaction plot to describe the interaction

Run glm proc glm class model means run; data=a 1; gender fs; extol=gender fs;

Run glm proc glm class model means run; data=a 1; gender fs; extol=gender fs; gender fs/tukey;

ANOVA Table Source Model Error Sum of DF Squares Mean Square F Value Pr

ANOVA Table Source Model Error Sum of DF Squares Mean Square F Value Pr > F 4 561. 99167 140. 4979167 15. 17 <. 0001 19 175. 95792 Corrected Total 23 737. 94958 9. 2609430

Factor effects output Mean Source DF Type I SS Square F Value Pr >

Factor effects output Mean Source DF Type I SS Square F Value Pr > F gender 1 176. 5837500 19. 07 0. 0003 fs 3 385. 40792 128. 4693056 13. 87 <. 0001 Both are significant as expected…compare means

Means for gender Mean N gender A 18. 983 12 1 B 13. 558

Means for gender Mean N gender A 18. 983 12 1 B 13. 558 12 2

Tukey comparisons for fs A B B B Mean 22. 900 N 6 fs

Tukey comparisons for fs A B B B Mean 22. 900 N 6 fs 1_fs 16. 000 6 2_f. S 13. 117 6 4_FS 13. 067 6 3_Fs

Conclusions • Gender difference with males having a roughly 5. 5 minute higher exercise

Conclusions • Gender difference with males having a roughly 5. 5 minute higher exercise tolerance – beneficial to add CI here • There was a smoking history by body fat level interaction where those who were low body fat and had a light smoking history had a significantly higher exercise tolerance than the other three groups

Last slide • Read NKNW Chapter 24 • We used program topic 29. sas

Last slide • Read NKNW Chapter 24 • We used program topic 29. sas to generate the output for today