Nway ANOVA 3 way ANOVA 2 3 way

  • Slides: 34
Download presentation
N-way ANOVA

N-way ANOVA

3 -way ANOVA 2

3 -way ANOVA 2

3 -way ANOVA H 0: The mean respiratory rate is the same for all

3 -way ANOVA H 0: The mean respiratory rate is the same for all species H 0: The mean respiratory rate is the same for all temperatures H 0: The mean respiratory rate is the same for both sexes H 0: The mean respiratory rate is the same for all species H 0: There is no interaction between species and temperature across both sexes H 0: There is no interaction between species and sexes across temperature H 0: There is no interaction between sexes and temperature across both spices H 0: There is no interaction between species, temperature, and sexes 3

3 -way ANOVA Latin Square 4

3 -way ANOVA Latin Square 4

Multiple and non-linear regression

Multiple and non-linear regression

What is what? • • • Regression: One variable is considered dependent on the

What is what? • • • Regression: One variable is considered dependent on the other(s) Correlation: No variables are considered dependent on the other(s) Multiple regression: More than one independent variable Linear regression: The independent factor is scalar and linearly dependent on the independent factor(s) Logistic regression: The independent factor is categorical (hopefully only two levels) and follows a s-shaped relation. 6

Remember the simple linear regression? If Y is linaery dependent on X, simple linear

Remember the simple linear regression? If Y is linaery dependent on X, simple linear regression is used: is the intercept, the value of Y when X = 0 is the slope, the rate in which Y increases when X increases 7

I the relation linaer? 8

I the relation linaer? 8

Multiple linear regression If Y is linaery dependent on more than one independent variable:

Multiple linear regression If Y is linaery dependent on more than one independent variable: is the intercept, the value of Y when X 1 and X 2 = 0 1 and 2 are termed partial regression coefficients 1 expresses the change of Y for one unit of X when 2 is kept constant 9

Multiple linear regression – residual error and estimations As the collected data is not

Multiple linear regression – residual error and estimations As the collected data is not expected to fall in a plane an error term must be added The error term summes up to be zero. Estimating the dependent factor and the population parameters: 10

Multiple linear regression – general equations In general an finite number (m) of independent

Multiple linear regression – general equations In general an finite number (m) of independent variables may be used to estimate the hyperplane The number of sample points must be two more than the number of variables 11

Multiple linear regression – least sum of squares The principle of the least sum

Multiple linear regression – least sum of squares The principle of the least sum of squares are usually used to perform the fit: 12

Multiple linear regression – An example 13

Multiple linear regression – An example 13

Multiple linear regression – The fitted equation 14

Multiple linear regression – The fitted equation 14

Multiple linear regression – Are any of the coefficients significant? F = regression MS

Multiple linear regression – Are any of the coefficients significant? F = regression MS / residual MS 15

Multiple linear regression – Is it a good fit? R 2 = 1 -regression

Multiple linear regression – Is it a good fit? R 2 = 1 -regression SS / total SS • Is an expression of how much of the variation can be described by the model • When comparing models with different numbers of variables the ajusted Rsquare should be used: Ra 2 = 1 – regression MS / total MS The multiple regression coefficient: R = sqrt(R 2) The standard error of the estimate = sqrt(residual MS) 16

Multiple linear regression – Which of the coefficient are significant? • • • sbi

Multiple linear regression – Which of the coefficient are significant? • • • sbi is the standard error of the regresion parameter bi t-tests if bi is different from 0 t = bi / sbi is the residual DF p values can be found in a table 17

Multiple linear regression – Which of the are most important? • The standardized regression

Multiple linear regression – Which of the are most important? • The standardized regression coefficient , b’ is a normalized version of b 18

Multiple linear regression - multicollinearity • • • If two factors are well correlated

Multiple linear regression - multicollinearity • • • If two factors are well correlated the estimated b’s becomes inaccurate. Collinearity, intercorrelation, nonorthogonality, illconditioning Tolerance or variance inflation factors can be computed • Extreme correlation is called singularity and on of the correlated variables must be removed. 19

Multiple linear regression – Pairvise correlation coefficients 20

Multiple linear regression – Pairvise correlation coefficients 20

Multiple linear regression – Assumptions The same as for simple linear regression: 1. Y’s

Multiple linear regression – Assumptions The same as for simple linear regression: 1. Y’s are randomly sampled 2. The reciduals are normal distributed 3. The reciduals hav equal variance 4. The X’s are fixed factors (their error are small). 5. The X’s are not perfectly correlated 21

Logistic regression 22

Logistic regression 22

Logistic Regression • • • If the dependent variable is categorical and especially binary?

Logistic Regression • • • If the dependent variable is categorical and especially binary? Use some interpolation method Linear regression cannot help us. 23

The sigmodal curve 24

The sigmodal curve 24

The sigmodal curve • The intercept basically just ‘scale’ the input variable 25

The sigmodal curve • The intercept basically just ‘scale’ the input variable 25

The sigmodal curve • • The intercept basically just ‘scale’ the input variable Large

The sigmodal curve • • The intercept basically just ‘scale’ the input variable Large regression coefficient → risk factor strongly influences the probability 26

The sigmodal curve • • The intercept basically just ‘scale’ the input variable Large

The sigmodal curve • • The intercept basically just ‘scale’ the input variable Large regression coefficient → risk factor strongly influences the probability Positive regression coefficient → risk factor increases the probability Logistic regession uses maximum likelihood estimation, not least square estimation 27

Does age influence the diagnosis? Continuous independent variable Variables in the Equation 95% C.

Does age influence the diagnosis? Continuous independent variable Variables in the Equation 95% C. I. for EXP(B) B Step 1 a Age Constant S. E. Wald df Sig. Exp(B) , 109 , 010 108, 745 1 , 000 1, 115 -4, 213 , 423 99, 097 1 , 000 , 015 Lower 1, 092 Upper 1, 138 a. Variable(s) entered on step 1: Age. 28

Does previous intake of OCP influence the diagnosis? Categorical independent variable Variables in the

Does previous intake of OCP influence the diagnosis? Categorical independent variable Variables in the Equation 95% C. I. for EXP(B) B Step 1 a OCP(1) Constant S. E. Wald df Sig. Exp(B) -, 311 , 180 2, 979 1 , 084 , 733 , 233 , 123 3, 583 1 , 058 1, 263 Lower , 515 Upper 1, 043 a. Variable(s) entered on step 1: OCP. 29

Odds ratio 30

Odds ratio 30

Multiple logistic regression Variables in the Equation 95% C. I. for EXP(B) B Step

Multiple logistic regression Variables in the Equation 95% C. I. for EXP(B) B Step 1 a S. E. Wald df Sig. Exp(B) Lower Upper Age , 123 , 011 115, 343 1 , 000 1, 131 1, 106 1, 157 BMI , 083 , 019 18, 732 1 , 000 1, 087 1, 046 1, 128 OCP , 528 , 219 5, 808 1 , 016 1, 695 1, 104 2, 603 -6, 974 , 762 83, 777 1 , 000 , 001 Constant a. Variable(s) entered on step 1: Age, BMI, OCP. 31

Predicting the diagnosis by logistic regression What is the probability that the tumor of

Predicting the diagnosis by logistic regression What is the probability that the tumor of a 50 year old woman who has been using OCP and has a BMI of 26 is malignant? z = -6. 974 + 0. 123*50 + 0. 083*26 + 0. 28*1 = 1. 6140 p = 1/(1+e-1. 6140) = 0. 8340 Variables in the Equation 95% C. I. for EXP(B) B Step 1 a S. E. Wald df Sig. Exp(B) Lower Upper Age , 123 , 011 115, 343 1 , 000 1, 131 1, 106 1, 157 BMI , 083 , 019 18, 732 1 , 000 1, 087 1, 046 1, 128 OCP , 528 , 219 5, 808 1 , 016 1, 695 1, 104 2, 603 -6, 974 , 762 83, 777 1 , 000 , 001 Constant a. Variable(s) entered on step 1: Age, BMI, OCP. 32

Exercises 20. 1, 20. 2 33

Exercises 20. 1, 20. 2 33

Exercises 14. 1, 14. 2 34

Exercises 14. 1, 14. 2 34