Multinomial Logistic Regression David F Staples Outline Review
- Slides: 20
Multinomial Logistic Regression David F. Staples
Outline • Review of Logistic Regression • BCS Example • Extension to Multiple Response Groups • Nominal Categories • Ordinal Categories • Model Fitting & Interpretation • Shallow Lake Trophic Status
Logistic Regression Based on a Binomial Random Variable: Ø Prob(Y = 1) = p Ø Prob(Y = 0) = 1 -p p(x) = P(Yi = 1|Xi) = Y = {0, 1} , where Xβ = β 0 + β 1 x 1 +…+ βkxk.
Logistic Regression Based on a Binomial Random Variable: Ø Prob(Y = 1) = p Ø Prob(Y = 0) = 1 -p p(x) = P(Yi = 1|Xi) = Y = {0, 1} , where Xβ = β 0 + β 1 x 1 +…+ βkxk. A logit transformation is used to linearize p(x): = β 0 + β 1 x 1 +…+ βkxk = Xβ Log Odds of ‘Success’ → The β’s give the additive effect of X’s on the Log Odds
Logistic Regression Example Dichotomous Variable is the Presence/Absence of BCS Ø Y = 1 if BCS Present Ø Y = 0 if BCS Absent Ø p = Prob(BCS Present) Model p as a function of Macrophyte Patch Area glm(BCS ~ Patch_area, family = binomial) Estimate Intercept Patch_area -2. 433 e+00 1. 765 e-04 SE 5. 108 e-01 4. 725 e-05 z -4. 764 3. 736 Pr(>|z|) 1. 9 e-06 0. 0001
Interpreting Logistic Regression glm(BCS ~ Patch_area, family = binomial) Estimate Intercept Patch_area -2. 433 e+00 1. 765 e-04 SE 5. 108 e-01 4. 725 e-05 z -4. 764 3. 736 Pr(>|z|) 1. 9 e-06 0. 0001 Effect of Patch Area on P(BCS) • Non-Linear Transformation Ø Value of Intercept Ø Value of Other Variables
Interpreting Logistic Regression glm(BCS ~ Patch_area, family = binomial) Estimate Intercept Patch_area -2. 433 e+00 1. 765 e-04 SE z 5. 108 e-01 4. 725 e-05 -4. 764 3. 736 Pr(>|z|) 1. 9 e-06 0. 0001 For the average size patch area (8374), the log odds ratio would be: -2. 433 + 0. 0001765 * 8374 = -0. 955 exponentiate to get the Odds of Success: exp(-. 955) = p/1 -p = 0. 38, Solve for p, Prob(BCS Present|Area=8374) =. 28
Interpreting Logistic Regression glm(BCS ~ Patch_area, family = binomial) Estimate Intercept Patch_area -2. 433 e+00 1. 765 e-04 SE 5. 108 e-01 4. 725 e-05 z -4. 764 3. 736 Pr(>|z|) 1. 9 e-06 0. 0001 When p = 0. 5, the log odds equals 0, – 2. 433 +. 0001765*Area = 0. Thus, the patch area for p =. 50 is 2. 433/. 0001765 = 13784. 7
Multinomial Logistic Regression • Logistic Regression with > 2 response categories • Model Probabilities Relative to ‘Reference’ Category • Response May be Nominal or Ordinal Nominal Ordinal
Shallow Lake Trophic Status 3 Categories Defining Lake State: Y = 1 if Lake Clear Y = 2 if Lake Shifting States Y = 3 if Lake Turbid
Nominal (un-ordered) Multinomial Logistic library(nnet) multinom(State. Nom ~ TP) 2 3 (Int) -2. 47 -1. 89 Std. Errors: (Int) 2 0. 549 3 0. 447 TP 0. 012 0. 014 TP 0. 004 Residual Deviance: 113. 8345 AIC: 121. 8345
Nominal (un-ordered) Multinomial Logistic Library(nnet) multinom(State. Nom ~ TP) 2 3 (Int) -2. 47 -1. 89 For TP = 50 TP 0. 012 0. 014 p(Shifting) is about 16% of p(Clear)
Nominal (un-ordered) Multinomial Logistic Library(nnet) multinom(State. Nom ~ TP) 2 3 (Int) -2. 47 -1. 89 For TP = 50 TP 0. 012 0. 014 p(Turbid) is about 30% of p(Clear)
Nominal (un-ordered) Multinomial Logistic Odds of Shifting State vs. Clear State
Ordinal Multinomial Logistic a. k. a. Proportional Odds Model 3 Ordered Status Categories: Y = 1 if lake clear Y = 2 if lake shifting states Y = 3 if lake turbid
Ordinal Multinomial Logistic a. k. a. Proportional Odds Model library(MASS) State. Ord = as. ordered(State. Nom) polr(State. Ord ~ TP) SE 0. 002 t value 3. 81 Intercepts: Value SE 1|2 1. 103 0. 342 2|3 1. 889 0. 397 t value 3. 22 4. 76 TP Value 0. 009 Residual Deviance: 118. 99 AIC: 124. 9897 Assume Same Slope => Fewer Parameters 3 Ordered Status Categories: Y = 1 if lake clear Y = 2 if lake shifting states Y = 3 if lake turbid
m 2 = polr(State. Ord ~ TP) newd = data. frame(TP = seq(0, 600)) prd = predict(m 2, newdata=newd, type='p') matplot(newd$TP, prd)
Nominal/Ordinal Comparison
Nominal (un-ordered) Multinomial Logistic Library(nnet) multinom(State. Nom ~ TP) (Intercept) TP 2 -2. 469517 0. 01248172 3 -1. 891459 0. 01384079 Std. Errors: (Intercept) TP 2 0. 5486044 0. 004183882 3 0. 4465049 0. 003932610 Residual Deviance: 113. 8345 AIC: 121. 8345 For J = 3 Categories defining lake state: Y = 1 if lake clear Y = 2 if lake shifting states Y = 3 if lake turbid
Ordinal Multinomial Logistic a. k. a. Proportional Odds Model Library(MASS) State. Ord = as. ordered(State. Nom) polr(State. Ord ~ TP, Hess = T) TP Value 0. 0086 Intercepts: Value 1|2 1. 1028 2|3 1. 8889 SE 0. 0023 t value 3. 8085 SE 0. 3417 0. 3968 t value 3. 2277 4. 7605 Residual Deviance: 118. 9897 AIC: 124. 9897 For J = 3 Categories defining lake state: Y = 1 if lake clear Y = 2 if lake shifting states Y = 3 if lake turbid (State 2 is Intermediate between 1 & 3)
- Multinomial logistic regression
- Multinomial logistic regression
- Logistic regression vs linear regression
- Logistic regression vs linear regression
- Pseudo r-square
- Sequential logistic regression
- Random forest spss
- Perceptron
- Cost function logistic regression
- Andy field multiple regression
- Logistic regression
- Logistic regression epidemiology
- Hosmer lemeshow test
- Binary logistic regression spss
- Ln(p/1-p)
- Logistic regression stata
- Logistic regression stata
- Multiple linear regression
- Perbedaan analisis diskriminan dan regresi logistik
- Normal equation logistic regression
- Outliers in logistic regression