Multinomial Logistic Regression David F Staples Outline Review

  • Slides: 20
Download presentation
Multinomial Logistic Regression David F. Staples

Multinomial Logistic Regression David F. Staples

Outline • Review of Logistic Regression • BCS Example • Extension to Multiple Response

Outline • Review of Logistic Regression • BCS Example • Extension to Multiple Response Groups • Nominal Categories • Ordinal Categories • Model Fitting & Interpretation • Shallow Lake Trophic Status

Logistic Regression Based on a Binomial Random Variable: Ø Prob(Y = 1) = p

Logistic Regression Based on a Binomial Random Variable: Ø Prob(Y = 1) = p Ø Prob(Y = 0) = 1 -p p(x) = P(Yi = 1|Xi) = Y = {0, 1} , where Xβ = β 0 + β 1 x 1 +…+ βkxk.

Logistic Regression Based on a Binomial Random Variable: Ø Prob(Y = 1) = p

Logistic Regression Based on a Binomial Random Variable: Ø Prob(Y = 1) = p Ø Prob(Y = 0) = 1 -p p(x) = P(Yi = 1|Xi) = Y = {0, 1} , where Xβ = β 0 + β 1 x 1 +…+ βkxk. A logit transformation is used to linearize p(x): = β 0 + β 1 x 1 +…+ βkxk = Xβ Log Odds of ‘Success’ → The β’s give the additive effect of X’s on the Log Odds

Logistic Regression Example Dichotomous Variable is the Presence/Absence of BCS Ø Y = 1

Logistic Regression Example Dichotomous Variable is the Presence/Absence of BCS Ø Y = 1 if BCS Present Ø Y = 0 if BCS Absent Ø p = Prob(BCS Present) Model p as a function of Macrophyte Patch Area glm(BCS ~ Patch_area, family = binomial) Estimate Intercept Patch_area -2. 433 e+00 1. 765 e-04 SE 5. 108 e-01 4. 725 e-05 z -4. 764 3. 736 Pr(>|z|) 1. 9 e-06 0. 0001

Interpreting Logistic Regression glm(BCS ~ Patch_area, family = binomial) Estimate Intercept Patch_area -2. 433

Interpreting Logistic Regression glm(BCS ~ Patch_area, family = binomial) Estimate Intercept Patch_area -2. 433 e+00 1. 765 e-04 SE 5. 108 e-01 4. 725 e-05 z -4. 764 3. 736 Pr(>|z|) 1. 9 e-06 0. 0001 Effect of Patch Area on P(BCS) • Non-Linear Transformation Ø Value of Intercept Ø Value of Other Variables

Interpreting Logistic Regression glm(BCS ~ Patch_area, family = binomial) Estimate Intercept Patch_area -2. 433

Interpreting Logistic Regression glm(BCS ~ Patch_area, family = binomial) Estimate Intercept Patch_area -2. 433 e+00 1. 765 e-04 SE z 5. 108 e-01 4. 725 e-05 -4. 764 3. 736 Pr(>|z|) 1. 9 e-06 0. 0001 For the average size patch area (8374), the log odds ratio would be: -2. 433 + 0. 0001765 * 8374 = -0. 955 exponentiate to get the Odds of Success: exp(-. 955) = p/1 -p = 0. 38, Solve for p, Prob(BCS Present|Area=8374) =. 28

Interpreting Logistic Regression glm(BCS ~ Patch_area, family = binomial) Estimate Intercept Patch_area -2. 433

Interpreting Logistic Regression glm(BCS ~ Patch_area, family = binomial) Estimate Intercept Patch_area -2. 433 e+00 1. 765 e-04 SE 5. 108 e-01 4. 725 e-05 z -4. 764 3. 736 Pr(>|z|) 1. 9 e-06 0. 0001 When p = 0. 5, the log odds equals 0, – 2. 433 +. 0001765*Area = 0. Thus, the patch area for p =. 50 is 2. 433/. 0001765 = 13784. 7

Multinomial Logistic Regression • Logistic Regression with > 2 response categories • Model Probabilities

Multinomial Logistic Regression • Logistic Regression with > 2 response categories • Model Probabilities Relative to ‘Reference’ Category • Response May be Nominal or Ordinal Nominal Ordinal

Shallow Lake Trophic Status 3 Categories Defining Lake State: Y = 1 if Lake

Shallow Lake Trophic Status 3 Categories Defining Lake State: Y = 1 if Lake Clear Y = 2 if Lake Shifting States Y = 3 if Lake Turbid

Nominal (un-ordered) Multinomial Logistic library(nnet) multinom(State. Nom ~ TP) 2 3 (Int) -2. 47

Nominal (un-ordered) Multinomial Logistic library(nnet) multinom(State. Nom ~ TP) 2 3 (Int) -2. 47 -1. 89 Std. Errors: (Int) 2 0. 549 3 0. 447 TP 0. 012 0. 014 TP 0. 004 Residual Deviance: 113. 8345 AIC: 121. 8345

Nominal (un-ordered) Multinomial Logistic Library(nnet) multinom(State. Nom ~ TP) 2 3 (Int) -2. 47

Nominal (un-ordered) Multinomial Logistic Library(nnet) multinom(State. Nom ~ TP) 2 3 (Int) -2. 47 -1. 89 For TP = 50 TP 0. 012 0. 014 p(Shifting) is about 16% of p(Clear)

Nominal (un-ordered) Multinomial Logistic Library(nnet) multinom(State. Nom ~ TP) 2 3 (Int) -2. 47

Nominal (un-ordered) Multinomial Logistic Library(nnet) multinom(State. Nom ~ TP) 2 3 (Int) -2. 47 -1. 89 For TP = 50 TP 0. 012 0. 014 p(Turbid) is about 30% of p(Clear)

Nominal (un-ordered) Multinomial Logistic Odds of Shifting State vs. Clear State

Nominal (un-ordered) Multinomial Logistic Odds of Shifting State vs. Clear State

Ordinal Multinomial Logistic a. k. a. Proportional Odds Model 3 Ordered Status Categories: Y

Ordinal Multinomial Logistic a. k. a. Proportional Odds Model 3 Ordered Status Categories: Y = 1 if lake clear Y = 2 if lake shifting states Y = 3 if lake turbid

Ordinal Multinomial Logistic a. k. a. Proportional Odds Model library(MASS) State. Ord = as.

Ordinal Multinomial Logistic a. k. a. Proportional Odds Model library(MASS) State. Ord = as. ordered(State. Nom) polr(State. Ord ~ TP) SE 0. 002 t value 3. 81 Intercepts: Value SE 1|2 1. 103 0. 342 2|3 1. 889 0. 397 t value 3. 22 4. 76 TP Value 0. 009 Residual Deviance: 118. 99 AIC: 124. 9897 Assume Same Slope => Fewer Parameters 3 Ordered Status Categories: Y = 1 if lake clear Y = 2 if lake shifting states Y = 3 if lake turbid

m 2 = polr(State. Ord ~ TP) newd = data. frame(TP = seq(0, 600))

m 2 = polr(State. Ord ~ TP) newd = data. frame(TP = seq(0, 600)) prd = predict(m 2, newdata=newd, type='p') matplot(newd$TP, prd)

Nominal/Ordinal Comparison

Nominal/Ordinal Comparison

Nominal (un-ordered) Multinomial Logistic Library(nnet) multinom(State. Nom ~ TP) (Intercept) TP 2 -2. 469517

Nominal (un-ordered) Multinomial Logistic Library(nnet) multinom(State. Nom ~ TP) (Intercept) TP 2 -2. 469517 0. 01248172 3 -1. 891459 0. 01384079 Std. Errors: (Intercept) TP 2 0. 5486044 0. 004183882 3 0. 4465049 0. 003932610 Residual Deviance: 113. 8345 AIC: 121. 8345 For J = 3 Categories defining lake state: Y = 1 if lake clear Y = 2 if lake shifting states Y = 3 if lake turbid

Ordinal Multinomial Logistic a. k. a. Proportional Odds Model Library(MASS) State. Ord = as.

Ordinal Multinomial Logistic a. k. a. Proportional Odds Model Library(MASS) State. Ord = as. ordered(State. Nom) polr(State. Ord ~ TP, Hess = T) TP Value 0. 0086 Intercepts: Value 1|2 1. 1028 2|3 1. 8889 SE 0. 0023 t value 3. 8085 SE 0. 3417 0. 3968 t value 3. 2277 4. 7605 Residual Deviance: 118. 9897 AIC: 124. 9897 For J = 3 Categories defining lake state: Y = 1 if lake clear Y = 2 if lake shifting states Y = 3 if lake turbid (State 2 is Intermediate between 1 & 3)