Chapter 13 LOGISTIC REGRESSION LOGISTIC REGRESSION Set of

  • Slides: 23
Download presentation
Chapter 13 LOGISTIC REGRESSION

Chapter 13 LOGISTIC REGRESSION

LOGISTIC REGRESSION Set of independent variables Categorical outcome measure, generally dichotomous

LOGISTIC REGRESSION Set of independent variables Categorical outcome measure, generally dichotomous

Discriminant Function Analysis o Distinguishes among groups based on predictor variables o With two

Discriminant Function Analysis o Distinguishes among groups based on predictor variables o With two groups, results same as multiple regression with dummy-coded dependent variable

DISCRIMINANT FUNCTION ANALYSIS o Number of Discriminant Functions o One less than the number

DISCRIMINANT FUNCTION ANALYSIS o Number of Discriminant Functions o One less than the number of categories in the dependent variable, or the number of independent variables, whichever is less.

DISCRIMINANT FUNCTION ANALYSIS o Centroid o Mean of the discriminant scores for a given

DISCRIMINANT FUNCTION ANALYSIS o Centroid o Mean of the discriminant scores for a given group.

DISCRIMINANT FUNCTION ANALYSIS o Coefficients o Raw - like bs in regression o Standardized

DISCRIMINANT FUNCTION ANALYSIS o Coefficients o Raw - like bs in regression o Standardized - like Betas in regression o Structure - like loadings in factor analysis n . 30 or greater considered meaningful

Analysis o Similar to factor analysis o Principal components analysis o Rotation may be

Analysis o Similar to factor analysis o Principal components analysis o Rotation may be used o Wilks’ lambda

Discriminant Function Analysis o Assumptions n n n Distribution of independent variables, given value

Discriminant Function Analysis o Assumptions n n n Distribution of independent variables, given value of outcome variable, is multivariate normal Dichotomous outcome variable makes this unlikely Discriminant function tends to overestimate the magnitude of the association

Dichotomous Outcome Variable o Mean will be between 0 and 1. o Binomial, rather

Dichotomous Outcome Variable o Mean will be between 0 and 1. o Binomial, rather than normal distribution, describes distribution of residuals.

Discriminant Function vs Logistic Regression o Logistic Regression requires fewer assumptions o Even if

Discriminant Function vs Logistic Regression o Logistic Regression requires fewer assumptions o Even if assumptions for Discriminant are met, Logistic still works well

LOGISTIC REGRESSION o Which variables affect the probability of a certain outcome? n Produces

LOGISTIC REGRESSION o Which variables affect the probability of a certain outcome? n Produces odds ratios that aid interpretation

Methods o Discriminant analysis - least squares o Logistic regression - maximum-likelihood method n

Methods o Discriminant analysis - least squares o Logistic regression - maximum-likelihood method n n coefficients make observed results most likely non-linear iterative data assume S-shaped curve

ODDS o Based on Probabilities o Probability of occurrence/probability of nonoccurrence n n Probability

ODDS o Based on Probabilities o Probability of occurrence/probability of nonoccurrence n n Probability of developing lung cancer/probability of not developing lung cancer Can calculate the odds of developing lung cancer for smokers and for nonsmokers

ODDS RATIO o Ratio of one probability to the other o Ratio of odds

ODDS RATIO o Ratio of one probability to the other o Ratio of odds of developing lung cancer for smokers vs the odds of developing lung cancer for nonsmokers

LOGISTIC REGRESSION o Exercise n n Recode the exercise variable into a new variable

LOGISTIC REGRESSION o Exercise n n Recode the exercise variable into a new variable where people who exercise rarely or sometimes are scored 0, and those who exercise often or routinely are scored 1. Recode marital status into a new variable where never married = 0, married at some time = 1, and living with significant other is assigned to missing values.

Exercise Continued o Which of the predictor variables affect the probability of regular exercise?

Exercise Continued o Which of the predictor variables affect the probability of regular exercise? o Enter the predictors in the following sets:

Predictors o Step 1 n n Gender Marital status o Step 2 n n

Predictors o Step 1 n n Gender Marital status o Step 2 n n Satisfaction with weight Overall health o Step 3 n Current quality of life

SPSS - Logistic Regression o ANALYZE n Regression p Binary Logistic p Options n

SPSS - Logistic Regression o ANALYZE n Regression p Binary Logistic p Options n n n Classification Plots Hosmer-Lemeshow goodness of fit Casewise listing of residuals Iteration history CI for exp B

-2 Log Likelihood o Likelihood = probability of observed results given parameters o -2

-2 Log Likelihood o Likelihood = probability of observed results given parameters o -2 times the log of the likelihood is given o Perfect model would have -2 LL = 0. o Model chi-square reflects difference between successive -2 LLs.

Terms o Step - If variables within a block are entered in a stepwise

Terms o Step - If variables within a block are entered in a stepwise fashion, this tests each step o Block - Test of variables entered in this block o Model - Test of overall model at this point

Estimates of Variance Accounted For o Cox & Snell (can’t equal one) o Nagelkerke

Estimates of Variance Accounted For o Cox & Snell (can’t equal one) o Nagelkerke (modification of Cox & Snell)

Goodness of Fit o Hosmer and Lemeshow Test o Non-significant result means model fits

Goodness of Fit o Hosmer and Lemeshow Test o Non-significant result means model fits

Example from the Literature

Example from the Literature