Stata Logistic Regression 2 h Hein Stigum Nina

  • Slides: 27
Download presentation
Stata: Logistic Regression 2 h Hein Stigum (Nina Iszatt) Presentation, data and programs at:

Stata: Logistic Regression 2 h Hein Stigum (Nina Iszatt) Presentation, data and programs at: http: //folk. uio. no/heins/ Nov-20 H. S. 1

DAG: Physical activity and CHD • CHD analysis – Binary outcome – Plots by

DAG: Physical activity and CHD • CHD analysis – Binary outcome – Plots by physical activity – Compare proportions – Logistic regression 02 November H. S. 2

Agenda • Purpose • Workflow • Syntax • Testing assumptions • Influence Nov-20 H.

Agenda • Purpose • Workflow • Syntax • Testing assumptions • Influence Nov-20 H. S. 3

BACKGROUND Nov-20 H. S. 4

BACKGROUND Nov-20 H. S. 4

Logistic regression Physical Activity and Coronary Heart Disease 02 November H. S. 5

Logistic regression Physical Activity and Coronary Heart Disease 02 November H. S. 5

Logistic model and assumptions • linear predictor: xb 02 November H. S. 6

Logistic model and assumptions • linear predictor: xb 02 November H. S. 6

Association measure, Odds ratio Start with: Hence: 02 November H. S. 7

Association measure, Odds ratio Start with: Hence: 02 November H. S. 7

Short: need to know • Binary outcome • Assume – Linear effects on the

Short: need to know • Binary outcome • Assume – Linear effects on the log-odds scale • Association measure – OR=eb, b=coefficient • Scale – Multiplicative exposed to both x 1 and x 2 : OR 1*OR 2 02 November H. S. 8

Purpose of regression • Estimation DAGs, bias, precision – Estimate association between exposure and

Purpose of regression • Estimation DAGs, bias, precision – Estimate association between exposure and outcome adjusted for other covariates – Estimate the effect of smoking on lung cancer • Prediction Predictive power, model fit, R 2 – Use an estimated model to predict the outcome given covariates in a new dataset – Predict air pollution by distance from roads Nov-20 H. S. 9

Syntax • Estimation – logistic y x 1 x 2 logistic regression – logistic

Syntax • Estimation – logistic y x 1 x 2 logistic regression – logistic y i. smoke c. age cat. smoke, cont. Age – logistic y i. smoke##c. age interaction • Manage models – estimates store m 1 – est table m 1, eform save model show OR • Post estimation – predict yf, pr predict probability – margins, over(age. I) predict(xb) non-linearity? 02 November H. S. 10

Workflow • DAG • Bivariate analysis • Regression – Model fitting • Exposure •

Workflow • DAG • Bivariate analysis • Regression – Model fitting • Exposure • + Confounders – Test of assumptions • Independent errors • Linear effects (on the log odds scale) • Interactions – Influence 02 November H. S. 11

Syntax “Descriptive Analysis” Nov-20 H. S. 12

Syntax “Descriptive Analysis” Nov-20 H. S. 12

ASSUMPTIONS Nov-20 H. S. 13

ASSUMPTIONS Nov-20 H. S. 13

Assumptions of the standard model 1. Independent residuals 2. Linear effects on the log-odds

Assumptions of the standard model 1. Independent residuals 2. Linear effects on the log-odds scale 3. No interactions discuss add splines test in model Dependent residuals? When will the heart disease of one person depend on the heart disease of another? logistic …, vce(cluster(m_id)) Siblings, twins melogit … Nov-20 If many siblings: clusters by mother’s id H. S. 14

Non-linear effects Nov-20 H. S. 15

Non-linear effects Nov-20 H. S. 15

Smoothers in regressions • Polynomials – x, x 2, x 3 • Fractional polynomials

Smoothers in regressions • Polynomials – x, x 2, x 3 • Fractional polynomials (2 of 8) x-2, x-1, x-0. 5 log(x), x 0. 5 x, x 2, x 3 • Splines – cubic – linear only plots estimates g 1 g 2 (Govindarajulu, Malloy et al. 2009, Binder, Sauerbrei et al. 2013, Kahan, Rushton et al. 2016) Nov-20 H. S. 16

Syntax “Non-linear effect” Nov-20 H. S. 17

Syntax “Non-linear effect” Nov-20 H. S. 17

Measures of influence INFLUENCE Nov-20 H. S. 18

Measures of influence INFLUENCE Nov-20 H. S. 18

Measures of influence Remove obs 1, see change remove obs 2, see change •

Measures of influence Remove obs 1, see change remove obs 2, see change • Measure change in: One delta-beta per observations (with same covariate pattern) for all covariates – Coefficients (beta) • Delta beta Nov-20 H. S. 19

Syntax “Influence” Nov-20 H. S. 20

Syntax “Influence” Nov-20 H. S. 20

Summing up 1 • Build model – – – logistic chd phys est store

Summing up 1 • Build model – – – logistic chd phys est store m 1 logistic chd phys age educ est store m 2 est table m 1 m 2, eform crude model store full model store compare ORs • Interactions – logistic c. age##i. sex main +interaction • Non-linearity (linear spline) – mkspline l 1 9 l 2=phys – logistic chd l? age educ Nov-20 spline in phys (knot at 9) regression with spline H. S. 21

Summing up 2 • Interaction – logistic chd logistic c. phys##i. sex test interaction

Summing up 2 • Interaction – logistic chd logistic c. phys##i. sex test interaction • Influence of outliers – predict d. Beta, db – scatter d. Beta p, jitter(10) Nov-20 delta beta (common) delta-beta by p H. S. 22

Generalized Linear Models, GLM Linear regression Logistic regression Poisson regression Nov-20 H. S. 23

Generalized Linear Models, GLM Linear regression Logistic regression Poisson regression Nov-20 H. S. 23

Syntax “Binary regression, OR, RR and RD effect measures” Nov-20 H. S. 24

Syntax “Binary regression, OR, RR and RD effect measures” Nov-20 H. S. 24

The end 02 November H. S. 25

The end 02 November H. S. 25

Stata regression commands 02 November H. S. 26

Stata regression commands 02 November H. S. 26

 • Regression with simple error structure – regress linear regression (also heteroschedastic errors)

• Regression with simple error structure – regress linear regression (also heteroschedastic errors) – nl non linear least squares • GLM – logistic regression – poisson Poisson regression – binreg binary outcome, OR, RR, or RD effect measures • Conditional logistc – clogit for matched case-control data • Categorical outcome (>2 categories) – mlogit multinomial logit (not ordered) – ologit ordered logit • Regression with complex error structure – mixed linear mixed models – melogit random effect logistic 02 November H. S. 27