Statistics for clinicians Biostatistics course by Kevin E
Statistics for clinicians Biostatistics course by Kevin E. Kip , Ph. D. , FAHA Professor and Executive Director, Research Center University of South Florida, College of Nursing Professor, College of Public Health Department of Epidemiology and Biostatistics Associate Member, Byrd Alzheimer’s Institute Morsani College of Medicine Tampa, FL, USA 1
SECTION 6. 6 Introduction to survival analysis 2
Learning Outcome: Recognize concepts and methods used in survival analysis
Survival Analysis • A technique to estimate the probability of “survival” (and also risk of disease) that takes into account incomplete subject followup. • Calculates risks over a time period with changing incidence rates. • Wide application in a variety of disciplines, such as engineering.
Survival Analysis • With the Kaplan-Meier method (“product-limit method”), survival probabilities are calculated at each time interval in which an event occurs. • The cumulative survival over the entire follow-up period is derived from the product of all interval survival probabilities. • Cumulative incidence (risk) is the complement of cumulative survival.
K-M formula: # of time S Where: = intervals (Nk – Ak) ------- k=1 Nk k = sequence of time interval Nk = number of subjects at risk Ak = number of outcome events
Survival Analysis • With the Kaplan-Meier method, subjects with incomplete follow-up (FU) are “censored” at their last known time of (FU). • An important assumption (often not upheld) is that censoring is “non-informative” (survival experience of subjects censored is the same as those with complete FU). • Non-fatal outcomes can also be studied.
Survival Analysis • The Life-Table method is conceptually similar to the Kaplan-Meier method. • The primary difference is that survival probabilities are determined at predetermined intervals (i. e. years), rather than when events occur.
SECTION 6. 7 Calculation and Interpretation of Survival Analysis Estimates 9
Learning Outcome: Calculate and interpret survival analysis estimates of incidence
Survival Analysis Example: • Assume a study of 10 subjects conducted over a 2 -year period. • A total of 4 subjects die. • Another 2 subjects have incomplete followup (study withdrawal or late study entry). What is the probability of 2 -year survival, and the corresponding risk of 2 -year death?
(1) Time to Death from Entry (Mo) (2) No. Alive at Each Time (3) No. Who Died at Each Time (4) No. Lost to FU Prior to Next Time (5) Prop. Died at That Time (3) / (2) (6) Prop. Survive At That Time 1 – (5) (7) Cumul. Survival To that Time (8) Cumul. Risk to That Time 1 – (7) 5 10 1 1 0. 10 0. 90 0. 10 7 8 1 0 0. 125 0. 875 0. 788 0. 212 20 ? 1 1 ? ?
(1) Time to Death from Entry (Mo) (2) No. Alive at Each Time (3) No. Who Died at Each Time (4) No. Lost to FU Prior to Next Time (5) Prop. Died at That Time (3) / (2) (6) Prop. Survive At That Time 1 – (5) (7) Cumul. Survival To that Time (8) Cumul. Risk to That Time 1 – (7) 5 10 1 1 0. 10 0. 90 0. 10 7 8 1 0 0. 125 0. 875 0. 788 0. 212 20 7 1 1 ? ?
(1) Time to Death from Entry (Mo) (2) No. Alive at Each Time (3) No. Who Died at Each Time (4) No. Lost to FU Prior to Next Time (5) Prop. Died at That Time (3) / (2) (6) Prop. Survive At That Time 1 – (5) (7) Cumul. Survival To that Time (8) Cumul. Risk to That Time 1 – (7) 5 10 1 1 0. 10 0. 90 0. 10 7 8 1 0 0. 125 0. 875 0. 788 0. 212 20 7 1 1 0. 143 0. 857 0. 675 0. 325 22 5 1 0 ? ?
(1) Time to Death from Entry (Mo) (2) No. Alive at Each Time (3) No. Who Died at Each Time (4) No. Lost to FU Prior to Next Time (5) Prop. Died at That Time (3) / (2) (6) Prop. Survive At That Time 1 – (5) (7) Cumul. Survival To that Time (8) Cumul. Risk to That Time 1 – (7) 5 10 1 1 0. 10 0. 90 0. 10 7 8 1 0 0. 125 0. 875 0. 788 0. 212 20 7 1 1 0. 143 0. 857 0. 675 0. 325 22 5 1 0 0. 20 0. 80 0. 54 0. 46
(1) Time to Death from Entry (Mo) (2) No. Alive at Each Time (3) No. Who Died at Each Time (4) No. Lost to FU Prior to Next Time (5) Prop. Died at That Time (3) / (2) (6) Prop. Survive At That Time 1 – (5) (7) Cumul. Survival To that Time (8) Cumul. Risk to That Time 1 – (7) 5 10 1 1 0. 10 0. 90 0. 10 7 8 1 0 0. 125 0. 875 0. 788 0. 212 20 7 1 1 0. 143 0. 857 0. 675 0. 325 22 5 1 0 0. 20 0. 80 0. 54 0. 46 24 4 0 0 0. 0 1. 0 0. 54 0. 46
Interpretation: What is the 2 -year risk of death? (1) Time to Death from Entry (Mo) (2) No. Alive at Each Time (3) No. Who Died at Each Time (4) No. Lost to FU Prior to Next Time (5) Prop. Died at That Time (3) / (2) (6) Prop. Survive At That Time 1 – (5) (7) Cumul. Survival To that Time (8) Cumul. Risk to That Time 1 – (7) 5 10 1 1 0. 10 0. 90 0. 10 7 8 1 0 0. 125 0. 875 0. 788 0. 212 20 7 1 1 0. 143 0. 857 0. 675 0. 325 22 5 1 0 0. 20 0. 80 0. 54 0. 46 24 4 0 0 0. 0 1. 0 0. 54 0. 46
Interpretation: What is the 1 -year risk of death? (1) Time to Death from Entry (Mo) (2) No. Alive at Each Time (3) No. Who Died at Each Time (4) No. Lost to FU Prior to Next Time (5) Prop. Died at That Time (3) / (2) (6) Prop. Survive At That Time 1 – (5) (7) Cumul. Survival To that Time (8) Cumul. Risk to That Time 1 – (7) 5 10 1 1 0. 10 0. 90 0. 10 7 8 1 0 0. 125 0. 875 0. 788 0. 212 20 7 1 1 0. 143 0. 857 0. 675 0. 325 22 5 1 0 0. 20 0. 80 0. 54 0. 46 24 4 0 0 0. 0 1. 0 0. 54 0. 46
Survival Analysis (Practice) Example: • Assume a study of 12 subjects conducted over a 3 -year period. • A total of 5 subjects die. • Another 2 subjects have incomplete followup (study withdrawal or late study entry). What is the probability of 3 -year survival, and the corresponding risk of 3 -year death? 19
Complete the worksheet below (1) Time to Death from Entry (Mo) (2) No. Alive at Each Time (3) No. Who Died at Each Time (4) No. Lost to FU Prior to Next Time (5) Prop. Died at That Time (3) / (2) (6) Prop. Survive At That Time 1 – (5) (7) Cumul. Survival To that Time (8) Cumul. Risk to That Time 1 – (7) 7 12 1 1 0. 0833 0. 9167 0. 0833 11 10 1 0 0. 10 0. 90 0. 8250 0. 1750 16 1 0 24 1 1 30 1 0 36 0 0 What is the probability of 3 -year survival, and the corresponding risk of 3 -year death? Survival _______ Death _____ 20
Complete the worksheet below (1) Time to Death from Entry (Mo) (2) No. Alive at Each Time (3) No. Who Died at Each Time (4) No. Lost to FU Prior to Next Time (5) Prop. Died at That Time (3) / (2) (6) Prop. Survive At That Time 1 – (5) (7) Cumul. Survival To that Time (8) Cumul. Risk to That Time 1 – (7) 7 12 1 1 0. 0833 0. 9167 0. 0833 11 10 1 0 0. 10 0. 90 0. 8250 0. 1750 16 9 1 0 0. 1111 0. 8889 0. 7333 0. 2667 24 8 1 1 0. 125 0. 875 0. 6416 0. 3584 30 6 1 0 0. 1667 0. 8333 0. 5346 0. 4654 36 5 0 0 0. 0 1. 0 0. 5346 0. 4654 What is the probability of 3 -year survival, and the corresponding risk of 3 -year death? Survival _0. 5346_ Death _0. 4654_ 21
SECTION 6. 8 Logistic Regression Model 22
Learning Outcome: Recognize components and interpret parameters from the logistic regression model 23
Logistic Regression Analysis § Conceptually similar to linear regression with dichotomous outcome. § Outcome is usually coded as “ 0” or “ 1”, with “ 1” referring to presence of the outcome in interest (although SAS assumes 0). § p represents the probability that the outcome is present (e. g. value of 1), given particular covariate values of an individual
Logistic Regression Analysis § Multiple logistic regression model can be written in different ways: where: p = expected probability that outcome is present x 1 through xp = independent variables b 0 through bp = regression coefficients
Logistic Regression Analysis b 1 = change in the expected log odds in the outcome relative to a 1 -unit change in xi holding other predictors constant Anti-log of regression coefficient, exp(bi), produces odds ratio
Logistic Regression Analysis Example: Estimate the risk of incident CVD among persons defined as obese. Variable Intercept Obesity (yes vs. no) p ln 1–p { } = b 0 b -2. 367 0. 658 χ2 307. 38 9. 87 p-value 0. 0001 0. 0017 + b 1 x 1 + b 2 x 2 + … bpxp p ln = -2. 367 + 0. 658(Obesity) = log odds 1–p exp(0. 658) = 1. 93 (odds ratio) { }
Example: Estimate the log odds of being on a statin drug in relation to the predictors listed below. Variable b Wald χ2 p-value Intercept -3. 065 8. 015 0. 027 Age (per year) 0. 036 5. 334 0. 021 Gender (female = 1) -0. 530 5. 082 0. 024 Body mass index (per unit) 0. 029 2. 187 0. 139 Physical activity (per unit) -0. 001 0. 000 0. 996 History of diabetes (1 = yes) 1. 067 9. 250 0. 002 p ln 1–p { } = b 0 + b 1 x 1 + b 2 x 2 + … bpxp Write out the logistic regression equation below. (Practice) p ln 1–p { } =
Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below. Variable b Wald χ2 p-value Intercept -3. 065 8. 015 0. 027 Age (per year) 0. 036 5. 334 0. 021 Gender (female = 1) -0. 530 5. 082 0. 024 Body mass index (per unit) 0. 029 2. 187 0. 139 Physical activity (per unit) -0. 001 0. 000 0. 996 History of diabetes (1 = yes) 1. 067 9. 250 0. 002 p ln 1–p { } = b 0 + b 1 x 1 + b 2 x 2 + … bpxp Write out the logistic regression equation below. p ln 1–p { } = -3. 065 + 0. 036(age) – 0. 53(female) + 0. 029(BMI) – 0. 001 (physical activity) + 1. 067(diabetes)
Variable b Wald χ2 p-value Intercept -3. 065 8. 015 0. 027 Age (per year) 0. 036 5. 334 0. 021 Gender (female = 1) -0. 530 5. 082 0. 024 Body mass index (per unit) 0. 029 2. 187 0. 139 Physical activity (per unit) -0. 001 0. 000 0. 996 History of diabetes (1 = yes) 1. 067 9. 250 0. 002 p ln 1–p { } = b 0 + b 1 x 1 + b 2 x 2 + … bpxp So, the predicted odds of an individual being on a statin drug = = EXP[(-3. 065 + 0. 036(age) – 0. 53(female) + 0. 029(BMI) – 0. 001 (physical activity) + 1. 067(diabetes)] AND Predicted Probability = Predicted odds / (1 + predicted odds).
Variable b Wald χ2 p-value Intercept -3. 065 8. 015 0. 027 Age (per year) 0. 036 5. 334 0. 021 Gender (female = 1) -0. 530 5. 082 0. 024 Body mass index (per unit) 0. 029 2. 187 0. 139 Physical activity (per unit) -0. 001 0. 000 0. 996 History of diabetes (1 = yes) 1. 067 9. 250 0. 002 Estimate the predicted odds and probability of an individual being on a statin drug with the following characteristics: Age=55; male; BMI=31. 4; physical activity level=2; diabetic = EXP[(-3. 065 + 0. 036(55) – 0. 53(0) + 0. 029(31. 4) – 0. 001 (2) + 1. 067(1)] = exp(0. 896) = 2. 446 Predicted Probability = odds / (1 + predicted odds) = 2. 446 / (3. 446) = 0. 71
Variable b Wald χ2 p-value Intercept -3. 065 8. 015 0. 027 Age (per year) 0. 036 5. 334 0. 021 Gender (female = 1) -0. 530 5. 082 0. 024 Body mass index (per unit) 0. 029 2. 187 0. 139 Physical activity (per unit) -0. 001 0. 000 0. 996 History of diabetes (1 = yes) 1. 067 9. 250 0. 002 Estimate the predicted odds and probability of an individual being on a statin drug with the following characteristics: PRACTICE Age=52; female; BMI=29. 5; physical activity level=3; non-diabetic = Predicted Probability = odds / (1 + predicted odds) =
Variable b Wald χ2 p-value Intercept -3. 065 8. 015 0. 027 Age (per year) 0. 036 5. 334 0. 021 Gender (female = 1) -0. 530 5. 082 0. 024 Body mass index (per unit) 0. 029 2. 187 0. 139 Physical activity (per unit) -0. 001 0. 000 0. 996 History of diabetes (1 = yes) 1. 067 9. 250 0. 002 Estimate the predicted odds and probability of an individual being on a statin drug with the following characteristics: Age=52; female; BMI=29. 5; physical activity level=3; non-diabetic = EXP[(-3. 065 + 0. 036(52) – 0. 53(1) + 0. 029(29. 5) – 0. 001 (3) + 1. 067(0)] = exp(-0. 8645) = 0. 42 Predicted Probability = odds / (1 + predicted odds) = 0. 42 / (1. 42) = 0. 296
Example: Estimate the log odds of being on a statin drug in relation to the predictors listed below. Variable b Wald χ2 p-value Intercept -3. 065 8. 015 0. 027 Age (per year) 0. 036 5. 334 0. 021 Gender (female = 1) -0. 530 5. 082 0. 024 Body mass index (per unit) 0. 029 2. 187 0. 139 Physical activity (per unit) -0. 001 0. 000 0. 996 History of diabetes (1 = yes) 1. 067 9. 250 0. 002 Produce odds ratio estimates of statin use for the following (Practice): Age (per year) Age per 5 years) Female gender History of diabetes = =
Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below. Variable b Wald χ2 p-value Intercept -3. 065 8. 015 0. 027 Age (per year) 0. 036 5. 334 0. 021 Gender (female = 1) -0. 530 5. 082 0. 024 Body mass index (per unit) 0. 029 2. 187 0. 139 Physical activity (per unit) -0. 001 0. 000 0. 996 History of diabetes (1 = yes) 1. 067 9. 250 0. 002 Produce odds ratio estimates of statin use for the following: Age (per year) Age per 10 years) Female gender History of diabetes = = exp(0. 036) = 1. 04 exp(10 x 0. 036) = 1. 43 exp(-0. 530) = 0. 59 exp(1. 067) = 2. 91
Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below. Variable b Wald χ2 p-value Intercept -3. 065 8. 015 0. 027 Age (per year) 0. 036 5. 334 0. 021 Gender (female = 1) -0. 530 5. 082 0. 024 Body mass index (per unit) 0. 029 2. 187 0. 139 Physical activity (per unit) -0. 001 0. 000 0. 996 History of diabetes (1 = yes) 1. 067 9. 250 0. 002 Interpret odds ratio estimates of statin use for the following: Age per 10 years) = exp(10 x 0. 036) = 1. 43 History of diabetes = exp(1. 067) = 2. 91
Example: Estimate the log odds of being on a statin drug in relation To the predictors listed below. Variable b Wald χ2 p-value Intercept -3. 065 8. 015 0. 027 Age (per year) 0. 036 5. 334 0. 021 Gender (female = 1) -0. 530 5. 082 0. 024 Body mass index (per unit) 0. 029 2. 187 0. 139 Physical activity (per unit) -0. 001 0. 000 0. 996 History of diabetes (1 = yes) 1. 067 9. 250 0. 002 Interpret odds ratio estimates of statin use for the following: Age per 10 years) = exp(10 x 0. 036) = 1. 43 For every 10 year increase in age, the adjusted odds of being on a statin drug increases 1. 43 -fold History of diabetes = exp(1. 067) = 2. 91 Persons with diabetes have 2. 91 times higher odds of being on a statin drug compared to persons without diabetes
SECTION 6. 9 SPSS for Logistic Regression Analysis 38
Learning Outcome: Use SPSS to fit and interpret a logistic regression model 39
SPSS Analyze Regression Binary Logistic Dependent Variable Covariates
SPSS Analyze Descriptive Statistics Crosstabs Row=Hx diabetes Col = Statin use Odds Ratio = odds exposure cases odd exposure controls = (17 / 88) / (24 / 372) = 0. 193 / 0. 0645 = 2. 99
- Slides: 45