Survival Analysis Bandit Thinkhamrop Ph D Statistics Department
Survival Analysis Bandit Thinkhamrop, Ph. D. (Statistics) Department of Biostatistics and Demography Faculty of Public Health, Khon Kaen University
Begin at the conclusion 7
Type of the study outcome: Key for selecting appropriate statistical methods • Study outcome • • Dependent variable or response variable Focus on primary study outcome if there are more • Type of the study outcome • • Continuous Categorical (dichotomous, polytomous, ordinal) Numerical (Poisson) count Event-free duration
The outcome determine statistics Continuous Mean Median Categorical Proportion (Prevalence Or Risk) Linear Reg. Count Survival Rate per “space” Median survival Risk of events at T(t) Logistic Reg. Poisson Reg. Cox Reg.
Statistics quantify errors for judgments Parameter estimation [95%CI] Hypothesis testing [P-value]
Back to the conclusion Continuous Categorical Count Survival Appropriate statistical methods Mean Median Proportion (Prevalence or Risk) Rate per “space” Median survival Risk of events at T(t) Magnitude of effect 95% CI Answer the research question based on lower or upper limit of the CI P-value
Study outcome • • Survival outcome = event-free duration Event (1=Yes; 0=Censor) Duration or length of time between: • Start date () • End date () At the start, no one had event (event = 0) at time t(0) At any point since the start, event could occur, hence, failure (event = 1) at time t(t) At the end of the study period, if event did not occur, hence, censored (event = 0) Thus, the duration could be either ‘time-to-event’ or ‘time-to-censoring’
Censoring • Censored data = incomplete ‘time to event’ data • In the present of censoring, the ‘time to event’ is not known • The duration indicates there has been no event occurred since the start date up to last date assessed or observed, a. k. a. , the end date. • The end date could be • End of the study • Last observed prior to the end of the study due to • Lost to follow-up • Withdrawn consent • Competing events occurred, prohibiting progression to the event under observation • Explanatory variables changed, irrelevance to occurrence of event under observation
Magnitude of effects • Median survival • Survival probability • Hazard ratio
SURVIVAL ANALYSIS Study aims: • Median survival of liver cancer • Survival probability • Five-year survival of liver cancer • Five-year survival rate of liver cancer • Hazard ratio • Factors affecting liver cancer survival • Effect of chemotherapy on liver cancer survival
SURVIVAL ANALYSIS Event Dead, infection, relapsed, etc Negative Cured, improved, conception, discharged, etc Positive Smoking cessation, ect Neutral
Natural History of Cancer
Accrual, Follow-up, and Event 0122 0112 0102 0092 ID Begin the study End of the study 1 2 3 Dead 4 Dead 5 6 Start of accrual End of follow-up Recruitment period Follow-up period
ID 1 2 3 4 5 6 Time since the beginning of the study 4 3 2 1 0 48 months 22 months 14 months 40 months 26 months 13 months The data : 13< Dead 26< Dead 40 14 22< 48<
DATA ID 48 22 14 40 26 13 SURVIVAL TIME )Months) 1 2 3 4 5 6 OUTCOME AT THE END OF THE STUDY Still alive at the end of the study Dead due to accident Dead caused by the disease under investigation Still alive at the end of the study Lost to follow-up EVENT Censored Dead Censored
DATA ID TIME 48 22 14 40 26 13 EVENT ID 1 Censored 2 Censored 3 Dead 4 Dead 5 Censored 6 Censored 0 0 1 1 0 0 TIME 48 22 14 40 26 13 EVENT 1 2 3 4 5 6
ANALYSIS ID 0 0 1 1 0 0 TIME 48 22 14 40 26 13 EVENT 1 2 3 4 5 6 Prevalence = 2/6 Incidence density = 2/163 person-months Proportion of surviving at month ‘t’ Median survival time
RESULTS ID 0 0 1 1 0 0 TIME 48 22 14 40 26 13 EVENT 1 2 3 4 5 6 Incidence density = 1. 2 per 100 person-months (95%CI: 0. 1 to 4. 4) Proportion of surviving at 24 month = 80% (95%CI: 20 to 97) Median survival time = 40 Months (95%CI: 14 to 48)
Type of Censoring 1) Left censoring: When the patient experiences the event in question before the beginning of the study observation period. 2) Interval censoring: When the patient is followed for awhile and then goes on a trip for awhile and then returns to continue being studied. 3) Right censoring: 1) single censoring: does not experience event during the study observation period 2) A patient is lost to follow-up within the study period. 3) Experiences the event after the observation period 4) multiple censoring: May experience event multiple times after study observation ends, when the event in question is not death.
Summary description of survival data set stdes • This command describes summary information about the data set. It provides summary statistics about the number of subjects, records, time at risk, failure events, etc.
Computation of S(t) 1) Suppose the study time is divided into periods, the number of which is designated by the letter, t. 2) The survivorship probability is computed by multiplying a proportion of people surviving for each period of the study. 3) If we subtract the conditional probability of the failure event for each period from one, we obtain that quantity. 4) The product of these quantities constitutes the survivorship function.
Kaplan-Meier Methods
Kaplan-Meier survival curve
Median survival time
Survival Function • The number in the risk set is used as the denominator. • For the numerator, the number dying in period t is subtracted from the number in the risk set. The product of these ratios over the study time=
Survival experience
Survival curve more than one group
Comparing survival between groups ID TIME DEAD DRUG 1 2 3 4 5 6 7 8 9 10 11 12 48 22 14 40 26 13 13 6 12 14 22 13 0 0 1 1 0 0 0 1 1 1 0 0 0
Kaplan-Meier surve Kaplan-Meier survival estimates, by drug 1. 00 drug 1 0. 75 0. 50 0. 25 drug 0 0. 00 0 20 analysis time 40 60
Log-rank test • • • t n n 1 n 2 d c O 1 O 2 E 1 E 2 = = = = = Time Number at risk for both groups at time t Number at risk for group 1 at time t Number at risk for group 2 at time t Dead for both groups at time t Censored for both groups at time t Number of dead for group 1 at time t Number of dead for group 2 at time t Number of expected dead for group 1 at time t Number of expected dead for group 2 at time t
Log-rank test example • DRUG 1 = 48+, 22+, 26+, 13+, 14, 40 • DRUG 0 = 13+, 6, 12, 14, 22, 13
Hazard Function
Survival Function vs Hazard Function H(t) = -ln(S(t)) = EXP(-H(t))
Hazard rate • The conditional probability of the event under study, provided the patient has survived up to an including that time period • Sometimes called the intensity function, the failure rate, the instantaneous failure rate
Formulation of the hazard rate The HR can vary from 0 to infinity. It can increase or decrease or remain constant over time. It can become the focal point of much survival analysis.
Cox Regression • The Cox model presumes that the ratio of the hazard rate to a baseline hazard rate is an exponential function of the parameter vector. h(t) = h 0(t) EXP(b 1 X 1 + b 2 X 2 + b 3 X 3 +. . . + bp. Xp )
Hazard ratio
Testing the Adequacy of the model 1. We save the Schoenfeld residuals of the model and the scaled Schoenfeld residuals. 2. For persons censored, the value of the residual is set to missing. borrowed from Professor Robert A. Yaffee
A graphical test of the proportion hazards assumption • A graph of the log hazard would reveal 2 lines over time, one for the baseline hazard (when x=0) and the other for when x = 1 • The difference between these two curves over time should be constant = B If we plot the Schoenfeld residuals over the line y=0, the best fitting line should be parallel to y=0. borrowed from Professor Robert A. Yaffee
Graphical tests • Criteria of adequacy: The residuals, particularly the rescaled residuals, plotted against time should show no trend(slope) and should be more or less constant over time. borrowed from Professor Robert A. Yaffee
Other issues • • Time-Varying Covariates Interactions may be plotted Conditional Proportional Hazards models: Stratification of the model may be performed. Then the stphtest should be performed for each stratum. borrowed from Professor Robert A. Yaffee
Suggested Readings for beginners
Suggested Readings for advanced learners
Survival analysis in practice • What is the type of research question that survival analysis should be used?
Stata for one-group survival analysis • • stset time, failure(event) stdescribe tab event stsum strate stci sts list, at(12 24)
Stata for one-group survival analysis (cont. ) • • sts g, atrisk sts g, lost sts g, enter sts g, risktable sts g, cumhaz ci sts g, hazard
Stata for multiple-group survival analysis • • • stset time, failure(event) stdescribe stsum, by(group) sts test group, wilcoxon strate group stci , by(group) sts g, by(group) atrisk sts g, by(group) risktable sts g, by(group) cumhaz lost sts g, by(group) hazard ci
Stata for multiple-group survival analysis • • • sts list, , by(group) at(12 24) compare ltable group, interval(#) ltable group, graph ltable group, hazard stmh group, by(strata) stmc group stcox group stir group
Stata for Model Fitting • Continuous covariate • xtile newvar = varlist , nq(4) • tabstat varlist, stat(n min max) by(newvar) • xi: stcox i. newvar • stsum, by(newvar) • Categorical covariate • tab exposure outcome, col • xi: stcox i. exposure
Sample size for Cox Model • stpower cox, failprob(. 2) hratio(0. 1 0. 3) sd(. 3) r 2(. 1) power(0. 8 0. 9) hr
- Slides: 50