Data structure for a discretetime event history analysis

  • Slides: 25
Download presentation
Data structure for a discrete-time event history analysis Jane E. Miller, Ph. D The

Data structure for a discrete-time event history analysis Jane E. Miller, Ph. D The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Overview • Structure of most survey data: One record per respondent • Discrete-time event

Overview • Structure of most survey data: One record per respondent • Discrete-time event history analysis requires separate records for each person-time unit at risk of the event • Review: How to create one record per spell • How to create one record person-time unit – Components of the dependent variable – Fixed characteristics – Time varying characteristics Event history analysis: discrete time data The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Data preparation for an event history • Survey data often contains one record per

Data preparation for an event history • Survey data often contains one record per respondent • Continuous-time event history data contain one record per spell • Discrete-time event history analysis requires one record person-time unit within each spell – E. g. , one record for each person-month at risk of divorce, within each spell at risk of divorce Event history analysis: discrete time data The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Source data from survey: 1 record per respondent ID Date of 1 st birth

Source data from survey: 1 record per respondent ID Date of 1 st birth marriage divorce 1 2/1/52. 2 Date of Date of Date 1 st Date last 2 nd 1 st child's 2 nd child's marriage divorce death observed Gender birth . . 7/15/85 10/1/10 F . . 7/15/69 6/22/10. . 9/21/85 11/5/10 M . . 1/1/97 10/1/04. . 10/8/85 5/1/05 M 12/5/95. 10/1/02 12/2/85 10/2/02 F 9/21/64 5/11/67 3 3/1/65 8/1/90 4 3/1/42 6/1/63. Event history analysis: discrete time data . . The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Example timelines for study of divorce M = Married D = Divorced L =

Example timelines for study of divorce M = Married D = Divorced L = Lost to follow-up O = Censored by end of study. X = Died Case 1: Never married -> no spells Case 2: Married once, censored by end of survey Case 3: Married twice, lost to follow-up before end of survey Case 4: Married once, died before end of survey M O M Not married -> not at risk of divorce -> not part of a spell M D M L X End of observation period Event history analysis: discrete time data The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Continuous-time event history data • One record for each period at risk (spell) –

Continuous-time event history data • One record for each period at risk (spell) – Duration of overall spell – Event indicator at end of spell Date Duration Status Divorce Age first Age at Age last # kids at Spell # spell of spell at end event observed start of ID (marriage #) started (mos. ) of spell indicator (yrs) spell (yrs) Gender spell 2 1 6/22/10 3. 5 0 0 16 40 41 male 0 3 1 8/1/90 76. 5 1 1 20 25 45 male 0 3 2 10/1/04 6. 5 2 0 20 39 45 male 1 4 1 6/1/63 474. 5 3 0 43 21 60 female 0 Event history analysis: discrete time data The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Event history timeline: Discrete time specification Four person-month units Case 2, Continuous time version:

Event history timeline: Discrete time specification Four person-month units Case 2, Continuous time version: One four-month spell Married 6/22/2010 Last surveyed 11/5/2010 Case 2, Discrete-time version: Each person-month unit becomes one record -> unit of analysis. All records for each spell include respondent ID and other characteristics. 1 st person-month Married O O O 3 rd person-month O O 4 th person-month O 2 nd person-month Event history analysis: discrete time data O = Censored End of survey The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Discrete-time data set: ID codes on person-time records One record per spell ID 2

Discrete-time data set: ID codes on person-time records One record per spell ID 2 3 3 Duration Status Spell # of spell at end Divorce (marriage #) (mos. ) of spell indicator 1 4 0 0 1 77 1 1 2 7 2 0 • Each person-month record carries the respondent ID • Each record within a given spell also includes the spell # for that respondent Event history analysis: discrete time data One record person-month ID 2 2 3 3 3 3 3 Spell # Record # (marriage #) w/in spell 1 1 1 2 1 3 1 4 1 1 1 2 1 3 1 … 1 77 2 1 2 2 2 … 2 7 The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Record number within spell One record per spell ID 2 3 3 Duration Status

Record number within spell One record per spell ID 2 3 3 Duration Status Spell # of spell at end Divorce (marriage #) (mos. ) of spell indicator 1 4 0 0 1 77 1 1 2 7 2 0 • Each month in a spell will generate one person-month record, e. g. , – respondent #2 is observed for 4 months -> 4 person-month records – respondent #3 contributes a total of 84 records • 77 in his first spell • 7 in his second spell Event history analysis: discrete time data One record person-month ID 2 2 3 3 3 3 3 Spell # Record # (marriage #) w/in spell 1 1 1 2 1 3 1 4 1 1 1 2 1 3 1 … 1 77 2 1 2 2 2 … 2 7 The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Month counter within spell One record per spell Duration Status Spell # of spell

Month counter within spell One record per spell Duration Status Spell # of spell at end Divorce ID (marriage #) (mos. ) of spell indicator 2 1 4 0 0 3 1 77 1 1 3 2 7 2 0 The “month # within spell” counter indicates the start time of the person-month at risk for that record. E. g. , the first record for a given spell starts at baseline (time point 0). Event history analysis: discrete time data One record person-month ID 2 2 3 3 3 3 3 month # Spell # Record # within (marriage #) w/in spell 1 1 0 1 2 1 1 3 2 1 4 3 1 1 0 1 2 1 1 3 3 1 … … 1 77 76 2 1 1 2 2 … … 2 7 6 The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Duration measure for each record within spell One record person-month One record per spell

Duration measure for each record within spell One record person-month One record per spell Duration Status Spell # of spell at end Divorce ID (marriage #) (mos. ) of spell indicator 2 1 4 0 0 3 1 77 1 1 3 2 7 2 0 The duration measure will = 1 time units for all person-time records within a given spell EXCEPT = 0. 5 for the last month in a spell Event history analysis: discrete time data ID 2 2 3 3 3 3 3 Person. Spell # Record month # months (marriage # w/in within w/in #) spell record 1 1 0 1 1 2 1 1 1 3 2 1 1 4 3. 5 1 1 0 1 1 2 1 1 1 3 3 1 1 … … 1 1 77 76. 5 2 1 1 1 2 2 2 1 2 … … 1 2 7 6. 5 The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Status indicator for each record within spell One record person-month One record per spell

Status indicator for each record within spell One record person-month One record per spell Duration Status Spell # of spell at end Divorce ID (marriage #) (mos. ) of spell indicator 2 1 4 0 0 3 1 77 1 1 3 2 7 2 0 The indicator for status at end of record will = 0 for all person-time records within a given spell EXCEPT the last one because by definition they end in censoring (the spell is not yet complete) Event history analysis: discrete time data ID 2 2 3 3 3 3 3 Person- Status Spell # Record month # months at end (marriage # w/in within w/in of #) spell record 1 1 0 1 2 1 1 0 1 3 2 1 0 1 4 3. 5 0 1 1 0 1 2 1 1 0 1 3 3 1 0 1 … … 1 0 1 77 76. 5 1 2 1 1 1 0 2 2 2 1 0 2 … … 1 0 2 7 6. 5 2 The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Status indicator for last record within spell One record person-month One record per spell

Status indicator for last record within spell One record person-month One record per spell Duration Status Spell # of spell at end Divorce ID (marriage #) (mos. ) of spell indicator 2 1 4 0 0 3 1 77 1 1 3 2 7 2 0 The indicator for status at end of record for the last persontime record within each spell will take on the value of the status indicator for the overall spell Event history analysis: discrete time data ID 2 2 3 3 3 3 3 Person- Status Spell # Record month # months at end (marriage # w/in within w/in of #) spell record 1 1 0 1 2 1 1 0 1 3 2 1 0 1 4 3. 5 0 1 1 0 1 2 1 1 0 1 3 3 1 0 1 … … 1 0 1 77 76. 5 1 2 1 1 1 0 2 2 2 1 0 2 … … 1 0 2 7 6. 5 2 The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Event indicator for each record within spell One record person-month One record per spell

Event indicator for each record within spell One record person-month One record per spell Duration Status Spell # of spell at end Divorce ID (marriage #) (mos. ) of spell indicator 2 1 4 0 0 3 1 77 1 1 3 2 7 2 0 Event history analysis: discrete time data ID 2 2 3 3 3 3 3 month # Divorce Spell # Record # within indicator (marriage #) w/in spell for record 1 1 0 0 1 2 1 0 1 3 2 0 1 4 3 0 1 1 0 0 1 2 1 0 1 3 3 0 1 … … 0 1 77 76 1 2 1 1 0 2 2 2 0 2 … … 0 2 7 6 0 The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Fixed covariates for each person-time record Age, number of children at start of spell,

Fixed covariates for each person-time record Age, number of children at start of spell, and gender do not change during the course of a spell, so they have the same value for each person-time record within a given spell ID 2 2 3 3 3 3 3 Spell # Record # (marriage #) w/in spell 1 1 1 2 1 3 1 4 1 1 1 2 1 3 1 … 1 77 2 1 2 2 2 … 2 7 Event history analysis: discrete time data month # Divorce Age at within indicator start of # children at spell for record spell (yrs) Gender start of spell 0 0 40 male 0 1 0 40 male 0 2 0 40 male 0 3 0 40 male 0 0 0 25 male 0 1 0 25 male 0 3 0 25 male 0 … 0 25 male 0 76 1 25 male 0 1 0 39 male 1 2 0 39 male 1 … 0 39 male 1 6 0 39 male 1 The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Example timelines for number of children as time-varying covariate in study of divorce Columns

Example timelines for number of children as time-varying covariate in study of divorce Columns reordered into chronological order ID Date of 2 nd Date of Date 1 st Date of 1 st child's birth observed marriage birth 3 3/1/65 10/8/85 8/1/90 12/5/95. 4 3/1/42 12/2/85 6/1/63 9/21/64 5/11/67. Date of 1 st 2 nd divorce marriage divorce 1/1/97 10/1/04. . M C Date of death . . D Date last observed 5/1/05 10/1/02 M 10/2/02 L Case 3: No kids Case 4: M No kids C One kid X C Two kids M = Married D = Divorced C = Child born L = Lost to follow-up O = Censored by end of study. X = Died Event history analysis: discrete time data The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Discrete time with time-varying covariates • Case 3 has his first child 64 months

Discrete time with time-varying covariates • Case 3 has his first child 64 months into his first marriage, and no additional children while observed. # kids at start of record is § 0 for his first 63 records of spell 1 § 1 for records 64 through 77 of spell 1 § 1 for all records in spell 2 Event history analysis: discrete time data ID 3 3 3 3 Divorce # kids at month # indicator for start of Spell # w/in spell record spell 1 0 0 0 1 1 0 0 1 … 0 0 1 64 0 0 1 77 1 0 2 0 0 1 2 … 0 1 2 6 0 1 # kids at start of record 0 0 0 1 1 1 The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Discrete time with time-varying covariates • Case 4 has her first child 15 months

Discrete time with time-varying covariates • Case 4 has her first child 15 months into her marriage, a second child in month 47 after marriage. For her the # kids at start of record is § 0 for her first 15 records § 1 for records 15 through 46 § 2 for records 47 or higher, all in spell 1 Event history analysis: discrete time data ID 4 4 4 4 Divorce # kids at month # indicator for start of Spell # w/in spell record spell 1 0 0 0 1 … 0 0 1 15 0 0 1 … 0 0 1 474 0 0 # kids at start of record 0 0 1 1 2 2 2 The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Presenting information on event history construction: Background work • Most of the gory details

Presenting information on event history construction: Background work • Most of the gory details of creating an event history are part of behind-the-scenes work – Important to do consistency checks to make sure event histories were created correctly given • • Original data source of information for timeline construction Type of event under study Fixed covariates Time-varying covariates – E. g. , correct • • Number of spells per respondent Number of person-time records for each spell Duration and event indicators for each person-time record Values of fixed- and time-varying covariates for each person-time record Event history analysis: discrete time data The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Presenting information on event history construction • In the data and methods section, describe:

Presenting information on event history construction • In the data and methods section, describe: – Original data source of information for timeline construction • Dates, status, duration of events – – – Type of event under study Unit of person-time (e. g. , person-years, person-months) What constitutes censoring Fixed covariates Time-varying covariates • Source(s) of information for determining timing of changes in those variables • See checklist in chapter 17 of Writing about Multivariate Analysis, 2 nd Edition for more detail on what to report Event history analysis: discrete time data The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Summary • A discrete-time event history analysis requires a separate record for each person-time

Summary • A discrete-time event history analysis requires a separate record for each person-time unit at risk of the event • For each respondent, create correct number of spells • For each spell, calculate – Correct number of person-time units – Components of the dependent variable • Duration measure • Event indicator – Fixed characteristics – Time-varying characteristics • In data and methods section, describe data sources and variables for the event history Event history analysis: discrete time data The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Suggested resources • Allison, P. D. 2010. Survival Analysis Using the SAS System: A

Suggested resources • Allison, P. D. 2010. Survival Analysis Using the SAS System: A Practical Guide, 2 nd Edition. Cary, NC: SAS Institute. • Miller, J. E. 2013. The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition. University of Chicago Press, chapter 17. Event history analysis: discrete time data The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Suggested online resources • Podcast on data structure for a continuoustime event history analysis

Suggested online resources • Podcast on data structure for a continuoustime event history analysis Event history analysis: discrete time data The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Suggested exercises • Study guide to The Chicago Guide to Writing about Multivariate Analysis,

Suggested exercises • Study guide to The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition. – Question #3 a in the problem set for chapter 17 – Suggested course extensions for chapter 17 • “Reviewing” exercises #2 a through 2 h • “Applying statistics and writing” exercises #1 and 2 a Event history analysis: discrete time data The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.

Contact information Jane E. Miller, Ph. D jmiller@ifh. rutgers. edu Online materials available at

Contact information Jane E. Miller, Ph. D jmiller@ifh. rutgers. edu Online materials available at http: //press. uchicago. edu/books/miller/multivariate/index. html Event history analysis: discrete time data The Chicago Guide to Writing about Multivariate Analysis, 2 nd Edition.