What is The Analysis of Longitudinal Survey Data

  • Slides: 33
Download presentation
What is… The Analysis of Longitudinal Survey Data Paul Lambert University of Stirling Prepared

What is… The Analysis of Longitudinal Survey Data Paul Lambert University of Stirling Prepared for: National Centre for Research Methods, Research Methods Festival, St Catherine’s College, Oxford, 7 July 2010 Also see: www. longitudinal. stir. ac. uk / www. dames. org. uk 1

So what’s distinct about the analysis of longitudinal survey data? You already know. .

So what’s distinct about the analysis of longitudinal survey data? You already know. . Ø Working with (survey) datasets with longitudinal information (data about time) and the specialist techniques of statistical analysis that are appropriate You maybe don’t realise. . 1) Groups of techniques and data types 2) Complex data and data management components July 2010: LDA 2

1) Types of longitudinal survey data • Survey resources Data analysis is used to

1) Types of longitudinal survey data • Survey resources Data analysis is used to give a parsimonious summary of patterns of relations between variables in the survey dataset • Longitudinal [‘. . of or about time. . ’] i. {Analysis is concerned with time} ii. Data is concerned with more than one time point • [e. g. Taris 2000; Blossfeld and Rohwer 2002] iii. Repeated measures over time • [e. g. Menard 2002; Martin et al 2006] July 2010: LDA 3

Types of data and analysis traditions for longitudinal surveys cf. www. longitudinal. stir. ac.

Types of data and analysis traditions for longitudinal surveys cf. www. longitudinal. stir. ac. uk 0. Temporal effects in cross-sectional data 1. Repeated cross-sections 2. Panel datasets 3. Cohort studies 4. Events history datasets 5. Time series analyses July 2010: LDA 4

[Data type: 1/6] Temporal effects in single crosssectional surveys • Temporal effects are (a)

[Data type: 1/6] Temporal effects in single crosssectional surveys • Temporal effects are (a) present and (b) of interest in most social science studies • We can measure differences between people in terms of their age / year of birth • These matter empirically & are interesting substantively • But we can’t tell if differences are due to age or period or cohort (or other things that are collinear with these, e. g. life course stage or major events) July 2010: LDA 5

Longitudinal statements from cross-sectional data are common. . . • We typically fit linear/curvilinear

Longitudinal statements from cross-sectional data are common. . . • We typically fit linear/curvilinear trend lines for time effects • Treiman (2009: 162): nonlinear specifications of time and age effects – Year of birth effect on literacy in China: discontinuity at 1955; curve 1955 -1967; knot at 1967

Within 20’s 0. 15 yob cohort, 30’s 0. 28 Gamma on 40’s 0. 22

Within 20’s 0. 15 yob cohort, 30’s 0. 28 Gamma on 40’s 0. 22 educ to health 50’s 0. 23 is… 60’s 0. 22 70’s 0. 15 80’s 0. 10 7

[Data type: 2/6] Repeated cross-sections: Surveys on same topics, on multiple occasions, to different

[Data type: 2/6] Repeated cross-sections: Surveys on same topics, on multiple occasions, to different people Data example: GHS pooled ‘time-series’ dataset (UKDA, SN: 5664) Adults aged 25 -65 only 8

Repeated cross sections ü Easy to communicate & appealing: how things have changed between

Repeated cross sections ü Easy to communicate & appealing: how things have changed between certain time points ü Can distinguishes any 2 of age / period / cohort ü Easier to analyse – less data management However. . L Don’t get other Qn. LR attractions (nature of changers; residual heterogeneity; causality; durations) L Hidden complications: are sampling methods, variable operationalisations really comparable? § More on this below. . . July 2010: LDA 9

Example: Labour Force Survey yearly stats Percent of UK workers with a higher degree,

Example: Labour Force Survey yearly stats Percent of UK workers with a higher degree, by employment category and gender (m / f ) Sample size ~35, 000 m / 30, 000 f each year Profess. Non-Prof 1991 14. 4 1. 3 11. 0 0. 6 1996 19. 9 2. 5 24. 4 2. 3 July 2010: LDA 2001 24. 9 3. 5 28. 3 3. 2 10

LFS and time (example in SPSS from www. longitudinal. stir. ac. uk) July 2010:

LFS and time (example in SPSS from www. longitudinal. stir. ac. uk) July 2010: LDA 11

[Data type: 3/6] Panel Datasets Information collected on the same cases at more than

[Data type: 3/6] Panel Datasets Information collected on the same cases at more than one point in time – ‘classic’ longitudinal design – incorporates ‘follow-up’, ‘repeated measures’, and ‘cohort’; large and small in scale v. Several major panel studies in UK, e. g. www. esds. ac. uk/longitudinal v. Many cross-sectional surveys feature additional panel elements July 2010: LDA 12

Illustration: Unbalanced panel Wave* 1 1 1 2 2 3 3 3 N_w=3 Person

Illustration: Unbalanced panel Wave* 1 1 1 2 2 3 3 3 N_w=3 Person 1 2 3 N_p=3 Person-level Vars 1 38 1 36 2 34 2 0 2 6 9 1 38 2 35 1 16 1 40 1 36 2 36 1 18 2 8 9 *also ‘sweep’, ‘contact’, . . 13

Complex data example: BHPS panel dataset [SN 5151] July 2010: LDA 14

Complex data example: BHPS panel dataset [SN 5151] July 2010: LDA 14

Panel data advantages • Study ‘changers’ – how many of them, what are they

Panel data advantages • Study ‘changers’ – how many of them, what are they like, what caused change • Control for individuals’ unknown characteristics (‘residual heterogeneity’) • Develop a full and reliable life history – e. g. family formation, employment patterns July 2010: LDA 15

Example: Panel transitions Young people’s household circumstance changes by subjective well-being between 1994 and

Example: Panel transitions Young people’s household circumstance changes by subjective well-being between 1994 and 1995. BHPS youth panel, 11 -14 yrs in 1994, row percents. Stays Cheers Becomes Stays happy up miserable N HH Stable 54% 19% 10% 18% 499 HH Changes 42% 22% 14% 22% 81 July 2010: LDA 16

Panel data can be ‘wide’ or ‘long’ • Depends upon the analytical approach •

Panel data can be ‘wide’ or ‘long’ • Depends upon the analytical approach • Wide format is simpler to envisage but analysis will need unbalanced data or missing value imputations • Long format is harder to manipulate (e. g. to crosscheck), but is more flexible in the types of analysis it supports 1991 July 2010: LDA 1992 1993 1994 1995 1991 1992 1993 1994 1995 1996 17

Panel models: Regression style models with various estimators to recognise the repeated contacts: e.

Panel models: Regression style models with various estimators to recognise the repeated contacts: e. g. random effects; fixed effects; population average; linear (model: influences on GHQ score in the BHPS; Stata examples available via www. dames. org. uk/workshops) July 2010: LDA 18

[Data type: 4/6] Cohort Datasets Information on a group of cases which share a

[Data type: 4/6] Cohort Datasets Information on a group of cases which share a common circumstance, collected repeatedly as they progress through a life course – Intuitive type of repeated contact data – e. g. ‘ 7 -up’ series − Often contributes to cross-cohort comparisons − e. g. UK Birth cohort studies in 1946, 1958, 1970 and 2000 July 2010: LDA 19

Cohort data and analysis in the social sciences • Many circumstances parallel other panel

Cohort data and analysis in the social sciences • Many circumstances parallel other panel types: Ø Large scale studies ambitious & expensive Ø Small scale cohorts still quite common… v Attrition problems often more severe v Considerable study duration limits v Glenn (2005) argues that ‘cohort analysis’ should be specifically directed to understanding effects of ageing/progression over time • Other uses of cohort data are just = panel data • It remains hard - even with extensive cohort data - to authoritatively understand ageing effects (age = period – cohort) July 2010: LDA 20

[Data type: 5/6] Event history data analysis [esp. Blossfeld et al 2007] Focus shifts

[Data type: 5/6] Event history data analysis [esp. Blossfeld et al 2007] Focus shifts to length of time in a ‘state’ analyse determinants/patterns to time in state(s) • Data sources are panel / cohort studies, or retrospective interviews (…recall errors. . ) Ø Analysis of event durations: ‘Event history analysis’; ‘Survival data analysis’; ‘Failure time analysis’; ‘hazards’; ‘risks’; . . Ø Analysis of event patterns: ‘Sequence analysis’; ‘trajectory analysis’; ‘optimal matching analysis’; ‘latent growth curves’ 21

Key to event histories is ‘state space’ July 2010: LDA 22

Key to event histories is ‘state space’ July 2010: LDA 22

Example: Cox regression (SPSS example at www. longitudinal. stir. ac. uk) July 2010: LDA

Example: Cox regression (SPSS example at www. longitudinal. stir. ac. uk) July 2010: LDA 23

[Data type: 6/6] Time series data Statistical summary of one particular concept, collected at

[Data type: 6/6] Time series data Statistical summary of one particular concept, collected at repeated time points from one or more subjects Examples: • Unemployment rates by year in UK • University entrance rates by year by country Comments: – Panel = many variables few time points = ‘cross-sectional time series’ to economists – Time series = few variables, many time points – Descriptive analyses – e. g. charts of statistics over time – Advanced modelling analyses typically involve including ‘autoregressive’ terms (e. g. lag effects) amongst explanatory factors 24

…. Six types of data/analysis…! 0. Temporal effects in cross 1. Repeated cross-sections -sectional

…. Six types of data/analysis…! 0. Temporal effects in cross 1. Repeated cross-sections -sectional data 2. Panel datasets 3. Cohort studies 4. Event history datasets 5. Time series analyses July 2010: LDA 25

[. . and then there’s another thing. . ] 2. Data management issues •

[. . and then there’s another thing. . ] 2. Data management issues • Working with longitudinal survey data is made more challenging by important issues of ‘data management’ Ø Variable operationalisations for comparisons e. g. strategies for standardisation, harmonisation Ø Linking datasets internally to a study Ø Linking with other datasets to enhance analysis [Value of organising your data and files – e. g. Long, 2009] Ø Recognising data structure in analysis e. g. missing data; survey effects; modelling specifications

Dealing with complex data In the UK we host many projects and centres which

Dealing with complex data In the UK we host many projects and centres which contribute to enabling the analysis of complex longitudinal data for social science research – Specifying suitably complex statistical models • Examples at the Centre for Multilevel Modelling (‘E-Stat’ a generic tool for specifying advanced models; Realcom – for analysing longitudinal missing data); Lancaster-Warwick. Stirling NCRM Node; ULSC (Essex) on survey design effects – Resources on accessing and handling complex data • e. g. ESDS; ADMIN Node; Obesity e-lab; DAMES Node • . . Session 17 in yesterday’s programme. . July 2010: LDA 27

My own pet project concerns comparability of variables over time. . (see www. dames.

My own pet project concerns comparability of variables over time. . (see www. dames. org. uk) July 2010: LDA 28

…‘Effect proportional scaling’ using parents’ occupational advantage July 2010: LDA 29

…‘Effect proportional scaling’ using parents’ occupational advantage July 2010: LDA 29

3. Some closing comments on the analysis of longitudinal survey data Why bother with

3. Some closing comments on the analysis of longitudinal survey data Why bother with all this. . ? – Focus on change / stability – Focus on the life course Ø Distinguish age, period and cohort effects Ø Career trajectories / life course sequences – Focus on time / durations Ø Substantive role of durations (e. g. Unemployment) – Getting the ‘full picture’ Ø Causality and residual heterogeneity Ø Examining multivariate relationships Ø Representative conclusions [e. g. Abbott 2006; Mayer 2005; Menard 2002; Baltagi 2001; Rose 2000; Dale and Davies 1994; Hannan and Tuma 1979; Moser 1958] 30

Research traditions • ‘geographers study space and economists study time’ [adage quoted in Fotheringham

Research traditions • ‘geographers study space and economists study time’ [adage quoted in Fotheringham et al. 2000: 245] Ø Vast economics literature using techniques for temporal analysis Ø Other social science disciplines to some degree catching up Ø Though methodological research on longitudinal models, and data quality, cross-cuts disciplines [e. g. Dale and Davies, 1994] • Data expansions c 1990 -> more encompassing models; new substantive applications areas – – For example: [Platt 2005] - ethnic minorities’ social mobility 1971 -2001 [Pahl & Pevalin 2005] – Friendship patterns over time [Verbakel & de Graaf 2008] – spouses effect on careers 1941 -2003 • …One challenge is getting used to talking about time in a more disciplined way: e. g. traditional sociological characterisations of ‘the past’ and ‘social change’ may not be empirically satisfactory 31

What’s exciting in the analysis of longitudinal social survey data? • A personal view:

What’s exciting in the analysis of longitudinal social survey data? • A personal view: By and large, the core analytical & methodological issues have been recognised for some time What is exciting is the rapid expansion of secondary quantitative longitudinal data, its quality, its volume and its accessibility (a) - new data (b) - new tools for accessing, handling and modelling large and complex data 32

References • • • • • Abbott, A. (2006). 'Mobility: What? When? How? '

References • • • • • Abbott, A. (2006). 'Mobility: What? When? How? ' in Morgan, S. L. , Grusky, D. B. and Fields, G. S. (eds. ) Mobility and Inequality. Stanford: Stanford University Press. Baltagi, B. H. (2001). Econometric Analysis of Panel Data. New York: Wiley. Blossfeld, H. P. and Rohwer, G. (2002). Techniques of Event History Modelling: New Approaches to Causal Analysis, 2 nd Edition. Mawah, NJ: Lawrence Erlbaum Associates. Blossfeld, H. P. , Grolsch, K. , & Rohwer, G. (2007). Event History Analysis with Stata. New York: Lawrence Erlbaum Davies, R. B. (1994). 'From Cross-Sectional to Longitudinal Analysis' in Dale, A. and Davies, R. B. (eds. ) Analysing Social and Political Change : A casebook of methods. London: Sage. Fotheringham, A. S. , Brunsdon, C. , & Charlton, M. (2000). Quantitative Geography: Perspectives on Spatial Data Analysis. London: Sage. Glenn, N. D. (2005). Cohort Analysis, 2 nd Edition. London: Sage. Hannan, M. T. , & Tuma, N. B. (1979). Methods for Temporal Analysis. Annual Review of Sociology, 5, 303 -328. Li, Y. , & Heath, A. F. (2008). Socio-Economic Position and Political Support of Black and Ethnic Minority Groups in the United Kingdom, 1972 -2005 [computer file]. 2 nd Ed. Colchester, Essex: UK Data Archive [distributor], SN: 5666. Long, J. S. (2009). The Workflow of Data Analysis using Stata. Boca Raton, Texas: Martin, J. , Bynner, J. , Kalton, G. , Boyle, P. , Goldstein, H. , Gayle, V. , Parsons, S. and Piesse, A. 2006. Strategic Review of Panel and Cohort Studies. London: Longview, and www. longviewuk. com/ Mayer, K. U. 2005. 'Life courses and life chances in a comparative perspective' in Svallfors, S. (ed. ) Analyzing Inequality: Life Chances and Social Mobility in Comparative Perspective. Stanford: Stanford University Press. Menard, S. 2002. Longitudinal Research, 2 nd Edition. London: Sage, Number 76 in Quantitative Applications in the Social Sciences Series. Moser, C. A. (1958). Survey Methods in Social Investigation. London: Heinemann. Pahl, R. , & Pevalin, D. (2005). Between family and friends: a longitudinal study of friendship choice. British Journal of Sociology, 56(3), 433 -450. Platt, L. (2005). Migration and Social Mobility: The Life Chances of Britain's Minority Ethnic Communities. Bristol: The Policy Press. Rose, D. (2000). Researching Social and Economic Change: The Uses of Household Panel Studies. London: Routledge. Taris, T. W. (2000). A Primer in Longitudinal Data Analysis. London: Sage. Treiman, D. J. (2009). Quantitative Data Analysis: Doing Social Research to Test Ideas. New York: Josey Bass. Verbakel, E. , & de Graaf, P. M. (2008). Resources of the Partner: Support or Restriction in the Occupational Career 33 Developments in the Netherlands Between 1940 and 2003. European Sociological Review, 24(1), 81 -95.