Combining Cycles of the Canadian Community Health Survey
Combining Cycles of the Canadian Community Health Survey APHEO Conference 2007 Steven Thomas Statistics Canada October 16, 2007
Outline Ø Ø Ø Introduction to the Canadian Community Health Survey (CCHS) Motivation for combining Guidelines Methods Case Study example Conclusion 2
Overview of the CCHS Ø Ø Two surveys: Regional Component (. 1 Cycle) – – – Ø General health survey – 130, 000 respondents Biennial cycle (2001, 2003, 2005) Common, optional and sub-sample content Provincial Component (. 2 Cycle) – – – Focused health survey - 30, 000 respondents Biennial cycle (2002 - Mental Health, 2004 - Nutrition) Focused content plus common and sub-sample content 3
Motivation for Combining Ø Ø Alone, the data from one cycle do not meet all the needs of users. There is a need for: Ø Ø Ø Improved quality of existing estimates More sample for studying rare characteristics Estimates for smaller areas of geography Similar information collected with the different cycles of the CCHS Possible Solution: Combine cycles 4
Guidelines Ø Ø Ø Statistics Canada will not be releasing a combined data file of previous releases A set of guidelines will be published along with case study examples In the simplest forms, combining cycles is a straight-forward process and most users familiar with the CCHS datasets should be able to combine them 5
General Guidelines Ø Users must consider the time effect on what they are studying Ø Ø Ø Changes to the population and to the characteristics of the population Because of time issues, it is recommended to only combine when deemed necessary Use caution when combining different surveys Ø Ø Users are interested in combining. 1 Cycles, . 2 Cycles and NPHS Given the ideas presented: Ø Unlikely that the results obtained with the different cross-sectional health surveys are comparable 6
General Guidelines Ø Major points highlighted in ‘Combining Cycles of the Canadian Community Health Survey’ by Steven Thomas (Symposium 2006) 1. Ensure that the same characteristic is being measured from cycle to cycle Ensure that the same population is being targeted by the different sources Verify that geography boundaries have not changed between cycles and update to new geography if necessary May be necessary to verify that same values for the characteristics are being measured between cycles / time periods Consider possible ‘mode effect’ 2. 3. 4. 5. 7
Methods Ø Methods of combining can be broken down into: Ø Separate Approach: Ø Ø Pooled Approach: Ø Ø Ø Estimates calculated separately and then combined Data files combined and estimates calculated Two methods do not necessarily result in the same estimates Both methods require independence between samples for variance calculation 8
Methods Ø Separate Approach: Ø Simple average (recommended): Ø Ø Parameter of interest: θ (mean, total, ratio) K estimates being combined with = Estimate obtained from cycle i Ø Ø Advantages: Ø Ø Easy to interpret – Average of values observed from different cycles Easy to implement using existing tables No major assumptions Disadvantage: Does not minimize the variance 9
Methods: Separate Approach Ø More advanced application: Ø Composite estimation or weighted average Ø is an unbiased estimator of θ for any choices of ai Ø Assumes that is unbiased for all cycles Ø Ø Ø Advantage: Can be used to minimize the variance Disadvantage: Assumptions not always met 10
Methods Ø Pooled Approach: Ø Basic application Ø Files are combined with existing weights and resulting data file is treated as one sample from a population Advantage: Easy to implement Ø Disadvantage: Weights may not be appropriate for totals Ø 11
Methods: Pooled Approach Ø Period Estimation (recommended): Ø Ø Weights for combined data files are adjusted by dividing by the number of cycles being combined (Each time period or cycle is treated equally) Advantages: Ø Ø Ø Easy to implement Totals correspond to average population Results are similar to other CCHS releases No major assumptions Disadvantage: Does not minimize the variance 12
Methods: Pooled Approach Ø More complex applications (Cont. ) Ø Ø Ø Weights for combined data files are adjusted by a set of ai that are more appropriate for the analysis (minimize the variance, increasing weight function, etc) Requires the assumption that the same values are being reported from cycle to cycle Advantages: Ø Ø Ø Can be used to minimize the variance Resulting value corresponds to all cycles or time periods Disadvantage: Assumptions are often not met 13
Methods Ø Ø Ø Note that the more complex methods require the assumption that the same values are being reported from cycle to cycle Given the evolving nature of the CCHS (Changing characteristics of an evolving population) it is unlikely that this assumption is met Recommend basic approaches of the simple average or the period estimate 14
Methods Ø Note on Modeling: Ø With most regression analyses, the researcher is interested in describing the relationship between characteristics and not the population Ø Cycles are combined to increase the number of observations Ø Weights should be used in the model Ø Weights do not need to be adjusted Ø Control for cycle in the model 15
Case Study Ø Example: Durham health unit - Use combined cycle data to analyze adolescent health, showcasing relevant indicators by specific age groups Ø Analysis included: Ø Breakdown of adolescent age groups 12 -19 12 -14, 15 -19 Ø Comparison with provincial values Ø 16
Case Study Durham Health Region – Males 12 -19 Daily Smoker Total Rate CV Ø Simple Average Ø Ø Ø Total = 2807. 43 Rate = 8. 30% CV = 21. 17% Cycle 1. 1 2277 7. 31% 32% Ø Cycle 2. 1 3228 9. 48% 38% Cycle 3. 1 2917 8. 12% 40% Period Estimate Ø Ø Ø Total = 2807. 43 Rate = 8. 33% CV = 23. 2% 17
Case Study Findings Ø Ø Ø Combining methods did create more precise estimates Combining was not always necessary Targeting 12 -14 age group was not possible Some differences became significant Estimates require special interpretation
Conclusion Ø Ø Ø Combining can be used to calculate more precise estimates Estimates for combined cycles of the CCHS do require special interpretation Combining can not solve all problems
- Slides: 19