Research design and statistics II Observational studies WeiChu
Research design and statistics (II) Observational studies Wei-Chu Chie outcome research 1
Purposes of clinical/preventive research – Therapeutic (treatment) efficacy evaluation – Prognostic/predictive factors identification • Possible factors associated with an clinical/preventive outcome – Descriptive – Causal inference • hypothesis testing outcome research • prediction 2
Therapeutic efficacy evaluation • Exposure (cause): a certain treatment – drug, procedure, education, . . . • Outcome: – the end result of care, or a measurable change in the health status or behavior of patients – clinical, functional, . . . outcome research 3
Prognostic/predictive factors identification – Exposure: certain characteristics of patient, disease, . . . • Prognosis: a prediction of the future course of disease following its onset • Prognostic/predictive factors: conditions associated with an outcome of disease – Outcome: • the end result of care, or a measurable change in the health status or behavior of patients outcome research 4
Major designs – Experimental: • exposure (treatment) is manipulated • analog to laboratory work • gold standard: randomized controlled trials – Observational: • no any manipulation of exposure (treatment) • natural observation – Common purpose: causal inference outcome research 5
Common enemies of causal inference • What we pursue? – Real causal effect • What we may have otherwise? – Bias – confounding – chance – reverse causal relation outcome research 6
Bias • Source: systemic error – selection: two definitions! – information • prevention/avoidance – better design (RCT) • evaluation and analysis – additional data – check consistency outcome research 7
Chance • Source: random error • Prevention/avoidance – increase sample size/power of test – more accurate measurement • Analysis and evaluation – p value/ confidence intervals – meta-analysis outcome research 8
Confounding • Source: other factors associated with both exposure and outcome • prevention/avoidance – better design (RCT) (*matching is not suitable) • analysis and evaluation restriction – stratification – modeling (adjusting) outcome research 9
Reverse causal association • Source: cross-sectional data collection • Prevention/avoidance – proper time sequence: exposure before outcome • Analysis and evaluation – biological plausibility – consistency – in fact, no better ways! outcome research 10
美國預防醫學特別委員會 判斷標準 – Review of Evidence: – literature retrieval and exclusion criteria – evaluating the quality of the evidence (Appendix A) • grade I: RCT (randomized controlled trials) • grade II-1: CT without R • grade II-2: well-designed cohort or casecontrol studies, multi-center preferable outcome research 11
美國預防醫學特別委員會 判斷標準 • grade II-3: multiple time-series with or without intervention, dramatic results of uncontrolled experiments • grade III: opinion of respected authorities, based on clinical • experiences; descriptive studies and case reports; case reports of expert committees – cost-benefit, utility and effectiveness analysis – meta-analysis and synthesis of research results 12 outcome research – updating evidence
美國預防醫學特別委員會 判斷標準 • Translating science into clinical practice (Recommendation in Appendix A) – level A: good evidence to support the recommendation that the condition be specifically considered in a periodic health examination – level B: fair … – level C: insufficient … but recommendations may be made on other grounds – level D: fair evidence … be excluded – level E: good evidence … be excluded • External review from other organizations outcome research 13
Grade definition • Good: – evidence includes consistent results from well-designed, well-conducted studies in representative populations that directly assess effects on health outcomes outcome research 14
Grade definition • Fair: – evidence is sufficient to determine effects on health outcomes, but the strength of the evidence is limited by the number, quality, or consistency of the individual studies outcome research 15
Grade definition • Poor: – evidence is insufficient to assess the effects on health outcomes because of limited number or power of studies, important flaws in their design or conduct, gaps in the chain of evidence, or lack of information on important health outcomes outcome research 16
Ideal and reality – Ideal: experimental • good for causal inference/fewer bias and confounders but not always generalizable • more ethical concerns and costly • therapeutic efficacy evaluation – Reality: observational • easier to implement or data ready to use • fewer ethical concerns but more bias or confounders • prognostic factor identification outcome research 17
Observational studies • Cohort, prospective – variations of prospective cohort • Case-control • Cross-sectional • Other related designs outcome research 18
The major difference • Time/timing between measurement of exposure and outcome • Strength in causal inference • Efficiency of subjects recruitment outcome research 19
Cohort study • Original definition of a cohort 羅馬軍 團 • Prospective cohort study: – the most classical design and attractive nature of epidemiology – causal inference without experiment – the best in observational studies • Variants: – retrospective cohort, . . . outcome research 20
Prospective cohort study in outcome research • Assemble the cohort – inception cohort: onset of disease/zero time • Measure predictor variables (prognostic/predictive) • Follow-up and measure outcomes – time to event (incidence): change of status – surrogate, qol, …: change of value outcome research 21
Strengths and weakness in outcome research • Strengths: – proper time sequence: predictors (exposures measured before outcomes) – fewer bias: information and selection – time-dependent variables available if measured – binary: rates obtainable/ non-binary: value/change • Weakness: – inefficient for rare outcomes – expensive, outcome time research consuming in maintenance/follow-up 22
Variant 1: retrospective cohort study – Identify a suitable cohort – Collect data about predictor variables – Collect data about outcomes at a later time • basically also a cohort or follow-up study • only difference: time of measurement • common in clinical studies/data linkage • not necessarily collecting outcomes “later” but at a later time than the occurrence of the exposure outcome research 23
Strengths and weakness in outcome research • Strengths: – same as prospective cohort – less costly and time consuming • Weakness: – same as prospective cohort except for cost & time – no QA/QC for data collection – may not include information needed outcome research 24
Variant 2: case-cohort study • Identify a cohort with adequate samples • select a sub-cohort as comparison: representative to the full cohort • identify cases at the end of follow-up • measure predictors on baseline samples from cases and sub-cohort outcome research 25
Strengths and weakness in outcome research • Strengths – same as prospective cohort except non-binary outcomes – most useful for costly analyses of specimens • biochemistry, biomarkers, … – time-dependent factors available – rate and non-binary outcomes obtainable • Weakness – same as prospective cohort except of cost – specimen storage/use and re-use outcome research 26
Variant 3: nested casecontrol study • Identify a cohort with adequate samples • identify cases at the end of follow-up • select matched controls from the cohort • measure predictors on baseline samples from cases and controls outcome research 27
Strengths and weakness in outcome research • Strengths – same as prospective cohort except non-binary outcomes – most useful for costly analyses of specimens • biochemistry, biomarkers, … – controls from cohort: representative and timematched • Weakness – same as prospective cohort except for cost – specimen storage/use and re-use – time-dependent factors, rates, non-binary outcomes not available outcome research 28
Variant 4: double cohort and external controls • Identify cohorts with different exposures • Determine outcomes – less often used outcome research 29
Strengths and weakness in outcome research • Strengths – same as prospective cohort – good for rare exposures – time-dependent factors available – rates, non-binary outcomes obtainable • Weakness – same as prospective cohort – more confounding – may not include information needed outcome research 30
Case-control (reference) study • An important breakthrough of epidemiologic study • classical definition • new perspective – control as a sample of hypothetical population from which cases came from – can be seen as a variant of cohort study outcome research 31
Case-control study in outcome research – Draw a sample of new (incident) cases (outcome +) – Draw a sample of controls (outcome - at a certain time) • a sample of hypothetical population from which cases came from – Measure the predictor variables • usually at the time when cases and controls are drawn outcome research 32
Strengths and weakness in outcome research • Strengths – efficient for rare outcomes: time and cost • Weakness – not always proper time sequence – bias: selection and information – confounding – non-binary outcomes not obtainable – binary outcomes: only odds ratio obtainable outcome research 33
Cross-sectional study • The most easy type • usually by surveys • current status/prevalence and prevalence ratios only • poor in causal inference outcome research 34
Cross-sectional study in outcome research • Select a sample from population • measure the predictor variables and the outcomes at the same time – case/non-case (not controls) – exposure/non-exposure outcome research 35
Strengths and weakness in outcome research • Strengths – time saving – get prevalence/status data both binary and non -binary • Weakness – no proper time sequence: poor in causal inference – inefficient in rare outcomes – bias: selection and information/confounding – binary outcomes: only prevalence ratio obtainable, no incidence or change of status outcome research 36
Differences of case-control and cross-sectional studies • Nature of cases – case-control: incident (newly onset) cases – cross-sectional: prevalent cases • the problem of using prevalent cases: duration • Ratio of case to non-case/controls – case-control: equal or of a certain ratio/efficient – cross-sectional: depends on the ratio in the sample outcome research 37
Differences of case-control and cross-sectional studies • Nature of non-cases/controls – case-control: a representative sample of cohort – cross-sectional: a group of non-cases in the sample of population at the time of sampling • subject to the duration, recovery of disease • Time frame of two parts – case-control: better matched by time of onset – cross-sectional: usually not related to onset outcome research 38
Similarity of case-control and cross-sectional studies • Collect data at the time when the cases/non-cases or controls identified • not easy to get proper and true time sequence of exposure and outcome research 39
Serial surveys or panels • Follow-up a single population – Serial surveys: like multiple crosssectional studies – Panel: like cohort studies • multiple measurements outcome research 40
Sample size estimation – Purpose: adequate power of test – basic formula and necessary components • alpha (one or two-sided) and beta error – usually alpha = 0. 05, beta = 0. 2 – then power = 1 -beta = 0. 8 • effective size: mean, difference, ratio, . . . • standard deviation – from prior information or other related source – formula/tables/softwares outcome research 41
Statistical methods – Purpose: • descriptive or inference • not to confuse or cheat readers! – Rules: • a good design is much more important than fancy statistical methods • never dream of using statistical technique to compensate the results of a poor design • keep as simple as possible outcome research 42
Statistical methods • Exposure (independent variable) X outcome (dependent variable) • classification of variables – nominal or categorical • binary/dichotomous, time to event • other – ordinal or rank – interval or continuous outcome research 43
exposure as categorical • outcome as binary – chi-square/proportion (Z) – logistic regression • outcome as binary, time to event/‘censored’ – survival : • Kaplan-Meier, log-rank • Cox’s proportional hazard • Poisson regression outcome research 44
exposure as categorical • Outcome as other categorical – chi-square – proportion – logistic regression for poly-chotomous outcomes vs. dichotomous outcome research 45
exposure as categorical • Outcome as rank – non-parametric methods – Wilconxon’s rank sum test – Kruskal-Wallis test – treat as interval/continuous outcome research 46
exposure as categorical • Outcome as interval – t-test – ANOVA – regression with dummy independent variable outcome research 47
Exposure as rank • outcome as categorical: – chi-square • outcome as rank: – non-parametric correlation • outcome as interval – non-parametric correlation – parametric correlation/regression outcome research 48
Exposure as interval • Outcome as categorical – treat as reverse relation • outcome as rank – non-parametric correlation – parametric correlation/regression • outcome as interval – parametric correlation/regression outcome research 49
Repeated measurements • Two times – Categorical variables: • Mc. Nemar’s chi-square – Rank variables: • Wilcoxon’s signed rank test – Interval variables • paired-t test outcome research 50
Repeated measurements • Multiple times • Advanced statistic models/methods – GEE – mixed effects regression model – Markov chain and transitional probability of more than two states/status outcome research 51
- Slides: 51