Causal inference in observational studies Roberto Cardarelli DO

Causal inference in observational studies Roberto Cardarelli, DO, MPH, FAAFP Professor, Chief of Community Medicine

Bottom line In observational studies, the goal is to investigate a potential cause-effect association We must rule out other explanations for findings Bias (systematic error) Chance (random error, Type I error) Confounding

Important questions to ask in the design phase Do the samples of study subjects sufficiently represent the population of interest? Does the measurement of the predictor variable sufficiently represent the predictor of interest? Does the measurement of the outcome variable sufficiently represent the outcome of interest?

P-value The probability of obtaining the observed result (test statistic) or a more extreme value if the hypothesis is true The smaller the p-value, the less compatible is the hypothesis is with the observed results In most applications, the hypothesis tested is the null hypothesis In this situation, the p value reflects the probability of a Type 1 error (rejecting a true null hypothesis)

P-value Although NOT RECOMMENDED, the result of a statistical test is often reported as “significant” (usually when p<. 05) This cutoff value for the accepted probability of a Type I error is arbitrary Has NO scientific value for making causal inferences

P-value Over-reliance on p values for making causal inferences is common among scientists Often misunderstood and misinterpreted It is NOT the probability that the null hypothesis is true or that the result was due to chance

Effect-cause Reverse causality: the outcome causes the predictor variable

Confounding Is a confusion of effects that occurs when the effect of an extraneous factor is mistaken for or mixed with the actual exposure effect

A confounder can… Create the appearance of an association when the true association is null (e. g. , next slide) Create the appearance of a null association when there is a true association Bias the measure of a true effect toward or away from the null value Reverse the direction of a true association

Example Prevalence of Down syndrome at birth increases with birth order However, birth order is highly correlated with mother’s age The birth-order effect is confounded by the effect of mother’s age

General definition of confounding E ? C Y

Confounding Confounder (C) is causally associated with outcome (Y) Confounder is either causally or not causally associated with the exposure of interest (E) Association of interest is whether E causes Y

Example: Down syndrome Birth order Mother’s age ? Down syndrome

Example: Lung cancer Alcohol Smoking ? Lung cancer

Example: Low birth weight Maternal coffee consumption Smoking ? LBW

Example: Down syndrome Maternal spermicide use Mother’s age ? Down syndrome

Assessing for confounding 1. Is the potential confounder causally associated with the outcome? 2. Is the potential confounder causally or non-causally associated with the predictor variable of interest? 3. Does the association between the predictor variable and the outcome vary by the potential confounder?

Example for assessing confounding Is cigarette lighter use associated with lung cancer? case control exp 39 15 Not exp 61 85 Odds ratio = 3. 6 What potential confounder should be assessed?

Example What is the association between smoking and lung cancer? case control exp 75 25 Not exp 25 75 Odds ratio = 9. 0

Example Is the potential confounder associated with the exposure? smoker Not smoker lighter 50 4 No lighter 50 96 Odds ratio = 24. 0

Example Is the potential confounder associated with the exposure? Is the potential confounder associated with the outcome? Are the results of the original analysis actually confounded by the potential confounder? l l To answer this, stratify by the potential confounder If the odds ratio in any strata are different than the unstratified (original) odds ratio of 3. 6, then confounding has been demonstrated

Assessing for confounding: Stratification Involves segregating subjects into strata (subgroups) according to the potential confounder and then examining the association between the predictor variable and the outcome separately in each stratum When there is a confounding effect, the associations seen across strata of the potential confounder are of similar magnitude to each other and are different from the crude value

Example: stratification smokers case control lighter 38 12 No lighter 37 13 nonsmokers case control lighter 1 3 No lighter 24 72 OR =1. 1 OR =1. 0

Example #2 of assessing for confounding Is coffee consumption associated with pancreatic cancer? case control Coffee drinker 30 18 No coffee 70 82 Odds ratio = 1. 95 What potential confounder should be assessed?

Causal diagram Coffee consumption Smoking ? Pancreatic cancer

Example What is the association between smoking and pancreatic cancer? case control smoker 50 20 Not smoker 50 80 20% of controls were smokers, compared with 50% of cases OR=4. 0

Example Is the potential confounder associated with the exposure? Coffee drinker No coffee Smoker 35 35 No smoke 13 117 Odds ratio = 9. 0

Stratification by potential confounder smokers case control Coffee drinker 25 10 No coffee 25 10 nonsmokers case control Coffee drinker 5 8 No coffee 45 72 OR =1. 0

Example #2: Conclusions The only reason that we originally had an OR of 1. 95 was because there was a difference in the distribution of smoking across the cases and controls Thus, in this example, smoking is a confounder

Controlling for confounding Methods to control for confounding in the design: randomization restriction matching Methods to control for confounding in the analysis: stratified analysis multivariate analysis

Restriction (specification) Restrict inclusion into your study to those subjects who have a specific value for a factor Limits the external validity (generalizability) of your findings May limit the sample size

Matching Selecting cases and controls with matching values of the confounding variable Requires special analytical techniques

Adjustment A statistical technique to control for confounders in order to isolate the effects of predictor variables and confounders

Integrating notions of scientific inference Proof does not exist in science! This includes observational and experimental evidence Causation can only be inferred from observations

Hill’s “Criteria” for Causation Often used by epidemiologists, esp. in the absence of clear-cut experimental evidence in humans Several epidemiologists contributed to the development and elaboration of these ideas Sir Austin Bradford Hill is credited with their codification Hill called them “aspects” of an exposuredisease association Were applied by US Surgeon General’s Advisory Committee that prepared 1964 report on Smoking and Health

Hill’s aspects of E-D association 1. Strength of association 2. Dose-response gradient 3. Temporality 4. Consistency of findings 5. Biological plausibility of hypothesis 6. Coherence of evidence 7. Specificity of association 8. Experimental evidence 9. Analogy with known effects

Strength of association The larger the rate/risk ratio, the less likely it is due entirely to bias An observed association that is weak does not mean that we can rule out a causal interpretation

Dose-response gradient Disease frequency increases or decreases monotonically with the level of exposure We may not be able to rule out nonlinear causal relations Threshold effect l Saturation effect l Quadratic relation (e. g. , alcohol/mortality) l

Temporality Exposure must precede the disease A necessary condition for making a causal inference Ability to establish this is largely dependent on study design (problem for case control or cross sectional studies) It is more difficult to establish when investigating diseases for which time of onset is questionable or the induction/ latent period is long

Consistency of findings If all studies with a given relation produce similar quantitative results, causal inference is enhanced Particularly important if studies involve different populations, methods or periods Consistencies may actually reflect similar biases (all estimates are distorted) We cannot rule out a causal interpretation just because of inconsistent results, because there are other explanations for such inconsistencies (chance, biases)

Consistency comments Positive results are more likely to be reported Some authors may not point out inconsistencies in their data Preliminary results on “hot” topics may be prematurely published before they are properly reviewed, leading to false inferences Sometimes the editorial and peer-review process of professional journals does not work well, leading to incorrect or misleading reports

Biological plausibility of hypothesis If the hypothesized effect makes sense in the context of current biological theory and knowledge, we are more likely to accept a causal interpretation of the observed association Current state of knowledge may be inadequate to explain our observations For example, the effects of contaminated water on typhoid fever were originally challenged by critics who argues it was not biologically plausible

Coherence of evidence If the findings do not seriously conflict with our understanding of the natural history of the disease or with other accepted facts about disease occurrence, a causal interpretation is strengthened Combines aspects of consistency and biological plausibility

Specificity of association If a specific exposure is found to be associated with only one disease, or if disease is found the be associated with only one exposure, a causal interpretation is suggested Not given much credibility or weight in contemporary epidemiology Many factors have multiple health effects Diseases have multiple causes

Experimental evidence The condition can be altered (prevented or ameliorated) by an appropriate experimental regimen Logically, experimental evidence is not a criterion, but a test of the causal hypothesis, which is unavailable in many (if not most) circumstances

Analogy Ability to look to an analogous exposure-disease relationship Absence of such analogies only reflects a lack of imagination or experience, not falsity of the hypothesis

Using these aspects to infer causality While these aspects may be helpful, there is no set of rules for assessing the extent to which aspects are met nor weighing each aspect relative to the others Only one aspect- temporality- is necessary and none are sufficient for making a causal interpretation

Exercise What are potential confounders in your study?