Causal Directed Acyclic Graphs DAG Causal Diagrams 2013

  • Slides: 71
Download presentation
Causal Directed Acyclic Graphs (DAG) (Causal Diagrams) 2013 Eyal Shahar, MD, MPH Professor 1

Causal Directed Acyclic Graphs (DAG) (Causal Diagrams) 2013 Eyal Shahar, MD, MPH Professor 1

What is a causal diagram? n Components ¡ ¡ Variables Unidirectional arrows D A

What is a causal diagram? n Components ¡ ¡ Variables Unidirectional arrows D A C E B 2

Rules: displaying variables n Called “nodes” or “vertices” n Should be clearly understood by

Rules: displaying variables n Called “nodes” or “vertices” n Should be clearly understood by others n Variables, not values of variables ¡ n “Smoking status” is okay; “smoking” is not Displayed along the time axis (left to right) ¡ but sometimes we ignore this rule 3

Rules: drawing arrows n An arrow ¡ No bidirectional arrows n An arrow with

Rules: drawing arrows n An arrow ¡ No bidirectional arrows n An arrow with a question mark n A B C The research question at hand An arrow without a question mark ¡ B From a postulated cause to its postulated effect n ¡ A Background theory or axiomatic A ? B 4

Rules: drawing arrows n Directed Acyclic Graph ¡ Circularity does not exist ¡ A

Rules: drawing arrows n Directed Acyclic Graph ¡ Circularity does not exist ¡ A future effect cannot be a cause of its cause in the past n So-called “circularity” ¡ Directed acyclic graph with time-indexed variables A At=1 B Bt=2 C At=3 Bt=4 5

Example: a causal diagram for gastroesophageal reflux and esophageal disease S 1 S 2

Example: a causal diagram for gastroesophageal reflux and esophageal disease S 1 S 2 ? T R 1 R 2 D 1 R=reflux I 1 S=symptoms T=treatment I=imaging D=esophagus status Ddx=diagnosed esophagus status D 2 I 2 D 1 dx D 2 dx 6

How does a causal diagram help in research? n Decodes causal assertions ¡ All

How does a causal diagram help in research? n Decodes causal assertions ¡ All of science is about causation! n Clarifies our wordy or vague causal thoughts about the research topic n Connects “association” with “causation” n Helps us decide which covariates should enter the statistical model—and which should not n Unifies our understanding of confounding bias, colliding bias, information bias (and three other, less well, known biases) n Can depict and explain all types of bias 7

Pub. Med search (through 2012) n “Causal diagrams”: 83 titles n “Directed acyclic graph”:

Pub. Med search (through 2012) n “Causal diagrams”: 83 titles n “Directed acyclic graph”: 137 titles (some irrelevant) n Still not widely known n Rarely used 8

Some references n Pearl J. Causality: models, reasoning, and inference. 2000. Cambridge University Press

Some references n Pearl J. Causality: models, reasoning, and inference. 2000. Cambridge University Press (2009, second edition) n Greenland S et al. Causal diagrams for epidemiologic research. Epidemiology 1999; 10: 3748 n Robins JM. Data, design, and background knowledge in etiologic inference. Epidemiology 2001; 11: 313 -320 n Hernan MA et al. A structural approach to selection bias. Epidemiology 2004; 15: 615 -625 n Shahar E, Shahar DJ. Causal diagrams, information bias, and thought bias. Pragmatic and Observational Research 2010: 1; 33 -47 n Shahar E, Shahar DJ: Causal diagrams and three pairs of biases. In: Epidemiology – Current Perspectives on Research and Practice (Lunet N, Editor). www. intechopen. com/books/epidemiology-current-perspectives-on-research-and-practice, 2012: pp. 31 -62 (reading material for this module) 9

A natural path between two variables n n Formally: a sequence of arrows, regardless

A natural path between two variables n n Formally: a sequence of arrows, regardless of their direction, that connects two variables (and does not pass more than once through each variable) Informally: “can walk from A to Z, or from Z to A, on bridges” A A Z A B C D Z B C D E Z 10

Types of natural paths between two variables n n n Causal paths Confounding paths

Types of natural paths between two variables n n n Causal paths Confounding paths Colliding paths 11

A causal path between two variables (also called “directed path”) n n A natural

A causal path between two variables (also called “directed path”) n n A natural path between A and Z, in which all the arrows point in the same direction (hence, “directed path”) “A is a cause of Z” or “Z is a cause of A A Z A A B C Z Z C D B Z Z A 12

“Direct” versus “indirect” causal path Indirect causal path A B Z “direct” causal path

“Direct” versus “indirect” causal path Indirect causal path A B Z “direct” causal path n “Direct” is often (maybe always) over-simplification ¡ ¡ n n Is it really direct? No intermediary exists? Better terminology: “causal paths in which no intermediary variables are known or displayed” Overall (total) effect: by all directed paths (combined) 13

A confounding path between two variables n A natural path between A and Z

A confounding path between two variables n A natural path between A and Z that contains a shared cause of A and Z on this path (a confounder) C C X A A Z Z Alternative display A C Z A C X Z 14

A colliding path between two variables n A natural path between A and Z

A colliding path between two variables n A natural path between A and Z that contains at least two arrowheads that “collide” at some variable along this path (a collider on the path) L A Z K M A C Z Alternative display A C Z A K M L Z 15

Side point: collider (and confounder) are path-specific terms n A variable called a collider

Side point: collider (and confounder) are path-specific terms n A variable called a collider (or a confounder) on one path need not be a collider (or a confounder) on another path B D C A n Z C is a collider on one path (A B C D Z) and a confounder on another path (A C Z) 16

Identify and name each natural path between A and Z Q S R P

Identify and name each natural path between A and Z Q S R P A Z K L M 17

A bridge to “association” n What is “association”? ¡ ¡ n Are there “spurious

A bridge to “association” n What is “association”? ¡ ¡ n Are there “spurious associations”? ¡ ¡ ¡ n Mathematical phenomenon Ability to guess the value of one variable based on the value of another variable Mathematical relation between variables is never “spurious” Poor word choice “The association of A with Z is spurious. ” What does the writer have in mind, though? What creates associations? ¡ A causal structure 18

A bridge between natural paths and associations n Which natural paths between A and

A bridge between natural paths and associations n Which natural paths between A and Z contribute to the marginal (crude) association between A and Z? ¡ Causal paths Open paths ¡ n Confounding paths Which natural paths between A and Z do not contribute to an association between A and Z? ¡ Colliding paths Blocked paths 19

Identify open paths and blocked paths (between A and Z) in this diagram B

Identify open paths and blocked paths (between A and Z) in this diagram B Open paths A Z Blocked paths B D B A Z A Z D C A C Z B A C Z 20

When does an association between A and Z reflect the effect of A on

When does an association between A and Z reflect the effect of A on Z? n When only causal paths contribute to the association between A and Z n When confounding paths do not exist, or are somehow blocked ¡ Almost true: not a sufficient condition 21

How do we block a confounding path? n By conditioning on some variable along

How do we block a confounding path? n By conditioning on some variable along the path n What is “conditioning” on a variable? ¡ ¡ Restricting the variable to one of its values Various forms of “adjustment” n n n Standardization Stratification and a weighted average (Mantel-Haenszel) Adding an independent variable to a regression model 22

Conditioning on a variable… n Dissociates a variable from its causes and its effects

Conditioning on a variable… n Dissociates a variable from its causes and its effects A X V B Y C n Z Turns an open natural path into a blocked path A V Z 23

Deconfounding = blocking a confounding path C C X A ? Z C But

Deconfounding = blocking a confounding path C C X A ? Z C But what if? X A ? Z 24

Induced paths n Conditioning on a collider creates (or contributes to) the association between

Induced paths n Conditioning on a collider creates (or contributes to) the association between the colliding variables L A Z K C n A M Z Why? ¡ Later… 25

Induced paths n An induced path may contain ¡ ¡ ¡ n n An

Induced paths n An induced path may contain ¡ ¡ ¡ n n An induced path may be blocked or open An induced path is blocked ¡ n Only dashed lines Dashed lines and arrows Colliders If there is at least one collider on the path An induced path is open ¡ If there are no colliders on the path 26

Blocked induced paths Blocked induced path Blocked natural path A C B E Z

Blocked induced paths Blocked induced path Blocked natural path A C B E Z D Blocked induced path Blocked natural path A A Z A C B E D Z 27

Open induced paths Blocked natural path Open induced path C C A B A

Open induced paths Blocked natural path Open induced path C C A B A Z Blocked natural path A C B E D B Z Open induced path Z A C B E Z D 28

Confounding bias and colliding bias n A confounding path contributes to the (marginal) association

Confounding bias and colliding bias n A confounding path contributes to the (marginal) association between A and Z ¡ n This unwanted contribution is called confounding bias An open induced path contributes to the (conditional) association between A and Z ¡ This unwanted contribution is called colliding bias 29

Can we block an open induced path? --Yes We can eliminate these paths by

Can we block an open induced path? --Yes We can eliminate these paths by conditioning on C Open induced paths C A A B C E D A Z Z B C A B Z E Z D 30

Key questions n Why does a collider block a path? ¡ n Why don’t

Key questions n Why does a collider block a path? ¡ n Why don’t we observe an association between colliding variables? Why does conditioning on a collider create an association between the colliding variables? Blocked path A Open induced path Z C A Z C 31

Intuitive explanation n n A sample of N patients Variables ¡ ¡ ¡ n

Intuitive explanation n n A sample of N patients Variables ¡ ¡ ¡ n M: meningitis status (yes, no) S: stroke status (yes, no) V: vital status (alive, dead) Assume: causal reality is fully described in the diagram M S V 32

Is there a marginal (crude) association between meningitis status and stroke status? n No,

Is there a marginal (crude) association between meningitis status and stroke status? n No, we cannot guess stroke status from meningitis status (or vice versa) n Intuition: a common effect (vital status) cannot induce an association between its (past) causes n There is no transfer of guesses across a collider ¡ A colliding path is a blocked path 33

Suppose we condition on V (vital status)… Stratum 1 (V=alive) Alive patients Pt Stroke

Suppose we condition on V (vital status)… Stratum 1 (V=alive) Alive patients Pt Stroke status Vital status Meningitis status 1 No Alive ? My guess: “No” n n Stratum 2 (V=dead) Dead patients Pt Stroke status Vital status Meningitis status 2 No Dead ? My guess: “Yes” We can make some guesses after conditioning M (meningitis status) and S (stroke status) are associated within the strata of V (the collider) 34

Before and after conditioning… Blocked path M Open induced path S V M S

Before and after conditioning… Blocked path M Open induced path S V M S V 35

Theorem and implications n Theorem ¡ n Colliding variables will be associated within at

Theorem and implications n Theorem ¡ n Colliding variables will be associated within at least one stratum of their collider Implications ¡ ¡ a Mantel-Haenszel summary measure of association will differ from the crude, if we summarize across a collider A regression coefficient will change if we “adjust” for a collider 36

Goal: estimate a measure of effect (causation) by a measure of association n Association

Goal: estimate a measure of effect (causation) by a measure of association n Association is estimating causation (A Z) when: ¡ The association between A and Z is due only to A Z n n direct and indirect paths combined Methods ¡ ¡ ¡ Display variables and causal assumptions in a causal diagram Block all confounding paths between A and Z Do not create open induced paths between A and Z n or eliminate them, if created 37

Confounding bias (again) n n The most widely known Historical definitions and identification methods

Confounding bias (again) n n The most widely known Historical definitions and identification methods ¡ ¡ n “Lack of exchangeability” “Mixed effects” “Non-collapsibility” “Change-in-estimate” A fair amount of confusion The basic causal structure C A ? Z 38

So what is a confounder? n A confounder is a common cause of the

So what is a confounder? n A confounder is a common cause of the exposure (A) and the disease (Z) A B C D Confounder Note: we can block the path by conditioning on B or C or D. C B A Z D Z 39

Endless complexity Exposure: E 0 (baseline exposure) Disease: D 2 (follow-up) Question: Which is

Endless complexity Exposure: E 0 (baseline exposure) Disease: D 2 (follow-up) Question: Which is the confounder? Q− 3 Q− 2 Q− 1 Q 0 E− 3 E− 2 E− 1 E 0 E 1 D− 2 D− 1 D 0 D 1 D 2 40

Colliding bias n n Formerly known as “selection bias” Confusing names and types ¡

Colliding bias n n Formerly known as “selection bias” Confusing names and types ¡ “No representativeness” ¡ “Biased sample” ¡ “Convenient sampling” ¡ “Control-selection bias” ¡ “Survival bias” ¡ “Informative censoring” The basic causal structure A ? Z C 41

But there are many more versions X X Y C C ? A A

But there are many more versions X X Y C C ? A A Z ? Z X C A ? Z A Z C 42

Confounder versus collider Confounder A Z Collider 43

Confounder versus collider Confounder A Z Collider 43

confounding bias and colliding bias: an antithetical pair Confounder C A ? Collider A

confounding bias and colliding bias: an antithetical pair Confounder C A ? Collider A Z No bias A C A No bias Z C Bias ? ? Z C Bias 44

Even more impressive in text… Confounder Collider Main attribute common cause common effect Association

Even more impressive in text… Confounder Collider Main attribute common cause common effect Association contributes to the association between its effects does not contribute to the association between its causes Type of path open path blocked path Effect of conditioning blocked path open path Bias before conditioning? Yes, confounding bias No Bias after conditioning? No Yes, colliding bias 45

What is selection bias? n A type of colliding bias n Should be called

What is selection bias? n A type of colliding bias n Should be called “sampling colliding bias” 46

Types of colliding bias n Sampling colliding bias ¡ ¡ ¡ n Every study

Types of colliding bias n Sampling colliding bias ¡ ¡ ¡ n Every study is restricted to selected people Inevitable conditioning on “selection status” (S) Sometimes, this unavoidable conditioning creates colliding bias Analytical colliding bias ¡ ¡ ¡ Restricted analysis: computing association for one stratum of a collider Stratified analysis: computing association for each stratum of a collider Adjustment by analysis n Computing a weighted average across the collider n Adding the collider to a regression model, as a covariate 47

Sampling colliding bias: a wrong sampling decision n What happens if we estimate the

Sampling colliding bias: a wrong sampling decision n What happens if we estimate the effect of marital status (A) on dementia status (Z) in a sample of nursing home residents? ¡ n Restricting recruitment to nursing home residents Assumptions ¡ ¡ No effect of A on Z Both variables affect “place of residence” (P) (nursing home or elsewhere) 48

Causal diagrams (marital status) A P (dementia status) Z P S (Selection status) (marital

Causal diagrams (marital status) A P (dementia status) Z P S (Selection status) (marital status) A (dementia status) Z P P S 49

Sampling colliding bias: a wrong sampling decision n What happens if we estimate the

Sampling colliding bias: a wrong sampling decision n What happens if we estimate the effect of coughing status (A) on abdominal pain status (Z) in a sample of hospitalized patients? ¡ n Restricting recruitment to hospitalized patients Assumptions ¡ ¡ Displayed in the diagram (next slide) H is hospitalization status 50

Causal diagram S H H H pneumonia status A (coughing status) H ulcer status

Causal diagram S H H H pneumonia status A (coughing status) H ulcer status ? Z (abdominal pain status) 51

Basic causal diagrams for every casecontrol study n The key feature of a case-control

Basic causal diagrams for every casecontrol study n The key feature of a case-control study ¡ ¡ Disease status affects selection into the case-control sample Diseased people are much more likely to be selected than diseasefree people A ? Z S (selection status) A ? Z S No bias, unless we mistakenly create an open path between A and S! 52

Sampling colliding bias: a wrong sampling decision n Research question: What is the effect

Sampling colliding bias: a wrong sampling decision n Research question: What is the effect of smoking status (A) on cancer status (Z)? n Design: Hospital-based case-control study n Controls: patients with cardiovascular disease (CVD) 53

Causal diagram: smoking and cancer (smoking status) A ? (cancer status) Z Background knowledge

Causal diagram: smoking and cancer (smoking status) A ? (cancer status) Z Background knowledge CVD status Always exists in a case-control study Sampling decision for controls S Note: CVD and Z collide at S 54

Colliding bias (AKA control selection bias) (smoking status) A ? (cancer status) Z CVD

Colliding bias (AKA control selection bias) (smoking status) A ? (cancer status) Z CVD status S 55

Sampling colliding bias: Willingness to participate in a case-control study (smoking status) A ?

Sampling colliding bias: Willingness to participate in a case-control study (smoking status) A ? (cancer status) Z Background knowledge Willing to participate? S 56

Control (or case) selection bias n Two main mechanisms ? A Z B B

Control (or case) selection bias n Two main mechanisms ? A Z B B Sampling/participation of controls (or cases) n S Sampling/participation of controls (or cases) S Remember: ¡ ¡ Z S always exists We always condition on S 57

Types of colliding bias n Sampling colliding bias ¡ ¡ ¡ n Every study

Types of colliding bias n Sampling colliding bias ¡ ¡ ¡ n Every study is restricted to selected people Inevitable conditioning on “selection status” (S) Sometimes, this unavoidable conditioning creates colliding bias Analytical colliding bias ¡ ¡ ¡ Restricted analysis: computing association for one stratum of a collider Stratified analysis: computing association for each stratum of a collider Adjustment by analysis n Computing a weighted average across the collider n Adding the collider to a regression model, as a covariate 58

Analytical colliding bias: restricted analysis n Research question: what is the effect of dietary

Analytical colliding bias: restricted analysis n Research question: what is the effect of dietary fibers on colon polyp? n n Design: a cross-sectional study n Analysis: restricted to people who have not developed yet colon cancer 59

Causal diagram (Dietary fibers) A Assumed knowledge ? (Colon polyp status) Z Assumed knowledge

Causal diagram (Dietary fibers) A Assumed knowledge ? (Colon polyp status) Z Assumed knowledge Colon cancer status Note: A and Z collide at colon cancer status 60

Analytical colliding bias (Dietary fibers) A ? (Colon polyp status) Z Colon cancer status

Analytical colliding bias (Dietary fibers) A ? (Colon polyp status) Z Colon cancer status Despite “intuition” we should not restrict the sample to cancer-free people 61

Analytical colliding bias: adjustment n “We adjusted for everything, but the kitchen sink” n

Analytical colliding bias: adjustment n “We adjusted for everything, but the kitchen sink” n Traditional steps ¡ ¡ ¡ Add a laundry list of covariates to the regression model See what happens to the exposure coefficient Use the “change-in-estimate” method n ¡ n Change in the coefficient = Evidence for confounding Report the “adjusted” coefficient as a better (less confounded) measure of effect Prone to colliding bias 62

Analytical colliding bias n Research question: what is the overall effect of gender on

Analytical colliding bias n Research question: what is the overall effect of gender on blood pressure? n Design: a cross-sectional study n Analysis ¡ ¡ Crude mean difference in systolic blood pressure “Adjusted” mean difference (conditioned on waist circumference) 63

Results Analysis Mean SBP men (mm. Hg) Mean SBP women (mm. Hg) Mean difference

Results Analysis Mean SBP men (mm. Hg) Mean SBP women (mm. Hg) Mean difference (mm. Hg) Crude 123. 8 122. 1 1. 7 “Adjusted” for waist circumference n n n Why do the estimates differ? Which estimate should be reported? Is the adjusted estimate less biased? -3. 1 64

Is abdominal fat (measured by waist circumference) a confounder? No! A (Gender) (abdominal fat)

Is abdominal fat (measured by waist circumference) a confounder? No! A (Gender) (abdominal fat) C Z (Blood pressure) 65

Revised diagram (abdominal fat) C A (Gender) n n Z (Blood pressure) No need

Revised diagram (abdominal fat) C A (Gender) n n Z (Blood pressure) No need to “adjust for” abdominal fat “Adjustment” could have: n blocked a causal path n created colliding bias 66

Could have blocked a causal path… (abdominal fat) C A (Gender) Z (Blood pressure)

Could have blocked a causal path… (abdominal fat) C A (Gender) Z (Blood pressure) 67

Could have created colliding bias… U (abdominal fat) C A (Gender) Z (Blood pressure)

Could have created colliding bias… U (abdominal fat) C A (Gender) Z (Blood pressure) 68

Advice on multivariable regression n Do not adjust for an effect of the exposure

Advice on multivariable regression n Do not adjust for an effect of the exposure n Do not adjust for an effect of the outcome n Select covariates according to theory (causal diagram), not mechanistically (change in estimate, stepwise regression) n “Every variable is adjusted for all others” is almost always false ¡ Confounding is not a reciprocal property 69

Key points n n The essence of epidemiology (and all of science) is causal

Key points n n The essence of epidemiology (and all of science) is causal theories Your theories (about causation) are not “A is associated with Z” ¡ n n “Possessing a cigarette lighter is associated with lung cancer” is true, but who cares? That’s not causal knowledge Your theories about bias are not “intuition” about bias; they are causal theories, too. Almost every theory in science is about causation, which means an arrow between variables 70

Key points n Magnitude of bias is more important than merely its presence ¡

Key points n Magnitude of bias is more important than merely its presence ¡ ¡ n Small bias may be ignored Magnitude of bias may be difficult to estimate The bias-variance tradeoff 71