ROBUSTNESS OF CAUSAL CLAIMS Judea Pearl Computer Science
ROBUSTNESS OF CAUSAL CLAIMS Judea Pearl Computer Science Department UCLA www. cs. ucla. edu/~judea
ROBUSTNESS: MOTIVATION Genetic Factors (unobserved) u x Smoking a y Cancer In linear systems: y = on ax cancer +e The effect of smoking is, in general, a is non-identifiable (from observational studies).
ROBUSTNESS: MOTIVATION Z Price of Cigarettes Genetic Factors (unobserved) u b a x Smoking y Cancer Z – Instrumental variable; cov(z, u) = 0 a is identifiable
ROBUSTNESS: MOTIVATION Z Price of Cigarettes Genetic Factors (unobserved) u b a x Smoking y Cancer Problem with Instrumental Variables: The model may be wrong!
ROBUSTNESS: MOTIVATION Z 1 Price of Cigarettes Z 2 Peer Pressure Genetic Factors (unobserved) u b a g x Smoking y Cancer Solution: Invoke several instruments Surprise: a 1 = a 2 model is likely correct
ROBUSTNESS: MOTIVATION Z 1 Price of Cigarettes Z 2 Peer Pressure Genetic Factors (unobserved) u b a g x Smoking y Cancer Z 3 Anti-smoking Legislation Zn Greater surprise: a 1 = a 2 = a 3…. = an = q Claim a = q is highly likely to be correct
ROBUSTNESS: MOTIVATION Genetic Factors (unobserved) u x Smoking a y Cancer s Symptoms do not act as instruments a remains non-identifiable Why? Taking a noisy measurement (s) of an observed variable (y) cannot add new information
ROBUSTNESS: MOTIVATION Genetic Factors (unobserved) Sn u S 2 a x Smoking y Cancer S 1 Symptom Adding many symptoms does not help. a remains non-identifiable
ROBUSTNESS: MOTIVATION Given a parameter a in a general graph a x y Find if a can evoke an equality surprise a 1 = a 2 = … an associated with several independent estimands of a Formulate: Surprise, over-identification, independence Robustness: The degree to which a is robust to violations of model assumptions
ROBUSTNESS: FORMULATION Bad attempt: Parameter a is robust (over identifies) if: f 1, f 2: Two distinct functions
ROBUSTNESS: FORMULATION ex ey b x Ryx = b Rzx = bc Rzy = c ez x = ex y = bx + ey z = cy + ez c y z (b) (c) constraint: y → z irrelvant to derivation of b
RELEVANCE: FORMULATION Definition 8 Let A be an assumption embodied in model M, and p a parameter in M. A is said to be relevant to p if and only if there exists a set of assumptions S in M such that S and A sustain the identification of p but S alone does not sustain such identification. Theorem 2 An assumption A is relevant to p if and only if A is a member of a minimal set of assumptions sufficient for identifying p.
ROBUSTNESS: FORMULATION Definition 5 (Degree of over-identification) A parameter p (of model M) is identified to degree k (read: k-identified) if there are k minimal sets of assumptions each yielding a distinct estimand of p.
ROBUSTNESS: FORMULATION b c x y Minimal assumption sets for c. x y c G 1 z x y c z z x y G 3 G 2 Minimal assumption sets for b. c x b y z z
FROM MINIMAL ASSUMPTION SETS TO MAXIMAL EDGE SUPERGRAPHS FROM PARAMETERS TO CLAIMS Definition A claim C is identified to degree k in model M (graph G), if there are k edge supergraphs of G that permit the identification of C, each yielding a distinct estimand. e. g. , Claim: (Total effect) TE(x, z) = q x y TE(x, z) = Rzx z x x y y z TE(x, z) = Rzx Rzy ·x z
CONCLUSIONS 1. Formal definition to ROBUSTNESS of causal claims: “A claim is robust when it is insensitive to violations of some of the model assumptions” 2. Graphical criteria and algorithms for computing the degree of robustness of a given causal claim.
- Slides: 16