What is regression discontinuity design Mike Brewer University
What is: regression discontinuity design? Mike Brewer University of Essex and Institute for Fiscal Studies Part of “Programme Evaluation for Policy Analysis” (PEPA), a Node of the NCRM
Regression discontinuity design: overview • A regression discontinuity design is a way of undertaking causal inference, usually of some policy intervention • It can provide robust, convincing estimates of causal impacts under fairly weak conditions or minimal assumptions • It was invented by psychologists, but labour economists are now realising how applicable it is • The nature of the intervention will determine whether an RDD is appropriate. Even when it is, data demands are often great
What is regression discontinuity? • A regression discontinuity design is appropriate where – a treatment/intervention/policy is given to individuals for whom some measured characteristic lies on one side of a “cut-off” (sharp RD) – AND the characteristic cannot be perfectly manipulated by individuals
(Sharp) regression discontinuity design Treatment, Di These people are above the cut-off and exposed to treatment. To estimate the impact of the treatment, we need a comparison group 1 X (running variable) 0 These people are below the cut-off and not exposed to treatment. And some of them are very similar to some who are treated. . . c (cut-off) An ideal comparison group would have the same values of X, but not be treated, But such people do not exist. . .
RDD: the principle • Compare treated outcome for those just above cut-off with untreated outcome for those just below cut-off – This identifies the average treatment effect on subjects at the cutoff • Why does this work? – If the “running variable” cannot be perfectly manipulated, then individuals on either side of the cut -off should be very similar to each other in their observable and unobservable characteristics: it’s as if treatment were randomly assigned – Key assumption: nothing else jumps at cut-off
RDD: examples Intervention of interest “Running variable” Unit of study Outcomes of interest Remedial measures Exam results School Future exam results Scholarship Test scores Individual Drop-out rates, future exam results, earnings Labour market or W 2 W policies Age Individual Labour supply, earnings Enrolment in school Date of birth Children (or parents) Children’s test scores of future earnings, parents’ labour supply Regulation, Payroll Taxes Size Firm Labour demand, polluting behaviour Candidates for public office Future income/wealth Being elected to public Vote share office See much longer list in Lee and Lemieux, 2010
RDD: implementation • Graphical analysis – Outcome vs running variable either side of cut-off
Example: link between entitlement to UB and length of unemployment Source: Lalive, (2007)
RDD: implementation • Graphical analysis – Outcome vs running variable either side of cut-off • Formal estimate – Parametric = OLS. Easy! – Non-parametric means local linear regression
RDD: implementing in OLS Indicator for being right side of cut-off, so coefficient measures how outcome variable jumps at X=c Other covariates Allows running variable, X, affects outcomes according to quadratic function whose slope changes at X=c If X discrete, then should allow for errors to be clustered at level of running variable (Card and Lee, 2008)
RDD: implementation • Graphical analysis – Outcome vs running variable either side of cut-off • Formal estimate – Parametric = OLS. Easy! – Non-parametric means local linear regression • Sensitivities and robustness checks
RDD: checks • Something other than treatment might cause the jumps – Do pre-treatment variables or explanatory variables jump around cut-off? • Individuals might manipulate running variable – Is density of running variable smooth around cut-off? – Do pre-treatment variables or explanatory variables jump around cut-off? • Distinguish between discontinuity and non-linearity – Are results robust to inclusion of higher-order polynomials? – Are results robust to changing size of “window” around cut-off? • Are there jumps when none expected (“placebo RDDs”)?
A non-smooth density function From Mc. Crary, 2008. Probability of vote just being lost is a lot lower than it just being won
RDD: checks • Something other than treatment might cause the jumps – Do pre-treatment variables or explanatory variables jump around cut-off? • Individuals might manipulate running variable – Is density of running variable smooth around cut-off? – Do pre-treatment variables or explanatory variables jump around cut-off? • Distinguish between discontinuity and non-linearity – Are results robust to inclusion of higher-order polynomials? – Are results robust to changing size of “window” around cut-off? • Are there jumps when none expected (“placebo RDDs”)?
From Mostly Harmless Econometrics
Variant: Fuzzy RDD • Fuzzy RDD appropriate when the probability that someone is treated changes discontinuously when a characteristic crosses a “cut-off” • For those close to the cut-off, “being on the right side of the cut-off” is a valid instrument (predicts treatment well, no direct impact on outcome) • Can then estimate impact of treatment through 2 SLS (change in outcome either side of cut-off divided by change in treatment either side of cutoff) – Technically, requires a monotonicity assumption and then identifies a LATE: the impact of the treatment on “compliers” at the cut-off
Fuzzy regression discontinuity design Treatment, DD i i 1 0 X c (cutoff) Now treatment depends on whether X bigger than cut-off c, but this is not the only factor. There is a jump in the fraction who are treated as we cross the cut-off, c.
Variant: Fuzzy RDD • Fuzzy RDD appropriate when the probability that someone is treated changes discontinuously when a characteristic crosses a “cut-off” • For those close to the cut-off, “being on the right side of the cut-off” is a valid instrument (predicts treatment well, no direct impact on outcome) • Can then estimate impact of treatment through 2 SLS (change in outcome either side of cut-off divided by change in treatment either side of cutoff) – Technically, requires a monotonicity assumption and then identifies a LATE: the impact of the treatment on “compliers” at the cut-off
RDD: assessment • RDDs can provide convincing causal estimates and can be easily implemented via OLS • But – not universally applicable: depends entirely on nature of intervention – focusing on small area around cut-off often requires large amounts of data
References and reading Where it all came from: Thistlethwaite & Campbell, 1960 "Regression-Discontinuity Analysis: An Alternative to the Ex Post Facto Experiment" Journal of Educational Psychology, 51(6): 309 -17 To find out more: Angrist & Pischke, “Mostly Harmless Econometrics” Lee and Lemieux, 2010, "Regression Discontinuity Designs in Economics. " Journal of Economic Literature, 48(2), 281– 355 Journal of Econometrics, 2008, 142(2), esp. articles by Imbens & Lemieux, Lalive, Card & Lee, Mc. Crary: http: //www. sciencedirect. com/science/journal/03044076/142/2. For economics examples, see citations in Lee and Lemieux Some UK examples outside economics: Del Bono et al. , “Health information and health outcomes: an application of the regression discontinuity design to the 1995 UK contraceptive pill scare case”, ISER WP 2011 -16 Eggers and Hainmueller, “MPs for Sale? Returns to Office in Postwar British Politics”, American Political Science Review, 103, pp 513 -533
- Slides: 20