Experimental Data Some Basic Terminology Start with example

Experimental Data

Some Basic Terminology • Start with example where X is binary (though simple to generalize): – X=0 is control group – X=1 is treatment group • Causal effect sometimes called treatment effect • Randomization implies everyone has same probability of treatment

Why is Randomization Good? • If X allocated at random then know that X is independent of all pre-treatment variables in whole wide world • an amazing claim but true. • Implies there cannot be a problem of omitted variables, reverse causality etc • On average, only reason for difference between treatment and control group is different receipt of treatment

Why is this useful? An Example: Racial Discrimination • Black men earn less than white men in US LOGWAGE | Coef. Std. Err. t ------+---------------BLACK | -. 1673813. 0066708 -25. 09 NO_HS | -. 2138331. 0077192 -27. 70 SOMECOLL |. 1104148. 0049139 22. 47 COLLEGE |. 4660205. 0048839 95. 42 AGE |. 0704488. 0008552 82. 38 AGESQUARED | -. 0007227. 0000101 -71. 41 _cons | 1. 088116. 0172715 63. 00 • Could be discrimination or other factors unobserved by the researcher but observed by the employer? • hard to fully resolve with non-experimental data

An Experimental Design • Bertrand/Mullainathan “Are Emily and Greg More Employable Than Lakisha and Jamal”, American Economic Review, 2004 • Create fake CVs and send replies to job adverts • Allocate names at random to CVs – some given ‘black-sounding’ names, others ‘whitesounding’

• Outcome variable is call-back rates • Interpretation – not direct measure of racial discrimination, just effect of having a ‘blacksounding’ name – may have other connotations. • But name uncorrelated by construction with other material on CV

The Treatment Effect • Want estimate of:

Estimating Treatment Effects • Take mean of outcome variable in treatment group • Take mean of outcome variable in control group • Take difference between the two

Bertrand/Mullainathan: Basic Results

Summary So Far • Econometrics very easy if all data comes from randomized controlled experiment • Just need to collect data on treatment/control and outcome variables • Just need to compare means of outcomes of treatment and control groups • Is data on other variables of any use at all? – Not necessary but useful

Including Other Regressors • Can get consistent estimate of treatment effect without worrying about other variables • Reason is that randomization ensures no problem of omitted variables bias • But there are reasons to include other regressors: – – Improved efficiency Check for randomization Improve randomization Heterogeneity in treatment effects

What treatment effect to estimate? • Would like to estimate causal effect for everyone – this is not possible • Can only hope to estimate some average • Average treatment effect:

Bertrand/Mullainathan • Different treatment effect for high and low quality CVs:

What Do We Want to Estimate? • Under voluntary participation the researcher is interested in measuring the effect of being offered the program, rather than the actual treatment. ITT measures the average impact of offering a program using the initial random assignment as a way to avoid the re-introduction of selection bias. • ‘Intention-to-Treat’: ITT=E(y|Z=1)-E(y|Z=0) • Treatment Effect on Treated

Estimating TOT • Under imperfect compliance, the TOT captures the average gain of the program for those who actually get treated. • Can’t use simple regression of y on Z • But should recognize TOT as Wald estimator • Can be estimated by regressing y on X using Z as instrument • Relationship between TOT and ITT:

• Morduch et al. (2013) paper on substitution bias

Spill-overs/Externalities /General Equilibrium Effects • Have assumed that treatment only affects outcome for person for receives it • Many situations in which this is not true • E. g. externalities, spill-overs, effects on market prices • Example: Miguel and Kremer, “Worms: Identifying Impacts on Education and Health in the Presence of Treatment Externalities”, Econometrica 2004

Problems with Experiments • Expense • Ethical Issues • Threats to Internal Validity – Failure to follow experiment – Experimental effects (Hawthorne effects) – Externalities/spillovers • Threats to External Validity – Non-representative sample – Scale effects

Conclusions on Experiments • Are ‘gold standard’ of empirical research • Are becoming more common • Study of non-experimental data can deliver useful knowledge