ESTIMATING THE DOSERESPONSE FUNCTION THROUGH THE GLM APPROACH

  • Slides: 31
Download presentation
ESTIMATING THE DOSE-RESPONSE FUNCTION THROUGH THE GLM APPROACH Barbara Guardabascio, Marco Ventura Italian National

ESTIMATING THE DOSE-RESPONSE FUNCTION THROUGH THE GLM APPROACH Barbara Guardabascio, Marco Ventura Italian National Institute of Statistics 7 th June 2013, Potsdam 1

Outline of the talk Ø Motivations; Ø literature references; Ø our contribution to the

Outline of the talk Ø Motivations; Ø literature references; Ø our contribution to the topic; Ø the econometrics of the dose-response; Ø how to implement the dose-response; Ø our programs; Ø applications. 2

Motivations Ø Main question: how effective are public policy programs with continuous treatment exposure?

Motivations Ø Main question: how effective are public policy programs with continuous treatment exposure? Ø Fundamental problem: treated individuals are self-selected and not randomly. Treatment is not randomly assigned Ø (possible) solution: estimating a dose-response function 3

Motivations Ø What is a dose-response function? It is a relationship between treatment and

Motivations Ø What is a dose-response function? It is a relationship between treatment and an outcome variable e. g. : birth weight, employment, bank debt, etc 4

Motivations Ø How can we estimate a dose-response function? It can be estimated by

Motivations Ø How can we estimate a dose-response function? It can be estimated by using the Generalized Propensity Score (GPS) 5

Literature references 1. Propensity Score for binary treatments: Rosenbaum and Rubin (1983), (1984) 2.

Literature references 1. Propensity Score for binary treatments: Rosenbaum and Rubin (1983), (1984) 2. for categorical treatment variables: Imbens (2000), Lechner (2001) 3. Generalized Propensity Score for continuous treatments: Hirano and Imbens, 2004; Imai and Van Dyk (2004) 6

Our contribution Ø Ad hoc programs have been provided to STATA users (Bia and

Our contribution Ø Ad hoc programs have been provided to STATA users (Bia and Mattei, 2008), but … … these programs contemplate only Normal distribution of the treatment variable (gpscore. ado and doseresponse. ado) Ø We provide new programs to accommodate other distributions, not Normal. (gpscore 2. ado and doseresponse 2. ado) 7

The econometrics of the dose-response Ø {Yi(t)} set of potential outcomes for Ø Where

The econometrics of the dose-response Ø {Yi(t)} set of potential outcomes for Ø Where [t 0, t 1] is the set of potential treatments over 8

The econometrics of the dose-response Let us suppose to have N individuals, i=1 …

The econometrics of the dose-response Let us suppose to have N individuals, i=1 … N Xi vector of pre-treatment covariates; Ti level of treatment delivered; Yi (Ti) outcome corresponding to the treatment Ti 9

The econometrics of the dose-response Ø We want the average dose response function Ø

The econometrics of the dose-response Ø We want the average dose response function Ø Hirano-Imbens define the GPS as the conditional density of the actual treatment given the covariates 10

The econometrics of the dose-response Ø Balancing property: Within strata with the same r(t,

The econometrics of the dose-response Ø Balancing property: Within strata with the same r(t, x) the probability that T=t does not depend on X 11

The econometrics of the dose-response Ø If weak unconfoundedness holds we have This means

The econometrics of the dose-response Ø If weak unconfoundedness holds we have This means that the GPS can be used to eliminate any bias associated with differences in the covariates and … 12

The econometrics of the dose-response Ø The dose-response function can be computed as: 13

The econometrics of the dose-response Ø The dose-response function can be computed as: 13

How to implement the GPS Ø The dose-respone can be implemented in 3 steps:

How to implement the GPS Ø The dose-respone can be implemented in 3 steps: FIRST STEP: 1. Regress Ti on Xi and take the conditional distribution of the treatment given the covariates Ti| Xi 14

How to implement the GPS Where f(. ) is a suitable transformation of T

How to implement the GPS Where f(. ) is a suitable transformation of T (link) D is a distribution of the exponential family β parameters to be estimated σ conditional SE of T|X 15

How to implement the GPS 1 a. Test the balancing property 16

How to implement the GPS 1 a. Test the balancing property 16

How to implement the GPS SECOND STEP: Model the conditional expectation of E[Yi| Ti,

How to implement the GPS SECOND STEP: Model the conditional expectation of E[Yi| Ti, Ri ] as a function of Ti and Ri 17

How to implement the GPS THIRD STEP: Estimate the dose-response function by averaging the

How to implement the GPS THIRD STEP: Estimate the dose-response function by averaging the estimated conditionl expectation over the GPS at each level of the treatment we are interested in 18

How to implement the GPS Ø Where is the novelty? in the FIRST STEP

How to implement the GPS Ø Where is the novelty? in the FIRST STEP Ø Instead of a ML we use a GLM ü exponential distribution (family) ü combined with a link function 19

our programs LinkDistr Normal Inv. Normal Binomial Poisson Neg. Binomial Gamma Identity X X

our programs LinkDistr Normal Inv. Normal Binomial Poisson Neg. Binomial Gamma Identity X X X Log X X X X X Logit X Probit X Cloglog X Power Opower X X Nbin X Loglog X Logc X 20

our programs We have written two programs: Ø doserepsonse 2. ado; estimates the dose-response

our programs We have written two programs: Ø doserepsonse 2. ado; estimates the dose-response function and graphs the result. It carries out step 1 – 2 – 3 of the previous slides by running other 2 programs 21

our programs Ø gpscore 2. ado: evaluates the gpscore under 6 different distributional assumptions

our programs Ø gpscore 2. ado: evaluates the gpscore under 6 different distributional assumptions step 1 of the previous slides Ø doseresponse_model. ado: Carries out step 2 of the previous slides 22

our programs doseresponse 2 varlist , outcome(varname) t(varname) family(string) link(string) gpscore(newvarname) predict(newvarname) sigma(newvarname) cutpoints(varname)

our programs doseresponse 2 varlist , outcome(varname) t(varname) family(string) link(string) gpscore(newvarname) predict(newvarname) sigma(newvarname) cutpoints(varname) nq_gps(#) index(string) dose_response(newvarlist) Options t_transf(transformation) normal_test(test) normal_level(#) test_varlist(varlist) test(type) flag(#) cmd(regression_cmd) reg_type_t(string) reg_type_gps(string) interaction(#) t_points(vector) npoints(#) delta(#) bootstrap(string) filename(filename) boot_reps(#) analysis(string) analysis_leve(#) graph(filename) flag_b(#) opt_nb(string) opt_b(varname) detail 23

our programs gpscore 2 varlist , t(varname) family(string) link(string) gpscore(newvarname) predict(newvarname) sigma(newvarname) cutpoints(varname) index(string)

our programs gpscore 2 varlist , t(varname) family(string) link(string) gpscore(newvarname) predict(newvarname) sigma(newvarname) cutpoints(varname) index(string) nq_gps(#) Options t_transf(transformation) normal_test(test) normal_level(#) test_varlist(varlist) test(type) flag_b(#) opt_nb(string) opt_b(varname) detail 24

Application Data set by Imbens, Rubin and Sacerdote (2001); The winners of a lottery

Application Data set by Imbens, Rubin and Sacerdote (2001); The winners of a lottery in Massachussets: amount of the prize (treatment) Ti earnings 6 years after winning (outcome) Yi age, gender, education, # of tickets bought, working status, earnings before winning up to 6 Xi 25

Application: flogit Fractional data: flogit model. Treatment: prize/max(prize) outcome: earnings after 6 year family(binomial)

Application: flogit Fractional data: flogit model. Treatment: prize/max(prize) outcome: earnings after 6 year family(binomial) link(logit) 26

Application: flogit 27

Application: flogit 27

Application: count data Count data: Poisson model. Treatment: years of college+ high school outcome:

Application: count data Count data: Poisson model. Treatment: years of college+ high school outcome: earnings after 6 year family(poisson) link(log) 28

Application: count data 29

Application: count data 29

Application: gamma distribution Gamma distribution: Treatment: age outcome: earnings after 6 year family(gamma) link(log)

Application: gamma distribution Gamma distribution: Treatment: age outcome: earnings after 6 year family(gamma) link(log) 30

Application: gamma distribution 31

Application: gamma distribution 31