Remaining Weeks Next week DiffnDiff Nov 17 Power

  • Slides: 39
Download presentation
Remaining Weeks • • Next week: Diff-n-Diff Nov. 17: Power calculations. Nov. 24: summary,

Remaining Weeks • • Next week: Diff-n-Diff Nov. 17: Power calculations. Nov. 24: summary, in class presentations. Dec. 1: Guests, more presentations.

Motivation: Causality. Research from Superfreakonomics Indian villages with TV treat women better. Oster &

Motivation: Causality. Research from Superfreakonomics Indian villages with TV treat women better. Oster & Johnson QJE 2009

Real-World Complications Attrition Data Quality Cars Stuck in the Mud, Employees Robbed

Real-World Complications Attrition Data Quality Cars Stuck in the Mud, Employees Robbed

What type of day are you having?

What type of day are you having?

Practical Problems Language Culture Being around the same four Westerners 24/7 without going crazy.

Practical Problems Language Culture Being around the same four Westerners 24/7 without going crazy. Solutions: Having had a real job? Management skills

Actual Organizations • CEGA (Our Sponsor) http: //cega. berkeley. edu • Poverty Action Lab

Actual Organizations • CEGA (Our Sponsor) http: //cega. berkeley. edu • Poverty Action Lab (J-PAL) http: //www. povertyactionlab. org • Innovations for Poverty Action http: //www. poverty-action. org • Blum Center for Developing Economies http: //blumcenter. berkeley. edu

CEGA-related Faculty • • Alain de Janvry Frederico Finan David I. Levine Jeremy Magruder

CEGA-related Faculty • • Alain de Janvry Frederico Finan David I. Levine Jeremy Magruder Edward Miguel Nancy Padian Elisabeth Sadoulet http: //cega. berkeley. edu/affiliates. html

Larger NGO-types • The World Bank • Center for Global Development • International Food

Larger NGO-types • The World Bank • Center for Global Development • International Food Policy Research Institute many, many more

Human Subjects UC Berkeley Committee for the Protection of Human Subjects http: //cphs. berkeley.

Human Subjects UC Berkeley Committee for the Protection of Human Subjects http: //cphs. berkeley. edu In-country organization as well, for example: Kenya Medical Research Institute http: //www. kemri. org

Attrition Randomized trials often require that we get data from the subjects twice--once before

Attrition Randomized trials often require that we get data from the subjects twice--once before the experiment and once after. What if we can’t find them afterwards?

Worksheet How might you expect people we couldn’t find to differ from those we

Worksheet How might you expect people we couldn’t find to differ from those we could easily find? What could cause people to go missing?

Attrition Create Lower/Upper Bound for our estimates by assuming the worst about the people

Attrition Create Lower/Upper Bound for our estimates by assuming the worst about the people we couldn’t find. (David Lee. Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects. REStud 2009) In our case, we’ll just say it’s important to find as many people as possible to get good data.

Attrition in KLPS Kenyan Life Panel Survey 2003 -2005 follow-up to Deworming (19982000) 7500

Attrition in KLPS Kenyan Life Panel Survey 2003 -2005 follow-up to Deworming (19982000) 7500 of the original 30, 000 were randomly selected to be surveyed.

Attrition in KLPS First, go their old school and ask around. Second, try and

Attrition in KLPS First, go their old school and ask around. Second, try and go find their house. Third, travel far and wide.

Attrition in KLPS Using two-part regular and intensive tracking just like in “Moving to

Attrition in KLPS Using two-part regular and intensive tracking just like in “Moving to Opportunity. ” After finding as large a portion as you can, select random sub-sample of everyone remaining. ERR=MRR+SRR*(1 -MRR)

Attrition in KLPS End Results: 84% successfully contacted 83% successfully surveyed

Attrition in KLPS End Results: 84% successfully contacted 83% successfully surveyed

Attrition in KLPS 4 different types of being “found, ” by treatment and gender

Attrition in KLPS 4 different types of being “found, ” by treatment and gender

Where’d we find them? --19% Outside Busia --14% Outside Neighboring Areas --25% Overall (Non-Snapshot)

Where’d we find them? --19% Outside Busia --14% Outside Neighboring Areas --25% Overall (Non-Snapshot)

So, We Got 84%, Are We Cool? • Is treatment correlated with attrition?

So, We Got 84%, Are We Cool? • Is treatment correlated with attrition?

So, We Got 84%, Are We Cool? • Is treatment correlated with attrition? Probably

So, We Got 84%, Are We Cool? • Is treatment correlated with attrition? Probably Not. We found 83. 9% to 85. 0% in all treatment groups.

Was it worth it? • We spent a lot of money to find the

Was it worth it? • We spent a lot of money to find the emigrants.

Did we need to bother? • Migrants are 1. 7 cm shorter than non-migrants,

Did we need to bother? • Migrants are 1. 7 cm shorter than non-migrants, and an additional year of treatment increased migrant height by. 4 cm and only. 1 cm for the full sample.

 • • • The Nuts & Bolts of Building the Dataset Written on

• • • The Nuts & Bolts of Building the Dataset Written on hard-copy of survey. Sub-sample checked for mistakes. Data-entry place double enters. We check for correlation of two entries. We re-enter 5% sample and check against their work, accept if error rate below threshold. • That’s the “raw” data

The Nuts & Bolts of Building the Dataset • Depressed grad students spend whole

The Nuts & Bolts of Building the Dataset • Depressed grad students spend whole summers in windowless Unix lab on the 6 th floor of the 2 nd ugliest building on campus writing cleaning files, which checks for blanks and skip-pattern violations. • Send the list of flagged entries to location of hard copies • Hard-copies checked against soft-copy. Soft-copy corrected, mistake flag lowered. • Feel free to use the data.

Data Quality Fine, we correctly recorded what the respondent said, but should we really

Data Quality Fine, we correctly recorded what the respondent said, but should we really trust what they said? That is, if you were 16 and had a miscarriage a year ago, would you really want to tell an older man that’s a stranger about it?

Gender

Gender

Tribe

Tribe

Do Kids Know What They’re Talking About? • Disregard the respondent/enumerator relationship. Do the

Do Kids Know What They’re Talking About? • Disregard the respondent/enumerator relationship. Do the kids really know what they’re talking about? • Depends on the question.

What’s Reliable? • We sample 5% to be resurveyed, successfully resurveyed about 4%. 3

What’s Reliable? • We sample 5% to be resurveyed, successfully resurveyed about 4%. 3 months later on average. Baseline: If we ask “what tribe are you? ” It stays the same 95% of the time.

Pretty Decent

Pretty Decent

Pretty Decent

Pretty Decent

I Can’t Throw This Very Far.

I Can’t Throw This Very Far.

I Can’t Throw This Very Far.

I Can’t Throw This Very Far.

Fraction Matching • • • Sub-Tribe 95% Age in 1998 76% Grade in 2002

Fraction Matching • • • Sub-Tribe 95% Age in 1998 76% Grade in 2002 86% Ever left local area 91% Mom/Dad Education 51 -53%

What Determines Remembering? • Tables 22 and 23 show what characteristics are correlated with

What Determines Remembering? • Tables 22 and 23 show what characteristics are correlated with giving the same answer about Mom/Dad’s education in both survey and re-survey.

Conclusion • Field work is great; go do some. • Try and find everyone.

Conclusion • Field work is great; go do some. • Try and find everyone. • Especially if you’re more/less likely to find them thanks to your intervention. • Do your Field Officers effect the answers given? • Does the respondent really know the right answer in the first place?