Using Stata to Analyze Complex Survey Data 2013

  • Slides: 21
Download presentation
Using Stata to Analyze Complex Survey Data 2013 Tanzania National Grade 2 Cross-sectional Survey

Using Stata to Analyze Complex Survey Data 2013 Tanzania National Grade 2 Cross-sectional Survey using EGRA/EGMA SSME instruments Chris Cummiskey & Marissa Gargano Sunday March 5, 8: 30 – 14: 45 Georgia 2 (South Tower) CIES 2017 Downtown Sheraton Atlanta, GA RTI International is a registered trademark and a trade name of Research Triangle Institute. www. rti. org

Purpose 1: Show that SRS is incorrect for inferencial – describe the sample (not

Purpose 1: Show that SRS is incorrect for inferencial – describe the sample (not the population from which the same came) Sample § Complex Survey - Inferential – Project the sample to the population y v s et s y sv Sample ts h ig we 2 Population Descriptive - SRS Cluster Effect §

Review Background, Research Questions, Sample Methodology. § Materials used in this section: – 0_Research

Review Background, Research Questions, Sample Methodology. § Materials used in this section: – 0_Research Questions for Grade 2 Tanzania EGRA-2013. docx [Paper] – 1_Analyze_Tanzania-Data_CIES 2017_Analysis-Workshop. do [Electric] – 1_Worksheet_Tanzania Grade 2 National Cross_sectional EGRA-EGMA study. xlsx [Electric]

Background 0_Research Questions for Grade 2 Tanzania EGRA-2013. docx [Paper] v Who? : Standard

Background 0_Research Questions for Grade 2 Tanzania EGRA-2013. docx [Paper] v Who? : Standard 2 students attending government schools. v What? : English/Kiswahili EGRA & EGMA cross-sectional (snapshot) study. v Where? : Tanzania National. v When? : October, 2013 [end of Grade 2]. v Why? : To get an national picture of grade 2 classrooms, teachers, students and student reading/math ability. v To better understand the different aspects among schools in low, middle and high performance bands, as labeled in the National Standard 7 Leaving v

List Frame – Population Data § 2012 National Primary-Schools-Leaving-Certificate-Examination (PSLCE). Contains all primary government

List Frame – Population Data § 2012 National Primary-Schools-Leaving-Certificate-Examination (PSLCE). Contains all primary government schools – Contains the information needed to draw the sample – Population

0. ) Stata: Lets take a look at the Population Data… Population § 1_Analyze_Tanzania-Data_CIES

0. ) Stata: Lets take a look at the Population Data… Population § 1_Analyze_Tanzania-Data_CIES 2017_Analysis-Workshop. do – Read in the Tanzania Census Data [School Level] § use “…<path>Analysis WorkshopMaterialsTanzania_2013Data Census_List. Frame For Tanzania 2013 -National Survey_PLSE 7. dta”, clear What level is the Population dataset? How many units are in the list? § What percent of the <units> POPULATION is urban/rural? § What percentage of the <units> POPULATION is high/mid/low performing band? §

TZ-2013 Sample Methodology Sample Strata FPC Finite population correction Probability of Selection Councils (20)

TZ-2013 Sample Methodology Sample Strata FPC Finite population correction Probability of Selection Councils (20) Urban/Rural (2) Total urban councils + total rural councils Number of Schools (200) School Performance (3) Total high/mid/low performing schools in selected councils Number of Schools (None) Total grade 2 classrooms in selected school Equal Gender (2) Total grade 2 female/male in selected classroom Equal Stage Number Item Sampled Stage 1 Stage 2 Stage 3 Stage 4 G 2 Classrooms (200) G 2 Student (2, 266)

20 Selected Councils

20 Selected Councils

1 -1. ) Comparison of the Sample and Estimated Population 1_Analyze_Tanzania-Data_CIES 2017_Analysis-Workshop. do What

1 -1. ) Comparison of the Sample and Estimated Population 1_Analyze_Tanzania-Data_CIES 2017_Analysis-Workshop. do What is the sample count and percentages for the school performance (“band”)? Sample – What is the estimated population count and percentages for the school performance (“band”)? Population – What group(s) were over sampled? – What group(s) were under sampled? – § How might how this over/under sample effect the results if they are not accounted for in the analysis?

1. 2) Compare SRS vs. Complex Analyses § 1_Analyze_Tanzania-Data_CIES 2017_Analysis-Workshop. do – – –

1. 2) Compare SRS vs. Complex Analyses § 1_Analyze_Tanzania-Data_CIES 2017_Analysis-Workshop. do – – – Compare the high bands SRS vs. Complex for the mean estimates. Is there a big difference? Compare the high bands SRS vs. Complex for the SE and 95%CI estimates. Is there a big difference? Do the same low/mid bands? How might how the over/under sample effect the NATIONAL results if they are not accounted for in the analysis?

1. 2) Compare SRS vs. Complex Analyses: Nationallly How might how the over/under sample

1. 2) Compare SRS vs. Complex Analyses: Nationallly How might how the over/under sample effect the NATIONAL results if they are not accounted for in the analysis? If the analysis thinks the students were sampled with SRS? – If the analysis knows how the students were really sampled? – Mean estimates § SE and 95%CI Estimates §

1. 3) Explore the TZ-2013 svyset 1_Analyze_Tanzania-Data_CIES 2017_Analysis-Workshop. do § [Refer back to the

1. 3) Explore the TZ-2013 svyset 1_Analyze_Tanzania-Data_CIES 2017_Analysis-Workshop. do § [Refer back to the Sample Methodology Table] Cluster Effect – s we ht ig

Understanding the Sample Motive § Sample methodology: Must be developed to answer the primary

Understanding the Sample Motive § Sample methodology: Must be developed to answer the primary research designs. – Must account for the cost of data collection – Should account for the data collection logistics – Should be tweaked to maximize the Statistical-Benefit : Cost ratio – § Statistically Ideal Sample: Sample weights are roughly balanced – Large sample size – Small clusters (no more than 20 students per cluster) –

How effective was our sample? § – How balanced were the weights in the

How effective was our sample? § – How balanced were the weights in the sample? – How large was our sample of grade 2 students? – How large were the clusters in the sample? What ways could this sample have been more statistically sufficient?

Conduct the same analysis but for Urban/Rural and k_orf

Conduct the same analysis but for Urban/Rural and k_orf

Begin to answer some research questions. § Materials used in this section: – 0_Research

Begin to answer some research questions. § Materials used in this section: – 0_Research Questions for Grade 2 Tanzania EGRA-2013. docx

Primary Analysis: Answering the Primary Research Questions – 0_Research Questions for Grade 2 Tanzania

Primary Analysis: Answering the Primary Research Questions – 0_Research Questions for Grade 2 Tanzania EGRA-2013. docx [Paper] P-I. What is the national Kiswahili literacy ability of Grade 2 students attending non-special governmental schools? – P-II. What is the national English literacy ability of Grade 2 students attending non-special governmental schools? – P-III. What is the national Mathematic ability of Grade 2 students attending non-special governmental schools? – P-IV. Based on the national Kiswahili literacy ability, how different were Grade 2 student’s reading ability by: – a. School-band (high/mid/low performing) § b. Gender § c. Urban/Rural §

Primary Analysis: Answering the Primary Research Questions – P-I. What is the national Kiswahili

Primary Analysis: Answering the Primary Research Questions – P-I. What is the national Kiswahili literacy ability of Grade 2 students attending non-special governmental schools? CODE: svy: mean k_orf – P-IV. Based on the national Kiswahili literacy ability, how different were Grade 2 student’s reading ability by: § a. School-band (high/mid/low performing) CODE: svy: reg k_orf band OR: svy, over(band): mean k_orf

Secondary Analysis: Answering the Secondary Research Questions 0_Research Questions for Grade 2 Tanzania EGRA-2013.

Secondary Analysis: Answering the Secondary Research Questions 0_Research Questions for Grade 2 Tanzania EGRA-2013. docx [Paper] – S-1. What does a Grade 2 student from a low/medium/high performing school look like? – a. What are student’s demographic make-up? § b. What does the school they attend look like? § c. What does the classroom they are instructed in look like? § d. What does the household environment look like? § S-2. Of these characteristics mentioned above, what seems correlated with higher/lower reading ability? – S-3. How well correlated are the English literacy ability with the Kiswahili literacy ability? –

Secondary Analysis: Answering the Secondary Research Questions 0_Research Questions for Grade 2 Tanzania EGRA-2013.

Secondary Analysis: Answering the Secondary Research Questions 0_Research Questions for Grade 2 Tanzania EGRA-2013. docx [Paper] – S-1. What does a Grade 2 student from a low/medium/high performing school look like? – a. What are student’s demographic make-up? CODE: svy, over(band): proportion female CODE: svy, subpop(if band == 1): tab age § b. What does the classroom they are instructed in look like? CODE: svy, subpop(if band == 2 | band == 3): proportion tr_1 § – S-2. Of these characteristics mentioned above, what seems correlated with higher/lower reading ability? CODE: svy: reg k_orf ib 1. band ib 0. tr_1 age ib 0. female

Contact Information THANK YOU! § Chris Cummiskey: Email: ccummiskey@rti. org – Skype: chris. cummiskey

Contact Information THANK YOU! § Chris Cummiskey: Email: ccummiskey@rti. org – Skype: chris. cummiskey – § Marissa Gargano: Email: mgargano@rti. org – Skype: marissangargano – § RTI: @RTI_Ed. Work – @RTI_Intl_Dev – Shar. Ed. rti. org –