Why Use Comparison Groups in Evaluation Patricia Gonzalez




























- Slides: 28
Why Use Comparison Groups in Evaluation? Patricia Gonzalez, OSEP June 14, 2011 1
The purpose of annual performance reporting is to demonstrate that IDEA funds are being used to improve or benefit children with disabilities and their families. In the case of SPDG Program funds, theory is that providing effective professional development to personnel implementing special education or early intervention services will ultimately benefit targeted children and families. 2
In order to show improvement or benefit in a social condition, change or progress must be demonstrated on outcome variables of interest (that is, an outcome evaluation must occur). and… In order to “credit” the SPDG Program, some link must be established between SPDG activities and those changes. 3
Additionally, judgments based on outcome evaluations rely on determining whether outcomes have improved or are better “as compared” to something else. Linking program activities to interventions and comparing those outcomes to other groups (or points of time with the same group) requires an evaluation (research) design. 4
In special education, evaluation questions involving comparisons often focus on one of the following: ◦ Comparisons with non-disabled students (or special/general education teachers) ◦ Comparisons of students with different types of disabilities or teachers with different specializations ◦ Cross-unit comparisons (districts, schools, classrooms) ◦ Longitudinal/repeated measures of the same group 5
Decisions about the type of evaluation design and the appropriate comparison group depends, for example, on: ◦ ◦ the the evaluation questions length of the program or intervention amount of resources available ability to randomly assign participants to groups 6
Trochim, William. (2008). Research Methods Knowledge Base http: //www. socialresearchmethods. net/kb/destypes. php 7
Random assignment of participants to groups improves the rigor of the evaluation, but the use of intact groups, such as classrooms and schools is much more common in practice. The use of propensity scores with intact groups reduces threats to internal validity and improves confidence in evaluation results. 8
Examples of Comparison Group Evaluation Amy Gaumer Erickson, Ph. D. , aerickson@ku. edu University of Kansas, Center for Research on Learning 9
Is random assignment feasible? � Do you use random assignment of individuals in any of your SPDG activities? � Yes � No 10
Comparing Groups (both receiving intervention) Question: Do teachers and administrators have different perceptions about the level of parental involvement in their schools? I regularly communicate with families regarding student academic & behavior goals/progress. I make informed decisions based on feedback from families. Administrator Teacher I think my school does a good job in including parents as team members in data-based decision-making. 0 11 0. 5 1 1. 5 2 2. 5 3 3. 5 4 4. 5
Comparing Across Time (intervention group only) Question: Through multi-year participation in the intervention, do educators feel that their schools improved academic and behavior supports for students? Evidence-Based Practices Mean 0. 00 1. 00 2. 00 3. 00 I think my school does a good job of addressing the academic & behavior needs of students at tier 1 (universal). I think my school does a good job of addressing the academic & behavior needs of students at tier 2 (small group). I think my school does a good job of addressing the academic & behavior needs of students at tier 3 (intensive). I adapt the environment, curriculum, & instruction based on each student’s academic & behavior data. 12 4. 00 5. 00 3. 70 3. 94 3. 40 2009 -2010 3. 74 2010 -2011 3. 20 3. 47 3. 80 4. 19
Comparing Groups (baseline with demographic variable) Question: What is the level of implementation of research-based transition indicators reported by high school special education teachers? Inclusion & Access to the General Education Curriculum Student Involvement Urban Rural Family Involvement 0 13 0. 5 1 1. 5 2 2. 5 3 3. 5
Comparing Across Time (intervention group only) Question: How do students with disabilities in the intervention schools perform on the state communication arts assessment across multiple years of intervention implementation? Communication Arts 25. 00% 20. 00% 15. 00% Communication Arts 10. 00% 5. 00% 0. 00% 2007 14 2008 2009 2010
Comparing Groups (intervention & state average) Question: How do students with disabilities in the intervention schools perform on state assessments compared to the state average? Mathematics Intervention Schools State Average Communication Arts 0 15 5 10 15 20 25 30
Comparing Groups (stratified sample) Question: In the past year, did the percentage of students with disabilities in the intervention schools that met proficiency on state mathematics assessments increase? Small Districts Medium Districts Large Districts 0 16 2 4 6 8 10 12 14
Comparing Groups (Intervention & Similar Schools) In the past year, did the percentage of students that met proficiency on state communication arts assessments increase? All Students Intervention Schools Comparison Schools Students with IEPs 0 17 2 4 6 8 10 12
Developing Stronger Outcome Data � Research design should be clearly articulated from the beginning � Intervention groups must be clearly defined � Multiple measures are necessary � Outcome variable should be collected across time � Outcome variables should be compared to something 18
Propensity Score Matching Chunmei (Rose) Zheng Graduate Research Assistant Center of Research on Learning (CRL) University of Kansas 06/08/2011 19
Overview of Presentation • • Example of comparison analysis General description of PSM Steps of the PSM Resources and References 20
An Example of Comparison Analysis Research question: Is there a significant difference in youth income between students with job training and students without job training? The data set looks like: Individuals 1 2 3 4 5 6 7 8 Job Training 0 0 1 1 Income 60 80 90 200 100 80 90 70 21
An Example of Comparison Analysis Education years might be a factor to influence youth income. The data set looks like: Individual Job Training Income Education 1 0 60 2 2 0 80 3 3 0 90 5 4 0 200 12 5 1 100 5 6 1 80 3 7 1 90 4 8 1 70 2 From Heinrich, C. , Maffioli, A. , & Vázquez, G. (2010) 22
An Example of Comparison Analysis After matching, the data looks like: I 1 2 3 4 5 6 7 8 JT 0 0 1 1 Income 60 80 90 200 100 80 90 70 Education 2 3 5 12 5 3 4 2 Match -- --[3] [2, 3] [1] Y 1 -- --100 80 90 70 Y 0 -- --90 80 85 6 Difference -- --10 0 5 10 From Heinrich, C. , Maffioli, A. , & Vázquez, G. (2010) But what about adding other covariate variables: age, gender, and ethnicity? Matching becomes more and more complicated…. 23
General Description of PSM • Why Propensity Score Matching (PSM)? --Propensity score can reduce the entire set of covariates into a single variable. -- Adjusts for (but not totally solve the problem of) selection bias • What is propensity score? In statistical terms, propensity scores are the estimated conditional probability that a subject will be assigned to a particular treatment, given a vector of observed covariates (Pasta, D. J. , p. 262). 24
General Description of PSM • Average Treatment Effect or ATE: ATE = E(δ ) = E( Y 1 –Y 0 ) • Average Treatment Effect on the Treated, or ATT, ATT = E(Y 1 −Y 0 | D =1) • Average Treatment Effect on the Untreated, or (ATU) ATU = E( Y 1 −Y 0 | D = 0) 25
Steps of PSM • Estimate the propensity score • Choose a matching algorithm that will use the estimated propensity scores to match untreated units to treated units • Estimate the impact of the intervention with the matched sample and calculate standard errors. ---From Heinrich, C. , Maffioli, A. , & Vázquez, G. (2010) 26
Software for PSM • • • Stata R SPSS S-Plus SAS Mplus 27
Reference • Pasta, D. J. (n. d. ). Using propensity scores to adjust for group differences: examples comparing alternative surgical methods. SUGI paper, 261 -25. • Heinrich, C. , Maffioli, A. , & Vázquez, G. (2010). A Primer for Applying Propensity-Score Matching. SPD Working Papers. • Guo, S. , & Fraser, M. W. (2010). Propensity score analysis: Statistical methods and applications. Sage. 28