Evidence Based Policy Evidence Grading Schemes and Entities
Evidence Based Policy, Evidence Grading Schemes and Entities, and Ethics in Complex Research Systems Robert Boruch, University of Pennsylvania September 14 -15 2008 5 th European Conference on Complex Systems Jerusalem 1
Summary of Themes People want to know “what works” so as to inform their decisions. l The scientific quality of evidence on “what works” is variable and often poor. l People’s access to information on what works through the internet is substantial. l Organizations and data bases have been created to (a) develop evidence grading schemes, (b) apply the schemes in systematic reviews of evidence from multiple studies, and (c) disseminate results through the internet. l 2
The “What Works” Theme l “What works” refers here to estimating the effects of social, educational, or criminological interventions l In a statistically/scientifically unbiased way l And so as to generate a statistical statement of one’s confidence in the results. 3
Evidence Based Policy/Law: A Driver of Interest in What Works l US, Canada, UK, Israel (e. g. National Academy) l Sweden, Norway, Denmark l Australia, Malaysia, China l Mexico, others in Central America l Multinationals: OECD, World Bank l Others 4
Information Glut as Driver: Naïve Web Searches A Google search on “evidence based, ” yields 9, 660, 000 links. A Google search on “what works” yields 6, 350, 000 links (. 21 seconds). A Google search on “evidence based practice” yields 2, 000 links (. 42 seconds). A Google search on “evidence based policy” yields 132, 000 links (. 35 seconds). l What are we to make of this? 5
Publication Rates in Education 20, 000 articles on education published each year in English language journals l 2/1, 000 -5/1, 000/year report on controlled trials of programs, policies, or practices to estimate effectiveness l For every curriculum package that has been tested in a controlled trial, there are 50 -80 that are claimed to be effective based on no defensible scientific evidence. l 6
Relevant Organizations Nested l National, State/Provincial, Municipal: Policy or law l Agencies with nation, etc. , e. g. National Science Foundations, Institute for Education Sciences (US), University Research l Programs and projects within agencies l Data bases and reports within projects l Users of information at each level, e. g. scientists, policy people, the public 7
International Organizations: NGOs l Cochrane Collaboration in Health Care: http: //cochrane. org l Campbell Collaboration in education, welfare, crime and justice: http: //campbellcollboration. org 8
Two Examples Here l International Campbell Collaboration in education, welfare, crime and justice l What Works Clearinghouse in education (Institute for Education Sciences, US) 9
Data Bases in this Context Evidence Grading Schemes currently focus on reports of statistical analyses of impact, not microrecords of individuals as yet. l Example: 5 -10 statistical reports (ingredients of part of data base) on evaluating impact of conditional income transfer programs in developing regions l Example: Cochrane Collaboration data base on randomized trials contains nearly. 5 million such reports l “Meta-analysis” of results of multiple studies l 10
C 2 SPECTR l C 2 Social, Psychological, Educational, and Criminological Trials Register l 13, 000+ entries on randomized and possibly randomized trials l Feeding into C 2 systematic reviews l Feeding into the IES What Works Clearinghouse (USDE) 11
The Campbell Collaboration l Mission: since 2000, prepare, maintain and make accessible C 2 systematic reviews of evidence on the effects of interventions (“what works” ) to inform decision makers and stakeholders. l International and Multidisciplinary: Education, social welfare/services, crime and justice l l http: //campbellcollaboration. org Precedent: Cochrane Collaboration in health (1993) 12
Nine Key Principles of C 2: A Scientific Ethic 1. Collaborating across l 5. Keeping Current Nations and Disciplines l 6. Striving for Relevance l 2. Building on Enthusiasm l 7. Promoting Access l 3. Avoiding Duplication l 8. Ensuring Quality l 4. Minimizing Bias l 9. Maintaining 13 Continuity l
What are Evidence Grading Schemes (EGSs) ? l These are inventories (guidance, checklists, scales) or processes that… l facilitate making transparent and uniform scientific judgments about… l The quality of evidence on effects of programs or practices or policies 14
C 2’s and Others’ Major Evidence Grading Distinction on What Works Randomized controlled trials yield the least biased and least equivocal evidence on “what works” i. e. effect of a new intervention (program, practice, etc. ) l Alternative methods to estimate the effect of interventions yield more equivocal and more biased estimates of effect, e. g. “before-after” evaluations and other nonrandomized trials. l Both randomized trials and nonrandomized trials are important, but they must be separated in evidence grading schemes. l 15
Example: Randomized Controlled Trial Individuals or entities such as villages or organizations are randomly allocated to one of two or more interventions l The random allocation assures a fair comparison of the effects of the interventions l And the random allocation assures a statistically credible statement about confidence in the result, e. g. confidence interval and statistical tests l 16
More Specific Example A new commercially curriculum package for math education is the intervention under investigation l The new curriculum is RANDOMLY allocated to half of a sample of 100 schools, with the remaining half of schools serving as a control group, so as to form two equivalent groups of schools (fair comparison) l The outcomes, such as achievement test scores, from the intervention group and the control group are compared l 17
Entities and Evidence Grading Schemes for What Works l l l l Cochrane Collaboration: Systematic reviews in health Campbell Collaboration: crime, education, welfare Society for Prevention Research (Prevention Science 2006) What Works Clearinghouse, Institute for Education Sciences WWC IES http: //whatworks. ed. gov Food and Drug Administration, other regulatory agencies National Register of Evidence-based Programs and Practices Others: California etc. 18
What are the Ingredients of EGSs? Pre-specification of primary outcomes Comparison condition fidelity Pre-specification of all analyses Reliability of outcome measures Pre-specification of all measures Validity of outcome measures Control for assignment/selection bias Adherence to standards for data collection Appropriate comparison condition Adjustment for differential attrition Control for subject awareness of assigned intervention Adjustment for overall loss to follow-up Control for provider awareness of assigned intervention Adjustment for missing data Control for data collector awareness of assigned intervention Analysis meets statistical assumptions Assurances to participants to elicit disclosure Analysis consistent with study theory Intervention fidelity/Measurement of exposure Adjustment for multiple measures Control for contamination and co-intervention Absence of or explanation for anomalous findings Reliability and validity of exposure measures 19
WWC Aims To be a trusted source of scientific evidence on what works, what does not, and on where evidence is absent… l Not to endorse products l http: //www. whatworks. ed. gov l 20
What Works Clearinghouse Illustration 21
22
Example: C 2 Parental Involvement Trials l 500 possibly relevant studies of impact l 45 Possible Randomized Controlled Trials (RCTs) l 20 RCTs Met Study Inclusion Criteria 18 RCTs included in the Meta-Analysis l Nye, Turner, Schwartz http//: campbellcollaboration. org 23
24
Example: Petrosino et al on Scared Straight Trials l Over 600 articles that are possibly relevant to impact of Scared Straight l Only 15 reach a “reasonable” level of scientific standard l Only 7 reached standard of being randomized controlled trial. 25
Figure 1. The effects of Scared Straight and other juvenile awareness programs on juvenile delinquency: random effects model, “first effect, ” reported in the study (Petrosino, Turpin-Petrosino, and Buehler, 2002) n=number of failures N=number of participants CI=confidence intervals Random=random effects model assumed 26
C 2 Product: Scared Straight Pro Humanitate Award l l l Observational Studies Ashcroft: -50% crime Buckner: 0% Berry: -5% Mitchell -53% Several dozen others l l l l Randomized Trials Mich: +26% crime Gtr Egypt: +5% Yarb: +1% Orchow: +2% Vreeland: +11% Finckenauer: +30% Lewis: +14% 27
Scientific Ethic Providing access to scientific reports of evaluations of the effect of interventions, e. g. journal publications and limited circulation reports from governments or private organizations l Providing information beyond reports to assure understanding l In principle, but not always in practice, providing access to micro-records from impact evaluations l 28
Ethics of Research on Humans Evidence Grading Schemes and organizations need not worry about individual privacy because they have not access, as yet, to individuals records in identifiable form l They rely on statistical/scientific reports that are published in peer reviewed journals and other reports and which include no individual records. l 29
Ethics and Law: US Individual rights to privacy are routinely assured on account of professional ethics statements and laws in the US. l The relevant codes of professional ethics in US include those of AERA, ASA, AAPOR, APA, and others. l The relevant laws in the US include Family Education Rights and Privacy Act (FERPA), Privacy Act, HIPPA l 30
Ethics and Randomized Controlled Trials l Relevant codes and law concern individual privacy and confidentiality of individual’s identifiable micro-records l Relevant regulations and codes include attention to informed consent (45 CFR 46) l Access to anonymous micro-records for secondary analysis is problematic and possibly unnecessary in this context 31
Appendices 32
Robert Boruch: Bio Boruch is the University Trustee Chair Professor in the Graduate School of Education and the Statistics Department of the Wharton School at the University of Pennsylvania, Philadelphia Pennsylvania Boruch is Fellow of the American Statistical Association, Academy of Experimental Criminology, American Academy of Arts and Sciences, American Educational Research Association Email: robertb@gse. upenn. edu 33
Provision to Advance Rigorous Evaluations in Legislation The program shall allocate X% of program funds [or $Y million] to evaluate the effectiveness of funded projects using a methodology that – l l – Includes, to the maximum extent feasible, random assignment of program participants (or entities working with such persons) to intervention and control groups; and – Generates evidence on which program approaches and strategies are most effective. The program shall require program grantees, as a condition of grant award, participate in such evaluations if asked, including the random assignment. 34
Provision to Advance Replication of Research-Proven Interventions l Agency shall establish a competitive grant program focused on scaling up research-proven models l Grant applicants shall – – Identify the research-proven model they will implement, including supporting evidence (well-designed RCTs showing sizeable, sustained effects on important outcomes); Provide a plan to adhere closely to key elements of the model; and Obtain sizeable matching funds from other sources, especially large formula grant programs. 35
A Focus on Data Bases that Concern “What Works” l Here, the focus is on projects that generate evidence about “what works, ” and what does not work using good scientific standards l This is different from a focus on projects or programs that generate information on nature of a problem, monitoring program compliance with law, etc. 36
What are the Campbell Collaboration (C 2) Assumptions? Public interest in evidence based policy and practice will increase. l Scientific and government interest in cumulation and synthesis of evidence on “what works” will increase. l Access to information and evidence of dubious quality and need to screen for quality of evidence will increase. l The use of randomized controlled trials to generate trustworthy evidence on what works will increase. l 37
What are the Products? 1. 2. 3. 4. 5. 6. 7. Registries of C 2 Systematic Reviews of the effects of interventions (C 2 -RIPE) Registries of reports of randomized trials and nonrandomized trials, (C 2 -SPECTR) and future reports of randomized trials (C 2 -PROT) Standards of evidence for conducting C 2 Systematic reviews Annual Campbell Colloquia Training for producing reviews New technologies and methodologies Web site: http: //www. campbellcollaboration. org 38
What are Other C 2 Products? l l l C 2 Trials Register (C 2 SPECTR): 13, 000 entries Annals of the American Academy of Political and Social Sciences: Special Issues C 2 Prospective Trials Register C 2 Policy Briefs Annual and Intermediate Meetings: London, Philadelphia, Stockholm, Lisbon, Paris, Oslo, Copenhagen, Helsinki, Los Angeles 39
Hand Search vs Machine Based Search l Journal of Educational Psychology (‘ 03 - ” 06) l Hand search: RCT=66 l Full Text Elec N=99: 59% accurate, 41% false positives, 24% false negatives l Abstract only Elect N=11: 91% accurate. 9% false positive, 85% false negative 40
What Is the Value Added ? l Building a cumulative knowledge base l Developing exhaustive searches l Producing transparent and uniform standards of evidence l International scope l Periodic updating l Making reviews accessible 41
C 2 Futures/Tensions l C 2 Production: AIR and others l C 2 Publications v journals l C 2 and governments and C 2 apart from governments l C 2 and Sustainability, C 2 as voluntary Organization versus C 2 and Spin Off Organizations and Products 42
What are Other Illustrative Reviews? l l l l “Scared Straight” Programs (Done, Award) Multi-systemic Therapy (Done) Parental Involvement (Done) After School programs (Due 12/05) Peer Assisted Learning Counter Terrorism Strategies (Under revision) Reducing Illegal Firearms Possession 43
- Slides: 43