ARG symposium discussion Dylan Wiliam Annual conference of

  • Slides: 9
Download presentation
ARG symposium discussion Dylan Wiliam Annual conference of the British Educational Research Association; London,

ARG symposium discussion Dylan Wiliam Annual conference of the British Educational Research Association; London, UK: 2007 www. dylanwiliam. net

Validity “Validity is an integrative evaluative judgment of the degree to which empirical evidence

Validity “Validity is an integrative evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment” (Messick, 1989 p. 13) Validity is a property of inferences, not assessments • No such thing as a biased assessment Validity subsumes all aspects of assessment quality • Reliability • Content coverage But not impact (Popham: right concern, wrong concept)

Messick (1989) Result interpretation Result use Evidential basis Content validity Construct validity/utility Consequential basis

Messick (1989) Result interpretation Result use Evidential basis Content validity Construct validity/utility Consequential basis Value implications Social consequences

Validity As has been stressed several times already, it is not that adverse social

Validity As has been stressed several times already, it is not that adverse social consequences of test use render the use invalid, but, rather, that adverse social consequences should not be attributable to any source of test invalidity such as construct-irrelevant variance. If the adverse social consequences are empirically traceable to sources of test invalidity, then the validity of the test use is jeopardized. If the social consequences cannot be so traced—or if the validation process can discount sources of test invalidity as the likely determinants, or at least render them less plausible—then the validity of the test use is not overturned. Adverse social consequences associated with valid test interpretation and use may implicate the attributes validly assessed, to be sure, as they function under the existing social conditions of the applied setting, but they are not in themselves indicative of invalidity. (Messick, 1989, p. 88 -89)

Koretz, Linn, Dunbar, Shepard (1991)

Koretz, Linn, Dunbar, Shepard (1991)

Sensitivity to instruction Average cohort progress: 0. 3 sd per year Good teachers (+1

Sensitivity to instruction Average cohort progress: 0. 3 sd per year Good teachers (+1 sd) produce 0. 4 sd per year Poor teachers (-1 sd) produce 0. 2 sd per year Giving all disadvantaged children above average teachers, and all advantaged children below average teachers would take 5 years to eradicate the achievement gap…

So… Although teacher quality is the single most important determinant of student progress… …the

So… Although teacher quality is the single most important determinant of student progress… …the effect is small compared to the accumulated achievement over the course of a learner’s education… …inferences that school outcomes are indications of the contributions made by the school are almost certainly invalid.

Teacher involvement in high-stakes assessments In a high-stakes environment, the challenge is to produce

Teacher involvement in high-stakes assessments In a high-stakes environment, the challenge is to produce assessments worth teaching to Assessments therefore need to be • Distributed • Cumulative and synoptic

Teacher professional development Effective teacher engagement in summative assessment requires new teacher knowledge Effective

Teacher professional development Effective teacher engagement in summative assessment requires new teacher knowledge Effective teacher engagement in formative assessment requires new teacher behaviours These require very different kinds of professional development, and different structures to sustain them