Notes towards a theory of formative assessment Dylan

  • Slides: 50
Download presentation
Notes towards a theory of formative assessment Dylan Wiliam King’s College London www. kcl.

Notes towards a theory of formative assessment Dylan Wiliam King’s College London www. kcl. ac. uk www. dylanwiliam. net

Outline What is formative assessment? Putting it into practice Theorising the outcomes

Outline What is formative assessment? Putting it into practice Theorising the outcomes

Assessment for Learning. . . Assessment for learning is any assessment for which the

Assessment for Learning. . . Assessment for learning is any assessment for which the first priority in its design and practice is to serve the purpose of promoting learning. It thus differs from assessment designed primarily to serve the purposes of accountability, or of ranking, or of certifying competence. Specifically, assessment for learning describes all those activities undertaken by learners and teachers to assist the learners in finding out where they are in their learning, where they are going, and how to get there.

Formative assessment An assessment activity can help learning if it provides information to be

Formative assessment An assessment activity can help learning if it provides information to be used as feedback, by teachers, and by their students, in assessing themselves and each other, to modify the teaching and learning activities in which they are engaged. Such assessment becomes ‘formative assessment’ when the evidence is actually used to adapt the teaching work to meet learning needs.

What research says about Assessment for Learning Reviews of research provide firm evidence that

What research says about Assessment for Learning Reviews of research provide firm evidence that Assessment for Learning practices improve learning and raise achievement • Natriello (1987) • Crooks (1988) • Black and Wiliam (1998)

Substantial effects About 50 studies, ranging over ages, subjects and countries, compared improvements in

Substantial effects About 50 studies, ranging over ages, subjects and countries, compared improvements in achievements for students in ‘intervention’ groups with students in ‘control’ groups. ‘Assessment for learning’ innovations typically produced effect sizes of between 0. 4 and 0. 7 – larger than those found for most other educational innovations.

Aspects of formative assessment Where the learner is Where they How to get are

Aspects of formative assessment Where the learner is Where they How to get are going there Teacher Eliciting information Curriculum philosophy Feedback Peerassessment Sharing criteria Peer-tutoring Learner Selfassessment Sharing criteria ?

Inferences from responses

Inferences from responses

Kinds of questions: Israel Which fraction is the smallest? Success rate 88% Which fraction

Kinds of questions: Israel Which fraction is the smallest? Success rate 88% Which fraction is the largest? Success rate 46%; 39% chose (b) [Vinner, PME conference, Lahti, Finland, 1997]

Fit and match (false positives) Responses to weak questions are consistent with (ie match)

Fit and match (false positives) Responses to weak questions are consistent with (ie match) a wide range of learning outcomes, and thus provide limited support for inferences about learning needs Responses to strong questions are consistent (ie fit) only with a narrow range of learning outcomes, and thus provide strong support for inferences about learning needs

Disclosure (false negatives) Questions with high disclosure can be relied on to provide evidence

Disclosure (false negatives) Questions with high disclosure can be relied on to provide evidence of the learner’s capability on the construct of interest (ie if they know it, they show it) Questions with low disclosure cannot be relied on to provide evidence of the learner’s capability (ie verdict is not proven)

Effects of feedback (1) 132 low and high ability year 7 pupils in 12

Effects of feedback (1) 132 low and high ability year 7 pupils in 12 classes in 4 schools Same teaching, same aims, same teachers, same classwork Three kinds of feedback: marks, comments, marks+comments Feedback Gain Interest marks none top bottom comments 30% all marks plus comments none top bottom [Butler(1988) Br. J. Educ. Psychol. , 58 1 -14] +ve -ve

Effects of feedback (2) Kluger & De. Nisi (1996) undertook a comprehensive review of

Effects of feedback (2) Kluger & De. Nisi (1996) undertook a comprehensive review of research reports related to feedback Excluding those: with poor design or without adequate controls with fewer than 10 participants where performance was not measured or effect sizes not given left 131 reports, 607 effect sizes, on 12652 individuals Average effect size 0. 4, but standard deviation of effect sizes almost 1. 0 40% of effect sizes were negative

Quality of feedback: scaffolding Day & Cordón, 1993 2 Y 4 classes experimental group

Quality of feedback: scaffolding Day & Cordón, 1993 2 Y 4 classes experimental group 1 given solution when stuck experimental group 2 given ‘scaffolded’ response Group 2 outperformed group 1

Feedback and formative assessment Feedback contributes to Assessment for Learning (ie the assessment is

Feedback and formative assessment Feedback contributes to Assessment for Learning (ie the assessment is formative) only if the information fed back to the learner is actually used by the learner in making improvements.

Understanding quality Conditions for improvement (Sadler, 1989) “The indispensable conditions for improvement are that

Understanding quality Conditions for improvement (Sadler, 1989) “The indispensable conditions for improvement are that the student comes to hold a concept of quality roughly similar to that held by the teacher, is continuously able to monitor the quality of what is being produced during the act of production itself, and has a repertoire of alternative moves or strategies from which to draw at any given point. ” Telos goals versus horizons

Understanding quality “Maxims cannot be understood, still less applied by anyone not already possessing

Understanding quality “Maxims cannot be understood, still less applied by anyone not already possessing a good practical knowledge of the art. They derive their interest from our appreciation of the art and cannot themselves either replace or establish that appreciation”. (Polanyi, 1958 p 50). “Quality doesn’t have to be defined. You understand it without definition. Quality is a direct experience independent of and prior to intellectual abstractions”. (Pirsig, 1991 p 64).

Understanding quality 3 teachers each teaching 4 Y 8 science classes in two US

Understanding quality 3 teachers each teaching 4 Y 8 science classes in two US schools 14 week experiment 7 two-week projects, scored 2 -10 For a part of each week Two of each teacher’s classes discusses their likes and dislikes about the teaching (control) The other two classes discusses how their work will be assessed All other teaching is the same [Frederiksen & White, AERA conference, Chicago, 1997]

Sharing criteria with learners

Sharing criteria with learners

Self-assessment: Portugal 50 teachers following a part-time Masters in Education programme for one evening

Self-assessment: Portugal 50 teachers following a part-time Masters in Education programme for one evening a week over two years 25 teachers spent two terms (ie 20 weeks) developing and promoting pupil self-assessment in mathematics Students taught by control group teachers gained 7. 5 marks over the two terms Students taught by teachers developing self-assessment (matched in age, qualifications and experience, using the same mathematics scheme for the same amount of time): 15 marks [Fontana & Fernandez, Br. J. Educ. Psychol. 64: 407 -417]

Formative & summative Summative function validated by widely shared meanings require teachers to form

Formative & summative Summative function validated by widely shared meanings require teachers to form a community of practice Formative function validated by appropriate consequences (ie learning) require learners to be enculturated into the same community of practice require teachers to interpret performance in terms of learning needs (ie to possess an anatomy of quality)

Educational knowledge Nature of knowledge in education no reliable knowledge reasonableness, not rationality Nature

Educational knowledge Nature of knowledge in education no reliable knowledge reasonableness, not rationality Nature of expertise exquisitely attuned to local context

Countdown 25 3 9 1 4 Target number: 127

Countdown 25 3 9 1 4 Target number: 127

Knowledge transfer After Nonaka & Tageuchi, 1995

Knowledge transfer After Nonaka & Tageuchi, 1995

KMO Formative Assessment Project 24 teachers, each developing their practice in individual ways Different

KMO Formative Assessment Project 24 teachers, each developing their practice in individual ways Different outcome variables No possibility of standardized controls ‘Local design’ Synthesis by standardized effect size

Inset timetable

Inset timetable

Practical strategies: questioning Improving teacher questioning closed v open low-order v high-order generating questions

Practical strategies: questioning Improving teacher questioning closed v open low-order v high-order generating questions with colleagues ‘Hot Seat’ questioning extended interaction with one student to scaffold learning other students learn vicariously ‘No hands up’ (except to ask a question) Brainstorming what students know already Increased wait time Training students to pose questions Class polls to review current attitudes towards an issue

Practical strategies: feedback Comment-only marking Focused marking Explicit reference to criteria Suggestions on how

Practical strategies: feedback Comment-only marking Focused marking Explicit reference to criteria Suggestions on how to improve ‘Strategy cards’ ideas for improvement Not giving complete solutions Re-timing assessment (eg two-thirds-of-the-way-through-a-topic test)

Practical strategies: understanding quality Explaining learning objectives at start of lesson/unit Criteria in students’

Practical strategies: understanding quality Explaining learning objectives at start of lesson/unit Criteria in students’ language Posters of key words to talk about learning eg describe, explain, evaluate Planning/writing frames Annotated examples of different standards to ‘flesh out’ assessment criteria Opportunities for students to design their own tests

Practical strategies: peer- and self-assessment Students assessing their own/peers’ work with marking schemes with

Practical strategies: peer- and self-assessment Students assessing their own/peers’ work with marking schemes with criteria with exemplars Identifying group weaknesses Self-assessment of confidence and uncertainty Traffic lights Smiley faces Post-it notes End-of-lesson students’ review

Clustering of strategies

Clustering of strategies

Theorising formative assessment Why? to make sense of studies with low power to relate

Theorising formative assessment Why? to make sense of studies with low power to relate formative assessment to other, similar interventions (eg thinking skills programmes such as cognitive acceleration) to simplify or optimise the intervention

A theory of everything? No; formative assessment focuses on moments of contingency in teaching

A theory of everything? No; formative assessment focuses on moments of contingency in teaching and learning, but provides a ‘Trojan Horse’ into wider issues

Theorising formative assessment [Whether formative assessments works] no longer seems to me, however, to

Theorising formative assessment [Whether formative assessments works] no longer seems to me, however, to be the central issue. It would seem more important to concentrate on theoretical models of learning and its regulation and their implementation. These constitute the real systems of thought and action, in which feedback is only one element. ( Perrenoud, 1998, p. 86) Regulation of activity of learning

A simple model Make things as simple as possible, but no simpler (Einstein) Roles

A simple model Make things as simple as possible, but no simpler (Einstein) Roles (division of labour) Teachers Students • as individuals • as groups Resources (cultural artefacts) Theories of learning Nature of subject

A simple model Theories of learning Teachers Students as individuals Subjects Students as groups

A simple model Theories of learning Teachers Students as individuals Subjects Students as groups

Subject knowledge Types of knowledge abstract content knowledge pedagogical content knowledge (Shulman, 1986) Subject

Subject knowledge Types of knowledge abstract content knowledge pedagogical content knowledge (Shulman, 1986) Subject differences knowledge to be assimilated skills to be acquired capability to be developed

Subject knowledge [Teachers of other subjects] do more of it than us as part

Subject knowledge [Teachers of other subjects] do more of it than us as part of their normal teaching. Art and drama teachers do it all the time, so do technology teachers (something to do with openended activities, long project times, and perhaps a less cramped curriculum? ). But an English teacher came up to me today and said “Yesterday afternoon was fantastic. I tried it today with my year 8 s, and it works. No hands up, and giving them time to think. I had fantastic responses from kids who have barely spoken in class all year. They all wanted to say something and the quality of answers was brilliant. This is the first time for ages that I’ve learnt something new that’s going to make a real difference to my teaching. ” James, Two Bishops School

Theories of learning Teachers have asked for lectures on the psychology of education! They

Theories of learning Teachers have asked for lectures on the psychology of education! They have begun to focus on what their students make of their teaching, and to build predictive models of how they will learn.

The teacher’s role “I would like to suggest several ways forward, based on distinguishing

The teacher’s role “I would like to suggest several ways forward, based on distinguishing two levels of the management of situations which favour the interactive regulation of learning processes: the first relates to the setting up of such situations through much larger mechanisms and classroom management; the second relates to interactive regulation which takes place through didactic situations. ” (Perrenoud, 1998 p. 92)

The teacher’s role I now think more about the content of the lesson. The

The teacher’s role I now think more about the content of the lesson. The influence has shifted from ‘what am I going to teach and what are the pupils going to do? ’ towards ‘how am I doing to teach this and what are the pupils going to learn? ’ (Susan, Waterford School) There was a definite transition at some point, from focusing on what I was putting into the process, to what the students were contributing. It became obvious that one way to make a significant sustainable change was to get the students doing more of the thinking. I then began to search for ways to make the learning process more transparent to the students. Indeed, I now spend my time looking for ways to get students to take responsibility for their learning and at the same time making the learning more collaborative. (Tom, Riverside School)

The students’ role They feel that the pressure to succeed in tests is being

The students’ role They feel that the pressure to succeed in tests is being replaced by the need to understand the work that has been covered and the test is just an assessment along the way of what needs more work and what seems to be fine. [. . . ] They have commented on the fact that they think I am more interested in the general way to get to an answer than a specific solution and when Clare [a researcher] interviewed them they decided this was so that they could apply their understanding in a wider sense. (Belinda, Cornbury Estate School)

Critical factors Evidence Ideas Support Reflection Time

Critical factors Evidence Ideas Support Reflection Time

And yet. . . The 4 -component model is a model only of ‘what

And yet. . . The 4 -component model is a model only of ‘what happened? ’—a representation of the dynamics of the process of implementation

Were the components right? Criteria for individual components relevance feasibility acceptability based on cognitive/affective

Were the components right? Criteria for individual components relevance feasibility acceptability based on cognitive/affective principles Criteria for the collection of components synergy completeness (eg integrating formative & summative)

Principles for design 1 Process of intervention must be slow, steady, piecemeal ‘infiltration’ rather

Principles for design 1 Process of intervention must be slow, steady, piecemeal ‘infiltration’ rather than by wholesale imposition Content of intervention must match such an approach components must emphasise underlying principles • synergy • comprehensiveness • based in cognitive and affective principles

Design and intervention The design process cognitive/affective insights synergy/ comprehensiveness set of components The

Design and intervention The design process cognitive/affective insights synergy/ comprehensiveness set of components The implementation process set of components synergy/ comprehensiveness cognitive/affective insights

Principles for design 2 Design of process must promote progress in each of role

Principles for design 2 Design of process must promote progress in each of role of teacher role of students nature of the subject theories of learning and must foster interactions between these.

Outside all this Teacher change & professional development Context effects institutional effects national cultures

Outside all this Teacher change & professional development Context effects institutional effects national cultures and policies resources