Alternative Assessment Brief introduction Approach to measuring a

Alternative Assessment

Brief introduction Approach to measuring a student’s status based on the way the student completes a specific task Became popular during the 1990’s as a result from educators dissatisfaction with the tradition paper-and -pencil tests Proponents: - Marzano, Pickering, & Mc. Tighe (1993); NCTE (1996); NCTM (1995); Stiggins (1991); Wiggins (1989)

Some distinction The degree to which the examination simulates the criterion situation Assessment in which students are required to construct an original response The extent to which the examination approximates the domain of student behaviors about which we wish to make inferences Foster thought, persistence, construction of meaning, and deepening of understanding of students Foster application of concepts and principles to the real world, even not always situated in real-life settings

Types… Performance Observation Portfolio Journal Authentic/Project

Reasons… A different form of assessment (from the traditional) Ability to tap and assess higher order thinking skills and performances require reasoned application of acquired knowledge rather than rote repetition of facts or formulae are defined within contexts that are meaningful to students, either as relevant to real-world tasks and/or as examples of ongoing work produced by the student.

Principles… Purpose of Assessment Different tests are designed for different purposes and should be used only for the purpose(s) for which they are designed. Use of Multiple Measures Multiple measures should always be considered in generating information upon which any type of educational decision is to be made. No test, no matter how technically advanced, should be used as the sole criterion in making high-stakes decisions.

Principles… Technical Rigor All assessment must meet appropriate standards of technical rigor, which have been clearly defined prior to the development of the assessment. This is even more important when an assessment is used in a "high stakes" decision context

Principles… Cost Effectiveness In making selection decisions about assessments, the quality and utility of the information produced must be weighed against the cost of collecting, interpreting and reporting it. Such costs must take into account the time requirements for developers, teachers, administrators and students.

Principles… Protections of Students/Equitability No harm should accrue to any student or groups of students as the result of administration or subsequent use of the results of any form of assessment.

Principles… Educational Value All performance assessment should be designed so that both the administration of the assessment itself and the use of the results of the assessment augment the educational experience of students. Indeed, augmentation is one of the most significant potential contributions performance assessment has to offer to educational reform.

Principles… Decision Making All assessment should provide data that enhance the decision making ability of students, lecturers, administrators, top University management, parents and/or community members

Characteristics… Asks students to perform, create or produce something Encourages student self-reflection Measures outcomes of significance Taps higher-level thinking and problem-solving skills Uses tasks that represent meaningful instructional activities Invokes real-world applications Uses human judgment (rather than machines) for scoring

Characteristics… Requires new instructional and assessment roles for teachers Provides self-assessment opportunities for students Provides opportunities for both individual and group work Encourages students to continue the learning activity beyond the scope of the assignment Defines explicit performance criteria Makes assessment equal in importance to curriculum and instruction

Requirements… Substantial teacher involvement and professional development. Clear definition of the skills to be assessed and of the relationship of the skills to the specific context or format in which they will be assessed. Clear specification of the criteria on which skills are to be judged.

Requirements… Clear specification of the purposes for which an assessment will be used and how the outcomes obtained will be used for that purpose. Specification of scoring and adequate training of raters, especially if assessment is to be used for high stakes decision-making.

Example: collaborative skills Learning outcome: At the end of the course, students can solve problems collaboratively There may be a set of assessment options that vary in the degree to which the task approximates the desired criterion behavior True false, MCQ, short answers, essay, observation through novel problems, etc Thus different educators may use the phrase “performance assessment” for different types of assessment

Features of performance assessment Multiple evaluative criteria student’s performance must be judged by more than one evaluative criterion Prespecified quality standards Each criteria to be judged must be clearly explicated in advance of judging the quality of student’s performance Judgmental appraisal It must depend on human judgments

Issues of performance assessment Selecting appropriate task Typically requires a small number of more significant tasks to be performed Eg: perform an actual chemistry experiment, then write an interpretation of the experiment results and an analytic critique of the procedures used. Great care must be taken in the selection of the said tasks ( self generate or select elsewhere)

Issues-continued These suitable tasks would then determine how you assess your students via: The inference you make about them The decision that will be based on the inference LO provides the source of the inference; assessment tasks yield the evidence needed for teacher to arrive at defensible inferences regarding the extent to which students can perform

Generalizability dilemma The dilemma is that when students respond to fewer tasks, teacher is at a difficult situation when deriving inferences about students abilities Thus any inferences should be made with increase caution Need to select task that optimize the likelihood of accurately generalizing student’s abilities

Factors to consider when evaluating tasks Generalizibility Authenticity Multiple foci Teachability Fairness Feasibility Scorability

Identifying scoring criteria The scoring of constructed responses centers on the “evaluative criteria” that determines the adequacy of students’ responses A criteria is a standard on which judgment or decision may be based Eg: scoring a composition on basis of “organization, word choice, clarity” is not the same when based on “spelling, punctuation, grammar” Scoring procedures these days are called “rubrics”

Rubrics Needs to have at least 3 important features: Evaluative criteria (suitable) Descriptions of qualitative differences for evaluative criteria An indication of whether a holistic or analytic scoring approach is to be used (devising a numerical scoring scale for each criteria) (scoring criteria are applied in the form of ratings for performances or observations for student behavior

Types of rubrics Task specific Criteria that are linked to the particular task embodied in a performance test Hypergeneral Criteria described in a general and amorphous manner Skill-focused Markedly enhance teacher’s instruction

Skill focused rubric Conceptualized around the skill that is: Being measured by the constructed response assessment Being pursued instructionally by teacher 5 rubric rules Make sure skills assessed is significant All evaluative criteria can be addressed instructionally Employ as few criteria as possible Provide a succinct label for each evaluative criterion Match the length of rubric to own tolerance for detail

Sources of error in scoring Scoring –instrument flaws Lack of descriptive rigor on evaluative criteria Procedural flaws Too many evaluative criteria to rate Teacher’s personal-bias errors Generosity errors Severity errors Central-tendency error Halo effect

summary Performance assessment brings about: Alternative methods of assessing than paper-and-pencil tests More authentic –reflective of tasks that occur in the real world Better match between assessment tasks with behavior domain (to be inferred) Established targets that influence teacher’s instruction thus affect instructional activities