Performance Assessment Rubrics Rating Scales 1 Trends Definitions

Performance Assessment, Rubrics, & Rating Scales § § § 1 Trends Definitions Advantages & Disadvantages Elements for Planning Technical Concerns Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Types of Performance Assessments 2 Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Performance Assessment 3 l Who is currently using performance assessments in their courses or programs? l What are some examples of these assessment tasks? Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Primary Characteristics 4 l Constructed response l Reviewed against criteria/continuum (individual or program) l Design is driven by assessment question/ decision Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Why on Rise? l l l 5 Accountability issues increasing Educational reform has been underway Growing dissatisfaction with traditional multiple choice tests (MC) Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Exercise 1 l l l 6 Locate the sample rubrics in your packet. Working with a partner, review the different rubrics. Describe what you like and what you find difficult about each (BE KIND). Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Advantages As Reported By Faculty 7 l Clarification of goals & objectives l Narrows gap between instruction & assessment l May enrich insights about students’ skills & abilities l Useful for assessing complex learning Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Advantages for Students 8 l Opportunity for detailed feedback l Motivation for learning enhanced l Process information differently Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Disadvantages l Requires Coordination –Goals –Administration –Scoring –Summary 9 & Reports Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Disadvantages l Archival/Retrieval –Accessible –Maintain 10 Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Disadvantages l. Costs –Designing –Scoring (Train/Monitor) –Archiving 11 Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Steps in Developing Performance Assessments 1. 2. 3. 4. 5. 12 Clarify purpose/reason for assessment Clarify performance Design tasks Design rating plan Pilot/revise Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Steps in Developing Rubrics 1. Identify purpose/reason for rating scale 2. Define clearly what is to be rated 3. Decide which you will use a. Holistic or Analytic b. Generic or Task-Specific 4. Draft the rating scale and have it reviewed 13 Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Recommendations Generic Taskspecific 14 Holistic Analytic Cost-effective but lacking Desirable in diagnostic value Not recommended Deborah Moore, Office of Planning & Institutional Effectiveness Very desirable but expensive Spring 2002

Steps in Developing Rubrics (continued) 5. 6. 7. 8. 15 Pilot your assessment tasks and review Apply your rating scales Determine the reliability of the ratings Evaluate results and revise as needed. Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

16 Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Descriptive Rating Scales l l 17 Each rating scale point has a phrase, sentence, or even paragraph describing what is being rated. Generally recommended over graded-category rating scales. Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Portfolio Scoring Workshop 18 Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Subject Matter Expertise Experts like Dr. Edward White join faculty in their work to refine scoring rubrics and monitor the process. 19 Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Exercise 2 l l l 20 Locate the University of South Florida example. Identify the various rating strategies that are involved in use of this form. Identify strengths and weaknesses of this form. Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Common Strategy Used l l l 21 Instructor assigns individual grade for an assignment within a course. Assignments are forwarded to program-level assessment team. Team randomly selects a set of assignments and assigns a different rating scheme. Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Exercise 3 l l Locate Rose-Hulman criteria. Select one of the criteria. In 1 -2 sentences, describe an assessment task/scenario for that criterion. Develop rating scales for the criterion. – – 22 List traits Describe distinctions along continuum of ratings Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Example of Consistent & Inconsistent Ratings 23 Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Calculating Rater Agreement (3 Raters for 2 Papers) 24 Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Rater Selection and Training 25 l Identify raters carefully. l Train raters about purpose of assessment and to use rubrics appropriately. l Study rating patterns and do not keep raters who are inconsistent. Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Some Rating Problems l l l 26 Leniency/Severity Response set Central tendency Idiosyncrasy Lack of interest Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Exercise 4 l l l 27 Locate Generalizability Study tables (1 -4). In reviewing table 1, describe the plan for rating the performance. What kinds of rating problems do you see? In table 2, what seems to be the biggest rating problem? In table 3, what seems to have more impact, additional items or raters? Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Generalizability Study (GENOVA) 28 l G Study: identifies sources of error (facet) in the overall design; estimates error variance for each facet of the measurement design l D Study: estimates reliability of ratings with current design to project outcome of alternative designs Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

29 Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

30 Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

31 Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

32 Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Summary 33 l Interpretation - Raters using the rubric in nonsystematic ways l Reliability (phi) values range from. 21 to. 67 for the teams—well below. 75 level desired Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

What Research Says About Current Practice Instructors have limited sense of differentiated purposes of assessment Articulating their goals for student outcomes is difficult Uneven understanding about what constitutes thinking & problem solving Discipline content Articulating Instructors worry gets short shrift criteria for judging over fairness; difficult criteria unevenly applied 34 Deborah Moore, Office of Planning & Institutional Effectiveness Often design plan is weak/flawed; limited thought about use of information Instructors not prepared to consider reliability & validity issues Spring 2002

Summary l l l 35 Use on the rise Costly Psychometrically challenging Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002

Thank you for your attention. 36 l Deborah Moore, Assessment Specialist l 101 B Alumni Gym Office of Planning & Institutional Effectiveness l dlmoor 2@email. uky. edu l 859/257 -7086 l http: //www. uky. edu/Lex. Campus/; http: //www. uky. edu/OPIE/ Thank you for attending. Deborah Moore, Office of Planning & Institutional Effectiveness Spring 2002