Conflict of Interest l l Nancy Piro Ph

PC 002 BR 04 The Basics of Effective Evaluations in the Era of NAS

Workshop Objectives l Understand how: – – l Evaluation impacts NAS impacts evaluations Design

What Are Milestones ? l Defined areas of competency/expectations – – – We know

Milestones l Levels (from Dreyfus) – – – l Novice PGY-0 Advanced Beginner PGY-1

Milestones l l l List of competencies to be attained by trainee Some milestones

Where do these level 1 -5 come from? Dreyfus Model in a Nutshell… l

Dreyfus Model (1980): Characteristics of each stage of developing expertise Source: Eraut, M. Developing

Milestone Level Definitions l l l Level 1: The resident is a graduating medical

Milestone Level Definitions (continued) l Level 4: l Level 5: The resident has advanced

Milestones l Making decisions about readiness for graduation is the purview of the residency

What is NAS Driving Us to Do? l Change our Evaluations – – –

Overall Milestone Evaluation Approach: Examples EXAMPLE #1: Current Use Current Evaluations EXAMPLE #2: Direct

Milestones Impact on Evaluations: Linking questions to milestones l Step Two: Ensure specific evaluation

Example # 3 - Combined Evaluation Questions Tagged To Milestones Evaluated Directly

Milestone Evaluations demand valid, reliable evaluations as component measurements l Unintended bias can impact

Step 1: Question and Response Scale Construction Two Major Goals: l Construct unbiased, focused

Step 1: Create the Evaluation-Eliminating Unintended Cognitive Bias What is cognitive bias? l Cognitive

Step 1: Create the Evaluation - Where does response bias occur? 1. Response bias

Response Bias in the Raters/Evaluators Beware the Halo Effect l The “halo effect” refers

The Halo Effect and Expectations l The halo effect is discussed in Kelley's implicit

The Halo Effect Extends to Products and Marketing Efforts Copyright © 2010 Apple Inc.

Could this impact our evaluations ? A majority of trainees at Stanford believe that:

“Reverse Halo Effect” l A corollary to the halo effect is the “reverse halo

Blind Spots l In the 1970 s, the social psychologist Richard Nisbett demonstrated that

The Self-Fulfilling Prophecy (SFP) What is the Self-Fulfilling Prophecy? l First defined by Robert

The Self-Fulfilling Prophecy (SFP) What is the Self-Fulfilling Prophecy? Has been observed in many

Rosenthal’s Research on SFP For more than 50 years, Robert Rosenthal has conducted research

The Pygmalion Effect: (Self-Fulfilling Prophesy) Sum it up: the greater the expectation placed upon

Pygmalion in Education l l l The purpose of the experiment was to support

Pygmalion Effect in Education l l l All students in a single California elementary

Pygmalion Effect l ALL six grades in both experimental and control groups showed a

Why ? ? ? l Anthropologists have postulated that accurate predictions in our human

Other extensions… l l l “Dani is my shy child…” “Mike never eats his

Quotes… James Rhem, executive editor for the online National Teaching and Learning Forum, commented:

Quotes l “Do or do not…. there is no try” – Yoda

Applying these principles to assessments and evaluations…

Step 1: Create the Evaluation Question Construction - Exercise One l Review each question

Question Construction: Exercise 1 l Example 1: – l Example 2: – l “Incomplete,

Question Construction - Test Your Knowledge l Example 1: "I can always talk to

Question Construction - Test Your Knowledge l l l Example 2: “Career planning resources

Question Construction - Test Your Knowledge l Example 3: "Communication in my subspecialty program

Question Construction - Test Your Knowledge l Example 4: “Evidences incomplete, inaccurate medical interviews,

Question Construction - Test Your Knowledge l Example (5): – "The pace on our

Evaluation Question Design Principles Avoid ‘double-barreled’ or multi-barreled questions. Eliminate them from your evaluations.

Avoiding Double-Barreled Questions l Example: COMPETENCY 1 – Patient Care “Resident provides sensitive support

Evaluation Question Design Principles Combining the two or more questions into one question makes

Evaluation Question Design Principles l l Avoid questions with double negatives… When respondents are

Evaluation Question Design Principles l l l If you respond that you disagree, you

Evaluation Question Design Principles l l l Because every question is measuring “something”, it

Evaluation Question Design Principles l l l If your respondents are not clear on

Evaluation Question Design Principles Keep questions short. l Long questions can be confusing. l

Evaluation Question Design Principles l l Do not use “loaded” or “leading” questions. A

Evaluation Question Design Principles l "I’m concerned about doing a procedure if my performance

Evaluation Question Design Principles l A leading question is phrased in such a way

Evaluation Question Design Principles l l Do use Open-Ended Questions Do use comment boxes

Quick Exercise #2 1. 2. 3. Please rate the anesthesia resident’s communication and technical

Quick Exercise #2 4. 5. 6. Residents deserve higher pay for all the hours

Quick Exercise #2 8. 9. 10. Demonstrates an awareness of and responsiveness to the

Quick Exercise #2 “Post Test” (1) 1. Please rate the anesthesia resident’s communication and

Quick Exercise #2 “Post Test” (2) 3. Rate the resident’s abilities with respect to

Quick Exercise #2 “Post Test” (3) 4. Residents deserve higher pay for all the

Quick Exercise #2 “Post Test” (4) 6. Do you agree or disagree that residents

Bias in the Rating Scales for Questions The scale you construct can also skew

Milestone Scales use Balanced Scaled Responses…

Review your Program’s evaluations with the goal of eliminating any unintentional bias from them.

Gentle Words of Wisdom - Avoid large numbers of questions…. l l Respondent fatigue

Gentle Words of Wisdom Begin with the End Goal in Mind l l l

Gentle Words of Wisdom: Relevancy/Accuracy – Collect data in a valid and reliable way

Summary: Evaluation Do’s and Don’ts DO’s l Keep Questions Clear, Precise and Relatively Short.

Questions Please feel free to contact us with any questions @ 650 -723 -5948

Slides: 73

Download presentation

Conflict of Interest l l Nancy Piro, Ph. D - No Conflicts of Interest to Declare Ann Dohn, MA - No Conflicts of Interest to Declare

PC 002 BR 04 The Basics of Effective Evaluations in the Era of NAS Nancy Piro, Ph. D Program Manager & Education Specialist Ann Dohn, MA DIO, Director of GME Graduate Medical Education

Workshop Objectives l Understand how: – – l Evaluation impacts NAS impacts evaluations Design and implement a valid & reliable resident evaluation process. – – – Unintended cognitive bias The impact of self-fulfilling prophesy Evaluation questions

What Are Milestones ? l Defined areas of competency/expectations – – – We know what we expect of trainee Trainee knows what is expected of them Public knows what a ‘pathologist’, for example, should know and be able to do when certified as … “competent to practice pathology without supervision”

Milestones l Levels (from Dreyfus) – – – l Novice PGY-0 Advanced Beginner PGY-1 Competent PGY-2 and 3 Proficient Practitioner PGY-4/5 Expert Practitioner - Ongoing Each trainee assessed with respect to level for each competency

Milestones l l l List of competencies to be attained by trainee Some milestones are still in development Milestones are within the ACGME existing 6 core competencies Each milestone is defined as continuum with respect to each level of competency from level 1 to level 5 Rating Scales: Levels 1 to 5

Where do these level 1 -5 come from? Dreyfus Model in a Nutshell… l “In acquiring a skill by means of instruction and experience, the trainee normally passes through five developmental stages: 1. 2. 3. 4. 5. Novice Competence Proficiency Expertise Mastery”

Dreyfus Model (1980): Characteristics of each stage of developing expertise Source: Eraut, M. Developing Professional Knowledge and Competencies. (1994)

Milestone Level Definitions l l l Level 1: The resident is a graduating medical student/experiencing the first day of residency. Level 2: The resident is advancing and demonstrating additional milestones. Level 3: The resident continues to advance and demonstrate additional milestones; the resident consistently demonstrates the majority of milestones targeted for residency.

Milestone Level Definitions (continued) l Level 4: l Level 5: The resident has advanced beyond The resident has advanced so that he or she now substantially demonstrates the milestones targeted for residency. This level is designed as the graduation target – not requirement. performance targets set for residency and is demonstrating “aspirational” goals which might describe the performance of someone who has been in practice for several years. It is expected that only a few exceptional residents will reach this level.

Milestones l Making decisions about readiness for graduation is the purview of the residency program director.

What is NAS Driving Us to Do? l Change our Evaluations – – – l Tagging current evaluation questions to core competencies and milestones Use direct milestone evaluations Or use a combination Ensure our evaluations are valid and reliable measures

Overall Milestone Evaluation Approach: Examples EXAMPLE #1: Current Use Current Evaluations EXAMPLE #2: Direct milestone evaluation EXAMPLE #3: Combo Use Current Evaluations Direct milestone evaluation Tag questions to milestones Map questions to milestones

Milestones Impact on Evaluations: Linking questions to milestones l Step Two: Ensure specific evaluation Advises the referring health care provider(s) about the appropriateness of a procedure in routine clinical situations questions are linked to milestones

Example #2 Direct

Example # 3 - Combined Evaluation Questions Tagged To Milestones Evaluated Directly

Milestone Evaluations demand valid, reliable evaluations as component measurements l Unintended bias can impact both direct and indirect milestone evaluations. – – Bias in questions & response scales (existing evaluations) Bias in evaluators (all evaluations) l l l Halo Effect Devil Effect Similarity Effect Central Tendency Self-Fulfilling Prophesy

Step 1: Question and Response Scale Construction Two Major Goals: l Construct unbiased, focused and non-leading questions that produce valid data l Use valid unbiased response scales

Step 1: Create the Evaluation-Eliminating Unintended Cognitive Bias What is cognitive bias? l Cognitive response bias is a type of bias which can affect the results of an evaluation if evaluators answer questions in the way they think they are designed to be answered, or with a positive or negative bias toward the trainee.

Step 1: Create the Evaluation - Where does response bias occur? 1. Response bias can be in the evaluators themselves: Central Tendency, Similarity Effect, First Impressions, Halo Effect, Devil Effect, Self-Fulfilling Prophecy 2. Response bias occurs most often in the wording of the question. Response bias is present when a question contains a leading phrase or words 3. Response bias can also occur in the rating scales.

Response Bias in the Raters/Evaluators Beware the Halo Effect l The “halo effect” refers to a type of cognitive bias where the perception of a particular behavior or trait is positively influenced by the perception of the former positive traits in a sequence of interpretations. l Thorndike (1920)

The Halo Effect and Expectations l The halo effect is discussed in Kelley's implicit personality theory… – “the first traits we recognize in other people influence our interpretation and perception of later ones because of our expectations…”

Could this impact our evaluations ? A majority of trainees at Stanford believe that: “The general feeling in my program is that your ability will be labeled based on your initial performance. ” 25. 67% 19. 56% 54. 77% l GME House Staff Survey 2013 -2014

“Reverse Halo Effect” l A corollary to the halo effect is the “reverse halo effect” (devil effect) – Individuals or brands which are seen to have a single undesirable trait are later judged to have many poor traits… - i. e. , a single weak point (showing up late, for example) influences others' perception of the person or brand

Blind Spots l In the 1970 s, the social psychologist Richard Nisbett demonstrated that we typically have no awareness of when the halo effect influences us (Nisbett, R. E. and Wilson, T. D. , 1977) l The problem with Blind Spots is that we are blind to them…now on to self-fulfilling prophesy effect …

The Self-Fulfilling Prophecy (SFP) What is the Self-Fulfilling Prophecy? l First defined by Robert Merton in 1948 to describe: – ‘A false definition of a situation evoking a new behavior which makes the originally false definition come true. ’

The Self-Fulfilling Prophecy (SFP) What is the Self-Fulfilling Prophecy? Has been observed in many processes (Henshel, 1978) l Within an individual, as with the placebo response l While the placebo effect occurs “within” the individual undergoing treatment, it is important to note that the individual’s belief derives from situational encounters with someone assumed to have expertise.

Rosenthal’s Research on SFP For more than 50 years, Robert Rosenthal has conducted research on the role of self-fulfilling prophecies in everyday life and in laboratory situations. – – – Faculty expectations on trainees’ academic and clinical performance The effects of experimenters’ expectations on the results of their research (hence double-blinds) The effects of clinicians’ expectations on their patients’ mental and physical health.

The Pygmalion Effect: (Self-Fulfilling Prophesy) Sum it up: the greater the expectation placed upon people, the better they perform l Rosenthal and Jacobson (1968) reported and discussed the Pygmalion effect in the classroom at length. – In their study, they demonstrated that if teachers were led to expect enhanced performance from some children, then the children did indeed show that enhancement.

Pygmalion in Education l l l The purpose of the experiment was to support the hypothesis that reality can be influenced by the expectations of others. This influence can be beneficial as well as detrimental depending on which label an individual is assigned. The observer-expectancy effect, which involves an experimenter's unconsciously biased expectations, is tested in real life situations. – Rosenthal posited that biased expectancies can essentially affect reality and create self-fulfilling prophecies as a result.

Pygmalion Effect in Education l l l All students in a single California elementary school were given a disguised IQ test at the beginning of the study. These scores were not disclosed to teachers. Teachers were told that some of their students (about 20% of the school chosen at random) could be expected to be "spurters" that year, doing better than expected in comparison to their classmates. The spurters' names were made known to the teachers. At the end of the study all students were again tested with the same IQ-test used at the beginning of the study.

Pygmalion Effect l ALL six grades in both experimental and control groups showed a mean gain in IQ from pretest to post-test. – – l l be l l i u w fu Yo cess suc First and second graders showed statistically significant gains favoring the experimental group of "spurters”. Conclusion: Teacher expectations, particularly for the youngest children, can influence student achievement. Paul Harvey

Why ? ? ? l Anthropologists have postulated that accurate predictions in our human evolution came to be valued in and of themselves….

Other extensions… l l l “Dani is my shy child…” “Mike never eats his vegetables…” “I can’t see them making much progress…” “Michelle isn’t very coordinated…. ” “You will feel comfortable and heal very well after this surgery” “I don’t expect Chris to make it through residency, do you? ”

Quotes… James Rhem, executive editor for the online National Teaching and Learning Forum, commented: l l "When teachers expect students to do well and show intellectual growth, they do; when teachers do not have such expectations, performance and growth are not so encouraged and may in fact be discouraged in a variety of ways. " "How we believe the world is and what we honestly think it can become have powerful effects on how things will turn out. "

Quotes…

Quotes l “Do or do not…. there is no try” – Yoda

Applying these principles to assessments and evaluations…

Step 1: Create the Evaluation Question Construction - Exercise One l Review each question and share your thinking of what makes it a good or bad question.

Question Construction: Exercise 1 l Example 1: – l Example 2: – l “Incomplete, inaccurate medical interviews, physical examinations; incomplete review and summary of other data sources. Fails to analyze data to make decisions; poor clinical judgment. ” Example 4: – l “Sufficient career planning resources are available to me and my program director supports my professional aspirations. ” Example 3: – l "I can always talk to my Program Director about residency related problems. ” "Communication in my sub-specialty program is good. “ Example 5: – "The pace on our service is chaotic. "

Question Construction - Test Your Knowledge l Example 1: "I can always talk to my Program Director about residency related problems. " Problem: Terms such as "always" and "never" will bias the response in the opposite direction. l Result: Data will be skewed. l

Question Construction - Test Your Knowledge l l l Example 2: “Career planning resources are available to me and my program director supports my professional aspirations. " Problem: Double-barreled ---resources and aspirations… Respondents may agree with one and not the other. Researcher cannot make assumptions about which part of the question respondents were rating. Result: Data is useless.

Question Construction - Test Your Knowledge l Example 3: "Communication in my subspecialty program is good. " l Problem: Question is too broad. If score is less than 100% positive, researcher/evaluator still does not know what aspect of communication needs improvement. l Result: Data is of little or no usefulness.

Question Construction - Test Your Knowledge l Example 4: “Evidences incomplete, inaccurate medical interviews, physical examinations; incomplete review and summary of other data sources. Fails to analyze data to make decisions; poor clinical judgment. ” l Problem: Multi-barreled ---Respondents may need to agree with some and not the others. Evaluator cannot make assumptions about which part of the question respondents were rating. l Result: Data is useless.

Question Construction - Test Your Knowledge l Example (5): – "The pace on our service is chaotic. “ l Problem: The question is negative, and broadcasts a bad message about the rotation/program. l Result: Data will be skewed, and the climate may be negatively impacted.

Evaluation Question Design Principles Avoid ‘double-barreled’ or multi-barreled questions. Eliminate them from your evaluations. l A multi-barreled question combines two or more issues or “attitudinal objects” in a single question. l More examples….

Avoiding Double-Barreled Questions l Example: COMPETENCY 1 – Patient Care “Resident provides sensitive support to patients with serious illness and to their families, and arranges for on-going support or preventive services if needed. ”

Evaluation Question Design Principles Combining the two or more questions into one question makes it unclear which object attribute is being measured, as each question may elicit a different perception of the resident’s performance. RESULT Respondents are confused and results are confounded leading to unreliable or misleading results Tip: If the word “and” or the word “or” appears in a question, check to verify whether it is a double-barreled question.

Evaluation Question Design Principles l l Avoid questions with double negatives… When respondents are asked for their agreement with a negatively phrased statement, double negatives can occur. – l Example: Do you agree or disagree with the following statement? Attendings should not be required to supervise their residents during night call.

Evaluation Question Design Principles l l l If you respond that you disagree, you are saying you do not think attendings should not supervise residents. In other words, you believe that attendings should supervise residents. Phrase the questions positively if possible. If you do use a negative word like “not”, consider highlighting the word by underlining or bolding it to catch the respondent’s attention.

Evaluation Question Design Principles l l l Because every question is measuring “something”, it is important for each to be clear and precise. Remember…Your goal is for each respondent to interpret the meaning of each survey question in exactly the same way. Pre-testing questions is definitely recommended!

Evaluation Question Design Principles l l l If your respondents are not clear on what is being asked in a question, their responses may result in data that cannot or should not be applied to your evaluation results… Example: "For me, further development of my medical competence, it is important enough to take risks. ” Does this mean to take risks with patient safety, risks to one's pride, or something else?

Evaluation Question Design Principles Keep questions short. l Long questions can be confusing. l Bottom line: Focus on short, concise, clearly written statements that get right to the point, producing actionable data. Short concise questions take only seconds to respond to and are easily interpreted. l

Evaluation Question Design Principles l l Do not use “loaded” or “leading” questions. A loaded or leading question biases the response the respondent gives. A loaded question is one that contains loaded words. – For example: “I’m concerned about doing a procedure if my performance would reveal that I had low ability” Disagree Agree

Evaluation Question Design Principles l "I’m concerned about doing a procedure if my performance would reveal that I had low ability" l How can this be answered with “agree or disagree” if you think you have good abilities in appropriate tasks for your area?

Evaluation Question Design Principles l A leading question is phrased in such a way that suggests to the respondent that a certain answer is expected: – Example: Don’t you agree that nurses should show more respect to residents and attendings? l l Yes, they should show more respect No, they should not show more respect

Evaluation Question Design Principles l l Do use Open-Ended Questions Do use comment boxes after negative ratings – l To explain the reasoning and target areas for focus and improvement. General, open-ended questions at the end of the evaluation can prove very beneficial – Often it is found that entire topics have been omitted from the evaluation that should have been included.

Quick Exercise #2 1. 2. 3. Please rate the anesthesia resident’s communication and technical skills. Rate the resident’s ability to communicate with patients and their families. Rate the resident’s abilities with respect to case familiarization; effort in reading about patient’s disease process and familiarizing with operative care and post op care

Quick Exercise #2 4. 5. 6. Residents deserve higher pay for all the hours they put in, don’t they? Explains and performs steps required in resuscitation and stabilization. Do you agree or disagree that residents shouldn’t have to pay for their meals when oncall?

Quick Exercise #2 8. 9. 10. Demonstrates an awareness of and responsiveness to the larger context of health care. Demonstrates ability to communicate with faculty and staff.

Quick Exercise #2 “Post Test” (1) 1. Please rate the anesthesia resident’s communication and technical skills - Please rate the anesthesia resident’s communication skills. - Please rate the anesthesia resident's technical skills. 2. Rate the resident’s ability to communicate with patients and their families - Rate the resident’s ability to communicate with patients - Rate the resident’s ability to communicate with families

Quick Exercise #2 “Post Test” (2) 3. Rate the resident’s abilities with respect to case familiarization; effort in reading about patient’s disease process and familiarizing with operative care and post op care – – Rate the resident’s ability with respect to case familiarization Rate the resident’s ability with respect to effort in reading about patient disease process Rate the resident’s ability with respect to familiarizing with operative care Rate the resident’s ability with familiarizing with post op care

Quick Exercise #2 “Post Test” (3) 4. Residents deserve higher pay for all the hours they put in, don’t they? – To what extent do you agree that residents are adequately paid? 5. Explains and performs steps required in resuscitation and stabilization – – Explains steps required in resuscitation Explains steps required in stabilization following resuscitation

Quick Exercise #2 “Post Test” (4) 6. Do you agree or disagree that residents shouldn’t have to pay for their meals when on-call? – To what extent do you agree that residents need to pay for their own on-call meals? 7. Demonstrates an awareness of and responsiveness to the larger context of health care. – – Demonstrates an awareness of the larger context of health care Demonstrates responsiveness to larger context of health care 8. Demonstrates ability to communicate with faculty and staff. – – Demonstrates ability to communicate with faculty Demonstrates ability to communicate with staff

Bias in the Rating Scales for Questions The scale you construct can also skew your data, much like we discussed about question construction but luckily, for us…

Milestone Scales use Balanced Scaled Responses…

Review your Program’s evaluations with the goal of eliminating any unintentional bias from them. l Take a look at the evaluations you use. – – Identify questions which may have unintended bias. Modify questions to eliminate bias

Gentle Words of Wisdom - Avoid large numbers of questions…. l l Respondent fatigue – the respondent tends to give similar ratings to all items without giving much thought to individual items, just wanting to finish In situations where many items are considered important, a large number can receive very similar ratings at the top end of the scale – – l Items are not traded-off against each other Many items that not at the extreme ends of the scale or that are considered similarly important are given a similar rating Respondents quit answering questions at all… – – Total Started Survey: 578 Total Completed Survey: 473 (81. 8%)

Gentle Words of Wisdom Begin with the End Goal in Mind l l l What do you want as your outcomes? Be prepared to put in the time with pretesting. The faculty member, nurse, patient, resident has to be able to understand the intent of the question - and each must find it credible and interpret it in the same way.

Gentle Words of Wisdom: Relevancy/Accuracy – Collect data in a valid and reliable way l l l If the questions aren’t framed properly, if they are too vague or too specific, it’s impossible to get any meaningful data. Question miswording can lead to skewed data with little or no usefulness. If you don't plan or know how you are going to use the data, don't ask the question!

Summary: Evaluation Do’s and Don’ts DO’s l Keep Questions Clear, Precise and Relatively Short. l Use a balanced response scale – l Milestone scales are balanced Use open-ended questions DON’Ts l Do not use Double+ Barreled Questions l Do not use Double Negative Questions l Do not use Loaded or Leading Questions.

Questions Please feel free to contact us with any questions @ 650 -723 -5948 npiro@ stanford. edu or adohn 1@stanford. edu Stanford University Medical Center Graduate Medical Education