5 Reliability and Validity q What are random
5. Reliability and Validity q What are random error and systematic error, and how do they influence measurement? q What is reliability? Why must a measure be reliable? q How are test-retest and equivalent-forms reliability measured? q How are split-half reliability and coefficient alpha used to assess the internal consistency of a measured variable? q What is interrater reliability?
q What are face validity and content validity? q How are convergent and discriminant validity used to assess the construct validity of a measured variable? q What is criterion validity? q What methods can be used to increase the reliability and validity of a self-report measure? q How are reliability and construct validity similar? How are they different?
Random and Systematic Error Random Error 1) fluctuations in the person’s current mood. 2) misreading or misunderstanding the questions 3) measurement of the individuals on different days or in different places. These error may cancel out as you collect many samples Systematic Error Sources of error including the style of measurement, tendency toward self-promotion, cooperative reporting, and other conceptual variables are being measured.
So, we have to reduce these errors to prove scientific findings How well do our measured variables “capture” the conceptual variables? Reliability The extent to which the variables are free from random error, usually determined by measuring the variables more than once. Construct Validity The extent to which a measured variable actually measures the conceptual variables that is design to assess the extent to which it is known to reflect the conceptual variable other measured variables * ** CVs** ** * CVs * ** *
Test-Retest Reliability The extent to which scores on the same measured variable correlate with each other on two different measurements given at two different time. Questionnaire 9/20 Questionnaire 9/27 ___ 4 I feel I do not have much proud of. ___ 3 On the whole, I am satisfied with myself ___ 4 On the whole, I am satisfied with myself ___ 2 I certainly feel useless at times 1 I certainly feel useless at times ___ 1 At times I think I am no good at all ___ 4 I have a number of good qualities ___ 3 I am able to do things as well as others ___ 4 I am able to do things as well as others in t s e t Re ct e f f g. E
Equivalent-Forms Reliability The extent to which two equivalent variables given at different time correlate each other. Example. GRE, SAT, GMAT, TOEFL 22 X 45 = 32 X 45 = 85 X (23 -11) = 85 X (41 -11) = 72 -14 X 12 X (7 -1) = 72 -14 X 25 X (6 -1) =
Reliability as Internal Consistency The extent to which the scores on the items correlate with each other and thus are all measuring the true score rather than reflecting random error. Questionnaire 9/20 ___ I feel I do not have much proud of. How Do You Measure Internal Consistency? ___ On the whole, I am satisfied with myself ___ I certainly feel useless at times Split-half Reliability ___ At times I think I am no good at all ___ I have a number of good qualities ___ I am able to do things as well as others Coefficient Alpha
Interrater Reliability The extent to which the scores counted by coders correlate each other. Aggression Coder 1 Coder 2 Hit boy A ______ 1 3 Hit boy B ______ 3 3 Hit girl A ______ 3 2 Hit girl B ______ 1 1 How Do You Measure Interrater Reliability? Cohen’s Kappa
Questionnaire 1 Test-Retest Reliability t Questionnaire 1 Item 2 Reliability as Internal Consistency Item 3 Questionnaire 2 Equivalent-Forms Reliability Interrater Reliability Item 1 Item 2 Item 3
Validity Construct Validity The extent to which a measured variable actually measures the conceptual variable (that is, the construct) that it is designed to assess. Criterion Validity The extent to which a self-report measure correlates with a behavioral measured variables.
Construct Validity Face Validity The extent to which the measured variable appears to be an adequate measure of the conceptual variables I don’t like Japanese Strongly Disagree 1 2 3 4 5 6 7 8 Strongly Agree Discrimination towards Japanese Measured Variable X Conceptual Variable
Construct Validity Content Validity The degree to which the measured variable appears to have adequately sampled from the potential domain of question that might relate to the conceptual variable of interest. Sympathy Verbal Aptitude Intelligence Math Aptitude
Construct Validity Convergent Validity The extent to which a measured variable is found to be related to other measured variables designed to measure the same conceptual variable. Interdependence Scale Collectivism Scale Discriminant Validity The extent to which a measured variable is found to be unrelated to other measured variables designed to measure the different conceptual variables. Independence Scale Interdependence Scale
Criterion Validity Predictive Validity The extent to which the scores can predict the participants’ future performance. Example. GRE, SAT. . . Concurrent Validity The extent to which the self-report measure correlate with the behavioral measure that is assessed at the same time.
How Do You Improve the Reliability and Validity of Your Measured Variables? 1. Conduct a pilot test, trying out a questionnaire or other research instruments on a small group. 2. 3. 4. 5. Use multiple measures. Ensure variability that there is in your measures. Write good items. Get your respondents to take your questions seriously. 6. Make your items nonreactive. 7. Be certain to consider face and content validity by choosing reasonable terms and that cover a broad range of issues reflecting the conceptual variables. . 8. Use existing measures.
time Conceptual Variables Future behaviors Face Validity Predictive Validity Other Domain of the CVs Measured Variables (Self-Report) Content Validity Similar Items-Scales Convergent Validity Concurrent Validity Items-Scales Measured Variables (Behavioral) Other Items-Scales Discriminant Validity
- Slides: 16