RELIABILITY AND VALIDITY OF DATA COLLECTION RELIABILITY OF

RELIABILITY AND VALIDITY OF DATA COLLECTION

RELIABILITY OF MEASUREMENT • Measurement is reliable when it yields the same values across repeated measures of the same event • Relates to repeatability • Not the same as accuracy • Low reliability signals suspect data

THREATS TO RELIABILITY 1. Human error • Miss recording a data point • Usually result from poorly designed measurement systems • Cumbersome or difficult to use • To complex • Can reduce by using technology – Cameras

2. INADEQUATE OBSERVER TRAINING • Training must be explicit and systematic • Careful selection of observers • Must clearly define the target behavior • Train to competency standard • Have on-going training to minimize observer drift • Have back up observers observe the primary observers

3. UNINTENDED INFLUENCES ON OBSERVERS • Causes all sorts of problems • Expectations of what the data should look like • Observer reactivity when she/he is aware that others are evaluating the data • Measurement bias • Feedback to observers about how their data relates to the goals of intervention

SOLUTIONS TO RELIABILITY ISSUES 1. Design a good measurement system • Take your time on the front end 2. Train observers carefully 3. Evaluate extent to which data are accurate and reliable 4. Measure the measurement system

ACCURACY OF MEASUREMENT • Observed values match the true values of an event • Issue: Do not want to base research conclusions or treatment decisions on faulty data

PURPOSES OF ACCURACY ASSESSMENT: • Determine if data are good enough to make decisions • Discover and correct measurement errors • Reveal consistent patterns of measurement error • Assure consumers that data are accurate

OBSERVED VALUES MUST MATCH TRUE VALUES • Determined by calculating correspondence of each data point with its true value • Accuracy assessment should be reported in research

INTER- OBSERVER AGREEMENT (IOA) OR RELIABILITY (IOR) • Is the degree to which two or more independent observers report the same values for the same events • Used to: • Determine competency of observers • Detect observer drift • Judge clarity of definitions and system • Increase validity of the data

REQUIREMENTS FOR IOA / IOR • Observers must: • Use the same observation code and measurement system • Observe and measure the same participants and events • Observe and record independently of one another

METHODS TO CALCULATE IOA / IOR • (Smaller Freq. / Larger Freq. ) * 100 = percentage • Can be done with intervals as well • Agreements / Agreements + Disagreements X 100 • Methods can compare: • Total count recorded by each observer • Mean count-per-interval • Exact count-per-interval • Trial-by-trial

TIMING RECORDING METHODS: • Total duration IOA • Mean duration-per-occurrence IOA • Latency-per-response • Mean IRT-per-response

INTERVAL RECORDING AND TIME SAMPLING: • Interval-by-interval IOA (Point by point) • Scored-interval IOA • Unscored-interval IOA

CONSIDERATIONS IN IOA • During each condition and phase of a study • Distributed across days of the week, time of day, settings, observers • Minimum of 20% of sessions, preferably 2530% • More frequent with complex systems

CONSIDERATIONS IN IOA • Obtain and report IOA at the same levels at which researchers will report and discuss it within the results • For each behavior • For each participant • In each phase of intervention or baseline

OTHER CONSIDERATIONS • More conservative methods should be used • Methods that will overestimate actual agreement should be avoided • If in doubt, report more than one calculation • 80% agreement usually the benchmark • Higher the better • Depends upon the complexity of the measurement system

REPORTING IOA • Can use • Narrative • Tables • Graphs • Report how, when, and how often IOA was assessed

VALIDITY • Many types • Are you measuring what you believe you are measuring • Ensures the data are representative • In ABA, usually measure: • a socially significant behavior • dimension of the behavior relevant to the question

THREATS TO VALIDITY • Measuring a behavior other than the behavior of interest • Measuring a dimension that is irrelevant or ill suited to the reason for measuring behavior • Measurement artifacts • Must provide evidence that the behavior measured is directly related to behavior of interest

EXAMPLES • Discontinuous measurement • Poorly scheduled observations • Insensitive or limiting measurement scales

CONCLUSIONS • Reliabiltiy and validity of data collection are important • Impacts the client, • Impacts your reputation for good work