Test Blueprints for Adaptive Assessments Tony Alpert Oregon

Test Blueprints for Adaptive Assessments Tony Alpert Oregon Department of Education

Purposes of Blueprints �Communicates with stakeholders including test takers regarding the design and content of the assessment �Serves as a source of evidence for demonstrating compliance with federal and state laws and regulations �Serves as method of codifying more detailed agreements regarding the design and content of the assessment for staff and contractors

Sources of Information Regarding Requirements for Blueprints �The joint Standards for Educational and Psychological Testing (AERA, APA, NCME) �Operational Best Practices for Statewide Large-Scale Assessment Programs (CCSSO and ATP) �Existing Blueprints of Large Scale Assessment Systems �Peer Review Guidance (USED)

Examples of Blueprint Content Joint Standards � 3. 2 Describe the purpose of the test and the domain � 3. 3 Specifications should define the content of the test, proposed number of items, item formats, desired psychometric properties of the items, and the item and section arrangement, time for testing, � Also describes directions to test takers, test administration and scoring procedures and other relevant information � 3. 4 Procedure to interpret scores CCSSO/ATP � 2. 1 Test Specifications for each grade and content area � 2. 1. 2 In addition to 3. 3 Should also include use of supporting materials like calculators

Examples of Blueprint Content (cont) �Standard 3. 5 relevant experts external to the testing program should review the test specifications �CCSSO/ATP clarifies that these specifications should precede item development

CAT and Blueprints �CAT designed to be individualized, blueprints are designed to generalize �Assessment Systems have dealt with this problem when using multiple forms

Issues With Clear Solutions �Purpose of the test, item types, psychometric approach to scoring, format �Test Length, Approximate Time Required �Students that are eligible �Achievement level descriptors

Issues Not Unique to CAT � Content standards to which the test is aligned � State assessments sample content from a larger domain. Each student does not demonstrate proficiency on each required element of content at every level of complexity � Some content may not be assessed well on a summative assessment � Achievement level descriptors � Represent a description of the knowledge and skills students likely have achieved not necessarily whether they have demonstrated knowledge about each element of content (i. e. 100% correct isn’t required for any test) � Validity is tied to educator review and on the standard setting process

Issues Unique to CAT �Allocation of Content �CATs access item pools and are caged by constraints �Constraints can take many forms (e. g. Content, Cognitive Complexity, Item types, etc) �Interaction between item pool and constraints �Percent Correct Isn’t Helpful �Each student should get about 50% of the items correct �P-values for items don’t work either

Issues Not Unique To CAT – But Complicated For CAT �Evaluation of Test Specifications �Prior to administration of tests use simulations �During and after administration, use real data �Determine if discrepancies are due to the algorithm, constraints, item pool, or interaction of the three �Evaluation of Item Pool �Sufficient Size �Optimally in proportion to depth and breadth as expected in each test

Issues Related To CAT Policy �Termination Criteria �Impacts test length, Allocation of content, testing time required, item pool requirements �Documentation in specifications should likely be more complex to address these more complex approaches �Use of Interactive Simulations, Audio and Speech �Paper may not be sufficient medium for test specifications �Need to consider test specifications that only can be delivered online