PLANNING THE CLASSROOM TEST The main goal in

  • Slides: 41
Download presentation
PLANNING THE CLASSROOM TEST. The main goal in classroom testing is to obtain a

PLANNING THE CLASSROOM TEST. The main goal in classroom testing is to obtain a valid, reliable and useful information concerning pupil achievement. This requires a series of steps to be followed

Steps in Planning classroom test • • 1. Determine the purpose of the test

Steps in Planning classroom test • • 1. Determine the purpose of the test 2. Develop test specification 3. Select appropriate test items 4. Preparing relevant test items 5. Assembling the test 6. Administering the test 7. Appraising the test 8. Using the results

1. Determining the purpose of the test • a)Pre-testing e. g readiness test. To

1. Determining the purpose of the test • a)Pre-testing e. g readiness test. To determine prerequisite skills needed for instruction. Limited scope and relatively low difficult. • Mastery of planned instruction. Similar to those measuring the outcome of instruction. • b) During instruction: • Formative; to monitor learning progress. These cover specific portion of instruction • Diagnostic: testing for persisting learning difficulty. This requires a number of items in each specific area.

 • c) End of instruction Summative This measures the achievement of intended learning

• c) End of instruction Summative This measures the achievement of intended learning outcomes. They are used for assigning grades and for the determnation of the effectiveness of instruction

2. Developing test specification (test blue print) • The purpose is to make sure

2. Developing test specification (test blue print) • The purpose is to make sure that a test will measure a representative sample of the tasks. This involves: 1. Preparing a list of instructional objectives. This is limited to those outcomes that can be measured by a classroom test. It does not include performance skills or affective domain. 2. Outlining the instructional content. 3. Preparing a two way chart that relates the instructional objectives to the instructional content. The numbers could be used to indicate the number of items or relative weight given to the objective and / or content area.

Example of table of specification Instructional Objectives Content areas Knows Understand Interprets s Tota

Example of table of specification Instructional Objectives Content areas Knows Understand Interprets s Tota l Basic terms Specific facts Influence on agricult ure Soil 60 diagram s Air pressure 3 5 2 5 15 Wind 4 6 4 1 15 Temperature 5 6 2 0 13 Humidity and precipitation 3 3 5 6 17

Example of table of specification Some levels of cognitive domain Instr. Objectives Content areas

Example of table of specification Some levels of cognitive domain Instr. Objectives Content areas Knows units Estimates Use units size of units (aplicatio n) Convert Total s units Units of length 2 1 2 2 7 Units of mass 2 2 1 3 8 Units of time 3 0 4 3 10 7 3 7 8 25

Selecting appropriate item types • Classroom tests are generally devided into two main categories

Selecting appropriate item types • Classroom tests are generally devided into two main categories • Objective tests- These are highly structured and require pupils to supply a few words or select from given alternatives. 1) Essay – In these Students are allowed organise and present their answer in the form of composition. Which type to use depends on the purpose. Both have their advantages and disadvantages

Objective types • These are further devided into two: • Supply items –the pupil

Objective types • These are further devided into two: • Supply items –the pupil is required to supply an answer eg short answer, and completion • Selection items – Pupils are required to select from a given number of alternatives. eg True false, matching items and Multiple choice

Comparative advantages of objective and essay type questions Objective test Essay test L/outcomes measured

Comparative advantages of objective and essay type questions Objective test Essay test L/outcomes measured Efficient for measuring knowledge of facts, understanding and thinking skills Inappropriate for measuring ability to select and organize ideas, writing skills and some types of problem solving skills Can measure understanding and thinking skills and other complex learning outcomes. Appropriate for measuring ability to select and organize ideas, writing skills and some types of problem soling skills requiring originality L/outcomes measured Large no items is required. Preparation is difficult Only a few questions are needed. Prep. easy Sampling of content Extensive sampling Limited

Objective test Essay test Control of responses Limits responses, avoids influence of writing skills

Objective test Essay test Control of responses Limits responses, avoids influence of writing skills Respond in own words, writing skill influence the score, guessing is minimised Scoring Quick easy and consistent Slow, subjective, difficult and inconsistent Influence on learning Encourages comprehensive knowledge of specific facts and the ability to make fine descriminations Encourages concentration on larger units, with special emphasis on the ability to organise interpret and express ideas Reliability High reliability Typical low

Essay type questions • In these students are free to select, relate and present

Essay type questions • In these students are free to select, relate and present ideas in their own words. They can be used to measure the ability evaluate ideas, relate them and express them clearly. In terms of amount of freedom essay type questions can be classified into: 1. Extended response type 2. Restricted response type

 • Example: Type 1. Explain your views concerning the perfomance of government of

• Example: Type 1. Explain your views concerning the perfomance of government of national unity in Zanzibar. • In this a pupil is free to select any factual information, organise their answer integrate and evaluate. This makes the extended response question inefficient for measuring more specific learning outcomes. • Example: Type 2 Explain briefly the differences between acids and bases

 • The restriction can be in both content and response. e. g your

• The restriction can be in both content and response. e. g your answer should not exceed two pages. or explain one factor affecting reliability of test • Restricted response essay types are most useful for measuring learning outcomes requiring the interpretation and application of data in specific area. Weaknesses • Limited coverage • Marking is difficult and less reliable • They tend to favour verbally fluent

Suggestions for constructing essay type questions • Essay questions should be used to measure

Suggestions for constructing essay type questions • Essay questions should be used to measure learning outcomes that cannot be satisfactorily measured by objective items • Make sure that the question measures the behaviour specified • Questions should be phrased so that the task is clearly indicated. Some types of qestions : compare, relate cause and effects, generalize, justify, classify, create, apply, analyse. • Indicate approximate time limit for each question

Points to consider in marking essay items • Prepare an outline of the expected

Points to consider in marking essay items • Prepare an outline of the expected answer in advance • Decide how you will handle irrelevant factors. (handwrting, spelling, sentence structure etc) • Evaluate all answers to one question before going to another • Evaluate answers without looking pupil’s name

CONSTRUCTING OBJECTIVE TEST ITEMS Short answer items These are short direct questions in the

CONSTRUCTING OBJECTIVE TEST ITEMS Short answer items These are short direct questions in the form of incomplete statement or direct question. Learning outcomes that are measured include, knowledge, specific facts, principles and procedure, manipulative skills. Advantages Easier to construct

Are the most appropriate items for measuring the recall of memorized information Guessing is

Are the most appropriate items for measuring the recall of memorized information Guessing is minimised. Limitations: Unsuitable for measuring complex learning outcomes. They are difficult to score if the item is not carefully phrased.

How to minimise these limitations • Avoid phrases and words that constitute different answers.

How to minimise these limitations • Avoid phrases and words that constitute different answers. • The items should be worded such that the required answer is brief and specific • Textbook statement should not be taken for granted • A direct question is generally better than incomplete statement. • Avoid too many blanks in a single question item • Giving examples of how information is to be supplied reduces pupil’s anxiety and saves time. • Avoid lengthy statements

TRUE /FALSE • These have only two possible answers. R/W, Correct / Incorrect, Yes/

TRUE /FALSE • These have only two possible answers. R/W, Correct / Incorrect, Yes/ No, Fact / Opinion, Agree/ Disagree etc. They are used to measure the ability to identify the correctness of factual statements. Advantages Easy to construct

Disadvantages • Measure low level f cognitive skills • 50% level of guessing. Suggestion

Disadvantages • Measure low level f cognitive skills • 50% level of guessing. Suggestion for constructing T/F items • Avoid broad general statements. • Minimise the use of negative statements • Use statements that are absolutely true or false.

MATCHING ITEMS • These consist of two parallel columns. Items for which a match

MATCHING ITEMS • These consist of two parallel columns. Items for which a match is sought are called premises. Words or statements in column in which a solution is made are called responses. • They measure factual information based on simple association. Advantages Compact form – possible to measure large amount of materials. Easy to construct. Disadvantages: Guessing, Difficult to find suffcient number of related items in some topics,

Guidelines for constructing matching items • One column should have more items than the

Guidelines for constructing matching items • One column should have more items than the other • Premises should be homogeneous • The list of premises should be brief • The responses should be arranged in logical order. • Indicate the basis for matching • Place all the items in the same page.

MULTIPLE CHOICE ITEMS • They are used more than other types of objective tests.

MULTIPLE CHOICE ITEMS • They are used more than other types of objective tests. They are capable of measuring from simpler to complex outcomes. The statement of the problem is called stem. The list of alternatives are called responses distractors, key • Advantages, guessing is reduced, more reliable than T/F,

Disadvantages • • Cannot measure problem solving skills Cannot measure the ability to organize

Disadvantages • • Cannot measure problem solving skills Cannot measure the ability to organize and present ideas. Guidelines for constructing multiple choice q’s 1. Make sure that the stem is meaningful 2. All alternatives should be grammatical consistent with the stem 3. Each item should contain only one answer. Avoid “all of the above” 4. Make all distracters plausible 5. Length of alternatives should be relatively equal. 6. Correct responses should appear randomly

Assembling classroom test • Write test items in such a way that they can

Assembling classroom test • Write test items in such a way that they can be easily modified. • Review the test several times in order to detect defects. • Arrange the items in a logical manner • Consider subject matter, learning outcomes, difficulty etc. • T/F, matching, short answer, multiple choice essay

Reproduce the test in good arrangement so that it can be easily read and

Reproduce the test in good arrangement so that it can be easily read and scored Provide directions at the beginning of the tests. These include time allowed, basis for answering, procedure of recording answers etc. Where necessary specific directions for each question should be provided. The directions should be clearly written to avoid need for additional verbal explanation. The test should be written to permit anyone to supervise without test writer to be around. The items should be arranged in such a way to avoid the need to turn the page back and forth.

ADMINISTERING THE TEST All the pupils should be given a fair chance to demonstrate

ADMINISTERING THE TEST All the pupils should be given a fair chance to demonstrate their ability. Assure favourable conditions for test taking Physical conditions – space, light, temp, ventilation etc. Psychological conditions – anxiety, worry, threatening, timing (before a big event) Things to avoid: Unnecessary talking, interruptions, giving hints to pupils, Descourage cheating

 • Sitting arrangement should be in such a way that cheating is minimised.

• Sitting arrangement should be in such a way that cheating is minimised. Another way is to prepare two formats of the same test but with different arrangement. SCORING THE TEST For objective tests stencil can be used. When you use stencil check if only one answer is marked. Correction for guessing. This is done when pupils do not have sufficient time to complete all the items and when they have been instructed that there will be a penalty for guessing.

n – Number of alternatives for an item Item effectiveness This is determined by

n – Number of alternatives for an item Item effectiveness This is determined by analyzing the pupils’ responses to an item. It attempts to answer the following questions 1. Did the item function as intended? 2. Were the items of appropriate difficulty? 3. Were the items free of irrelevant clues? 4. Were each of the distracters effective?

Advantages of item analysis 1. It provides a basis for class discussion of test

Advantages of item analysis 1. It provides a basis for class discussion of test results 2. It provides basis for remedial work 3. It provides a basis for the general improvement of classroom instruction 4. It provides for increased skill for test construction

Item analysis for N-R The method is different from C-R test as these two

Item analysis for N-R The method is different from C-R test as these two types serve different functions. Procedure: For small classes N<20 we compare the lower and upper halves. But with larger classes we use the score of upper 25% and lower 25%. - Rank the papers from highest to lowest score. - Select 10 of the top and 10 of the bottom - For each item tabulate the number of pupils in the upper and in the lower who selected the alternative. -

 • Use the formular to calculate item difficulty • R – total number

• Use the formular to calculate item difficulty • R – total number of pupils who got the item right • T total number of pupils who tried the item from the table P = 0. 7 or 70% Computing discriminating power • Positive discrimination indicates that more pupils in the upper group than in the lower group get the item right

In the example D = 0. 6 we say the item is discriminating positively.

In the example D = 0. 6 we say the item is discriminating positively. That is, it is discriminating in the same direction as the total score. Evaluating effectiveness of the distracters. This can be determined by inspection. A good distracter attracts more pupils from the lower than the upper group

t • A distracter selected by none is said to be ineffective • B

t • A distracter selected by none is said to be ineffective • B poor distracter it attracts more pupils from the upper group than the lower group • C Ineffective distracter it attracted none • D Good distracter • Item difficulty = 0. 3, • Discriminating index=

Cautions in using item analysis data 1. Item discriminating power does not indicate item

Cautions in using item analysis data 1. Item discriminating power does not indicate item validity. This is because it takes only upper and lower groups 2. A low index of discrimination does not necessarily indicate a defective item 3. Item analysis data for small sample are highly tentative. It can vary from group to group

Analysis for CR mastery items In CR we are interested with the question “to

Analysis for CR mastery items In CR we are interested with the question “to what extent did the test items measure the effects of instruction. ” Same test is given before and after instruction and the results compared. The following formula is used for measuring sensitivity of instructional effects.

 • RA number of pupils who got the item right after instruction •

• RA number of pupils who got the item right after instruction • RB number of pupils who got the item right before instruction • T total number of pupils who tried the item both times. Effective items give the value of S from 0 to 1 (ideal) • 0 and negative values of S do not reflect the intended effects of instruction For their effectiveness, distracters not or rarely selected should be replaced

Using marks or grades • Raw score: This is the number of points obtained

Using marks or grades • Raw score: This is the number of points obtained by a pupil according to marking scheme. • Raw score has no educational meaning. It can be described in terms of tasks (CR) or in terms of other scores (NR). Functions of marks or grades 1. They provide objective criteria for assessing student performance 2. They provide permanent records of achievement- promotion, certification, scholarship etc

3. Motivates pupils to work hard. 4. They serve as a form of justice.

3. Motivates pupils to work hard. 4. They serve as a form of justice. Limitations of marks or grades. No mark or grade is perfect. They encourage unhealthy competition and cheating They classify students. This leads to psychological effects. Some of them are subjective and do not reflect pupils achievement. They demoralise poor students