Chapter 9 Scaling Reliability and Validity Chapter Objectives

Chapter 9 Scaling, Reliability and Validity

Chapter Objectives • Know how and when to use the different forms of rating scales and ranking scales • Explain stability and consistency, and how they are established • Explain the different forms of validity • Discuss what ‘goodness’ of measures means, and why it is necessary to establish it in research

Rating and Ranking Scales • Rating Scales – have several response categories and are used to elicit responses with regard to the object, event or person studied. • Ranking Scales – make comparisons between or among objects, events or persons, and elicit the preferred choices and ranking among them.

Rating Scales • • • dichotomous scale category scale Likert scale numerical scales semantic differential scale itemised rating scale fixed or constant sum rating scale Stapel scale graphic rating scale consensus scale

Dichotomous Scale Used to elicit a Yes or No answer, eg: Do you own a car? Yes No

Category Scales

Likert Scale Indicate the extent to which you agree or disagree with the following statements: My work is very interesting 1 2 3 4 5 Life without my work would be dull 1 2 3 4 5

Semantic Differential Scale Responsive Unresponsive Good Bad Courageous Timid

Numerical Scale How pleased are you with your new car? Extremely pleased 7 6 5 4 3 2 1 displeased

Itemised Rating Scale This is an unbalanced rating scale which does not have a neutral point.

Fixed or Constant Sum Rating Scale Respondents are asked to distribute a given number of points across various items, eg: Fragrance — Colour — Shape — Size — Texture of lather — Total points 100

Stapel Scale Measures the direction and intensity of the attitude towards the items under study, eg

Graphic Rating Scale

Ranking Scales • paired comparison • forced choice • comparative scale

Paired Comparison • Used when, among a small number of objects, respondents are asked to choose between two objects at a time. • The paired choices for n objects will be ((n) (n-1)/2).

Forced Choice Rank your preferences among the following magazines, 1 being your preferred choice and 5 being your least preferred: Australian Financial Review __ Business Review Weekly Playboy The Economist Time __ __

Comparative Scale In a volatile financial environment, compared with shares, how useful is it to invest in government bonds? More useful 1 About the same 2 3 Less 4 5

Goodness of Measures • Reliability measures – How stable and consistent is the measuring instrument? • Validity measures – Are we measuring the right thing?

Reliability and Validity in Target Shooting

Forms of Reliability and Validity

Reliability • Stability – refers to the ability of a measure to maintain stability over time, despite uncontrollable testing conditions or the state of the respondents themselves • Internal consistency – indicates how well the items ‘hang together as a set’ and can independently measure the same concept, so respondents attach the same overall meaning to each of the items

Stability of Measures • Test-retest reliability – the reliability coefficient obtained with a repetition of the same measure on a second occasion • Parallel-form reliability – the correlation obtained from responses on two comparable sets of measures (changed for wording & question order) tapping the same construct

Internal Consistency of Measures • Inter-item consistency reliability – test of the consistency of respondents’ answers to all the items in a measure – usually tested by Cronbach’s coefficient alpha • Split-half reliability – reflects the correlations between two halves of an instrument

Types of Validity