Scaling Design Chapter 8 Cooper Schindler Scaling Defined

  • Slides: 48
Download presentation
Scaling Design Chapter 8 Cooper & Schindler

Scaling Design Chapter 8 Cooper & Schindler

Scaling Defined • Procedure for the assignment of numbers to a property of objects

Scaling Defined • Procedure for the assignment of numbers to a property of objects in order to impart some of the characteristics of numbers to the properties in question • We assign numbers to indicants of the properties of objects – Use mercury to measure temperature – Peer group to rate a person’s supervisory capacity

Scale Classification • • • Study objective Response scales Degree of preference Scale properties

Scale Classification • • • Study objective Response scales Degree of preference Scale properties Number of dimensions Scale construction

Study Objective • Measure the characteristics of the respondents who complete it – Combine

Study Objective • Measure the characteristics of the respondents who complete it – Combine each respondent’s answers to form an indicator of that respondent’s conservatism or political orientation • Use respondents as judges of the objects or stimuli present to them – Use the same data but are interested in how respondent view different things

Response Scales* • Categorical scales – Used when respondents score some object without direct

Response Scales* • Categorical scales – Used when respondents score some object without direct reference to other objects – Asked to rate the styling of new autos on a fivepoint scale – Rating scales • Comparative scale – The respondents are asked to choose which one of a pair of cars has more attractive styling – Ranking scales

Degree of Preference • Preference measurement – Respondents are asked to choose the object

Degree of Preference • Preference measurement – Respondents are asked to choose the object each favors or solution each would prefer • Nonpreference measurement – Respondents are asked to judge which object has more of some characteristic or which solution takes the most resources, without reflecting any personal preference toward objects or solutions

Scale Properties • • Nominal Ordinal Interval Ratio

Scale Properties • • Nominal Ordinal Interval Ratio

Number of Dimensions • Unidimensional scale – Seek to measure only one attitude of

Number of Dimensions • Unidimensional scale – Seek to measure only one attitude of the respondent or object – One measure of employee potential is promotability • Multidimensional scaling – Object is described in attribute space of n dimensions – The employee promotability variable is expressed by three distinct dimensions • Managerial performance • Technical performance

Scale Construction (I)* • Arbitrary approach – Scale is developed on an ad hoc

Scale Construction (I)* • Arbitrary approach – Scale is developed on an ad hoc basis • Consensus scales – A panel of judges evaluates the items to be included in the instrument based on relevance to the topic area and lack of ambiguity • Item analysis – Individual items are developed for a test that is given to a group of respondents – After administering the tests, total scores are calculated – Individual items are then analyzed to determine which ones discriminate between persons or objects with high total scores and low total scores

Scale Construction (II)* • Cumulative scales – Are chosen for their conformance to a

Scale Construction (II)* • Cumulative scales – Are chosen for their conformance to a ranking of items with ascending and descending discriminating power – The endorsement of an item representing an extreme position results in the endorsement of all items of less extreme positions • Factor scales – Are constructed from intercorrelations of items – Common factors account for the relationships – The relationships are measured statistically through factor analysis or cluster analysis

Rating Scales • Judge properties of objects without reference to other similar objects •

Rating Scales • Judge properties of objects without reference to other similar objects • Examples – Like-dislike – Approve-indifferent-disapprove • There is little conclusive support for choosing scale point – The most widely used scales range from three to seven points

Graphical Rating Scale • The rater checks his or her response at any point

Graphical Rating Scale • The rater checks his or her response at any point along a continuum • The second scale (II) illustrate the graphic principle but is not a good scale • Other graphic rating scales use pictures, icons, or other visual to communicate with the rater • Variation 1 • Variation 2 • Other Variations

(I) “How well does the employee get along with co-workers? ” (Place an X

(I) “How well does the employee get along with co-workers? ” (Place an X at the position along the line that best reflects your judgment) Always gets along Never gets along Employee Evaluation Form (II) “How well does the employee get along with co-workers? ” (Please check) Always gets Sometimes has along well trouble Often has trouble Always at odds with someone

Variation 1. Boxes replace the line and make it more certain that respondents will

Variation 1. Boxes replace the line and make it more certain that respondents will choose only one of four points Always gets Sometimes has Often has along well trouble Always at odds with someone Variation 2. Two polar positions shown with a number scale used to show degree of opinion Gets along well Has trouble 1 2 3 4 5

Itemized Scale • Presents a series of statement from which respondents select one as

Itemized Scale • Presents a series of statement from which respondents select one as best reflecting their evaluation • The judgments are ordered in a progression of a property • This form is more difficult to develop • It provides more information and meaning to the rater • It increases reliability because more detailed statement help respondents develop and hold the same frame of reference while they fill them out

How well does the employee get along with co-workers? __ Almost always involved in

How well does the employee get along with co-workers? __ Almost always involved in some friction or argument with a co-worker __ Often at odds with one or more co-workers. The frequency of involvement is clearly above that of the average worker __ Sometimes gets involved in friction. The frequency of involvement is about equal to that of the average worker __ Infrequently becomes involvement in friction with others, definitely less often than most workers __ Almost never gets involved in friction situations with other workers

Problems in Using Rating Scales (I) • Leniency – occurs when a respondent is

Problems in Using Rating Scales (I) • Leniency – occurs when a respondent is either an easy rater or a hard rater – Design the rating scale to anticipate it • Central tendency – Raters are reluctant to give extreme judgments – Efforts to counteract this error • Adjust the strength of descriptive adjectives • Space the intermediate descriptive phrases farther apart in graphic scales

Problems in Using Rating Scales (II) – Efforts to counteract this error • Provide

Problems in Using Rating Scales (II) – Efforts to counteract this error • Provide smaller differences in meaning between steps near the ends of the scale than between the steps near the center • Use more points in the scale • Halo effect – Is the systematic bias that the rater introduces by carrying over a generalized impression of the subject from one rating to another – Halo is a pervasive error – It is difficult to avoid when the property being studied is not clearly defined

Ranking Scales • Compare two or more objects and make choices among them •

Ranking Scales • Compare two or more objects and make choices among them • Vote-splitting – 40 percent choose model A; 30 percent choose model B; 30 percent choose model C – 60 percent of respondents chose some model other than A; perhaps all B and C voters would place A last, preferring either B or C to it • Ranking – Method of paired comparisons – Method of rank order

Pair Comparisons • The respondent can express attitudes unambiguously by choosing between two objects

Pair Comparisons • The respondent can express attitudes unambiguously by choosing between two objects • Generally, there are more than two stimuli to judge • The number of judgments requirement is n(n-1) N= 2 N = Number of judgments n = Number of stimuli or objects to be judged

Paired-Comparison Problems • The respondent will get tired and give illconsidered answers or refuse

Paired-Comparison Problems • The respondent will get tired and give illconsidered answers or refuse to continue • Reducing the number of comparisons per respondent without reducing the number of objects being studied can lighten this burden – Present each respondent with only a sample of this stimuli; each pair of objects must be compared an equal number of times – Another is to choose a few objects that cover the range of attractiveness at equal intervals; all other stimuli are then compared to these few standard objects • Example – 36 employees are to be judge, 4 may be selected as standards and the others divided into four groups of eight

Pair-Comparison Data Analysis • Transitivity • Union bargaining committee is considering five major demand

Pair-Comparison Data Analysis • Transitivity • Union bargaining committee is considering five major demand proposals – Low of Comparative Judgment – Guilford’s “Composite-standard” method A B C D E Total A --36 62 150 130 378 B 164 --146 186 170 666 C 138 54 --168 150 510 D 50 14 32 --82 178 E 70 30 50 118 --268

Method of Rank Order • Ask respondents to rank order their choices • This

Method of Rank Order • Ask respondents to rank order their choices • This method is faster than paired comparisons and is usually easier and more motivating to the respondent • With 7 items, for example • Drawbacks – respondents may grow careless in ranking 10 or more items – The rank ordering is still an ordinal scale with all of its limitations

Rank Order Data Analysis • There are several simple ways to combine ranking into

Rank Order Data Analysis • There are several simple ways to combine ranking into an overall index • Means cannot properly be calculated, but it is possible to compute medians • The sum of rank values will probably give the best simple indication (? ) • Translate ordinal rank data into an interval scale – Normalized-rank – Comparative-judgment • Complete ranking is sometimes not needed

Scale Construction Techniques • Arbitrary scales • Consensus scales – Differential scales • Item

Scale Construction Techniques • Arbitrary scales • Consensus scales – Differential scales • Item analysis – Likert scales • Cumulative scales – Guttman scalogram • Factor Scales – Semantic Differential

Arbitrary Scales • Collect several items that are unambiguous and appropriate to a given

Arbitrary Scales • Collect several items that are unambiguous and appropriate to a given topic • Company image example • Easy to develop, inexpensive, and can be designed to be highly specific – The design approach is subjective • Data analysis How do you regard (name)company’s reputation 1. As a place to work? Bad ___ 2. As a sponsor of civic project? Bad ___ 3. For ecological concern? Bad ___ 4. As an employer of minorities? Bad ___ ___ ___ Good ___ Good

Consensus Scaling 1 • Require that the item are selected by a panel of

Consensus Scaling 1 • Require that the item are selected by a panel of judges who evaluate them on – relevance to the topic area – potential for ambiguity – the level of attitude it represents • Thurstone differential scales – Known as method of Equal Appearing Intervals – Developed to create an interval rating scale for attitude measurement

Consensus Scaling 2 • Development Procedures – Often 50 or more judges evaluating a

Consensus Scaling 2 • Development Procedures – Often 50 or more judges evaluating a large number of statements (one statement per card) • Expressing different degrees of favorableness toward an object – The judges sort each card into 1 of 11 piles • Representing their evaluation of the degree of favorableness that the statement expresses • The judge’s agreement or disagreement with the statement is not involved – Three of the 11 piles are identified to the judges by labels of “favorable” “unfavorable” and “neutral” (1, 6, 11)

Consensus Scaling 3 – The scale position for a given statement is found by

Consensus Scaling 3 – The scale position for a given statement is found by calculating its median score – A measure of dispersion is calculate for each statement (interquartile range) • If a given statement has a large interquartile range, it is judged to be too ambiguous to be used in the final scale – Those with median scores spread evenly from one extreme to the other and with small interquartile are included in the final attitude scale

Scale administration 1 • Respondents read approximately 20 statements and select those items with

Scale administration 1 • Respondents read approximately 20 statements and select those items with which they agree • The mean or median value of the chosen scale items is then calculated as the measure of the respondent’s attitude • Example is part of a classic 50 -item scale that was designed to reveal the attitudes of employees toward their employer – The scale value are shown here but would not be on the instrument when it is used

Scale administration 2 • Statements are arranged in random order of scale value •

Scale administration 2 • Statements are arranged in random order of scale value • The typical will choose one or several adjoining items (in terms of scale values) – If the values are valid and if it deal with only one attitude dimension • Divergences occur because a statement appears to tap a different attitude dimension • Differential scales are reliable, more widely used in academic studies • The method is costly, and time-consuming

Scale Value 10. 4 I think this company treats its employees better than any

Scale Value 10. 4 I think this company treats its employees better than any other company does 8. 9 A person can get ahead in this company if he or she tries 8. 5 The company is sincere in wanting to know what its employees think about it 5. 4 I believe accidents will happen no matter what you do about them 5. 1 The workers put as much over on the company as the company puts over on them 4. 1 Soldiering on the job is on the increase 2. 9 My boss gives all the breaks to his lodge and church friends 2. 5 I think the company goes outside to fill good jobs instead of promoting people who are here 1. 5 In the long run this company will “put it over” on you 1. 0 The pay in this company is terrible

Item Analysis • Evaluate an item base on how well it discriminates between those

Item Analysis • Evaluate an item base on how well it discriminates between those persons whose total score is high and those whose total score is low • The most popular type is the summated scale – consist of statements that express either a favorable or unfavorable attitude toward the object of interest – The respondent is asked to agree or disagree with each statement – Each response is given a numerical score – Total scores are total to measure the respondent’s attitude • The most frequently used form is the Likert scales

Likert scale I consider my job rather unpleasant Strongly Agree (1) Agree (2) Neither

Likert scale I consider my job rather unpleasant Strongly Agree (1) Agree (2) Neither Agree nor Disagree (3) Disagree (4) Strongly Disagree (5)

Likert Scale Construction 1 • The first is to collect a large number of

Likert Scale Construction 1 • The first is to collect a large number of statements that meet two criteria – Each statement is believed to be relevant to the attitude being studied – Each is believed to reflect a favorable or unfavorable position on that attitude • Each person’s responses are then added to secure a total score • The next step is to array these total scores and select some part of the highest and lowest total score – The top 25 percent and the bottom 25 percent are used criteria by which to evaluate individual statement (Table 7 -2)

Likert Scale Construction 2 • Item analysis involves calculating the mean scores for each

Likert Scale Construction 2 • Item analysis involves calculating the mean scores for each scale item among the low scores and high scorers • The item means between the high-score group and the low-score group are then tested for significance by calculating t values • Finally, the 20 to 25 items that have greatest t values are selected for inclusion in the final scale – Edwards suggests using only those statements whose t value is 1. 75 or greater , provided there are 25 or more subjects in each group

Advantages of Likert Scales • It is easy and quick to construct • Each

Advantages of Likert Scales • It is easy and quick to construct • Each item has met an empirical test for discriminating ability • More reliable than the Thurstone scale • Provide a greater volume of data than does the Thurstone differential scale • Easy to use both in respondent-centered and stimulus-centered studies • It is also treated as an interval scale • On can develop scales of these types in an arbitrary manner

Cumulative Scales • The major scale of this type is the Guttman scalogram •

Cumulative Scales • The major scale of this type is the Guttman scalogram • Scalogram analysis is a procedure for determining whether a set of items forms a unidimensional scale – A scale is said to be unidimensional if the responses fall into a pattern in which endorsement of the item reflecting the extreme position results also in endorsing all items that are less extreme

1. Style X is good looking 2. I will insist on style X next

1. Style X is good looking 2. I will insist on style X next time because it is great looking 3. The appearance of style X is acceptable to me 4. I prefer style X to other styles Ideal Scalogram Response Pattern Item 2 4 1 3 Respondent Score X ----- X X X --- X X -- 4 3 2 1 0

Guttman Scale Development 1 • One first defines the universe of content – Viewer

Guttman Scale Development 1 • One first defines the universe of content – Viewer attitude toward TV advertizing • The second step is to develop items that can be used in pretest that tell us if this topic is scaleable – Guttman suggests that a pretest include 12 or more items, while the final may have only 4 to 6 items – Pretest respondent numbers may be small, but final scale use should involve 100 or more respondents

Guttman Scale Development 2 • Take the pretest results and order the respondents from

Guttman Scale Development 2 • Take the pretest results and order the respondents from top to bottom – Most favorable to the least favorable • The next step is to discard those statements that fail to discriminate well between favorable and unfavorable respondents • Finally calculate a Coefficient of Reproducibility (CR) • CR should be 0. 90 or better for a scale to be considered unidimensional Reproducibility = 1 - e n(N) e is the number of errors n is the number of items N is the number of cases

Factor Scales • Include a variety of techniques that have been developed for two

Factor Scales • Include a variety of techniques that have been developed for two problems – How to deal with the universe of content that is multidimensional – How to uncover underlying dimensions that have not been identified • These techniques are designed to intercorrelate items so their degree of interdependence may be detected • Semantic differential is based on factor analysis

Semantic Differential • Developed by Osgood and his associates – Is to measure the

Semantic Differential • Developed by Osgood and his associates – Is to measure the psychological meaning of an object to an individual • Is based on the proposition that an object can have several dimensions of connotative meaning – The meaning are located in multidimensional property space, called semantic space Good : : : Bad

Semantic Differential 2 • The scale developers produced a long list of adjective pairs

Semantic Differential 2 • The scale developers produced a long list of adjective pairs useful for attitude research • They chose 20 concepts with the psychological meaning they wished to probe • Some concepts illustrated – – – Person concepts Abstract concepts Event concepts Institutions Physical concepts • Three factors contributed most to meaningful judgment by respondents

Semantic Differential 3 – Evaluation – Potency or power – Activity • Occasionally, the

Semantic Differential 3 – Evaluation – Potency or power – Activity • Occasionally, the potency and activity dimensions combined to form dynamism • Lesser dimensions – Stability – Tautness – Novelty – Receptivity

Semantic Differential Development • The first step is to select the concepts to be

Semantic Differential Development • The first step is to select the concepts to be studied – Nouns, noun phrases, nonverbal stimuli(visual sketches) – Concepts are chosen by judgment and reflect the nature of the problem under study • Select the original bipolar word pairs or tailormade scales – Three criteria should guide their selection • The factors composition (Table 7 -4) • The scale’s relevance to the concepts being judged • The scale should be stable across subjects and concepts

SD Illustration (E) Sociable(7) (P) Weak(1) (A) Active(7) (E) Progressive(7) (P) Yielding(1) (A) Slow(1)

SD Illustration (E) Sociable(7) (P) Weak(1) (A) Active(7) (E) Progressive(7) (P) Yielding(1) (A) Slow(1) (E) True(7) (P) Heavy(7) (A) Hot(7) (E) Unsuccessful(1) Unsociable(1) Strong(7) Passive(1) Regressive(1) Tenacious(7) Fast(7) False(1) Light(1) Cold(1) Successful(7) SD Scale for Analyzing Candidates for an Industry Leadership Position

Advanced Scaling Techniques • Multidimensional scaling – Describe a collection of techniques that deal

Advanced Scaling Techniques • Multidimensional scaling – Describe a collection of techniques that deal with property space in a more general manner than the semantic differential • Conjoint analysis – Used to measure complex decision making that requires multiattribute judgements