Research Methods in International Business Chapter 11 Measurement
Research Methods in International Business Chapter 11 Measurement Questions Dr. Alex Settles University of Florida alexander. settles@aalto. fi alex. settles@gmail. com
Learning Objectives Understand… • The role of a preliminary analysis plan in developing measurement questions. • The critical decisions involved in selecting an appropriate measurement scale for a measurement question. • The characteristics and use of various questions based on the categories of scales: rating, ranking, sorting, and other scales. • The factors that influence a specific measurement question. 12 -2
Research Thought Leader “We increase measurement error when we ask the wrong questions. The questions we ask need to be answerable, clear, unbiased, and easy to answer. ” David F. Harris president, Insight and Measurement author, The Complete Guide to Writing Questionnaires: How to Get Better Information for Better Decisions 12 -3
Relationship of Questions to Scales Measurement Question Measurement Scale Data Level
Instrument Design in the Research Process 11 -5
Instrument Design: Phase 1
Instrument Design: Phase 1
Instrument Design: Phase 1
Communication Approach Computer Personal Interview Phone Mail/ Fax/ Courier
Instrument Design: Phase 1
Instrument Structure Issues Question Structure Unstructured Structured Objective of the Study Participant’s Level of Information Level of Thought on Topic Communication Ease Motivation to Share
Factors Affecting Concealment of Purpose & Sponsor Purpose Concealment Sponsor Concealment Disguised Questions Undisguised Questions Types of information Willingly Shared, Conscious-level Reluctantly shared, conscious-level Knowable, limited-conscious-level Subconscious-level
Instrument Design: Phase 1
Preliminary Analysis Plan: Dummy Tables
Instrument Design: Phase 1
Summary of Scales by Data Levels
Measurement Questions: Select the Scales 11 -17
Factors Affecting Measurement Scale Selection Research objectives Response types Number of dimensions Balanced or unbalanced Forced or unforced choices Number of scale points Rater errors 12 -18
Factors Affecting Measurement Scale Selection • Attitude scaling is the process of assessing an attitudinal disposition using a number that represents a person’s score on an attitudinal continuum ranging from an extremely favorable disposition to an extremely unfavorable one. • Scaling is the procedure for the assignment of numbers to a property of objects in order to impart some of the characteristics of numbers to the properties in question. • Selecting and constructing a measurement scale requires the consideration of several factors that influence the reliability, validity, and practicality of the scale. These factors are listed in the slide. • Researchers face two types of scaling objectives: 1. to measure characteristics of the participants who participate in the study, and 2. to use participants as judges of the objects or indicants presented to them. • Measurement scales fall into one of four general response types: rating, ranking, categorization, and sorting. These are discussed further on the following slide. • Decisions about the choice of measurement scales are often made with regard to the data properties generated by each scale: nominal, ordinal, interval, and ratio. • Measurement scales are either • unidimensional or multidimensional, • balanced or unbalanced, • forced or unforced.
Research Objectives Measure characteristics of participants Use participants as judges
Response Types Rating Questions Ranking Questions Categorization Questions Sorting Questions 12 -21
Number of Dimensions • With a unidimensional scale, one seeks to measure only one attribute of the participant or object. One measure of an actor’s star power is his or her ability to “carry” a movie. It is a single dimension. • A multidimensional scale recognizes that an object might be better described with several dimensions. The actor’s star power variable might be better expressed by three distinct dimensions - ticket sales for the last three movies, speed of attracting financial resources, and column-inch/amount of TV coverage of the last three movies.
Balanced or Unbalanced How good an actress is Emma Stone? Very bad Bad Neither good nor bad Good Very good Poor Fair Good Very good Excellent 12 -23
Forced or Unforced Choices How good an actress is Emma Stone? Very bad Bad Neither good nor bad Good Very good No opinion Don’t know 12 -24
Number of Scale Points How good an actress is Emma Stone? Very bad Bad Neither good nor bad Good Very good Very bad Somewhat bad A little bad Neither good nor bad A little good Somewhat good Very good 12 -25
Reduce Rater Errors • Adjust strength of descriptive adjectives Error of central tendency • Space intermediate Error of leniency • Provide smaller Error of Strictness descriptive phrases farther apart differences in meaning between terms near the ends of the scale • Use more scale points 12 -26
Reduce Rater Errors Primacy Effect Recency Effect Reverse order of alternatives periodically or randomly 12 -27
Reduce Rater Errors • Rate one trait at a time Halo Effect • Reveal one trait per page • Reverse anchors periodically 12 -28
Factors Affecting Participant Honesty Peacock Ignoramus Unconscious Decision Maker Self-Delutionist Pleaser Syndrome Gamer Disengager
Nature of Attitudes Cognitive I think oatmeal is healthier than corn flakes for breakfast. Affective I hate corn flakes. Behavioral I intend to eat more oatmeal for breakfast. 12 -31
Predicting Behavior from Attitudes Specific Multiple measures Reference groups Strong Factors Direct Basis 12 -32
Response Types Rating questions Ranking Questions Categorization Questions Sorting Questions 12 -33
Simple Category Question I plan to purchase a laptop computer in the next 12 months. Yes No 12 -34
Multiple-Choice, Single-Response Question What newspaper do you read most often for financial news? East City Gazette West City Tribune Regional newspaper National newspaper Other (specify: _____) 12 -35
Multiple-Choice, Multiple-Response Question Check any of the sources you consulted when designing your new home. Online planning services Magazines Independent contractor/builder Designer Architect Other (specify: _______) 12 -36
Likert Scale-based Question The Internet is superior to traditional libraries for comprehensive searches. Strongly Disagree Neither Agree nor Disagree Agree Strongly Agree 12 -37
Likert scale • The Likert scale was developed by Rensis Likert and is the most frequently used variation of the summated rating scale. • Summated rating scales consist of statements that express either a favorable or unfavorable attitude toward the object of interest. • The participant is asked to agree or disagree with each statement. • Each response is given a numerical score to reflect its degree of attitudinal favorableness and the scores may be summed to measure the participant’s overall attitude. • Questions based on Likert-like scales may use 7 or 9 scale points. • They are quick and easy to construct. • The scale produces interval data.
Likert scale • Originally, creating a Likert scale involved a procedure known as item analysis. • Item analysis assesses each item based on how well it discriminates between those people whose total score is high and those whose total score is low. • It involves calculating the mean scores for each scale item among the low scorers and the high scorers. • The mean scores for the high-score and low-score groups are then tested for statistical significance by computing t values. • After finding the t values for each statement, the statements are rankordered, and those statements with the highest t values are selected. • Researchers have found that a larger number of items for each attitude object improves the reliability of the scale.
How to perform a Likert Item Analysis Collect Statements Select Participant Stand-ins Test Statements with Stand-ins Add Participant Stand-in’s Total Score Array Scores from Highest to Lowest Calculate a mean score per statement Test mean scores for statistical significance Rank order statements by t value Select 20 -25 statements with highest t value
How to perform a Likert Item Analysis
Evaluating a Scale Statement by Item Analysis 11 -42
Semantic Differential-based Question 12 -43
Semantic Differential-based Question • The semantic differential scale measures the psychological meanings of an attitude object using bipolar adjectives. • Researchers use this scale for studies of brand institutional image, employee morale, safety, financial soundness, trust, etc. • The method consists of a set of bipolar rating scales, usually with 7 points, by which one or more participants rate one or more concepts on each scale item. • The scale is based on the proposition that an object can have several dimensions of connotative meaning. The meanings are located in multidimensional property space, called semantic space. • The semantic differential scale is efficient and easy for securing attitudes from a large sample. Attitudes may be measured in both direction and intensity. The total set of responses provides a comprehensive picture of the meaning of an object and a measure of the person doing the rating. It is standardized and produces interval data.
How to Construct an SD Question Select the variable Identify nouns, noun phrases, adjectives, visual stimuli that represent the variable Select bipolar pairs, phrase pairs, or visual pairs (3 for each dimension) Evaluation Potency Activity Create Scoring System Order the Bipolar Pairs Half: Positive left Half: Positive right No pairs for one dimension together
How to Construct an SD Scale 11 -46
Adapting Semantic Differential-based Scales Convenience of Reaching the Store from Your Location Nearby ___: ___: Distant Short time required to reach store ___: ___: Long time required to reach store Difficult drive ___: ___: Easy Drive Difficult to find parking place ___: ___: Easy to find parking place Convenient to other stores I shop ___: ___: Inconvenient to other stores I shop Products offered Wide selection of different kinds of products ___: ___: Limited selection of different kinds of products Fully stocked ___: ___: Understocked Undependable products ___: ___: Dependable products High quality ___: ___: Low quality Numerous brands ___: ___: Few brands Unknown brands ___: ___: Well-known brands 12 -47
SD Scale for Analyzing Industry Association Candidates 12 -48
Graphic Representation of Semantic Differential-based Analysis 12 -49 11 -49
Numerical Rating Question 12 -50
Numerical Rating Question • Numerical rating questions have equal intervals that separate their numeric scale points. • The verbal anchors serve as the labels for the extreme points. • Numerical scales on which these questions are based are often 5 -point scales but may have 7 or 10 points. • The participants write a number from the scale next to each item. • It produces either ordinal or interval data.
Multiple Rating List Question “Please indicate the importance of each service characteristic: ” IMPORTANT Fast, reliable repair 7 Service at my location 7 Maintenance by manufacturer 7 Knowledgeable technicians 7 Notification of upgrades 7 Service contract after warranty 7 6 6 6 5 5 5 4 4 4 3 3 3 2 2 2 UNIMPORTANT 1 1 1 12 -52
Multiple Rating List Question • A multiple rating list question is similar to the numerical scale but differs in two ways: 1) it accepts a circled response from the rater, and 2) the layout facilitates visualization of the results. • The advantage is that a mental map of the participant’s evaluations is evident to both the rater and the researcher. • This scale produces interval data.
Stapel Scale-based Question 12 -54
Stapel Scale-based Question • The Stapel scale is used as an alternative to the semantic differential, especially when it is difficult to find bipolar adjectives that match the investigative question. • In the example, there are three attributes of corporate image. • The scale is composed of the word identifying the image dimension and a set of 10 response categories for each of the three attributes. • Questions based on Stapel scales produce interval data.
Constant-Sum Question 12 -56
Constant-sum question • The constant-sum question helps researchers to discover proportions. • The participant allocates points to more than one attribute or property indicant, such that they total a constant sum, usually 100 or 10. • Participant precision and patience suffer when too many stimuli are proportioned and summed. • A participant’s ability to add may also be taxed. (this problem is overcome in a computer based measurement instrument as the computer keeps a running tally of the points allocated) • Its advantage is its compatibility with percent and the fact that alternatives that are perceived to be equal can be so scored. • This type of question produces interval data.
Verbal Graphic Rating Question 12 -58
Verbal Graphic Rating Question • The graphic rating scale was originally created to enable researchers to discern FINE differences. • Theoretically, an infinite number of ratings is possible if participants are sophisticated enough to differentiate and record them. • They are instructed to mark their response at any point along a continuum. • Usually, the score is a measure of length from either endpoint. • The difficulty is in coding and analysis. • Verbal graphic rating questions are anchored with words or phrases. This type of question generates interval data. • Visual Graphic rating questions use pictures, icons, or other visuals to communicate with the rater and represent a variety of data types. • Visual Graphic scales are often used with children.
Visual Graphic Rating Question 12 -60
Visual Graphic Rating Question • The graphic rating scale was originally created to enable researchers to discern FINE differences. • Theoretically, an infinite number of ratings is possible if participants are sophisticated enough to differentiate and record them. • They are instructed to mark their response at any point along a continuum. • Usually, the score is a measure of length from either endpoint. • The difficulty is in coding and analysis. • Visual Graphic rating questions use pictures, icons, or other visuals to communicate with the rater and represent a variety of data types. • Visual Graphic scales are often used with children. • This type of question generates interval data.
Characteristics of Scale Types
Response Types Rating questions Ranking Questions Categorization Questions Sorting Questions 12 -63
Ranking Questions Paired-comparison Forced ranking Comparative 12 -64
Paired-Comparison Question 12 -65
Paired-Comparison Question • Using the paired-comparison question, the participant can express attitudes unambiguously by choosing between two objects. • The number of judgments required in a paired comparison is [(n)(n 1)/2], where n is the number of stimuli or objects to be judged. • Paired comparisons run the risk that participants will tire to the point that they give ill-considered answers or refuse to continue. • Paired comparison questions provide ordinal data.
Paired Comparison Response Patterns
Forced Ranking Question 12 -68
Forced Ranking Question • The forced ranking question lists attributes that are ranked relative to each other. • This method is faster than paired comparisons and is usually easier and more motivating to the participant. • With five items, it takes ten paired comparisons to complete the task, but the simple forced ranking of five is easier. • A drawback of this scale is the limited number of stimuli (usually no more than 7) that can be handed by the participant. • This type of question produces ordinal data.
Comparative Questions 12 -70
Comparative Questions • When using a question based on a comparative scale, the participant compares an object against a standard. • The comparative scale is ideal for such comparisons if the participants are familiar with the standard. Some researchers treat the data produced by comparative scales as interval data since the scoring reflects an interval between the standard and what is being compared, but the text recommends treating the data as ordinal unless the linearity of the variables in question can be supported.
Cumulative scale • With a cumulative scale, a participant’s agreement with one extreme scale item endorses all other items that take a less extreme position. • A pioneering scale of this type was the scalogram. • Scalogram analysis is a procedure for determining whether a set of items forms a unidimensional scale. • A question using this scale as its foundation is unidimensional if the responses fall into a pattern in which endorsement of the item reflecting the extreme position results in endorsing all items that are less extreme. • The scalogram and similar procedures for discovering underlying structure are useful for assessing attitudes and behaviors that are highly structured, such as social distance, organizational hierarchies, and evolutionary product stages.
Ideal Scalogram Pattern Item 2 4 1 3 X X __ X X X __ __ __ X __ __ Participant Score 4 3 2 1 0 * X = agree; __ = disagree. 12 -73
Response Types Rating questions Ranking Questions Categorization Questions Sorting Questions 12 -74
Response Types Rating questions Ranking Questions Categorization Questions Sorting Questions 12 -75
Sorting Questions Select descriptors >60 <120 Create cards, shuffle Sort cards into piles Arrange Piles Between 7 -11 Structured sort Unstructured Sort Left = most favorable Right = least favorable
Sorting Questions 1. The basic Q-sort procedure involves the selection of a verbal statements, phrases, single words, or photos related to the concept being studied. 2. For statistical stability, the number of cards should not be less than 60, and, for convenience, not be more than 120. 3. After the cards are created, they are shuffled, and the participant is instructed to sort the cards into a set of piles (usually 7 to 11), each pile representing a point on the judgment continuum. • In the case of a structured sort, the distribution of cards allowed in each pile is predetermined. • With an unstructured sort, only the number of piles will be determined. 4. The criteria are arranged so that the left-most pile represents the concept statements that are “most valuable, ” “favorable, ” and “agreeable. ” The right-most pile contains the least favorable criteria.
Find or Craft Measurement Questions Question Coverage Question wording Question Frame of Reference Response Alternatives
Question Design • Question coverage refers to the number of questions needed to adequately measure the variable as operationally defined. • The more complex the variable, the more likely it will take more questions to provide full coverage. • Question wording means • finding Shared vocabulary between the researcher and the participant • Avoiding leading questions • Avoiding double-barreled questions • Avoiding unsupported assumptions within questions
Find or Craft Measurement Questions Question Coverage Personalization Question wording Shared vocabulary Question Frame of Reference Leading questions Double-barreled questions Unsupported assumptions Response Alternatives
Find or Craft Measurement Questions Question Coverage Question wording Question Frame of Reference Response Alternatives Role Behavior time frame Behavior cycle Behavior frequency Memory Decay
Find or Craft Measurement Questions Question Coverage Question wording Question Frame of Reference Response Alternatives Unstructured Structured 90% of responses Recency Effect Primacy Effect Central Tendency
Summary of Issues Related to Measurement Questions
Summary of Issues Related to Measurement Questions (cont. )
Summary of Issues Related to Measurement Questions (cont. )
Instrument Design: Phase 1
Pretest Measurement Questions Participant Surrogates Research Colleagues Unanswerable questions Difficult-to-answer Questions Inappropriate Rating scales Questions missing frame of reference Questions that need instructions Questions that need operational definitions
Key Terms • • • Acquiescence bias Attitude scaling Balanced rating scale Behavior time frame Behavior cycle Behavior frequency Categorization Checklist Comparative scale Constant-sum scale • • • Cumulative scale Dichotomous question Disguised question Double-barreled question Dummy table Error of central tendency Error of leniency Error of strictness Forced-choice rating scale Forced ranking scale Graphic rating scale 12 -88
Key Terms • • • Halo effect (error) Hypothetical construct Leading question Likert scale Measurement instrument Measurement question Measurement scale Memory scale Multidimensional scale Multiple-choice, multiple-response scale • Multiple-choice, single-response scale • Multiple choice question • Multiple rating list scale • Numerical scale • Paired-comparison scale • Pretesting • Preliminary analysis plan • Primacy effect • Q-sort • Ranking scale 12 -89
Key Terms • • • Rating scale Rating question Recency effect Scaling Scalogram analysis Semantic differential Simple category scale Sorting Stapel scale Summated rating scale Structured response • • • Structured question Summated rating scale Unbalanced rating scale Unforced-choice rating scale Unidimensional scale Unstructured response Unstructured question Unsupported assumption question Visual graphic rating scale 12 -90
- Slides: 89