Design and Construction of Measures Lee Sechrest Katherine

  • Slides: 61
Download presentation
Design and Construction of Measures Lee Sechrest, Katherine Mc. Knight, & Mei-kuang Chen

Design and Construction of Measures Lee Sechrest, Katherine Mc. Knight, & Mei-kuang Chen

How NOT to Construct a measure 11/11/2009 Design and Construction of Measures, EGAD, Orlando,

How NOT to Construct a measure 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Focus The concerns we are going to talk about should not be avoided Think

Focus The concerns we are going to talk about should not be avoided Think hard(er) about measurement Measures should be deliberately and carefully conceptualized, designed, and constructed Types of measures and related assumptions, construction, interpretation, reliability and validity issues 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Why measures matter: “Garbage in, garbage out”! Gang involvement: single item in the instrument

Why measures matter: “Garbage in, garbage out”! Gang involvement: single item in the instrument Treatment dosage (Treatment Received Scale): direct, family and external type of substance abuse treatment services in the past 90 days Family treatment: Work with you at your home? Meet with family members of yours more than one time? Work with members of your family on communication? Hook your family up with services? 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

What are Composite Variables? Two or more measures are combined to form a single

What are Composite Variables? Two or more measures are combined to form a single value of a measure Individual indicators are compiled to a single index based on an underlying model Ideally measure multi-dimensional concepts that cannot be captured by a single indicator alone 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Simple example Count data: The U of A football record in the PAC-10 is

Simple example Count data: The U of A football record in the PAC-10 is now 4 wins and 1 loss The U of A football record is now (4 wins/5 games) =. 80 a composite measure 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Examples from Program Evaluation Socioeconomic Status Earned income Education Zip code Quality of Life

Examples from Program Evaluation Socioeconomic Status Earned income Education Zip code Quality of Life Health status (physical & mental) Employment status Social support Quality of Law School LSAT entrance scores Job placement Starting salary

Some things do NOT make sensible composites! 11/11/2009 Design and Construction of Measures, EGAD,

Some things do NOT make sensible composites! 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Before constructing a measure Think more about your construct Theory informs how we construct

Before constructing a measure Think more about your construct Theory informs how we construct our measures Indicators of the construct Measurement model (e. g. , latent vs. emergent) In the end your measure should map onto your understanding of the construct 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

There is nothing more practical than a good theory

There is nothing more practical than a good theory

Types of composites Measured (observed) variable Common factor: Latent variable measures Quasi-latent variable measures:

Types of composites Measured (observed) variable Common factor: Latent variable measures Quasi-latent variable measures: constructed variables with (implicit) assumed model Emergent (constructed) variables with defined weights 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Measured (observed) variables Defined by the specific measure: e. g. , income = paycheck;

Measured (observed) variables Defined by the specific measure: e. g. , income = paycheck; height = inches; speed = distance/time Has no other meaning Assumed to be measured without error Is this an accurate assumption? 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Common factor/latent variable Assume an underlying factor that causes its indicators, i. e. all

Common factor/latent variable Assume an underlying factor that causes its indicators, i. e. all indicators have a common cause Therefore, all indicators are positively correlated The value (quantity) of the unobserved (latent) 11/11/2009 variable can be inferred from the values of its indicators Design and Construction of Measures, EGAD, Orlando, AEA

Latent Variable Effects Indicators Intelligence Vocabulary Short-term memory Problemsolving Pattern perception Factual knowledge

Latent Variable Effects Indicators Intelligence Vocabulary Short-term memory Problemsolving Pattern perception Factual knowledge

Common factor/latent variable (cont’d) Because all indicators have variance in common, they are to

Common factor/latent variable (cont’d) Because all indicators have variance in common, they are to that degree equivalent Therefore, missing indicators can be dealt with by using means of values for observed indicators The weights given to indicators are determined by their statistical inter-relationships 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Quasi-Latent & Emergent Variables Constructed variables with (implicit) assumed model: Variables are constructed from

Quasi-Latent & Emergent Variables Constructed variables with (implicit) assumed model: Variables are constructed from conceptual models that should exhaust their intended meaning Weights for indicators cannot be determined statistically because they are not necessarily correlated They must be specified: Expert judgment Analytic approach Criterion Arbitrarily 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Quasi-latent or Emergent Variable Causal Indicators Quality of life Physical Health Mental Health Financial

Quasi-latent or Emergent Variable Causal Indicators Quality of life Physical Health Mental Health Financial Status Job Satisfaction Relationship With Spouse

Quasi-Latent & Emergent Variables (Cont’d) Indicators need not be correlated- correlation of indicators not

Quasi-Latent & Emergent Variables (Cont’d) Indicators need not be correlated- correlation of indicators not verified Missing values cannot be imputed Any indicator omitted, any indicator added changes the definition of the variable Assumption of but no correction of error 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Example: Gang involvement – What kind of measure? Is it a measured variable? “Are

Example: Gang involvement – What kind of measure? Is it a measured variable? “Are you part of a gang? ” YES/NO Is it a latent variable? What does “gang” mean? Entitativity (Campbell)- similarity, proximity, high coefficient of common fate What does involvement mean? Loyalty Frequency of contact Is it a quasi-latent or emergent variable? “tagging, ” “jumping” members into the group) + self-identification Frequency of defined gang activities (e. g. , as a gang member + reported loyalty 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Defining the construct Think carefully about what it is you want to measure Then

Defining the construct Think carefully about what it is you want to measure Then think carefully about what it is you want to measure and what you want that measure to be used for The measure and what it is to be used for ought to be fairly closely related 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Defining the construct Theoretical framework- weighting and aggregation Take method variance into consideration To

Defining the construct Theoretical framework- weighting and aggregation Take method variance into consideration To the extent that we rely on a single method of measurement, then the variance associated with that method is integral to the definition of the variable 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Defining the construct There is not a method more common, more used and abused,

Defining the construct There is not a method more common, more used and abused, than self-report Are you interested in: Self-reported gang involvement Self-reported depression Self-reported income Self-reported job satisfaction Self-reported social support Self-reported intelligence Self-reported substance use problem 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Defining the construct Best way to deal with unwanted method variance: Multiple measures that

Defining the construct Best way to deal with unwanted method variance: Multiple measures that are not subject to the same sources of error Look at other behavior- not just the behavior of filling out self-report measure “I like a look of anguish, because I know it’s true” Emily Dickinson 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Behavioral measures They are often difficult and expensive to acquire They may be useful

Behavioral measures They are often difficult and expensive to acquire They may be useful even if they are available on only a portion of the sample 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Creating composite variables Use our knowledge & conceptions about a given construct to :

Creating composite variables Use our knowledge & conceptions about a given construct to : Identify appropriate indicators Determine the best method for combining the indicators -Linear vs. nonlinear -Weighting schemes -Scaling 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Identifying Key Indicators Errors of Omission & Co-mission Latent/Common Factors: universe of correlated indicators,

Identifying Key Indicators Errors of Omission & Co-mission Latent/Common Factors: universe of correlated indicators, random subset serves equally Quasi-Latent & Emergent Variables: may be uncorrelated, defined by their indicators - Emergent Variables are the result of the indicators

Examples Latent/Common Factors: Intelligence - E. g. , Short-term memory (recall a string of

Examples Latent/Common Factors: Intelligence - E. g. , Short-term memory (recall a string of #s), vocabulary, spatial ability… Quasi-Latent: Good Teacher - Personality, content knowledge, organizational skills, behavioral management… Emergent: Livable Cities - K-12 education, housing price, clean air, climate, traffic…

Constructing the Algorithm for Combining Indicators Should be dictated by how we conceive the

Constructing the Algorithm for Combining Indicators Should be dictated by how we conceive the construct: Adiposity Ponderal Index = height/weight 1/3 Weight/Height Ratio = weight/height Body Mass Index = weight/height 2

A measure of adiposity should have the following characteristics (Billewicz et al, 1962): 1)

A measure of adiposity should have the following characteristics (Billewicz et al, 1962): 1) Allow us to rank individuals in order of their true relative adiposity 2) A given value, for each sex, should imply on average the same degree of relative adiposity at all heights 3) Easy to compute and invariant with respect the units of measurement

Source: Billewicz et al. (1962)

Source: Billewicz et al. (1962)

These examples show that an index which contains an implicit assumption about the relationship

These examples show that an index which contains an implicit assumption about the relationship of adiposity to height cannot be safely used in statistical analyses unless height is introduced as an independent variable. This conclusion is of particular importance in relation to analyses of body build in diseases which have a marked social class gradient. There is no doubt that average stature increases with rise of social status, so that, for example, a disease which occurs more commonly in the upper social strata will tend also to occur more commonly in tall individuals. Incautious application of the ponderal index or the height/weight ratio to groups of sufferers from such a disease could readily give misleading results. -- (Billewicz, Kemsley & Thomson, 1962, p. 186)

Constructing the Algorithm for Combining Indicators Should be dictated by how we conceive of

Constructing the Algorithm for Combining Indicators Should be dictated by how we conceive of the construct, and what we know about the phenomenon of interest: In humans, WEIGHT is NINE TIMES more variable than HEIGHT & therefore measures that include both vary disproportionately. Billewicz et al (1962) recommend a more useful index: (observed wt)/(standard wt. )

Common Algorithms: Linear Combination Sum the indicators - Stressful Life Events - Child Behavior

Common Algorithms: Linear Combination Sum the indicators - Stressful Life Events - Child Behavior Checklists Assumptions: - Indicators are additive - Their contributions are unique/independent - Their importance (contribution to the sum) is equal to their variance

Example: Risk for Heart Disease *Smoking *High Cholesterol *High blood pressure *Body fat *Diet

Example: Risk for Heart Disease *Smoking *High Cholesterol *High blood pressure *Body fat *Diet *Genetics Do we sum these? Do we weight them?

Example: Delinquency *Auto theft *Stealing < $20 *Vandalism *Disrespecting authority *Drug use *Skipping school

Example: Delinquency *Auto theft *Stealing < $20 *Vandalism *Disrespecting authority *Drug use *Skipping school Do we sum these? Do we weight them?

Weighting Schemes Implicit vs. Explicit - Weight by variance vs. determine purposefully Unit weighting

Weighting Schemes Implicit vs. Explicit - Weight by variance vs. determine purposefully Unit weighting (z-scores, 1/variance) Expert opinion (e. g. Sellin & Wolfgang (1966); US News & World Report; meta-analyses) - Stealing item valued at $5 = 1 - Assault resulting in death = 26 Analytically (e. g. , unstandardized regression weights) Criterion (e. g. , weighting bball hits based on their consequence to the game)

Other Methods for Combining Indicators Nonlinear (non-additive) - Offensive efficiency = (on-base %) x

Other Methods for Combining Indicators Nonlinear (non-additive) - Offensive efficiency = (on-base %) x (slugging avg) - BMI = weight/height 2 Cumulative scaling – indicators ordered via probability of endorsement - Attitudes toward immigration* Allow into my country Live in my community Live on my block Live next door My child is allowed to date My child is allowed to marry “Hurdles” scaling – must “clear” one indicator to move onto the next, e. g. , DSM diagnoses *Source: http: //trochim. human. cornell. edu/kb/scalgutt. htm

Creating Composite Variables Our examples are not exhaustive but illustrative Algorithms are limited only

Creating Composite Variables Our examples are not exhaustive but illustrative Algorithms are limited only by our imagination for combining indicators meaningfully so they accurately reflect our knowledge of the construct we are measuring

Psychometric properties of measures Reliability (repeatability, consistency) and validity (are we measuring what we

Psychometric properties of measures Reliability (repeatability, consistency) and validity (are we measuring what we claim? ) Not inherent properties of a measure Dependent on measurement model Dependent on use of measure (context)

Psychometric consideration Reliability Validity Construct validity Criterion validity 11/11/2009 Design and Construction of Measures,

Psychometric consideration Reliability Validity Construct validity Criterion validity 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Reliability Index of consistency items (internal consistency, split-half) Index of repeatability time (test-retest) Classical

Reliability Index of consistency items (internal consistency, split-half) Index of repeatability time (test-retest) Classical test theory reliability = true score / (true score + error) observed score

True score reliability True error Observed score

True score reliability True error Observed score

Validity Extent to which a measure corresponds to the construct it is intended to

Validity Extent to which a measure corresponds to the construct it is intended to measure Face: looks like it measures the construct Content: adequate sampling of full range of content to be measured Concurrent: does it measure concept as it appears “now” Predictive: does it measure concept as it will be in some future point in time

Criterion validity of a measure Can be estimated with precision and specified by some

Criterion validity of a measure Can be estimated with precision and specified by some statistical process (e. g. , a correlation) A measure can be considered to have criterion validity for any criterion with which it is correlated Criterion validity does not require any understanding of the process underlying the correlation- practical prediction 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Construct validity of measures is not about statistics or even psychometrics It is about

Construct validity of measures is not about statistics or even psychometrics It is about careful thinking, systematic work, careful data collection, and intensive data analysis Construct validity cannot be quantified with any precision Ultimately, it is a matter of consensus among those who use the measures 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Measurement Models: Latent vs. Quasi-latent or Emergent variables Effects vs. causal indicators (Bollen &

Measurement Models: Latent vs. Quasi-latent or Emergent variables Effects vs. causal indicators (Bollen & Lennox, 1991) Effects indicators: ‘latent’ variables, common factors Indicators are the result of the underlying construct Measured indirectly by the indicators (i. e. latent) Causal indicators: ‘quasi-latent or emergent’ variables, formative indicators The construct is the result of the indicators Measured directly (i. e. measured)

Validity issues Common factor measures: Construct validity is the essential issue Quasi-latent or emergent

Validity issues Common factor measures: Construct validity is the essential issue Quasi-latent or emergent variable: Utility is the essential issue 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA

Reliability & Latent Variables Measurement model: Indicators correlated due to common cause unidimensional Internal

Reliability & Latent Variables Measurement model: Indicators correlated due to common cause unidimensional Internal consistency relevant Latent Variable Indicators imperfect reflection of underlying cause true score ≠ observed score V 1 V 2 V 3 V 4

Reliability & Quasi-latent or Emergent Variables n Measurement Model: q Indicators create the construct

Reliability & Quasi-latent or Emergent Variables n Measurement Model: q Indicators create the construct n n q Multidimensional Internal consistency NOT relevant Quasi-latent or emergent Variable Construct is defined by indicators n Accuracy is relevant true score = observed score V 1 V 2 V 3 V 4

Reliability of Quasi-latent or Emergent Variables Reliability analyses usually limited to Cronbach’s α and

Reliability of Quasi-latent or Emergent Variables Reliability analyses usually limited to Cronbach’s α and retest coefficients (Socan, 2000) Cronbach’s α underestimates reliability of multidimensional composites (Lord & Novick, 1974) Test-retest of alternate forms (for each indicator), if available (Nunnally & Berstein, 1994), e. g. , Quality of Life form 1 Quality of Life form 2 Health status, form 1 Financial status, form 1 Job satisfaction, form 1 Health status, form 2 Financial status, form 2 Job satisfaction, form 2

Reliability of multidimensional linear composites (Nunnally & Bernstein, 1994) where = variance of individual

Reliability of multidimensional linear composites (Nunnally & Bernstein, 1994) where = variance of individual indicators in the linear combination = estimates of each indicator’s reliability = variance of linear combination

Reliability vs. Accuracy Emergent variables are defined by their indicators — “it is what

Reliability vs. Accuracy Emergent variables are defined by their indicators — “it is what it is” Therefore, they are measured variables Therefore, no measurement error in the CTT sense True & Observed Score

Reliability vs. Accuracy Quality of U. S. graduate business schools (US News & World

Reliability vs. Accuracy Quality of U. S. graduate business schools (US News & World Report, 2002) Peer assessment (1 = marginal to 5 = outstanding) Recruiter assessment (1 = marginal to 5 = outstanding) Average undergraduate GPA Average GMAT score Acceptance rate Average starting salary & bonus % employed at graduation % employed 3 months after graduation Out-of-state tuition & fees Total full-time enrollment

Validity issues What is being measured? To what extent does the composite measure what

Validity issues What is being measured? To what extent does the composite measure what it is designed to measure? Is more than one trait being measured? How well does the composite variable correlate with external criteria?

Validity & Latent variables Measurement model: Indicators imperfect reflection of underlying cause Latent Variable

Validity & Latent variables Measurement model: Indicators imperfect reflection of underlying cause Latent Variable “Entitativity” (Campbell, 1958) Correspondence ≠ 100% V 1 V 2 V 3 V 4

Validity & Quasi-latent or Emergent variables Measurement Model: Indicators create the construct Construct is

Validity & Quasi-latent or Emergent variables Measurement Model: Indicators create the construct Construct is defined by indicators – “it is what it is” Quasi-latent or Emergent Variable indexes Correspondence = 100% V 1 V 2 V 3 V 4

Utility of Quasi-latent or Emergent Variables Extent to which it measures what it is

Utility of Quasi-latent or Emergent Variables Extent to which it measures what it is intended to is not a relevant issue, but… How well does this measure do what it is supposed to do? What does this measure tell us?

Measuring Quality of Business Schools (U. S. News & World Report Rankings) Peer assessment

Measuring Quality of Business Schools (U. S. News & World Report Rankings) Peer assessment (1 = marginal to 5 = outstanding) Recruiter assessment (1 = marginal to 5 = outstanding) Average undergraduate GPA Average GMAT score Acceptance rate Average starting salary & bonus % employed at graduation % employed 3 months after graduation Out-of-state tuition & fees Total full-time enrollment

Is this index useful? What can this index do for us? Help applicants select

Is this index useful? What can this index do for us? Help applicants select schools for quality education Index should predict knowledge gained Help employers select graduates for quality employees Index should predict quality of work produced

Conclusion We need more attention on our measurement models before we focus on content

Conclusion We need more attention on our measurement models before we focus on content Appropriate measures of psychometric properties should be determined by the measurement model Emergent variable models reflect different assumptions about reliability & validity as traditionally construed Accuracy & utility are more applicable

Useful Resource E. Giovannini et al. (2005), " Handbook on Constructing Composite Indicators: Methodology

Useful Resource E. Giovannini et al. (2005), " Handbook on Constructing Composite Indicators: Methodology and User Guide “, OECD Statistics Working Papers, 2005/3, OECD Publishing. doi: 10. 1787/533411815016 11/11/2009 Design and Construction of Measures, EGAD, Orlando, AEA