Grading the evidence in systematic reviews of measurement

























- Slides: 25
Grading the evidence in systematic reviews of measurement properties 23 september 2010 Caroline Terwee Knowledgecenter Measurement Instruments Department of Epidemiology and Biostatistics VU University Medical Center www. kmin-vumc. nl
What is a systematic review of measurement properties? A review of the measurement properties of all available measurement instruments that aim to measure a particular construct in a particular population. Aim is to select the best instrument for a specific purpose www. kmin-vumc. nl
What is special about a systematic review of measurement properties (SR-MP)? • A SR-MP has more than one outcome measure, i. e. multiple measurement properties • Different studies evaluate different measurement properties the number of studies in the analysis is different per measurement property • The quality of the studies is evaluated per measurement property • Data synthesis is different per measurement property • Evidence for one measurement property may come from different studies Therefore, a SR-MP is actually a collection of separate reviews per measurement property www. kmin-vumc. nl
Methodology of systematic reviews of measurement properties 10 steps 1. formulating a research question 2. performing a literature search 3. formulating eligibility criteria 4. selecting abstracts and full-text articles 5. evaluating the methodological quality of the included studies 6. data extraction 7. content comparison 8. data synthesis - evaluating of quality of the instrument 9. overall conclusion of the systematic review 10. reporting a systematic review of measurement properties www. kmin-vumc. nl
Methodology of systematic reviews of measurement properties 10 steps 1. formulating a research question 2. performing a literature search 3. formulating eligibility criteria 4. selecting abstracts and full-text articles 5. evaluating the methodological quality of the included studies 6. data extraction 7. content comparison 8. data synthesis - evaluating of quality of the instruments 9. overall conclusion of the systematic review 10. reporting a systematic review of measurement properties www. kmin-vumc. nl
Data synthesis 2 steps 1. Decide on combining studies • • • www. kmin-vumc. nl Homogeneity of study characteristics study population, setting, (language) version of the instrument, mode of administration, design characterstics (time interval) Methodological quality Consistency of the results of the measurement properties
Data synthesis 2 steps 1. Decide on combining studies • • • 2. Decide on the analysis: • • www. kmin-vumc. nl Homogeneity of study characteristics study population, setting, (language) version of the instrument, mode of administration, design characterstics (time interval) Methodological quality Consistency of the results of the measurement properties Quantitative analysis statistical pooling Qualitative analysis best evidence synthesis
Example Reliability of the Quebec Pain Disability Scale (QBPDS) Nine studies evaluated reliability; 8 included in the analysis 1. Kopec et al. J Clin Epidemiol 1996 2. Schoppink et al. Phys Ther 1996 3. Fritz et al. Phys Ther 2001 4. Davidson et al. Phys Ther 2002 5. Mousavi et al. Spine 2006 6. De Beer et al. S Afr J Physiother 2008 7. Melikoglu et al. Spine 2009 8. Hicks et al. Pain Med 2009 www. kmin-vumc. nl
Study population, setting, country 1 2 3 4 5 6 7 8 40 (10) 44 (12) 79 (5) 40 (12) 42 (9) 55 (17) 37(10) 45 (15) 70 32 29 45 5 64 48 26 12 (0128) 85% >12 7 (6) 84 (108) ? 50% >6 <1 51 (50) Setting General practice Retirem ent commu nity Orthop/ trauma clinic Country NL Brazil US Iran South Afrika Australi a US Turkey Dutch Portuge se English Persian Tswana English Turkish Age % Male Duration complaints (mo) Language www. kmin-vumc. nl Hospital PT Hospital - PT + PT outpatie - PT practice nt clinic + com. serv
Study population, setting, country 1 2 3 4 5 6 7 8 40 (10) 44 (12) 79 (5) 40 (12) 42 (9) 55 (17) 37(10) 45 (15) 70 32 29 45 5 64 48 26 12 (0128) 85% >12 7 (6) 84 (108) ? 50% >6 <1 51 (50) Setting General practice Orthop/ trauma clinic Retirem ent commu nity Country NL Brazil US Iran South Afrika Australi a US Turkey Dutch Portuge se English Persian Tswana English Turkish Age % Male Duration complaints (mo) Language www. kmin-vumc. nl Hospital PT Hospital - PT + PT outpatie - PT practice nt clinic + com. serv
Methodological quality COSMIN checklist COnsensus-based Standards for the Selection of Health Measurement INstruments Different boxes for each measurement property, with questions regarding quality aspects Box reliability: 14 items to evaluate the quality of a reliability study www. kmin-vumc. nl
Methodological quality of the studies 1 2 3 4 5 6 7 8 20% deleted ? ? % deleted 0 ? ? Sample size 89 54 56 31 31 47 23 100 Time interval (d) 7 3 -4 11 1 1 42 28 1 Missing items Stable patients Assuma Based Assuma ble? on GPC. ble PT PT Treatm treatme ent? nt nt Test conditions 2 x mail www. kmin-vumc. nl inter/intr a-rater 2 x mail ? 2 x clinic 2 x mail 2 x clinic
Methodological quality of the studies 1 2 3 4 5 6 7 8 20% deleted ? ? % deleted 0 ? ? Sample size 89 54 56 31 31 47 23 100 Time interval (d) 7 3 -4 11 1 1 42 28 1 Missing items Stable patients Assuma Based Assuma ble? on GPC. ble PT PT Treatm treatme ent? nt nt Test conditions 2 x mail inter/intr a-rater 2 x mail ? 2 x clinic 2 x mail 2 x clinic COSMIN good fair poor fair www. kmin-vumc. nl
Methodological quality of the studies 1 2 3 4 5 6 7 8 20% deleted ? ? % deleted 0 ? ? Sample size 89 54 56 31 31 47 23 100 Time interval (d) 7 3 -4 11 1 1 42 28 1 Missing items Stable patients Assuma Based Assuma ble? on GPC. ble PT PT Treatm treatme ent? nt nt Test conditions 2 x mail inter/intr a-rater 2 x mail ? 2 x clinic 2 x mail 2 x clinic COSMIN good fair poor fair www. kmin-vumc. nl
Consistency of results ICC www. kmin-vumc. nl 1 2 3 4 5 6 7 8 0. 90 0. 890. 93 0. 94 0. 86 0. 91 0. 84 0. 55 0. 92
Results of the studies 1 2 3 4 5 6 7 8 ICC 0. 90 0. 890. 93 0. 94 0. 86 0. 91 0. 84 0. 55 0. 92 COSMIN good fair poor fair www. kmin-vumc. nl
Results of the studies 1 2 3 4 5 6 7 8 ICC 0. 90 0. 890. 93 0. 94 0. 86 0. 91 0. 84 0. 55 0. 92 COSMIN good fair poor fair Duration complaints (mo) 12 (0128) 85% >12 7 (6) 84 (108) ? 50% >6 <1 51 (50) www. kmin-vumc. nl
Step 2: Best evidence synthesis Data syntheses is different per measurement property General guideline: levels of evidence, based on Cochrane Back Review group Level Rating Criteria Consistent findings in multiple studies of good strong +++ or --- methodological quality OR in one study of excellent methodological quality Consistent findings in multiple studies of fair moderate ++ or -- methodological quality OR in one study of good methodological quality limited + or - One study of fair methodological quality conflicting +/- Conflicting findings unknown ? Only studies of poor methodological quality www. kmin-vumc. nl
Example Reliability of the Quebec Pain Disability Scale (QBPDS) 1 2 3 4 5 6 7 8 ICC 0. 90 0. 890. 93 0. 94 0. 86 0. 91 0. 84 0. 55 0. 92 COSMIN good fair poor fair Consistent findings of good reliability (ICC>0. 70) in three studies of good methodological quality and in four studies of fair methodological quality Strong evidence for a good reliability (+++) www. kmin-vumc. nl
Other measurement properties Internal consistency Consistent findings in mutiple studies of good methodological quality or one study of excellent methodological quality that (sub)scales are unidimensional PLUS Consistent findings in mutiple studies of good methodological quality or one study of excellent methodological quality that Cronbach’s alpha is > 0. 70 www. kmin-vumc. nl
Content validity Strong evidence: all items are considered relevant for the construct, purpose, and target population and the instrument is considered comprehensive. Moderate evidence: the items are considered relevant for the construct or target population and the instrument is considered comprehensive. Limited evidence: only one aspect of content validity is assessed. www. kmin-vumc. nl
Construct validity and responsiveness Levels of evidence as described in the Table are applied. Challenges • some studies examine more hypotheses than others • some hypotheses are more challenging than others • some comparison instruments are better than others www. kmin-vumc. nl
Summary similarities and dissimilarities with GRADE Study limitations (methodological quality) are taken into account, but not to downgrade the level of evidence but to exclude studies from the data analysis Inconsistency is taken into account in applying levels of evidence Indirectness (generalizibility) is usually not taken into account because there are too many differences in study characteristics and the influence on the measurement properties is unclear Impresision (only one study) is very common. This is taken into account in applying levels of evidence Publication bias is not considered, but might be a problem www. kmin-vumc. nl
Systematic reviews of measurement properties www. kmin-vumc. nl
TODAY 13. 45 AULA VU Ph. D defence Wieneke Mokkink COSMIN checklist COnsensus-based Standards for the Selection of Health Measurement INstruments www. kmin-vumc. nl