Measuring differences in the quality of CABG surgery

Measuring differences in the quality of CABG surgery using quality-adjusted life years Justin Timbie, Ph. D. Postdoctoral Fellow, HSR&D Center for Practice Management & Outcomes Research VA Ann Arbor Healthcare System January 18, 2008

Outline of talk • Background: Summary measures of quality • Analytic approach: Use quality measures to estimate quality-adjusted life expectancy. • Illustration: Compare outcomes for 14 hospitals performing CABG surgery. 2

Background: Summary measures of quality • Combine mortality, process, and “intermediate” outcome measures. • Used in pay-for-performance and benefit design. Type of measure Diabetes AMI CABG surgery Mortality - 1 1 Process 4 8 5 Intermediate Outcomes 3 - - Complications - - 5 AMI = Acute Myocardial Infarction CABG = Coronary Artery Bypass Graft 3

Background: Current approaches Approach Equal weights Commonly used by: CMS Expert panel weights “All-or-nothing” The Leapfrog Group Institute for Healthcare Improvement Latent variable model • Limitations: – Weighting has limited (or no) clinical basis. – Weighting of mortality vs. process measures is conceptually weak. 4

Objective • Translate information in quality measures into summary measure of health impact. • Metric is quality-adjusted life expectancy. • Theoretical framework for combining morbidity and mortality information. • Methods: 1) Map quality measures to health utilities. 2) Predict failure time given quality data. 3) Weight survival time by health utilities. 5

Quality of CABG surgery in Massachusetts Measure 1 Preoperative beta-blocker Failure Rate (%) 15. 2 2 Use of internal mammary artery 4. 9 3 Aspirin at discharge 5. 2 4 Beta-blocker at discharge 15. 2 5 Anti-lipids at discharge 21. 6 6 Stroke (lasting >72 hours) 1. 2 7 Deep sternal wound infection 1. 2 8 Renal failure 2. 9 9 Prolonged ventilation (>24 hours) 10 Re-operation 11. 9 2. 2 Source: Massachusetts Data Analysis Center (Mass-DAC), 2004. 6

Quality failures at the patient level Number of reported quality failures N (%) 0 1928 (48. 4) 1 1178 (29. 6) 2 531 (13. 3) 3 256 (6. 4) 4 66 (1. 7) 5 22 (0. 55) 6 5 (0. 13) Source: Mass-DAC, 2004. 7

Defining and measuring “utility” • Measure of a patient’s preference for a health state on [0, 1] scale. • Elicited using standard gamble, time tradeoff, others. Strengths • Quantifying morbidity in terms of mortality. • Can compare health states on cardinal scale. Challenges • Burdensome to collect. • Heterogeneity in estimates caused by elicitation method and population characteristics. • No estimates for some health states. • Health states of interest are often 8 multidimensional.

Utilities for CABG complications Complication Duration Estimate Reoperation Prolonged ventilation Renal failure 30 d 0. 78 0. 76 0. 63 Deep sternal wound infection Stroke Renal failure with hemodialysis 60 d 2 y* 0. 58 0. 52 0. 49 * Two years of mortality data 9

Estimating joint utilities • Estimated for three periods: 0 -30 d, 31 -60 d, 61 d-2 y. • Example: Patient had renal failure (resolved) (U=0. 63) and a DSWI (U=0. 58) Method Formula Utility (0 -30 days) Multiplicative U =. 63*. 58*1*1 = 0. 37 Additive U = 1 -(. 37+. 42+0+0) = 0. 21 10

Combining utilities and failure time Patient 719 Process failures: Complications: Failure time: Predicted Utilities Preoperative beta blocker Renal failure (resolved), DSWI > 24 months Predicted failure time 1 Utility 0 1 2 Time (months) 24 11

Predicting failure time • Hierarchical Poisson hazard model h|X, Q = β 0 i + β 1 i. Period 2 + β 2 X + β 3 Q X = risk factors Q = binary quality indicators = p(S|hij)*2 y • Bayesian analysis using win. BUGS. • (p = 0 -30 d, 31 -60 d, 61 -730 d) 12

Defining a benchmark • “Best quality” hospital: – Lowest failure rate on all process measures of quality. – Lowest case-mix adjusted complication rates. • Expected survival: h|X, QBEST = + Period 2 + X+ QBEST, f(h) = where QBEST = min[p(Q=1)i] for process measures; QBEST = p(Q=1|βBEST) for complications (from univariate hierarchical risk-adjustment models). 13

Defining a benchmark • Expected (univariate) utilities: Example: EU(stroke) = p(stroke=1|βBEST(stroke)i, X)*U(stroke) + p(stroke=0|βBEST(stroke)i, X)*U(no stroke) • “Incremental” QALY = Predicted QALY – Expected QALY 14

Results Hospital 8: - 43 days Hospital 5: - 7 days -60 -40 -20 0 Incremental QALYs (days) 20 15

Key findings • Incidence of complications was low, and most are resolved after 2 months, diluting their impact on 2 -year QALE. • Results largely unaffected by magnitude of utility estimates or aggregation method. • Comparison with other summary measures needed to put these results in context. 16

Limitations • Utility estimation needs refinement. • Did not include disutilities associated with process measures. – Difficulty estimating health impact of discharge medication measures. • Limited clinical detail in quality measures. – Severity of stroke, infections. – Contraindications, treatment preferences. 17

Strengths of approach • Theoretical framework for combining morbidity and mortality information. • Weights of quality measures are based on: – Survival model regression coefficients – Utility weights (patients’ preferences) • Generalizable to other diseases (e. g. diabetes). • Orientation shifted from hospital level to patient level. 18

Acknowledgements Sharon-Lise Normand, Ph. D. Joseph Newhouse, Ph. D. Meredith Rosenthal, Ph. D. David Shahian, M. D. Massachusetts Data Analysis Center (Mass-DAC) for the use of their data. Funding from the Alfred P. Sloan Foundation. 19