IS THERE A CRACK IN MY CRYSTAL BALL

IS THERE A CRACK IN MY CRYSTAL BALL? Critically appraising clinical prediction rules (CPRs)

Clinical Prediction Rules are Everywhere! (Examples extracted from Med. Calc and Qx Calculate) Setting Examples Predicts Emergenc y Canadian CT Head Rule Need for CT after head trauma CRB-65, CURB-65 Risk of mortality in community-acquired pneumonia Wells Score for DVT, PE Clinical probability of DVT, PE in patients presenting to ER with characteristic symptoms CHADS 2, CHA 2 DS 2 -VASc 1 -year risk of stroke in afib patients Framingham 10 -year risk of CVD MELD, MELD-Na Risk of 3 -month mortality and need for liver transplant in liver failure PESI Risk of death following PE GRACE, TIMI score Short and long-term mortality after ACS Euro. SCORE Early mortality following cardiac surgery Gupta Perioperative Cardiac Risk of perioperative MI or cardiac arrest Postoperative Respiratory Failure Risk Calculator As name suggests ICU Clinical Pulmonary Infection Score Presence of ventilator-acquired pneumonia Pediatrics! APGAR Neonatal problems including death Medicine/ Ambulator y Surgery

Steps to developing a clinical prediction rule (CPR) 1. 2. Derive Validate � 3. 4. 5. 6. 7. Optional: Refine and re-validate Validate … Impact analysis

Derivation 1. 2. 3. Define a clinically important outcome that you want to predict Pick a relevant setting and follow a cohort of patients over time Collect baseline data on known and possible risk factors, as well as other commonly collected information

Derivation 4. 5. Using a multivariable statistical model, uncover which patient characteristics predict the outcome Optionally (and optimally), simplify the statistical model into a tool usable by busy clinicians

Validation Internal 6. Apply the rule in the derivation cohort to assess its accuracy, keeping in mind that this is overly optimistic External 6. Apply the rule as it is designed to be used (not just the statistical model) in various practice settings, with different patients, over different periods of time, to assess for broadness of applicability

Components of a CPR study Study population 2. Predictor variables 3. Outcome 4. Accuracy 5. Transportability 1.

1. Study population Source � Prospective cohort is best � Retrospective collection inaccurate and incomplete � RCTs poorly generalizable, may miss important predictors Setting � Country, inpatient vs outpatient, tertiary vs primary care, ward (ER vs ICU vs medicine vs surgery)

1. Study population Patients � Inclusion/exclusion � Consecutive vs selective enrolment � SCRAPP mnemonic Representation � Predictor variables present in sufficient number of patients to avoid type 2 error in statistical model development (“powered”)

2. Predictor variables ² All variables considered for multivariable analysis described (even if not in final model) Reproducible in your practice Acceptable inter- and intra-observer variability Available at point of decision Blind assessment Validation studies: Definition & ascertainment method comparable to derivation study

3. Outcome ² Clinical importance Clear and reproducible definition Blind assessment Validation studies: Definition & ascertainment method comparable to derivation study

4. Accuracy Derivation study: Description of multivariable analysis Handling of missing data � Multiple ² imputation is preferable Discrimination Calibration Internal validation: Overfitting/optimism Validation studies: Definition & ascertainment method comparable to derivation study

Let’s talk Statistics

Diagnostic terms They apply to prognostic studies too! Has disease according to gold standard No disease according to gold standard + test true+ false+ - test false- true- In diagnostic studies, measuring a new test’s accuracy in identifying disease according to the tried-and-true gold standard � E. g. Trops to identify pts presenting to ER with CP who end up having CAD on angiogram The same principles are almost identical for studies looking at clinical prediction rules; the tool is the “new test” and the outcome is the “disease”

Diagnostic terms They apply to prognostic studies too! Has disease according to gold standard No disease according to gold standard + test true+ false+ - test false- true- Some new terms: You have to think of these ones “backwards” Sensitivity (Sn): % of pts who have the outcome who “test +” � true+ / (true+ + false-) The Sn. Nout rule of thumb: A test with high Sensitivity that is Negative tends to rule out disease � E. g. Negative d-dimer with low-pretest probability of DVT

Diagnostic terms They apply to prognostic studies too! Has disease according to gold standard No disease according to gold standard + test true+ false+ - test false- true- Some new terms: You have to think of these ones “backwards” Specificity (Sp): % of pts who do not have the outcome who “test –” � true- / (false+ + true-) The Sp. Pin rule of thumb: A test with high Specificity that is Positive tends to rule in disease � E. g. A positive serologic assay for Heparin Induced Thrombocytopenia following a high 4 Ts score

Some facts about diagnostic stats Sensitivity and specificity aren’t especially intuitive to think about They’re also not very useful when looking at them in isolation from one another If only there was a way to combine their info into a single useful value…

Bayesian probability Uncertainty + new info = clearer picture Prior (or pre-test) probability = prior knowledge Likelihood ratio = new information Posterior (or post-test) probability = clearer picture

Bayesian probability: The likelihood ratio (LR) combines sensitivity and specificity! There are two possible LRs derived from any pair of sensitivity and specificity LR+ is the LR used when the test is positive LR- is the LR used when the test is negative LR+ = Sn/(1 -Sp) LR- = (1 -Sn)/Sp LRs are interpreted in the same way as odds ratios This can be done easily by plugging values into Med. Calc’s likelihood ratio calculator!

The likelihood ratio: Example from the original CHADS 2 study Stroke No stroke CHADS 2 ≥ 1 (+ test) 92 (true+) 1521 (false+) CHADS 2 = 0 (- test) 2 (false-) 118 (true-) Sn = 0. 98 Sp = 0. 07 What do the sensitivity and specificity of this cutoff tell us about CHADS 2?

The likelihood ratio: Example from the original CHADS 2 study Stroke No stroke CHADS 2 ≥ 1 (+ test) 92 (true+) 1521 (false+) CHADS 2 = 0 (- test) 2 (false-) 118 (true-) LR+ = 1. 05 (95% CI 1. 02 -1. 09) LR- = 0. 29 (95% CI 0. 07 -1. 18) � Note that the confidence interval crosses 1 What do these likelihood ratios tell us?

A likelihood ratio needs to be applied to prior knowledge to be useful! LR without pre-test probability = as good as RRR without absolute baseline risk How do I find out my patient’s pre-test probability? � Cohort studies of representative individuals with the condition of interest

Applying the CHADS 2 ≥ or = 0 LRs Remember that � If CHADS 2 =0, LR for stroke is 0. 29 (95% CI 0. 07 -1. 18) � If CHADS 2 ≥ 1, LR for stroke is 1. 05 (95% CI 1. 02 -1. 09) We will use the overall 1 -year risk of stroke in this afib cohort, which was 5. 4%, as our pretest probability

Applying the LR to the pre-test probability and spitting out a post-test probability 3 ways: 1. Calculate by hand 2. Use a nomogram 3. Cheat and use Med. Calc’s ‘post-test probability’ calculator or similar online calculator

Applying the CHADS 2 score Pre-test probability = 5. 4% � If CHADS 2 =0, LR = 0. 07 -1. 18 � If CHADS 2 ≥ 1, LR = 1. 02 -1. 09 Post-test probability � If CHADS 2 =0: 0. 4% to 6. 3% � If CHADS 2 ≥ 1: 5. 5% to 5. 9% Does a CHADS 2 score of 0 identify afib patients who do not need to be on warfarin?

5. Transportability Across time Across the world Across settings Across methodologies Across spectra of disease Across follow-up durations

Let’s critically appraise HAS-BLED

1. Study population: HASBLED Derivation Validation Source Prospective cohort (for other purpose) Prospective cohort Setting • 35 European countries • Switzerland • Inpatients & outpatients • Single university hospital • University & non-university • Inpatients & outpatients hospitals • Specialized & non-specialized centers Eligibili ty criteria • ≥ 18 y • ECG/Holter-proven AF during admission or in preceding year • ≥ 18 y • Receiving OAC at time of discharge or at presentation to ambulatory clinic

1. Study population: HASBLED Derivation Validation Patients (SCRAPP) n = 3978, OAC+antiplatelet, antiplateletalone or no antithrombotic therapy • Mean age ~70 y • Female ~45% • AFib 100% o Mean CHADS 2 ~2 • HTN 70% • Renal failure 5% • Previous CVA 9% • Prior major bleed 1. 5% • ETOH abuse 4% n = 515 • Median age ~71 y • Female ~35% • Afib ~60% • PE ~15% • Artificial heart valve 8% • HTN ~60% • Previous CVA ~15% • Hx of GI bleed 5% • ≥ 1 antiplatelets 30% • ETOH abuse 8% • 64% started OAC ≥ 3 months before enrolment Adequate proportion of patients with predictor variables in • N/A Unlikely for many (e. g. prior major bleed, renal failure, SBP >160)

2. Predictor variables: HASBLED Derivation Description of all predictor variables considered, including those that did not make it in the final tool • • • • • Age >65 y Female sex Diabetes mellitus HF COPD Valvular heart disease (presence of any regurgitation or gradient over a valve with hemodynamic significance and/or related symptoms) Kidney failure Prior major bleeding episode Clopidogrel use OAC ETOH use HTN Thyroid disease Liver failure (not in candidate variables section) Anemia (not in candidate variables section) ASA use (not in candidate variables section) Labile INR (not in candidate variables section)

2. Predictor variables: HASBLED Lett er Definition in derivation study Points H Hypertension (SBP >160) 1 A Abnormal renal • SCr ≥ 200 µmol/L, chronic dialysis or renal transplantation) Abnormal liver function • Chronic hepatic disease or biochemical evidence of significant hepatic derangement (e. g. bili >2 x. UNL with ALT/AST/Alk. Phos >3 x. UNL, etc) 1 or 2 (1 each) S Stroke (previous history, especially lacunar) 1 B Bleeding history or predisposition (anemia) 1 L Labile INR (time in therapeutic INR <60%) 1 E Elderly (>65 y) 1 D Drugs • Antiplatelets or NSAIDs 1 or 2 (1 each)

2. Predictor variables: HASBLED Clear and reproducible definition (derivation study) Derivation Validation For some predictor variables but not others (see previous slide) Assessment of INR lability omitted Other variables defined and assessed as per standard practice at site (unlikely to differ from our practice) (comparability of predictor variables in validation study to original derivation definitions) Reliability Not tested, but variables with strict definition should be fine; others (e. g. definition of liver dysfunction) may have wide inter -rater variability Availability at point of decision Labile INR not usually available; rest is information from history and simple lab measures Blind assessment Yes (prospective cohort)

3. Outcome: HAS-BLED Derivation Validation Clinical importance Major bleeding – associated with significant morbidity & mortality Clear & reproducible definition Defined as bleeding leading to any of the following: • Hemorrhagic stroke (focal (comparability of validation outcome to original derivation outcome) neuro deficit of suddent onset + dx by neurologist + lasting >24 h + caused by bleeding) • Requiring hospitalization • Causing Hb drop ≥ 20 g/L • Requiring blood transfusion Defined as: • Fatal bleeding (any death occurring ≤ 7 days of a major bleed in the absence of an alternate cause of death) • Symptomatic bleed in a critical area or organ (intracranial, intraspinal, intraocular with threat to vision, retroperitoneal, intra-articular, pericardial, intramuscular with compartment syndrome) • Causing Hb drop ≥ 20 g/L Blind assessment Yes (prospective cohort)

4. Accuracy: HAS-BLED Derivation Description of multivariate analysis Multivariate logistic regression (only kidney failure, prior major bleeding & age >65 y had SS predictive power in multivariate analysis – maybe the score should be called ABE) Despite nonsignificant result for other variables, investigators went ahead and included these other variables into the CPR anyway)

4. Accuracy: HAS-BLED Derivation Validation Handling of missing data Unknown (used standard collection form; unlikely to be much data missing) Overfitting (derivation study) Very likely to be overfitted N/A Discrimination C-statistic = 0. 72 in overall population, 0. 68 in OAC-only C-statistic = 0. 57 (0. 49 -0. 68) in overall population, 0. 58 (0. 49 group, 0. 78 in OAC+antiplatelet group -0. 68) in afib patients At least 13 candidate variables were tested on 53 outcome events (rule of thumb is 1 candidate variable per 10 events)

4. Accuracy: HAS-BLED Derivation Calibratio Not assessed (no formal n statistical model due to way score was created) Rates by score: 0 – 1. 13% 1 – 1. 02% 2 – 1. 88% 3 – 3. 74% 4 – 8. 70% 5 – 12. 5% 6 – 0% 7 -9 – No patient had this score Validation Not assessed (no formal statistical model due to way score was created) Rates by score: 0 – 2. 8% 1 -2 – 6. 9% ≥ 3 – 9. 5%

5. Transportability: HAS-BLED Historical Derivation Validation 2003 -2004 2008 -2009 Geographic Wide European spread Domain Switzerland Inpatient & outpatient Methodolog Data gathered as part of larger Data gathered by research ic cohort nurse using standardized forms Spectrum General afib patients, with or General patients requiring oral without antithrombotic therapy, anticoagulation for any nothing specific to certain indication subgroups (e. g. CKD, post. PCI) Follow-up If the score performed well in steps 1 -4 (hint: 1 ityear really didn’t), the score could be intervalto modern general European patients with an indication for applied antithrombotic therapy (but not necessarily receiving any) to predict 1 -year risk of a major bleeding event.