Establishing Efficacy through Randomized Controlled Clinical Trials Ernst

Establishing Efficacy through Randomized Controlled Clinical Trials Ernst R. Berndt, Ph. D. MIT and NBER

Goal of Today’s Presentation • Help you to become critical, informed readers of reports/articles from randomized controlled trials (“RCTs”), and of associated research proposals • Realize that “the devil is in the details”

Background Change in Health Status = (symptom & disease metrics) f Baseline health status, demographics, medical intervention, lifestyle, education, comorbidities, other concomitant medications, history of non-responsiveness to TX, host of other factors or HS = f (HS 0, MI, everything else) Issue: How to measure HS / MI ?

Possibilities & Problems • Retrospective data and use of multivariate statistical analysis • Prospective “naturalistic” data and use of multivariate statistical analysis • Prospective randomized controlled (double blinded? ) trial • Other?

Why Randomize? • Members of the treatment (“TX”) and control groups tend to be comparable on all variables, known and unknown • Provides basis for statistical analysis

Pitfalls to Randomization • Timing (before vs. after patient enters trial) • Keeping track of drop-outs (non-random? ) – “intent to treat” analysis – “last observation carried forward” • Blinding • Randomization need not generate “median person”

On External Validity • What affects generalizability of study findings to population as a whole? – Exclusionary & inclusionary criteria • How to assess generalizability? – Any rigorous test? • Hawthorne effect present?

P-Values and Significance Levels Definitions: Observe Hypothesize MI Does Make a Difference 0+ H+ MI Does Not Make a Difference 0 HP-Value: P (O+/H-) -- false-positive rate, often. 05 or. 01 -- aka Type 1 or alpha error Beta: P (0 -/H+) -- false negative probability; for given n, lower alpha error rate => higher beta Sensitivity: P (0+/H+) -- true-positive rate, 1 -beta or “power” Specificity: P (O-/H-) -- true-negative rate

Implications • P-value does not directly tell us the probability that H+ is true • Classical statisticians “reject the H-” but cannot “accept the H+” (but Bayesians can go further than classical statisticians …) • For given beta, why do “head to head” RCTs require larger sample sizes than “placebocontrolled” trials? • Difference between “statistical significance” and “practical importance”

Related P-Value Issues • Subgroups – must they be specified ahead of time? – Data mining and data dredging – Bonferroni adjustments to overall distinct subgroup P-values • Stopping rules – Sequential design vs. chance result – FDA “confirmatory trials”

Internal Validity • Does credible physiological theory suggest mechanism of action? • Persuasive studies in animals? • Replicated in other human studies? – FDA requirement for two RCTs • Study undertaken by team with good credentials? • Was publication in peer-reviewed journal?

Single vs. Multi-Site Trials • MS accelerates patient recruitment • Less homogeneity of patients and treatments in MS trials, but greater external validity • MS trials typically simpler • MS trials give evidence of reproducibility

Quality Control of RCTs • Very detailed protocol, including endpoint metrics, a hallmark of a good study – done ex ante, not ex post – Use of “surrogate markers” / intermediate outcomes • Accuracy of case report forms • Trial design isolates effect of MI? • What if randomization “fails” in TX vs. control group on some dimensions? – Additional statistical analysis – Site effects? • Keeping track of drop-outs, and reasons for drop-outs

Related Class Topics • Efficacy vs. effectiveness • Use of clinical research organizations (CROs) in outsourcing clinical trials • Regulations on disseminating efficacy findings from RCTs • Use of retrospective medical claims data bases • Other outcomes (quality of life, costs, economic benefits)