Evidence and PValues Given Multiple Endpoints Multiple Comparisons

  • Slides: 24
Download presentation
Evidence and P-Values Given Multiple Endpoints, Multiple Comparisons and Multiple Studies Huque@cder. fda. gov

Evidence and P-Values Given Multiple Endpoints, Multiple Comparisons and Multiple Studies Huque@cder. fda. gov Division of Biometrics III, FDA 9201 Corporate Blvd, Rockville, Maryland (Slides to be presented on December 31, 2000, at the Joint Statistical Meeting, New Delhi, Sponsored by the International Indian Statistical Association (IISA) Disclaimer: Views expressed here is of the presenter and not necessarily of the U. S. Food & Drug Administration 1

Outline • Part I: Regulatory concept of evidence (for adequate and well-controlled clinical studies)

Outline • Part I: Regulatory concept of evidence (for adequate and well-controlled clinical studies) • Part II: Multiplicity issue when “lumping”/combining clinical trials • Part III: Multiplicity in clinical trials given multiple endpoints and treatment groups – Growing use of multi-step procedures and closure principle (some examples) – Issues and concerns • Final comments 2

Approval of New Drug Applications (NDAs) Based On: • “Substantial evidence of efficacy” for

Approval of New Drug Applications (NDAs) Based On: • “Substantial evidence of efficacy” for the intended use of the drug • Benefit outweighs the risk: benefit-risk analysis 3

Federal Food, Drug, and Cosmetic Act Section 505 (d) “Substantial Evidence” • Evidence consisting

Federal Food, Drug, and Cosmetic Act Section 505 (d) “Substantial Evidence” • Evidence consisting of adequate and well-controlled investigations • Includes clinical investigations by experts qualified by scientific training and experience to evaluate the effectiveness of the drug involved • Fair and responsible evaluations/conclusions by such experts • The drug will have the effect it purports or is represented to have under the conditions of use prescribed 4

CFR 21 § 314. 126 (a)-(e) : Adequate and Well-Controlled Studies Key Characteristics: •

CFR 21 § 314. 126 (a)-(e) : Adequate and Well-Controlled Studies Key Characteristics: • clear statement of the objectives • valid design and appropriate control • proper selection of patients having the disease or condition • proper treatment assignments (randomization) • adequate measures to minimize bias (w. r. t. subjects, observers and analysis of data) • well designed and reliable methods of measuring responses • The design is such that an analysis of data is possible that will be adequate to assess the effects of the drug 5

Evidence for NDA (New Drug Application) vs. post-NDA submissions may vary • NDA •

Evidence for NDA (New Drug Application) vs. post-NDA submissions may vary • NDA • New similar indication of an approved drug • Change in dosage • Change in formulation • Pediatrics • Generics 6

An Example: Type I Error Inflation When 2 Studies are Combined • Consider 2

An Example: Type I Error Inflation When 2 Studies are Combined • Consider 2 placebo controlled trials, equal sample size, equal allocation, equal variance, and endpoint compared by mean • Null hypothesis: assume global null, i. e, H 0: 1=0, 2=0 • Test statistics for separate analyses: T 1 Z 1 - , T 2 Z 1 - • Test statistic pooled analysis: T=(T 1 + T 2)/ 2 Z 1 - 2 7

An Example of Type I Error Inflation: Pooling 2 Studies (After Testing for Single

An Example of Type I Error Inflation: Pooling 2 Studies (After Testing for Single Studies ) Type I error inflation in pooled analysis - dashed area T 2 = Z 1 - T 1 = Z 1 - (T 1 + T 2)/ 2 = Z 1 - 2 8

Uncontrolled Decision Error Under Partial Null Scenario: Consider 5 trials with 2 treatments active

Uncontrolled Decision Error Under Partial Null Scenario: Consider 5 trials with 2 treatments active and placebo; n=100 patients randomized to each arm. For i=1, …, 5 studies, control response rates 0 i same for all trials, and i = ( Ti - 0 i ) are true difference in response rates. ________________________________ Control True Treatment Prob. at least 2 Studies #Prob. MA (Fixed Eff. ) Rate Differences Significant at =. 025 Significant at =. 05 ______________ 10% (0, 0, 10%). 0514. 1119 10% (0, 0, 0, 10%). 3032. 3814 50% (0, 0, 10%). 0312. 0965 50% (0, 0, 0, 10%). 1200. 1606 ________________________________ # 10, 000 simulated fixed effect meta-analyses. P-Value for each MA by MH-test 9

Some Recent Work • Such some of these problems can be alleviated by putting

Some Recent Work • Such some of these problems can be alleviated by putting some consistency constraints on the pvalues of the studies which do not show significance at the desired level • Methodology: for multiple studies obtain optimal “cut off points” for p-values that protect the overall , and at the same time give some assurance about the consistency of results across studies for the overall result to be interpretable. 10

Controlling Decision Error Under Partial Null Scenario: Consider 3 trials with 2 treatments active

Controlling Decision Error Under Partial Null Scenario: Consider 3 trials with 2 treatments active and placebo; n=100 patients randomized to each arm. For i=1, 2 studies, control response rates 0 i same for all trials, and i = ( Ti - 0 i ) are true difference in response rates. ________________________________ Control True Treatment Prob. at least 2 Stuides #Prob. at least 2 Stuides Rate Differences Significant at =. 025 ______________ 50% ( 0, 0, 30%). 0492. 0118 50% ( 0, 0%, 20%). 0415. 0100 50% ( 0, 0, 10%). 0151. 0038 ________________________________ # and the 3 rd study significant at =. 25 11

Type I Error (Decision Error) for Partial Nulls (Multiple Endpoints Case) Each individual endpoint

Type I Error (Decision Error) for Partial Nulls (Multiple Endpoints Case) Each individual endpoint planned with power=. 90 and =. 025 (1 -sided) ________________________ T=2 -------- T=1 ------------r Rule 3/3 Rules 2/3 Rule 2/3* Rule 3/3 ________________________ 0. 020. 044. 006. 001. 4. 023. 046. 0196. 004. 8. 024. 039. 035. 009 1. 0 . 025 _____________________ *: 2/3 rule + the p-value for the 3 rd endpoint <. 15 (1 -sided) 12

Multi-Step & Closure Principle for Handling Multiple Comparison/ Multiple Endpoint Problems • Closure principle

Multi-Step & Closure Principle for Handling Multiple Comparison/ Multiple Endpoint Problems • Closure principle used in recent developments of multistep procedures – Refereces: Peritz (1970) Marcus et al (1976), Westfall et al (1999 SAS pub. ), Hsu & Burger (1999), Hochberg & Tamhane (1987), Peter Bauer (96, 97, 98), and others • Allows the use of -level test at each step to assure Type I FWE of • Easy to setup and apply • However, there are issues and concerns 13

Closure Principle (Peritz, 1970; Marcus et al, 1976) C Let W be the set

Closure Principle (Peritz, 1970; Marcus et al, 1976) C Let W be the set of null hypotheses closed under intersection: wi, wj W implies wi wj W C Any null hypotheses wb is tested and rejected at level by means of a test if and only if all hypotheses w that are included in wb ( w wb), and w W, have been tested and rejected at level . C The probability of making no Type I error rate by this procedure is at least (1 - ) 14

Applications of the Closure Principle - 3 Clinical Trial Examples • Example 1 -

Applications of the Closure Principle - 3 Clinical Trial Examples • Example 1 - Tests for non-inferiority, statistical superiority, and clinical superiority - all in one trial - Morikawa & Yoshida (1995, JBS) • Example 2 - Tests for dose effects in a clinical trial with a control and K dose groups -- Hsu & Berger (JASA, 1999), Hochberg & Tamhane(1987) • Example 3: Tests for treatment effects in a two-arm clinical trial with an active and a placebo group having two endpoints Y 1 and Y 2 -- JSM 1999 15

Example 1 - Level a tests for non-inferiority, statistical superiority, and clinical superiority -

Example 1 - Level a tests for non-inferiority, statistical superiority, and clinical superiority - all in a single trial For a clinical trial with an active and a control treated groups, and 1 >0 and 2 >0 , H 03 = (m - m 0) - 1, A 3 = (m - m 0) > - 1 (clinical non-inferiority) H 02 = (m - m 0) 0, A 2 = (m - m 0) > 0 (statistical superiority) H 01 = (m - m 0) 2, A 1 = (m - m 0) > 2 (clinical superiority) Define, for i=1, 2, 3, H'0 i = j i H 0 i and A'i = j i Ai. Then H'03 H'02 H'01. Therefore, a direct application of the closure principle makes the following test procedure a closed test procedure (next page). 16

Example 1 (cont’d) Let L= (Ÿ - Ÿ 0) - t ; 1 -

Example 1 (cont’d) Let L= (Ÿ - Ÿ 0) - t ; 1 - . S ( 1/n + 1/n 0) ½ Step 1 - Reject H'03 if {L > - 1} and go the next step, else stop Step 2 - Reject H'02 if {L > - 1} {L > 0} and go to the next step, else stop. Note that {L > - 1} {L > 0} = {L > 0} Step 3 Reject H'01 if {L > - 1} {L > 0 } { L > 2 } 17

Example 2 - Consider a trial with a control and K dose group with

Example 2 - Consider a trial with a control and K dose group with unknown means m 0 , m 1, m 2, . . . , m. K For i=1, 2, . . . , K, and 1> 0, 2 > 0, consider, H 0 i = (mi - m 0) a, Ai = (mi - m 0) > a , where, a = - 1 (Setup for clinical inferiority trials) a=0 (Setup for statistical superiority trials) a = 2 (Setup for clinical superiority trials) CTP using order statistics (next page) Hsu & Berger (JASA, 1999), Hochberg Tamhane (1987, Wiley) 18

Example 2 (Cont’d) Let Ti = {(Ÿi - Ÿ 0) - a} / {S

Example 2 (Cont’d) Let Ti = {(Ÿi - Ÿ 0) - a} / {S ( 1/ni + 1/n 0) ½}, i=1, 2, . . . , K. Let T(1) T(2). . . T(K) be the K order statistics with critical values T(1); 1 - , T(2); 1 - . . , T(K); 1 - . • Step 1: Reject H(0 K) = (m(K) - m 0) a in favor of A(K) = (m(K) - m 0) > a if T(K) > T(K); 1 - and go to Step 2, else stop. • Step 2: Reject H(0(K-1)) = (m(K-1) - m 0) a in favor of A(K-1)= (m(K-1) - m 0) > a if T(K-1) > T(K-1); 1 - and go to Step 3 else stop. … • Step K: Reject H(01) = (m(1) - m 0) a in favor of A(1)= (m(1) m 0) > a if T(1) > T(1); 1 - 19

Example 3: Testing for treatment effects in a twoarm trial with an active and

Example 3: Testing for treatment effects in a twoarm trial with an active and a placebo group having two endpoints Y 1 and Y 2 Null and alternate hypotheses - endpoint 1 H 01 = (m 1 - m 01) a 1, A 1 = (m 1 - m 01) > a 1 Null and alternate hypotheses - endpoint 2 H 02 = (m 2 - m 02) a 2, A 2 = (m 2 - m 02) > a 2 CTP: Reject H 01 in favor of A 1 by using a global test procedure (e. g. , O’Biren’s GLS test) if both hypotheses H 01 and H 01 H 02 are rejected. Similarly for H 02 CTP: based on order statistics (next page) 20

Example 3 (Cont’d) Define for i=1, 2 endpoints, Ti = {(Ÿi - Ÿ 0

Example 3 (Cont’d) Define for i=1, 2 endpoints, Ti = {(Ÿi - Ÿ 0 i) - a} / {S ( 1/ni + 1/n 0 i) ½} Let T(1) T(2) be the order statistics for T 1 and T 2 and T(1); 1 - , T(2); 1 - be the critical values for these order statistics. • Step 1: Reject H(02) = (m(2) - m(02)) a(2) in favor of A(2) = (m(2) - m(02)) > a(2) if T(2) > T(2); 1 - and go to Step 2, else stop. • Step 2: Reject H(01) = (m(1) - m(01)) a(1) in favor of A(1) = (m(1) - m(01)) > a(1) if T(1) > T(1); 1 - 21

Example 3 (Cont’d) An Ad-hoc Stepwise Procedure Step 1: Reject H 01 = (m

Example 3 (Cont’d) An Ad-hoc Stepwise Procedure Step 1: Reject H 01 = (m 1 - m 01) a 1 in favor of A 1 = (m 1 - m 01) > a 1 if T 1 > t 1; 1 - and go to Step 2 A. If T 1 > t 1; 1 - ( ' > ) go to Step 2, else stop Step 2 A: Reject H 02 = (m 2 - m 02) a 2 in favor of A 2 = (m 2 - m 02) > a 2 if T 2 > t 2; 1 - ** Step 2 B: Reject H 02 = (m 2 - m 02) a 2 in favor of A 2 = (m 2 - m 02) > a 2 if T 2 > t 2; 1 - * where * < ** ; *, ** are determined by considering correlation between the two endpoints and Type I FWE 22

Multi-step and Closed Testing Procedures Some Issues & Concerns • Directional errors (Type III

Multi-step and Closed Testing Procedures Some Issues & Concerns • Directional errors (Type III errors): problem with two sided tests, but OK with one sided tests • For closed testing procedures adjusted p-values can be calculated (Examples: Huque & Sankoh (1997), JBS; Westfall et al (1999 SAS pub. )). However, the raw and adjusted p-values may differ in rank ordering. • Confidence interval? Sample size and power? • Different global test procedures when closed for getting a specific endpoint result may give different results • Tests, in a closed testing procedure, are generally correlated bootstrap or permutation techniques helpful 23

Final Comments • “Substantial evidence” is key to decision making for NDAs • In

Final Comments • “Substantial evidence” is key to decision making for NDAs • In general, literature based meta- analysis has not been helpful in this regard. Interpretation of the p-value is a major problem. • In combining studies, decision errors/type I errors need to be controlled by proper design and analysis • Multi-step and closure principles in general have been helpful in solving multiplicity problem for clinical trials. However there are issues and concerns. • Use of these methods, however, require pre-specification at the protocol stage 24