Medical Epidemiology Critiquing the Research Literature Critiquing the

  • Slides: 51
Download presentation
Medical Epidemiology Critiquing the Research Literature

Medical Epidemiology Critiquing the Research Literature

Critiquing the Research Literature n Fundamental questions to ask, systematically n Sample critical discussions

Critiquing the Research Literature n Fundamental questions to ask, systematically n Sample critical discussions of two research papers – Meningococcal disease and campus bar patronage – Diltiazem and mortality

Critiquing the Literature n Association: a statistical feature of comparisons(s), with six possible explanations:

Critiquing the Literature n Association: a statistical feature of comparisons(s), with six possible explanations: – Causation, with exposure promoting disease – Chance – Selection bias – Measurement bias – Confounding variable(s) – Causation, with disease promoting appearance of the exposure Always ask: are there plausible alternative explanations for the data?

Critiquing the Literature n Fundamental Questions – Do the results claimed by the investigator

Critiquing the Literature n Fundamental Questions – Do the results claimed by the investigator depend on (explicit or implicit) comparison of different groups? – If so, did the research design include such comparison? – Are the results claimed by the investigator plausibly explainable by random chance? – Are the results the investigator claims to have found plausibly explainable by bias in selecting the study subjects? – Are the results the investigator claims to have found plausibly explainable by bias in measuring important variables? – Are the results the investigator claims to have found plausibly explainable by failure to control for confounding variables? – If the conclusions pertain to cause and effect, does the study design provide assurance that the purported cause actually preceded the supposed outcome?

Critiquing the Literature What one needs to know n What did the researchers want

Critiquing the Literature What one needs to know n What did the researchers want to find out? n What study design was used? – Don’t take at face value the terms the researchers use to describe their own study design, as these may be misleading. n n Was randomization used appropriately and successfully? Is the population studied representative of an identifiable and interpretable population of patients? – For clinical application, is it representative of my patients?

Critiquing the Literature n Are subjects in the comparison group(s) similar, with respect to

Critiquing the Literature n Are subjects in the comparison group(s) similar, with respect to possibly confounding traits, to subjects in the case or experimental group? – Note that possibly confounding traits are variables related to those being compared between groups -- health outcome(s) in clinical trials and cohort studies, past exposure(s) for case -control studies. n How are levels of exposure and end-points measured? – Are the measures valid and reliable? – Are they the same for control and case or experimental groups?

Critiquing the Literature n Was statistical analysis interpretable and correct? – What do the

Critiquing the Literature n Was statistical analysis interpretable and correct? – What do the tables and figures mean? • Do they address the issues of scientific importance? – Were hypothesis tests, p-values, and confidence intervals used to distinguish systematic from chance variation? • Are they interpretable and correctly generated? • Were they adjusted for potential confounders? – What inferences do the authors make? • Are they logically correct? • Do they involve assumptions that go beyond what the results imply?

Critiquing the Literature n How do the various results of the study fit together

Critiquing the Literature n How do the various results of the study fit together and with outside information and common sense. n What inferences can be made re – the study subjects, – the population from which the study subjects were drawn, – other populations of interest?

For your presentation n n n May use powerpoint, transparencies, handouts. Should at least

For your presentation n n n May use powerpoint, transparencies, handouts. Should at least have 1 page handout of the outline of your presentation. Mention every point that is in your written critique. Either give a quick overview of article pointing important tables and graphs, then go through your critique. Or go through the article critiquing as you go along. Grades range from 60% to 100% (There have been lower grades). Grade for written (½ of total) is usually very similar to oral (1/4). Participation is different.

2 Secrets n n Go to the issue where the paper was published and

2 Secrets n n Go to the issue where the paper was published and see if there was an accompanying editorial. (JAMA has a page number) Go to the index of the journal for the 6 -9 months after the paper was published and find the letters to the editor and the authors’ response.

Meningococcemia on Campus Meningoccoccal Disease In Urbana-Champaign College Students, January 1991 – April 1992

Meningococcemia on Campus Meningoccoccal Disease In Urbana-Champaign College Students, January 1991 – April 1992 19 19 D E C J A N 19 F E B M A R MALE 20 20 20 22 18 A P R M A Y J J U U N L 1991 FEMALE A U G S E P O C T N O V D E C COMMUNITY COLLEGE J A N 19 F M A E A P B R R 1992 FATALITY M A Y

Meningococcemia on Campus n CASE INVESTIGATION – All 8 recovered isolates serogroup C –

Meningococcemia on Campus n CASE INVESTIGATION – All 8 recovered isolates serogroup C – All 9 cases patronized campus bars within 2 weeks prior to illness onset – 8 of 9 cases patronized Bar A within 2 weeks prior to illness onset

Meningococcemia on Campus ANALYTIC STUDIES n 1: 20 matched case-control study of last 7

Meningococcemia on Campus ANALYTIC STUDIES n 1: 20 matched case-control study of last 7 cases n nasopharyngeal carriage studies of: – university health service February 1992, May 1992, January 1993 – campus bar workers March-May 1992, January 1993 n Serogrouping of all N. meningitidis positive cultures, and multilocus enzyme electrophoresis (MEE) of all serogroup C isolates from cases and all carriage surveys

Meningococcemia on Campus CASE-CONTROL STUDY* Exposure Current cigarette smoking Cases Controls OR Exact mid-p

Meningococcemia on Campus CASE-CONTROL STUDY* Exposure Current cigarette smoking Cases Controls OR Exact mid-p 4/6 23/117 7. 8 . 012 Any 6/6 73/117 -- . 034 14 hours/week 5/6 26/117 16. 7 . 002 5/6 19/117 23. 1 . 0006 Campus bar exposure prior two weeks Bar A (any duration) *Matched to cases by gender, school, year in school, calendar time of bar exposure. One matched set excluded at request of case.

Meningococcemia on Campus CARRIAGE/MEE RESULTS Meningococcal Carriers number (%) Source All Strains Case-Linked Strains

Meningococcemia on Campus CARRIAGE/MEE RESULTS Meningococcal Carriers number (%) Source All Strains Case-Linked Strains 32 (37. 6) 3 (3. 5) Bar A (n=22) 7 (31. 8) 2 (9. 1) Bar B (n=14) 6 (42. 9) 1 (7. 1) 59 (28. 0) 0 (0. 0) Health center (n=1211) 128 (10. 6) 2 (0. 2) 1992 Recent bar patrons (n=411) 58 (14. 1) 1 (0. 2) Others (n=456) 28 (6. 1) 1 (0. 2) 1993 Recent bar patrons (n=192) 37 (19. 3) 0 (0. 0) 5 (3. 3) 0 (0. 0) Bar workers (n=296) 1992, 5 bars (n=85) 1993, 11 bars (n=211) Others (n=152)

Meningococcemia on Campus Multiple Logistic Model For 1992 Health Center Carriage Data Factor February

Meningococcemia on Campus Multiple Logistic Model For 1992 Health Center Carriage Data Factor February vs. May Estimated OR 1. 08 Wald 95% CI for OR p-value 0. 93 -1. 26 Alcohol consumption in past week . 3196. 0012 1 -6 Drinks vs. None 1. 56 1. 19 -2. 03 7 -15 Drinks vs. None 2. 42 1. 42 -4. 14 >15 Drinks vs. None 3. 77 1. 69 -8. 42 Type II without antibiotic 2. 88 1. 71 -4. 85 . 0001 Bar patronage in past 2 weeks 1. 94 1. 15 -3. 25 . 0122

Meningococcemia on Campus CARRIAGE/MEE RESULTS Meningococcal Carriers number (%) Source All Strains Case-Linked Strains

Meningococcemia on Campus CARRIAGE/MEE RESULTS Meningococcal Carriers number (%) Source All Strains Case-Linked Strains 32 (37. 6) 3 (3. 5) Bar A (n=22) 7 (31. 8) 2 (9. 1) Bar B (n=14) 6 (42. 9) 1 (7. 1) 59 (28. 0) 0 (0. 0) Health center (n=1211) 128 (10. 6) 2 (0. 2) 1992 Recent bar patrons (n=411) 58 (14. 1) 1 (0. 2) Others (n=456) 28 (6. 1) 1 (0. 2) 1993 Recent bar patrons (n=192) 37 (19. 3) 0 (0. 0) 5 (3. 3) 0 (0. 0) Bar workers (n=296) 1992, 5 bars (n=85) 1993, 11 bars (n=211) Others (n=152)

Meningococcemia on Campus CARRIAGE/MEE RESULTS Meningococcal Carriers number (%) Source All Strains Case-Linked Strains

Meningococcemia on Campus CARRIAGE/MEE RESULTS Meningococcal Carriers number (%) Source All Strains Case-Linked Strains 32 (37. 6) 3 (3. 5) Bar A (n=22) 7 (31. 8) 2 (9. 1) Bar B (n=14) 6 (42. 9) 1 (7. 1) 59 (28. 0) 0 (0. 0) Health center (n=1211) 128 (10. 6) 2 (0. 2) 1992 Recent bar patrons (n=411) 58 (14. 1) 1 (0. 2) Others (n=456) 28 (6. 1) 1 (0. 2) 1993 Recent bar patrons (n=192) 37 (19. 3) 0 (0. 0) 5 (3. 3) 0 (0. 0) Bar workers (n=296) 1992, 5 bars (n=85) 1993, 11 bars (n=211) Others (n=152)

Meningococcemia on Campus CONCLUSIONS n Some cases in this outbreak may have followed transmission

Meningococcemia on Campus CONCLUSIONS n Some cases in this outbreak may have followed transmission in Bar A. n Elevated levels of carriage of N. meningitidis in campus bar workers, and a campus bar environment, may participate in the spread of meningococcal disease among teens and young adults more generally.

Meningococcal Disease and Campus Bar Patronage n. Do the results claimed by the investigator

Meningococcal Disease and Campus Bar Patronage n. Do the results claimed by the investigator depend on (explicit or implicit) comparison of different groups? –Yes. Inference of elevated risk associated with exposure to campus bars, and Bar A in particular, depends on claim of unusually high bar exposure among cases, which implies a reference group. –Inference of unusually high prevalence of organism in Bar A workers implies comparison to some reference “usual” prevalence in the same or a related population. n. If so, did the research design include such comparison? –Yes: explicit case-control design for bar exposure comparison, and explicit cross-sectional design using workers at other bars and Mc. Kinley clients for oropharyngeal prevalence surveys.

Are the results claimed by the investigator plausibly explainable by random chance? - Apparently

Are the results claimed by the investigator plausibly explainable by random chance? - Apparently no. The association of Bar A with each of the causal organism and the disease both generate quite low pvalues, so neither association seems due to chance. – But, one might argue that these p-values are misleadingly low because any one of a number of bars, restaurants or other venues might have been unusually associated with the organism or the disease. Perhaps an adjustment for multiple comparisons should have been made. However, even a conservative adjustment for as many as 25 venues would have left both associations statistically significant at the conventional 5% level. – But, one might argue that the investigation of bar attendance was motivated by the apparently unusually consistent presence of that variable among many. Perhaps one should adjust for the number of variables in the case histories.

Are the results claimed by the investigator plausibly explainable by random chance? However, beyond

Are the results claimed by the investigator plausibly explainable by random chance? However, beyond the statistical significance of either relationship individually is the further “coincidence” that both the organism and the disease were not only each related to a campus bar, but that the two bars were the same bar.

Are the results the investigator claims to have found plausibly explainable by bias in

Are the results the investigator claims to have found plausibly explainable by bias in selecting the study subjects? n Case-control study: – Controls, matched to cases by school, year, and sex. – But one case dropped from analysis. – UI controls selected from University phone book. Many listings inaccurate. – Appropriate Parkland controls would have been randomly chosen 1991 2 nd-year students living out-of-town. However, controls used were selected from list of 50 1992 2 nd-year students living in town, selected by college administration. – Only about 80% of targeted students were reached and gave interviews. – The above matter only if the lost case was less likely or potential controls missed were more likely patrons of campus bars, especially Bar A, than other members of their respective groups.

Are the results the investigator claims to have found plausibly explainable by bias in

Are the results the investigator claims to have found plausibly explainable by bias in selecting the study subjects? n Oropharyngeal prevalence survey: – Mc. Kinley patrons don’t perfectly mirror UI students. Those with health problems, possibly taking antibiotics that lower carriage, were oversampled. (Though other infections may promote carriage. ) n Bar employee survey: – Employees studied were from selected bars whose owners cooperated, not from all bars. Even among employees of those bars, those studied were volunteers, and may not be representative of all workers from these same bars. – A (very) few bar samples were lost in shipment to CDC. – Thus, maybe the real comparison was between Mc. Kinley patrons who felt lousy and bar workers who felt like showing up at a meeting, whose samples made it to CDC. n These things matter only to the (unknown) extent that they are associated with carriage of N. meningitidis, particularly the case-linked strains.

Are the results the investigator claims to have found plausibly explainable by bias in

Are the results the investigator claims to have found plausibly explainable by bias in measuring important variables? n Case-control study: – Though anonymity of responses was promised, controls were not anonymous. “Underage” controls may have been reluctant to acknowledge bar patronage. – Controls were asked to recall bar patronage as long as 5 months previously. While cases had their illness to fix memories, controls had no special reason to remember their precise behavior, and were asked about “usual” frequencies and “likely” patronage, rather than actual behavior. – However, results from controls were comparable to other UI surveys, including the Mc. Kinley Health Center carriage surveys, and such bias would have been expected to affect all bars equally (since the study was roughly age-matched via year in school).

Are the results the investigator claims to have found plausibly explainable by bias in

Are the results the investigator claims to have found plausibly explainable by bias in measuring important variables? n Case-control study: – The two week periods during which bar exposure was examined did not sufficiently mirror the behavior of the disease under study. While there is no firm consensus on the actual duration of meningococcal disease, many believe that some cases incubate for over two weeks, so that the periods of study were too short. Further, exposure to Bar A was reported by one of the cases only <24 hours before symptom onset. This exposure should not have been counted, as it was unlikely to have caused the illness. – But, one has to choose some reasonable period and apply it equally to cases & controls. Control of Communicable Diseases in Man, an authoritative reference, lists the incubation period for meningococcal meningitis as 2 -10 days, commonly 3 -4 days. By this definition, the two week period was liberal. The case patronizing Bar A the day before onset was queried in detail ten months later, and may also have been there earlier and forgotten.

Are the results the investigator claims to have found plausibly explainable by bias in

Are the results the investigator claims to have found plausibly explainable by bias in measuring important variables? Oropharyngeal prevalence surveys: – These were anonymous, and requested behavioral data were recent. – However, samples from Mc. Kinley Health Center controls and bar employees were handled by different lab techniques which can substantially affect results in culturing N. meningitidis. • A comparability study was built into one of the Mc. Kinley health surveys, and showed no bias. – Most critically, all the carriage studies were at the wrong time. All were done after 8 of the 9 cases had already occurred. Two of the three Mc. Kinley surveys and all of the bar employee surveys were done after much of the campus had been immunized. The critical 1992 surveys were done at Mc. Kinley at specific times in February and May, while the bar surveys were scattered from mid-march to mid-May.

Are the results the investigator claims to have found plausibly explainable by bias in

Are the results the investigator claims to have found plausibly explainable by bias in measuring important variables? But: • Immunization has no documented effect on meningococcal carriage, and none was detected in our carriage studies. • A case occurred in late January 1992, 2 -3 weeks prior to the initial Mc. Kinley survey. An additional (fatal) case occurred in late April 1992, prima facie evidence that the causal organism was still present just prior to the final 1992 Mc. Kinley and bar worker surveys. • The duration of meningococcal carriage is thought to be relatively brief (several weeks to several months), but carriage over years has been documented (although this was done prior to the period when individual substrains were tracked, so that continuing carriage of the same substrain has not been proven).

Are the results the investigator claims to have found plausibly explainable by failure to

Are the results the investigator claims to have found plausibly explainable by failure to control for confounding variables? Candidate confounders: Other variables closely related to both Bar A exposure and more strongly associated with disease than Bar A exposure. • e. g. different forms and/or sites of close interpersonal contact between the cases or other common individuals. – A modest but by no means exhaustive search was made, and nothing turned up.

Meningococcal Disease and Campus Bar Patronage n If the conclusions pertain to cause and

Meningococcal Disease and Campus Bar Patronage n If the conclusions pertain to cause and effect, does the study design provide assurance that the purported cause actually preceded the supposed outcome? – No. Case-control and cross-sectional designs generally can’t do that.

Meningococcal Disease and Campus Bar Patronage n Conclusions – Ours: It is quite likely

Meningococcal Disease and Campus Bar Patronage n Conclusions – Ours: It is quite likely that exposure at Bar A caused several but by no means all of the cases in this cluster. – Yours: Up to you.

Diltiazem and Mortality n Objective: To determine whether long term treatment with diltiazem in

Diltiazem and Mortality n Objective: To determine whether long term treatment with diltiazem in patients with prior MI would reduce total mortality (and recurrent cardiac events [MI or cardiac death]) Design: randomized, triple-blind clinical trial Eligible patients: aged 25 to 75 with documented MI Therapy: Diltiazem or placebo, started 3 -15 days after MI Exclusions: indication for diltiazem or contraindications to diltiazem

Diltiazem and Mortality n Other outcomes assessed: one-year event rate for patients with 12

Diltiazem and Mortality n Other outcomes assessed: one-year event rate for patients with 12 preselected covariates (24 subgroups). n Sample size estimation: 2000 total for 80% power to detect 25% reduction in total mortality from a baseline mortality of 20% and alpha of 0. 05 (one sided).

Diltiazem and Mortality Results n 13, 618 patients were eligible. 11, 152 were excluded.

Diltiazem and Mortality Results n 13, 618 patients were eligible. 11, 152 were excluded. 2, 466 were enrolled (18%) n Baseline characteristics were similar (Table 1) n Follow up duration was 12 -52 months (mean 25) n Compliance was 63% at 2 years for diltiazem and placebo (pretty good) n Only serious side effects were bradycardia and hypotension (4% in diltiazem group vs. 2% in placebo (also pretty good)

Diltiazem and Mortality OUTCOMES n Total mortality was virtually identical (RR 1. 02, CI

Diltiazem and Mortality OUTCOMES n Total mortality was virtually identical (RR 1. 02, CI 0. 82 -1. 27) n Cardiac events RR 0. 90, CI 0. 74 -1. 08 n Subgroup analysis was done using 1 year event rate. The only significant variable was pulmonary congestion, which had a significant interaction with treatment (p=0. 0042, after Bonferroni p= 0. 0042 x 12= 0. 0504)

Diltiazem and Mortality n It seems that diltiazem was beneficial to patients without pulmonary

Diltiazem and Mortality n It seems that diltiazem was beneficial to patients without pulmonary congestion (RR=0. 77, CI 0. 61 -0. 98) with no correction for multiplicity, and harmful in those with pulmonary congestion (RR=1. 41, CI 1. 01 -1. 96). n It also seems that the higher the degree of pulmonary congestion, the higher the RR (dose response relationship).

Diltiazem and Mortality Critique n n n No fault with design, randomization, blinding, bias,

Diltiazem and Mortality Critique n n n No fault with design, randomization, blinding, bias, confounding (Typical of clinical trial in major journal) (In that setting the problems are usually with conclusions, “leap of faith”, misplaced emphasis, contradictions, items not reported or glossed over. ) Authors claim the primary end points were total mortality and cardiac events. – But actually it was only total mortality and then later it was changed. Also cardiac events is a less useful clinical endpoint than mortality.

Diltiazem and Mortality Critique n The authors claim that the 12 variables used as

Diltiazem and Mortality Critique n The authors claim that the 12 variables used as covariates were predetermined. –But only 42% of the patients received ejection fraction measurement. Might this variable have been added after the fact? Multiplicity n 2 endpoints, 12 variables, 24 subgroups n Why one year event rate? Did they try other time periods? n Data monitoring committee n The CI’s are very wide for both the groups with and without pulmonary congestion.

Objectives of Subgroup Analysis n n Support the main finding Check the consistency of

Objectives of Subgroup Analysis n n Support the main finding Check the consistency of main finding Address specific concerns re efficacy or safety in specific subgroup Generate hypotheses for future studies

Inappropriate Uses of Subgroup Analysis n n n Rescue a negative trial Rescue a

Inappropriate Uses of Subgroup Analysis n n n Rescue a negative trial Rescue a harmful trial Data dredging: find interesting results without a prespecified plan or hypothesis

To Avoid Inappropriate Uses of Subgroup Analysis n n Prespecify analysis plan Prespecify hypotheses

To Avoid Inappropriate Uses of Subgroup Analysis n n Prespecify analysis plan Prespecify hypotheses to be tested based on prior evidence Plan adequate power in the subgroups Avoid the previous pitfalls.

Problems with Subgroup Analysis 1. 2. 3. 4. 5. Low power Multiplicity Test for

Problems with Subgroup Analysis 1. 2. 3. 4. 5. Low power Multiplicity Test for interaction Comparability of the treatment groups maybe compromized Over-interpretation

Diltiazem and Mortality Critique n Data derived hypothesis n Power analysis – Should have

Diltiazem and Mortality Critique n Data derived hypothesis n Power analysis – Should have been done using 2 -sided p-values, like the rest of the study. – They should have made provisions for a lower outcome rate, as they should expect that. n The conclusion is not valid. The only conclusions that can be drawn are:

Diltiazem and Mortality n Diltiazem is not of major benefit (or harm) in preventing

Diltiazem and Mortality n Diltiazem is not of major benefit (or harm) in preventing death or recurrent cardiac events after MI. n The effect of diltiazem at 1 year is different in patients with pulmonary congestion compared to those without pulmonary congestion. – The small number of events in these subgroups and the multiplicity of comparisons done preclude a meaningful conclusion regarding the benefit of diltiazem in patients without pulmonary congestion. – However, since the rules of evidence should be different with regard to harmful effect, we strongly caution against using diltiazem in patients with pulmonary congestion. ”

Are we asking for too much?

Are we asking for too much?

Compare that conclusion to this one

Compare that conclusion to this one

Here is the <7 weeks category. But where is the >12 weeks category?

Here is the <7 weeks category. But where is the >12 weeks category?

Went too far the other way?

Went too far the other way?

Went too far the other way? n n n Statistically significant Dose response Biologic

Went too far the other way? n n n Statistically significant Dose response Biologic plausibility Consistent with animal studies Consistent with preexisting hypothesis Consistent with the hypothesis they set out to test

So what can you do? n n n Be skeptical. Yes. Be cynical. Maybe.

So what can you do? n n n Be skeptical. Yes. Be cynical. Maybe. Is medical literature any better than political commercials. Yes. Is it better than newspaper articles. Yes. In what way? The article itself almost always includes all the facts you need to refute its own conclusion.