Chapter 18 CrossTabulated Counts Part A 6112021 1
Chapter 18 Cross-Tabulated Counts Part A 6/11/2021 1
Chapter 18, Part A: • 18. 1 Types of Samples • 18. 2 Naturalistic and Cohort Samples • 18. 3 Chi-Square Test of Association 6/11/2021 2
Types of Samples I. Naturalistic Samples ≡ simple random sample or complete enumeration of the population II. Purposive Cohorts ≡ select fixed number of individuals in each exposure group III. Case-Control ≡ select fixed number of diseased and non-diseased individuals 6/11/2021 3
Naturalistic (Type I) Sample Random sample of study base 6/11/2021 4
Naturalistic (Type I) Sample Random sample of study base • How did we study CMV (the exposure) and restenosis (the disease) relationship via a naturalistic sample? • A population was identified and sampled • Sample classified as CMV+ and CMV− • Disease occurrence (restenosis) was studied and compared in the groups. 6/11/2021 5
Purposive Cohorts (Type II sample) Fixed numbers in exposure groups • How would we study CMV and restenosis with a purposive cohort design? • A population of CMV+ individuals would be identified. – From this population, select, say 38, individuals. • A population of CMV− individuals would be identified. – From this population, select, say, 38 individuals. • Disease occurrence (restenosis) would be studied and compared among the groups. 6/11/2021 6
Case-control (Type III sample) Set number of cases and non-cases • How would I do study CMV and restenosis with a case-control design? • A population of patents who experienced restenosis (cases) would be identified. – From this population, select, say, 38, individuals. • A population of patients who did not restenose (controls) would be identified. – From this population, select, say, 38 individuals. • The exposure (CMV) would be studied and compared among the groups. 6/11/2021 7
Case-Control (Type III sample) Set number of cases and non-cases 6/11/2021 8
Naturalistic Sample Illustrative Example • SRS, N = 585 Edu. • Cross-classify education level HS (categorical JC exposure) and JC+ smoking status (categorical UG disease) Grad • Talley R-by-C table “cross-tab” Total 6/11/2021 Smoke? + − Tot 12 38 50 18 67 85 27 95 122 32 239 271 5 52 57 94 491 585 9
Cross-tabulation (cont. ) Smoke? Educ. HS + − Tot 12 38 50 JC 18 67 85 Some 27 95 122 UG 32 239 271 Grad 5 52 57 Total 94 491 585 6/11/2021 Row margins Total Column margins 10
Cross-tabulation of counts For uniformity, we will always: put the exposure variable in rows put the disease variable in columns 6/11/2021 11
Exposure / Disease relationship Use conditional proportions to describe relationships between exposure and disease 6/11/2021 12
Conditional Proportions Exposure / Disease Relationship R-by-2 Table Grp 1 Grp 2 ↓ Grp R Total 6/11/2021 + a 1 a 2 ↓ a. R m 1 − b 1 b 2 ↓ b. R m 2 Total n 1 n 2 ↓ n. R N In naturalistic and cohort samples row percents! 13
Example Prevalence of smoking by education: Lower education associated with higher prevalence (negative association between education and smoking) 6/11/2021 14
Relative Risks Let group 1 represent the least exposed group 6/11/2021 15
Illustration: RRs Note trend 6/11/2021 16
k Levels of Disease Efficacy of Echinacea example. Randomized controlled clinical trial: echinacea vs. placebo in treatment of URI Exposure ≡ Echinacea vs. placebo Disease ≡ severity of illness Source: JAMA 2003, 290(21), 2824 -30 6/11/2021 17
Row Percents for Echinacea Example Echinacea group fared slightly worse than placebo group 6/11/2021 18
Chi-Square Test of Association A. H 0: no association in population Ha: association in population B. Test statistic 6/11/2021 19
Observed Degree Smoke + HS 12 JC JC+ UG Grad Total 6/11/2021 Smoke − Tot 38 50 18 67 85 27 95 122 32 239 271 5 52 57 94 491 585 20
Expected High. S Smoke + (50 × 94) ÷ 585 = Smoke − (50 × 491) ÷ 585 = 50 JC 8. 034 13. 658 Some 19. 603 102. 397 122 UG 43. 545 227. 455 271 9. 159 94 47. 841 491 57 585 Grad Total 6/11/2021 41. 966 71. 342 Total 85 21
Continuity Corrected Chi-Square • Pearson’s (“uncorrected”) chi-square • Yates’ continuity-corrected chi-square: 6/11/2021 22
Chi-Square Hand Calc. 6/11/2021 23
Chi-Square P-value • X 2 stat= 13. 20 with 4 df • Table E 4 df row bracket chi-square statistic look up right tail (P-value) regions • Example bracket X 2 stat between 11. 14 (P =. 025) and 13. 28 (P =. 01) • . 01 < P <. 025 Right tail df =4 6/11/2021 0. 975 0. 20 0. 15 0. 10 0. 05 0. 025 0. 01 0. 48 5. 39 5. 99 6. 74 7. 78 9. 49 11. 14 13. 28 14. 86 24
Illustration: X 2 stat= 13. 20 with 4 df The P-value = AUC in the tail beyond X 2 stat 6/11/2021 25
Win. PEPI > Compare 2 > F 1 Input screen row 5 not visible Output 6/11/2021 26
Chi-Square, cont. 1. How the chi-square works. When observed values = expected values, the chi-square statistic is 0. When the observed minus expected values gets large evidence against H 0 mounts 2. Avoid chi-square tests in small samples. Do not use a chi-square test when more than 20% of the cells have expected values that are less than 5. 6/11/2021 27
Chi-Square, cont. 3. Supplement chi-squares with descriptive stat. Chi-square statistics do not quantify effects 4. For 2 -by-2 tables, chi-square and z tests produce identical P-values. 6/11/2021 28
Discussion and demo on power and sample size • For estimation • For testing – Power – Sample size 6/11/2021 29
- Slides: 29