Chapter 10 Hypothesis testing Categorical Data Analysis EPI

  • Slides: 54
Download presentation
Chapter 10 Hypothesis testing: Categorical Data Analysis EPI 809/Spring 2008 1

Chapter 10 Hypothesis testing: Categorical Data Analysis EPI 809/Spring 2008 1

Learning Objectives 1. Comparison of binomial proportion using Z and 2 Test. 2. Explain

Learning Objectives 1. Comparison of binomial proportion using Z and 2 Test. 2. Explain 2 Test for Independence of 2 variables 3. Explain The Fisher’s test for independence 4. Mc. Nemar’s tests for correlated data 5. Kappa Statistic 6. Use of SAS Proc FREQ EPI 809/Spring 2008 2

Data Types EPI 809/Spring 2008 3

Data Types EPI 809/Spring 2008 3

Qualitative Data 1. 2. 3. 4. Qualitative Random Variables Yield Responses That Can Be

Qualitative Data 1. 2. 3. 4. Qualitative Random Variables Yield Responses That Can Be Put In Categories. Example: Gender (Male, Female) Measurement or Count Reflect # in Category Nominal (no order) or Ordinal Scale (order) Data can be collected as continuous but recoded to categorical data. Example (Systolic Blood Pressure - Hypotension, Normal tension, hypertension ) EPI 809/Spring 2008 4

Hypothesis Tests Qualitative Data EPI 809/Spring 2008 5

Hypothesis Tests Qualitative Data EPI 809/Spring 2008 5

Z Test for Differences in Two Proportions EPI 809/Spring 2008 6

Z Test for Differences in Two Proportions EPI 809/Spring 2008 6

Hypotheses for Two Proportions EPI 809/Spring 2008 7

Hypotheses for Two Proportions EPI 809/Spring 2008 7

Hypotheses for Two Proportions EPI 809/Spring 2008 8

Hypotheses for Two Proportions EPI 809/Spring 2008 8

Hypotheses for Two Proportions EPI 809/Spring 2008 9

Hypotheses for Two Proportions EPI 809/Spring 2008 9

Hypotheses for Two Proportions EPI 809/Spring 2008 10

Hypotheses for Two Proportions EPI 809/Spring 2008 10

Hypotheses for Two Proportions EPI 809/Spring 2008 11

Hypotheses for Two Proportions EPI 809/Spring 2008 11

Hypotheses for Two Proportions EPI 809/Spring 2008 12

Hypotheses for Two Proportions EPI 809/Spring 2008 12

Z Test for Difference in Two Proportions 1. Assumptions l l l 2. Populations

Z Test for Difference in Two Proportions 1. Assumptions l l l 2. Populations Are Independent Populations Follow Binomial Distribution Normal Approximation Can Be Used for large samples (All Expected Counts 5) Z-Test Statistic for Two Proportions EPI 809/Spring 2008 13

Sample Distribution for Difference Between Proportions EPI 809/Spring 2008 14

Sample Distribution for Difference Between Proportions EPI 809/Spring 2008 14

Z Test for Two Proportions Thinking Challenge MA Ø You’re an epidemiologist for the

Z Test for Two Proportions Thinking Challenge MA Ø You’re an epidemiologist for the US Department of Health and Human Services. You’re studying the prevalence of disease X in two states (MA and CA). In MA, 74 of 1500 people surveyed were diseased and in CA, 129 of 1500 were diseased. At. 05 level, does MA have a lower prevalence rate? EPI 809/Spring 2008 CA 15

Z Test for Two Proportions Solution* EPI 809/Spring 2008 16

Z Test for Two Proportions Solution* EPI 809/Spring 2008 16

Z Test for Two Proportions Solution* H 0: H a: = n. MA =

Z Test for Two Proportions Solution* H 0: H a: = n. MA = n. CA = Critical Value(s): Test Statistic: Decision: Conclusion: EPI 809/Spring 2008 17

Z Test for Two Proportions Solution* H 0: p. MA - p. CA =

Z Test for Two Proportions Solution* H 0: p. MA - p. CA = 0 Ha: p. MA - p. CA < 0 = n. MA = n. CA = Critical Value(s): Test Statistic: Decision: Conclusion: EPI 809/Spring 2008 18

Z Test for Two Proportions Solution* H 0: p. MA - p. CA =

Z Test for Two Proportions Solution* H 0: p. MA - p. CA = 0 Ha: p. MA - p. CA < 0 =. 05 n. MA = 1500 n. CA = 1500 Critical Value(s): Test Statistic: Decision: Conclusion: EPI 809/Spring 2008 19

Z Test for Two Proportions Solution* H 0: p. MA - p. CA =

Z Test for Two Proportions Solution* H 0: p. MA - p. CA = 0 Ha: p. MA - p. CA < 0 =. 05 n. MA = 1500 n. CA = 1500 Critical Value(s): Test Statistic: Decision: Conclusion: EPI 809/Spring 2008 20

Z Test for Two Proportions Solution* EPI 809/Spring 2008 21

Z Test for Two Proportions Solution* EPI 809/Spring 2008 21

Z Test for Two Proportions Solution* H 0: p. MA - p. CA =

Z Test for Two Proportions Solution* H 0: p. MA - p. CA = 0 Ha: p. MA - p. CA < 0 =. 05 n. MA = 1500 n. CA = 1500 Critical Value(s): Test Statistic: Z = -4. 00 Decision: Conclusion: EPI 809/Spring 2008 22

Z Test for Two Proportions Solution* H 0: p. MA - p. CA =

Z Test for Two Proportions Solution* H 0: p. MA - p. CA = 0 Ha: p. MA - p. CA < 0 =. 05 n. MA = 1500 n. CA = 1500 Critical Value(s): Test Statistic: Z = -4. 00 Decision: Reject at =. 05 Conclusion: EPI 809/Spring 2008 23

Z Test for Two Proportions Solution* H 0: p. MA - p. CA =

Z Test for Two Proportions Solution* H 0: p. MA - p. CA = 0 Ha: p. MA - p. CA < 0 =. 05 n. MA = 1500 n. CA = 1500 Critical Value(s): Test Statistic: Z = -4. 00 Decision: Reject at =. 05 Conclusion: There is evidence MA is less than CA EPI 809/Spring 2008 24

2 Test of Independence Between 2 Categorical Variables EPI 809/Spring 2008 25

2 Test of Independence Between 2 Categorical Variables EPI 809/Spring 2008 25

Hypothesis Tests Qualitative Data EPI 809/Spring 2008 26

Hypothesis Tests Qualitative Data EPI 809/Spring 2008 26

 2 Test of Independence 1. Shows If a Relationship Exists Between 2 Qualitative

2 Test of Independence 1. Shows If a Relationship Exists Between 2 Qualitative Variables, but does Not Show Causality 2. Assumptions Multinomial Experiment All Expected Counts 5 3. Uses Two-Way Contingency Table EPI 809/Spring 2008 27

 2 Test of Independence Contingency Table Ø 1. Shows # Observations From 1

2 Test of Independence Contingency Table Ø 1. Shows # Observations From 1 Sample Jointly in 2 Qualitative Variables EPI 809/Spring 2008 28

 2 Test of Independence Contingency Table 1. Shows # Observations From 1 Sample

2 Test of Independence Contingency Table 1. Shows # Observations From 1 Sample Jointly in 2 Qualitative Variables Levels of variable 2 Levels of variable 1 EPI 809/Spring 2008 29

 2 Test of Independence Hypotheses & Statistic 1. Hypotheses l l H 0:

2 Test of Independence Hypotheses & Statistic 1. Hypotheses l l H 0: Variables Are Independent Ha: Variables Are Related (Dependent) EPI 809/Spring 2008 30

 2 Test of Independence Hypotheses & Statistic 1. Hypotheses H 0: Variables Are

2 Test of Independence Hypotheses & Statistic 1. Hypotheses H 0: Variables Are Independent Ha: Variables Are Related (Dependent) 2. Test Statistic Observed count Expected count EPI 809/Spring 2008 31

 2 Test of Independence Hypotheses & Statistic 1. Hypotheses H 0: Variables Are

2 Test of Independence Hypotheses & Statistic 1. Hypotheses H 0: Variables Are Independent Ha: Variables Are Related (Dependent) 2. Test Statistic Observed count Expected count Rows Columns Degrees of Freedom: (r - 1)(c - 1) EPI 809/Spring 2008 32

2 Test of Independence Expected Counts 1. Statistical Independence Means Joint Probability Equals Product

2 Test of Independence Expected Counts 1. Statistical Independence Means Joint Probability Equals Product of Marginal Probabilities 2. Compute Marginal Probabilities & Multiply for Joint Probability 3. Expected Count Is Sample Size Times Joint Probability EPI 809/Spring 2008 33

Expected Count Example EPI 809/Spring 2008 34

Expected Count Example EPI 809/Spring 2008 34

Expected Count Example Marginal probability = 112 160 EPI 809/Spring 2008 35

Expected Count Example Marginal probability = 112 160 EPI 809/Spring 2008 35

Expected Count Example Marginal probability = 112 160 Marginal probability = 78 160 EPI

Expected Count Example Marginal probability = 112 160 Marginal probability = 78 160 EPI 809/Spring 2008 36

Expected Count Example 112 78 Joint probability = 160 Marginal probability = 112 160

Expected Count Example 112 78 Joint probability = 160 Marginal probability = 112 160 78 160 EPI 809/Spring 2008 37

Expected Count Example 112 78 Joint probability = 160 Marginal probability = 112 160

Expected Count Example 112 78 Joint probability = 160 Marginal probability = 112 160 78 160 112 78 Expected count = 160· 160 = 54. 6 Marginal probability = EPI 809/Spring 2008 38

Expected Count Calculation EPI 809/Spring 2008 39

Expected Count Calculation EPI 809/Spring 2008 39

Expected Count Calculation EPI 809/Spring 2008 40

Expected Count Calculation EPI 809/Spring 2008 40

Expected Count Calculation 112 x 78 160 112 x 82 160 48 x 78

Expected Count Calculation 112 x 78 160 112 x 82 160 48 x 78 160 EPI 809/Spring 2008 48 x 82 160 41

 2 Test of Independence Example on HIV Ø You randomly sample 286 sexually

2 Test of Independence Example on HIV Ø You randomly sample 286 sexually active individuals and collect information on their HIV status and History of STDs. At the. 05 level, is there evidence of a relationship? EPI 809/Spring 2008 42

 2 Test of Independence Solution EPI 809/Spring 2008 43

2 Test of Independence Solution EPI 809/Spring 2008 43

 2 Test of Independence Solution H 0: H a: = df = Critical

2 Test of Independence Solution H 0: H a: = df = Critical Value(s): Test Statistic: Decision: Conclusion: EPI 809/Spring 2008 44

 2 Test of Independence Solution H 0: No Relationship Ha: Relationship = df

2 Test of Independence Solution H 0: No Relationship Ha: Relationship = df = Critical Value(s): Test Statistic: Decision: Conclusion: EPI 809/Spring 2008 45

 2 Test of Independence Solution H 0: No Relationship Ha: Relationship =. 05

2 Test of Independence Solution H 0: No Relationship Ha: Relationship =. 05 df = (2 - 1) = 1 Critical Value(s): Test Statistic: Decision: Conclusion: EPI 809/Spring 2008 46

 2 Test of Independence Solution H 0: No Relationship Ha: Relationship =. 05

2 Test of Independence Solution H 0: No Relationship Ha: Relationship =. 05 df = (2 - 1) = 1 Critical Value(s): =. 05 Test Statistic: Decision: Conclusion: EPI 809/Spring 2008 47

 2 Test of Independence Solution E(nij) 5 in all cells 116 x 132

2 Test of Independence Solution E(nij) 5 in all cells 116 x 132 286 154 x 116 286 170 x 132 286 EPI 809/Spring 2008 170 x 154 286 48

 2 Test of Independence Solution EPI 809/Spring 2008 49

2 Test of Independence Solution EPI 809/Spring 2008 49

 2 Test of Independence Solution H 0: No Relationship Ha: Relationship =. 05

2 Test of Independence Solution H 0: No Relationship Ha: Relationship =. 05 df = (2 - 1) = 1 Critical Value(s): =. 05 Test Statistic: 2 = 54. 29 Decision: Conclusion: EPI 809/Spring 2008 50

 2 Test of Independence Solution H 0: No Relationship Ha: Relationship =. 05

2 Test of Independence Solution H 0: No Relationship Ha: Relationship =. 05 df = (2 - 1) = 1 Critical Value(s): =. 05 Test Statistic: 2 = 54. 29 Decision: Reject at =. 05 Conclusion: EPI 809/Spring 2008 51

 2 Test of Independence Solution H 0: No Relationship Ha: Relationship =. 05

2 Test of Independence Solution H 0: No Relationship Ha: Relationship =. 05 df = (2 - 1) = 1 Critical Value(s): =. 05 Test Statistic: 2 = 54. 29 Decision: Reject at =. 05 Conclusion: There is evidence of a relationship EPI 809/Spring 2008 52

 2 Test of Independence SAS CODES Data dis; input STDs HIV count; cards;

2 Test of Independence SAS CODES Data dis; input STDs HIV count; cards; 1 1 84 1 2 32 2 1 48 2 2 122 ; run; Proc freq data=dis order=data; weight Count; tables STDs*HIV/chisq; run; EPI 809/Spring 2008 53

 2 Test of Independence SAS OUTPUT Statistics for Table of STDs by HIV

2 Test of Independence SAS OUTPUT Statistics for Table of STDs by HIV Statistic DF Value Prob ---------------------------Chi-Square 1 54. 1502 <. 0001 Likelihood Ratio Chi-Square 1 55. 7826 <. 0001 Continuity Adj. Chi-Square 1 52. 3871 <. 0001 Mantel-Haenszel Chi-Square 1 53. 9609 <. 0001 Phi Coefficient 0. 4351 Contingency Coefficient 0. 3990 Cramer's V 0. 4351 EPI 809/Spring 2008 54