The ChiSquare Statistical Test and Research Technique Claudia





























- Slides: 29
The Chi-Square Statistical Test and Research Technique Claudia Bailey State University of New York Polytechnic Institute
The Chi-Square Test • The statistical method chosen for this assignment is the chi-square statistical test method. • “Statistical instruments are used to extract knowledge from the observation of real world phenomena” (Bolboacă, Jäntschi, Sestraş, S. , Sestraş, R. , & Pamfil, 2011, p. 529). • The Chi-square test has been applied in all research areas to assess distribution of frequency (Bolboacă, et al. , 2011).
Chi-Square Test Research Technique • The chi-square statistical analysis test hypotheses (Polit, & Beck, 2012). • The chi-square test is a non-parametric (distribution free) tool designed to analyze group differences when the dependent variable is measured at a nominal level of a known population (Polit, & Beck, 2012). • The data collection is a proportion of cases that fall into various categories in a contingency table (Polit, & Beck, 2012).
Chi-Square Test Research Technique • “The chi-square test (χ2) can provide information on the observed differences and also detailed information on exactly which categories account for differences found”(Mc. Hugh, 2013, p. 143). • The test evaluates independent variables and multiple group studies (Eck, & Ryan, 2008). • When a comparison is made between samples, the rule is to find the degrees of freedom (Eck, & Ryan, 2008).
Chi-Square Test Research Technique • “The chi-squared test is statistically computed by summing differences between the observed frequencies in each cell and the expected frequencies” (Polit, & Beck, 2012, p. 374) • The frequency is present when there is no relationship between variables (Polit, & Beck, 2012). • The calculations needed to compute the chi-square provide considerable information about each group in the study (Mc. Hugh, 2013, p. 143).
Requirements of Chi-Square • The chi-square test is used in research when the following conditions pertains to the research data: • 1. “The level of measurement of all the variables are nominal or ordinal”(Mc. Hugh, 2013, p. 143). • 2. “The sample size of the study groups are unequal; for the χ2, the groups may be of equal size or unequal size” (Mc. Hugh, 2013, p. 143) • 3. The original data is measured at an interval or ratio level (Mc. Hugh, 2013).
Requirements of Chi-Square • 4. Limitations include: difficulty of interpretation when there is a large number of categories (20 or more) in the independent or dependent variables. (Mc. Hugh, 2013). • 5. “The value of the cell expected is 5 or more in at least 80% of the cells and the expected cell is less than one. The size equals the number of cells multiplied by 5”. (Mc. Hugh, 2013, p. 143).
The Chi- Square Family of Test • The chi-square test is a family of test used in the application of a series of assumptions in statistical analysis of experiential data, depending on the method used to collect the data and the hypothesis tested for the study (Bolboacă, et al. , 2011). • The chi-square family of test consist of the goodness of fit test, the independence test, and the homogeneity test. • The test has different formulas used to derive information from the study.
The Chi-Square Test Goodness of Fit • “The chi- square goodness of fit test is used when a sample is compared to a variable of interest against a population with known parameters” (Franke, Ho, & Christie, 2011, p. 449). • “Test if a sample of data came from a population with specific distribution (Compares the distribution of a variable with another distribution when the expected frequencies are known)” (Bolboacă, et al. , 2011, p. 530).
The Chi-Square Test Homogeneity • The chi- square test of homogeneity is used to determine whether two or more independent samples differ in their distributions on a single variable of interest (Bolboacă, et al. , 2011). • “The homogeneity test determines frequency counts identically distributed across different populations or across sub-groups of the same population in the categorical data” (Bolboacă, et al. , 2011, p. 531).
The Chi-Square Test Independence • The chi-square test of independence determines whether two categorical variables in a single sample are independent or associated (Bolboacă, et al. , 2011). • Calculating and comparing it against a critical value from the distribution assess whether the association seen between the variables in a particular sample represent an actual relationship between the variables in the population (Mc. Hugh, 2013).
Example of Chi-Square Family of Test Formula Note. Adapted from Bolboacă, S. D. , Jäntschi, L. , Sestraş, A. F. , Sestraş, R. E. , & Pamfil, D. C. (2011). Pearson-Fisher Chi-Square Statistic Revisited. Information, 2(4), 528545.
The History of the Chi-Square Test • In 1890 Professor Pearson became a pioneer of statistical techniques, contributing to the chi-square test (Franke, Ho, & Christie, 2011). • “Professor Pearson initially developed the chi-square test in 1900 and applied it to the test of goodness of fit for frequency curves” (Franke, Ho, & Christie, 2011, p. 449). • “In 1904, Professor Pearson extended the contingency tables to test for independence between rows and columns” (Franke, Ho, & Christie, 2011, p. 449). • Professor Pearson established the discipline of mathematical statistics, contributed to the field of biometrics, and meteorology (Franke, Ho, & Christie, 2011).
Professor Karl Pearson Portrait of Karl Pearson, by Elliott & Fry, 1890. Born Carl Pearson (1857 -03 -27)27 March 1857 Islington, London, England Died 27 April 1936(1936 -04 -27) (aged 79) Coldharbour, Surrey, England Residence England Nationality British Fields Lawyer, Germanist, eugenicist, mathematician and statistician (primarily the last) Institutions University College London King's College, Cambridge Alma mater University of Cambridge University of Heidelberg Academic advisors Francis Galton Notable students Philip Hall John Wishart Julia Bell Nicholas Georgescu-Roegen Known for Pearson distribution Pearson's r Pearson's chi-squared test Phi coefficient Influenced Albert Einstein, Henry Ludwell Moore, James Arthur Harris Notable awards Darwin Medal (1898) Note. Adapted from https: //en. wikipedia. org/wiki/Karl_Pearson
The Historical Chi-Square Debate • In 1922 Professor R. A. Fisher modified Professor Pearson's chi-square test by applying an additional step to the contingency table decreasing the degree of freedom by one unit and modifying the unknown number of parameters associated to theoretical distribution (Bolboacă, et al. , 2011). • In 1934 Professor Frank yate’s proposed a correction in the significance the contingency table (Bolboacă, et al. , 2011).
The Historical Chi-Square Debate • In 1959 Professor Koelher and Professor Lantz proposed a correction to the “use of at least three categories if the number of observations by at least 10” (Bolboacă, et al. , 2011, p. 529). • The Chi-Square test is among the most commonly misinterpreted statistical analysis in evolution and social science research (Franke, Ho, & Christie, 2011). • “The problem is researchers misapply the results of the chi-square test” (Franke, Ho, & Christie, 2011, p. 449). • They tend to over interrupt the results leading to statements that have limited or no statistical support based on the analysis performed (Franke, Ho, & Christie, 2011).
Chi-Square Statistical Article • The statistical article chosen for the analysis technique is “The Chisquare test of independence” (Mc. Hugh, 2013). • The article applied the chi-square test of independence to a case study. The case study focused on employees working in a healthcare setting with possible exposer to pneumonia, vaccinated employees, and employees unvaccinated due to vaccination shortage. The Case Study : • “The owner of a laboratory wants to keep sick leave as low as possible by keeping employees healthy through disease prevention programs. Many employees have contracted pneumonia leading tto sick leave from the disease” (Mc. Hugh, 2013, p. 144).
Chi-Square Statistical Article • “Due to a production problem at the company that produces the pneumonia vaccine, there is only enough vaccine for half the employees” (Mc. Hugh, 2013, p. 144). • A record was kept on the number of employees who contracted pneumonia and what type of pneumonia contracted (Mc. Hugh, 2013, p. 144). • “There are two groups; employees who received the vaccine and employees who did not receive the vaccine”(Mc. Hugh, 2013, p. 144).
Application of the Chi. Square Independence Test • “The company wanted to know if providing the vaccine made a difference ? ” (Mc. Hugh, 2013, p. 145). • The statistical test for differences when all the variables are nominal is the chi-square independence test. • The χ2 statistic used to test the question, “Was there a difference in the incidence of pneumonia between the two groups at the end of the winter ? ” (Mc. Hugh, 2013, p. 145).
Table #1. Example of the Occurrence of Pneumonia • Group 1: “Employees not provided with the vaccine (unvaccinated control group, N = 92)” (Mc. Hugh, 2013, p. 145). • Group 2: “Employees provided with the vaccine (vaccinated experimental group, N = 92)” (Mc. Hugh, 2013, p. 145). • “The independent variable is vaccination status (vaccinated versus unvaccinated). The dependent variable is health outcome with three levels” (Mc. Hugh, 2013, p. 145). • “Contracted pneumonia” (Mc. Hugh, 2013, p. 145). • “Contracted another type of pneumonia” (Mc. Hugh, 2013, p. 145). • “Did not contract pneumonia” (Mc. Hugh, 2013, p. 145).
Table #1. Example of Results of the Vaccination Program. Health Outcome Unvaccinated Vaccinated Sick with pneumococcal pneumonia 23 5 Sick with nonpneumococcal pneumonia 8 10 No pneumonia 61 77 Note. Calculation Adapted from Mchugh, M. L. (2013). The Chi-square test of independence. Biochemia Medica Biochem Med, 23(3), 143 -149.
Independence Test Calculation • The calculation χ2 statistic test the vaccination program to find the health outcomes of the employees (Mc. Hugh, 2013). The Formula:
Calculation of the Expected • “The second step in the Chi-square statistic, the “expected” values represent an estimate of how the cases would be distributed if there were no vaccine effect”(Mc. Hugh, 2013, p. 146). • “Expected values must reflect both the incidence of cases in each category and the unbiased distribution of cases if there is no vaccine effect” (Mc. Hugh, 2013, p. 146). • The expected formula:
Table #2. Example of Calculation of the Marginal. Health Outcome Not vaccinated Vaccinated Colum 1 Colum 2 Row marginal (Row sum) Sick with pneumococcal pneumonia 23 5 28 Sick with nonpneumococcal pneumonia 8 10 18 Stayed healthy 61 77 138 Column 92 92 N=184 marginal (Sum of the column) Note. Calculation Adapted from Mchugh, M. L. (2013). The Chi-square test of independence. Biochemia Medica Biochem Med, 23(3), 143 -149.
The Cell χ2 Value and the Degree of Freedom • χ 2 formula applied to the case study: • The cell χ2 for the first cell in the study is: “(23− 13. 93)2/13. 93 = 5. 92. The cell χ2 value for each cell value is in parentheses in each of the cells in Table #3”(Mc. Hugh, 2013, p. 146). • Cell expected values and (cell Chi-square values). • “Once the cell χ2 values have been calculated, they are summed to obtain the χ2 statistic for the table. In this case, the χ2 is 12. 35 (rounded)” ”(Mc. Hugh, 2013, p. 146). • “The Chi-square table requires the table degrees of freedom (df) in order to determine the significance level of the statistic. The degrees of freedom for a χ2 table formula is: (Number of rows− 1)×(Number of columns− 1) ”(Mc. Hugh, 2013, p. 145). • “As the P-value of the table is less than P < 0. 05, the researcher rejects the null hypothesis and accepts the alternate hypothesis: There is a difference in the occurrence of pneumonia between the vaccinated and unvaccinated groups” (Mc. Hugh, 2013, p. 145).
Table #3. Example of Cell expected values and (cell Chi-square values) Health outcome Not vaccinated Vaccinated Sick with pneumococcal pneumonia 13. 92 (5. 92) 12. 57 (4. 56) Sick with nonpneumococcal pneumonia 8. 95 (0. 10) 9. 05 (0. 10) Stayed healthy 69. 12 (0. 95) 69. 88 (0. 73) Note. Calculation Adapted from Mchugh, M. L. (2013). The Chi-square test of independence. Biochemia Medica Biochem Med, 23(3), 143 -149.
Meaning of Results • Interpreting cell χ2 values: of the case study: • “It can be seen in Table # 3 that the largest cell χ2 value of 5. 92 occurs in Cell 1”(Mc. Hugh, 2013, p. 146). • “This is a result of the observed value being 23 while 13. 92 were expected. Therefore, this cell has a much larger number of observed cases then expected by chance” (Mc. Hugh, 2013, p. 146). “Cell 1 reflects the number of unvaccinated employees who contracted pneumonia. This means the number of unvaccinated employees who contracted pneumonia was significantly greater than expected” (Mc. Hugh, 2013, p. 146). • • “The second largest cell χ2 value of 4. 56 located in Cell 2. The observed cases were lower than expected (Observed = 5, Expected = 12. 57). This means that a significantly lower number of vaccinated employees contracted pneumonia if the vaccine had no effect. No other cell has a cell χ2 value greater than 0. 99” (Mc. Hugh, 2013, p. 146). . • “Few statistical programs provide tables of cell expected and cell χ2 values as part of the default. Some programs will produce those tables as an option, and that option should be used to examine the cell χ2 values ” (Mc. Hugh, 2013, p. 146).
Conclusion • The chi-square is an analysis tool. The test provides considerable information about the nature of the research data (Mc Hugh, M. , 2013).
Reference • Bolboacă, S. D. , Jäntschi, L. , Sestraş, A. F. , Sestraş, R. E. , & Pamfil, D. C. (2011). Pearson-Fisher Chi-Square Statistic Revisited. Information, 2(4), 528 -545. • Polit, D. F. & Beck, C. T. (2012). Nursing research, generating and assessing evidence for nursing practice (9 th ed. ). Philadelphia, PA: Lippincott, Williams, & Wilkins • Franke, T. M. , Ho, T. , & Christie, C. A. (2011). The Chi. Square Test: Often Used and More Often Misinterpreted. American Journal of Evaluation, 33(3), 448 -458. • Eck, D. , & Ryan, J. (2008). The Chi Square Statistic. Retrieved from http: //math. hws. edu/javamath/ryan/Chi. Square. html • Mchugh, M. L. (2013). The Chi-square test of independence. Biochemia Medica Biochem Med, 23(3), 143 -149. • https: //en. wikipedia. org/wiki/Karl_Pearson