Chisquare 2 Fenster ChiSquare l ChiSquare 2 l

  • Slides: 42
Download presentation
Chi-square (χ2) Fenster

Chi-square (χ2) Fenster

Chi-Square l Chi-Square χ2 l Tests of Statistical Significance for Nominal Level Data (Note:

Chi-Square l Chi-Square χ2 l Tests of Statistical Significance for Nominal Level Data (Note: can also be used for ordinal level data).

Chi-Square is an elegant and beautiful test l The assumptions required to use the

Chi-Square is an elegant and beautiful test l The assumptions required to use the test are very weak. That is to say, we do not have to make many assumptions about how the data are distributed. l

Chi-Square l We ask the following question- Are the frequencies empirically obtained (by this

Chi-Square l We ask the following question- Are the frequencies empirically obtained (by this we mean OBSERVED) significantly different from those, which would have been EXPECTED under some general set of assumptions:

Chi-Square Assumptions to use Chi-Square Test: l Samples are randomly selected from the population.

Chi-Square Assumptions to use Chi-Square Test: l Samples are randomly selected from the population. l EXPECTED frequencies (to be defined later) are greater than 5 in every cell. But even this assumption can be modified with the use of Yates' correction. WE DO NOT NEED TO ASSUME NORMALITY!!!! l

Chi-Square l This may not surprise you. After all, the concept of a normal

Chi-Square l This may not surprise you. After all, the concept of a normal distribution has no meaning for nominal level data and chisquare is a test for nominal level data. Chisquare is so popular because of this weak set of assumptions.

Chi-Square l l Ho for chi-square: If there were no relationship between the dependent

Chi-Square l l Ho for chi-square: If there were no relationship between the dependent and independent variable, the column percentages will not change as we move across levels of the INDEPENDENT variable. Note: We covered this earlier in the course. We said we had no relationship between two variables if the column percentages do not change across the independent variables.

Chi-Square We can compute an EXPECTED set of frequencies from the MARGINAL totals of

Chi-Square We can compute an EXPECTED set of frequencies from the MARGINAL totals of the dependent and independent variables. l To calculate EXPECTED frequencies we take the row total multiplied by the column total and divide by the grand total l

Chi-Square l Expected frequencies= (Row total) X (Column total) Grand Total l OBSERVED Frequencies

Chi-Square l Expected frequencies= (Row total) X (Column total) Grand Total l OBSERVED Frequencies are those frequencies that are empirically obtained. l Those are the frequencies that are given to us.

Chi-Square l Chi-Square= Σ(Observed frequencies- Expected Frequencies)2 Expected Frequencies

Chi-Square l Chi-Square= Σ(Observed frequencies- Expected Frequencies)2 Expected Frequencies

Chi-Square Usually this formula is written l Chi-Square = Σ (O - E)2 l

Chi-Square Usually this formula is written l Chi-Square = Σ (O - E)2 l E l l The larger the difference between observed and expected frequencies the larger the value for χ2.

Chi-Square If you look at a chi-square table, you will see many different χ2

Chi-Square If you look at a chi-square table, you will see many different χ2 distributions. l Which one should you use? l You use the χ2 distribution with the appropriate number of degrees of freedom. l For χ2 degrees of freedom are given with the following formula: df= (r-1) X (c-1) l

Chi-Square That is to say l (1) we take the number of rows we

Chi-Square That is to say l (1) we take the number of rows we have and subtract one. l (2) We take the number of columns we have and subtract one. l (3) We then multiply the numbers we get for the first two parts. l

Chi-Square Logic of the χ2 test: l We do not expect observed and expected

Chi-Square Logic of the χ2 test: l We do not expect observed and expected frequencies to be EXACTLY the same. l Observed and expected values can vary simply by sampling variability. l However, if the value of χ2 turns out to be larger than that expected by chance, we shall be in a position to reject the null hypothesis. l

Chi-Square EXAMPLE: l Let us say one was interested in investigating the relationship between

Chi-Square EXAMPLE: l Let us say one was interested in investigating the relationship between gender and opinions on accountability. l Our null hypothesis is that gender makes no difference in attitudes towards accountability. Our research hypothesis is that gender makes a difference in attitudes towards accountability. l

Chi-Square Gender Opinion on Accountability Male Female Row Totals Accountability good for educational system

Chi-Square Gender Opinion on Accountability Male Female Row Totals Accountability good for educational system 126 99 225 Accountability bad for educational system 71 162 233 197 261 458 Col. Totals

Chi-Square It is important to note that the numbers in each cell are actual

Chi-Square It is important to note that the numbers in each cell are actual frequencies rather than percentages. l Let us go through our six-step hypothesis testing method in this case. l

Chi-Square Step 2: State the Research hypothesis l H 1: Gender does make a

Chi-Square Step 2: State the Research hypothesis l H 1: Gender does make a difference when predicting to attitudes towards educational accountability. l Step 1 -State null hypothesis l Ho: Gender does not make a difference when predicting to attitudes towards educational accountability. l

Chi-Square Step 3: Select a significance level: Let’s chose α=. 01 l Step 4:

Chi-Square Step 3: Select a significance level: Let’s chose α=. 01 l Step 4: Collect and summarize the sample data: l Calculation of χ2: l Compute out EXPECTED FREQUENCIES for EACH CELL l

Chi-Square Computing out EXPECTED FREQUENCIES l cell a- males who believe that accountability is

Chi-Square Computing out EXPECTED FREQUENCIES l cell a- males who believe that accountability is good for the educational system l (197) (225) = 96. 8 458 l

Chi-Square b- females who believe that accountability is good for the educational system l

Chi-Square b- females who believe that accountability is good for the educational system l (261) (225) = 128. 2 458 l

Chi-Square c- males who believe that accountability is bad for the educational system l

Chi-Square c- males who believe that accountability is bad for the educational system l (197) (233) = 100. 2 458 l

Chi-Square d- females who believe that accountability is bad for the educational system l

Chi-Square d- females who believe that accountability is bad for the educational system l (261) (233) = 132. 8 l 458

Set up a chi-square table f observed Cell f f observedexpected f expected (f

Set up a chi-square table f observed Cell f f observedexpected f expected (f obs- f exp)2 (f obs- f exp) 2/ f exp A 126 96. 8 29. 2 852. 64 8. 808 B 99 128. 2 -29. 2 852. 64 6. 651 C 71 100. 2 -29. 2 852. 64 8. 509 D 162 132. 8 29. 2 852. 64 6. 420 Total 458 0 30. 388

Chi-Square l l l Step 5 Obtaining the sampling distribution. Look at a chi-square

Chi-Square l l l Step 5 Obtaining the sampling distribution. Look at a chi-square table. We will use the chi-square test with 1 degree of freedom. Why one? df=(r-1) X (c-1) We have 2 rows and 2 columns. so we get df= (2 -1) X (2 -1)= 1 X 1=1 With our choice of α=. 01, we get a χ2 critical of 6. 635 (found in chi-square table, p. 566)

Chi-Square l If we find a χ2 greater than or equal to 6. 635

Chi-Square l If we find a χ2 greater than or equal to 6. 635 we reject the null hypothesis and conclude that gender does make a difference when predicting to attitudes towards educational accountability. l If we find a χ2 less than 6. 635 we fail to reject the null hypothesis and conclude that gender does not make a difference when predicting to attitudes towards educational accountability.

Chi-Square Note: All χ2 tests are one-tailed tests. l Chi-square can only tell you

Chi-Square Note: All χ2 tests are one-tailed tests. l Chi-square can only tell you whether a variable is significant. l Chi-square can not tell you anything about the DIRECTIONALITY of the relationship. l You must inspect the column percentages as you move across categories of the independent variable to determine DIRECTIONALITY. l

Chi-Square l l Another way to determine DIRECTIONALITY is to look at the RESIDUALS

Chi-Square l l Another way to determine DIRECTIONALITY is to look at the RESIDUALS (you can instruct SPSS to present the residuals on your output file. RESIDUALS are simply the OBSERVED cell count minus the EXPECTED value. ) If the RESIDUALS are NEGATIVE, you are getting fewer OBSERVED cases than EXPECTED in a CELL. If the RESIDUALS are POSITIVE, you are getting more OBSERVED cases than EXPECTED in a CELL.

Chi-Square To determine DIRECTIONALITY, look at the SIGN changes of the RESIDUALS as you

Chi-Square To determine DIRECTIONALITY, look at the SIGN changes of the RESIDUALS as you move across categories of the independent variable. l Let us assume that the RESIDUALS start out NEGATIVE and end up POSITIVE. This would imply that the independent variable is related to the dependent variable. l

Chi-Square Step 6: Make a decision: l χ2 observed= 30. 388 and χ2 critical=

Chi-Square Step 6: Make a decision: l χ2 observed= 30. 388 and χ2 critical= 6. 635 l Decision: REJECT Ho : l χ2 observed is greater than χ2 critical l We easily reject the null hypothesis and conclude that gender does make a difference when predicting to attitudes towards educational accountability. l

Chi-Square Two points to note in this example. l We had one degree of

Chi-Square Two points to note in this example. l We had one degree of freedom. l By one degree of freedom we mean that only one number in the table is actually free to vary. l Assume we know the row and column totals. l Once we know one number in a 2 X 2 table, we can find the other three. l

Chi-Square a b a+b c d c+d a+c b+d a+b+c+d l If I knew

Chi-Square a b a+b c d c+d a+c b+d a+b+c+d l If I knew the row and column totals, there is only one cell that is free to vary.

Chi-Square l WE ONLY NEED TO KNOW ONE CELL TO KNOW THE ENTIRE TABLE.

Chi-Square l WE ONLY NEED TO KNOW ONE CELL TO KNOW THE ENTIRE TABLE. THIS IS WHY WE HAD ONE DEGREE OF FREEDOM.

Chi-Square l In our example, f observed - f expected = the same number

Chi-Square l In our example, f observed - f expected = the same number for each cell (-29. 2 or 29. 2) because the table had only one degree of freedom. If a table has more than one degree of freedom, f observed- f expected does not necessarily equal the same number in every cell (and will not generally be the same).

Chi-Square How many cells do we need to know in a 3 X 3

Chi-Square How many cells do we need to know in a 3 X 3 table? I told you the formula tells us the answer is (r-1) (c-1) l (3 -1)=(2) X (2) = 4 l

Let us see how we get df to equal 4. a b c a+b+c

Let us see how we get df to equal 4. a b c a+b+c d e f d+e+f g h i g+h+i a+d+g b+e+h c+f+i a+b+c+ d+e+f+ g+h+i

Chi-Square l l l Let us say we knew cell a. Could we know

Chi-Square l l l Let us say we knew cell a. Could we know all the other cells in the table? Not this time. Let’s say we know cells a and b. If we knew cells a and b than we can find out cell c, but we would not know any other cells. Only if we know four cells: a, b, d, and e would we be able to find the other five cells. This is why we have four degrees of freedom in a 3 X 3 table.

Chi-Square One other point about chi-square. l Chi-square can tell you whether a relationship

Chi-Square One other point about chi-square. l Chi-square can tell you whether a relationship is significant. l Chi-square can also tell you what cells are most important in determining the significance of the relationship. In our example we find that all cells contribute to the significance of the relationship. l

Set up a chi-square table f observed Cell f f observedexpected f expected (f

Set up a chi-square table f observed Cell f f observedexpected f expected (f obs- f exp)2 (f obs- f exp) 2/ f exp A 126 96. 8 29. 2 852. 64 8. 808 B 99 128. 2 -29. 2 852. 64 6. 651 C 71 100. 2 -29. 2 852. 64 8. 509 D 162 132. 8 29. 2 852. 64 6. 420 Total 458 0 30. 388

Chi-Square l Three of our cells have individual χ2 greater than needed to establish

Chi-Square l Three of our cells have individual χ2 greater than needed to establish statistical significance for an entire relationship. Since χ2 cannot be negative, we can determine if part of our relationship drives the entire relationship to statistical significance.

SPSS Command Syntax for Crosstabs Note: You can get EXPECTED frequencies in SPSS by

SPSS Command Syntax for Crosstabs Note: You can get EXPECTED frequencies in SPSS by going into l ANALYZE l DESCRIPTIVE STATISTICS l CROSSTABS and clicking on CROSSTABS l Dependent variable goes into row box l Independent variable goes into column box l

SPSS Command Syntax for Crosstabs Click on Cells l Click on EXPECTED l Also

SPSS Command Syntax for Crosstabs Click on Cells l Click on EXPECTED l Also click on UNSTANDARDIZED under residuals. ) Click Continue. l Click on the STATISTICS box l Click on Chi-Square. l You may also want to click on the Contingency Coefficient and lambda. l