Contingency Tables Part II Getting Past ChiSquare Measures
Contingency Tables – Part II – Getting Past Chi-Square?
Measures of Association – A Review 1. What is the difference between a significance test statistic and a measure of association? • How are they related? 2. The basic questions about associations between variables? a) Does an association exist (vs. independence)? b) What is form (& direction) of the relation? c) What is the magnitude (strength) of the relation? • Association • Effect size
“Strength of Association C. What does “association” mean? 1) Covariance 2) Agreement 3) Predictability (reduction in errors/ignorance) D. Characteristics of association measures? 1) Coefficient should range between 0 (= no association) and 1 (+1) 2) Coefficient should not be directly affected by N 3) Coefficient should be independent of a variable’s scale of measurement (its “metric”) 4) Coefficient values should be interpretable (intuitively or mathematically)
“Strength of Association (cont. ) E. A number of different measures of association (coefficients) are available: • Based on different levels of measurement • Based on different interpretive models How to choose among them? 1) Identify levels-of-measurement of both variables 2) Identify if you have a clear independent variable may use a directional or a nondirectional coefficient 3) Identify which coefficients are most commonly used or most interpretable
Measurement Level Situations: Association between 2 numerical variables? – Coefficient = Pearson’s r • r 2 = proportion of variance “in common” – May use Spearman’s r if data are ranks Association between 1 categoric and 1 numeric variable? (as in ANOVA) – Coefficient of Association = eta (ή) • eta-squared = proportion of variance “between groups” • In SPSS, use Descriptives Cross-tabs or Compare Means procedures
Association between 2 categoric variables • Different approaches to nonparametric measures of association 1) Chi-square-based Correct for degrees of freedom and sample size 2) Uncertainty/Errors of Prediction (PRE/PRU) Improved Predictability of Y given knowledge of X 3) Concordance/agreement Proportion of shared or correspondent values • Note: coefficients for Ordinal and Nominal variables are slightly different Coeff. limited by the lowest level variable
Strength of Association (continued) • Association between 2 Nominal variables (or 1 nominal + 1 ordinal variable) – Chi-square-derived: • Contingency coefficient, C (forget it!) • Cramer’s V coefficient use this for 3 x 3 or larger tables • Phi coefficient, Φ use this for 2 x 2 tables (or 2 x 3 tables) – PRE-derived : • Lambda (asymmetric) (λyx <> λxy)
Phi-Coefficient formula: Cramer’s V formula:
Strength of Association (continued) • Association between 2 Ordinal variables – Concordance-based (PRE) statistics: • Gamma, γ most commonly used (note: in cases of 2 x 2 tables, gamma = Yules Q) • Others? Kendall’s tau; Somer’s d (less used) – Rank-order statistics: • Spearman’s Rho , • Use if many categories & few ties • Must convert scores to ranks – Can also use Chi-square-based measures • Will generally yield lower values
Nonparametric Measures of Association: Summary Recap • Nominal variables – Phi, Φ for 2 x 2 tables (or 2 x 3) – Kramer’s V for 3 x 3 tables or larger • Ordinal variables – – Gamma, γ most commonly used Yules Q same statistic in a 2 x 2 table Spearman’s r if many values & few ties Can also use Phi and Kramer’s V
Nonparametric Measures of Association: Summary (continued) • Different kinds of coefficients will not yield the same values on the same crosstabulation • Gamma (& Yules Q) will almost always compute higher values than Kramer’s V (& Phi) on the same tables • Note that 2 x 2 tables (with binary variables) are somewhat of a special case – Why?
Non-Parametric measures of association a. How to Compute them? – By Hand: see formulas in the textbook • Chi-square-based = easiest to compute • Gamma = more laborious by hand • Note: X & Y variables in crosstab must be formatted in the same direction for ordinal statistics (e. g. , Gamma) – In SPSS: Click Statistics box in Crosstabs pop-up menu, then select appropriate coefficients (Note: do not select them all)
II. Multivariate analysis of associations • Going beyond bivariate analysis to multivariate analyses – We often wish to consider more than two variables at a time because other variables may be involved in more complex patterns – Termed “Partialling” or “Elaborating” statistically consider: • • confounding effects of additional variables “spurious relationships” Complicating effects of additional variables “contingent relationships”
Multivariate Analysis (continued) – In cross-tabulations, crosstabs are “nested within levels of other variables • Compute separate sub-crosstabs within each category or level of the 3 rd variable • See the example on the handout – Partialing is only useful when the extra variable is associated with both X and Y • Then we wish to remove the extra covariation • Otherwise, it’s a waste of time
- Slides: 17