LIS 570 Summarising and presenting data Bivariate analysis
LIS 570 Summarising and presenting data Bivariate analysis
Association w Example: gender and voting n n n Are gender and party supported associated (related)? Are gender and party supported independent (unrelated)? Are women more likely than men to vote labor? Are men more likely to vote Liberal?
Association in bivariate data means that certain values of one variable tend to occur more often with some values of the second variable than with other variables of that variable (Moore p. 242)
Cross Tabulation Tables w w w w Designate the X variable and the Y variable Place the values of X across the table Draw a column for each X value Place the values of Y down the table Draw a row for each Y value Insert frequencies into each CELL Compute totals (MARGINALS) for each column and row
Determining if a Relationship Exists w Compute percentages for each value of X (down each column) n Base = marginal for each column w Read the table by comparing values of X for each value of Y n Read table across each row w Terminology n strong/ weak; positive/ negative; linear/ curvilinear
Cross tabulation tables Calculate percent Vote Occupation Read Table (De Vaus pp 158 -160)
Cross tabulation w Use column percentages and compare these across the table w Where there is a difference this indicates some association
Describing association Strong - Weak Direction Strength Positive - Negative Nature Linear - Curvilinear
Describing association Two variables are positively associated when larger values of one tend to be accompanied by larger values of the other The variables are negatively associated when larger values of one tend to be accompanied by smaller values of the other (Moore, p. 254)
Describing association w Scattergram n a graph that can be used to show two interval level variables are related to one another Variable N Shoe size Age Variable M
Description of Scattergrams n Strength of Relationship l l l n Linearity of Relationship l l n Strong Moderate Low Linear Curvilinear Direction l l Positive Negative
Description of scatterplots Y Y X Strength and direction Y X X
Description of scatterplots Y Nature X Y Y Strength and direction X Y X X
Correlation w Correlation coefficient n number used to describe the strength and direction of association between variables Very strong =. 80 through 1 l Moderately strong =. 60 through. 79 l Moderate =. 50 through. 59 l Moderately weak =. 30 through. 49 l Very weak to no relationship 0 to. 29 l -1. 00 Perfect Negative Correlation 0. 00 No relationship 1. 00 Perfect Positive Correlation
Correlation Coefficients n Nominal Phi (Spss Crosstabs) l Cramer’s V (Spss Crosstabs) l n Ordinal (linear) l n Gamma (Spss Crosstabs) Nominal and Interval l Eta (Spss Crosstabs)
Correlation: Pearson’s r n n (SPSS correlate, bivariate) Interval and/or ratio variables Pearson product moment coefficient (r) l l l two interval normally distributed variables assumes a linear relationship Can be any number from w 0 to -1 : 0 to 1 (+1) l l l Sign (+ or -) shows direction Number shows strength Linearity cannot be determined from the coefficient r=. 8913
Summary w Bivariate analysis w crosstabulation n n X - columns Y - rows l l calculate percentages for columns read percentages across the rows to observe association w Correlation and scattergram n describe strength and direction of association
- Slides: 17