3 1 RELATIONSHIP BETWEEN TWO QUANTITATIVE VARIABLES PEARSON
3. 1 RELATIONSHIP BETWEEN TWO QUANTITATIVE VARIABLES: PEARSON CORRELATION COEFFICIENT Design and Data Analysis in Psychology II Susana Sanduvete Chaves Salvador Chacón Moscoso 1
DEFINITION • r. XY • Coefficient useful to measure covariation between variables: in which way changes in a variable are associated to the changes in other variable. • Quantitative variables (interval or ratio scale). • Linear relationship EXCLUSIVELY. • Values: -1 ≤ r. XY ≤ +1. • Interpretation: +1: perfect positive correlation (direct association). -1: perfect negative correlation (inverse association). 0: no correlation. 2
Perfect positive correlation: rxy = +1 (difficult to find in psychology) 3
Positive correlation: 0 < rxy < +1 4
Perfect negative correlation: rxy = -1 (difficult to find in psychology) 5
Negative correlation: -1 < rxy < 0 6
No correlation 7
Formulas Raw scores Deviation scores Standard scores 8
Example X: 2 4 6 8 10 12 14 16 18 20 Y: 1 6 8 10 12 13 10 22 1. Calculate rxy in raw scores. 2. Calculate rxy in deviation scores. 3. Calculate rxy in standard scores. 9
Example: scatter plot 10
Example : calculation of rxy in raw scores X 2 4 6 8 10 12 14 16 18 20 110 Y 1 6 8 10 12 13 10 22 104 XY 2 24 48 80 120 168 208 180 440 1390 X 2 4 16 36 64 100 144 196 256 324 400 1540 Y 2 1 36 64 100 144 169 100 484 1342 11
Example : calculation of rxy in raw scores 12
Example : calculation of rxy in deviation scores X 2 4 6 8 10 12 14 16 18 20 110 Y 1 6 8 10 12 13 10 22 104 x -9 -7 -5 -3 -1 1 3 5 7 9 0 y -9. 4 -4. 4 -2. 4 -0. 4 1. 6 2. 6 -0. 4 11. 6 0 xy 84. 6 30. 8 12 1. 2 -1. 6 -0. 4 4. 8 13 -2. 8 104. 4 246 x 2 81 49 25 9 1 1 9 25 49 81 330 y 2 88. 36 19. 36 5. 76 0. 16 2. 56 6. 76 0. 16 134. 56 260. 4 13
Example : calculation of rxy in deviation scores 14
Example : calculation of rxy in standard scores X 2 4 6 8 10 12 14 16 18 20 110 Y 1 6 8 10 12 13 10 22 104 Zx -1. 567 -1. 218 -0. 870 -0. 522 -0. 174 0. 522 0. 870 1. 218 1. 567 0 Zy -1. 842 -0. 862 -0. 470 -0. 078 0. 314 0. 510 -0. 078 2. 273 0 Zx. Zy 2. 886 1. 051 0. 409 0. 041 -0. 055 -0. 014 0. 164 0. 443 -0. 096 3. 561 8. 391 15
Example : calculation of rxy in standard scores 16
Significance • Does the correlation coefficient show a real relationship between X and Y, or is that relationship due to hazard? • Null hypothesis H 0: rxy = 0. The correlation coefficient is drawn from a population whose correlation is zero (ρXY = 0). • Alternative hypothesis H 1: . The correlation coefficient is not drawn from a population whose correlation is different to zero (ρXY ). 17
Significance • Formula: • Interpretation: – Null hypothesis is rejected. The correlation is not drawn from a population whose score ρxy = 0. Significant relationship between variables exists. – Null hypothesis is accepted. The correlation is drawn from a population whose score ρxy = 0. Significant relationship between variables does not exist. • Exercise: conclude about the significance of the example. 18
Significance: example Conclusions: we reject the null hypothesis with a maximum risk to fail of 0. 05. The correlation is not drawn from a population whose score ρxy = 0. Relationship between variables exists. 19
Other questions to be considered • Correlation does not imply causality. • Statistical significance depends on sample size (higher N, likelier to obtain significance). • Other possible interpretation is given by the coefficient of determination , or proportion of variability in Y that is ‘explained’ by X. • The proportion of Y variability that left unexplained by X is called coefficient of non-determination: • Exercise: calculate the coefficient of determination and the coefficient of non-determination and interpret the results. 20
Coefficient of determination: example explained by X. Y is not explained. . 70. 4% of variability in Y is . 29. 6% of variability in 21
Which is the final conclusion? Non-significant effect Significant effect High effect size (≥ 0. 67) Low effect size (≤ 0. 18) The non-significance can be due to low The effect probably statistical power exists The statistical significance can be due to an excessive high statistical power The effect probably does not exist 22
Which is the final conclusion? Non-significant effect Significant effect High effect size (≥ 0. 67) Low effect size (≤ 0. 18) The non-significance can be due to low The effect probably statistical power exists The statistical significance can be due to an excessive high statistical power The effect probably does not exist 23
- Slides: 23