Correlation and Prediction Chapter 10 Chapter Outline Graphing

  • Slides: 43
Download presentation
Correlation and Prediction Chapter 10

Correlation and Prediction Chapter 10

Chapter Outline • • • Graphing Correlations: The Scatter Diagram Patterns of Correlation The

Chapter Outline • • • Graphing Correlations: The Scatter Diagram Patterns of Correlation The Correlation Coefficient Issues in Interpreting the Correlation Coefficient Prediction The Correlation Coefficient and Proportion of Variance Accounted for • Correlation and Prediction in Research Articles • Advanced Topic: Multiple Regression in Research Articles

Correlations • Can be thought of as a descriptive statistic for the relationship between

Correlations • Can be thought of as a descriptive statistic for the relationship between two variables • Describes the relationship between two equalinterval numeric variables – e. g. , correlation between amount of time studying and amount learned – e. g. , correlation between number of years of education and salary

Correlation instruct. uwo. ca/geog/500/correlation_by_6. pdf

Correlation instruct. uwo. ca/geog/500/correlation_by_6. pdf

Scatter Diagram or Scatter Plot Graph showing the pattern o f the relationship between

Scatter Diagram or Scatter Plot Graph showing the pattern o f the relationship between two variables

Patterns of Correlation • A linear correlation – relationship between two variables on a

Patterns of Correlation • A linear correlation – relationship between two variables on a scatter diagram roughly approximating a straight line • Curvilinear correlation – any association between two variables other than a linear correlation – relationship between two variables that shows up on a scatter diagram as dots following a systematic pattern that is not a straight line • No correlation – no systematic relationship between two variables

Positive and Negative Linear Correlation • Positive Correlation – High scores go with high

Positive and Negative Linear Correlation • Positive Correlation – High scores go with high scores. – Low scores go with low scores. – Medium scores go with medium scores. • e. g. , level of education achieved and income • Negative Correlation – High scores go with low scores. • e. g. , the relationship between fewer hours of sleep and higher levels of stress • Strength of the Correlation – how close the dots on a scatter diagram fall to a simple straight line

Positive Linear correlation

Positive Linear correlation

Negative correlation

Negative correlation

Zero Correlation ludwig-sun 2. unil. ch/~darlene/Rmini/lec/20021031. ppt

Zero Correlation ludwig-sun 2. unil. ch/~darlene/Rmini/lec/20021031. ppt

Curvilinear Relationship ludwig-sun 2. unil. ch/~darlene/Rmini/lec/20021031. ppt

Curvilinear Relationship ludwig-sun 2. unil. ch/~darlene/Rmini/lec/20021031. ppt

Curvilinear

Curvilinear

How Are You doing? • What does it mean when two variables have a

How Are You doing? • What does it mean when two variables have a curvilinear relationship? • True or False: When two variables are negatively correlated, high scores go with high scores, low scores go with low scores, and medium scores go with medium scores.

The Correlation Coefficient • Number that gives exact correlation between 2 variables – can

The Correlation Coefficient • Number that gives exact correlation between 2 variables – can tell you direction and strength – uses Z scores to compare scores on different variables • Z scores allow you to calculate a cross-product that tells you the direction of the correlation. – A cross-product is the result of multiplying a score on one variable by a score on the other variable. – If you multiply a high Z score by a high Z score, you will always get a positive cross-product. – If you multiply a low Z score by a low Z score, you will always get a positive cross-product. – If you multiply a high Z score with a low Z score or a low Z score with a high Z score, you will get a negative number.

The Correlation Coefficient (r) • The sign of r (Pearson correlation coefficient) tells the

The Correlation Coefficient (r) • The sign of r (Pearson correlation coefficient) tells the general trend of a relationship between two variables. – A + sign means the correlation is positive. – A - sign means the correlation is negative. • The value of r ranges from -1 to 1. – 1 is the highest value a correlation can have. • A correlation of 1 or -1 means that the variables are perfectly correlated. • 0 = no correlation – The value of a correlation defines the strength of the correlation regardless of the sign. • e. g. , -. 99 is a stronger correlation than. 75

Formula for a Correlation Coefficient • r = ∑Zx. Zy N • • Zx

Formula for a Correlation Coefficient • r = ∑Zx. Zy N • • Zx = Z score for each person on the X variable Zy = Z score for each person on the Y variable Zx. Zy = cross-product of Zx and Zy ∑Zx. Zy = sum of the cross-products of the Z scores over all participants in the study

Pearson Correlation Coefficient • Pearson correlation coefficient “r” is the average value of the

Pearson Correlation Coefficient • Pearson correlation coefficient “r” is the average value of the cross-product of ZX and Zy • r is a measure of LINEAR ASSOCIATION (Direction: + vs. – and Strength: How much

Definitional Formula

Definitional Formula

Computational Formula

Computational Formula

Bivariate Correlation

Bivariate Correlation

Issues in Interpreting the Correlation Coefficient • Direction of causality – path of causal

Issues in Interpreting the Correlation Coefficient • Direction of causality – path of causal effect (e. g. , X causes Y) • You cannot determine the direction of causality just because two variables are correlated.

Three Possible Directions of Causality • Variable X causes variable Y. – e. g.

Three Possible Directions of Causality • Variable X causes variable Y. – e. g. , less sleep causes more stress • Variable Y causes variable X. – e. g. , more stress causes people to sleep less • There is a third variable that causes both variable X and variable Y. – e. g. , working longer hours causes both stress and fewer hours of sleep

Ruling Out Some Possible Directions of Causality • Longitudinal Study – a study where

Ruling Out Some Possible Directions of Causality • Longitudinal Study – a study where people are measured at two or more points in time • e. g. , evaluating number of hours of sleep at one time point and then evaluating their levels of stress at a later time point • True Experiment – a study in which participants are randomly assigned to a particular level of a variable and then measured on another variable • e. g. , exposing individuals to varying amounts of sleep in a laboratory environment and then evaluating their stress levels

The Statistical Significance of r • A correlation is statistically significant if it is

The Statistical Significance of r • A correlation is statistically significant if it is unlikely that you could have gotten a correlation as big as you did if in fact there was no relationship between variables. – If the probability (p) is less than some small degree of probability (e. g. , 5% or 1%), the correlation is considered statistically significant.

Malawi Med J. 2012 Sep; 24(3): 69– 71.

Malawi Med J. 2012 Sep; 24(3): 69– 71.

Key Points • • • Two variables are correlated when they are associated in

Key Points • • • Two variables are correlated when they are associated in a clear pattern. A scatter diagram displays the relationship between two variables. A linear correlation is seen when the dots in a scatter diagram generally follow a straight line. In a curvilinear correlation, the dots follow a pattern that does not approximate a straight line. When there is no correlation, the dots do not follow a pattern. In a positive correlation, the highs go with the highs, the lows with the lows, and the mediums go with the mediums. With a negative correlation, the lows go with the highs. r is the correlation coefficient and gives you the direction and strength of a correlation. r = (∑Zx Zy )/N The maximum positive value of r = 1 and the maximum negative value of r = -1. The closer the correlation is to -1 or 1, the stronger the correlation. Correlation does not tell you the direction of causation. Prediction model using Z scores = predicted Zy = ( )(Zx). Prediction model with raw scores = predicted Y = (SDy)(predicted Zy) + My. r 2 = proportion of variance accounted for and is used to compare linear correlations Correlation coefficients are reported both in the text and in tables of research articles.