Correlation How Strong Is the Linear Relationship Lecture

  • Slides: 30
Download presentation
Correlation: How Strong Is the Linear Relationship? Lecture 50 Sec. 13. 7 Fri, Apr

Correlation: How Strong Is the Linear Relationship? Lecture 50 Sec. 13. 7 Fri, Apr 22, 2005

The Correlation Coefficient The correlation coefficient r is a number between – 1 and

The Correlation Coefficient The correlation coefficient r is a number between – 1 and +1. n It measures the direction and strength of the linear relationship. n If r > 0, then the relationship is positive. If r < 0, then the relationship is negative. n The closer r is to +1 or – 1, the stronger the relationship. n The closer r is to 0, the weaker the relationship. n

Strong Positive Linear Association n In this display, r is close to +1. y

Strong Positive Linear Association n In this display, r is close to +1. y x

Strong Positive Linear Association n In this display, r is close to +1. y

Strong Positive Linear Association n In this display, r is close to +1. y x

Strong Negative Linear Association n In this display, r is close to – 1.

Strong Negative Linear Association n In this display, r is close to – 1. y x

Strong Negative Linear Association n In this display, r is close to – 1.

Strong Negative Linear Association n In this display, r is close to – 1. y x

Almost No Linear Association n In this display, r is close to 0. y

Almost No Linear Association n In this display, r is close to 0. y x

Almost No Linear Association n In this display, r is close to 0. y

Almost No Linear Association n In this display, r is close to 0. y x

Correlation vs. Cause and Effect If the value of r is close to +1

Correlation vs. Cause and Effect If the value of r is close to +1 or -1, that indicates that x is a good predictor of y. n It does not indicate that x causes y. n The correlation coefficient cannot be used to determine cause and effect. n

Correlation vs. Cause and Effect There is good reason to believe that the size

Correlation vs. Cause and Effect There is good reason to believe that the size of a person’s waistline is a predictor of his performance on an algebra test (within the age range 0 – 21). n However, increasing your waistline will not help you on an algebra test. n Conversely, learning more algebra will not increase your waistline. n So why is there a relationship? n

“Third” Variables The hidden third variable is age. n Age causes (to some extent)

“Third” Variables The hidden third variable is age. n Age causes (to some extent) the waistline to increase. n Age causes (to some extent) a person to do better on an algebra test. n

Mixing Populations Mixing nonhomogeneous groups can create a misleading correlation coefficient. n Suppose we

Mixing Populations Mixing nonhomogeneous groups can create a misleading correlation coefficient. n Suppose we gather data on the number of hours spent watching TV each week and the child’s reading level, for 1 st, 2 nd, and 3 rd grade students. n

Mixing Populations We may get the following results, suggesting a weak positive correlation. Reading

Mixing Populations We may get the following results, suggesting a weak positive correlation. Reading level n Number of hours of TV

Mixing Populations We may get the following results, suggesting a weak positive correlation. Reading

Mixing Populations We may get the following results, suggesting a weak positive correlation. Reading level n Number of hours of TV

Mixing Populations However, if we separate the points according to grade level, we may

Mixing Populations However, if we separate the points according to grade level, we may see a different picture. 1 st grade Reading level n 2 nd grade 3 rd grade Number of hours of TV

Mixing Populations First-grade students by themselves may indicate negative correlation. 1 st grade Reading

Mixing Populations First-grade students by themselves may indicate negative correlation. 1 st grade Reading level n 2 nd grade 3 rd grade Number of hours of TV

Mixing Populations Second-grade students by themselves may also indicate negative correlation. 1 st grade

Mixing Populations Second-grade students by themselves may also indicate negative correlation. 1 st grade Reading level n 2 nd grade 3 rd grade Number of hours of TV

Mixing Populations And third-grade students by themselves may indicate negative correlation. 1 st grade

Mixing Populations And third-grade students by themselves may indicate negative correlation. 1 st grade Reading level n 2 nd grade 3 rd grade Number of hours of TV

Mixing Populations So, why did the points in the aggregate indicate a positive relationship?

Mixing Populations So, why did the points in the aggregate indicate a positive relationship? 1 st grade Reading level n 2 nd grade 3 rd grade Number of hours of TV

Calculating the Correlation Coefficient There are many formulas for r. n The most basic

Calculating the Correlation Coefficient There are many formulas for r. n The most basic formula is n

Example x 2 3 5 6 9 y 3 5 9 12 16 n

Example x 2 3 5 6 9 y 3 5 9 12 16 n Consider again the data

Example n Compute x, y, x 2, y 2, and xy. xy x 2

Example n Compute x, y, x 2, y 2, and xy. xy x 2 3 5 6 9 25 y 3 5 9 12 16 x 2 4 9 25 36 81 y 2 xy 9 6 25 15 81 45 144 72 256 144 45 155 515 282

Example n Then compute r.

Example n Then compute r.

An Alternate Formula n An alternate formula is n First, compute

An Alternate Formula n An alternate formula is n First, compute

An Alternate Formula n Then compute r.

An Alternate Formula n Then compute r.

TI-83 – Calculating r n To calculate r on the TI-83, First, be sure

TI-83 – Calculating r n To calculate r on the TI-83, First, be sure that Diagnostic is turned on. n Then, follow the procedure that produces the regression line. n In the same window, the TI-83 reports r 2 and r. n n Use the TI-83 to calculate r in the preceding example.

Let’s Do It! n Let’s Do It! 13. 10, p. 781 – Oil-Change Data.

Let’s Do It! n Let’s Do It! 13. 10, p. 781 – Oil-Change Data. n n Do part (b) on the TI-83. Let’s Do It! 13. 11, p. 782 – Data on Milk Production.

The Relationship Between b and r n It turns out that there is a

The Relationship Between b and r n It turns out that there is a simple relationship between the slope b of the regression line and the correlation coefficient r.

The Relationship Between b and r In the previous example, we found s. X

The Relationship Between b and r In the previous example, we found s. X = 2. 7386 and s. Y = 5. 2440. n We also found r = 0. 9922. n Therefore, the slope is n

The Relationship Between b and r n Equivalently, In our example, s. X =

The Relationship Between b and r n Equivalently, In our example, s. X = 2. 7386, s. Y = 5. 2440, and b = 1. 9. n Therefore, n