ELEMENTARY STATISTICS Chapter 9 Correlation and Regression C

  • Slides: 45
Download presentation
ELEMENTARY STATISTICS Chapter 9 Correlation and Regression C. M. Pascual Chapter 9. Section 9

ELEMENTARY STATISTICS Chapter 9 Correlation and Regression C. M. Pascual Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 1

Chapter 9 Correlation and Regression 9 -1 Overview 9 -2 Correlation 9 -3 Regression

Chapter 9 Correlation and Regression 9 -1 Overview 9 -2 Correlation 9 -3 Regression 9 -4 Variation and Prediction Intervals 9 -5 Multiple Regression 9 -6 Modeling Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 2

9 -1 Overview Paired Data v is there a relationship v if so, what

9 -1 Overview Paired Data v is there a relationship v if so, what is the equation v use the equation for prediction Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 3

9 -2 Correlation Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics,

9 -2 Correlation Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 4

Definition v. Correlation exists between two variables when one of them is related to

Definition v. Correlation exists between two variables when one of them is related to the other in some way Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 5

Assumptions 1. The sample of paired data (x, y) is a random sample. 2.

Assumptions 1. The sample of paired data (x, y) is a random sample. 2. The pairs of (x, y) data have a bivariate normal distribution. Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 6

Definition v. Scatterplot (or scatter diagram) is a graph in which the paired (x,

Definition v. Scatterplot (or scatter diagram) is a graph in which the paired (x, y) sample data are plotted with a horizontal x axis and a vertical y axis. Each individual (x, y) pair is plotted as a single point. Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 7

Scatter Diagram of Paired Data Chapter 9. Section 9 -1 and 9 -2. Triola,

Scatter Diagram of Paired Data Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 8

Scatter Diagram of Paired Data Chapter 9. Section 9 -1 and 9 -2. Triola,

Scatter Diagram of Paired Data Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 9

Positive Linear Correlation y y y (a) Positive Figure 9 -1 x x x

Positive Linear Correlation y y y (a) Positive Figure 9 -1 x x x (b) Strong positive (c) Perfect positive Scatter Plots Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 10

Negative Linear Correlation y y y (d) Negative Figure 9 -1 x x x

Negative Linear Correlation y y y (d) Negative Figure 9 -1 x x x (e) Strong negative (f) Perfect negative Scatter Plots Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 11

No Linear Correlation y y x (g) No Correlation Figure 9 -1 x (h)

No Linear Correlation y y x (g) No Correlation Figure 9 -1 x (h) Nonlinear Correlation Scatter Plots Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 12

Definition v. Linear Correlation Coefficient r measures strength of the linear relationship between paired

Definition v. Linear Correlation Coefficient r measures strength of the linear relationship between paired x and y values in a sample Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 13

Definition v. Linear Correlation Coefficient r measures strength of the linear relationship between paired

Definition v. Linear Correlation Coefficient r measures strength of the linear relationship between paired x and y values in a sample r= n xy - ( x)( y) n( x 2) - ( x)2 n( y 2) - ( y)2 Formula 9 -1 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 14

Definition v. Linear Correlation Coefficient r measures strength of the linear relationship between paired

Definition v. Linear Correlation Coefficient r measures strength of the linear relationship between paired x and y values in a sample r= n xy - ( x)( y) n( x 2) - ( x)2 n( y 2) - ( y)2 Formula 9 -1 Calculators can compute r (rho) is the linear correlation coefficient for all paired data in the population. Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 15

Notation for the Linear Correlation Coefficient n = number of pairs of data presented

Notation for the Linear Correlation Coefficient n = number of pairs of data presented denotes the addition of the items indicated. x denotes the sum of all x values. x 2 indicates that each x score should be squared and then those squares added. ( x)2 indicates that the x scores should be added and the total then squared. xy indicates that each x score should be first multiplied by its corresponding y score. After obtaining all such products, find their sum. r represents linear correlation coefficient for a sample represents linear correlation coefficient for a population Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 16

Rounding the Linear Correlation Coefficient r v Round to three decimal places so that

Rounding the Linear Correlation Coefficient r v Round to three decimal places so that it can be compared to critical values in Table A-6 v Use calculator or computer if possible Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 17

Interpreting the Linear Correlation Coefficient v. If the absolute value of r exceeds the

Interpreting the Linear Correlation Coefficient v. If the absolute value of r exceeds the value in Table A - 6, conclude that there is a significant linear correlation. v. Otherwise, there is not sufficient evidence to support the conclusion of significant linear correlation. Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 18

TABLE A-6 Critical Values of the Pearson Correlation Coefficient r n 4 5 6

TABLE A-6 Critical Values of the Pearson Correlation Coefficient r n 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30 35 40 45 50 60 70 80 90 100 =. 05. 950. 878. 811. 754. 707. 666. 632. 602. 576. 553. 532. 514. 497. 482. 468. 456. 444. 396. 361. 335. 312. 294. 279. 254. 236. 220. 207. 196 =. 01. 999. 959. 917. 875. 834. 798. 765. 735. 708. 684. 661. 641. 623. 606. 590. 575. 561. 505. 463. 430. 402. 378. 361. 330. 305. 286. 269. 256 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 19

Example 1 • Construct a scatter plot for the given data Age, x 43

Example 1 • Construct a scatter plot for the given data Age, x 43 48 56 61 67 70 Pressure, y 128 120 135 143 141 152 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 20

Example 1 Solution: Draw and label the x and y axes. Plot each point

Example 1 Solution: Draw and label the x and y axes. Plot each point on the graph below Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 21

Example 2 • A Statistics professor at a state university wants to see how

Example 2 • A Statistics professor at a state university wants to see how strong the relationship is between a student’s score on a test and his or her grade point average. The data obtained from the sample follow: Test score, x 98 105 100 106 95 116 112 GPA, y 2. 1 2. 4 3. 2 2. 7 2. 2 2. 3 3. 8 3. 4 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 22

Subject Test Score GPA x y xy x^2 y^2 1 98 2. 1 205.

Subject Test Score GPA x y xy x^2 y^2 1 98 2. 1 205. 8 9604 4. 41 2 105 2. 4 252 11025 5. 76 3 100 3. 2 320 10000 10. 24 4 100 2. 7 270 10000 7. 29 5 106 2. 2 233. 2 11236 4. 84 6 95 2. 3 218. 5 9025 5. 29 7 116 3. 8 440. 8 13456 14. 44 8 112 3. 4 380. 8 12544 11. 56 SUM 832 22. 1 2321. 1 86890 63. 83 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 23

Example 2 • • Solve of SSxy, SSxx, and Ssyy; SSxy = ∑xy –

Example 2 • • Solve of SSxy, SSxx, and Ssyy; SSxy = ∑xy – [(∑x) (∑y )]/n = 2321. 1 – [(832)(22. 1)]/8 = 22. 7 SSxx = ∑x 2 – (∑x)2/n = 86890 – [(832)2]/8 = 362 SSyy = ∑y 2 – (∑y)2/n = 63. 83 – [(22. 1)2]/8 = 2. 78 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 24

Example 2 • Substitute in the formula and solve for r; • r =

Example 2 • Substitute in the formula and solve for r; • r = SSxy/(SSxx * Ssyy)0. 5 • = 22. 7/[(362)(2. 78)]0. 5 = 0. 716 • The correlation coefficient suggests a strong positive relationship between the test score and the grade point average. Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 25

Properties of the Linear Correlation Coefficient r 1. -1 r 1 2. Value of

Properties of the Linear Correlation Coefficient r 1. -1 r 1 2. Value of r does not change if all values of either variable are converted to a different scale. 3. The r is not affected by the choice of x and y. Interchange x and y and the value of r will not change. 4. r measures strength of a linear relationship. Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 26

Common Errors Involving Correlation 1. Causation: It is wrong to conclude that correlation implies

Common Errors Involving Correlation 1. Causation: It is wrong to conclude that correlation implies causality. 2. Averages: Averages suppress individual variation and may inflate the correlation coefficient. 3. Linearity: There may be some relationship between x and y even when there is no significant linear correlation. Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 27

Common Errors Involving Correlation FIGURE 9 -2 250 Distance (feet) 200 150 100 50

Common Errors Involving Correlation FIGURE 9 -2 250 Distance (feet) 200 150 100 50 0 0 1 2 3 4 5 6 7 8 Time (seconds) Scatterplot of Distance above Ground and Time for Object Thrown Upward Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 28

Formal Hypothesis Test v To determine whethere is a significant linear correlation between two

Formal Hypothesis Test v To determine whethere is a significant linear correlation between two variables v Two methods v Both methods let H 0: = (no significant linear correlation) H 1: (significant linear correlation) Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 29

Method 1: Test Statistic is t (follows format of earlier chapters) Test statistic: t=

Method 1: Test Statistic is t (follows format of earlier chapters) Test statistic: t= r 1 -r 2 n-2 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 30

Method 1: Test Statistic is t (follows format of earlier chapters) Test statistic: t=

Method 1: Test Statistic is t (follows format of earlier chapters) Test statistic: t= r 1 -r 2 n-2 Critical values: use Table A-3 with degrees of freedom = n - 2 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 31

Method 1: Test Statistic is t (follows format of earlier chapters) Figure 9 -4

Method 1: Test Statistic is t (follows format of earlier chapters) Figure 9 -4 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 32

Method 2: Test Statistic is r (uses fewer calculations) v. Test statistic: r v.

Method 2: Test Statistic is r (uses fewer calculations) v. Test statistic: r v. Critical values: Refer to Table A-6 (no degrees of freedom) Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 33

Method 2: Test Statistic is r (uses fewer calculations) v. Test statistic: r v.

Method 2: Test Statistic is r (uses fewer calculations) v. Test statistic: r v. Critical values: Refer to Table A-6 (no degrees of freedom) Reject = 0 -1 Figure 9 -5 r = - 0. 811 Fail to reject =0 0 Reject = 0 r = 0. 811 1 Sample data: r = 0. 828 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 34

FIGURE 9 -3 Start Testing for a Linear Correlation Let H 0: = 0

FIGURE 9 -3 Start Testing for a Linear Correlation Let H 0: = 0 H 1: 0 Select a significance level Calculate r using Formula 9 -1 METHOD 2 The test statistic is r The test statistic is t= r Critical values of t are from Table A-6 1 -r 2 n -2 Critical values of t are from Table A-3 with n -2 degrees of freedom If the absolute value of the test statistic exceeds the critical values, reject H 0: = 0 Otherwise fail to reject H 0 If H 0 is rejected conclude that there is a significant linear correlation. If you fail to reject H 0, then there is not sufficient evidence to conclude that there is linear correlation. Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 35

Is there a significant linear correlation? Data from the Garbage Project x Plastic (lb)

Is there a significant linear correlation? Data from the Garbage Project x Plastic (lb) y Household 0. 27 1. 41 2 3 2. 19 2. 83 2. 19 1. 81 0. 85 3. 05 3 6 4 2 1 5 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 36

Is there a significant linear correlation? Data from the Garbage Project x Plastic (lb)

Is there a significant linear correlation? Data from the Garbage Project x Plastic (lb) y Household n=8 0. 27 1. 41 2 3 = 0. 05 2. 19 2. 83 2. 19 1. 81 0. 85 3. 05 3 6 4 2 1 5 H 0: = 0 H 1 : 0 Test statistic is r = 0. 842 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 37

Is there a significant linear correlation? n=8 = 0. 05 =0 : 0 H

Is there a significant linear correlation? n=8 = 0. 05 =0 : 0 H 0 : H 1 Test statistic is r = 0. 842 Critical values are r = - 0. 707 and 0. 707 (Table A-6 with n = 8 and = 0. 05) TABLE A-6 Critical Values of the Pearson Correlation Coefficient r n 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30 35 40 45 50 60 70 80 90 100 =. 05. 950. 878. 811. 754. 707. 666. 632. 602. 576. 553. 532. 514. 497. 482. 468. 456. 444. 396. 361. 335. 312. 294. 279. 254. 236. 220. 207. 196 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman =. 01. 999. 959. 917. 875. 834. 798. 765. 735. 708. 684. 661. 641. 623. 606. 590. 575. 561. 505. 463. 430. 402. 378. 361. 330. 305. 286. 269. 256 38

Is there a significant linear correlation? Reject = 0 -1 r = - 0.

Is there a significant linear correlation? Reject = 0 -1 r = - 0. 707 Fail to reject =0 0 Reject = 0 r = 0. 707 1 Sample data: r = 0. 842 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 39

Is there a significant linear correlation? 0. 842 > 0. 707, That is the

Is there a significant linear correlation? 0. 842 > 0. 707, That is the test statistic does fall within the critical region. Reject = 0 -1 r = - 0. 707 Fail to reject =0 0 Reject = 0 r = 0. 707 1 Sample data: r = 0. 842 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 40

Is there a significant linear correlation? 0. 842 > 0. 707, That is the

Is there a significant linear correlation? 0. 842 > 0. 707, That is the test statistic does fall within the critical region. Therefore, we REJECT H 0: = 0 (no correlation) and conclude there is a significant linear correlation between the weights of discarded plastic and household size. Reject = 0 -1 r = - 0. 707 Fail to reject =0 0 Reject = 0 r = 0. 707 1 Sample data: r = 0. 842 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 41

Justification for r Formula Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary

Justification for r Formula Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 42

Justification for r Formula 9 -1 is developed from r= (x -x) (y -y)

Justification for r Formula 9 -1 is developed from r= (x -x) (y -y) (n -1) Sx Sy Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 43

Justification for r Formula 9 -1 is developed from r= (x -x) (y -y)

Justification for r Formula 9 -1 is developed from r= (x -x) (y -y) (n -1) Sx Sy (x, y) centroid of sample points Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 44

Justification for r Formula 9 -1 is developed from r= (x -x) (y -y)

Justification for r Formula 9 -1 is developed from r= (x -x) (y -y) (n -1) Sx Sy (x, y) centroid of sample points x=3 y x - x = 7 - 3 = 4 (7, 23) • 24 20 y - y = 23 - 11 = 12 Quadrant 1 Quadrant 2 16 • 12 8 • Quadrant 3 • • 4 y = 11 (x, y) Quadrant 4 FIGURE 9 -6 x 0 0 1 2 3 4 5 6 7 Chapter 9. Section 9 -1 and 9 -2. Triola, Elementary Statistics, Eighth Edition. Copyright 2001. Addison Wesley Longman 45