Chapter 12 Simple Linear Regression and Correlation Copyright
- Slides: 47
Chapter 12 Simple Linear Regression and Correlation Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
12. 1 The Simple Linear Regression Model Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Linear Relationship The simplest deterministic mathematical relationship between two variables x and y is a linear relationship The set of pairs (x, y) for which determines a straight line. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Terminology The variable whose value is fixed by the experimenter, denoted x, is the independent (predictor, explanatory) variable. For a fixed x, the second variable will be a random variable Y with observed value y, referred to as the dependent (response) variable. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
The Simple Linear Regression Model There exists parameters such that for any fixed value of x, the dependent variable is related to x through the model equation is a random variable (called the random deviation) with Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Linear Regression Model (x 1, y 1) True regression line x 1 Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Distribution of Normal, mean = 0, standard deviation 0 Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Distribution of Y for Different Values of x x 1 x 2 x 3 Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
12. 2 Estimating Model Parameters Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Principle of Least Squares The vertical deviation of the point (xi, yi) from the line y = b 0 + b 1 x is yi – (b 0 + b 1 xi) The sum of squared vertical deviations from the points to the line is: Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Principle of Least Squares The least-squares (regression) line for the data is given by where and Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. Find the equation of least-squares for the data Sum: x 1 2 y 2 3 xy 2 6 x 2 1 4 3 7 21 9 12 29 14 6 = 2. 5 = – 1 Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Fitted Values and Residuals The fitted (predicted) values are obtained by substituting into the equation of the estimated regression line: The residuals are the vertical deviations from the estimated line. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Error Sum of Squares The error sum of squares, denoted SSE, is and the estimate of is Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Computational Formula A computational formula for the SSE, is Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Total Sum of Squares The total sum of squares, denoted SST, is Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Coefficient of Determination The coefficient of determination, denoted by r 2, is given by It is interpreted as the proportion of observed y variation that can be explained by the simple linear regression model. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Regression Sum of Squares SSR = SST – SSE Regression sum of squares is interpreted as the amount of variation that is explained by the model. We have Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
12. 3 Inferences About the Slope Parameter Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
1. The mean of 2. The variance and standard deviation are 3. has a normal distribution. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
T Variable The assumptions of the simple linear regression model imply that the standardized variable has a t distribution with n – 2 df. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Confidence Interval of the true regression line is Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Hypothesis-Testing Procedures Null hypothesis: Test statistic value: Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Hypothesis-Testing Procedures Alternative Hypothesis Rejection Region for Approx. Level Test or A P-value based on n – 2 df can be calculated as in Chap 8 and 9. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Hypothesis-Testing The model utility test is the test of in which case the test statistic value is the ratio Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
ANOVA Table Source of Variation df Sum of squares Mean Square Regression 1 SSR Error n– 2 SSE Total n– 1 SST f Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
12. 4 Inferences Concerning and the Prediction of Future Y Values Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
is some fixed value of x. 1. The mean of is 2. Variance and standard deviation: Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
2. (continued) 3. has a normal distibution. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
T Variable The variable has a t distribution with n – 2 df. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Confidence Interval expected value of Y when x = x*, is Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Prediction Interval A future value of Y is not a parameter but instead a random variable; its interval of plausible values is referred to as a prediction interval. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Prediction Interval A PI for a future Y observation to be made when x = x*, is Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
12. 5 Correlation Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Sample Correlation Coefficient The sample correlation coefficient, denoted r, of n pairs (x 1, y 1), …, (xn, yn) is Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Ex. Find the correlation coefficient for the least-squares line from the points = 0. 9449 Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Properties of r Important properties of r 1. The value of r does not depend on which of the two variables under study is labeled x and which is labeled y. 2. The value of r is independent of the units in which x and y are measured. 3. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Properties of r 4. r = 1 iff all (xi, yi) pairs lie on straight line with positive slope, and r = – 1 iff all (xi, yi) pairs lie on a straight line with negative slope. 5. The square of the sample correlation coefficient gives the value of the coefficient of determination that would result from fitting the simple linear regression model. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Different Values of r r near 1 r near 0, no relationship r near -1 r near 0, nonlinear relationship Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
The Population Correlation Coefficient where depending on whether (X, Y) is discrete or continuous. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Estimator Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Assumption The joint probability distribution of (X, Y ) is specified by is called the bivariate normal probability distribution. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Testing for the Absence of Correlation When statistic: is true, the test Has a t distribution with n – 2 df. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Hypothesis-Testing Alternative Hypothesis Rejection Region for Approx. Level Test or A P-value based on n – 2 df can be calculated as described previously. Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
Other Inferences Concerning When (X 1, Y 1), …, (Xn, Yn) is a sample from a bivariate normal distribution, the rv has approximately a normal distribution with mean and variance Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
The test statistic for testing Alternative Hypothesis Rejection Region for Level Test or Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
CI for where c 1 and c 2 are the left and right endpoints, of the CI interval for Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.
- Simple linear regression and multiple linear regression
- Logistic regression vs linear regression
- Logistic regression vs linear regression
- Regression linear model
- Which table shows no correlation
- Positive correlation versus negative correlation
- Pearson r correlation
- Correlation and regression
- Difference between regression and correlation
- Difference between regression and correlation
- Coefficient of determination formula in regression
- Absolute value of correlation coefficient
- Multivariate vs bivariate
- Contoh soal analisis regresi sederhana
- Regression vs correlation
- Correlation vs regression
- Copyright
- Simple linear regression excel
- Useless regression chapter 16
- Simple linear regression
- Linear regression function
- Linear regression assumptions spss
- Chapter 7 linear regression
- Chapter 8 linear regression
- Chapter 8 linear regression
- Korelasi parsial
- Chapter 7 scatterplots association and correlation
- Chapter 7 scatterplots association and correlation
- Knn linear regression
- Hierarchical multiple regression spss
- Linear regression riddle b
- Scala meter
- Logistic regression interaction interpretation
- Apa fungsi regresi linear
- Mahalanobis distance spss
- Cost function linear regression
- Linear regression multiple features
- Sum of squares
- Ap statistics linear regression
- Standard error of regression
- Log linear regression model
- Linear regression slope formula
- Log linear regression model
- Classical regression model assumptions
- History of regression analysis
- Linear regression loss function
- Classical normal linear regression model
- Multiple linear regression variance