Definition formulae The total sum of squares denoted
- Slides: 12
Definition formulae The total sum of squares, denoted by SSTo, is defined as The residual sum of squares, denoted by SSResid, is defined as 2
Calculation Formulae Recalled SSTo and SSResid are generally found as part of the standard output from most statistical packages or can be obtained using the following computational formulas: 3
Coefficient of Determination The coefficient of determination, denoted by r 2, gives the proportion of variation in y that can be attributed to an approximate linear relationship between x and y. 4
Estimated Standard Deviation, se The statistic for estimating the variance s 2 is where 5
Estimated Standard Deviation, se The estimate of s is the estimated standard deviation The number of degrees of freedom associated with estimating s 2 or s in simple linear regression is n-2. 6
Example continued 7
Example continued 8
Example continued With r 2=0. 627 or 62. 7%, we can say that 62. 7% of the observed variation in %Fat can be attributed to the probabilistic linear relationship with human age. The magnitude of a typical sample deviation from the least squares line is about 5. 75(%) which is reasonably large compared to the y values themselves. This would suggest that the model is only useful in the sense of provide gross ballpark estimates for %Fat for humans based on age. 9
Properties of the Sampling Distribution of b When the four basic assumptions of the simple linear regression model are satisfied, the following conditions are met: 1. The mean value of b is b. Specifically, mb=b and hence b is an unbiased statistic for estimating b 2. The standard deviation of the statistic b is 10 3. The statistic b has a normal distribution (a consequence of the error e being normally distributed)
Estimated Standard Deviation of b The estimated standard deviation of the statistic b is When then four basic assumptions of the simple linear regression model are satisfied, the probability distribution of the standardized variable Is the t distribution with df = n - 2 11
Confidence interval for b When then four basic assumptions of the simple linear regression model are satisfied, a confidence interval for b, the slope of the population regression line, has the form b (t critical value) sb Where the t critical value is based on df = n - 2. 12