PREDICT 422 Practical Machine Learning Module 2 Ordinary
![PREDICT 422: Practical Machine Learning Module 2: Ordinary Least Squares Linear Regression Lecturer: Nathan PREDICT 422: Practical Machine Learning Module 2: Ordinary Least Squares Linear Regression Lecturer: Nathan](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-1.jpg)
![Assignment § Reading: Ch. 3 § Activity: Quiz 2, R Lab 2 Assignment § Reading: Ch. 3 § Activity: Quiz 2, R Lab 2](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-2.jpg)
![References § An Introduction to Statistical Learning, with Applications in R (2013), by G. References § An Introduction to Statistical Learning, with Applications in R (2013), by G.](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-3.jpg)
![Lesson Goals: § § § Understand the basic concepts of expectation, variance, and parameter Lesson Goals: § § § Understand the basic concepts of expectation, variance, and parameter](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-4.jpg)
![Overview: Linear Regression § Linear regression is a simple approach to supervised learning, as Overview: Linear Regression § Linear regression is a simple approach to supervised learning, as](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-5.jpg)
![Review: Expectation § Review: Expectation §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-6.jpg)
![Review: Expectation (cont. ) § Review: Expectation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-7.jpg)
![Review: Variance § Review: Variance §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-8.jpg)
![Review: Frequentist Basics § Review: Frequentist Basics §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-9.jpg)
![Review: Parameter Estimation § In practice, we often seek to select a distribution (model) Review: Parameter Estimation § In practice, we often seek to select a distribution (model)](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-10.jpg)
![Review: Parameter Estimation (cont. ) § Review: Parameter Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-11.jpg)
![Review: Parameter Estimation (cont. ) § Review: Parameter Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-12.jpg)
![Statistical Decision Theory § Statistical Decision Theory §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-13.jpg)
![Statistical Decision Theory (cont. ) § Statistical Decision Theory (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-14.jpg)
![Statistical Decision Theory (cont. ) § Statistical Decision Theory (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-15.jpg)
![Linear Regression Model § Linear Regression Model §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-16.jpg)
![Linear Regression Model (cont. ) § Linear Regression Model (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-17.jpg)
![Ordinary Least Squares Estimation § Ordinary Least Squares Estimation §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-18.jpg)
![OLS Estimation (cont. ) § OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-19.jpg)
![OLS Estimation (cont. ) § We illustrate the geometry of OLS fitting, where we OLS Estimation (cont. ) § We illustrate the geometry of OLS fitting, where we](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-20.jpg)
![OLS Estimation (cont. ) § OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-21.jpg)
![OLS Estimation (cont. ) § OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-22.jpg)
![OLS Estimation (cont. ) § OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-23.jpg)
![OLS Estimation (cont. ) § OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-24.jpg)
![OLS Estimation (cont. ) § OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-25.jpg)
![OLS Estimation (cont. ) § OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-26.jpg)
![OLS Estimation (cont. ) § OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-27.jpg)
![Population vs. OLS Lines § Population vs. OLS Lines §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-28.jpg)
![Accuracy of Coefficient Estimates § Accuracy of Coefficient Estimates §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-29.jpg)
![Accuracy of the Model: RSE § Accuracy of the Model: RSE §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-30.jpg)
![Accuracy of the Model: R 2 § Accuracy of the Model: R 2 §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-31.jpg)
![Two Key Questions § Two Key Questions §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-32.jpg)
![](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-33.jpg)
![Are all regression coefficients 0? § Are all regression coefficients 0? §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-34.jpg)
![Deciding on Important Variables § Best Subset Selection: we compute the OLS fit for Deciding on Important Variables § Best Subset Selection: we compute the OLS fit for](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-35.jpg)
![Overview: Forward Selection § We begin with the null model, a model containing an Overview: Forward Selection § We begin with the null model, a model containing an](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-36.jpg)
![Overview: Backward Selection § We begin with all variables in the model. § We Overview: Backward Selection § We begin with all variables in the model. § We](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-37.jpg)
![Qualitative Predictors § Some predictors are not quantitative but are qualitative, taking a discrete Qualitative Predictors § Some predictors are not quantitative but are qualitative, taking a discrete](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-38.jpg)
![Qualitative Predictors (cont. ) § When a qualitative predictor has more than two levels, Qualitative Predictors (cont. ) § When a qualitative predictor has more than two levels,](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-39.jpg)
![Qualitative Predictors (cont. ) § Qualitative Predictors (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-40.jpg)
![Qualitative Predictors (cont. ) § Qualitative Predictors (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-41.jpg)
![Extensions of the Linear Model § Allow for interaction effects. Note that if interaction Extensions of the Linear Model § Allow for interaction effects. Note that if interaction](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-42.jpg)
![Potential Problems § 1. 2. 3. 4. 5. 6. There are several potential problems Potential Problems § 1. 2. 3. 4. 5. 6. There are several potential problems](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-43.jpg)
![Non-linearity of the Data § The linear regression model assumes that there is a Non-linearity of the Data § The linear regression model assumes that there is a](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-44.jpg)
![Correlation of Residuals § An important assumption of the linear regression model is that Correlation of Residuals § An important assumption of the linear regression model is that](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-45.jpg)
![Non-normality and Non-constant Variance of Residuals § Another important assumption of the linear regression Non-normality and Non-constant Variance of Residuals § Another important assumption of the linear regression](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-46.jpg)
![KNN Regression § K-nearest neighbors (KNN) regression is a non-parametric, flexible approach for performing KNN Regression § K-nearest neighbors (KNN) regression is a non-parametric, flexible approach for performing](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-47.jpg)
![KNN Regression (cont. ) § The parametric approach (e. g. linear regression) will outperform KNN Regression (cont. ) § The parametric approach (e. g. linear regression) will outperform](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-48.jpg)
![Generalization of the Linear Model Now that we have reviewed linear regression models, we Generalization of the Linear Model Now that we have reviewed linear regression models, we](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-49.jpg)
![Summary § § § Review of expectation, variance, and parameter estimation. Basic concepts of Summary § § § Review of expectation, variance, and parameter estimation. Basic concepts of](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-50.jpg)
- Slides: 50
![PREDICT 422 Practical Machine Learning Module 2 Ordinary Least Squares Linear Regression Lecturer Nathan PREDICT 422: Practical Machine Learning Module 2: Ordinary Least Squares Linear Regression Lecturer: Nathan](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-1.jpg)
PREDICT 422: Practical Machine Learning Module 2: Ordinary Least Squares Linear Regression Lecturer: Nathan Bastian, Section: XXX
![Assignment Reading Ch 3 Activity Quiz 2 R Lab 2 Assignment § Reading: Ch. 3 § Activity: Quiz 2, R Lab 2](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-2.jpg)
Assignment § Reading: Ch. 3 § Activity: Quiz 2, R Lab 2
![References An Introduction to Statistical Learning with Applications in R 2013 by G References § An Introduction to Statistical Learning, with Applications in R (2013), by G.](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-3.jpg)
References § An Introduction to Statistical Learning, with Applications in R (2013), by G. James, D. Witten, T. Hastie, and R. Tibshirani. § The Elements of Statistical Learning (2009), by T. Hastie, R. Tibshirani, and J. Friedman. § Learning from Data: A Short Course (2012), by Y. Abu-Mostafa, M. Magdon-Ismail, and H. Lin. § Machine Learning: A Probabilistic Perspective (2012), by K. Murphy
![Lesson Goals Understand the basic concepts of expectation variance and parameter Lesson Goals: § § § Understand the basic concepts of expectation, variance, and parameter](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-4.jpg)
Lesson Goals: § § § Understand the basic concepts of expectation, variance, and parameter estimation. Understand the basic concepts of statistical decision theory. Understand simple and multiple linear regression as a supervised learning algorithm. Understand ordinary least squares estimation for linear regression models. Understand the basic concepts of k-nearest neighbors regression.
![Overview Linear Regression Linear regression is a simple approach to supervised learning as Overview: Linear Regression § Linear regression is a simple approach to supervised learning, as](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-5.jpg)
Overview: Linear Regression § Linear regression is a simple approach to supervised learning, as it assumes that the dependence of Y on X 1, X 2, …, Xp is linear. § Most modern machine learning approaches can be seen as generalizations or extensions of linear regression. § When augmented with kernels or other forms of basis function expansion (which replace X with some non-linear function of the inputs), it can also model non-linear relationships. § Goal: predict Y from X by f(X)
![Review Expectation Review: Expectation §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-6.jpg)
Review: Expectation §
![Review Expectation cont Review: Expectation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-7.jpg)
Review: Expectation (cont. ) §
![Review Variance Review: Variance §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-8.jpg)
Review: Variance §
![Review Frequentist Basics Review: Frequentist Basics §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-9.jpg)
Review: Frequentist Basics §
![Review Parameter Estimation In practice we often seek to select a distribution model Review: Parameter Estimation § In practice, we often seek to select a distribution (model)](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-10.jpg)
Review: Parameter Estimation § In practice, we often seek to select a distribution (model) corresponding to our data. § If the model is parameterized by some set of values, then this problem is that of parameter estimation. § In general, we typically use maximum likelihood estimation (MLE) to obtain parameter estimates.
![Review Parameter Estimation cont Review: Parameter Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-11.jpg)
Review: Parameter Estimation (cont. ) §
![Review Parameter Estimation cont Review: Parameter Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-12.jpg)
Review: Parameter Estimation (cont. ) §
![Statistical Decision Theory Statistical Decision Theory §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-13.jpg)
Statistical Decision Theory §
![Statistical Decision Theory cont Statistical Decision Theory (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-14.jpg)
Statistical Decision Theory (cont. ) §
![Statistical Decision Theory cont Statistical Decision Theory (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-15.jpg)
Statistical Decision Theory (cont. ) §
![Linear Regression Model Linear Regression Model §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-16.jpg)
Linear Regression Model §
![Linear Regression Model cont Linear Regression Model (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-17.jpg)
Linear Regression Model (cont. ) §
![Ordinary Least Squares Estimation Ordinary Least Squares Estimation §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-18.jpg)
Ordinary Least Squares Estimation §
![OLS Estimation cont OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-19.jpg)
OLS Estimation (cont. ) §
![OLS Estimation cont We illustrate the geometry of OLS fitting where we OLS Estimation (cont. ) § We illustrate the geometry of OLS fitting, where we](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-20.jpg)
OLS Estimation (cont. ) § We illustrate the geometry of OLS fitting, where we seek the linear function of X that minimizes the sum of squared residuals from Y. § The predictor function corresponds to a plane (hyper plane) in the 3 D space. § For accurate prediction, hopefully the data will lie close to this hyper plane, but they won’t lie exactly in the hyper plane.
![OLS Estimation cont OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-21.jpg)
OLS Estimation (cont. ) §
![OLS Estimation cont OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-22.jpg)
OLS Estimation (cont. ) §
![OLS Estimation cont OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-23.jpg)
OLS Estimation (cont. ) §
![OLS Estimation cont OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-24.jpg)
OLS Estimation (cont. ) §
![OLS Estimation cont OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-25.jpg)
OLS Estimation (cont. ) §
![OLS Estimation cont OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-26.jpg)
OLS Estimation (cont. ) §
![OLS Estimation cont OLS Estimation (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-27.jpg)
OLS Estimation (cont. ) §
![Population vs OLS Lines Population vs. OLS Lines §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-28.jpg)
Population vs. OLS Lines §
![Accuracy of Coefficient Estimates Accuracy of Coefficient Estimates §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-29.jpg)
Accuracy of Coefficient Estimates §
![Accuracy of the Model RSE Accuracy of the Model: RSE §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-30.jpg)
Accuracy of the Model: RSE §
![Accuracy of the Model R 2 Accuracy of the Model: R 2 §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-31.jpg)
Accuracy of the Model: R 2 §
![Two Key Questions Two Key Questions §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-32.jpg)
Two Key Questions §
![](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-33.jpg)
![Are all regression coefficients 0 Are all regression coefficients 0? §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-34.jpg)
Are all regression coefficients 0? §
![Deciding on Important Variables Best Subset Selection we compute the OLS fit for Deciding on Important Variables § Best Subset Selection: we compute the OLS fit for](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-35.jpg)
Deciding on Important Variables § Best Subset Selection: we compute the OLS fit for all possible subsets of predictors and then choose between them based on some criterion that balances training error with model size. § There are 2 p possible models, so can’t examine them all. § We use an automated approach that searches through a subset of all the models. – Forward Selection – Backward Selection
![Overview Forward Selection We begin with the null model a model containing an Overview: Forward Selection § We begin with the null model, a model containing an](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-36.jpg)
Overview: Forward Selection § We begin with the null model, a model containing an intercept but no predictors. § We fit p simple linear regressions and add to the null model that variable resulting in the lowest RSS. § We add to that model the variable that results in the lowest RSS amongst all two-variable models. § The algorithm continues until some stopping rule is satisfied (i. e. all remaining variables have a p-value greater than some threshold).
![Overview Backward Selection We begin with all variables in the model We Overview: Backward Selection § We begin with all variables in the model. § We](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-37.jpg)
Overview: Backward Selection § We begin with all variables in the model. § We remove the variable with the largest p-value (i. e. least statistically significant). § The new (p – 1)-variable model is fit, and the variable with the largest p-value is removed. § The algorithm continues until a stopping rule is reached.
![Qualitative Predictors Some predictors are not quantitative but are qualitative taking a discrete Qualitative Predictors § Some predictors are not quantitative but are qualitative, taking a discrete](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-38.jpg)
Qualitative Predictors § Some predictors are not quantitative but are qualitative, taking a discrete set of values. § These are known as categorical variables, which we can code as indicator variables (dummy variables). § Examples: gender, student status, marital status, ethnicity
![Qualitative Predictors cont When a qualitative predictor has more than two levels Qualitative Predictors (cont. ) § When a qualitative predictor has more than two levels,](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-39.jpg)
Qualitative Predictors (cont. ) § When a qualitative predictor has more than two levels, a single dummy variable cannot represent all possible variables. § Thus, there will always be one few dummy variables than the number of levels in the factor. – Factor = Ethnicity – Levels = Asian, Caucasian, African American – # of Dummy Variables = 3 – 1 = 2 § The level with no dummy variable is the baseline.
![Qualitative Predictors cont Qualitative Predictors (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-40.jpg)
Qualitative Predictors (cont. ) §
![Qualitative Predictors cont Qualitative Predictors (cont. ) §](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-41.jpg)
Qualitative Predictors (cont. ) §
![Extensions of the Linear Model Allow for interaction effects Note that if interaction Extensions of the Linear Model § Allow for interaction effects. Note that if interaction](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-42.jpg)
Extensions of the Linear Model § Allow for interaction effects. Note that if interaction is included in the model, all of the main effects should be include as well (even if not statistically significant). § Accommodate non-linear relationships using polynomial regression. For example, you can include transformed versions of the predictors in the model.
![Potential Problems 1 2 3 4 5 6 There are several potential problems Potential Problems § 1. 2. 3. 4. 5. 6. There are several potential problems](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-43.jpg)
Potential Problems § 1. 2. 3. 4. 5. 6. There are several potential problems that may occur when fitting a linear regression model. Non-linearity of the response-predictor relationships. Correlation of residuals. Non-normality and non-constant variance of the residuals. Outliers (refer to Section 3. 3. 3 in text). High-leverage points (refer to Section 3. 3. 3 in text). Collinearity (refer to Section 3. 3. 3 in text).
![Nonlinearity of the Data The linear regression model assumes that there is a Non-linearity of the Data § The linear regression model assumes that there is a](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-44.jpg)
Non-linearity of the Data § The linear regression model assumes that there is a straight-line relationship between predictors and the response. § If the true relationship is non-linear then conclusions are suspect. § Examine the residual plots, as strong patterns (U-shape) in the residuals indicate non-linearity in the data. § If there are non-linear associations in the data, then use nonlinear transformations of the predictors (e. g. log X).
![Correlation of Residuals An important assumption of the linear regression model is that Correlation of Residuals § An important assumption of the linear regression model is that](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-45.jpg)
Correlation of Residuals § An important assumption of the linear regression model is that the residuals are uncorrelated. § If there is correlation among the residuals (Durbin-Watson test), then the estimated standard errors will tend to underestimate the true standard errors – this makes the CIs and PIs narrower than they should be. § These correlations frequently occur in the context of time series data, so consider employing time series analysis methods (such ARIMA, etc. ).
![Nonnormality and Nonconstant Variance of Residuals Another important assumption of the linear regression Non-normality and Non-constant Variance of Residuals § Another important assumption of the linear regression](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-46.jpg)
Non-normality and Non-constant Variance of Residuals § Another important assumption of the linear regression model is that the residuals are normally distributed and have constant variance across all levels of X. § If the residuals are not normally distributed (Anderson-Darling test), you can perform a Box-Cox transformation on the response Y. § If there is heteroscedasticity (Breusch-Pagan, Modified Levene, or Special White’s tests), then you can consider transforming the response Y. If this doesn’t fix the problem, consider computing robust standard errors or conduct weighted least squares regression.
![KNN Regression Knearest neighbors KNN regression is a nonparametric flexible approach for performing KNN Regression § K-nearest neighbors (KNN) regression is a non-parametric, flexible approach for performing](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-47.jpg)
KNN Regression § K-nearest neighbors (KNN) regression is a non-parametric, flexible approach for performing regression. § It is closely related to the KNN classifier discussed last week in Module 1. § To predict Y for a given value of X, consider the k closest points to X in the training data and take the average of the responses. § If k is small, then KNN is more flexible than linear regression; it will have low bias but high variance.
![KNN Regression cont The parametric approach e g linear regression will outperform KNN Regression (cont. ) § The parametric approach (e. g. linear regression) will outperform](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-48.jpg)
KNN Regression (cont. ) § The parametric approach (e. g. linear regression) will outperform the non-parametric approach (e. g. KNN regression) if the parametric form that has been selected is close to the true form of f. § Here, linear regression achieves a lower test MSE than does KNN regression, since f(X) is in fact linear.
![Generalization of the Linear Model Now that we have reviewed linear regression models we Generalization of the Linear Model Now that we have reviewed linear regression models, we](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-49.jpg)
Generalization of the Linear Model Now that we have reviewed linear regression models, we will now begin to expand their scope to include: § Classification problems: logistic regression, support vector machines. § Non-linearity: kernel smoothing, splines and generalized additive models; nearest neighbor methods. § Interactions: tree-based methods, bagging, boosting, random forests. § Regularization: ridge regression and lasso.
![Summary Review of expectation variance and parameter estimation Basic concepts of Summary § § § Review of expectation, variance, and parameter estimation. Basic concepts of](https://slidetodoc.com/presentation_image_h/a956edc54693e925cc0d14ad73e841a8/image-50.jpg)
Summary § § § Review of expectation, variance, and parameter estimation. Basic concepts of statistical decision theory. Simple and multiple linear regression as a supervised learning algorithm. Ordinary least squares estimation for linear regression models. Basic concepts of k-nearest neighbors regression.
Practical machine learning quiz 4
Practical machine learning quiz 2
Concept learning task in machine learning
Analytical learning in machine learning
Pac learning model in machine learning
Pac learning model in machine learning
Inductive vs analytical learning
Focl in machine learning
Instance based learning in machine learning
Inductive learning machine learning
First order rule learning in machine learning
Lazy learner and eager learner
Deep learning vs machine learning
A library has 6 422 music cds stored on 26 shelves
Cpsc 422
Cse 422
Cs 422
Af form 422
Round 348 to the nearest ten
Rounding hundred thousands
Cpsc 422
422
Used pc 422r8
Csc 422
Cs 422
Uiuc cs 461
Cpsc 422
Cpsc 422
Cpsc 422
Cs 422
Cristina conati
Cuadro comparativo entre e-learning b-learning y m-learning
C device module module 1
Practical focus on learning
Vsepr theory definition
δhsys
Summarise vipers
Explain vipers
Zkratka nnn
Predict whether isopropanol will conduct electricity
Predictogram
Experimental control
Predict cyber crime
What does vsepr theory stand for?
Linear regression predict
Define gratifying
Measurement of average kinetic energy
Explain brian's solution for storing fish
Noble gas period 6
Visualize reading strategy
Predict the products of the following reactions.