Chapter 11 Validation of Regression Models Linear Regression
- Slides: 16
Chapter 11 Validation of Regression Models Linear Regression Analysis 5 E Montgomery, Peck & Vining 1
11. 1 Introduction • What the regression equation was created for, may not always be what it is used for. • Model Adequacy Checking – Residual analysis, lack of fit testing, determining influential observations. Checks the fit of the model to the available data. • Model Validation – determining if the model will behave or function as it was intended in the operating environment. Linear Regression Analysis 5 E Montgomery, Peck & Vining 2
11. 2 Validation Techniques 1. • • Analysis of model coefficients and predicted values Check for “inappropriate” signs on the coefficients; Check for unusual magnitudes on the coefficients; Check for stability in the coefficient estimates; Check the predicted values (do they make sense for the nature of the data? ) 2. Collection of new data • Usually 15 -20 new observations are adequate Linear Regression Analysis 5 E Montgomery, Peck & Vining 3
Example 11. 1 The Hald Cement Data Coefficients of x 1 very similar, coefficients of x 2 and the intercept moderately different Difference in predicted values? Linear Regression Analysis 5 E Montgomery, Peck & Vining 4
Which model would you prefer? Linear Regression Analysis 5 E Montgomery, Peck & Vining 5
Example 11. 2 The Delivery Time Data Compare the residual mean square to the average squared prediction error Linear Regression Analysis 5 E Montgomery, Peck & Vining 6
New data: Average squared prediction error Linear Regression Analysis 5 E Montgomery, Peck & Vining 7
How does this compare to the R 2 for prediction based on PRESS? Linear Regression Analysis 5 E Montgomery, Peck & Vining 8
11. 2 Validation Techniques 3. Data splitting (aka cross validation) • Divide the data into two parts: estimation data and prediction data • The PRESS statistic is an estimate of performance based on data splitting • We can also use PRESS to compute an R 2 type statistic for prediction: Linear Regression Analysis 5 E Montgomery, Peck & Vining 9
11. 2 Validation Techniques 3. Data splitting (aka cross validation) • If the time sequence is known, data splitting can be done by time order (common in time series or forecasting) • Other characteristics of the data (are data grouped by operator, machine, location, etc. ) • Double cross validation • Drawbacks? • A more formal approach? • The DUPLEX algorithm Linear Regression Analysis 5 E Montgomery, Peck & Vining 10
Example 11. 3 The Delivery Time Data A portion of Table 11. 3 showing prediction and estimation data determined with DUPLEX, Linear Regression Analysis 5 E Montgomery, Peck & Vining 11
Linear Regression Analysis 5 E Montgomery, Peck & Vining 12
A portion of Table 11. 4 is reproduced here. Linear Regression Analysis 5 E Montgomery, Peck & Vining 13
Linear Regression Analysis 5 E Montgomery, Peck & Vining 14
Example 11. 3 The Delivery Time Data Linear Regression Analysis 5 E Montgomery, Peck & Vining 15
Linear Regression Analysis 5 E Montgomery, Peck & Vining 16
- Simple linear regression and multiple regression
- Linear regression model validation techniques
- Survival analysis vs logistic regression
- Logistic regression vs linear regression
- Regression linear model
- Chapter 7 linear regression
- Chapter 8 linear regression
- Chapter 8 linear regression
- What is the difference between model and semi modals
- Types of functional forms
- Qualitative response regression models
- Qualitative response regression models ppt
- Advanced regression and multilevel models
- Panel data
- Advanced regression models
- Types of regression models
- What is linear model of communication