STAT 250 Dr Kari Lock Morgan Multiple Regression
- Slides: 35
STAT 250 Dr. Kari Lock Morgan Multiple Regression SECTIONS 10. 1, 10. 3 • Multiple explanatory variables (10. 1, 10. 3) Statistics: Unlocking the Power of Data Lock 5
More than 2 variables! • For the rest of the course, we’ll finally get beyond one or two variables! Statistics: Unlocking the Power of Data Lock 5
Question of the Day How can we predict body fat percentage from easy measurements? Statistics: Unlocking the Power of Data Lock 5
Predicting Body Fat Percentage �The percentage of a person’s weight that is made up of body fat is often used as an indicator of health and fitness �Accurate measures of percent body fat are hard to get (for example, immerse the body in water to estimate density, then apply a formula) �Another option: build a model predicting % body fat based on easy to obtain measurements Statistics: Unlocking the Power of Data Lock 5
Body Fat Data �Measurements were collected on 100 men �Response variable: percent body fat �Explanatory variables: Age (in years) Weight (in pounds) Height (in inches) Neck circumference (in cm) Chest circumference (in cm) Abdomen circumference (in cm) Ankle circumference (in cm) Biceps circumference (in cm) Wrist circumference (in cm) A sample taken from data provided by Johnson R. , "Fitting Percentage of Body Fat to Simple Body Measurements, " Journal of Statistics Education, 1996 Statistics: Unlocking the Power of Data Lock 5
Multiple Regression • Multiple regression extends simple linear regression to include multiple explanatory variables: • Each x is a different explanatory variable • k is the number of explanatory variables Statistics: Unlocking the Power of Data Lock 5
Three Explanatory Variables We’ll start with three explanatory variables: age, weight, height The regression equation is… a) Statistics: Unlocking the Power of Data Lock 5
Predicting Percent Body Fat �What can we do with this? Make predictions Interpret coefficients Inference Interpret R 2 and more! Statistics: Unlocking the Power of Data Lock 5
Making Predictions �If you are male, you can use this to predict your percent body fat! �(Females can try too, just for practice, but it won’t be accurate – why not? ) �Age: years, weight: pounds, height: inches Statistics: Unlocking the Power of Data Lock 5
Percent Body Fat Statistics: Unlocking the Power of Data Lock 5
Interpreting Coefficients �Intercept: a man 0 years old, weighs 0 lbs, and is 0 inches tall would have 49. 6% body fat �Slope: Keeping weight and height constant, percent body fat increases by 0. 1653 for every additional year �Keeping age and height constant, percent body fat increases by 0. 2264 for every additional pound Statistics: Unlocking the Power of Data Lock 5
Interpreting Coefficients Which of the following is a correct interpretation? a) Keeping age and weight constant, height decreases by 1. 117 for every additional percent of body fat b) Keeping age and weight constant, percent body fat decreases by 1. 117 for every additional inch c) Predicted body fat decreases by 1. 117 for every additional inch Statistics: Unlocking the Power of Data Lock 5
Minitab Output Statistics: Unlocking the Power of Data Lock 5
Inference �Are our explanatory variables significant predictors? �All of the p-values corresponding to the explanatory variables are very small �Age, weight, and height are all significant predictors of percent body fat (given the other variables in the model) Statistics: Unlocking the Power of Data Lock 5
R 2 is the proportion of the variability in the response variable, Y, that is explained by the fitted model �For simple linear regression, R 2 = r 2 (R 2 is just the sample correlation squared) �R 2 is also called the coefficient of determination Statistics: Unlocking the Power of Data Lock 5
2 R How much does the variability in Y decrease if you know X? Statistics: Unlocking the Power of Data Lock 5
R 2 • About 55% of the variability in percent body fat is explained by age, weight, and height • Can we do better? Statistics: Unlocking the Power of Data Lock 5
Comparing with BMI �BMI is used more commonly than percent body fat because it is easy to calculate �Currently, our predicted percent body fat is not using much more information than BMI (just age as an extra predictor) �What’s wrong with body mass index (BMI) as a indicator of health and fitness? �How might we improve our model to fix this problem? Statistics: Unlocking the Power of Data Lock 5
New Model �Bodyfat = -55. 9 + 0. 0067 Age - 0. 1724 Weight + 0. 099 Height + 1. 066 Abdomen �Anything look odd about this equation? ? ? �Model without Abdomen: Bodyfat = 49. 6 + 0. 1653 Age + 0. 2264 Weight - 1. 117 Height �What’s going on? !? Statistics: Unlocking the Power of Data Lock 5
Significance Which explanatory variable(s) are significant? a) All of them – age, weight, height, abdomen b) Weight and height c) Weight, height, abdomen d) Weight and abdomen e) Abdomen only Statistics: Unlocking the Power of Data Lock 5
Multiple Regression • The coefficient for each explanatory variable is the predicted change in y for one unit change in x, given the other explanatory variables in the model! • The p-value for each coefficient indicates whether it is a significant predictor of y, given the other explanatory variables in the model! • If explanatory variables are associated with each other, coefficients and p-values will change depending on what else is included in the model Statistics: Unlocking the Power of Data Lock 5
Full Model Statistics: Unlocking the Power of Data Lock 5
Which explanatory variable(s) are significant? a) All of them b) Weight and abdomen c) Neck only d) Abdomen and wrist Statistics: Unlocking the Power of Data Lock 5
Insignificant Terms �What should we do with the insignificant variables? �Keep them in the model? �Take them out of the model? �Deciding which variables to keep in the model (variable selection) is an entire subfield of statistics, and beyond the scope of this class �Want to learn more about it? Take STAT 462! Statistics: Unlocking the Power of Data Lock 5
Explaining Variability How much of the variability in percent body fat is explained by this model? Which of the following would tell us this? a) p-value b) correlation c) slope coefficients d) R 2 e) confidence interval Statistics: Unlocking the Power of Data Lock 5
Full Model Statistics: Unlocking the Power of Data Lock 5
Question #2 of the Day What will I get on the final exam? ? ? Statistics: Unlocking the Power of Data Lock 5
Model Output �All grades are in percent form (0 – 100) �You can predict your final exam score based on your performance so far! �You have a point estimate… what do you really want? ? ? Statistics: Unlocking the Power of Data Lock 5
Uncertainty? � Statistics: Unlocking the Power of Data Lock 5
Significance Wiley. Plus and Clicker grades are not significant in the model. Does this mean that they are not significantly associated with Final Exam score? a) Yes b) No Statistics: Unlocking the Power of Data Lock 5
Significance Clicker is still not significant in the model. Does this mean coming to class doesn’t matter? a) Yes b) No Statistics: Unlocking the Power of Data Lock 5
Clicker Can we conclude that coming to class improves your score on the final exam? a) Yes b) No Statistics: Unlocking the Power of Data Lock 5
Multiple Regression Coefficients and p-values depend on the other explanatory variables included in the model!!! Statistics: Unlocking the Power of Data Lock 5
Lots More! �The goal of this class was to expose you to multiple regression as a way to incorporate more than two variables �This one class does NOT cover everything you should know about regression! �If you really want to use multiple regression for data analysis, take STAT 462! (Or consult with a statistician) Statistics: Unlocking the Power of Data Lock 5
To Do �Do HW 10. 13 (due Wednesday, 4/19) Statistics: Unlocking the Power of Data Lock 5
- Kari lock morgan
- Kari lock morgan
- Kari lock morgan
- Kari lock morgan
- Simple and multiple linear regression
- Multiple regression vs simple regression
- Lock 5 stat
- Lock 5 stat
- Statkey lock
- Logistic regression vs linear regression
- Survival analysis vs logistic regression
- Regression anal
- Multiple regression equation
- Extra sum of squares multiple regression
- Hypothesis for multiple regression
- Hierarchical linear regression spss
- Quadratic regression spss
- Polynomial regression
- Multiple regression analysis inference
- In multiple linear regression model, the hat matrix (h) is
- Multiple logistic regression spss
- Definition of multiple regression
- Multiple linear regression interpretation
- Sample size for multiple regression
- Multiple regression analysis with qualitative information
- Vif minitab
- Multiple linear regression analysis formula
- Linear regression with multiple variables machine learning
- Contoh soal regresi berganda
- Multiple linear regression
- Multiple regression analysis inference
- Pengertian analisis regresi logistik
- Dataset for multiple regression analysis
- Moderated multiple regression
- Linear regression assumptions spss
- Moderated multiple regression