Low Birth Weight Analysis of States By Team
Low Birth Weight: Analysis of States By Team 1 Randy Matt Heather Jen Copyright (c) 2008 by The Mc. Graw-Hill Companies. This material is intended solely for educational use by licensed users of Learning. Stats. It may not be copied or resold for profit.
Introduction Tasks: • Discuss a priori logic • Select possible predictors • Run regression with the predictors • Test residuals for normality • Check for heteroscedasticity
Definition of Variables Independent Variable
First and 9 th Run First Run 9 th Run
Summary of Runs
Test for Normality • We tested our data for normality • We constructed both a histogram and residual probability plot • In both graphs it is clear that the data is normally distributed but with possible outliers
Multicollinearity • A Variable Inflation Factor table and a correlation matrix • • were constructed These tables assessed the collinearity between the predictors The Birthrate and Birthrate 2 have high VIF values because Birthrate 2 is the squared value of birthrate The Unins. Chi and Unins. Tot have high VIF values because the number of uninsured children is part of the number of total uninsured people On the correlation matrix, Birthrate and Birthrate 2 as well as Unins. Chi and. Unins. Tot have values greater than. 500 which shows that they may be related
Variance Inflation Factors
Correlation Matrix By definition
Constant Variance • Residual Plots were made to test whether our residuals have a constant variance • Residual Values were plotted against each predictor • The results show that all the variables are homoscedastic except for the two binary variables
Constant Variance Graphs
Y actual vs. Y Fitted • The next graph shows our predicted and actual values of Lo. Weight on a scatter plot • Colorado and Wyoming are clearly outliers • The rest of our data overall is close to the 45 degree line which shows that our model is well fitted, in turn giving the data a high R 2
Y actual vs. Y fitted Wyoming Colorado
- Slides: 15