A Multivariate Regression Analysis of Hospital Stays in
- Slides: 17
A Multivariate Regression Analysis of Hospital Stays in a Nosocomial Infection Control Data Principal Investigator: Dr. Linda B. Hayden Mentor: Dr. Julian A. Allagan Team Members: Matthew Hill, Jessica Hathaway, Lilshay Rogers, Heaven Tate
Abstract In this report we developed analyzed several linear regression models to predict hospital stays (or length of Stay) of patients in the U. S using the SENIC project data from CDC-Atlanta. We examined several potential exploratory variables and their relations with the response variable “Stay”, with the goal of determining what leading factors influenced the length of stay of patients in this Nosocomial (hospital acquired) infection control data. In particular, our report aimed at answering the following: given the data, what leading factors help explain the hospital stays of patients in U. S? In at least one model, we found that Age and Regions influenced the variable “Stay” the most.
Introduction • In 2011 hospital in-patient expenses accounted for almost one -third of all healthcare expenditures compared to prescription medicine which accounted for about one-fifth of total medical expenses in the United States. • In 2012, there were 36. 5 million hospital stays in the US with an average length of stay of 4. 5 days and with an average cost of $10, 400 per stay. • Using some data exploration and linear regression analysis tools, we determine some association between hospital stays and several factors.
Data • 1 file: ‘Hospital. txt’ 8 KB • 12 variables from 113 US hospital records • SENIC data-CDC Atlanta üAge (Age) üInfection Risk (Risk) üRoutine Culturing Ratio (Culturing) üNumber of Beds (Bed) üRegions (Region) üNumber of Nurses (Nurse)
Variable Distributions Variables Min 1 st. Q Median Mean 3 rd. Q Max Stay (days) 11 8 9 10 11 20 Age (years) 39 51 53 53 56 66 Risk (percent) 1. 3 3. 7 4. 4 4. 3 5. 2 7. 8 Beds 29 106 186 252 312 835 Nurses 14 66 132 173 218 656
Hospital Stay Boxplot
Distribution of Length of the response Stay
Analytical Process Stepwise selection. AIC 4 -6 predictors OLS with Length of Stay as response 11 explanatory variables (covariates)
Investigation: Leading factors in length of Stay Model A Risk Region Census Nurses Age Model B Model C Risk Region Census Nurses Age Xray Beds
Model A The predictors help explain about 60% (R 2 = 0. 59) of the changes we observed in the average length of stay in this model. Moreover, each parameter is statistically significant (p-value<2. 2 e-16).
Model B The predictors help explain about 61% (R 2 = 0. 61) of the changes we observed in the average length of stay in this model. Moreover, each parameter is statistically significant (p-value<2. 2 e-16).
Model C The predictors help explain about 62% (R 2 = 0. 62) of the changes we observed in the average length of stay in this model. Moreover, each parameter is statistically significant (p-value<2. 2 e-16).
Model Building Process Summary Model A (R-sq=0. 59) Pool of Variables Model B (R-sq=0. 61) Model C (R-sq=0. 62) Model A Chosen All variables uncorrelated Check Model Assumptions Model A as “Best” Predictive Model
Regression Output For Our Final Model Coefficients Risk Region Census Nurses Age 0. 54 -0. 68 0. 01 -0. 01 0. 08 Standard Error 0. 10 0. 12 0. 00 0. 03 P-value 1. 54 e-07 1. 21 e-07 1. 18 e-06 0. 00058 0. 00456 All variables are statistically significant at a 95% confidence level, with very low coefficient estimates errors
Conclusion • We used a traditional or standard ordinary least square regression model on this data • Model indicates that variables such as Risk Regions and Nurses play bigger roles in affecting the Patients length of Stay. Age is almost irrelevant. • Recommendation: Find ways to lower the risk of infection, and perhaps increase the number of Nurses. • Future work: Instead of a Stepwise selection criterion, use a machine learning algorithm such as GBM (Gradient Boosting Machine) to find perhaps a list of different set of predictors from the data. • Future question: What factors contribute the most to the increase in risk of a nosocomial infection?
References 1 Gonzalez JM. National Health Care Expenses in the U. S. Civilian Noninstitutionalized Population, 2011. MEPS Statistical Brief No. 425. Rockville, MD: Agency for Healthcare Research and Quality, 2013. http: //meps. ahrq. gov/data_files/publications/st 425/stat 425. pdf 2 Weiss AJ (Truven Health Analytics), Elixhauser A (AHRQ). Overview of Hospital Stays in the United States, 2012. HCUP Statistical Brief #180. October 2014. Agency for Healthcare Research and Quality, Rockville, MD. http: //www. hcup-us. ahrq. gov/reports/statbriefs/sb 180 -Hospitalizations-United-States- 2012. pdf. 3 Special issue, The SENIC Project, ” American Journal of Epidemiology 111 (1980), pp. 465 -653. Data obtained from Robert W. Haley, M. D. Hospital Infections Program, Center for Infectious Disease, Center for Disease Control, Atlanta, Georgia 30333. 4 Kutner, Nachtsheim, Neter and Li, Applied Linear Statistical Methods 5 ed. , Mc. Graw-Hill, 2004.
Questions
- Differentialgleichung logistisches wachstum
- Linear regression spss
- Ratio test
- What is multivariate analysis
- Multi variance
- Nature of multivariate analysis
- Multivariate analysis of variance and covariance
- Multivariate analysis
- Multivariate statistical analysis
- Multivariate analysis
- Multivariate analysis
- Multivariate pattern analysis
- An object in motion stays in motion
- Why must the electrode holder be correctly sized?
- Pain itself the image of agony
- Rhythmic element
- Non locomotor axial movements
- Hiv stays alive in dried blood