FORECASTING AND PREDICTIVE ANALYTICS Combining Techniques for Better
FORECASTING AND PREDICTIVE ANALYTICS Combining Techniques for Better Projections Rebecca T. Barber, Ph. D. Arizona State University
“IF WE COULD FIRST KNOW WHERE WE ARE AND WHITHER WE ARE TENDING, WE COULD BETTER JUDGE WHAT TO DO AND HOW TO DO IT” Abraham Lincoln
Text REBECCABARBE 835 to 37607 to join the poll or go to Poll. EV. com/rebeccabarbe 8 35 THEN text words you think of when you think about forecasting What words come to mind when you think about FORECASTIN G?
Forecasting, defined ■ OED: “A calculation or estimate of future events”, to “Predict or estimate ■ A technique that looks at a time series of numbers and predicts the future value of the data by looking at the trends.
Forecasting, characteristics ■ Tends to be: – Logical – Rules-driven – Process-oriented – At an aggregate level – TIME is made explicit – Assumes you have all the relevant data
Forecasting, examples ■ ■ Forecast enrollment for fall semester based upon – Applications by subgroup – Applications accepted and historic percentage of new applications likely to be accepted by subgroup – Historic percentage of accepted students who enroll by subgroup – Historic withdrawal rate before census date by subgroup Forecast revenue for a fiscal year using – Enrollment by tuition subgroups – Proposed tuition rates by subgroup – Proposed discount rates by subgroup – Historic percentage non-payments – Historic percentage Fall to Spring retention
Forecasting, factors related to accuracy ■ How well do we understand the factors contributing to our outcome? ■ How much data is available? ■ Can our forecast affect the thing we are trying to forecast?
Forecasting, techniques ■ Arithmetic combinations – Fall Revenue: 100 students * 10, 000 = 1, 000 – Spring Revenue: 1, 000 * 0. 9465 = 946, 500 ■ Less common techniques (in our environment) – Moving Average (simple or weighted) – Exponential Smoothing ■ Seasonal Decomposition – ARIMA: Takes into account past prediction errors
PREDICTIVE ANALYTICS
Text REBECCABARBE 835 to 37607 to join the poll or go to Poll. EV. com/rebeccabarbe 8 35 THEN text words you think of when you think about predictive analytics? What words come to mind when you think about PREDICTIVE ANALYTICS?
Predictive Analytics, Defined ■ SAS: “Predictive analytics is the use of data, statistical algorithms and machine learning techniques to identify the likelihood of future outcomes based on historical data. ”
Predictive Analytics, characteristics ■ Tends to be – data mining, machine learning – Starts exploratory – Does not assume you know all the “why”’s
Predictive Analytics, Example ■ Predict the likelihood that a specific student will enroll for fall based upon – Demographic information – Grades, test scores, class ranks – Date applied (days before semester start) – FAFSA submission, details of package – Relatives enrolled/alumni – Historic likelihood from that high school – Campus tour, contact with recruiters – Presence/absence of keywords in essay – Applied for a specific program/honors college – Web pages visited; terms searched for – Housing deposit received
Predictive Analytic: factors related to accuracy ■ Data Variety ■ Data Volume ■ Data Completeness
Predictive Analytic: Techniques ■ Regression-based (continuous outcome variable) ■ Classification-based (discrete outcome variables)
COMBINING TECHNIQUES
A project combining the data PROBLEM • Enrollment forecasting model is VERY sensitive to small changes, resulting in high levels of volatility in the predicted enrollment and, by extension, the revenue PROPOSED SOLUTION • Design and build a predictive model that can be used to triangulate and smooth the forecast.
The Data Detail data on applications / acceptances / enrollments • • Date Applied / Accepted – Calculated days to start of term) CI Score Campus visit (yes/no) Honors College applicant (yes/no) Residency Transcripts received (yes/no) FAFSA received (yes/no) Aggregate data for all other student groups • • Week Level Residency Modality
A multi-step hybrid model Logistic regression on potential new students (FTFTF and transfers) • Cut-offs were assigned • Logistic scores were classified and aggregated by week, etc. SPSS TSMODEL was run by week leading to census date • Combined ARIMA and Seasonal models • Generated predicted value at census date (with 95% confidence intervals)
Visualization ■ Shows trend and seasonality – Dip is census date when all counts reset – Blue line at the end is prediction – Purple is upper and lower confidence intervals
Results ■ Overall accuracy: -1. 26% – Accuracy was best on the largest groups (Undergrad Immersion, +0. 32%) – Graduate students were overestimated regardless of modality – Undergraduate students were underestimated regardless of modality ■ All actual values were within the confidence intervals.
Predictive Model Thoughts and next steps • Consider adding more predictors to the logistic model Forecast • Investigate instances of substantial deviation from actual
- Slides: 24