Use of the SAS HighPerformance Forecasting Software to

  • Slides: 26
Download presentation
Use of the SAS High-Performance Forecasting Software to Detect Break in Time Series Frédéric

Use of the SAS High-Performance Forecasting Software to Detect Break in Time Series Frédéric Picard and Steve Matthews SAS OPUS Ottawa, Ontario November 26, 2015

Outline Context SAS High Performance Forecasting Software Exploration of a Few Options Conclusion 2

Outline Context SAS High Performance Forecasting Software Exploration of a Few Options Conclusion 2

Context Most Time Series at Statistics Canada are Surveys Repeated Over Time Annual Surveys

Context Most Time Series at Statistics Canada are Surveys Repeated Over Time Annual Surveys Sub-Annual Surveys (Monthlies and Quarterlies) Several Domains 3

Methodological Changes in Time Series Try to minimize changes to surveys, but sometimes necessary:

Methodological Changes in Time Series Try to minimize changes to surveys, but sometimes necessary: • changing requirements • changing population characteristics (cell phones, equestionnaire) • maintain or gain efficiency (sampling error) Artificial change could be misinterpreted as meaningful • prefer to revise past to remove effect • Impact is best estimated by conducting parallel run (costly) 4

Use of Forecasts for Time Series Break Detection Select a model based on past

Use of Forecasts for Time Series Break Detection Select a model based on past data Use that model to “forecast” the current period along with a 95% prediction interval. Looks if the survey estimate for the current period falls within the interval. If the survey estimate is outside of the interval then it is an indication that there is a break. 5

Statistical Forecasting : Annual Example Salaries and Wages, Industry domain ARIMA (1, 1, 0)

Statistical Forecasting : Annual Example Salaries and Wages, Industry domain ARIMA (1, 1, 0) 6

Challenges Related to the Volume Tens of thousands of survey estimates Model selection and

Challenges Related to the Volume Tens of thousands of survey estimates Model selection and estimation Manual model selection is a difficult and time consuming process Automated model selection can sometimes fail Inclusion or not of an auxiliary variable to improve forecast. 7

SAS High Performance Forecasting Features 8 Several Options Automated model selection Fast Robust GUI

SAS High Performance Forecasting Features 8 Several Options Automated model selection Fast Robust GUI SAS code (proc HPFdiagnose, HPFengine…) GUI can generate the corresponding SAS code

Several Options Model selection criteria (MAPE, RMSE, …), In-Sample vs Out-of-Sample Inclusion or not

Several Options Model selection criteria (MAPE, RMSE, …), In-Sample vs Out-of-Sample Inclusion or not of an auxiliary variable (force or let HPF decide) Transform the data or not Several models available ESM, ARIMAX, UCM, IDM Outlier detections 9

Robust If ARIMAX models fail, HPF will try simpler models such as ESM. It

Robust If ARIMAX models fail, HPF will try simpler models such as ESM. It has intermittent demand models for data with a lot of zeroes HPF will (almost) always provide a forecast and a confidence interval if the syntax and file formats are correct. 10

For our Project Used GUI to explore the different options and their impact on

For our Project Used GUI to explore the different options and their impact on the models Looked at the generated SAS code We use the SAS code for the production • Easier to manage datasets • A little bit more flexible • Easy to reuse with other datasets or when data is updated 11

Proc HPFdiagnose and HPFengine Helpful to determine the best model: • Does the series

Proc HPFdiagnose and HPFengine Helpful to determine the best model: • Does the series have a trend? • Is the series seasonal? • Is the auxiliary variable a good predictor of the variable of interest? • Should we transform the data using the log? 12

Forecasting a Seasonal Monthly Series Energy Consumption, Province*Industry domain ARIMA (0, 1, 0)(2, 0,

Forecasting a Seasonal Monthly Series Energy Consumption, Province*Industry domain ARIMA (0, 1, 0)(2, 0, 0) 13

Forecasting a non-seasonal Monthly Series Energy Consumption, Province*Industry domain ESM with trend 14

Forecasting a non-seasonal Monthly Series Energy Consumption, Province*Industry domain ESM with trend 14

Usefulness of an auxiliary variable Sometimes, we have an auxiliary variable It has the

Usefulness of an auxiliary variable Sometimes, we have an auxiliary variable It has the potential of increasing the precision Usually, in order to be useful • It has to be available for the period that we want to forecast the variable of interest • It has to be a good predictor of the variable of interest. 15

Statistical Forecasting: Annual Example Total Revenue, Province*Industry domain ARIMA (0, 1, 0) 16

Statistical Forecasting: Annual Example Total Revenue, Province*Industry domain ARIMA (0, 1, 0) 16

Statistical Forecasting: Annual Example Total Revenue, Province*Industry domain ARIMA(1, 1, 0) +X(1) 17

Statistical Forecasting: Annual Example Total Revenue, Province*Industry domain ARIMA(1, 1, 0) +X(1) 17

Roles to Variables 18

Roles to Variables 18

Corresponding Code proc hpfdiagnose data=… criterion=mape holdout=0; transform type=auto ; id date_SAS interval=YEAR; forecast

Corresponding Code proc hpfdiagnose data=… criterion=mape holdout=0; transform type=auto ; id date_SAS interval=YEAR; forecast Revenue ; input Taxable. Revenue / REQUIRED=MAYBE(POSITIVE) ; arimax identifyorder=both; esm; run; 19

Transform Data or Not 20

Transform Data or Not 20

Corresponding Code proc hpfdiagnose data=… criterion=mape holdout=0; transform type=auto ; id date_SAS interval=YEAR; forecast

Corresponding Code proc hpfdiagnose data=… criterion=mape holdout=0; transform type=auto ; id date_SAS interval=YEAR; forecast Revenue ; input Taxable. Revenue / REQUIRED=MAYBE(POSITIVE) ; arimax identifyorder=both; esm; run; 21

Selection Criterion 22

Selection Criterion 22

Selection Criterion 23

Selection Criterion 23

Corresponding Code proc hpfdiagnose data=… criterion=mape holdout=0; transform type=auto ; id date_SAS interval=YEAR; forecast

Corresponding Code proc hpfdiagnose data=… criterion=mape holdout=0; transform type=auto ; id date_SAS interval=YEAR; forecast Revenue ; input Taxable. Revenue / REQUIRED=MAYBE(POSITIVE) ; arimax identifyorder=both; esm; run; 24

Conclusion HPF was helpful to us to help manage the large number of forecast

Conclusion HPF was helpful to us to help manage the large number of forecast HPF allowed us to use complex models (with auxiliary variables) The GUI was helpful to explore different options HPF is highly automated but we have the option to intervene 25

Thank you! For more information, please contact: Pour plus d’information, veuillez contacter : Frederic

Thank you! For more information, please contact: Pour plus d’information, veuillez contacter : Frederic Picard Time Series Research and Analysis Centre Statistics Canada frederic. picard@canada. ca 26