ESTP course Demetra for X 12 in Daily

  • Slides: 47
Download presentation
ESTP course Demetra+ for X 12 in Daily Work Ketty Attal-Toubert and Stéphanie Himpens

ESTP course Demetra+ for X 12 in Daily Work Ketty Attal-Toubert and Stéphanie Himpens Insee, France 16 th of November, 2011

Contents of the presentation › 1. Overview of seasonal adjustment at Insee’s short term

Contents of the presentation › 1. Overview of seasonal adjustment at Insee’s short term statistics department › 2. Which tools are used? › 3. Using SAS to implement additional tools › 4. Using Demetra+ to improve quality › 5. How to make the process safer: controlling refreshment › 6. How to make the process safer: filing Page 2

An overview of seasonal adjustment at Insee’s short term statistics department Page 3

An overview of seasonal adjustment at Insee’s short term statistics department Page 3

Which Series are seasonally adjusted ? › IPI : 91 series, NACE 3 Level

Which Series are seasonally adjusted ? › IPI : 91 series, NACE 3 Level › Turnover Indexes : 440 series, NACE (3 -digit) in industry and NACE (5 -digit) in other sectors. › All are monthly series. Page 4

The seasonal adjustment process › According to ESS Guidelines on seasonal adjustment, we implement

The seasonal adjustment process › According to ESS Guidelines on seasonal adjustment, we implement two processes : – Each year a complete review of the models – Each month a re-estimation of the parameters of the models defined by the annual process. Outliers are reidentified on the last period. => Partial concurrent adjustment Page 5

The seasonal adjustment process › Indirect seasonal adjustment : Seasonally and calendar adjusted series

The seasonal adjustment process › Indirect seasonal adjustment : Seasonally and calendar adjusted series of the aggregates are computed as a weighted sum of their constitutive seasonally and calendar adjusted series. Page 6

Methods and tools › Since 2011 we use X 12 -ARIMA in Demetra+. ›

Methods and tools › Since 2011 we use X 12 -ARIMA in Demetra+. › We developed a set of SAS macros to do some steps of the annual and monthly campaigns. Page 7

The annual seasonal adjustment campaign The annual campaign is done in 8 stages :

The annual seasonal adjustment campaign The annual campaign is done in 8 stages : 1. Specification of the relevant set of Trading day regressors. Different sets of regressors are considered depending on the sector of activity (retail trade, wholesale trade, services, industry); 2. Automatic identification of ARIMA model and outliers (X 12 Demetra+), with selected TD-regressors. Quality report on new and old treatments are computed; 3. Comparison of quality reports is made in order to automatically select the best ARIMA model for each series; 4. Interactive check of results and search for best specifications for problematic series in order to improve quality; 5. New computation of quality report, if interactive changes of specifications; 6. Graphical comparison between new and old seasonally adjusted series and return to interactive analysis if needed; 7. Quality report of the treatment at aggregate level (new and old). 8. Final search for better specifications for series which cause a decline in quality of aggregates. Page 8

Annual SA campaign Demetra+ shows poor diagnostics Specification of relevant set of regressors Automoatic

Annual SA campaign Demetra+ shows poor diagnostics Specification of relevant set of regressors Automoatic identication of model (X 12 Demetra+) Unsatisfying quality report Quality report for aggregates (old and new treatments) Page 9 Comparison of quality reports => Automatic selection of the best models for each series Interactive search of new specifications. Unsatisfying quality report Unsatisfying revisions Graphical comparison new/old SA series Quality report

Monthly SA campaign 1. Update of the raw data file 2. Seasonal adjustment in

Monthly SA campaign 1. Update of the raw data file 2. Seasonal adjustment in Demetra+, launched from SAS. 3. Interactive check in Demetra+ of seasonal adjustments results and change of some specifications if needed. 4. Export of results 5. Saving of the workspace, the linked data file and the specifications in an archive directory. Page 10

Which tools are used ? Page 11

Which tools are used ? Page 11

Demetra+ › To seasonally adjust the series (but execution is ordered from SAS (WSACruncher)).

Demetra+ › To seasonally adjust the series (but execution is ordered from SAS (WSACruncher)). › To check quality : detailed examination of results and of diagnostics › To modify specifications in an interactive way if quality is not good enough Page 12

Launching Demetra+ from SAS (WSACruncher) › It allows to launch seasonal adjustment from SAS

Launching Demetra+ from SAS (WSACruncher) › It allows to launch seasonal adjustment from SAS which is the tool in which most of other treatments are done. Allows to generate output (result series and matrix view). Results can then be imported in SAS format as soon as seasonal adjustment is finished. Example : wsacruncher d: demetra+test_1. xml -t -p complete › => It limits changes of environment in the computing process of indices, which increases safety of the process. Interactive Demetra+ is only opened to check quality and results. Page 13

SAS › To implement additional tools › To increase safety and traceability of the

SAS › To implement additional tools › To increase safety and traceability of the processes => More details later Page 14

Excel › Input and output of Demetra+: – Raw data in SAS format are

Excel › Input and output of Demetra+: – Raw data in SAS format are converted to Excel format for being used in Demetra+ – Output generated by Demetra+ (output series + matrix view in. csv format) are converted to SAS format. › More attractive presentation of quality report results : SAS tables from quality report are converted to Excel format. Colours are used to spot problematic series. Page 15

Tools used in annual process: summary Sas • Building and selection of trading day

Tools used in annual process: summary Sas • Building and selection of trading day regressors • Creation of specifications sets for automated treatment • Export of raw data to an Excel file • Launching of seasonal adjustment treatment and generation of output with cruncher. Page 16 Excel Raw data file Output of Demetra+ • Execution of seasonal adjustment treatment • Detailed analysis of the treatment • Interactive change of specifications. New generation of output Excel Results of quality report Sas • Import of results from Excel to SAS format • Quality report • Comparison of quality reports from different treatments • Saving of final annual process results (workspace, data file, specifications)

Tools used in monthly process: summary Sas • Updating of raw data file •

Tools used in monthly process: summary Sas • Updating of raw data file • Export of raw data to an Excel file • Launching of seasonal adjustment refreshment and generation of output with cruncher Page 17 Excel Raw data file Output of Demetra+ • Refreshment of seasonal adjustment • Detailed analisys of the treatment • Interactive change of specifications if necessary. New generation of output Sas • Import of results from Excel to SAS format • Saving of monthly process results (workspace, data file, specifications) • Tool to compare specifications betwen month m and month (m+1)

Using SAS to implement additional tools Page 18

Using SAS to implement additional tools Page 18

Computing and identifying sets of trading-day regressors (1) › We developed a specific way

Computing and identifying sets of trading-day regressors (1) › We developed a specific way to compute our calendar regressors which take into account both French calendar and specificity of each sector of activity under review. › In TRamo-Seats and X 12 -ARIMA, predefined regressors consider only two cases: Trading-day effect (6 variables: all days are different) or Weekday effect (Monday until Friday are similar but different from Weekend days). › Those cases are not suitable for all sectors of activity, like French retail trade (low activity on Page. Monday, 20 high activity on Saturday)

Computing and identifying sets of tradingday regressors (2) › Automatic selection is made among

Computing and identifying sets of tradingday regressors (2) › Automatic selection is made among several sets of regressors (more choice than pre-defined regressors). Criteria for selection: combination of Fisher test (covariance matrix of coefficients in given in Dos version of X 12 -ARIMA), and Akaike Criterion (AICC, given in Demetra+). › Available sets depend on the sector of activity of the series. Sector of activity Page 21 Number of available sets of TD-regressors Industry 3 Retail and Wholesale trade, Services 4

Checking quality (1): an additional tool Aims of this quality report, built in 2007:

Checking quality (1): an additional tool Aims of this quality report, built in 2007: › Select criteria among many and add some complementary diagnostics › Synthesize this information for each series: – control of the way the quality criteria are summarized – makes the comparison of the quality of different treatments easier › Synthesize information at aggregate level: – way of judging the quality of aggregates – allows a top-down analysis of the quality and to target those series that are “really” problematic Solution: › Assign grades to evaluate different aspects of quality; › Synthesize them for each series (weighted mean of the grades, the weight depending on the importance of the quality criterion); › Aggregate the grades of the series according to the weight of each series in the aggregate series (economic weight). Page 23

Checking quality (2): computing grades › Grades (numerical values) are computed by SAS programs.

Checking quality (2): computing grades › Grades (numerical values) are computed by SAS programs. In the columns below the grade is The overall quality of 100 if the null hypothesis of the test is not rejected at 5% level, 50 if it is ARIMA model is rejected at 5% but not rejected at computed as a 1% level, 0 if it is rejected at 1% weighted mean of the level. grades related to the quality of residuals Page 24

Checking quality (3): groups of diagnostics The different groups of diagnostics are : ›

Checking quality (3): groups of diagnostics The different groups of diagnostics are : › ARIMA modelling quality › Outlier identification (number of outliers, concentration) › Quality of decomposition (residual seasonality or residual trading-day effects, orthogonality of components) › History revisions (+outliers stability) Weighted means of grades are computed to evaluate the quality of the aggregates (indirect seasonal adjustment). Results are summed up in an Excel file. Page 25

Comparing quality and results › SAS output of the quality report is exported in

Comparing quality and results › SAS output of the quality report is exported in Excel file. To compare the quality of two different treatments for a given series, the difference between the synthetic grades are calculated => allow to spot series which show a decline or an improvement of quality. › The quality report also produces a file which contains the characteristics of the treatment (ARIMA model, transformation outliers. . ). It is then possible to spot differences between the two treatments. › Graphs overlaying results of different treatments are drawn in SAS. It is possible to easily scroll the results. Page 26

Graphical comparison of old and new SA series (SAS) Page 27

Graphical comparison of old and new SA series (SAS) Page 27

Changing specifications › It is easy to read and write specifications in SAS (easy

Changing specifications › It is easy to read and write specifications in SAS (easy link with XML files). Data file specifications Arima model parameters Series span specifications › We create a SAS table containing specifications Page 28

Changing specifications automatically SAS allows us › to create mix models (for example old

Changing specifications automatically SAS allows us › to create mix models (for example old ARIMA models with new regressors). › To fix parts of models (the selected set of regressors for each series). › To change specifications in case of bad diagnostics (to test a model with the maximum number of calendar regressors every time residual calendar effects have been detected). Page 29

Comparing two sets of specifications (1) Do they use the same data file ?

Comparing two sets of specifications (1) Do they use the same data file ? Are the two sets of specifications different or not ? Page 30 Do they share the same options to find ARIMA models (initial specifications) ?

Comparing two sets of specifications (2) Two types of specifications can be compared :

Comparing two sets of specifications (2) Two types of specifications can be compared : › It is possible to compare results : the models used to process the series with no care of whether they’re a result of an automated identification or if they were fixed by the user › It is also possible to compare initial options : which process leads to this model ? => Traceability of the history of changes in specifications from month to month. Page 31

Using Demetra+ to improve quality Page 32

Using Demetra+ to improve quality Page 32

Interactive use of Demetra (1) › To do a quick check of all series

Interactive use of Demetra (1) › To do a quick check of all series during annual and monthly campaigns: Mains Results menu and Charts menu. Page 33

Interactive use of Demetra (2) ›For a detailed analysis of problematic series: – «

Interactive use of Demetra (2) ›For a detailed analysis of problematic series: – « Bad» or « Severe » according to Demetra+ – New automatically identified transformation of the series (logarithm or none) different from the old one. – Charts looking « strange » or very different from the old ones. – Degradation of the quality report grades. Priority is given to series with highest weight in the global index. Page 34

How to make the process safer: controlling refreshment Page 35

How to make the process safer: controlling refreshment Page 35

Refreshment › Monthly process : – A new value is added at the end

Refreshment › Monthly process : – A new value is added at the end of the series in the data linked file and, generally, previous raw data are revised. – Seasonal adjustment is done in Demetra+. => To take changes in the raw data file into account, refreshment must be performed on the SAProcessing. Page 36

2 kinds of specifications › An initial set of specifications (for example automatic ARIMA

2 kinds of specifications › An initial set of specifications (for example automatic ARIMA model identification, test for transformation…) › A final set of specifications (what has been found for each part of the model : ARIMA model, transformation etc. ) = specifications from result Page 37

Refreshing options There are different options of refreshment (Concurrent, partial concurrent, current) but in

Refreshing options There are different options of refreshment (Concurrent, partial concurrent, current) but in all cases: › They read the linked raw data file to update the data contained in the processing. › They take into account initial specifications. Page 38

Refreshing options › Example : – Suppose we decide to do a full automated

Refreshing options › Example : – Suppose we decide to do a full automated process except for outlier identification, which is disabled. – A « Partial concurrent, all outliers » refreshment is then done. One might think outlier identification would be performed on the full span of data since outliers are supposed to be refreshed… Page 39

Refreshing options No new identification is done ! => No outlier identification was requested

Refreshing options No new identification is done ! => No outlier identification was requested in initial specifications. Page 40

Refreshing options A few remarks : › In a refreshment process, treatments applied to

Refreshing options A few remarks : › In a refreshment process, treatments applied to each series can be different according to the differences in their initial specifications › There is no way to easily check what is done for each series Page 41

Refreshing options What we want (in monthly campaigns): › The models to be fixed

Refreshing options What we want (in monthly campaigns): › The models to be fixed (ARIMA part, outliers before a fixed date, calendar regressors, transformation) › The outliers to be identified only from a fixed date (Demetra+ doesn’t allow to do this). => SAS Page 42

Refreshing options A new SAProcessing XML file is written in SAS in which: ›

Refreshing options A new SAProcessing XML file is written in SAS in which: › Every items of the specifications are fixed. Result specifications of the annual campaign are turned into initial specifications in the monthly processing, but: – Outliers which occur after a fixed date (ex. earliest date of the raw data revision period) are removed (except for some of them we want to keep); – Outlier identification is requested from this fixed date. › “Concurrent” refresh option is then selected (=“complete” in cruncher statement). Page 43

How to make the process safer: filing Page 44

How to make the process safer: filing Page 44

Why should the filing of a workspace be made safer ? (1) What we

Why should the filing of a workspace be made safer ? (1) What we copy : Page 45

Why should the storing of a workspace be made safer ? (2) › Data

Why should the storing of a workspace be made safer ? (2) › Data saved in the processing : – The specifications – The path of the linked data file › The path to the user-variables is recorded in the Variables xml file › Every change in the original file might lead to a change in the filed workspace. Page 46 Linked to the original data file and to the uservariable data file

What must we do in a filing process to secure our results ? To

What must we do in a filing process to secure our results ? To secure a workspace: › – The workspace of the monthly or annual process must be copied in an appropriate directory. – The path of the linked data files must be changed into the ones copied in the archive directory. Page 47

Demetra+ workspaces for seasonal adjustment at Insee’s short term statistics department (1) Year y

Demetra+ workspaces for seasonal adjustment at Insee’s short term statistics department (1) Year y Final workspace of annual campaign Month m Workspace of monthly process Initial specifications Critical value for outliers detection Result specifications Filing process Page 49 Archived workspace (year y) Initial specifications Refresh « complete » Suppression of with cruncher outliers from a chosen date (except some of Result specifications them we want to keep) Filing process Archived workspace (month m)

Demetra+ workspaces for seasonal adjustment at Insee’s short term statistics department (2) Month m

Demetra+ workspaces for seasonal adjustment at Insee’s short term statistics department (2) Month m Workspace of monthly process Updated raw data Month m+1 Workspace of monthly process Initial specifications Refresh « complete » with cruncher Result specifications Filing process Page 50 Archived workspace (month m) + possible interactive changes Refresh « complete » with cruncher Result specifications Filing process Archived workspace (month m+1)