ESSnet on microdata linking and data warehousing in

  • Slides: 19
Download presentation
ESSnet on microdata linking and data warehousing in statistical production ESS-net DWH

ESSnet on microdata linking and data warehousing in statistical production ESS-net DWH

Content § Background ESS-net § Challenges § Explaining the statistical data warehouse (S-DWH) §

Content § Background ESS-net § Challenges § Explaining the statistical data warehouse (S-DWH) § Elements of the S-DWH - Business architecture - GSBPM mapping § Meta data § Organisational aspects ESS-net DWH 1

ESSnet Partnership ESS-net coordinator: § Statistics Netherlands (CBS) Co-partners: § Estonia, Italy, Lithuania, Portugal,

ESSnet Partnership ESS-net coordinator: § Statistics Netherlands (CBS) Co-partners: § Estonia, Italy, Lithuania, Portugal, Sweden, UK Starting date: § 4 October 2010 § SGA 1: first year, till 3 October 2011 § SGA 2: last 2 years, till 3 October 2013 ESS-net DWH 2

General Objectives ESSnet DWH Provide assistance in: the development and implementation of a maximum

General Objectives ESSnet DWH Provide assistance in: the development and implementation of a maximum efficient statistical process for business and trade statistics, independent of any (technical) specific architecture Results in daily statistical practice: § increase the efficiency of data processing in statistical production systems § maximize the reuse of already collected data Ø a 'data warehouse' approach to statistics ESS-net DWH 3

Start SGA 2 Conclusions § Data Warehousing in statistics is ‘hot’ § Metadata is

Start SGA 2 Conclusions § Data Warehousing in statistics is ‘hot’ § Metadata is found important…. . but also often neglected ! § S-DWH is very difficult to compare with common commercial DWH § Visiting NSIs has proven very effective for gathering information AND for sharing knowledge and expertise Ø Great need for knowledge & expertise ESS-net DWH 4

The Challenges § Decrease of costs & administrative burden increase of efficiency & flexibility

The Challenges § Decrease of costs & administrative burden increase of efficiency & flexibility § Rapidly changing demand for information: - versus growing need for more information on more topics decreasing lifecycle of policymakers, quicker delivery § Disclosure of all kinds of new data sources § Need for integrated production systems Ø Make optimal use of all available data sources (existing & new) ESS-net DWH 5

The Statistical Data Warehouse A central data hub to connect and integrate all available

The Statistical Data Warehouse A central data hub to connect and integrate all available data sources, supporting statistical production AND data collection processes by providing: a detailed and correct overview/insight of all available data sources a framework for adequate data governance, including metadata A central ‘statistical data store’ forand managing management, confidentiality aspects data authorisation § available flexible data storage and data exchange between processes all data of interest, regardles of its source, § accessthe to registers sampling frames (BR, etc); enabling NSI to produce necessary information (= statistics § § !) and to (re)use available data to create new data / new outputs. ESS-net DWH 6

Dataset Selected sample Dataset Admin data source Working data Selected sample Staging area Rules

Dataset Selected sample Dataset Admin data source Working data Selected sample Staging area Rules for generating samples etc. ESS-net DWH Aggregate Statistics Microdata Backbones (BR eg. ) BB snapshots Rules for updating BB Input reference frame Data extracts Input data Storage, combination Data extracts Outputs 7

Explaining the S-DWH A system or set of integrated systems, designed to handle the

Explaining the S-DWH A system or set of integrated systems, designed to handle the processing of statistical data in the production of statistics, comprimising: § technical facilities for storing and processing data, receiving data in and producing outputs in a flexible way § rules for updating the sources for the DWH § definitions necessary to achieve those samples / sources Ø The S-DWH is a concept that provides an architectural model of the statistical data flow, from data collection to statistical output ESS-net DWH 8

The S-DWH Business Architecture § Conceptualisation of how to build up a S-DWH §

The S-DWH Business Architecture § Conceptualisation of how to build up a S-DWH § A common model for the total statistical process and data flow § Provide optimal organisation of all structured data, enabling re-use, creation of new data etc. § 4 Layers, covering all statistical activities ‒ Sources ‒ Integration ‒ Interpretation & Analysis ‒ Data Access / Output ESS-net DWH 9

The layered architectureof the S-DWH, with focus on the data sources used in each

The layered architectureof the S-DWH, with focus on the data sources used in each layer Specific for S-DWH ESS-net DWH 10

Mapping the S-DWH on the GSBPM Use the GSBPM as common language to identify

Mapping the S-DWH on the GSBPM Use the GSBPM as common language to identify and locate the various phases on the 4 S-DWH layers ESS-net DWH 11

Managing the S-DWH The S-DWH is a logically coherent central data store, not necessarily

Managing the S-DWH The S-DWH is a logically coherent central data store, not necessarily one single physical unit. Metadata is vital in the governance, satisfying 2 essential needs: § to guide statisticians in processing and controlling the statistical data § to inform users by giving insight in the exact meaning of the statistical data The vertical metadata layer enables to search all (meta)data in the 4 layers and, if permitted, give access to the data. ESS-net DWH 12

Meta data layer Metadata Layer Data Access Layer Interpretation and Data Analysis Layer Integration

Meta data layer Metadata Layer Data Access Layer Interpretation and Data Analysis Layer Integration Layer Source Layer ESS-net DWH 13

Metadata - the DNA of the S-DWH Framework: § General metadata definitions § Metadata

Metadata - the DNA of the S-DWH Framework: § General metadata definitions § Metadata for the S-DWH § Use of metadata models § Metadata standards & norms § Metadata quality & governance § Categories & subsets § Minimum requirements ESS-net DWH 14

S-DWH meta data requirements Subsets Standards & Norms ISO 11179 Internal rules Guidelines Mata

S-DWH meta data requirements Subsets Standards & Norms ISO 11179 Internal rules Guidelines Mata data model ESS-net DWH S-DWH Gatekeeper 15

Centre of knowledge & expertise Defining and implementing business modell: § Organisational aspects -

Centre of knowledge & expertise Defining and implementing business modell: § Organisational aspects - Experts from partners and other ESS members - Research on actual topics - Seminar / workshop § Financial aspects covered § Roll out for more fields of expertise ESS-net DWH 16

Organisational aspects Implementation of a S-DWH has huge organisational impact: § It means: moving

Organisational aspects Implementation of a S-DWH has huge organisational impact: § It means: moving from single operations to integrated, generic processes § It needs: a redesign of the statistical process § It asks: new IT systems, tools, high investments § It is: a new way of working Ø Only changing systems will not do the trick, changing people is the key to success ESS-net DWH 17

ESSnet on data warehousing Thank you ! ESS-net DWH

ESSnet on data warehousing Thank you ! ESS-net DWH