SDMX IT Tools Introduction JeanFrancois LEBLANC Christian SEBASTIAN
SDMX IT Tools Introduction Jean-Francois LEBLANC Christian SEBASTIAN June 2015 Eurostat Unit B 3 – IT and standards for data and metadata exchange Eurostat
Table of contents 1. 2. 3. 4. Where are we? Standardization Why do we need a model? GSBPM Generic Statistical Business Process Model 1. Phases 2. Key features 3. Other uses 5. Standards – Relations 6. GSIM Generic Statistical Information Model 2 Eurostat
Table of contents 7. SDMX & DDI 8. SDMX 1. 2. 3. 4. 5. 6. 7. 8. Why? Benefits Costs Opportunities Impacts From 1. 0 to 2. 1 The SDMX components SDMX in practice 9. Summary 3 Eurostat
1. Where are we? • Dramatic changes in the environment of official statistics producers (e. g. data deluge) • Modernization of statistical information system seen as a question of survival for the sector of official statistics • Standardization viewed as a key enabler for modernization • "Standards-based” industrialization of statistical production 4 Eurostat
2. Standardization • Why is it necessary? • Harmonization • Reusability and interoperability • Shared solutions across statistical institutes • What does it imply? • Common processes • Common tools • Common methodologies 5 Eurostat
2. Standardization • Industry Standards • • GSBPM - Generic Statistical Business Process Model GSIM - Generic Statistical Information Model SDMX - Statistical Data and Metadata e. Xchange DDI - Data Documentation Initiative GSBPM GSIM • Other major standards • • RDF - Resource Description Framework LOD - Linked Open Data JSON - Java. Script Object Notation XBRL - e. Xtensible Business Reporting Language SDMX DDI 6 Eurostat
3. Why do we need a model? • To define and describe statistical processes in a coherent way • To standardize process terminology • To compare and benchmark processes within and between organisations • To identify synergies between processes • To inform decisions on systems architectures and organisation of resources 7 Eurostat
4. GSBPM Generic Statistical Business Process Model • Applicable to all activities undertaken by producers of official statistics -> data outputs • Used by National and international statistical organisations • Independent of data source, can be used for: • Surveys / censuses • Administrative sources / register-based statistics • Mixed sources 8 Eurostat
4. 1 GSBPM - Phases 9 Eurostat
4. 2 GSBPM – Key features Not a linear model • Sub-processes do not have to be followed in a strict order • It is a matrix with many possible paths, including iterative loops within and between phases • Some iterations of a regular process may skip certain sub-processes 10 Eurostat
4. 3 GSBPM – Other uses • • • Harmonizing statistical computing systems Facilitating sharing of statistical software Framework for process quality management Structure for storage of documents Measuring operational costs 11 Eurostat
5. Standards - Relations Information concepts Statistical concepts Conceptual GSIM GSBPM Statistics production Practical Technology Methods Statistical Production how-to SDMX, DDI, RDF, ISO-11179, … Eurostat 12
6. GSIM Generic Statistical Information Model SDMX DDI Other standards GSIM Eurostat Implementation standards Conceptual model 13
7. SDMX & DDI • DDI offers a very rich model for the documentation of micro -data • SDMX offers a very integrated exchange platform for statistical outputs (IT architectures, tools, web services) The combined use of both standards could allow a higher level of integration of the complete production process 14 Eurostat
8. SDMX Statistical Data and Metadata e. Xchange SDMX UNSD World Bank Eurostat 15
8. 1 SDMX – Why? • The exchange of statistical data and metadata is complex, resource intensive and expensive • In the past, national and international organisations had developed specific approaches and solutions • Opportunities and challenges related to new technologies for machine to machine exchange were coming up, e. g. XML, web services. SDMX is the global answer to this. 16 Eurostat
8. 2 SDMX - Benefits • Efficiency • Reduced burden after low investment • Consistent and comparable data and metadata messages produced by different organizations • Harmonized statistical processes, offering new ways of data and metadata exchange (such as data hubs) • Web-based dissemination formats are provided that are computer “readable” and easier to update. 17 Eurostat
8. 3 SDMX - Costs • Development/maintenance of the SDMX standards and guidelines done by the international sponsoring institutions (supported by NSIs) • Standards are public and open source • IT tools are created by sponsoring or other organizations and made freely available • Capacity building by individual sponsoring institutions • User community input by means of open process • Low investment cost – gradual implementation 18 Eurostat
8. 4 SDMX - Opportunities • Across domains • Across organizations Simplification • Streamline data flows • Central management (SDMX Registry) • Software tools • Data sharing • Data structures • Concepts • Code lists Harmonization Standardization 19 Eurostat
8. 5 SDMX - Impacts • Reduced reporting burden via common formats adopted by international organizations for data and metadata exchange • User-friendly access when publishing national data and metadata on the web via global standards for data formats, catalogs/registries and associated services • Improved management and analysis of data via global guidelines for metadata vocabularies and repositories in common formats • Replicable models and tools for statistical information systems at national levels 20 Eurostat
8. 6 SDMX – From 1. 0 to 2. 1 Version 2. 0 2008 SDMX accepted at UN level SDMX-EDI SDMX-ML SDMX Registry Version 1. 0 SDMX recognised and supported as the preferred standard GESMES/TS September 2004 November 2005 Versio n 1. 0 Versio n 2. 0 February 2008 April 2011 Versio n 2. 1 21 Eurostat
8. 7 The SDMX Components Describe statistics in a standard way Objects and their relationships § § Data Structure Definition (DSD), Concepts, Code List § Central management and standard access § SDMX Registry, SDMX Web Services § § § Cross Domain Concepts Cross Domain Code Lists Statistical Domains Metadata Common Vocabulary § Push § Provider generates and sends file to receiver Pull § § § Provider opens web service to data Receiver downloads regularly Hub § § Special case of pull: receiver downloads on end user request Eurostat 22
9. Summary • To enable a modernized statistical production, standards are the key • Standards at different levels are being used in an increasingly coherent way • GSBPM and GSIM provide conceptual models and facilitate communication • SDMX, DDI and other standards provide implementation models which can be used in a coordinated way • There are now more technologies than just GESMES and XML: a coherent overall model is critical 23 Eurostat
Introduction 24 Eurostat
- Slides: 24