4 SDMX Main objects for data exchange Raynald

  • Slides: 38
Download presentation
4. SDMX: Main objects for data exchange Raynald Palmieri Eurostat Unit B 5: “Central

4. SDMX: Main objects for data exchange Raynald Palmieri Eurostat Unit B 5: “Central data and metadata services” 1 SDMX Basics course, March 2016 Eurostat

The SDMX Components Describe statistics in a standard way Objects and their relationships §

The SDMX Components Describe statistics in a standard way Objects and their relationships § § Data Structure Definition (DSD), Concepts, Code List § Central management and standard access § SDMX Registry, SDMX Web Services § § § Cross Domain Concepts Cross Domain Code Lists Statistical Domains Metadata Common Vocabulary § Push § Provider generates and sends file to receiver Pull § § § Provider opens web service to data Receiver downloads regularly Hub § § Special case of pull: receiver downloads on end user request Eurostat 2

Describing the data exchange Who? When? How? Who? Where? What? Eurostat What?

Describing the data exchange Who? When? How? Who? Where? What? Eurostat What?

Dataflows - classification Category Tourism Statistical Tables = data flows Sub categories 4 Eurostat

Dataflows - classification Category Tourism Statistical Tables = data flows Sub categories 4 Eurostat

SDMX Implementation steps Dataflows 5 Concepts & Code lists DSD sharing Eurostat SDMX Data

SDMX Implementation steps Dataflows 5 Concepts & Code lists DSD sharing Eurostat SDMX Data Structure Definition

Dataflows - classification Categories Dataflows Tourism Capacity Occupancy Night_Spent Arrival_of_ residents Occupancy_ rate 6

Dataflows - classification Categories Dataflows Tourism Capacity Occupancy Night_Spent Arrival_of_ residents Occupancy_ rate 6 Eurostat

Concepts & Codelists : Tourism Example • What do we want to exchange? •

Concepts & Codelists : Tourism Example • What do we want to exchange? • Statistical tables 7 Eurostat

Preparation phase SDMX Implementation steps Dataflows 8 Concepts & Code lists DSD sharing Eurostat

Preparation phase SDMX Implementation steps Dataflows 8 Concepts & Code lists DSD sharing Eurostat SDMX Data Structure Definition

Model of the statistical table Number Tourism establishments Italy Annual data 2529 Eurostat

Model of the statistical table Number Tourism establishments Italy Annual data 2529 Eurostat

Model of the statistical table: What do we need to do first? • Identify

Model of the statistical table: What do we need to do first? • Identify the Concepts • A concept is a unit of knowledge created by a unique combination of characteristics (SDMX Information Model) • Sources • Existing data set tables From website From applications • Data Collection Instruments Questionnaires/Excel spreadsheets • Handbooks, User Guides • Database Tables • Existing Data Structure Definitions From other organisations • Legislation/Regulation Eurostat

Identifying the concepts FREQUENCY TOURISM_ACTIVITY COUNTRY TOURISM_INDICATOR UNIT OBS_VALUE TIME E P Eurostat OBS_STATUS

Identifying the concepts FREQUENCY TOURISM_ACTIVITY COUNTRY TOURISM_INDICATOR UNIT OBS_VALUE TIME E P Eurostat OBS_STATUS

Concept Scheme Eurostat

Concept Scheme Eurostat

Identify/Define Code Lists • Purpose of a Code List • Constrains the value domain

Identify/Define Code Lists • Purpose of a Code List • Constrains the value domain of concepts when used in a structure like a data structure definition • Defines a shortened language independent representation of the values • Gives semantic meaning to the values, possibly in multiple languages • Agreeing on harmonised code lists is an important aspect of defining a data structure definition Eurostat

Concepts & Codelists : Tourism Example SDMX Code List Code list is maintainable SDMX

Concepts & Codelists : Tourism Example SDMX Code List Code list is maintainable SDMX container. Each code is defined uniquely by an ID, a maintenance agency, and a version. The name can be provided in several languages. Partial code lists can also be exchanged (v 2. 1). 14 The content of the partial code list is specified in a Constraint. Eurostat

Exercice Exercise: Deriving a concept scheme from a table Eurostat

Exercice Exercise: Deriving a concept scheme from a table Eurostat

Proposed solution Deriving a concept scheme from a table Eurostat

Proposed solution Deriving a concept scheme from a table Eurostat

Data Set Structure • Computers need to know the structure of data in terms

Data Set Structure • Computers need to know the structure of data in terms of: • • • Dimensionality Additional metadata Measures (Observation) Concepts Valid content Code Lists Non coded format (integer, date, text) Eurostat

Concepts play roles in a Data Structure • Comprises – Dimensions Concepts that identify

Concepts play roles in a Data Structure • Comprises – Dimensions Concepts that identify the observation value – Attributes Concepts that additional metadata about the observation value (as a value or the context of the value) Measure that is the observation value – Concept – Any of these may be • coded Representation • text • date/time • number • etc. Eurostat

DERIVING A DATA STRUCTURE FROM A TABLE FREQUENCY COUNTRY TOURISM_ACTIVITY TOURISM_INDICATOR UNIT OBS_VALUE TIME

DERIVING A DATA STRUCTURE FROM A TABLE FREQUENCY COUNTRY TOURISM_ACTIVITY TOURISM_INDICATOR UNIT OBS_VALUE TIME E P DIMENSIONS ATTRIBUTES Eurostat OBS_STATUS MEASURES

DATA STRUCTURE DEFINITION Eurostat

DATA STRUCTURE DEFINITION Eurostat

DATA STRUCTURE DEFINITION - Summary Reference DSD Reference Concept Scheme Eurostat Reference Code lists

DATA STRUCTURE DEFINITION - Summary Reference DSD Reference Concept Scheme Eurostat Reference Code lists

SDMX Implementation steps Dataflows 22 Concepts & Code lists DSD sharing Eurostat SDMX Data

SDMX Implementation steps Dataflows 22 Concepts & Code lists DSD sharing Eurostat SDMX Data Structure Definition

DSD Sharing: Tourism Example 23 Eurostat

DSD Sharing: Tourism Example 23 Eurostat

How to achieve DSD sharing? Use of Constraints The Constraint can define or both

How to achieve DSD sharing? Use of Constraints The Constraint can define or both of: • the Codes in a Code List that are applicable Ex: (A, M, W, Q) (A) • the list of series keys that are applicable FREQ COUNTRY TOURISM _INDICATOR TOURISM _ACTIVITY A IT A 003 B 100 Can be used to constrain the DSD for which a sub set of the DSD content is meaningful. Constraints are usually linked to the dataflows or the provision agreements. 24 Eurostat

SDMX Implementation steps Dataflows 25 Concepts & Code lists DSD sharing Eurostat SDMX Data

SDMX Implementation steps Dataflows 25 Concepts & Code lists DSD sharing Eurostat SDMX Data Structure Definition

DATA STRUCTURE DEFINITION - Design Data Structure Wizard • • Java desktop application Graphical

DATA STRUCTURE DEFINITION - Design Data Structure Wizard • • Java desktop application Graphical Interface For DSD designers Maintenance of SDMX v 2. 0/2. 1 data and meta data structures • Web service to query/submit SDMX registries Eurostat

SDMX Registry: Designing & Publishing DSDs Graphical User Interface Web service Eurostat

SDMX Registry: Designing & Publishing DSDs Graphical User Interface Web service Eurostat

Exercise: Consult a DSD URL Registry ( Test purpose): https: //webgate. test. ec. europa.

Exercise: Consult a DSD URL Registry ( Test purpose): https: //webgate. test. ec. europa. eu/sdmxregistry/ DSD: WASTE_GENER Eurostat

Exercise: Browse the different objects of the DSD Codelists: • CL_FREQ • CL_GEO_EUCCEFTA •

Exercise: Browse the different objects of the DSD Codelists: • CL_FREQ • CL_GEO_EUCCEFTA • CL_WASTE • CL_HAZARD • CL_NACE_R 2_WASTE Concept Scheme: • CS_WASTE DSD: • WASTE_GENER Eurostat

ADDITIONAL INFORMATION 30 Eurostat

ADDITIONAL INFORMATION 30 Eurostat

SDMX Dataset DSD Define the structure Dataset = XML file describing the table content

SDMX Dataset DSD Define the structure Dataset = XML file describing the table content according to the DSD. E P Eurostat

Syntaxes for SDMX datasets • Based on a common Information Model • SDMX-EDI (GESMES/TS)

Syntaxes for SDMX datasets • Based on a common Information Model • SDMX-EDI (GESMES/TS) EDIFACT syntax Time-series oriented – One format for Data Sets • SDMX-ML 2. 0 & 2. 1 XML syntax Different formats for Data Sets Easier validation (XML based) Eurostat

SDMX-ML 2. 0 formats Conversions Equivalent formats Compact SDMX-ML Based on the same IM

SDMX-ML 2. 0 formats Conversions Equivalent formats Compact SDMX-ML Based on the same IM Cross-sectional SDMX-ML Can be expanded to other formats (e. g. CSV, GESMES, SDMX 2. 1) Eurostat Generic SDMX-ML

SDMX data common header Eurostat

SDMX data common header Eurostat

SDMX 2. 0 vs 2. 1 Eurostat

SDMX 2. 0 vs 2. 1 Eurostat

SDMX-ML formats Equivalent representations for reporting Datasets Version 2. 0 4 data messages, each

SDMX-ML formats Equivalent representations for reporting Datasets Version 2. 0 4 data messages, each with a distinct format. Version 2. 1 Cross. Sectional Data Compact Data Therefore, there are now 4 data messages which are based on two general formats: • Generic. Data Generic. Time. Series. Data • Structure. Specific. Data Structure. Specific. Time. Series. Data Utility. Data Phased out Generic. Data Eurostat

Constraint Version 2. 0 Constraint is only available for use in a Registry context

Constraint Version 2. 0 Constraint is only available for use in a Registry context Version 2. 1 Dataflow Constraint Registry Provision agreement DSD The same Constraint can be “used” to constrain multiple objects Constraint is independently maintained Constraint is embedded in the object it constrains Eurostat

Thank you for your attention! Questions Eurostat

Thank you for your attention! Questions Eurostat