2 An overview of SDMX What is SDMX

  • Slides: 35
Download presentation
2. An overview of SDMX (What is SDMX? Part I) Edward Cook Eurostat Unit

2. An overview of SDMX (What is SDMX? Part I) Edward Cook Eurostat Unit B 5: “Central data and metadata services” SDMX Basics course, 1 -2 March 2017 1

A typical production chain • Data collection is little different from other goods 2

A typical production chain • Data collection is little different from other goods 2

What are the key features? PRODUCER MANUFACTURER CONTRACT SPECIFICATIONS § Type of fruits (oranges)

What are the key features? PRODUCER MANUFACTURER CONTRACT SPECIFICATIONS § Type of fruits (oranges) § Dimensions of the box § Number of fruits per box 3

What are the key features? MANUFACTURER PRODUCER All the details of the contract are

What are the key features? MANUFACTURER PRODUCER All the details of the contract are stored in the company offices to be checked by both parties SPECIFICATIONS COMPANY OFFICE CONTRACT 4

In SDMX … DATA PRODUCER DATA CONSUMER PROVISION AGREEMENT SDMX REGISTRY DATAFLOW DATA STRUCTURE

In SDMX … DATA PRODUCER DATA CONSUMER PROVISION AGREEMENT SDMX REGISTRY DATAFLOW DATA STRUCTURE DEFINITION

What is SDMX in more technical terms? 6

What is SDMX in more technical terms? 6

What is SDMX? • A model to describe statistical data and metadata statisticians agree

What is SDMX? • A model to describe statistical data and metadata statisticians agree to use common descriptions and guidelines • A standard for automated communication from machine to machine driven by these common descriptors for all to reuse • A technology supporting standardised IT tools developed as wide-ranging open source software 7 Eurostat

Presentation of SDMX • The SDMX Information model: What is the information model underlying

Presentation of SDMX • The SDMX Information model: What is the information model underlying the data and metadata exchange between the partners? • Content-oriented guidelines: How to increase the interoperability and statistical harmonisation? • IT Architecture for Data Exchange: How to exchange the data? Eurostat 8

The information model 9 Eurostat

The information model 9 Eurostat

The Information Model: … is a representation of concepts, relationships, constraints, rules and operations.

The Information Model: … is a representation of concepts, relationships, constraints, rules and operations. … is a formal way to: - express and design information needs - communicate with IT people - give specifications to reporting agents - document the system - drive the software 10 Eurostat

What things does SDMX need to model? • Statistical data • Through descriptor concepts.

What things does SDMX need to model? • Statistical data • Through descriptor concepts. These concepts can be further classified into dimensions, attributes and measures. • Metadata • Structural metadata (such as concept names etc. ) • Reference (or explanatory) metadata • Data exchange processes 11 Eurostat

Modelling statistical data 12 Eurostat

Modelling statistical data 12 Eurostat

Modelling structural metadata Data Structure Definition (DSD) • Identification of dimensions, attributes and measures

Modelling structural metadata Data Structure Definition (DSD) • Identification of dimensions, attributes and measures • Use of common code lists • Integration into concept schemes 13 Eurostat

Modelling reference metadata Quality descriptions Process descriptions Methodological descriptions Administrative descriptions So much descriptive

Modelling reference metadata Quality descriptions Process descriptions Methodological descriptions Administrative descriptions So much descriptive information. It needs to be expressed in a common, standard way. 14 Eurostat

The standard way is the Metadata Structure Definition (MSD) A Metadata Structure Definition describes

The standard way is the Metadata Structure Definition (MSD) A Metadata Structure Definition describes how metadata sets, containing reference metadata are organised. In particular, it defines: - which metadata are being exchanged; - how these concepts relate to each other; - how they are represented (free text or coded values); - with which object types (agencies, data flows, data providers, subsets of data flows, or others) they are associated. 15 Eurostat

Modelling reference metadata in SDMX 16 Eurostat

Modelling reference metadata in SDMX 16 Eurostat

THE CONTENT-ORIENTED GUIDELINES 17 Eurostat

THE CONTENT-ORIENTED GUIDELINES 17 Eurostat

Content-oriented guidelines • The content-oriented guidelines are a set of recommendations within the scope

Content-oriented guidelines • The content-oriented guidelines are a set of recommendations within the scope of the SDMX standard in order to produce maximum interoperability. • The SDMX standards: - provide essential support to statisticians; - maximise the amount of information through to users; - allow an automation of the process; - allow web-service queries. 18 Eurostat

There are three main areas in the content-oriented guidelines: 1. Statistical subject-matter domains. 2.

There are three main areas in the content-oriented guidelines: 1. Statistical subject-matter domains. 2. Cross-domain concepts (and code lists). 3. A Metadata Common Vocabulary. 19 Eurostat

Statistical subject-matter domains • Statistical subject matter domains is a high level classification of

Statistical subject-matter domains • Statistical subject matter domains is a high level classification of statistical areas. • They refer to statistical activities that have common characteristics with respect to variables, concepts and methodologies for data collection. • Examples: price statistics, national accounts, environment statistics or education statistics. • It is intended to cover the universe of official statistics. 20 Eurostat

Cross-domain concepts • They are a list of statistical concepts, related to statistical processes

Cross-domain concepts • They are a list of statistical concepts, related to statistical processes and data quality. • The list is based on the concepts used by the contributing international organisations. • The concepts can be used at the data side as well as at the metadata side. 21 Eurostat

Example of cross-domain concept 22 Eurostat

Example of cross-domain concept 22 Eurostat

 • A cross-domain concept may have a code list as presentation. • This

• A cross-domain concept may have a code list as presentation. • This means that the concept might take a limited set of possible values enumerated in its corresponding code list. • The code lists associated with cross-domain concepts are called cross-domain code lists. 23 Eurostat

24 Eurostat

24 Eurostat

Metadata Common Vocabulary • The Metadata Common Vocabulary (MCV) is a vocabulary that recommends

Metadata Common Vocabulary • The Metadata Common Vocabulary (MCV) is a vocabulary that recommends a common terminology to be used in order to facilitate communication and understanding • The MCV is closely linked to the cross-domain concepts as it also contains all these concepts, stating their definitions and context descriptions. 25 Eurostat

Example of Metadata Common Vocabulary 26 Eurostat

Example of Metadata Common Vocabulary 26 Eurostat

IT Architecture for data exchange 27 Eurostat

IT Architecture for data exchange 27 Eurostat

 • Standard formats for the exchange of data and metadata. • SDMX-ML •

• Standard formats for the exchange of data and metadata. • SDMX-ML • Architectures for data exchange: • Push • Pull • Data-hub • SDMX Tools 28 Eurostat

Producer can push them to the manufacturer… NS O I AT 1 C FI

Producer can push them to the manufacturer… NS O I AT 1 C FI CI T GE E SP 2 PREPARE PUSH 3

Push mode 30 Eurostat

Push mode 30 Eurostat

Manufacturer can go and collect the oranges… N O TI R DY A E

Manufacturer can go and collect the oranges… N O TI R DY A E CA S D OO E AR ND IFI SE OT N 2 G 1 PREPARE 3 PULL 4

Pull mode 32 Eurostat

Pull mode 32 Eurostat

In some cases, final client can get the oranges directly from the producer. .

In some cases, final client can get the oranges directly from the producer. . 2 T UES Q E R Q RE T S UE 3 PREPARE 4 SEND 1

Data Hub 34 Eurostat

Data Hub 34 Eurostat

SDMX tools • Eurostat tools at our SDMX Info Space http: //ec. europa. eu/eurostat/web/sdmx-info-space/sdmx-it

SDMX tools • Eurostat tools at our SDMX Info Space http: //ec. europa. eu/eurostat/web/sdmx-info-space/sdmx-it -tools • SDMX Data Structure Wizard (used to create, edit and test SDMX artefacts). • SDMX Converter (converts data files between SDMX formats and other file formats). • ESS Metadata Handler • SDMX Reference Infrastructure (SDMX-RI) (set of tools that allows to connect your IT systems to the SDMX world). • SDMX Mapping Assistant (mapping and transcoding of the contents of an existing database to SDMX data structures). 35 Eurostat