ESCWA SDMX Workshop Session SDMX and Data Session

  • Slides: 54
Download presentation
ESCWA SDMX Workshop Session: SDMX and Data

ESCWA SDMX Workshop Session: SDMX and Data

Session Objectives • At the end of this session you will: – Know the

Session Objectives • At the end of this session you will: – Know the SDMX model of a data structure definition – Understand the techniques to identify the structure of data – Identify the concepts in a simple data set – Be able to develop simple data structure definitions using SDMX tools

Data Set

Data Set

Data Set: Structure

Data Set: Structure

Data Set Structure • Computers need to know the structure of data in terms

Data Set Structure • Computers need to know the structure of data in terms of: – Concepts – Code Lists – Dimensionality – Additional metadata

First: Identify the Concepts • A concept is a unit of knowledge created by

First: Identify the Concepts • A concept is a unit of knowledge created by a unique combination of characteristics (SDMX Information Model)

Data Set Structure: Concepts Country Stock/Flow Unit Multiplier Unit Topic Time/Frequency

Data Set Structure: Concepts Country Stock/Flow Unit Multiplier Unit Topic Time/Frequency

Data Set Structure: Code Lists CONCEPTS Concepts Topic Country Flow Code Lists TOPIC COUNTRY

Data Set Structure: Code Lists CONCEPTS Concepts Topic Country Flow Code Lists TOPIC COUNTRY STOCK/FLOW A Brady Bonds AR Argentina 1 Stock B Bank Loans MX Mexico 2 Flow C Debt Securities ZA South Africa

Data Makes Sense Q, ZA, B, 1, 1999 -06 -30=16457

Data Makes Sense Q, ZA, B, 1, 1999 -06 -30=16457

Data Set Structure: Defining Multidimensional Structures • Comprises – Dimensions Concepts that identify the

Data Set Structure: Defining Multidimensional Structures • Comprises – Dimensions Concepts that identify the observation value – Attributes Concepts that additional metadata about the observation value – Measure Concept that is the observation value – Any of these may be • • • coded text date/time number etc. Representation

Data Set Structure: Concept Usage Country (Dimension) Stock/Flow (Dimension) Unit Multiplier (Attribute) Unit (Attribute)

Data Set Structure: Concept Usage Country (Dimension) Stock/Flow (Dimension) Unit Multiplier (Attribute) Unit (Attribute) Time/Frequency (Dimension) Topic (Dimension) Observation (Measure)

Data Structure Definition concepts that identify groups of keys concepts that identify the observation

Data Structure Definition concepts that identify groups of keys concepts that identify the observation Key Group Key concepts that are observed phenomenon concepts that add metadata Attributes Measures takes semantic from Concept CONCEPTS Topic Country Flow Dimensions takes semantic from has format Representation Noncoded has format Coded has code list TOPIC A Brady Bonds Code B Bank Loans List C Debt Securities

Data Makes Sense Frequency, Country, Topic, Stock/Flow, Time=Observation Q, ZA, B, 1, 1999 -06

Data Makes Sense Frequency, Country, Topic, Stock/Flow, Time=Observation Q, ZA, B, 1, 1999 -06 -30=16457 Quarterly, South Africa, Bank Loans, Stocks, 2 nd quarter 1999 16457

Identifying Concepts • Identifying Concepts - Sources – Existing data set tables • From

Identifying Concepts • Identifying Concepts - Sources – Existing data set tables • From website • From applications – Data Collection Instruments • Questionnaires • Excel spreadsheets – Regulations, Handbooks, User Guides • Labour Statistics Convention, 1985 (No. 160), Recommendation, 1985 (No. 170) • Council Regulation No: 311/76/EEC of 09/021976; OJ: L 039 of 14/02/1976; Compilation of statistics on foreign workers – Database Tables – Existing Data Structure Definitions • From other organisations

Identify Concepts – from website Measurement = 1, 000 Kg Source: FAO proof of

Identify Concepts – from website Measurement = 1, 000 Kg Source: FAO proof of concept project

Concepts Measure Type Frequency and Time Commodity Reference Region Measurement = 1, 000 Kg

Concepts Measure Type Frequency and Time Commodity Reference Region Measurement = 1, 000 Kg Unit and Unit Multiplier Observation Value

Exercise: Identify Concept Role

Exercise: Identify Concept Role

Concept Role: Reminder • Dimensions – Are the concepts that identify the observation value

Concept Role: Reminder • Dimensions – Are the concepts that identify the observation value • Attributes – Are the concepts that additional metadata about the observation value • Measure – Is the concept that is the observation value

Concepts Measure Type Frequency and Time Commodity Reference Region Measurement = 1, 000 Kg

Concepts Measure Type Frequency and Time Commodity Reference Region Measurement = 1, 000 Kg Unit and Unit Multiplier Observation Value

Exercise: Concept Role Measure Type Frequency and Time (Dimension) (Dimensions) Observation Value (Measure) Commodity

Exercise: Concept Role Measure Type Frequency and Time (Dimension) (Dimensions) Observation Value (Measure) Commodity (Dimension) Reference Region (Dimension) Measurement = 1, 000 Kg Unit and Unit Multiplier (Attributes)

Data Set and Structure Dimension Concept FREQ REF_AREA_REG COMMODITY MEASURE_TYPE TIME Measure Concept OBS_VALUE

Data Set and Structure Dimension Concept FREQ REF_AREA_REG COMMODITY MEASURE_TYPE TIME Measure Concept OBS_VALUE Attribute Concept OBS_STATUS OBS_CONF UNIT_MULTIPLIER

Identify/Define Code Lists • Purpose of a Code List – Constrains the value domain

Identify/Define Code Lists • Purpose of a Code List – Constrains the value domain of concepts when used in a structure like a data structure definition – Defines a shortened language independent representation of the values – Gives semantic meaning to the values, possibly in multiple languages • Agreeing on harmonised code lists is the most difficult aspect of defining a data structure definition

Code Lists Required Measure Type Frequency Commodity Reference Region Source: FAO proof of concept

Code Lists Required Measure Type Frequency Commodity Reference Region Source: FAO proof of concept project Measurement = 1, 000 Kg Unit and Unit Multiplier

Code Lists

Code Lists

Code Lists

Code Lists

Code Lists (CL_) For Time Series the SDMX Cross Domain Concepts recommend all observations

Code Lists (CL_) For Time Series the SDMX Cross Domain Concepts recommend all observations have a status code (Concept = OBS_STATUS) and a confidentiality code (Concept = OBS_CONF)

Data Structure Definition

Data Structure Definition

Data Structure Definition - Reminder Data Structure Definition concepts that identify the observation Key

Data Structure Definition - Reminder Data Structure Definition concepts that identify the observation Key concepts that add metadata Attributes Group Key concepts that are observed phenomenon Measures takes semantic from concepts that identify groups of keys takes semantic from Concept Dimensions has format takes semantic from has format Representation Noncoded Coded has code has list format Code List

Data Structure Definition - Agriculture Data Structure Definition AGRICULTURE_COMMODITY Key Attributes OBS_STATUS OBS_CONF UNIT_MULT

Data Structure Definition - Agriculture Data Structure Definition AGRICULTURE_COMMODITY Key Attributes OBS_STATUS OBS_CONF UNIT_MULT Measures OBS_VALUE Group Key Dimensions FREQ REF_AREA_REG COMMODITY MEASURE_TYPE TIME CL_FREQ CL_AREA_CTY CL_COMMODITY CL_MEASURE_ELEMENT Representation Concept Noncoded CL_OBS_STATUS CL_OBS_CONF CL_UNIT_MULT Coded Code List

SDMX and Data Formats Exercise: Identify Concepts © Metadata Technology

SDMX and Data Formats Exercise: Identify Concepts © Metadata Technology

Identifying Concepts • Identifying Concepts - Sources – Existing data set tables • From

Identifying Concepts • Identifying Concepts - Sources – Existing data set tables • From website • From applications – Data Collection Instruments • Questionnaires • Excel spreadsheets – Regulations, Handbooks, User Guides • Labour Statistics Convention, 1985 (No. 160), Recommendation, 1985 (No. 170) • Council Regulation No: 311/76/EEC of 09/021976; OJ: L 039 of 14/02/1976; Compilation of statistics on foreign workers – Database Tables – Existing Data Structure Definitions • From other organisations

Identifying Concepts • Identifying Concepts - Sources – Existing data set tables • From

Identifying Concepts • Identifying Concepts - Sources – Existing data set tables • From website • From applications – Data Collection Instruments • Questionnaires • Excel spreadsheets – Regulations, Handbooks, User Guides • Labour Statistics Convention, 1985 (No. 160), Recommendation, 1985 (No. 170) • Council Regulation No: 311/76/EEC of 09/021976; OJ: L 039 of 14/02/1976; Compilation of statistics on foreign workers – Database Tables – Existing Data Structure Definitions • From other organisations

Exercise: Identify Concepts – from collection instrument Source: UNESCO Institute for Statistics

Exercise: Identify Concepts – from collection instrument Source: UNESCO Institute for Statistics

Data Entry - Table 2. 1 Source: UNESCO Institute for Statistics

Data Entry - Table 2. 1 Source: UNESCO Institute for Statistics

Data Entry - Table 2. 2 Source: UNESCO Institute for Statistics

Data Entry - Table 2. 2 Source: UNESCO Institute for Statistics

Identifying Concepts • Identifying Concepts - Sources – Existing data set tables • From

Identifying Concepts • Identifying Concepts - Sources – Existing data set tables • From website • From applications – Data Collection Instruments • Questionnaires • Excel spreadsheets – Regulations, Handbooks, User Guides • Labour Statistics Convention, 1985 (No. 160), Recommendation, 1985 (No. 170) • Council Regulation No: 311/76/EEC of 09/021976; OJ: L 039 of 14/02/1976; Compilation of statistics on foreign workers – Database Tables – Existing Data Structure Definitions • From other organisations

Exercise: Identify Dimension Concepts – from website Source: International Labor Organisation

Exercise: Identify Dimension Concepts – from website Source: International Labor Organisation

Identify Concepts: Table 2 A Source: International Labor Organisation

Identify Concepts: Table 2 A Source: International Labor Organisation

Identify Concepts: Table 2 B Source: International Labor Organisation

Identify Concepts: Table 2 B Source: International Labor Organisation

Identify Concepts: Table 2 C Source: International Labor Organisation

Identify Concepts: Table 2 C Source: International Labor Organisation

Identify Concepts: Table 2 D Source: International Labor Organisation

Identify Concepts: Table 2 D Source: International Labor Organisation

Identify Concepts: Table 2 E Source: International Labor Organisation

Identify Concepts: Table 2 E Source: International Labor Organisation

Dimension Concept

Dimension Concept

Identify Concepts: Table 2 A Measure Type Reference Area Sex Time Period Frequency

Identify Concepts: Table 2 A Measure Type Reference Area Sex Time Period Frequency

Identify Concepts: Table 2 B Measure Type Economic Activity

Identify Concepts: Table 2 B Measure Type Economic Activity

Identify Concepts: Table 2 C Measure Type OCCUPATION

Identify Concepts: Table 2 C Measure Type OCCUPATION

Identify Concepts: Table 2 D Measure Type Status in Employment

Identify Concepts: Table 2 D Measure Type Status in Employment

Identify Concepts: Table 2 E Measure Type

Identify Concepts: Table 2 E Measure Type

Exercise: Identify Concepts – from collection instrument Reference Area Time Source: UNESCO Institute for

Exercise: Identify Concepts – from collection instrument Reference Area Time Source: UNESCO Institute for Statistics

Dimension Concepts - Tables 2. 1/2. 2 Education Level Institution Type Measure Type Sex

Dimension Concepts - Tables 2. 1/2. 2 Education Level Institution Type Measure Type Sex Work Mode Source: UNESCO Institute for Statistics Programme Orientation

Labor Statistics: Data Structure Definition (Incomplete) © Metadata Technology

Labor Statistics: Data Structure Definition (Incomplete) © Metadata Technology

Education Statistics : Data Structure Definition (Incomplete) Dimension Concept Representation Frequency (FREQ) CL_FREQ Reference

Education Statistics : Data Structure Definition (Incomplete) Dimension Concept Representation Frequency (FREQ) CL_FREQ Reference Area (REF_AREA) CL_REF_AREA Education level (EDUC_LEVEL) CL_EDUCATLVTYP Sex (SEX) CL_SEX Programme Orientation (PROG_ORIENTATION) CL_PROG_ORIENTATION Institution Type (INSTITUTION_TYPE) CL_INSTITUTION_TYPE Work Mode (WORK_MODE) CL_WORK_MODE Measure Type (MEASURE_TYPE) CL_MEASURE_TYPE Time (TIME) Date/Time Measure Concept Representation Observation Value (OBS_VAL) Numeric

Education Statistics : Data Structure Definition (Incomplete) Attribute Concept Assignment Status Attachment Representation Observation

Education Statistics : Data Structure Definition (Incomplete) Attribute Concept Assignment Status Attachment Representation Observation Status (OBS_STATUS) M(andatory) Observation CL_OBS_STATUS Observation Confidentiality C(onditional) (OBS_CONF) Observation CL_OBS_CONF Unit (UNIT) M Series CL_UNIT Unit Multiplier (UNIT_MULTIPLIER) M Series CL_UNIT_MULT

Identify Concepts from User Guide

Identify Concepts from User Guide