ESCWA SDMX Workshop Session SDMX and Data Session
- Slides: 54
ESCWA SDMX Workshop Session: SDMX and Data
Session Objectives • At the end of this session you will: – Know the SDMX model of a data structure definition – Understand the techniques to identify the structure of data – Identify the concepts in a simple data set – Be able to develop simple data structure definitions using SDMX tools
Data Set
Data Set: Structure
Data Set Structure • Computers need to know the structure of data in terms of: – Concepts – Code Lists – Dimensionality – Additional metadata
First: Identify the Concepts • A concept is a unit of knowledge created by a unique combination of characteristics (SDMX Information Model)
Data Set Structure: Concepts Country Stock/Flow Unit Multiplier Unit Topic Time/Frequency
Data Set Structure: Code Lists CONCEPTS Concepts Topic Country Flow Code Lists TOPIC COUNTRY STOCK/FLOW A Brady Bonds AR Argentina 1 Stock B Bank Loans MX Mexico 2 Flow C Debt Securities ZA South Africa
Data Makes Sense Q, ZA, B, 1, 1999 -06 -30=16457
Data Set Structure: Defining Multidimensional Structures • Comprises – Dimensions Concepts that identify the observation value – Attributes Concepts that additional metadata about the observation value – Measure Concept that is the observation value – Any of these may be • • • coded text date/time number etc. Representation
Data Set Structure: Concept Usage Country (Dimension) Stock/Flow (Dimension) Unit Multiplier (Attribute) Unit (Attribute) Time/Frequency (Dimension) Topic (Dimension) Observation (Measure)
Data Structure Definition concepts that identify groups of keys concepts that identify the observation Key Group Key concepts that are observed phenomenon concepts that add metadata Attributes Measures takes semantic from Concept CONCEPTS Topic Country Flow Dimensions takes semantic from has format Representation Noncoded has format Coded has code list TOPIC A Brady Bonds Code B Bank Loans List C Debt Securities
Data Makes Sense Frequency, Country, Topic, Stock/Flow, Time=Observation Q, ZA, B, 1, 1999 -06 -30=16457 Quarterly, South Africa, Bank Loans, Stocks, 2 nd quarter 1999 16457
Identifying Concepts • Identifying Concepts - Sources – Existing data set tables • From website • From applications – Data Collection Instruments • Questionnaires • Excel spreadsheets – Regulations, Handbooks, User Guides • Labour Statistics Convention, 1985 (No. 160), Recommendation, 1985 (No. 170) • Council Regulation No: 311/76/EEC of 09/021976; OJ: L 039 of 14/02/1976; Compilation of statistics on foreign workers – Database Tables – Existing Data Structure Definitions • From other organisations
Identify Concepts – from website Measurement = 1, 000 Kg Source: FAO proof of concept project
Concepts Measure Type Frequency and Time Commodity Reference Region Measurement = 1, 000 Kg Unit and Unit Multiplier Observation Value
Exercise: Identify Concept Role
Concept Role: Reminder • Dimensions – Are the concepts that identify the observation value • Attributes – Are the concepts that additional metadata about the observation value • Measure – Is the concept that is the observation value
Concepts Measure Type Frequency and Time Commodity Reference Region Measurement = 1, 000 Kg Unit and Unit Multiplier Observation Value
Exercise: Concept Role Measure Type Frequency and Time (Dimension) (Dimensions) Observation Value (Measure) Commodity (Dimension) Reference Region (Dimension) Measurement = 1, 000 Kg Unit and Unit Multiplier (Attributes)
Data Set and Structure Dimension Concept FREQ REF_AREA_REG COMMODITY MEASURE_TYPE TIME Measure Concept OBS_VALUE Attribute Concept OBS_STATUS OBS_CONF UNIT_MULTIPLIER
Identify/Define Code Lists • Purpose of a Code List – Constrains the value domain of concepts when used in a structure like a data structure definition – Defines a shortened language independent representation of the values – Gives semantic meaning to the values, possibly in multiple languages • Agreeing on harmonised code lists is the most difficult aspect of defining a data structure definition
Code Lists Required Measure Type Frequency Commodity Reference Region Source: FAO proof of concept project Measurement = 1, 000 Kg Unit and Unit Multiplier
Code Lists
Code Lists
Code Lists (CL_) For Time Series the SDMX Cross Domain Concepts recommend all observations have a status code (Concept = OBS_STATUS) and a confidentiality code (Concept = OBS_CONF)
Data Structure Definition
Data Structure Definition - Reminder Data Structure Definition concepts that identify the observation Key concepts that add metadata Attributes Group Key concepts that are observed phenomenon Measures takes semantic from concepts that identify groups of keys takes semantic from Concept Dimensions has format takes semantic from has format Representation Noncoded Coded has code has list format Code List
Data Structure Definition - Agriculture Data Structure Definition AGRICULTURE_COMMODITY Key Attributes OBS_STATUS OBS_CONF UNIT_MULT Measures OBS_VALUE Group Key Dimensions FREQ REF_AREA_REG COMMODITY MEASURE_TYPE TIME CL_FREQ CL_AREA_CTY CL_COMMODITY CL_MEASURE_ELEMENT Representation Concept Noncoded CL_OBS_STATUS CL_OBS_CONF CL_UNIT_MULT Coded Code List
SDMX and Data Formats Exercise: Identify Concepts © Metadata Technology
Identifying Concepts • Identifying Concepts - Sources – Existing data set tables • From website • From applications – Data Collection Instruments • Questionnaires • Excel spreadsheets – Regulations, Handbooks, User Guides • Labour Statistics Convention, 1985 (No. 160), Recommendation, 1985 (No. 170) • Council Regulation No: 311/76/EEC of 09/021976; OJ: L 039 of 14/02/1976; Compilation of statistics on foreign workers – Database Tables – Existing Data Structure Definitions • From other organisations
Identifying Concepts • Identifying Concepts - Sources – Existing data set tables • From website • From applications – Data Collection Instruments • Questionnaires • Excel spreadsheets – Regulations, Handbooks, User Guides • Labour Statistics Convention, 1985 (No. 160), Recommendation, 1985 (No. 170) • Council Regulation No: 311/76/EEC of 09/021976; OJ: L 039 of 14/02/1976; Compilation of statistics on foreign workers – Database Tables – Existing Data Structure Definitions • From other organisations
Exercise: Identify Concepts – from collection instrument Source: UNESCO Institute for Statistics
Data Entry - Table 2. 1 Source: UNESCO Institute for Statistics
Data Entry - Table 2. 2 Source: UNESCO Institute for Statistics
Identifying Concepts • Identifying Concepts - Sources – Existing data set tables • From website • From applications – Data Collection Instruments • Questionnaires • Excel spreadsheets – Regulations, Handbooks, User Guides • Labour Statistics Convention, 1985 (No. 160), Recommendation, 1985 (No. 170) • Council Regulation No: 311/76/EEC of 09/021976; OJ: L 039 of 14/02/1976; Compilation of statistics on foreign workers – Database Tables – Existing Data Structure Definitions • From other organisations
Exercise: Identify Dimension Concepts – from website Source: International Labor Organisation
Identify Concepts: Table 2 A Source: International Labor Organisation
Identify Concepts: Table 2 B Source: International Labor Organisation
Identify Concepts: Table 2 C Source: International Labor Organisation
Identify Concepts: Table 2 D Source: International Labor Organisation
Identify Concepts: Table 2 E Source: International Labor Organisation
Dimension Concept
Identify Concepts: Table 2 A Measure Type Reference Area Sex Time Period Frequency
Identify Concepts: Table 2 B Measure Type Economic Activity
Identify Concepts: Table 2 C Measure Type OCCUPATION
Identify Concepts: Table 2 D Measure Type Status in Employment
Identify Concepts: Table 2 E Measure Type
Exercise: Identify Concepts – from collection instrument Reference Area Time Source: UNESCO Institute for Statistics
Dimension Concepts - Tables 2. 1/2. 2 Education Level Institution Type Measure Type Sex Work Mode Source: UNESCO Institute for Statistics Programme Orientation
Labor Statistics: Data Structure Definition (Incomplete) © Metadata Technology
Education Statistics : Data Structure Definition (Incomplete) Dimension Concept Representation Frequency (FREQ) CL_FREQ Reference Area (REF_AREA) CL_REF_AREA Education level (EDUC_LEVEL) CL_EDUCATLVTYP Sex (SEX) CL_SEX Programme Orientation (PROG_ORIENTATION) CL_PROG_ORIENTATION Institution Type (INSTITUTION_TYPE) CL_INSTITUTION_TYPE Work Mode (WORK_MODE) CL_WORK_MODE Measure Type (MEASURE_TYPE) CL_MEASURE_TYPE Time (TIME) Date/Time Measure Concept Representation Observation Value (OBS_VAL) Numeric
Education Statistics : Data Structure Definition (Incomplete) Attribute Concept Assignment Status Attachment Representation Observation Status (OBS_STATUS) M(andatory) Observation CL_OBS_STATUS Observation Confidentiality C(onditional) (OBS_CONF) Observation CL_OBS_CONF Unit (UNIT) M Series CL_UNIT Unit Multiplier (UNIT_MULTIPLIER) M Series CL_UNIT_MULT
Identify Concepts from User Guide
- Sdmx excel
- Http://sdmx.us:8080
- Sdmx converter
- Website localization statistics
- Laura vignola
- Sdmx converter
- Data strategy workshop
- Data capturing workshop
- Data maturity workshop
- Session protocol data unit
- Unit 6 level d synonyms and antonyms
- Vocabulary workshop unit 13 level d
- Factual description of a person
- Sadlier unit 12 level b synonyms
- Unit 5 level d synonyms and antonyms
- Vocabulary workshop level f unit 8 synonyms and antonyms
- Hair and body workshop montmorency
- Pros and cons of reading workshop
- Brazing hearth diagram
- Sadlier level g unit 15
- Synonyms and antonyms
- Spatial data and attribute data
- Spatial data and attribute data
- Snapshot standby
- Respiration meaning in bengali
- What is kdd process in data mining
- Data cleaning problems and current approaches
- Data collection procedure
- Data preparation and basic data analysis
- Class mark in statistics
- Variance formula for ungrouped data example
- Static data structures
- Data warehouses generalize and consolidate data in space.
- Data quality and data cleaning an overview
- Data acquisition and data analysis
- Mining fraud
- Olap in data mining
- Primary data and secondary data
- Understanding data and ways to systematically collect data
- Difference between operational and informational data
- Data quality and data cleaning an overview
- Fills in gaps in data and fit data into curves
- Mashups meaning
- Introduction to data warehouse
- Primary data
- Primary data is
- Disadvantages of secondary data
- Analog signla
- Primary data and secondary data
- Primary data means
- What is the overlap of data set 1 and data set 2?
- Ooa and ood
- Data mashups and gis are data integration technologies.
- Ucl desktop everywhere
- Workshop visual thinking