BLS Metadata Repository Issues and Progress Daniel Gillman
BLS Metadata Repository – Issues and Progress Daniel Gillman US Bureau of Labor Statistics
Outline BLS Programs Time Series Data Dissemination Metadata Model BLS Repository Wolfram Data Summit 9/10/2010 3
The BLS Mission The Bureau of Labor Statistics (BLS) is the principal fact-finding agency for the Federal Government in the broad field of labor economics and statistics. The BLS collects, processes, analyzes, and disseminates essential statistical data to the public, Congress, Federal agencies, State and local governments, business, and labor.
BLS Programs 8 Major Program Areas Inflation & Prices Employment Unemployment Pay & Benefits Spending & Time Use Productivity Workplace Injuries International Wolfram Data Summit 9/10/2010 5
Time Series Measure or index over time Index: number relative to fixed point 30 series types Subset by – Industry – Occupation – Geography (state, county, MSA, etc) Tables Generated from time series data Wolfram Data Summit 9/10/2010 6
Data Dissemination Web site: http: //www. bls. gov 8 major numbers Unemployment rate (m) Consumer price index (m) Producer price index (m) Employment cost index (q) Average hourly earnings (m) Payroll employment (m) Productivity (q) Import price index (m) All time series Tables Wolfram Data Summit 9/10/2010 7
Data dissemination Wolfram Data Summit 9/10/2010 8
Data Dissemination Organized by programs Time series in ASCII files by FTP Some tables Crude database search Little metadata Web site itself Hidden in FTP directories Handbook of Methods Seasonal adjustment Wolfram Data Summit 9/10/2010 9
Data Dissemination Requires knowing Organization of BLS Specific surveys or programs Specific series Terms & technical meaning – E. g. , earnings Relies on “Series ID” Brittle scheme for identifying series Known by power users Wolfram Data Summit 9/10/2010 10
Metadata Supports Dissemination Support Data. Gov Time series and tables Does not support Internal processing Describing survey life-cycle Microdata (respondent level) Wolfram Data Summit 9/10/2010 11
Metadata Hard to collect Need “simple” model Maybe not so easy Basic metadata already on FTP sites Support finding data by Traditional means – Series ID, BLS structure New means – Subject matter Wolfram Data Summit 9/10/2010 12
Metadata Previous BLS focus group study Users find data by – Time – Place – Subject (title or keywords) Structure of agencies not known Technical terms not known Metadata must support this Wolfram Data Summit 9/10/2010 13
Model – Time Series Data Element Classification Concept Naming Convention Wolfram Data Summit 9/10/2010 14
Model Wolfram Data Summit 9/10/2010 15
BLS Repository Under development Requirement – fast response Testing – Flat single table design Using Apache Lucene Solr – Open source enterprise search Various interface approaches Visual Basic Java Wolfram Data Summit 9/10/2010 16
BLS Repository Need term map Common terms to technical terms Definitions for technical terms Concept based management Link terms to relevant data Manage multi-faceted search Development schedule Still research project Wolfram Data Summit 9/10/2010 17
Contact Information Daniel Gillman gillman. daniel@bls. gov
- Slides: 17