Data Documentation Initiative DDI Goals and Benefits Mary
Data Documentation Initiative (DDI): Goals and Benefits Mary Vardigan Director, DDI Alliance
Today’s Presentation • What is DDI? • History of the effort • The DDI community • DDI features, benefits, examples • Relationship to SDMX and GSBPM • Future directions
What Is DDI? • An international specification for structured metadata describing social, behavioral, and economic data • A standardized framework to maintain and exchange documentation/metadata • A basis on which to build software tools • Currently expressed in XML – e. Xtensible Markup Language
History • 1995 -- First international committee established • 2000 -- First DDI version published (aligned with codebooks, XML DTDbased) • 2003 – DDI 2 published (support for aggregate/tabular data and geography added) • 2003 -- Formation of the DDI Alliance, a self-sustaining membership organization • 2008 – DDI 3 published (aligned with data lifecycle, XML Schema-based) • 2010 – DDI rebranding – DDI Codebook (DDI 2 branch) and DDI Lifecycle (DDI 3 branch) development lines
DDI Alliance • 30 members from around the world • Modest institutional membership fee -- entitles members to help to shape the specification • Yearly meetings • Active working groups: Survey Design and Implementation, Qualitative Data • New groups: Paradata, Administrative Data, Disclosure
The DDI Community www. ddialliance. org
The DDI Community
Projects and Organizations Using DDI • Australian Bureau of Statistics • Canadian Research Data Centres (RDC) Program • CESSDA Data Portal • Data. First at University of Cape Town • The Dataverse Network • European Social Survey (ESS) and ESS-Edu-net • Gallup Europe
Projects and Organizations Using DDI • General Social Survey • Institute for the Study of Labor – IZA, Germany • LISS -- Longitudinal Internet studies in the Social Sciences, Netherlands • National Survey of Family Growth, US • UNICEF, Child Info – Monitoring the Situation of Children and Women • World Bank -- International Household Survey Network (IHSN) and Microdata Management Toolkit
Products of the DDI Alliance • Specifications for DDI Codebook and Lifecycle • Controlled vocabularies • Tools catalog • Training • Papers and presentations
DDI Tools • Several DDI authoring tools now available • Nesstar • IHSN Microdata Toolkit • Colectica • Tools for DDI in Research Data Centers
DDI Training and Exploration • Training available on demand in fundamentals of DDI • Advanced workshops in Germany in September 2011 • DDI and Longitudinal Data • DDI and Semantic Statistics
DDI Development Lines • DDI Codebook (DDI 2 branch) – Reflects components of social science codebooks – Includes descriptions at the study, file, and variable level • DDI Lifecycle (DDI 3 branch) – Reflects research data lifecycle – Optimized for reuse of metadata
Research Data Life Cycle
Research Data Life Cycle • Preservation metadata • Confidentiality • Add’l processing • Initial concepts • Questions and answers • Grant info • Questionnaire • Coded instrument • CAI metadata • Paradata • Data specs • Recodes • Summary descriptive info • Terms of use • Citation • Packaging info • Catalog record • Indexing • Related publications • Post-hoc harmonization • Data transformations • Replication code • Publications
Collection Custom Tools (e. g. Forms-based) CAI Tools MQDS etc. Processing Repurposing Concept Information extracted from SPSS etc. OAIS SIP AIP DDI as backbone for structured metadata Data / Documents outside of DDI Archive DIP Distribution Packages Web information system Search engines. Distribution Statistical packages Online Analysis. Discovery Analysis
DDI Lifecycle Features • Machine-actionable • Modular and extensible • Multi-lingual • Aligned with other metadata standards • Can carry data in-line • Focused on metadata reuse
DDI Lifecycle Features
DDI Lifecycle Features • Support for CAI instruments • Support for longitudinal surveys • Focus on comparison, both by design and after-the-fact (harmonization) • Robust record and file linkages for complex data files • Support for geographic content (shape and boundary files) • Capability for registries and question banks
Example – Comparative Collaborative Psychiatric Surveys www. icpsr. umich. edu/CPES
Explore a Variable
Compare Questions in Proximity
DDI and SDMX -- Statistical Data and Metadata e. Xchange • Not competing but complementary • Two standards bodies talking with each other – Recent meetings in Utrecht, Lisbon, and Washington, DC; next one in Luxembourg • Goals and use cases – Document survey process end-to-end – Drill back from macrodata to microdata – Data discovery
Generic Statistical Business Process Model -- GSBPM
GSBPM, SDMX, and DDI
Future Directions Data model development Additional controlled vocabularies Training in DDI and SDMX New tools – DDI from Blaise New versions of both Codebook and Lifecycle this year • Work starting on DDI 4. 0 • • •
Questions? • Mary Vardigan – vardigan@umich. edu
- Slides: 28