The Library of Online Harmonisations Martin Friedrichs and
The Library of Online Harmonisations Martin Friedrichs and Kristi Winters EDDI 2016 License CC BY 4. 0
Introduction Harmonizing variables is an important part of social science research. But for variable harmonization work to be of scientific value the harmonisation process must be documented and published in a precise and transparent ways. However even if done so these intellectual contributions are often lost because researchers can’t publish their work. Our solution is an online library where digitally documented harmonisation routines will be archived and accessed. 2
Overview L. o. H 3
Question “Nice. But what is the relationship to DDI? ” 4
Answer Such a platform requires metadata and data harmonisation workflow to be represented in a standardized and machine actionable way. As it is now harmonisation routines arise from different sources, like SPSS, Stata or as output from applications like Charm. Stats. These very different types of harmonisation routines should be represented by a single standard inside the library. We also need a standard for exchanging said “digitally documented harmonisation routines” with tools outside the library. Our idea is to combine these two requirements and to use an XML based representation for both of them. Which brings us to DDI. 5
DDI 6
Larry Hoyle and Joachim Wackerow „DDI AS A COMMON FORMAT FOR EXPORT AND IMPORT FOR STATISTICAL PACKAGES“ 7
In this article published 2015 in IASSIST Quarterly the authors presented an experiment using Stat/Transfer to exchange datasets between software packages with DDI Lifecycle as the „standard representation“. 8
Their conclusion „Adoption of DDI by tools like Stats/Transfer is encouraging. Basic metadata is transferable among all 5 packages via DDI. The current state still means that some important metadata […] still must be either hand entered into DDI or harvested and entered by user-witten code. “ (pg. 20) 9
Digitally documented harmonisation routines CHARMSTATS 10
About Charm. Stats (CS) 11
Overview CS Datamodel Study Project Instance. Link Instance. Map Instance Attribute. Comp Question Attribute. Link Variable Attribute. Map Char. Link Value Char. Map 12
Overview CS Datamodel Study Project Instance. Link Instance. Map Instance Attribute. Comp Question Attribute. Link Variable Attribute. Map Char. Link Value Char. Map 13
Overview CS Datamodel Study Project Instance. Link Instance. Map Instance Attribute. Comp Question Attribute. Link Variable Attribute. Map Char. Link Value Char. Map 14
Overview CS Datamodel Study Project Instance. Link Instance. Map Instance Attribute. Comp Question Attribute. Link Variable Attribute. Map Char. Link Value Char. Map 15
Overview CS Datamodel Study Project Instance. Link Instance. Map Instance Attribute. Comp Question Attribute. Link Variable Attribute. Map Char. Link Value Char. Map 16
Overview CS Datamodel Study Project Instance. Link Instance. Map Instance Attribute. Comp Question Attribute. Link Variable Attribute. Map Char. Link Value Char. Map 17
The Hidden Elements Measurement Dimension Indicator Variable 18
Overview CS Datamodel Project Instance. Link S. Inst. Attribute. Link Measure Char. Link Category Instance. Link C. Inst. Attribute. Link Dimension Char. Link Specification Instance. Link O. Inst. Attribute. Link Indicator Char. Link Prescription Instance. Link D. Inst. Attribute. Link Variable Char. Link Value 19
Overview CS Datamodel Instance. Link Project S. Inst. C. Inst. Attribute. Link Measure Char. Link Dimension Char. Link Category Specification Instance. Link O. Inst. Attribute. Link Indicator Char. Link Prescription Instance. Link D. Inst. Attribute. Link Variable Char. Link Value Instance. Map Attribute. Map Char. Map 20
Overview CS Datamodel Instance. Link S. Inst. Attribute. Link Measure Char. Link Category Instance. Link C. Inst. Attribute. Link Dimension Char. Link Specification Instance. Link Project Instance. Link Instance. Map O. Inst. D. Inst. Attribute. Link Attribute. Map Indicator Variable Char. Link Prescription Value Char. Map 21
Overview CS Datamodel Project Instance. Link S. Inst. Attribute. Link Measure Char. Link Category Instance. Link C. Inst. Attribute. Link Dimension Char. Link Specification Instance. Link O. Inst. Attribute. Link Indicator Char. Link Prescription Instance. Link Instance. Map D. Inst. Attribute. Link Attribute. Map Variable Char. Link Value Char. Map 22
DDI AND THE CS MODEL 23
Variable r: Versionable. Type Variable Name Label Variable. Type Level Measure. Type Question Dataset. Label PID Date. Last. Changed Variable. Type Variable. Name r: Label r: Question. Reference r: Description … Temporal* Geographic* is. Temporal is. Geographic is. Weight 24
Variable Project Instance. Link Instance. Map Instance Attribute. Link Attribute. Map Variable Char. Link Value Char. Map 25
Instance Label r: Maintainable r: Versionable Variable. Scheme. Type Variable. Scheme. Name r: Label Variable. Group. Type Variable. Group. Name Type. Of. Variable. Group Published. Since* Obsolete. Since* Description r: Variable. Scheme. Reference r: Universe. Reference r: Concept. Reference r: Subject r: Keyword Variable r: Variable. Reference Variable. Group r: Variable. Group. Reference is. Ordered 26
Instance V. Scheme V. Group Variable Project Instance. Link Instance. Map Instance Attribute. Link Attribute. Map Variable Char. Link Value Char. Map 27
References, Links and Maps Instance. Link ID Project Instance. Link. Type Attribute. Link ID Instance Attribute. Link. Type Characteristic. Link ID Attribute Characteristic. Link. Type Instance. Map ID Source. Instance. Link Target. Instance. Link Instance. Map. Type Project Attribute. Map ID Source. Attribute. Link Target. Attribute. Link Attribute. Map. Type Instance. Map Characteristic. Map ID Source. Characteristic. Link Target. Characteristic. Link Characteristic. Map. Type Attribute. Map Concept. Referenced Concept Ref. Relationship. Type <xs: complex. Type name="Reference. Type"> <xs: sequence> <xs: choice min. Occurs="1" max. Occurs="2"> <xs: element ref="URN"></xs: element> <xs: sequence> <xs: element ref="Agency"></xs: element> <xs: element ref="ID"></xs: element> <xs: element ref="Version"></xs: element> </xs: sequence> </xs: choice> <xs: element ref="Type. Of. Object"></xs: element> <xs: element ref="Maintainable. Object" min. Occurs="0"></xs: element> </xs: sequence> <xs: attribute name="is. External" type="xs: boolean" default="false"></xs: attribute> <xs: attribute name="external. Reference. Default. URI" type="xs: any. URI" use="optional"></xs: attribute> <xs: attribute name="is. Reference" type="xs: boolean" fixed="true"></xs: attribute> <xs: attribute name="late. Bound" type="xs: boolean" default="false"></xs: attribute> <xs: attribute name="late. Bound. Restriction" type="Version. Type" use="optional"></xs: attribute> <xs: attribute name="object. Language" type="Language. List" use="optional"></xs: attribute> <xs: attribute name="source. Context" type="xs: any. URI" use="optional"></xs: attribute> </xs: complex. Type> 28
Attribute. Link? V. Scheme V. Group r: Variable. Ref Variable Project Instance. Link Instance. Map Instance Attribute. Link Attribute. Map Variable Char. Link Value Char. Map Problem: We can‘t refer to r: Variable. Ref 29
„Attribute. Link“ Variable V. Scheme V. Group Project Instance. Link Instance. Map Instance Attribute. Link Attribute. Map Variable Char. Link Value Char. Map 30
Attribute. Map Variable V. Scheme V. Group Project Instance. Link Instance. Map Instance Attribute. Link Variable Attribute. Map Char. Link Value Char. Map V. Scheme V. Group r: Variable. Ref 31
Project Resource. Package Variable V. Scheme V. Group Project Instance. Link Instance. Map Instance Attribute. Link Variable Attribute. Map Char. Link Value Char. Map V. Scheme V. Group r: Variable. Ref 32
„Instance. Link“ Resource. Package V. Scheme Variable V. Scheme V. Group Project Instance. Link Instance. Map Instance Attribute. Link Variable Attribute. Map Char. Link Value Char. Map V. Scheme V. Group r: Variable. Ref 33
„Instance. Map“ Resource. Package V. Scheme. Ref Project Instance. Link Instance. Map Variable V. Scheme V. Group Instance Attribute. Link Variable Attribute. Map Char. Link Value Char. Map V. Scheme V. Group r: Variable. Ref 34
Conclusion • To replicate the CS model as it was implemented with DDI is not possible • The simulate the idea behind the CS model with DDI might be possible • Further research is necessary 35
Thank you for your patience! 36
- Slides: 36