Data curator in the middle Curating data for













- Slides: 13
Data curator in the middle Curating data for a diverse community of stakeholders Ruth Geraghty, Data Curator 15 th International Digital Curation Conference Croke Park, Dublin
Background • Funded by the Atlantic Philanthropies • To preserve data from their Child and Youth programme’ (2004 – 2016) • Funded 50+ programmes for children in Ireland Northern Ireland • Evaluation a condition of funding • My project – to preserve the data as an ‘evidence base’
Data producer motivations COMMUNITY ORGS: 30 organisations Want to know what to fund Want to know what programmes work and what to do to improve them FUNDERS: Atlantic Philanthropies + government Want to do scientific research and publish papers EVALUATORS: Diverse mix of disciplines
Data producer motivations Want to know what to fund Evidence base COMMUNITY ORGS: 30 organisations High quality GOOD research DATA FUNDERS: Atlantic Philanthropies + government Want to know what Want to do scientific programmes work and research and publish Apply what to do to improve papers knowledge them EVALUATORS: Diverse mix of disciplines
Processing stage: Locating and preparing the data Step 1: Approach and negotiation with data copyright holder (the copyright holders never had a copy of their data!) Step 2: Collaboration with and support university teams Step 3: Request for copyright to share items from standardised measures (in some format) Challenges of preparing this legacy data discussed in Geraghty, R. (2017). Curation After the Fact: Practical and Ethical Challenges of Archiving Legacy Evaluation Data. The International Journal of Digital Curation, 12 (1), 152– 161. Retrieved from https: //doi. org/10. 2218/ijdc. v 12 i 1. 550
Curation stage: Contextual materials for archive The research on user experience tells us that obstacles to reuse are: • Degree of effort required to the locate the data and to fully make sense of its origins can influence whether a researcher proceeded with re-using it (Gonçalves Curty, 2016) • Incorrect or incomplete data documentation is a major obstacle for reuse (Yoon, 2016) • Quality of data documentation is significantly related to users’ level of satisfaction with the reuse experience and increases their trust in the data (Faniel et al. , 2015)
I experimented with documentation (based on what I would like to get as a researcher) Codebooks include: • Citations for published research tools (i. e. standardised measures) • Notes on how data was deidentified (e. g. what codes were merged) • List of variables categorised by domain for quick reference User Guides include: • Information on intervention programme • Information on evaluation design • Copies of documentation used by study Sample: http: //www. ucd. ie/issda/t 4 media/0 055 -00_PFL_User_Guide. V 4. pdf
Final stage: Motivating the new user to engage with the data • Social science data is most commonly used for answering new research questions, for providing a comparative sample for a new study, or for methodological purposes such as teaching, rather than for replication or validation of the original research • The ‘issue’ is the size of a social science dataset is small number of cases x large number of variables. e. g. multiple variables to capture the respondent’s SES, employment status, ethnicity, family structure, use of local services, health behaviours, education, membership of groups, social supports and participation, civic engagement, financial info, home ownership … all before even measuring outcomes • Consultation with users: what do you need most?
1. Parental Cognitions and Conduct Towards Infant Scale 2. Parental Acceptance and Rejection Questionnaire 3. Inventory of Parent Attachment Across the evaluations there are 15 different ways to measure “parenthood” 4. Parenting Problem Checklist 5. Parenting Daily Hassles Scale 6. Parents Evaluation of Developmental Status 7. Pianta Child-Parent Relationship Scale 8. Parental Locus of Control Scale 9. Parental Monitoring 10. Parenting Scale 11. Parenting Styles and Dimensions Questionnaire 12. Parenting Stress Index 13. Parenting Style Inventory 14. Parenting Sense of Competence 15. Parental Stress Scale
Signposting tools for data: The index of standardised measures Index of measures search engine Index result Data archive Repository • Description of measure • How to access measure • Access Irish data • Dataset containing variables from this measure • Contextual materials describing the study • Reports generated by the analysis of this data
NAME OF FIELD EXPLANATION Title Official title Acronym Creator Authors of measure Description Background, what it measures, development Domain Associated subject terms Permission to use Free to use or cost to use Respondent Who is the respondent e. g. mother of study child Number of items How many items - gives sense of how long it will take Training / qualification required to administer measure None or detail them More information / Access this measure Source URL should point towards where to purchase/download Example of Irish programmes that use this Link to report/article using permanent identifier such as DOI or ISSBN measure in evaluation Compare available data from Ireland/Northern Ireland Link to archived PEI or GUI data using permanent identifier
Conclusion Social science research data in the archive – breathing new life into ‘old material’ “archives need not be passive agents, trying to fathom what users want. They can actively shape those needs and wants, ideally in an interactive and collaborative manner with re-users” (Bishop, 2014)
Ruth Geraghty Centre for Effective Services, Dublin Email: rgeraghty@effectiveservices. org Twitter: @Rutho. Geraghty