Because good research needs good data The Digital
Because good research needs good data The Digital Curation Lifecycle Model Joy Davidson and Sarah Jones Digital Curation Centre, Glasgow joy. davidson@glasgow. ac. uk sarah. jones@glasgow. ac. uk Funded by: The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data What is data curation? “the active management and appraisal of data over the lifecycle of scholarly and scientific interest” Data have importance as the evidential base of scholarly conclusions Curation is part of good research practice The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data Why curate: requirements Code of good research conduct data should be preserved and accessible for 10 years + Funders’ data policies www. dcc. ac. uk/resources/policy -and-legal/funders-data-policies Common principles on data policy www. rcuk. ac. uk/research/Pages/Data Policy. aspx The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data Why curate: rewards Prevent data loss More citations: 69% ↑ Validation of results (Piwowar, 2007 in PLo. S) New research opportunities and collaborations Easier to do your work… The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data DCC curation lifecycle model Activities to cover are: • data management planning • creating data • metadata & documentation • selecting what to keep The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data Conceptualise: planning what to do Key questions: • What data will be created? • How much storage is needed? • Are there ethical issues that require consent? How can you help researchers develop appropriate plans? n. b. many funders expect data management & sharing plans The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data Guidance: Decisions made early on may limit what is possible later Make sure researchers are aware of the support available (from colleagues, university, data centres etc) Encourage discussion to help people find the best approach The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data Data Management and Sharing Plans Funders typically want a short statement covering: • What data will be created? (format, types, volume) • What standards and methodologies will you use? (incl. metadata) • How will you manage ethics and Intellectual Property? • What are the plans for data sharing and access? • What is the strategy for long-term preservation? DMP guidance: www. dcc. ac. uk/resources/data-management-plans DMP online: http: //dmponline. dcc. ac. uk/ The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data What is DMP Online? A web-based tool to help researchers write plans It features: • Templates based on different requirements • Tailored guidance (disciplinary, funder etc) • Customised exports to a variety of formats • Ability to share DMPs with others https: //dmponline. dcc. ac. uk The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data You can customise it for your uni Add your logo, colours, URL… Select desired questions Profile local support • www. dcc. ac. uk/blog/tailoring-dmp-online-for-your-institution The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data Creating / collecting data Key questions: • What formats are most appropriate to use? • How will researchers create their data? - standards & methodologies to use - file naming and version control - quality control & assurance • How will ethical concerns be addressed to protect participants? The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data Guidance Data could be expensive or impossible to recapture so make sure it’s sustainable. Different formats are good for different things: one may be used for analysis then convert data to another for preservation Take time to develop processes at the start – it pays off later! Excellent guidance on creating data & managing ethics in: www. data-archive. ac. uk/media/2894/managingsharing. pdf The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data Provide guidance & raise awareness The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data Metadata & documentation Key questions: • What information do users need to understand the data? - descriptions of all variables / fields and their values code labels, classification schema, abbreviations list information about the project and data creators tips on usage e. g. exceptions, quirks, questionable results • How will this be captured? • Are there standards that can be used? The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data Guidance: • Create metadata at the time – it’s hard to do later • Develop processes so everyone does the same • Use standards for interoperability The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data Appraise: select what needs to be kept Key questions: • What has to be kept? (published data, of long-term value) • What has to be destroyed? (e. g. personal data) • Is it worth keeping the data? – cost/benefits • Where will the data be kept? The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data Guidance: Check funder expectations, licences and consent agreements Provide advice and guidance to help researchers decide Remember, storage is not the only cost in data management! Check out the DCC How to guide www. dcc. ac. uk/resources/how-guides/appraise-select-data The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data Criteria for appraisal decisions 1. Relevance to mission 2. Scientific, social, cultural, historical value 3. Uniqueness 4. Potential for redistribution 5. Non-replicability 6. Economic case 7. Full documentation The DCC lifecycle model, Exeter Uni, 19 May 2012
Because good research needs good data Good data management is about making informed decisions The DCC lifecycle model, Exeter Uni, 19 May 2012
• http: //xkcd. com/949
Because good research needs good data Any questions? For DCC guidance, tools and case studies see: www. dcc. ac. uk/resources Follow us on twitter @digitalcuration and #ukdcc The DCC lifecycle model, Exeter Uni, 19 May 2012
- Slides: 21