Deciding what to keep and where to keep
- Slides: 20
Deciding what to keep …and where to keep it Angus Whyte Digital Curation Centre Research Data Management at University of Aberdeen & RGU 7 th October 2014 This work is licensed under a Creative Commons Attribution 2. 5 UK: Scotland License
Outline • Why select, rather than ‘file and forget’! • Take five steps to inform your choise… ① Think. What could be reused for what purpose? ② Recognise compliance risks ③ (Gu)estimate long-term value ④ Judge the cost factors ⑤ Decide – what action needed • The onus is on you, but it’s a partnership • So what tools and practical help do you need?
Storage Strategies Good practice Weigh up risks, Value, and costs Bad practice Keep everything until… lost by natural wastage Select, share, safeguard what you can afford to, or dispose of it Fragmented • Findable • Accessible • Interoperable • Reusable FAIR Principles www. force 11. org/group/fairgroup Risking unauthorised disclosure or loss • Bit rot • Media degradation • Obsolescence • (software, device, • format, media) • Fire, flood, theft • Organisation failure
Why not keep it all? Globally, data volumes are doubling every two years John Gantz and David Reinsel 2011 Extracting Value from Chaos www. emc. com/digital_universe. 4
Data volumes escalate Volumes rising faster in data-intensive research domains e. g. DNA sequence data is doubling every 6 -8 months “ELIXIR and Open Data” View from an ELIXIR Node” Barend Mons, ELIXIR Launch event, 18 th Dec 2013 5
Storage mgmt costs rise long-term Hardware costs decline, but power and staff costs keep rising David Rosenthal blog. dshr. org/2012/05/lets-just-keep-everything-forever-in. html 6
While data availability declines Nature News 19 Dec 2013 www. nature. com/news/scientists-losing-data-at-a-rapid-rate-1. 14416 7
What to do? Data appraisal… a ‘later stage’ plan for your data ① Could this data be re-used ② Must it be kept to manage compliance risk ③ Should it be kept for its potential value and… ④ Considering costs ⑤ Will ✔or won’t ✗ it be kept, shared on what terms Researchers guidance & attractive choices Institutions Managed storage External repositories 8
Step 1 (? ) What ‘must’ be kept? Some data may be part of research record, evidence for e. g. … • Audit purposes • Health & Safety (Lab book) • Contractual requirement Jisc Infonet Guidance on Managing Research Records tools. jiscinfonet. ac. uk/downloads/bcs-rrs/managing-research-records. pdf What counts here? Depends on purposes data has been used for Compliance also about data that won’t be kept, or may only be shared with approved researchers… Research Ethics, Duty of Confidentiality, Data Protection Act, Human Rights Act, Statistics & Registration Services Act. UK Data Archive: http: //www. data-archive. ac. uk/create-manage/consent-ethics/legal 9
Step 1 (? ) What ‘must’ be kept? What about Funding Body data policies? “Data with acknowledged long-term value ” RCUK Common Principles on Data Policy “Data, information and other electronic resources of longterm interest” ESRC UK Data Archive Collections Development Policy “Where data underpins published research there is much greater expectation that it will be kept” Ben Ryan, EPSRC What counts depends on data’s value for purposes it has served or may serve, so consider these as first step. 10
Step 1 (? ) What ‘must’ be kept? Don’t forget Journal policies… “An inherent principle of publication is that others should be able to replicate and build upon the authors' published claims. Therefore, a condition of publication in a Nature journal is that authors are required to make materials, data and associated protocols promptly available to readers without undue qualifications. …Nature journals reserve the right to refuse publication in cases where authors do not provide adequate assurances that they can comply with the journal's requirements for sharing materials. http: //www. nature. com/authors/policies/availability. html • “Changemakers are journals with high impact factors…. Progressive policies are not widespread, but are being adopted rapidly” Victoria Stodden “Re-use and Reproducibility: Opportunities and Challenges” Open Repositories, 2013
Step 2 1 What could it be reused for? Step back and reflect – typical reuse purposes 1. 2. 3. 4. 5. 6. 7. Verification Further analysis Reputation building Resource development Further publications inc. data articles Learning and teaching materials Private reference Then relative to these, which data must be kept and which data and related materials will have significant value? 12
e. g. High Energy Physics community Levels of data to preserve Reuse purpose 1) Additional documentation Publication-related information search (e. g. wikis, news forums) 2) Data in a simplified format Outreach, simple training analyses 3) Analysis level software and the data format Full scientific analysis based on existing reconstruction 4) Reconstruction and simulation software and basic level data Full potential of the experimental data Adapted from: DPHEP Study Group: Towards a Global Effort for Sustainable Data Preservation in High Energy Physics, May 2012. http: //arxiv. org/abs/1205. 4667
Step 3 What data should have value Indicators that data have value 1. Quality of the data and its description complete, accurate, reliable, valid, representative etc 2. Demand high known users, integration potential, reputation, recommendation, appeal 3. Replication difficulty difficult, costly, or impossible to reproduce 4. Low barriers legal/ ethical, copyright non-restrictive terms and conditions 5. Rarity unique copy or other copies at risk Which related material does data depend on for its value? 14
Step 4 Cost factors Consider these when deciding what to keep because • Costs incurred during project may add to the data’s value • Need to make sure post-project costs are covered 1. Creation, collection & cleaning 2. Short-term storage & backup 3. Short-term access & security 4. Team communication & development 5. Preservation & long-term access What action needs to be taken to ensure preservation is costed? 15
Step 5 Your data appraisal Establish a clear idea of what data needs packaged at end 1. Title, contributors, description, access rights * 2. Reuse purpose(s) 3. Value for purpose 4. Risk of budget shortfall 5. Keep it or not? * 6. Reasons for disposal * 7. Actions to prepare for preservation or disposal * What anyone outside the project most needs to know (but the rest will help) 16
Who should help appraise? RLUK ‘skills gaps’ survey of Subject Librarians & Managers “ …nine key areas where future involvement by Subject Librarians is considered to be important now and is also expected to grow sharply… 1. Ability to advise on preserving research outputs (49% see as essential in 2 -5 years; 10% now) 2. Knowledge to advise on data management and curation, (48% essential in 2 -5 years; 16% now)…” Mary Auckland 2012 Reskilling Libraries for Research
Who else? Others who may be involved in appraising research data… • Domain specialists • Archives • Research Office- Business development • IT Support/ Research Computing • Research Ethics Committee • Records Management/ FOI Compliance • Facilities Managers (if physical samples involved)
Where should it go? Institutions aiming to offer a range of options Ø Secure managed storage/ disposal Ø Institutional Data catalogue Most universities establishing Ø Institutional data repository If nowhere else it can go Ø Help to find external repository Go 19
Go Ø Finding external repositories General directories Re 3 data. org Databib. org Ø Domain specific directories e. g. life sciences – Biosharing. org Ø Data journal recommendations Edinburgh research data blog: Sources of dataset peer review Ø Funding body recommendations E. g. Wellcome Trust Data repositories and database sources 20
- What spooked jem on the night of the radley
- Parents promoters apathetics defenders
- Power interest grid
- Keep it secret keep it safe
- Deciding to marry asl story
- Batasan perencanaan
- Deciding in advance
- Deciding on the global marketing organization
- Consensual vs top down
- Keep calm and rock the test
- Keep calm and speak english
- Chapter 26 how to get and keep credit
- Same sign add different signs subtract
- Keep calm and love reggae
- Reggae phrases
- Contoh customer intimacy
- Eat well and keep moving
- Positive food
- Keep watch pathfinder
- Square keep castles advantages and disadvantages
- Keep calm and make music