Sustainability considerations 1 ISIS Neutron Source 2 DP
Sustainability considerations: 1. ISIS Neutron Source 2. DP for HEP Matthew Viljoen STFC, UK APARSEN-EGI workshop: preserving big data for research Amsterdam Science Park 4 -6 March 2014
Technical considerations of chosen DP solution For ISIS, long term archive is implemented by: • Data - Tessella Safety Deposit Box - SGI DMF tape frontend (NFS mount) - Storage. Tek SL 8500 tape robot • Metadata - ICAT database - Data. Cite DOIs to copy of data on Windows filestore For DPHEP, solution will likely be entirely open source. Sustainable services/solutions. Working with bodies such as RDA/APA
Technical considerations Risks and mitigations Closed source commercial solutions. Company bankruptcy/unexpected license/costs increase • plans in place to migrate out of existing solutions • estimate effort to do this Lack of retained knowhow – staff turnover • documentation! Bit loss solution based approach. checksum validation. During media migration?
Future (meta)data formats Plans in place for migration to future: File formats - For ISIS we need to implement and test processes to do this. SDB allows for this Metadata schema changes - For ISIS ICAT metadata schema is community driven. Pressure for backward compatibility
Sustainability via usability Training • metadata - best practice for annotations, provenance • how to access data • Increased implementation and use of Persistent Identifiers. Carrot or stick approach? BUT different communities have very different training requirements Documentation • Tools • Archive and Procedures • Ensure data is usable by target communities
And finally costing Need to have clear business case for sustainable preservation at outset Considerations • • Business case needs to support DP costs Preservation length and metrics of success Additional research via open data/data reuse Negative impact of data loss!
- Slides: 6