A CIDOC CRM compatible metadata model for digital
A CIDOC CRM – compatible metadata model for digital preservation Panos Constantopoulos and Vicky Dritsou Information Systems and Databases Laboratory Department of Informatics Athens University of Economics and Business Vicky Dritsou DCEIS '06 - May 2006 1
Structure of the presentation • • • Introduction to Digital Preservation Metadata Existing proposals A conceptual preservation metadata model Properties of the model Model concepts Schema The complete model Conclusion Further research 24 October 2006 CIDOC CRM Workshop 2
Introduction to Digital Preservation (1/2) • Two types of perils for digital content exist - Physical: physical destruction of file systems, corruption of digital media, fire, earthquake - Technological: obsolete systems, non-compatible systems, software and formats • Physical perils are more straightforward to confront - By saving multiple copies of digital content: • On different media • At different geographic locations • Technological hazards require a more complex policy to be applied - By following the appropriate preservation strategy 24 October 2006 CIDOC CRM Workshop 3
Introduction to Digital Preservation (2/2) • Digital preservation strategies for technological hazards - • Information migration Technology emulation Technology preservation Backwards compatibility Reliance on standards Encapsulation Transformation to non-digital form Digital archeology Most strategies require some information to be collected and stored - This is achieved by using metadata 24 October 2006 CIDOC CRM Workshop 4
Metadata • • Defined as “data for data” or otherwise “information about information” Metadata properties - Not necessarily digital - Not autonomous - • Digital information needs to pre-exist Supplementary Dynamic character • Metadata types • Preservation metadata • But which metadata should we choose? - Descriptive - Structural - Administrative - They contain elements from all tree types 24 October 2006 CIDOC CRM Workshop 5
Existing proposals • • • Several approaches exist We have studied five widely known ones: - Dublin Core Open Archival Information Systems (OAIS) Curl Exemplars Digital Archives (CEDARS) Pittsburgh Project National Library of Australia (NLA) Discussion - None contains inter-related concepts (element lists) DC: Access-oriented, inadequate OAIS, CEDARS: very detailed, difficult to use PP: detailed, necessary/optional elements, use instructions NLA: Structured elements, object types 24 October 2006 CIDOC CRM Workshop 6
A conceptual preservation metadata model • A parsimonious metadata set derived from • Metadata elements • - comparison of the afore-mentioned proposals - CIDOC CRM - Title Identifier Subject Language Type Format Technical Equipment - Information Carrier Activity Right Actor Effect History Relations 24 October 2006 CIDOC CRM Workshop 7
Properties of the model • • • Each element forms a concept Contains relationships among concepts Results in a conceptual model - Compatible with CIDOC CRM • A small number of new concepts • Can serve as an application ontology - A guide for preservation • Independent from preservation strategy - Elements contain all the information required from each strategy - Further details can be added with the extension of concepts 24 October 2006 CIDOC CRM Workshop 8
Model concepts (1/3) • Main concept: Digital Object - Subclass of E 73 Information Object - Has attributes: Title, Subject, Type, Size, Identifier, Language, Digital Content • Identifiers may be global or local - Global identifiers must be unique • Digital Content allows separation of content from descriptive/administrative aspects - Stored in an Information Carrier - Digital Objects can consist of other digital objects (Complex Objects) - Type: image, text, sound, multimedia, … - Each object type can be formatted in one of a number of specific formats 24 October 2006 CIDOC CRM Workshop 9
Schema (1/3) 24 October 2006 CIDOC CRM Workshop 10
Model concepts (2/3) • • Activities have digital objects as input and output, are carried out by Actors and are subject to Rights Activity types: • Creation • Deletion • Modification • Alteration • Copy • Read - In all of them, except from Read and Deletion, we assume that the • • output is a new object We keep the sequence of performed Activities by assigning the appropriate attribute Effects can be used as a space-saving device when versions need not be kept 24 October 2006 CIDOC CRM Workshop 11
Schema (2/3) 24 October 2006 CIDOC CRM Workshop 12
Model concepts (3/3) • Activities require the appropriate Technical Equipment to be performed - Software - Hardware • These are all specializations of E 71 Man-Made Thing - The software needed depends on the Type and Format of the object • Information carrier also requires Technical Equipment - For reading the object - For writing the object 24 October 2006 CIDOC CRM Workshop 13
Schema (3/3) 24 October 2006 CIDOC CRM Workshop 14
The complete model 24 October 2006 CIDOC CRM Workshop 15
Conclusion • • Metadata elements drawn from existing metadata sets Conceptual model for digital preservation - Previous works included only lists of metadata elements - Extensible as needed • Compatible with CIDOC CRM Digital objects as - digital surrogates of non-digital objects - cultural objects by themselves 24 October 2006 CIDOC CRM Workshop 16
Further research • Historical processes: - interpretation - CIDOC CRM domain of application • Preservation processes: - decision and production processes - Prescription and monitoring Ø Explore differences in modelling requirements 24 October 2006 CIDOC CRM Workshop 17
Thank you for your attention! 24 October 2006 CIDOC CRM Workshop 18
- Slides: 18