HATHI TRUST A Shared Digital Repository Digital Repositories
- Slides: 28
HATHI TRUST A Shared Digital Repository Digital Repositories for Preservation and Access Digital Directions 2013 Jeremy York July 22, 2013 Unless otherwise noted, these slides and their contents are licensed under a Creative Commons Attribution Unported License.
Digital repositories • Primary mission to preserve content • Performs actions to this end
Reasons to preserve content • For access • Guard against threats to content – Digitization accepted method of preservation reformatting – Digital deteriorates, is fragile
Reasons to provide access • Meet needs of designated community • Check on integrity of content • Content that is accessible is more likely to be valued and preserved in the future
Reasons access might not be offered • • Copyright Privacy Licensing Needs of user community – Content available elsewhere • Technical limitations – Networking and storage requirements
A number of models • Full user access to preserved digital objects • No end-user access to digital objects • Delayed or triggered user access to digital objects • Partial access to digital objects
Requirements to preserve content • OAIS – “An OAIS is an Archive, consisting of an organization. . . of people and systems that has accepted the responsibility to preserve information and make it available for a Designated Community. ” [does not imply unrestricted access]
OAIS • Support information model – Define target of preservation (content data and representation information) – Define metadata needed to preserve, identify, contextualize information (PDI) • Fulfill responsibilities – – – Accept information from Producers Obtain control sufficient to preserve Ensure understandable to designated community Ensure preservation Make available to designated community with information supporting authenticity
Ensure preservation • Some strategies: – Transformation – Validation – Checks on integrity – Replication – Choice of formats – Migration
TRAC • Starts with “a mission to provide reliable, longterm access to managed digital resources to its designated community, now and into the future” • Encompasses – Organizational Infrastructure – Digital Object Management – Technical Infrastructure
TRAC (2) • Borrows vocabulary from OAIS • Adapts ideas for applying criteria from nestor and Digital Curation Centre – Documentation (evidence) – Transparency – Adequacy – Measurability
Mission OAIS TRAC Provenance Reference Context Fixity Access Rights Content Data Representation Information Preservation Actions Integrity Authenticity Transparency Documentation Organizational Infrastructure Reliability Adequacy Digital Object Management Measurability Technical Infrastructure Designated Community Preserve Content
Where does access come in • Some level of access is necessary – Management, integrity • What is preserved may not be what is most useful to the end user • Implications across the repository
Content formats • Can the content you are preserving be delivered over the Web? – Will you be storing derivative files? – Is some kind of transformation needed? – Do the files offer consistent functionality? • Implications for scale of repository, access systems, changes to services • In Hathi. Trust: – Limited to 3 formats, largely uniform in technical characteristics • ITU G 4 TIFF • JPEG 2000 • Unicode (with and without coordinates)
Storage of information about content • Is information about object adequately available for both preservation and access? – Structural information – Preservation information with implications for interface • Hathi. Trust uses METS as a wrapper – Available for preservation and access
Content Package images text Source METS Zip HT METS
Architecture. . /uc 1/pairtree_root/b 3/54/34/86/b 34543486. zip b 34543486. mets. xml images HT METS text Source METS
Storage • Does the storage system support needs for ingest and access? • In Hathi. Trust: – Need to have fast access to repository systems to support services
Security • Data Integrity – Checksum validation, digital object provenance • Physical security – Biometric door systems, locked racks • Network security – Firewalling, vulnerability scanning • Application security – Developer best practices, input validation • Access control…
Differential access to content • Rights database – Ensures appropriate access • Holdings database – Facilitates lawful uses of materials
Authentication/Authorization • Mechanisms to enable differential access, ensure security and appropriate use
User services • Bibliographic and full-text search indexes • Collection-building capabilities • User interfaces
APIs and Datasets • • • Data API Bibliographic API OAI “Hathifiles” Datasets
More • Quality • User Support • Correction
Content Formats Content Package Architecture Storage Security Authentication Authorization Differential Access Copyright/Agreem ents Lawful Uses Indexes Services / User Interfaces APIs and Datasets Information Quality User Support Correction Provide Access
Mission OAIS Preservation TRAC Provenance Reference Context Fixity Access Rights Content Data Representation Information Preservation Actions Authenticity Documentation Organizational Infrastructure Integrity Transparency Reliability Adequacy Digital Object Management Measurability Technical Infrastructure Designated Community Content Formats Content Package Architecture Security Authentication Authorization Lawful Uses Indexes Information Quality User Support Copyright/Agre ements APIs and Datasets Storage Differential Access Services / User Interfaces Correction Access
Thank you!
How to find out more • • About: http: //www. hathitrust. org/about Twitter: http: //twitter. com/hathitrust Facebook: http: //www. facebook. com/hathitrust Monthly newsletter: – http: www. hathitrust. org/updates – RSS http: //www. hathitrust. org/updates_rss • Contact us: feedback@issues. hathitrust. org • Blogs: http: //www. hathitrust. org/blogs – Large-scale Search – Perspectives from Hathi. Trust
- Hathi digital trust
- Háthí
- Github trending c
- Open doar
- Ckan metadata repositories
- Dogs trust shared adoption scheme
- Charitable work
- Dryad digital repository
- Dryad data submission
- Stanford digital repository
- St andrews thesis repository
- Repository ust
- Alma repository
- Knustspace
- Request flow in spring mvc
- Example of repository architecture
- Llw repository ltd
- Upper peninsula hospitals
- David dagon
- Wits repository
- Turnitin no repository class id
- Scm repository in software engineering
- Lessons learned repository
- Repository uin ar raniry
- Clinical data repository
- Automatic workload repository
- Repository ust
- Meditech data repository
- Repository ust login