The Fedora Digital Repository Project and the National















- Slides: 15

The Fedora Digital Repository Project and the National Science Digital Library (NSDL) July 26, 2005 Dean B. Krafft Cornell University

Fedora: Repository Middleware • A Flexible, Extensible Digital Object Repository Architecture • An architecture and toolkit (like IIS or SQL Server), not a vertical application • Audience: system builders – 12 major university or national (Denmark) digital libraries • DSpace in contrast: a vertical application with a fixed workflow targeted at users • So far incorporated in two commercial products: VTLS’s Vital digital library, and Company X’s product – finalist for a large government contract

Fedora: Project Details • Collaboration of Cornell and UVa • Development team of 10 developers+leads • Currently implemented in Java; licensed under Mozilla Public License • Funded by Mellon: starting 2 nd 3 yr $1. 4 m grant • Cornell leads: Sandy Payette & Carl Lagoze • 20, 000+ downloads, active user community • Use cases: Digital Asset Management, Scholarly Publishing, Information Network Overlay, Institutional Repository, Digital Archive and Records Management, Digital Library

Fedora Digital Object Model Component View Persistent ID (PID) Relations (RELS-EXT) Dublin Core (DC) Digital object identifier Reserved Datastreams Key object metadata Audit Trail (AUDIT) Datastreams Datastream Set of content or metadata items (local or external URL redirects) Default Disseminators Web-service methods for distributing views of recombined content

Fedora Repository Service • Set of SOAP/REST services: Manage, Access, Search, Query • Fundamental store is XML, with RDBMS cache (Oracle, My. SQL), and RDF triple store for relationship queries • Modular architecture: Manage, Access, Storage, Dissemination, Authentication, Authorization, RDF Resource Index

Fedora 2. 0 Capabilities • Object-to-object Relationships – Ontology of common relationships (RDF schema) – Relationships stored in special datastream (RELS-EXT) • Resource Index (RI) – RDF-based index of repository (Kowari triple-store) – Graph-based index includes: • Object properties and Dublin Core • Object Relationships and Object Disseminations – Powerful querying of graph of inter-related objects – REST-based query interface (using RDQL or ITQL) – Results in different formats (triples, tuples, sparql) • Fedora 2. 1 (August 2005) adds – Plug-in Authentication modules – Fine-grained Authorization using XACML XML-based policies

Fedora Service Framework (v 2. 1 & Planned 2005/6 -2006/7)

National Science Digital Library • K-gray Science, Technology, Engineering, and Mathematics (STEM) education • NSF-created brand home for digital resources of known high quality • Community of users, contributors and institutions (as providers and consumers) • Creates context for resources (e. g. lesson plans, standards alignment, ratings, annotations, reviews, brands) • Guides selection & use; not just discovery


Program Details • Major NSF Division of Undergraduate Education program, over $20 m/yr funding • Over 120 NSF grants in program • Core Integration collaboration of UCAR, Columbia University and Cornell University • Cornell provides core technical infrastructure: Fedora-based repository, Lucene-based search, nsdl. org portal • Columbia: Shibboleth authentication; SDSC: Storage Resource Broker archive

What Fedora Provides NSDL • Objects: Aggregators (collections), Metadata Providers, Agents, Resources (with local or remote content), Metadata • Relationships: Structural (part of), Equivalence, Membership, arbitrary graph queries • Network overlay architecture: A lens for viewing science content on the net, whether content is local, remote, or archived – it all has a repository -based URL • Web services: disseminations are arbitrary recombinations of content • Authentication/Authorization: Collections and services manage their own repository content


Appendix – Additional Information • Fedora website: http: //www. fedora. info • NSDL website: http: //nsdl. org • An Information Network Overlay Architecture for the NSDL by Lagoze, Krafft et al. : http: //www. arxiv. org/abs/cs. DL/0501080 • Fedora: An Architecture for Complex Objects and their Relationships by Lagoze, Payette et al. : http: //www. arxiv. org/abs/cs. DL/0501012

Selected Fedora Adopters • Current Users: – National Science Digital Library (NSDL): Core Integration – University of Virginia – digital library – VTLS – library systems vendor selling Fedora-based product – Tufts University – digital library and university records management – Ohio. Link – statewide consortium of academic libraries – Northwestern: Library and Academic Technologies – digital library – ARROW: National Library of Australia and Monash University – nationally distributed institutional repository project – Royal Library Denmark, National Library, and DTU – integrated national digital library – Rutgers University – digital library – Indiana University – digital library – American Geophysical Union – repository of back issues of journals – Library of Congress – National Digital Newspaper Project – University of Delaware – digital library – Hamilton College – digital library – Cornell CIT – Electronic File Cabinet to manage office records – Tibetan Buddhist Resource Center – digital library – Yale University – manage university records – DISA – South Africa, History of Apartheid resistance – record repository • Interesting new proposals – Company X finalist for large government contract – Cornell Lab of Ornithology (data + tools + documents)

Fedora Development Consortium • Advisory Board – – – – University of Virginia Tufts VTLS ARROW (Monash University and Nat’l Lib Australia) Harris Corp. Danish Royal Library and DTU Northwestern University NSDL – Core Integration • Mission – Requirements Definition, Specifications. Joint Development – Commission of Working Groups • Content Modeling • Outreach and Education • Workflow and Service-Oriented Processes – Recommendation for Long-Term sustainability model • Governance and Funding • Set Fedora Free – full open source model (e. g. , public Source. Forge) • Code Maintenance (UVA until 2012; plan for beyond)