The Fedora Project Update as of January 2004

  • Slides: 33
Download presentation
The Fedora Project Update as of January 2004 Ithaca, NY January 29, 2004 Sandy

The Fedora Project Update as of January 2004 Ithaca, NY January 29, 2004 Sandy Payette Cornell Information Science

The Fedora Project n Fedora n Flexible n Extensible n Digital n Object n

The Fedora Project n Fedora n Flexible n Extensible n Digital n Object n Repository n Architecture n Fedora Use Cases n n n n Digital Library Architecture Digital Asset Management Institutional Repository Content Management System (CMS) Scholarly publishing Preservation support Open source software n n Not Red. Hat ! Mozilla Public License

Fedora History n Research (1997 -present) : n n First Application (1999 -2001) :

Fedora History n Research (1997 -present) : n n First Application (1999 -2001) : n n n DARPA and NSF-funded research project at Cornell Reference implementation developed at Cornell Payette, Lagoze, Dushay University of Virginia digital library prototype Scale/stress testing for 10, 000 objects Open Source Software (2002 -present): n n Andrew W. Mellon Foundation granted UVA and Cornell $1 million to develop a production-quality Fedora system Fedora 1. 0 (May 2003) Fedora 1. 1 (Aug 2003) Fedora 1. 2 (Dec 2003)

Why Fedora? n Data model n n n Distributed repositories n n n n

Why Fedora? n Data model n n n Distributed repositories n n n n Fedora is exposed via web services Fedora can interact with other web services Fedora uses WSDL and XML Object Lifecycle and preservation n Provide multiple views of content/metadata Dynamic transformations of content/metadata Add new views/transformations over time Web Services n n Common data model Common APIs for access and management Federation Content repurposing n n Generic abstraction for heterogeneous digital resources Flexibility to create different “content models” No bifurcation of metadata and content Aggregate both locally stored content and by-reference content Content versioning Event history Easy integration with other applications and systems n n Web services with open APIs Does not assume any particular workflow or end-user application

Fedora in Use

Fedora in Use

Fedora Downloads as of Dec 2003 n Total downloads: 4960 n Average downloads per

Fedora Downloads as of Dec 2003 n Total downloads: 4960 n Average downloads per day: 19 n n # Countries: 50; # orgs: 360 Types of orgs: n n n n n Universities: libraries, IT, academic departments Software and technology companies Defense/military Banks National libraries and archives Publishers Research labs Library automation vendors Scholarly societies

Selected Projects Committed to Fedora n n n n University of Virginia: digital library

Selected Projects Committed to Fedora n n n n University of Virginia: digital library (images, EAD, e-texts) Cornell and UVA: Tibetan and Himalayan Digital Library VTLS (library systems): new commercial product (VITAL) Tufts University: education (VUE/concept maps); digital library Northwestern: academic technologies (images, art, video, e-texts) Indiana University: EVIA Digital Archive (video) Rutgers University: digital library (e-journals, numeric data) New York University: Humanities Computing

A Sampling of Fedora Usage: Some Active Prototypes and evaluations (we are tracking) n

A Sampling of Fedora Usage: Some Active Prototypes and evaluations (we are tracking) n JSTOR-Art. Store-EArchives (Ithaka) n Harris Corporation (R&D; government systems; archives) n American Geophysical Union n National Library of Portugal n Monash University with National Library of Australia n NSDL at Cornell Some Interesting Download Sites (from our logs) n British Library n Society of Biblical Literature n National Archives of Australia n Office of Defense Resources, Thailand n Microsoft n Sun Microsystems n Apple n Cornell Information Technologies (CIT)

Digital Object Model

Digital Object Model

Digital Object Model Architectural View Persistent ID (PID) Default Disseminator Your Extension Datastream (item)

Digital Object Model Architectural View Persistent ID (PID) Default Disseminator Your Extension Datastream (item) Digital object identifier Service Perspective: methods for disseminating “views” of content Item Perspective: Set of content or metadata items Datastream (item) System Metadata Internal: key metadata necessary to manage the object

Digital Object Model 4 Classifications for Datastreams Datastream (Managed) Fedora stores and manages the

Digital Object Model 4 Classifications for Datastreams Datastream (Managed) Fedora stores and manages the content bytestream Datastream (External) Fedora stores a reference (URL) to the content Datastream (Redirect) Fedora stores a reference (URL) to the content, but will not mediate access to content. Datastream (XML) Fedora stores a name-spaced block of XML content within the Fedora digital object XML file.

Digital Object Model Example “content model” Default Views Image Views Metadata Views Get Thumbnail

Digital Object Model Example “content model” Default Views Image Views Metadata Views Get Thumbnail Get Medium Get High Image (mrsid) DC (xml) Thumbnail (jpeg) System Metadata Get MARC Get DC n. Multiple Disseminations PID = uva-lib: 100 Get Profile List Items/Get Item List Methods Get OAI_DC

Digital Object Model Service Relationships Data Object Persistent ID (PID) System Metadata Persistent ID

Digital Object Model Service Relationships Data Object Persistent ID (PID) System Metadata Persistent ID (PID) Datastreams Disseminators Service Definition Metadata (WSDL) System Metadata Behavior Definition Object Datastreams Persistent ID (PID) System Metadata Behavior Mechanism Object Datastreams Service Binding Metadata (WSDL) External Service

Repository System Architecture and Software

Repository System Architecture and Software

Fedora Server Design: 3 Layers 1. Interface Web Service for Access/Search n Web Service

Fedora Server Design: 3 Layers 1. Interface Web Service for Access/Search n Web Service for Management n OAI Provider n 2. Application Logic Implements all functionality in terms of the Fedora digital object model. 3. Storage n RDBMS Object “cache” for performance n Digital object registry n n XML object serializations Authoritative object with versioning n All management operations on XML n

Fedora Server Architecture

Fedora Server Architecture

Fedora Repository System Client and Web Service Interactions Service Content Transform Service Fedora Repository

Fedora Repository System Client and Web Service Interactions Service Content Transform Service Fedora Repository System Web Service Dispatch Backend Web Service user client application user web browser client application Frontend Content Transform Service

Fedora 1. 2 – Server Feature Set n Management Web Service n Identify -

Fedora 1. 2 – Server Feature Set n Management Web Service n Identify - generate unique object identifiers (but will accept your identifier) n Ingest - object submission in XML format (e. g. , METS) n Create - interactive object creation via API calls n Maintain - interactive object modification and deletion via API calls n Export - provides a copy of an object encoded in XML format (e. g. , METS) n Purge - permanently remove objects from repository n Access and Search Web Service n Search - locate objects via the default repository index n Reflect - describe the disseminations an object can provide n Disseminate - deliver a view of an object’s content n OAI-PMH Provider Service n Request - OAI-DC records n Internal Features n Modules – system configurable and ability to replace modules n Performance – relational db object cache n Storage – XML object wrappers; datastreams in native formats n Replicate – XML object store to relational cache n Validate - application of integrity rules to objects n Secure - basic HTTP authentication and simple access control n Preserve - automatic content versioning and audit trail

Fedora 1. 2 - Clients n Fedora Administrator n n n Web Browser (via

Fedora 1. 2 - Clients n Fedora Administrator n n n Web Browser (via Fedora URL syntax) n n n REST-based search REST-based access to objects Command Line Utilities n n n Java Swing client Create/maintain objects Search repository Wizards for behavior objects Batch loading Ingest, purge, more Migration Utility n n General-purpose mass export/ingest Supports upgrading to new versions of Fedora

Fedora Software Package n n n Open Source (Mozilla Public License) 100% Java (Sun

Fedora Software Package n n n Open Source (Mozilla Public License) 100% Java (Sun Java J 2 SDK 1. 4) Supporting Technologies n n n n Apache Tomcat 4. 1 and Apache Axis (SOAP) Xerces 2 -2. 0. 2 for XML parsing and validation Saxon 6. 5 for XSLT transformation Schematron 1. 5 for validation My. SQL and Mckoi relational database Oracle 9 i support Deployment Platforms n n Windows 2000, NT, XP Solaris Linux Mac OSX (upcoming)

Fedora Demos

Fedora Demos

UVA EAD Collections [Search] [Angelica]

UVA EAD Collections [Search] [Angelica]

UVA Images [image]

UVA Images [image]

Fedora @ Tufts content maps container node web resource Faculty may sketch out their

Fedora @ Tufts content maps container node web resource Faculty may sketch out their course content, relationships and pathways through this content using a simple set of moveable objects or nodes. file node relationship notes Slide courtesy of David Kahle

Fedora @ Tufts Leveraging OKI technical standards will facilitate the sharing, distribution and integration

Fedora @ Tufts Leveraging OKI technical standards will facilitate the sharing, distribution and integration of this new educational tool in educational systems beyond Tufts. Slide courtesy of David Kahle OKI & FEDORA

Fedora @ Northwestern Content Models Genre of digital resource Types of Behaviors Image Core

Fedora @ Northwestern Content Models Genre of digital resource Types of Behaviors Image Core Image Hi-Res Layered Geo Time Text Chart courtesy of Bill Parod Map A/V Book News EText

Fedora @ Northwestern Dissemination: Merge two datastreams Image with Metadata Image courtesy of Bill

Fedora @ Northwestern Dissemination: Merge two datastreams Image with Metadata Image courtesy of Bill Parod [images] [art]

Fedora @ Northwestern Dissemination: Repurpose datastream image with Flash zoom viewer Image courtesy of

Fedora @ Northwestern Dissemination: Repurpose datastream image with Flash zoom viewer Image courtesy of Bill Parod [images] [art]

Fedora Administrator [Demo Object] n(Demo runs locally. Not available via public URL. )

Fedora Administrator [Demo Object] n(Demo runs locally. Not available via public URL. )

Fedora Future

Fedora Future

Fedora 1. 3 2. 0 (Jan-Dec 2004) n Fedora Object XML (FOXML) n New

Fedora 1. 3 2. 0 (Jan-Dec 2004) n Fedora Object XML (FOXML) n New internal storage format n Relationships metadata n Better support for event history n Format identifiers for dynamic service binding and OAI formats n Performance n Scale testing (benchmark ~10 million objects) n Concurrent usage stress n Performance tuning as needed (ingest, dissemination) n Advanced Access Control n Authentication (plug in modules for common schemes; Shibboleth) n XACML policy expression language n Fedora policy enforcement module n n n Web forms for easy content submission Batch object modification utility Administrative Reporting New ingest and export formats (FOXML, METS 1. 3, DIDL) Various enhancements and special requests

Next Development Proposal n Fedora R 2 R - Distributed, Federated Repositories n n

Next Development Proposal n Fedora R 2 R - Distributed, Federated Repositories n n n n Fedora Power Server n n n n Shared name resolution service Any repository can fulfill a dissemination request within a federation Fedora Proxy Service for distributed virtual repository Federated or distributed searching (Z 30. 50, OAI, other approaches)? Shared web services (for behaviors) Repositories as Service Registries (like UDDI) High Performance (>10 million objects) Storage expansion schemes Mirroring and Replication Repository clustering Load balancing Preservation feature set Quality of Service (Qo. S) and Fault Tolerance ? Object Creation Tools n n n Simple workflow utilities based on content models Object “workbenches” Web interface for document/content submission

Questions www. fedora. info

Questions www. fedora. info