Tutorial Semantic Digital Libraries Existing Semantic Digital Libraries
Tutorial – Semantic Digital Libraries - Existing Semantic Digital Libraries Solutions – FEDORA Sandy Payette Director, Fedora Project Cornell University Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Dean Krafft, PI, NSDL Cornell University Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Fedora Semantic Digital Libraries enable … Scholarly and Scientific Workbenches “Web 2. 0” Collaborative Repositories Linking Data and Publications Museum Exhibits with Lesson Plans blog and wiki Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Fedora - Technology Integration • Traverse graph • Relate • Contextualize • Inference • Query Repository Semantic Process • Workflow • Messaging • Transactions Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Preservation • Digital Objects • Manage • Access • Versioning • Storage • Integrity • Monitoring • Alerting • Migration • Replication Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
The Fedora Project • Fedora – – – • Flexible Extensible Digital Object Repository Architecture History – Cornell Research (1997 -2002) – DARPA and NSF-funded research and reference implementations – Distributed, Interoperable Repositories (experiments with CNRI) – Open Source Project (2002 -present) – Andrew W. Mellon Foundation (2002 -2009) – Joint development by Cornell University and University of Virginia – Transitioning into non-profit organization (Fedora Commons 501 c 3) Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Fedora Macro Roadmap 2005 Q 4 2007 Now Q 2 2009 2010 2011 onward Fedora Phase 2 Semantic Technologies Service Framework Fedora Enterprise Workflow Engine and Supporting Tools Message-Oriented Middleware and ESB Distributed Transactions Fedora Commons Technical: Evolution of Semantic-Repo-Service Integrated Platform Community Building: Foster Development and Outreach Business Model: Tapping ongoing sources of funding Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Relevant Technology Orientation for Fedora • Service-oriented architecture • Web 2. 0 SOA Web 2. 0 • Semantic Technologies RDF OWL Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Fedora Service Framework Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 SOA Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Fedora Technology – Enabling Position Semantic Digital Libraries Web 2. 0 Applications Collaborative Applications Semantic technologies integrated with repositories. Enables many different applications. Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Motivations: Fedora and Semantic Technologies (RDF) • A natural model for exposing repository as network of objects – Object-to-object relationships – Relationships to external entities – Query the graph; traversal to discover related stuff • Indexing based on generalizable data model – Graph-based data model is a common reduction – Avoid fixed schema problems and metadata mud wrestling • Extensible enrichment of object descriptions – Keep overlaying statements from multiple ontologies – Organic evolution • Powerful queries and inference for repository management – – Transitive relationships among objects Dependency analysis; Detection/Extraction of sub-graphs Provenance of disseminations Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
RDF in the Fedora Digital Object Model Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Digital Objects contain their RDF assertions • Assert relationships from Fedora base ontology – – – Collection – member Whole – part Equivalence Description Of More… Assert relationships/properties from community ontologies – – is. Annotation. Of is. Recommended. By is. Certified. By More …. Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Example: Digital Object with “compositional semantics” Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
RDF “Relationships” Datastream <foxml: datastream ID="RELS-EXT" CONTROL_GROUP="X"> <foxml: datastream. Version ID="RELS-EXT. 0" MIMETYPE="text/xml" LABEL="RDF"> <foxml: xml. Content> <rdf: RDF xmlns: rdf="http: //www. w 3. org/1999/02/22 -rdf-syntax-ns#" …. > <rdf: Description rdf: about="info: fedora/nsdl: 100"> <fedora: is. Member. Of rdf: resource="info: fedora/nsdl: nvo-49"/> <fedora: is. Member. Of rdf: resource="info: fedora/nsdl: physics-48"/> <nsdl: reviewed. By rdf: resource=“info: fedora/nsdl: ev-120”/> <nsdl: has. Data. Component rdf: resource="info: fedora/nsdl: nvo-11"/> </rdf: Description> </rdf: RDF> </foxml: xml. Content> </foxml: datastream. Version> </foxml: datastream> Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Fedora RDF-based Resource Index (RI) • • • NOT the core object store - RI is an index of the repository Automatic, incremental indexing into triplestore Search/query the repository via Fedora RI Query Interface RDF Index of Repository Digital Object Store RELS-EXT datastream Fedora model properties DC datastream Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
RI Graph view (abbreviated) … Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
RI Implementation: The Triplestore Challenge • Scalability • • Few triplestores perform well for 100 M+ triples Kowari – we tested to 180 M triples MPTStore – we tested to 250 M triples Performance • • • Jena - easy to get out of memory Sesame Native - slow for complex queries Kowari • Fast queries and full-featured query language (i. TQL) • Instability and corruption problems MPTStore • Very fast for SPO queries (limited support for complex queries) • Add/modify significantly faster than Kowari Mulgara • Fork of Kowari; complex queries; models; inference • Major bug fixes to fix stability and corruption problems • New features planned Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Use Case: scholarly objects and annotation in the humanities yy: certifies URI-55 URI-100 s scholarly objects nd e m om x ec x: r museum objects commercial web content Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Use Case: scientific publication and collaboration has. Review has. DS has. Data. Component has. DS Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Use Case: Object-Centered Sociality Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006 -2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
What is NSDL committed to? § NSDL 2. 0 as a platform for developing digital library tools § Support for communities across the full range of science, technology, engineering and mathematics research, learning and education § The library as a shared, collaborative, contributory space § Supporting the creation of context around library resources to enhance discovery, use, and understanding
NSDL Semantic Digital Library repository requirements § Supports storing both content and metadata § Allows arbitrary relationships among resource and metadata objects: organization, annotation, citation § Accessible through web service architecture of remixable data sources and transformations
NSDL Data Repository (NDR) § Implemented in Fedora 2. 2 with MPTStore and journalling § Moderately large: 4. 7 million digital objects, 250 million RDF triples § D. O. s: resources, metadata, agents, metadata providers, aggregators § A REST API to allow authenticated access by other applications § In production at nsdl. org
NSDL as Semantic Digital Library: collaboration, context, and contribution § The NDR and services provide the platform, but we still need the applications § Solution 1: Leverage the existing successful models: blogs, wikis, bookmarking/tagging § Solution 2: Leverage the existing software: Word. Press, Media. Wiki, Connotea, Sakai § Solution 3: Engage with partners and the broader community to build applications to the platform
Expert Voices § § § The NSDL Blogosphere, live at http: //expertvoices. nsdl. org Topic-based discussions (e. g. forensics) linked to related library resources A way for NSDL community members to become NSDL contributors: of resources, questions, reviews, annotations, metadata Wordpress-based multi-user multi-blog application (open source, plug-in architecture) Owner controls publication of entries as NSDL resources and visibility of comments Entries can contain linked references to NSDL resources, references to URLs that should become resources, and new resource metadata
Our. NSDL: NDR-integrated Wiki § § § Community of approved contributors (e. g. teachers, librarians, scientists) are granted edit access on Our. NSDL wiki New resources and metadata are created as wiki pages and reflected into the NDR Non-wiki-based NDR resources and metadata are displayed as read-only wiki pages, subject to comment and linking, with links reflected back into RDF relationships in NDR User and project pages organize NDR resources, again reflected back into repository as RDF Now implementing Media. Wiki extensions; beta release expected 2 Q 07
NDR Entry for Soft Matter Wiki Existing Collection Soft Matter Wiki New Metadata Provider Member of Wiki Entry New Audience MD Member of Metadata Provider Metadata for Referenced New Resource 1 Annotates Inferred relationship between resources Referenced Existing Resource 2
NSDL 2. 0 Ecosystem Archive Service Search Service Fedora-based NDR … STEM Collections Protocol: OAI-PMH HTTP REST NDR API
NSDL 2. 0 and the Semantic Web § NSDL 2. 0 applications situate resources in context, aiding both discovery and use § Users become contributors, adding new resources, ratings, annotations, and organizational structure – frequently as a side effect of using the library § Fedora-based semantic web technology organizes resources, ties context to content, maintains provenance, enables discovery, empowers the user, and powers the library
- Slides: 33