Provenance of scientific information as experienced in DRIVER
- Slides: 22
Provenance of scientific information as experienced in DRIVER 6 th e-Infrastructure Concertation Event Lyon, 24 th November 2008 Wolfram Horstmann Bielefeld University / DRIVER
Notions of Provenance • Where do data objects* originate from? – Scientific Work -- examples • Instrumentation techniques – Manufacturers of hard- and software • Methodologies – Processes, e. g. gene sequencing – Technical/Local -- examples • (web)-identifiers • Database, repository name * Primary data, documents, metadata …
Why Provenance? • Quoting / Citing / Referencing as global scientific principle – „Reproducible research“ • Giving credits to authors / creators in distributed environments • Original location / context has to be known • Experienced in Grid-Environments [1]
Provenance & Interoperability • Re-Use / Sharing: “Addressing/Accessing” – Common view, common use – Unidirectional: No change of data objects! • Federation: “Discovering in Context” – Remote representation of distributed DOs • Aggregation: “Contextualizing” – Add unchanged object in a context • Processing/Annotation: “Changing” – Uni- vs. Bidirectional: Change of DOs and remote representation vs. back-storage (e. g. CVS)
Scenarios in DRIVER
Digital Scientific Data
Digital Object Collections ⊃ ⊃
Digital Object Repositories + + = + +
Digital Information Space
Conventional Web Data
„Simple“ Applications
Metadata Infrastructure
Basic Provenance Settings • Indicate Production Situation – Metadata • Author, Instrumentation etc. • Remote Representation – Indicate place of origin in remote systems • Metadata as digital objects / first order citizens – Allow lineage respresentation • Credits in remote environments / versioning
Orders of Provenance • 1 st order: Metadata – Provenance attached to data – Minimal „knowledge“ required in application – Allow remote handling of data objects – Require metadata infrastructure – Metadata introduce 2 objects: requires linkage • 2 nd order: context / compounds – Express multiple relations between objects – May introduce semantic model
Provenance in DRIVER #1 • Simple Objects: OAI-PMH [2] – 1 st order provenance • Metadata: minimum OAI-DC – 2 nd order provenance • DRIVER explicit identifiers for repositories • OAI-PMH: inline representation („about“)
Semantic/Compound Data
„Semantic“ Applications
Provenance in DRIVER #2 • „Enhanced Publications“ – Research project in DRIVER-II – Representation of data /document packages – Use of OAI-ORE
Provenance in OAI-ORE • OAI-ORE: Object Re-Use and Exchange[4] – Uses Resource Maps < Named Graphs – Uses „lineage“ to represent expl. Provenance – Future: explicit provenance model [7] ?
Summary • Provenance essential for … – Indicating origin in distributed data spaces • Accessing / Addressing • Federation / Aggregation • Processing / Annotation – Document and data citation / trace-back – 1 st order: describing data > metadata – 2 nd order: describing context > semantic data
Lessons learnt in DRIVER • Use web-enabled Identification (URI/UDDI etc. ) – „Dark“ databases don‘t interoperate • 1 st order provenance at place of origin – Requires metadata to describe origin – Enables a metadata infrastructure – Introduces linkage problem • 2 nd order provenance in contexts – Requires data provider identification in federators / aggregators in order to link back – May require semantic model for context – Would benefit from a semantic infrastructure
Resources [1] On provenance in the e. Science / grid-environment – http: //www. sigmod. org/sigmod/record/issues/0509/p 31 -special-sw-section-5. pdf – In GLITE • http: //www. cesnet. cz/doc/techzpravy/2007/glite-job-provenance/ • http: //twiki. ipaw. info/bin/view/Challenge [2] On provenance in OAI-PMH – http: //www. openarchives. org/OAI/2. 0/guidelines-provenance. htm [3] On provenance OAI-ORE (referred to as ore: lineage) – http: //www. openarchives. org/ore/meetings/Soton/ore_beyond_basics. pdf (general) – http: //www. openarchives. org/ore/1. 0/vocabulary (definition) [4] Named Graphs, Provenance and Trust (Caroll et al. ) – http: //www 4. wiwiss. fu-berlin. de/bizer/SWTSGuide/carroll-ISWC 2004. pdf [5] W 3 C: On provenance in RDF – http: //www. w 3. org/2001/12/attributions/ [6] Open Provenance Model – http: //eprints. ecs. soton. ac. uk/14979/1/opm. pdf [7] DRIVER: Digital Repository Infrastructure for European Research – http: //www. driver-community. eu
- What is provinance
- Provenance semirings
- Fhir provenance example
- Software of unknown provenance
- "provenance properties"
- User mode driver framework
- Information gathered during an experiment
- How is a scientific law different from a scientific theory?
- Allows rapid entry of text by experienced users
- Have you ever experienced culture shock
- 30 cfr part 46
- Devers experienced the highlight
- Devers experienced the highlight
- More sports at ericson genre
- Less + adjective than
- Strength and weaknesses of subject centered design
- Whats linear momentum
- A 23 year old male experienced severe head trauma
- Give two pieces
- Dimensional analysis and its applications
- Brett laming
- Slovak centre of scientific and technical information
- Scirus for scientific information only