Understanding the utility and fitness of Workflow Provenance








- Slides: 8
Understanding the utility and fitness of Workflow Provenance for Experiment Reporting Pınar Alper, Supervisor: Carole A. Goble 1
Research Local Data Analysis Tool Analysis Local Tool Reporting Results Results Results Results Data share select Results package recollect publish Build a citation string Package results by origin C. Tenopir, S. Allard, et al. Data sharing by scientists: Practices and perceptions. PLo. S ONE, 6(6): e 21101, 06 2011. Document important run 2 parameteres
Provenance we have • Execution provenance Retrospective Prospective • WF description Generic information: Data artefacts, consumption/production relations 3 Execution times/status
Provenance that is reported Scientific Data Provenance – Origin – Methodological context – Scientific Context 4
Motifs Workflows as implementation artefacts: • 240 Workflows, 4 Systems 10 domains • A domain independent characterization of activities • ~90% characterizable Minority (~30%) Data-creation Majority (~70%) Data-preparation (value-copying http: //purl. org/net/wf-motifs# D Garijo, P Alper, K Belhajjame, O Corcho, Y Gil, C Goble, Common motifs in scientific workflows: An empirical analysis, Future Generation 5 Computer Systems. ISSN 0167 -739 X.
Research Framework Configurable filters Graph Re-write primitives More informed abstraction w. Motifs III II WF Summaries Labeling WF WF Motifs I Groundtruth –user behavior Novelty: Declarative abstraction and contextual grouping Process Model for labeling Motifs inform when to collect when to propagate labels Novelty: Dynamic, domain specific Minimal additional design -time information High-level categorization, as Semantic Annotations Grey-box Based on empirical evidence Novelty: Partial transparency P Alper, C Goble, and K Belhajjame. 2013. On assisting scientific data curation in collection-based dataflows using labels. In Proceedings of the 8 th Workshop on Workflows in Su Large-Scale Science (WORKS '13). ACM, New York, NY, USA, 7 -16. DOI=10. 1145/2534248. 2534249 P Alper, K Belhajjame, C Goble, P Karagoz, Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotations, IEEE Big Data, July 2013. 6
How do I use Taverna Workbench scufl 2 -api make a wf Inquire about details Scufl 2 -wfdesc we operate on abstract wf description Issues Additional characteristics (port depths, itertion config) Annotation support @UI w key-value pairs List handling representation Resource uniqueness 7
Thank you! Pinar ALPER University of Manchester Khalid BELHAJJAME Université Paris Dauphine Carole A. GOBLE University of Manchester Pinar KARAGOZ Middle East Technical University 8