Integrated Scientific Workflow Management for the Emulab Network
Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau University of Utah, School of Computing USENIX 2006 / June 3, 2006
This Talk in One Slide l Current network testbeds l l …manage the “laboratory” …not the experimentation process. l → A big problem for large-scale activities! l Evolve Emulab for experiments based on scientific workflows l l Big mutual benefits: testbed ↔ workflow Work in progress 2
Example: UAV Simulation l A distributed, real-time application images → l Evaluate improvements to real-time middleware l l l vs. CPU load vs. network load 4 research groups x 19 experiments x 56 metrics alerts → UAV Receiver ATR ← images 3
Use Emulab write “ns” file Concept Experiment “swap in” Emulate 4
Problems Solved l I get machines! 328 PCs, and more l Time- & space-shared l Loads OS and software l l I get network! l l Config. topology & quality I get to collaborate! Available to researchers and educators worldwide l File storage, email, … l 5
Problems Not Solved l “Now what? ” l Getting off the ground Run all my software l Add instrumentation l Collect all my data l Analyze it l l Scaling up 19 configurations l Automation l 6
More Problems Not Solved l “How did I get here? ” l Over the short term… “Where are the results I got last week? ” l “How did I get those results anyway? ” l “What if…? ” l l …and the long term Reproducing results l Reusing artifacts l 7
Idea: Scientific Workflow l Managing activities, inputs, and outputs is the job of a scientific workflow system l Our approach: evolve Emulab with integrated support for scientific workflows l l l Build on existing abstractions & mechanisms Resource focus → user & task focus Users work “within” and “across” experiments 8
Contributions l Address demand + opportunity l l l Advance the applicability of testbeds l l Users need to manage large-scale complexity A symbiotic combination: leverage and impact Not just Emulab — e. g. , Planet. Lab and DETER Advance scientific workflow systems l l Exploit testbed capabilities — e. g. , “total control” Address testbed requirements — e. g. , flexible use 9
Issue: Encapsulation l Current “experiment” model is not fully encapsulating l l l Topology + static events Need everything else! ns file packages inputs outputs Challenge: specification l l Complete and precise… …w/o huge user burden my software l OSes Approach: be automatic l l l E. g. , track files used Snapshot, archive, restore User can refine “extent” NFS monitors Subversion repo. packet monitors datapository (DB) AJAX GUI research filesystems 10
Issue: Definition vs. Execution l Current “experiment” has multiple roles l l l Challenge: representing relationships l l l Definition The thing that you run Multiple runs of one setup Similar configurations Approach: a new model of experimentation l l Separate the roles Evolve the new abstractions 11
New Model l Template l Swapin l Experiment l Activity l Record n=2 n=4 12
Issue: History rev 1. 1 l Research and educational plans are dynamic l By design & by discovery l Challenge: safe exploration l Fork l Back up l Approach: keep history & support temporal navigation l Keep template revisions l Track provenance l Locate, repeat, and reuse oops: need new measurements bigger nets what about loss? add params 13
Implementation in Progress Definition Data Analysis Execution & History 14
Conclusion l Large and powerful testbeds l l l Integrated workflow management can leverage the strengths of testbeds l l …enable complex and large-scale activities …lead to complex and large-scale workflow management problems Systems approach — and systems challenges → Better testbeds and workflow systems 15
http: //www. emulab. net/ Thanks!
Extra Slides After This Point 17
- Slides: 17