VGr ADSLEAD Planning VGr ADS and LEAD Joint

  • Slides: 6
Download presentation
VGr. ADS/LEAD Planning

VGr. ADS/LEAD Planning

VGr. ADS and LEAD Joint Demo • Basic idea: — Execute the “normal” (static)

VGr. ADS and LEAD Joint Demo • Basic idea: — Execute the “normal” (static) LEAD workflow under vg. ES • What is needed? — Queue scheduling heuristics — Queue prediction — Heuristic optimization + queue scheduling + provisioning – Connect three strands? Seems like a good idea — Performance models for LEAD components — Data movement; ensure arrival before task execution – Explicit data movement in the workflow – Decouple task dependences from allocation (allow contingencies) — Contingency scheduling as high priority – Submit to queue before data, have plan for job start before data arrives

Continued • What is needed? — How does this interact with vg. DL? –

Continued • What is needed? — How does this interact with vg. DL? – What abstractions can VG provide (and not require breaking)? – Sort of breaks the current “bind” model – Take the envelope of requirements, schedule within that? – VG defines a virtual machine, once virtual machine is provisioned we schedule onto real resources — Need to trade off resources to request vs. capability of those – Ask for various combinations, choose the one with least cost – Iterative process needed for feedback Estimate bounds on schedule, then see if you can provision, then re-estimate, … – Can we trade costs (e. g. Itanium vs. Opteron)? Only if you have a common currency

Where do we go next? • Application-driven research — As we’ve always done —

Where do we go next? • Application-driven research — As we’ve always done — Figure out what LEAD needs, then try on other apps (e. g. SCEC) • To do — — — LEAD work thru Dan and Lavanya Build performance modeler Integrate perf modeler into scheduler Do one workflow step at a time? Dennis suggestion Focus on target workflow - Static Run (see Dennis’ slide) Replace “Suresh scheduler” (man in loop) with queue prediction to choose resources – Virtual Suresh — Intermediate data movement needs to be scheduled

Next steps • Outline of new scheduler — Produce a series of “new” vg.

Next steps • Outline of new scheduler — Produce a series of “new” vg. DL requests – X nodes for Y minutes starting at Z time – May be implemented by reservation, statistical prediction gadget, or others — Choose the best, or a hybrid of several – Run based on a reservation – (longer-term) Start on immediately-available resources, reschedule when reservation comes up • Working group to build new scheduler — — Rice - Ryan Zhang ISI - probably Gurmeet Singh UCSB - probably Dan Nurmi LEAD - Lavanya or Suresh or Workflow Person TBN

Going Forward • vg. ES 2. 0 — Separate find and bind — Fold

Going Forward • vg. ES 2. 0 — Separate find and bind — Fold batch queue prediction into vg. ES, provide est schedule(s) to start • vg. DL — Design trades off execution speed for useful resources returned — Ryan drafted some recommended changes / additions — Need to present those, start a dialog