Omega flexible scalable schedulers for large compute clusters
Omega: flexible, scalable schedulers for large compute clusters Malte Schwarzkopf (University of Cambridge Computer Lab) Andy Konwinski (UC Berkeley) Michael Abd-El-Malek (Google) John Wilkes (Google) Presented by: Yikai Lin, Chao Kong Slides adapted from the Euro. Sys ‘ 13 presentation EECS 582 – W 16 1
Outline 1. Problem statement 2. Related work 3. Proposed solution 4. Comparison of approaches 5. Case study 6. Conclusion 7. Discussion EECS 582 – W 16 2
A scheduling problem EECS 582 – W 16 3
In the context of… EECS 582 – W 16 4
Why is it important to solve? EECS 582 – W 16 5
The actual problem Increasing complexity! EECS 582 – W 16 6
Problem statement 1. Break up the cluster scheduler into independent schedulers 2. Arbitrate resources between schedulers EECS 582 – W 16 7
Existing solutions EECS 582 – W 16 8
Proposed solution: Omega EECS 582 – W 16 9
Example EECS 582 – W 16 10
Example EECS 582 – W 16 11
Example EECS 582 – W 16 12
Example Conflict! EECS 582 – W 16 13
Example EECS 582 – W 16 14
Workload characterization Most jobs are batch, but most resources are consumed by service jobs. EECS 582 – W 16 15
Workload characterization EECS 582 – W 16 16
Comparison of approaches • Two Simulators • A lightweight simulator: Monolithic scheduler, Mesos, Omega • A high-fidelity simulator: Omega EECS 582 – W 16 17
Parameters and Metrics • Schedule decision time • Metrics • Job wait time • Scheduler busyness • Conflict fraction EECS 582 – W 16 18
Experiment 1 • Comparison in the lightweight simulator • Systems: • Monolithic, Mesos, Omega • Dataset: All clusters, 7 simulated days • Approach: varying service scheduler EECS 582 – W 16 19
Monolithic - I • Uniform decision time (single logic) EECS 582 – W 16 20
Monolithic - II • Fast-path batch decision time EECS 582 – W 16 21
Mesos EECS 582 – W 16 22
Mesos EECS 582 – W 16 23
Omega - I EECS 582 – W 16 24
Omega - II EECS 582 – W 16 25
Experiment 1 - summary • The Omega shared-state model performs as well as a (complex) monolithic multi-path scheduler. EECS 582 – W 16 26
Experiment 2 • Scalability of Shared - state design • Dataset: cluster B, 7 simulated days • System: Omega • Approaches: varying job arrival rate and number of schedulers EECS 582 – W 16 27
Experiment 2 - summary • Omega is scalable to many schedulers EECS 582 – W 16 28
Case study 11 200 “data from a month’s worth of Map. Reduce jobs run at Google showed that frequently observed values were 5, 11, 200 and 1, 000 workers. ” EECS 582 – W 16 29
Case study “…number of workers could be chosen automatically if additional resources were available, so that jobs could complete sooner" EECS 582 – W 16 30
Conclusion 1. Optimistic concurrency over shared state is a viable, attractive approach to cluster scheduling (as shown with Google’s workloads) 2. More work to be re-done to resolve the conflicts, but it’s a price worth paying for the parallelism and scalability 3. Visibility in global state allows specialized scheduler design EECS 582 – W 16 31
Discussions 1. Jobs are divided into roughly two categories: batch and service 2. Compatibility with non-Google workloads (i. e. short-lived) is unclear 3. Fairness and starvation barely studied 4. Scalability in terms of many different types of schedulers is unclear 5. The frequency of cell state update: overhead vs. # of conflicts 5. Global optimality EECS 582 – W 16 32
Backup slides EECS 582 – W 16 33
Experiment 3 • Scheduler interference in real Google workloads • Dataset: cluster C, 29 days, high-fidelity simulator • System: • Omega: two schedulers, non-uniform decision time • Approach: varying service scheduler EECS 582 – W 16 34
Experiment 3 - summary • Interference is higher for real-world settings. Scheduler busyness Conflict fraction EECS 582 – W 16 35
Experiment 4 • Deal with conflicts • Dataset: cluster C, 29 days, high-fidelity simulator • System: • Omega • Approach: varying scheduling methods EECS 582 – W 16 36
Experiment 4 - summary • Default: Incremental transactions with fine grain conflicted detection EECS 582 – W 16 37
- Slides: 37