Aalo Efficient Coflow Scheduling Without Prior Knowledge Mosharaf
- Slides: 30
Aalo Efficient Coflow Scheduling Without Prior Knowledge Mosharaf Chowdhury, Ion Stoica UC Berkeley
Communication is Crucial Performance Facebook jobs spend ~25% of runtime on average in intermediate comm. 1 Map Stage Reduce Stage As SSD-based and in-memory systems proliferate, the network is likely to become the primary bottleneck 1. Based on a month-long trace with 320, 000 jobs and 150 Million tasks, collected from a 3000 -machine Facebook production Map. Reduce cluster.
Flow-Based Solutions CSFQ WFQ GPS 1980 s RED ECN 1990 s D 3 XCP 2000 s Per-Flow Fairness RCP 2005 DCTCP 2010 De. Tail PDQ D 2 TCP p. Fabric FCP 2015 Flow Completion Time Independent flows cannot capture the collective communication patterns (e. g. , shuffle) common in data-parallel applications
Cof low 1. Minimize completion times, 2. Meet deadlines, or 3. Perform fair allocation Communication abstraction for data-parallel applications to express their performance goals
Benefits of. Inter-Coflow Scheduling Link 2 Link 1
Benefits of. Inter-Coflow Scheduling Coflow 2 Coflow 1 6 Units 2 Units Link 2 3 Units Link 1 Smallest-Flow First Fair Sharing Smallest-Coflow First L 2 L 2 L 1 L 1 2 time 4 6 Coflow 1 comp. time = 5 Coflow 2 comp. time = 6 Benefits increases 2 time 4 Coflow 1 comp. time = 5 Coflow 2 comp. time = 6 with the number 6 2 time 4 Coflow 1 comp. time = 3 Coflow 2 comp. time = 6 of coexisting coflows 6
Varys 1 Efficiently schedules coflows leveraging complete and future information 1. The size of each flow, 2. The total number of flows, and 3. The endpoints of individual flows 1. Efficient Coflow Scheduling with Varys, SIGCOMM’ 2014.
Varys Efficiently schedules coflows leveraging complete and future information Pipelining between 1. The size of each flow, stages 2. The total number of flows, and Speculative executions 3. The endpoints of individual Task failures and flows restarts
Aalo Efficiently schedules coflows without complete and future information Pipelining between 1. The size of each flow, stages 2. The total number of flows, and Speculative executions 3. The endpoints of individual Task failures and flows restarts
Coflow Scheduling Minimize Avg. Comp. Time With complete knowledge Without complete knowledge Flows on a Single Link Smallest-Flow-First Least-Attained Service (LAS)
Coflow Scheduling Minimize Avg. Comp. Time With complete knowledge Without complete knowledge Flows on a Single Link Smallest-Flow-First Least-Attained Service (LAS) 1. Efficient Coflow Scheduling with Varys, SIGCOMM’ 2014. Coflows in the Entire Network Varys 1, Smallest-Coflow-First 1 ?
Coflow Scheduling Minimize Avg. Comp. Time With complete knowledge Without complete knowledge Flows on a Single Link Smallest-Flow-First Least-Attained Service (LAS) Coflows in the Entire Network Varys 1, Smallest-Coflow-First 1 ? LAS: prioritize flow that has sent the least amount of data 1. Efficient Coflow Scheduling with Varys, SIGCOMM’ 2014.
Coflow-Aware LAS (CLAS) Prioritize coflow that has sent the least total number of bytes • The more a coflow has sent, the lower its priority • Smaller coflows finish faster
Coflow-Aware LAS (CLAS) Prioritize coflow that has sent the least total number of bytes • The more a coflow has sent, the lower its priority • Smaller coflows finish faster Challenges (also shared by LAS) • Can lead to starvation • Suboptimal for similar size coflows
Suboptimal for Similar Coflows Reduces to fair sharing • Doesn’t minimize average completion time Coflow 1 Coflow 2 2 time 4 6 Coflow 1 comp. time = 6 Coflow 2 comp. time = 6 FIFO works well for similar coflows • Optimal when cflows are identical 2 time 4 6 Coflow 1 comp. time = 3 Coflow 2 comp. time = 6
Between a “Rock” and a “Hard Place” Prioritize across dissimilar coflows FIFO schedule similar coflows
Discretized Coflow-Aware LAS (D-CLAS) Lowest. Priority Queue Priority discretization • Change priority when total # of bytes sent exceeds predefined thresholds Scheduling policies • FIFO within the same queue • Prioritization across queue Weighted sharing across queues • Guarantees starvation avoidance FIFO QK … FIFO Q 2 FIFO Q 1 Highest. Priority Queue
How to Discretize Priorities? Lowest. Priority Queue FIFO QK Exponentially spaced thresholds: A×Ei • A, E : constants • 1 ≤ i ≤ K : threshold constant • K : number of the queues ∞ A EK-1 … FIFO Q 2 A E 2 AE FIFO Q 1 AE 0 Highest. Priority Queue
Computing Total # of Bytes Sent D-CLAS requires to know total # of bytes sent over all flows of a coflow • Distributed computation over small time scales challenging
Computing Total # of Bytes Sent D-CLAS requires to know total # of bytes sent over all flows of a coflow • Distributed computation over small time scales challenging How much do we loose if we don’t compute total # of bytes sent? • D-LAS: make decisions based on total number of bytes sent locally
D-LAS Far From Optimal! Coflow 2 Coflow 1 6 Units 2 Units Link 2 3 Units Link 1 D-LAS (decision on # of bytes sent locally) D-CLAS L 2 L 1 2 time 4 Coflow 1 comp. time = 6 Coflow 2 comp. time = 6 6 2 time 4 Coflow 1 comp. time = 3 Coflow 2 comp. time = 6 6
Aalo Efficiently schedules coflows without complete and future information 1. Implement D-CLAS using a centralized architecture 2. Expose a non-blocking coflow API
Aalo Architecture Coordinator Worker Sender 1 Sender 2 μs milliseconds D-CLAS Worker Network Interface Worker Timescale Local/Global Scheduling
Details Non-blocking: when a new coflow arrives at an output port • Put its flow(s) in lowest priority queue and schedule them immediately • No need to sync all flows of a coflow as in Varys
Details Non-blocking: when a new coflow arrives at an output port • Put its flow(s) in lowest priority queue and schedule them immediately • No need to sync all flows of a coflow as in Varys Compute total number of bytes sent • Workers send info about active coflows periodically • Coordinator computes total # of bytes sent, and relay this info back to workers • Workers use this info to move coflows across queues Minimal overhead for small flows
Evaluation 1. Can it approach clairvoyant solutions? 2. Can it scale gracefully? A 3000 -machine tracedriven simulation matched against a 100 -machine EC 2 deployment YES
On Par with Clairvoyant Approaches [EC 2] Comm. Improv. Job Improv. Per-Flow Varys 1. 93 X 1. 18 X 0. 89 X 0. 91 X
Performance Breakdown [EC 2] Similar for large coflows because they are in slow-moving queues Fraction of Coflows 1 0. 5 Varys Aalo Non-Clairvoyant Scheduler 0 0. 01 1 100 Coflow Completion Time (Seconds) Performance loss for medium coflows by mischeduling them Improvements for small coflows by avoiding coordination
1 # (Emulated) Aalo Slaves Coordination Period (Δ) 100 s 1 s 100 ms 10 ms 992 495 Normalized Completion Time of Per. Flow Fairness w. r. t. Aalo 115 100000 50000 17 10000 8 10 100 Average Coordination Time (ms) What About Scalability? [EC 2] 2 1. 8 1. 6 1. 4 1. 2 1 0. 8 0. 6 0. 4 0. 2 0
Aalo Efficiently schedules coflows without complete information • Makes coflows practical in presence of failures and DAGs • Improved performance over flow-based approaches • Provides a simple, non-blocking API https: //github. com/coflow Mosharaf Chowdhury – mosharaf@umich. edu
- Unix timesharing system
- Productively efficient vs allocatively efficient
- Allocative efficiency vs productive efficiency
- Productively efficient vs allocatively efficient
- C b a d
- Allocative efficiency vs productive efficiency
- Efficient cab scheduling
- Symbolism vs allegory
- Prior knowledge survey
- Activating prior knowledge examples
- Sjn scheduling
- For my father who lived without ceremony
- Alliteration in keeping quiet
- What is the theme of without title
- Knowledge without boundaries
- Zeal without knowledge
- Contoh shallow knowledge dan deep knowledge
- "the knowledge society" "the knowledge society" or tks
- Knowledge shared is knowledge squared
- Apriori and aposteriori knowledge
- Knowledge shared is knowledge multiplied interpretation
- Street smart vs book smart
- Knowledge creation and knowledge architecture
- Knowledge and knower
- Shared knowledge vs personal knowledge
- Prior consistent statement example
- Correction
- Brownsville lhwca lawyers
- Kepro wv medicaid prior authorization form
- Artisan recognition of prior learning
- Profit prior to incorporation