Coflow A Networking Abstraction For Cluster Applications Mosharaf
Coflow A Networking Abstraction For Cluster Applications Mosharaf Chowdhury Ion Stoica UC Berkeley
Cluster Applications Multi-Stage Data Flows » Computation interleaved with communication Computation » Distributed » Runs on many machines Driver Communication » Structured » Between machine groups 2
Communication Abstraction A Flow » Sequence of packets » Independent » Often the unit for network scheduling, traffic engineering, load balancing etc. Multiple Parallel Flows » Independent » Yet, semantically bound » Shared objective Driver Minimize Completio n Time 3
Coflow ‘ A collection of flows between two groups of machines that are bound together by application-specific semantics Captures 1. Structure 2. Shared Objective 3. Semantics 4
We Want To… Better schedule the network » Intra-coflow » Inter-coflow Write the communication layer of a new application » Without reinventing the wheel Add unsupported coflows to an application, or Replace an existing coflow implementation » Independent of applications 5
Cluster Applications Coflow AP I The Network (Physically or Logically Centralized Controller) 6
Goals Coflow 1. Separate intent from mechanisms AP I 2. Convey application-specific semantics to the network 7
Coflow AP I terminate(handle) Job finishes get(handle, id) content Shuffl e finishe s Driver put(handle, id, content) create(SHUFFLE) handle Map. Reduc e 8
Flexibilit y reducer s shuffl e Coflow Choice of algorithms mapper s 1. Orchestra, SIGCOMM’ 2011 » Default » WSS 1 Choice of mechanism » App vs. Network layer » Pull vs. Push 9
@driver b create(BCAST) … Coflow reducer s broadcas t shuffl e Flexibilit y mappe rs driver (Job. Track er) put(b, id, content) … terminate(b) @mapper get(b, id) … 10
@driver b create(BCAST) s create(SHUFFLE, ord=[b ~> s]) Coflow reducer s broadca st shuffl e Flexibilit y mapper s driver (Job. Track er) put(b, id, content) … terminate(b) terminate(s) @mapper get(b, id) put(s, ids 1) … 11
Throughput-Sensitive Applications After 2 seconds Minimize Completion Time 12
Throughput-Sensitive Applications After 4 seconds After 7 seconds After 2 seconds Minimize Completion Time 13
Throughput-Sensitive Applications Free up resources without hurting applicationperceived communication time After 7 seconds After 2 seconds Minimize Completion Time 14
Latency-Sensitive Applications Hot. Nets 2012 Top-level Aggregator Mid-level Aggregato rs Workers 15
Latency-Sensitive Applications Hot. Nets 2012 Hot. Nets-XI: Home Page conferences. sigcomm. org/hotnets/2012/ The Eleventh ACM Workshop on Hot Topics in Networks (Hot. Nets-XI) will bring together people with interest in computer networks to engage in a lively debate. . . Top-level Aggregator Hot. Nets Workshop | acm sigcomm www. sigcomm. org/events/hotnets-workshop The Workshop on Hot Topics in Networks (Hot. Nets) was created in 2002 to discuss early-stage, creative. . . Hot. Nets-XI, Seattle, WA area, October 29 -30, 2012. Mid-level Aggregato Hot. Nets-XI: Call for Papers conferences. sigcomm. org/hotnets/2012/cfp. shtml rs The Eleventh ACM Workshop on Hot Topics in Networks (Hot. Nets-XI) will bring together researchers in computer networks and systems to engage in a lively. . . Meet Deadline 1, 2 Coflow accepted at Hot. Nets'2012 www. mosharaf. com/blog/2012/09/. . . /coflow-accepted-at-hotnets 201. . . Sep 13, 2012 – Update: Coflow camera-ready is available online! Tell us what you think! Our position paper to address the lack of a networking abstraction for. . . Workers 1. D 3, SIGCOMM’ 2011 2. PDQ, SIGCOMM’ 2012 Limit impact to as few requests as 16
One More Thing… 1. Critical Path Scheduling 2. Open. TCP 3. Structured Streams 4. … 17
Coflow A semantically-bound collection of flows Conveys application intent to the network » Allows better management of network resources » Provides greater flexibility in designing applications Mosharaf Chowdhury http: //www. mosharaf. com/ UC Berkeley
Critical Path Scheduling Communication of a cluster application is represented by a partially-ordered set of coflows S A A S S B S S Network allocation takes place among these partially-ordered sets of coflows 19
Coflow AP I Operation Caller create(PATTERN, [opt]) handle Driver put(handle, id, content, [opt]) result Sender get(handle, id, [opt]) content Receive r terminate(handle, [opt]) result Driver 20
Throughput-Sensitive Applications Job finishes Shuffle finishes Local shuffle finishes Reduce Stage Minimize Completion Time 1 Map Stage Map. Reduc e Framewor 1. Orchestra, SIGCOMM’ 2011 k Data Flow 21
reducers shuffle 1 shuffle 2 reducers Coflow Resourc e Allocation 1. Weights [Across Apps] mappe rs Job 1 Job 2 Weighted sharing between coflows @driver shuffle 1 create(SHUFFLE, weight=1) shuffle 2 create(SHUFFLE, weight=2) … 22
reducers shuffle 1 shuffle 2 reducers Coflow Resourc e Allocation 2. Priorities [Across Apps] mappe rs Job 1 Job 2 Strict priorities @driver shuffle 1 create(SHUFFLE, pri=3) shuffle 2 create(SHUFFLE, pri=5) … 23
reducers Resourc e mappe rs shuffle 2 broadcast (b) Coflow aggregation(a gg) shuffle 1 reducers mappe rs driver Job 1 Job 2 Allocation finishes_before (~>) 3. Dependencies @driver b create(BCAST) shuffle 2]) create(SHUFFLE, ord=[b ~> agg create(AGGR, ord=[shuffle 2 ~> agg]) [Within Apps] 24
Coflow Resourc e Allocation Communication of a cluster application is represented by a partially-ordered set of coflows S A A S S B S S Network allocation takes place among these partially-ordered sets of coflows 25
- Slides: 25