CLOVE How I Learned to Stop Worrying About
CLOVE How I Learned to Stop Worrying About the Core and Love the Edge Aditi Ghag 2 Naga Katta 1, 2, Mukesh Hira 2, Changhoon Kim 3, Isaac Keslassy 2, 4, Jennifer Rexford 1 1 Princeton University, 2 VMware, 3 Barefoot Networks, 4 Technion
Data center load balancing today Equal-Cost Multi-Path (ECMP) routing: § Path(packet) = hash(5 -tuple{src, dst IP + src, dst port + protocol number}) Spine Switches § Hash collisions § Coarse-grained Leaf Switches Servers … … . . . … § Congestion-oblivious
Proposed load balancing schemes Centralized load balancing Hedera, Micro. TE, SWAN, Fastpass Central Controller § Slow reaction time § Routes computation overhead v. Switch Hypervisors
Proposed load balancing schemes In-network load balancing CONGA, HULA § Needs custom ASIC data center fabric § High capital cost v. Switch Hypervisors
Proposed load balancing schemes v. Switch Hypervisors End-host load balancing PRESTO § Non-standard shadow mac labels based forwarding § Controller intervention in case of asymmetry MPTCP § Incast collapse § Guest VM network stack changes
v. Switch as the sweet spot Spine switches Network switches with ECMP using 5 -tuple Outer transport source ports are used for ECMP traffic distribution Eth IP TCP Leaf switches Overlay v. Switch Eth IP Payload v. Switch
CLOVE design § Path discovery § Load-balancing flowlets § v. Switch load-balancing
Path Discovery Load balancing flowlets v. Switch Load balancing § Outer transport source port maps to network path y. D ata Standard ECMP in the physical network t: P xed 1 Ove rla d e y ixe Ov rla F : e t r d v 3 ixe O DPo rt: P Data F y : a l 2 d t r o e e P 2 P or : Fix Ov DP ort: 1 to H S Port 4 D H rt: P 2 SP o 2 H P H S o o H 1 t 1 t SPort H 2 5001 H 2 5002 H 2 5003 H 2 5004 SP or o. H 2 DP ort : F i Dst ata D rlay H H 1 t Hypervisor learns source port to path mapping ta Da v. Switch Hypervisor H 1 Hypervisor H 2
Load balancing flowlets Dst SPort H 2 5001 H 2 5002 H 2 5003 H 2 5004 lay ta Da Ov er O src P P C C T 04 T 03 50 50 P I IP h Et Eth Ove rlay a Dat h IP TC P 50 src 02 ay rl ve v. Switch Load balancing Et T Eth IP CP src 5001 O verlay Da ta Da t a Path Discovery v. Switch H 1 H 2 Flowlet gap
Edge-Flowlet § Flowlet splitting in v. Switch § Select learnt source port for flowlets at random § Physical switches forward flowlets using ECMP
Path Discovery Load balancing flowlets v. Switch Load balancing CLOVE-ECN § Congestion-aware balancing based on ECN feedback 2. Switches mark ECN on data packets Path weight table Dst SPort Wt H 2 5001 0. 25 H 2 5002 0. 3 0. 25 H 2 5003 0. 25 H 2 5004 0. 3 0. 25 Data 1. Src v. Switch detects and forwards flowlets v. Switch Hypervisor H 1 5. Src v. Switch adjusts path weights for the src port 4. Return packet carries ECN and src port forward path H 2 Hypervisor v. Switch 3. Dst v. Switch relays ECN and src port to src v. Switch
Path Discovery Load balancing flowlets v. Switch Load balancing CLOVE-INT § Utilization-aware balancing based on INT feedback 2. Switches add requested link utilization Data 1. Src v. Switch adds INT instructions to flowlets Dst SPort Util H 2 5001 40 H 2 5002 30 H 2 5003 50 H 2 5004 10 v. Switch Hypervisor H 1 5. Src v. Switch updates link utilizations 6. Src v. Switch forwards flowlets on least utilized paths 4. Return packet carries link utilization for Hypervisor forward path H 2 v. Switch 3. Dst v. Switch relays link utilization and src port to src v. Switch
Performance evaluation setup Spine 1 Spine 2 § Implementation in ns-2 simulator § Realistic Web Search Workload 4 x 4 Gbps § Client on Leaf 1 <-> server on Leaf 2 16 x 1 Gbps 16 Clients … Leaf 1 Leaf 2 16 Servers …
Symmetric topology 1. 4 x lower FCT than ECMP 1. 1 x higher FCT than CONGA CLOVE-ECN captures 82% of the performance gain between ECMP and CONGA
Asymmetric topology 3 x lower FCT than ECMP 1. 2 x higher FCT than CONGA CLOVE-ECN captures 80% of the performance gain between ECMP and CONGA
CLOVE highlights § Captures 80% of the performance gain of CONGA § No changes to data center infrastructure or guest VM § Adapts to asymmetry without any controller input § Scalable due to distributed state § Modular implementation
Future work § Use packet latency to infer congestion § Adapt flowlet-gap to network conditions § Fine-tune congestion-management algorithm § Stability § Analyze processing overhead
THANK YOU Questions?
- Slides: 18