Data Center Traffic Engineering Jennifer Rexford Fall 2010

  • Slides: 32
Download presentation
Data Center Traffic Engineering Jennifer Rexford Fall 2010 (TTh 1: 30 -2: 50 in

Data Center Traffic Engineering Jennifer Rexford Fall 2010 (TTh 1: 30 -2: 50 in COS 302) COS 561: Advanced Computer Networks http: //www. cs. princeton. edu/courses/archive/fall 10/cos 561/

Cloud Computing 2

Cloud Computing 2

Cloud Computing • Elastic resources – Expand contract resources – Pay-per-use – Infrastructure on

Cloud Computing • Elastic resources – Expand contract resources – Pay-per-use – Infrastructure on demand • Multi-tenancy – Multiple independent users – Security and resource isolation – Amortize the cost of the (shared) infrastructure • Flexibility service management – Resiliency: isolate failure of servers and storage – Workload movement: move work to other locations 3

Cloud Service Models • Software as a Service – Provider licenses applications to users

Cloud Service Models • Software as a Service – Provider licenses applications to users as a service – E. g. , customer relationship management, e-mail, . . – Avoid costs of installation, maintenance, patches, … • Platform as a Service – Provider offers software platform for building applications – E. g. , Google’s App-Engine – Avoid worrying about scalability of platform • Infrastructure as a Service – Provider offers raw computing, storage, and network – E. g. , Amazon’s Elastic Computing Cloud (EC 2) – Avoid buying servers and estimating resource needs 4

Multi-Tier Applications • Applications consist of tasks – Many separate components – Running on

Multi-Tier Applications • Applications consist of tasks – Many separate components – Running on different machines • Commodity computers Front end Server – Many general-purpose computers – Not one big mainframe Aggregator – Easier scaling Aggregator …… Aggregator … Worker Worker 5

Enabling Technology: Virtualization • Multiple virtual machines on one physical machine • Applications run

Enabling Technology: Virtualization • Multiple virtual machines on one physical machine • Applications run unmodified as on real machine • VM can migrate from one computer to another 6

Data Center Network 7

Data Center Network 7

Status Quo: Virtual Switch in Server 8

Status Quo: Virtual Switch in Server 8

Top-of-Rack Architecture • Rack of servers – Commodity servers – And top-of-rack switch •

Top-of-Rack Architecture • Rack of servers – Commodity servers – And top-of-rack switch • Modular design – Preconfigured racks – Power, network, and storage cabling • Aggregate to the next level 9

Modularity, Modularity • Containers • Many containers 10

Modularity, Modularity • Containers • Many containers 10

Data Center Network Topology Internet CR S AR AR S S A A …A

Data Center Network Topology Internet CR S AR AR S S A A …A CR . . . S A A … A ~ 1, 000 servers/pod AR AR . . . • • Key CR = Core Router AR = Access Router S = Ethernet Switch A = Rack of app. servers 11

Capacity Mismatch CR AR AR S S CR ~ 200: 1 AR AR S

Capacity Mismatch CR AR AR S S CR ~ 200: 1 AR AR S S ~ 40: 1 S ~ S 5: 1 A A …A S S A A … A . . . S A A …A S A A … A 12

Data-Center Routing Internet CR DC-Layer 3 DC-Layer 2 S AR AR S S SS

Data-Center Routing Internet CR DC-Layer 3 DC-Layer 2 S AR AR S S SS A A …A CR . . . S S A A … A ~ 1, 000 servers/pod == IP subnet AR AR . . . • • Key CR = Core Router (L 3) AR = Access Router (L 3) S = Ethernet Switch (L 2) A = Rack of app. servers 13

Reminder: Layer 2 vs. Layer 3 • Ethernet switching (layer 2) – Cheaper switch

Reminder: Layer 2 vs. Layer 3 • Ethernet switching (layer 2) – Cheaper switch equipment – Fixed addresses and auto-configuration – Seamless mobility, migration, and failover • IP routing (layer 3) – Scalability through hierarchical addressing – Efficiency through shortest-path routing – Multipath routing through equal-cost multipath • So, like in enterprises… – Data centers often connect layer-2 islands by IP routers 14

Load Balancers • Spread load over server replicas – Present a single public address

Load Balancers • Spread load over server replicas – Present a single public address (VIP) for a service – Direct each request to a server replica 10. 10. 1 Virtual IP (VIP) 192. 121. 10. 10. 10. 2 10. 10. 3 15

Data Center Costs (Monthly Costs) • Servers: 45% – CPU, memory, disk • Infrastructure:

Data Center Costs (Monthly Costs) • Servers: 45% – CPU, memory, disk • Infrastructure: 25% – UPS, cooling, power distribution • Power draw: 15% – Electrical utility costs • Network: 15% – Switches, links, transit http: //perspectives. mvdirona. com/2008/11/28/Cost. Of. Power. In. Large. Scale. Data. Centers. aspx 16

Wide-Area Network. . . Data Centers Servers Router DNS Server DNS-based site selection .

Wide-Area Network. . . Data Centers Servers Router DNS Server DNS-based site selection . . . Servers Router Internet Clients 17

Wide-Area Network: Ingress Proxies. . . Data Centers Servers Router Proxy Clients 18

Wide-Area Network: Ingress Proxies. . . Data Centers Servers Router Proxy Clients 18

Data Center Traffic Engineering Challenges and Opportunities 19

Data Center Traffic Engineering Challenges and Opportunities 19

Traffic Engineering Challenges • Scale – Many switches, hosts, and virtual machines • Churn

Traffic Engineering Challenges • Scale – Many switches, hosts, and virtual machines • Churn – Large number of component failures – Virtual Machine (VM) migration • Traffic characteristics – High traffic volume and dense traffic matrix – Volatile, unpredictable traffic patterns • Performance requirements – Delay-sensitive applications – Resource isolation between tenants 20

Traffic Engineering Opportunities • Efficient network – Low propagation delay and high capacity •

Traffic Engineering Opportunities • Efficient network – Low propagation delay and high capacity • Specialized topology – Fat tree, Clos network, etc. – Opportunities for hierarchical addressing • Control over both network and hosts – Joint optimization of routing and server placement – Can move network functionality into the end host • Flexible movement of workload – Services replicated at multiple servers and data centers – Virtual Machine (VM) migration 21

VL 2 Paper Slides from Changhoon Kim (now at Microsoft) 22

VL 2 Paper Slides from Changhoon Kim (now at Microsoft) 22

Virtual Layer 2 Switch The Illusion of a Huge L 2 Switch CR AR

Virtual Layer 2 Switch The Illusion of a Huge L 2 Switch CR AR 1. L 2 semantics . . . AR 2. Uniform high S S capacity S S A A …A S S A A … A CR AR AR 3. Performance S S isolation . . . S S A A …A S S A A … A 23

VL 2 Goals and Solutions Approach Solution 1. Layer-2 semantics Employ flat addressing Name-location

VL 2 Goals and Solutions Approach Solution 1. Layer-2 semantics Employ flat addressing Name-location separation & resolution service 2. Uniform high capacity between servers Guarantee bandwidth for hose-model traffic Flow-based random traffic indirection (Valiant LB) Enforce hose model using existing mechanisms only TCP Objective 3. Performance Isolation “Hose”: each node has ingress/egress bandwidth constraints 24

Name/Location Separation Cope with host churns with very little overhead VL 2 Switches run

Name/Location Separation Cope with host churns with very little overhead VL 2 Switches run link-state routing and maintain only switch-level topology Directory Service • Allows to use low-cost switches … To. R 2 • Protects network and hosts from host-stateyx churn To. R 3 • Obviates host and switch reconfiguration z To. R 34 To. R 1. . . To. R 2 …. . . To. R 3. . . To. R 4 To. R 3 y payload To. R 34 z payload x y, yz Servers use flat names z Lookup & Response 25

Clos Network Topology Offer huge aggr capacity & multi paths at modest cost VL

Clos Network Topology Offer huge aggr capacity & multi paths at modest cost VL 2 Int D (# of 10 G ports) 48 Aggr 96 144. . . TOR. . . 20 Servers . . . Max DC size (# of Servers) 11, 520. . . 46, 080 K aggr switches with D ports 103, 680. . . 20*(DK/4) Servers 26

Valiant Load Balancing: Indirection Cope with arbitrary TMs with very little overhead IANY Links

Valiant Load Balancing: Indirection Cope with arbitrary TMs with very little overhead IANY Links used for up paths [ ECMP + IP Anycast ] Links used for down paths Harness huge bisection bandwidth • • Obviate esoteric traffic engineering or optimization • Ensure robustness to failures • Work with switch mechanisms available today T 1 IANY T 35 T 2 T 3 x Must spread Equal 1. Cost Multi Pathtraffic Forwarding 2. y Must ensure dst z independence yz payload T 4 T 5 T 6 27

VL 2 vs. Seattle • Similar “virtual layer 2” abstraction – Flat end-point addresses

VL 2 vs. Seattle • Similar “virtual layer 2” abstraction – Flat end-point addresses – Indirection through intermediate node • Enterprise networks (Seattle) – Hard to change hosts directory on the switches – Sparse traffic patterns effectiveness of caching – Predictable traffic patterns no emphasis on TE • Data center networks (VL 2) – Easy to change hosts move functionality to hosts – Dense traffic matrix reduce dependency on caching – Unpredictable traffic patterns ECMP and VLB for TE 28

Ongoing Research 29

Ongoing Research 29

Research Questions • What topology to use in data centers? – Reducing wiring complexity

Research Questions • What topology to use in data centers? – Reducing wiring complexity – Achieving high bisection bandwidth – Exploiting capabilities of optics and wireless • Routing architecture? – Flat layer-2 network vs. hybrid switch/router – Flat vs. hierarchical addressing • How to perform traffic engineering? – Over-engineering vs. adapting to load – Server selection, VM placement, or optimizing routing • Virtualization of NICs, servers, switches, … 30

Research Questions • Rethinking TCP congestion control? – Low propagation delay and high bandwidth

Research Questions • Rethinking TCP congestion control? – Low propagation delay and high bandwidth – “Incast” problem leading to bursty packet loss • Division of labor for TE, access control, … – VM, hypervisor, To. R, and core switches/routers • Reducing energy consumption – Better load balancing vs. selective shutting down • Wide-area traffic engineering – Selecting the least-loaded or closest data center • Security – Preventing information leakage and attacks 31

Discuss 32

Discuss 32