How Linked In used TCP Anycast to make
- Slides: 43
How Linked. In used TCP Anycast to make the site faster Ritesh Maheshwari Shawn Zandi
Anycast • Anycast provides a distributed service via routing. • It is not really different than unicast. • NLRI object with multiple next-hops. • It simply works for both TCP and UDP applications. (use with cautions!)
Bob NYC www. linkedin. com 2001: db 8: : 1/56 CHI www. linkedin. com 2001: db 8: : 1/56 SF www. linkedin. com 2001: db 8: : 1/56
Anycast with ECMP • Not a real issue in today’s internet • Consistent flow routing is required (per packet load balancing breaks Anycast) – Pretty Much Standard • Most BGP implementations do not load balance across different AS-PATHs even with same size.
Anycast Complications • Broken MTU Challenges • ICMP message may not reach the intended receiver to report MTU problem. Adjusting MSS can help. • RPF Checks • Multiple covering prefixes - Only one Service Address should be covered by each advertised prefix /24 or /56 • Monitoring!
But! How to measure Anycast effectiveness?
What is RUM? Java. Script (Client-code) to measure performance • • • DNS Time Connection time First Byte Time Download Time Page Load Time
What are Po. Ps? Point of Presence / Po. P • Small-scale data centers • Proxy servers at Linked. In (ATS)
Without Po. Ps Browser connection time Data Center 250 ms
Without Po. Ps Browser connection time Data Center 250 ms server compute time 500 ms
Without Po. Ps Browser connection time first byte time + page download time Total = 2000 ms Data Center 250 ms server compute time 3 -5 round trips 5 RTTs = 5 x 250 ms = 1250 ms 500 ms
With Po. Ps Browser Po. P Data Center 100 ms 250 ms
With Po. Ps Browser connection time Po. P Data Center 100 ms Old TCP Connection
With Po. Ps Browser connection time Po. P Data Center 100 ms Old TCP Connection first byte time + page download time server compute time one round trip 500 ms
900 ms gain! With Po. Ps Browser connection time Total = 1100 ms Po. P Data Center 100 ms Old TCP Connection server compute time first byte time + page download time one round trip 5 RTTs = 5 x 100 ms = 500 ms
How are users assigned to Po. Ps? Through DNS: IP handed based on user’s resolver country # California $ dig +short www. linkedin. com 216. 52. 242. 80 # Spain $ dig @109. 69. 8. 51 +short www. linkedin. com 91. 225. 248. 80
Should India connect to Singapore or Dublin? How to assure optimal Po. Ps assignment?
RUM beacons Fetch a tiny object from each candidate Po. P For each pop_name, 1. Start timer 2. Fetch {pop_name}. perf. linkedin. com/pop/admin 3. Stop timer Send data back to our servers • Millions of agents! • Analyze data to find “optimal” Po. P per country
We can assign countries to new Po. Ps! Country China India Po. P Hong Kong Dublin Singapore Median Beacon Time(ms) 434 1216 515 1368 1042 898
We can audit current assignment! Country India Pakistan Spain Brazil Netherlands UAE Italy Is Po. P optimal? TRUE FALSE TRUE Current Po. P Singapore Dublin US West Coast Dublin Optimal Po. P Singapore Dublin US East Coast Dublin Mexico TRUE US West Coast Russia FALSE US West Coast Dublin
Linked. In Homepage Download Time Improvement Percentage Improvement 30% 25% 20% 15% 10% 5% 0% India Pakistan Median Improvement Singapore Russia 90 th Percentile Improvement Brazil
Plot Twist: Assignment far from optimal • About 31% of US traffic gets assigned to a suboptimal Po. P. – 45% of East Coast • About 10% of traffic globally gets assigned to a suboptimal Po. P.
DNS Po. P assignment is suboptimal • Assignment based on Resolver IP, not Client IP California DNS Resolver Po. P US West New York Po. P US East
DNS Po. P assignment is suboptimal • Assignment based on Resolver IP, not Client IP • Bad IP to Geo databases – Resolver really in NY, but database says CA
Story so far 1. We built Po. Ps 2. …used RUM to assign users to Optimal Po. Ps 3. …found DNS based assignment is suboptimal
Accurate Po. P assignment Problem • Bug our DNS providers (31% -> 27%) • Run our own DNS How about Anycast?
Anycast – One IP, Multiple Servers Po. P C 1. 1 Bob Po. P B 1. 1 ü Client IP, not Resolver IP used! ü No Geo-IP Databases Po. P A 1. 1
How does Anycast compare to DNS? Will anycast send more users to optimal Po. P? Ø Lets test it!
RUM to rescue For each Po. P: 1. Announce same anycast IP (108. 174. 13. 10) 2. Configure a domain ac. perf. linkedin. com to point to 108. 174. 13. 10
RUM to rescue For each page view: 1. RUM downloads a tiny object : 2. 3. ac. perf. linkedin. com/pop/admin Read X-Li-Pop response header to record which Po. P served the object Send this back to Linked. In with RUM data Data: 1. For each user, the anycast Po. P 2. For each user, the optimal Po. P (from pop beacons)
Results Region or Country Illinois Florida Georgia Pennsylvania DNS % Optimal Assignment 70 73 75 85 Anycast % Optimal Assignment 90 95 93 95
Results Region or Country Arizona DNS % Optimal Assignment 60 Anycast % Optimal Assignment 39 Brazil 88 33 New York 77 74
Fewer hops != Lower Latency • • Carriers prefer to haul packets within their own network Peering can create inter-continental short cuts i X nk l li a t n ne nti -co nter 1. 1 Y Alice 1. 1 Z
Maybe DNS wasn’t so bad Continent-level assignments City / State level assignments
“Regional” Anycast DNS-based 1 anycast IP per continent i 2. 2 X nti -co nter nk al li t n e n 1. 1 Y Alice Ran a RUM experiment, all was fine 1. 1 Z
USA Ramp Results 100. 00 95. 00 % Traffic going to Optimal Po. P 90. 00 Illinois 85. 00 Florida 80. 00 North Carolina 75. 00 Indiana NY 70. 00 NJ Ramp outside USA In progress 65. 00 60. 00 VA WV LA 55. 00 50. 00 20141206 20141208 20141210 20141212 Date 20141214 20141216 20141218
Story so far 1. 2. 3. 4. 5. We built Po. Ps …used RUM to assign users to Optimal Po. Ps …found DNS based assignment is suboptimal …evaluated Anycast as a solution using RUM …now using Anycast to assign users to Po. Ps Next play: • Build more Po. Ps!
Story: The End Learnings • Clients are your measurement agents • Trust, but verify • You can have a bigger impact if you collaborate Next Play • Keep evaluating Anycast • Keep building new Po. Ps
© 2014 Linked. In Corporation. All Rights Reserved.
- Outside usa
- Anycast vs multicast
- Anycast hsrp
- Singly linked list vs doubly linked list
- Singly vs doubly linked list
- Perbedaan single linked list dan double linked list
- Make the lie big keep it simple
- Go make a difference in the world
- Make the lie big, make it simple
- Absolute modifier
- How is experimental probability used to make predictions
- Tools classification
- Batos taino
- Element scavenger hunt answer key
- Chapter 4 shielded metal arc welding
- A shiny substance used to make flexible bed springs
- Race determination from teeth
- Tool marks examples
- In a premix burner used in fes the fuel used is
- In a premix burner used in fes the fuel used is
- Livelli iso osi e tcp ip
- Twincat modbus tcp configurator
- Tcp 101
- Tcp rfc 793
- Size of tcp segment header
- Tcp ip sockets in c
- Tcp/ip sockets in java: practical guide for programmers
- Tcp stereo
- Sliding window tcp
- Selective acknowledgement
- Tcp reno rfc
- Purpose of tcp
- Tcp
- Connection establishment in computer networks
- Tcp tahoe
- Tcp reno fast recovery
- Tcp slow start
- Echo client server
- Tcp.split.handshake
- Tcp header
- Tcp segment structure
- A tcp connection is using a window size of 10 000 bytes
- Caracteristicas modelo osi y tcp/ip
- Snooping tcp