How Linked In used TCP Anycast to make

  • Slides: 43
Download presentation

How Linked. In used TCP Anycast to make the site faster Ritesh Maheshwari Shawn

How Linked. In used TCP Anycast to make the site faster Ritesh Maheshwari Shawn Zandi

Anycast • Anycast provides a distributed service via routing. • It is not really

Anycast • Anycast provides a distributed service via routing. • It is not really different than unicast. • NLRI object with multiple next-hops. • It simply works for both TCP and UDP applications. (use with cautions!)

Bob NYC www. linkedin. com 2001: db 8: : 1/56 CHI www. linkedin. com

Bob NYC www. linkedin. com 2001: db 8: : 1/56 CHI www. linkedin. com 2001: db 8: : 1/56 SF www. linkedin. com 2001: db 8: : 1/56

Anycast with ECMP • Not a real issue in today’s internet • Consistent flow

Anycast with ECMP • Not a real issue in today’s internet • Consistent flow routing is required (per packet load balancing breaks Anycast) – Pretty Much Standard • Most BGP implementations do not load balance across different AS-PATHs even with same size.

Anycast Complications • Broken MTU Challenges • ICMP message may not reach the intended

Anycast Complications • Broken MTU Challenges • ICMP message may not reach the intended receiver to report MTU problem. Adjusting MSS can help. • RPF Checks • Multiple covering prefixes - Only one Service Address should be covered by each advertised prefix /24 or /56 • Monitoring!

But! How to measure Anycast effectiveness?

But! How to measure Anycast effectiveness?

What is RUM? Java. Script (Client-code) to measure performance • • • DNS Time

What is RUM? Java. Script (Client-code) to measure performance • • • DNS Time Connection time First Byte Time Download Time Page Load Time

What are Po. Ps? Point of Presence / Po. P • Small-scale data centers

What are Po. Ps? Point of Presence / Po. P • Small-scale data centers • Proxy servers at Linked. In (ATS)

Without Po. Ps Browser connection time Data Center 250 ms

Without Po. Ps Browser connection time Data Center 250 ms

Without Po. Ps Browser connection time Data Center 250 ms server compute time 500

Without Po. Ps Browser connection time Data Center 250 ms server compute time 500 ms

Without Po. Ps Browser connection time first byte time + page download time Total

Without Po. Ps Browser connection time first byte time + page download time Total = 2000 ms Data Center 250 ms server compute time 3 -5 round trips 5 RTTs = 5 x 250 ms = 1250 ms 500 ms

With Po. Ps Browser Po. P Data Center 100 ms 250 ms

With Po. Ps Browser Po. P Data Center 100 ms 250 ms

With Po. Ps Browser connection time Po. P Data Center 100 ms Old TCP

With Po. Ps Browser connection time Po. P Data Center 100 ms Old TCP Connection

With Po. Ps Browser connection time Po. P Data Center 100 ms Old TCP

With Po. Ps Browser connection time Po. P Data Center 100 ms Old TCP Connection first byte time + page download time server compute time one round trip 500 ms

900 ms gain! With Po. Ps Browser connection time Total = 1100 ms Po.

900 ms gain! With Po. Ps Browser connection time Total = 1100 ms Po. P Data Center 100 ms Old TCP Connection server compute time first byte time + page download time one round trip 5 RTTs = 5 x 100 ms = 500 ms

How are users assigned to Po. Ps? Through DNS: IP handed based on user’s

How are users assigned to Po. Ps? Through DNS: IP handed based on user’s resolver country # California $ dig +short www. linkedin. com 216. 52. 242. 80 # Spain $ dig @109. 69. 8. 51 +short www. linkedin. com 91. 225. 248. 80

Should India connect to Singapore or Dublin? How to assure optimal Po. Ps assignment?

Should India connect to Singapore or Dublin? How to assure optimal Po. Ps assignment?

RUM beacons Fetch a tiny object from each candidate Po. P For each pop_name,

RUM beacons Fetch a tiny object from each candidate Po. P For each pop_name, 1. Start timer 2. Fetch {pop_name}. perf. linkedin. com/pop/admin 3. Stop timer Send data back to our servers • Millions of agents! • Analyze data to find “optimal” Po. P per country

We can assign countries to new Po. Ps! Country China India Po. P Hong

We can assign countries to new Po. Ps! Country China India Po. P Hong Kong Dublin Singapore Median Beacon Time(ms) 434 1216 515 1368 1042 898

We can audit current assignment! Country India Pakistan Spain Brazil Netherlands UAE Italy Is

We can audit current assignment! Country India Pakistan Spain Brazil Netherlands UAE Italy Is Po. P optimal? TRUE FALSE TRUE Current Po. P Singapore Dublin US West Coast Dublin Optimal Po. P Singapore Dublin US East Coast Dublin Mexico TRUE US West Coast Russia FALSE US West Coast Dublin

Linked. In Homepage Download Time Improvement Percentage Improvement 30% 25% 20% 15% 10% 5%

Linked. In Homepage Download Time Improvement Percentage Improvement 30% 25% 20% 15% 10% 5% 0% India Pakistan Median Improvement Singapore Russia 90 th Percentile Improvement Brazil

Plot Twist: Assignment far from optimal • About 31% of US traffic gets assigned

Plot Twist: Assignment far from optimal • About 31% of US traffic gets assigned to a suboptimal Po. P. – 45% of East Coast • About 10% of traffic globally gets assigned to a suboptimal Po. P.

DNS Po. P assignment is suboptimal • Assignment based on Resolver IP, not Client

DNS Po. P assignment is suboptimal • Assignment based on Resolver IP, not Client IP California DNS Resolver Po. P US West New York Po. P US East

DNS Po. P assignment is suboptimal • Assignment based on Resolver IP, not Client

DNS Po. P assignment is suboptimal • Assignment based on Resolver IP, not Client IP • Bad IP to Geo databases – Resolver really in NY, but database says CA

Story so far 1. We built Po. Ps 2. …used RUM to assign users

Story so far 1. We built Po. Ps 2. …used RUM to assign users to Optimal Po. Ps 3. …found DNS based assignment is suboptimal

Accurate Po. P assignment Problem • Bug our DNS providers (31% -> 27%) •

Accurate Po. P assignment Problem • Bug our DNS providers (31% -> 27%) • Run our own DNS How about Anycast?

Anycast – One IP, Multiple Servers Po. P C 1. 1 Bob Po. P

Anycast – One IP, Multiple Servers Po. P C 1. 1 Bob Po. P B 1. 1 ü Client IP, not Resolver IP used! ü No Geo-IP Databases Po. P A 1. 1

How does Anycast compare to DNS? Will anycast send more users to optimal Po.

How does Anycast compare to DNS? Will anycast send more users to optimal Po. P? Ø Lets test it!

RUM to rescue For each Po. P: 1. Announce same anycast IP (108. 174.

RUM to rescue For each Po. P: 1. Announce same anycast IP (108. 174. 13. 10) 2. Configure a domain ac. perf. linkedin. com to point to 108. 174. 13. 10

RUM to rescue For each page view: 1. RUM downloads a tiny object :

RUM to rescue For each page view: 1. RUM downloads a tiny object : 2. 3. ac. perf. linkedin. com/pop/admin Read X-Li-Pop response header to record which Po. P served the object Send this back to Linked. In with RUM data Data: 1. For each user, the anycast Po. P 2. For each user, the optimal Po. P (from pop beacons)

Results Region or Country Illinois Florida Georgia Pennsylvania DNS % Optimal Assignment 70 73

Results Region or Country Illinois Florida Georgia Pennsylvania DNS % Optimal Assignment 70 73 75 85 Anycast % Optimal Assignment 90 95 93 95

Results Region or Country Arizona DNS % Optimal Assignment 60 Anycast % Optimal Assignment

Results Region or Country Arizona DNS % Optimal Assignment 60 Anycast % Optimal Assignment 39 Brazil 88 33 New York 77 74

Fewer hops != Lower Latency • • Carriers prefer to haul packets within their

Fewer hops != Lower Latency • • Carriers prefer to haul packets within their own network Peering can create inter-continental short cuts i X nk l li a t n ne nti -co nter 1. 1 Y Alice 1. 1 Z

Maybe DNS wasn’t so bad Continent-level assignments City / State level assignments

Maybe DNS wasn’t so bad Continent-level assignments City / State level assignments

“Regional” Anycast DNS-based 1 anycast IP per continent i 2. 2 X nti -co

“Regional” Anycast DNS-based 1 anycast IP per continent i 2. 2 X nti -co nter nk al li t n e n 1. 1 Y Alice Ran a RUM experiment, all was fine 1. 1 Z

USA Ramp Results 100. 00 95. 00 % Traffic going to Optimal Po. P

USA Ramp Results 100. 00 95. 00 % Traffic going to Optimal Po. P 90. 00 Illinois 85. 00 Florida 80. 00 North Carolina 75. 00 Indiana NY 70. 00 NJ Ramp outside USA In progress 65. 00 60. 00 VA WV LA 55. 00 50. 00 20141206 20141208 20141210 20141212 Date 20141214 20141216 20141218

Story so far 1. 2. 3. 4. 5. We built Po. Ps …used RUM

Story so far 1. 2. 3. 4. 5. We built Po. Ps …used RUM to assign users to Optimal Po. Ps …found DNS based assignment is suboptimal …evaluated Anycast as a solution using RUM …now using Anycast to assign users to Po. Ps Next play: • Build more Po. Ps!

Story: The End Learnings • Clients are your measurement agents • Trust, but verify

Story: The End Learnings • Clients are your measurement agents • Trust, but verify • You can have a bigger impact if you collaborate Next Play • Keep evaluating Anycast • Keep building new Po. Ps

© 2014 Linked. In Corporation. All Rights Reserved.

© 2014 Linked. In Corporation. All Rights Reserved.