15 440 Distributed Systems Within the Internet Nov
15 -440 Distributed Systems Within the Internet Nov. 9, 2011 Topics n Domain Name System l Finding IP address n Content Delivery Networks l Caching content within the network
Domain Name System (DNS) n Mapping from Host Names to IP Addresses Distributed database n n Each site (university, large company, ISP, . . . ) maintains database with its own entries Provide server for others to query Implemented at Application Layer n – 2– Runs over UDP (normally) or TCP 15 -440
DNS Name Hierarchy unnamed root Top-level domain names mil arpa edu gov com ae United Arab Emirates in-addr mit cs cmu berkeley ece amazon www • • • us United States • • • zw Zimbawe Second-level domain names Third-level domain names 72. 21. 194. 1 ics greatwhite 128. 2. 220. 10 – 3– www 128. 2. 217. 13 n Both generic (e. g. , “. com”) and country (e. g. , “. jp” domains) n Top-level names managed by NIC Other name zones delegated to different entities n 15 -440
DNS Name Terminology unnamed root mil arpa edu gov com ae United Arab Emirates in-addr mit cs cmu berkeley ece • • • us • • • United States zw Zimbawe amazon www 72. 21. 194. 1 ics www 128. 2. 217. 13 n n n greatwhite 128. 2. 220. 10 – 4– Node: Any point in hierarchy Zone: A complete subtree Name Servers: Servers that can determine IP addresses within given zone l With help from other servers 15 -440
Programmer’s View of DNS n Conceptually, programmers can view the DNS database as a collection of millions of host entry structures: /* DNS host entry structure struct hostent { char *h_name; /* char **h_aliases; /* int h_addrtype; /* int h_length; /* char **h_addr_list; /* }; n */ official domain name of host */ null-terminated array of domain names */ host address type (AF_INET) */ length of an address, in bytes */ null-terminated array of in_addr structs */ in_addr is a struct consisting of 4 -byte IP address Functions for retrieving host entries from DNS: gethostbyname: query key is a DNS domain name. n gethostbyaddr: query key is an IP address. n – 5– 15 -440
Properties of DNS Host Entries n Each host entry is an equivalence class of domain names and IP addresses. Different kinds of mappings are possible: n Simple case: 1 -1 mapping between domain name and IP addr: l greatwhite. ics. cmu. edu maps to 128. 2. 220. 10 n Multiple domain names mapped to the same IP address: l eecs. mit. edu and cs. mit. edu both map to 18. 62. 1. 6 n Multiple domain names mapped to multiple IP addresses: l aol. com and www. aol. com map to multiple IP addrs. n Some valid domain names don’t map to any IP address: l for example: ics. cmu. edu – 6– 15 -440
DNS Name Server Hierarchy unnamed root edu cmu cs a. edu-servers. net • • • ny-server-03. net. cmu. edu nsauth 1. net. cmu. edu nsauth 2. net. cmu. edu AC-DDNS-2. NET. cs. cmu. edu. AC-DDNS-1. NET. cs. cmu. edu. AC-DDNS-3. NET. cs. cmu. edu ics pdl greatwhite imperial 128. 2. 220. 10 128. 2. 189. 40 – 7– a. root-servers. net • • • m. root-servers. net n n At each level of hierarchy, have group of servers that are authorized to handle that region of hierarchy At bottom of hierarchy, have authority server for specific name 15 -440
Nominal Root Name Servers n – 8– 13 total 15 -440
Physical Root Name Servers n n – 9– Several root servers have multiple physical servers Packets routed to “nearest” server by “Anycast” protocol 15 -440
DNS Records Format: (class, name, value, type, TTL) Database of Resource Records (RRs) n Classes: IN = Internet n Each class defines value associated with type IN Class Types n A Address l Name = hostname, Value = IP address n NS Name Server l Name = domain (e. g. , cs. cmu. edu) l Value = authoritative name server for this domain n CNAME Canonical Name (alias) l Name = alias name l Value = canonical name – 10 – n MX Mail server l Value = mail server hostname 15 -440
Getting DNS Information with dig unix> dig greatwhite. ics. cmu. edu ; ; ANSWER SECTION: greatwhite. ics. cmu. edu. 2966 IN A 128. 2. 220. 10 ; ; AUTHORITY SECTION: cs. cmu. edu. NS NS NS AC-DDNS-3. NET. cs. cmu. edu. AC-DDNS-1. NET. cs. cmu. edu. AC-DDNS-2. NET. cs. cmu. edu. 593 593 IN IN IN Perform DNS lookup as would for gethostbyname n – 11 – Lots of command-line options 15 -440
Tracing Hierarchy (1) Dig Program n Use flags to find name server (NS) n Disable recursion so that operates one step at a time unix> dig +norecurse @a. root-servers. net NS greatwhite. ics. cmu. edu ; ; ADDITIONAL SECTION: a. edu-servers. net. c. edu-servers. net. d. edu-servers. net. f. edu-servers. net. g. edu-servers. net. l. edu-servers. net. 172800 172800 IN IN A A AAAA A 192. 5. 6. 30 192. 26. 92. 30 192. 31. 80. 30 192. 35. 51. 30 192. 42. 93. 30 2001: 503: cc 2 c: : 2: 36 192. 41. 162. 30 IP v 6 address n – 12 – All. edu names handled by set of servers 15 -440
Tracing Hierarchy (2) n 3 servers handle CMU names unix> dig +norecurse @g. edu-servers. net NS greatwhite. ics. cmu. edu ; ; AUTHORITY SECTION: cmu. edu. – 13 – 172800 IN IN IN NS NS NS ny-server-03. net. cmu. edu. nsauth 1. net. cmu. edu. nsauth 2. net. cmu. edu. 15 -440
Tracing Hierarchy (3 & 4) n 3 servers handle CMU CS names unix> dig +norecurse @nsauth 1. net. cmu. edu NS greatwhite. ics. cmu. edu ; ; AUTHORITY SECTION: cs. cmu. edu. n 600 600 IN IN IN NS NS NS AC-DDNS-2. NET. cs. cmu. edu. AC-DDNS-1. NET. cs. cmu. edu. AC-DDNS-3. NET. cs. cmu. edu. Server within CS is “start of authority” for this name unix>dig +norecurse @AC_DDNS-2. NET. cs. cmu. edu NS greatwhite. ics. cmu. edu ; ; AUTHORITY SECTION: cs. cmu. edu. – 14 – 300 IN SOA PLANISPHERE. FAC. cs. cmu. edu. 15 -440
Recursive DNS Name Resolution Root Server 3 4 . edu Server ics edu 9 com 2 Local Server Recursively from root server downward n Results passed up Caching 7 cmu n someplace n 1 6 CMU CS Server unnamed root 8 CMU Server 5 Nonlocal Lookup cs 10 www 208. 216. 181. 15 n Results stored in caches along each hop Can shortcircuit lookup when cached entry present greatwhite 128. 2. 220. 10 – 15 -440
Iterative DNS Name Resolution Nonlocal Lookup Root Server . edu Server CMU Server unnamed root edu cmu 8 cs ics – 16 – greatwhite 128. 2. 220. 10 9 com 2 At each step, server returns name of next server down n Local server directly queries each successive server 4 5 6 7 CMU CS Server 3 n Local Server someplace 1 Caching 10 www n 208. 216. 181. 15 n Local server builds up cache of intermediate translations Helps in resolving names xxx. cs. cmu. edu, yy. cmu. edu, and z. edu 15 -440
Reverse DNS unnamed root edu arpa Task n in-addr cmu Method n 128 n cs 2 cmcl 242 – 17 – kittyhawk 128. 2. 194. 242 Maintain separate hierarchy based on IP names Write 128. 2. 194. 242 as 242. 194. 128. 2. in-addr. arpa Managing n 194 Given IP address, find its name n Authority manages IP addresses assigned to it E. g. , CMU manages name space 128. 2. in-addr. arpa 15 -440
. arpa Name Server Hierarchy in-addr. arpa 128 2 194 a. root-servers. net • • • m. root-servers. net chia. arin. net (dill, henna, indigo, epazote, figwort, ginseng) cucumber. srv. cs. cmu. edu, t-ns 1. net. cmu. edu t-ns 2. net. cmu. edu mango. srv. cs. cmu. edu (peach, banana, blueberry) n kittyhawk 128. 2. 194. 242 – 18 – At each level of hierarchy, have group of servers that are authorized to handle that region of hierarchy 15 -440
Performance Issues Challenge n There’s way too much traffic on the Internet n Popular sites (Google, Amazon, Facebook, …) get huge amounts of traffic l Could become “hot spot” n It takes much longer to route packets around world than next door Opportunities n Services can be replicated l Multiple servers / data center l Multiple data centers around world n Content can be cached How Can this Work? n – 19 – Contrary to original Internet model: IP address designates 15 -440 unique host
Server Balancing DNS Tricks n Customize DNS response to location l Allows distribution by geography n Return multiple host names / query l Client (could) choose one at random n Update DNS entries with new servers l Rotate loading Within Data Center n – 20 – Keep changing binding between IP address and host 15 -440
Server Balancing Example DNS Tricks n Different responses to different servers, short TTL’s unix 1> dig www. google. com ; ; ANSWER SECTION: www. google. com. www. l. google. com. 87775 81 81 81 IN IN IN CNAME A A A www. l. google. com. 72. 14. 204. 104 72. 14. 204. 105 72. 14. 204. 147 72. 14. 204. 99 72. 14. 204. 103 IN IN IN CNAME A A A www. l. google. com. 72. 14. 204. 99 72. 14. 204. 103 72. 14. 204. 104 72. 14. 204. 105 15 -440 72. 14. 204. 147 unix 2> dig www. google. com ; ; ANSWER SECTION: www. google. com. www. l. google. com. – 21 – www. l. google. com. 603997 145 145 145
Typical Workload (Web Pages) Multiple (typically small) objects per page n Frame, body, ads, logos, … File sizes n Heavy-tailed l Pareto distribution for tail • Lots of small objects means & TCP • 3 -way handshake • Lots of slow starts • Extra connection state l Lognormal for body of distribution Embedded references n Number of embedded objects also pareto Pr(X>x) = (x/xm)-k This plays havoc with performance. Why? Solutions? – 22 – 15 -440
Content Distribution Networks (CDNs) The content providers are the CDN customers. origin server in North America Content replication CDN company installs hundreds of CDN servers throughout Internet n Close to users CDN distribution node CDN replicates its customers’ content in CDN servers. When provider updates content, CDN updates servers CDNs: n n Akamai Major ISPs CDN server in S. America CDN server in Europe – 23 – CDN server in Asia 15 -440
Serving Through CDN Requirement n Route HTTP request to CDN node, rather than to original server Methods n CDN provider manipulates DNS tables unix 1> dig www. nfl. com ; ; ANSWER SECTION: www. nfl. com. 300 IN www. nfl. com. edgesuite. net. 13778 IN a 989. g. akamai. net. 20 IN n CNAME A A www. nfl. com. edgesuite. net. a 989. g. akamai. net. 96. 7. 40. 32 96. 7. 40. 33 Rewrite HTML pages l <a href=“http: //www. nfl. com/images/ben_roethlisberger”> n With l <a href=“http: //a 989. g. akamai. net/nfl/images/ben_roethlisberger”> – 24 – 15 -440
Caching Content in CDN Simplistic n Each CDN server caches content that flows through it Better n n Create DHT among cluster of servers Origin of Chord led to founding of Akamai Challenges n n Usual ones of staleness / consistency / replication Handled by TTLs Effectiveness n Can’t cache dynamic content l Responses to individual queries l But, even dynamic pages contain static links n – 25 – Great for streaming content 15 -440 l If multiple clients viewing same programs ~ simultaneously
Summary DNS one of world’s largest distributed system n Operation and authority delegated hierarchically n Huge number of queries / second Many Ways to Reduce / Balance Traffic n n n – 26 – Contrary to simple unique address / host model Time & location varying DNS entries CDNs 15 -440
- Slides: 26