Scalable Content Addressable Networks Prepared by Kuhan Paramsothy

Scalable Content. Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007

High-Level Overview l l l Hash tables (map keys to values) are heavily used in building software applications The concept of a Content-Addressable Network (CAN) provides hash table-like functionality on Internet-like scales. CAN is: l l Scalable Robust/Fault-tolerant Self-organizing Low-latency ECE 1770 – Content-Addressable Networks

Hash Tables and CAN l A data structure that efficiently maps keys onto values l CANs are a form of distributed, Internet-scale hash tables. ECE 1770 – Content-Addressable Networks

What CAN would do for us l CAN would improve peer-to-peer systems l Napster: the process of locating a file is centralized l l Gnutella: decentralized the file location process (network self-organizes into an application layer mesh) l l l Requests for files are done through flooding, not scalable, may not find content Conclusion: P 2 P systems need a scalable indexing mechanism CAN would improve large data repositories l l Expensive to scale the central repository, single point of failure These systems need efficient insertion and retrieval CAN would create large-scale name resolution services that don’t use a naming scheme (ie. Not DNS) l No more location-dependent naming schemes ECE 1770 – Content-Addressable Networks

Basic Operations Performed On CANs l Basic Operations l l Each CAN stores 1. 2. l A piece (called a zone) of the entire hash table Holds information about a small number of adjacent zones in the table Routing in a CAN l l Insertion (of key, value pairs) Lookup (of key, value pairs) Deletion (of key, value pairs) Done by intermediate CAN nodes towards the CAN node whose zone contains that key CAN Design is l l l Distributed (requires no centralized control or coordination) Scalable (nodes hold only a small about of information that doesn’t grow with the network) Fault-tolerant (nodes can route around failures) Doesn’t require a naming hierarchy Is entirely Application Layer ECE 1770 – Content-Addressable Networks

CAN Design l l Centers around a virtual d-dimensional Cartesian coordinate space on a d-torus At any time, the entire coordinate space is dynamically partitioned among all the nodes in the system l Each node owns a distinct zone ECE 1770 – Content-Addressable Networks

CAN Design (2) 1. 2. 3. 4. To store a pair, key K 1 is mapped to P via a uniform hash function The pair is then stored at the node that owns the zone where P lies To retrieve an entry corresponding to K 1, any node can apply the same hash function to map K 1 to P and get the corresponding value A node learns and maintains the IP addresses of those nodes that hold adjoining coordinate zones Efficient routing is critical to a useful CAN ECE 1770 – Content-Addressable Networks

Routing in a CAN l l Routing in a Content Addressable Networks by following the straight line path through the Cartesian space from source to destination coordinates. A CAN node maintains a coordinate routing table that holds the IP address and virtual coordinate zone of each of its immediate neighbors in the coordinate space. Average Path Length = (d/4)(n 1/d) Individual Nodes Have 2 d Neighbors Average Path Length Grows As O(n 1/d) ECE 1770 – Content-Addressable Networks

Construction of a CAN Overlay l l The entire CAN space is divided amongst the nodes currently in the system Incremental construction process takes three steps l l The new node finds a node already in the CAN Using the CAN routing mechanisms, finds a node whose zone will be split The neighbors of the split zone must be notified so that routing can include the new node Bootstrapping: There are CAN bootstrap nodes associated to a DNS domain name Node Insertion Affects Only O(number of dimensions) existing nodes ECE 1770 – Content-Addressable Networks

Maintenance of a CAN Overlay l l Node Graceful Departure: node explicitly hands over its zone and the associated (key, value) database to one of its neighbors Node Abrupt Disappearance: An immediate takeover algorithm ensures one of the “failed” node’s neighbors takes over the zone l l Under normal conditions, a node sends periodic update messages to each of its neighbors and a list of neighbors and their zone coordinates. Prolonged absence of an update message from a neighbor signals it’s failure ECE 1770 – Content-Addressable Networks

Design Improvements l Basic CAN algorithm provides l l l Low per-node state (O(d) for a d-dimensional space) Short path lengths (O(dn 1/d) hops for d dimensions and n nodes) The problem is that there applicationlayer hops, not IP-layer hops l Latency of each hop might be substantial ECE 1770 – Content-Addressable Networks

Design Improvements (2) l Improvement: Multi-dimensioned Coordinate Spaces l l Improvement: Multiple Coordinate Spaces (a. k. a. Multiple Realities) l l l Increasing the dimensions of the CAN coordinate space reduces the routing path length and path latency for a small increase in the size of the coordinate routing table Path Length scales as O(d(n 1/d)) Fault-tolerance improves Maintain multiple independent coordinate spaces with each node in the system being assigned a different zone in the coordinate space (each coordinate space is a reality) Fault-tolerance improves Low per-node state (O(d) for a d-dimensional space) Short path lengths (O(dn 1/d) hops for d dimensions and n nodes) Which is better? l Increasing the dimensions ECE 1770 – Content-Addressable Networks

Design Improvements (3) l Improvement: Better CAN Routing Metrics l l l Have each node measure the network-level round-trip-time RTT to each of its neighbors. Then route messages accordingly. Favors lower latency paths and avoids unnecessarily long hops Improvement: Caching and Replication l l l A CAN node can maintain a cache of the data keys it recently accessed A CAN node can replicate the data key at each of its neighboring nodes Both schemes need an associated time-to-live field, to eventually expire from the cache ECE 1770 – Content-Addressable Networks

Related Systems l Domain Name System l l CANs are more general than the DNS because DNS closely ties the naming scheme to the manner in which a name is resolved to an IP address Peer-to-Peer l l l A simple example is keys being analogous to a URL Will improve robustness Key difference is that content within the CAN can always be located by any other node because there is a clear “home” (point) in the CAN for that content and every other node knows what the home is how to reach it ECE 1770 – Content-Addressable Networks

Discussion l Security? l l l Better or worse with CAN? Any Other Design Improvement? Is The Communication Overhead Significant? ECE 1770 – Content-Addressable Networks

References l A Scalable Content-Addressable Network, Ratnasamy, University of California – Berkeley, http: //www. sigcomm. org/sigcomm 2001/p 13 ratnasamy. pdf ECE 1770 – Content-Addressable Networks