PeertoPeer Distributed Search PeertoPeer Networks A pure peertopeer

Peer-to-Peer Distributed Search

Peer-to-Peer Networks A pure peer-to-peer network is a collection of nodes or peers that: 1. Are autonomous: participants do not respect any central control and can join or leave the network at will. 2. Are loosely coupled; they communicate over a general-purpose network such as the Internet, rather than being hard-wired together like the processors in a parallel machine. 3. Are equal in functionality; there is no leader or controlling node. 4. Share resources with one another. Examples: Napster, Kazaa, Bit. Torrent, …

Search • Lookup records in a (very large) set of key-value pairs. – Associated with each key K is a value V. – E. g. • K might be the identifier of a document. • V could be the document itself. • If the size of the key-value data is small, we could use a central node that holds the entire key-value table. – All nodes would query the central node when they wanted the value V associated with a given key K.

What if the table is too large? Solution: Distribute the responsibility • At any time, only one node among the peers knows the value associated with any given key K. • Any node can ask the peers for the value V associated with a chosen key K. Desire • The value of V should be obtained using few messages.

Chord Circles - Placement • To place a node in the circle, we hash its ID i, and place it at position h(i). • Key-value pairs are also distributed around the circle using hash function h. • For a pair (K, V ) compute h(K) and place (K, V ) at the lowest numbered node Nj such that h(K) j. • In Fig. – Any (K, V ) pair such that 42 < h(K) 48 would be stored at N 48. – If h(K) is any of 57, 58, . . . , 63, 0, 1, then (K, V ) would be placed at N 1.

(Inefficient) Search Assumption • Each node knows its successor in the circle. Search • For instance, if N 8 wants to find V for key K such that h(K) = 54, it can send the request forward around the circle until a node Nj is found such that j 54; – it would be node N 56. • Very inefficient!

Links in Chord Circles • To speed up the search, each node has a finger table – Gives the first nodes found at distances around the circle that are a power of two. • Suppose that the hash function h produces m-bit numbers. – Node Ni has entries in its finger table for distances 1, 2, 4, 8, . . . , 2 m-1. – The entry for 2 j is the first node we meet after going distance 2 j clockwise around the circle. Example: Finger table for N 8 is

Search Using Finger Tables • Suppose Ni wants to find (K, V ) where h(K) = j. • If (K, V ) exists, it will be at the lowest-numbered node that is at least j. Algorithm Idea • Let Nk be the successor of Ni. • Check if i<j k. If yes, (K, V ) must be at Nk if it exists. So, end the search and ask Nk to send (K, V ). • Otherwise, consult the finger table to find the highest-numbered node Nh that is less than j. – Send Nh a message asking it to search for (K, V ). – Nh behaves the same.

Search Using Finger Tables: Example • Suppose N 8 wants to find (K, V ), where h(K) = 54. • Since the successor of N 8 is N 14, and 54 {9, 10, …, 14}, (K, V) is not at N 14. • N 8 examines its finger table, and finds that all the entries are below 54. • Thus it takes the largest, N 42, and sends a message to N 42 asking it to look for key K and have the result sent to N 8. • N 42 finds that 54 {43, 44, …, 48} between N 42 and its successor N 48. • Thus, N 42 examines its own finger table, which is:

Search Using Finger Tables: Example • The last node (in the circular sense) that is less than 54 is N 51, so N 42 sends a message to N 51, asking it to search for (K, V ) on behalf of N 8. • N 51 finds that 54 is no greater than its successor, N 56. Thus, if (K, V ) exists, it is at N 56. • N 51 sends a request to N 56, which replies to N 8. The sequence of messages is shown in Fig.

Adding New Nodes • • • A new node Ni (i. e. , a node whose ID hashes to i) wants to join. If Ni doesn’t know any peer, it is not possible for it to join. However, if Ni knows even one peer, Ni can ask that peer what node would be Ni's successor around the circle. To answer, the known peer performs the algorithm as if it were looking for a key that hashed to i. The node at which this hypothetical key would reside is the successor of Ni. Suppose that the successor of Ni is Nj. We need to do two things: 1. Change predecessor and successor links, so Ni is properly linked into the circle. 2. Rearrange data so Ni gets all the data at Nj that belongs to Ni. To avoid concurrency problems, we follow a procedure we will not cover here.