Faulttolerant Routing in PeertoPeer Systems James Aspnes Zo

  • Slides: 25
Download presentation
Fault-tolerant Routing in Peer-to-Peer Systems James Aspnes Zoë Diamadi Gauri Shah Yale University PODC

Fault-tolerant Routing in Peer-to-Peer Systems James Aspnes Zoë Diamadi Gauri Shah Yale University PODC 2002

P 2 P network Peers Resources Key • Bunch of peers. • Store resources

P 2 P network Peers Resources Key • Bunch of peers. • Store resources identified by keys. • Peers subject to crash failures. • Goal: locate resources ‘’efficiently’’.

Properties of ideal network • Data availability • Decentralization • Fault-tolerance • Scalability •

Properties of ideal network • Data availability • Decentralization • Fault-tolerance • Scalability • Load balancing • Maintaining network • Dynamic node addition/deletion • Self-stabilization • Efficient searching • Incorporating geography • Incorporating locality

Early P 2 P systems Napster Gnutella x ? x Central server bottleneck Inefficient

Early P 2 P systems Napster Gnutella x ? x Central server bottleneck Inefficient flooding

Tapestry [JKZ’ 01] Uses Plaxton’s Algorithm: Node xyz links to *XX, x*X and xy*

Tapestry [JKZ’ 01] Uses Plaxton’s Algorithm: Node xyz links to *XX, x*X and xy* [* = all digits, X = any digit] 427 768 368 123 327 135 360 Correct one digit at a time to reach target. Pastry [DR’ 01] is also similar.

CAN [RFHKS’ 01] Partition d-dimensional co-ordinate space into zones. (0, 1) (1, 1) 3

CAN [RFHKS’ 01] Partition d-dimensional co-ordinate space into zones. (0, 1) (1, 1) 3 d=2 2 (0, 0) 5 zone 7 8 (1, 0) Nodes own zones and keys hashed to them. Greedy routing: forward to neighbor closest to target.

Chord [SMKKB‘ 01] Nodes and resources mapped to identifier circle. Routing table: successor nodes

Chord [SMKKB‘ 01] Nodes and resources mapped to identifier circle. Routing table: successor nodes at distances. 0 successors 0 0 3 7 0 3 6 1 6 2 3 5 6 6 0 identifier circle (n=8) 4 Greedy routing: forward to node in routing table closest to target

Common underlying structure • Underlying metric space. • Nodes embedded in metric space. •

Common underlying structure • Underlying metric space. • Nodes embedded in metric space. • Location determined by key. • Hashing to balance load. • Greedy routing. • O(log n) space at each node. • O(log n) routing time.

Unifying approach Nodes v 4 Keys Virtual Route v 2 v 1 HASH Physical

Unifying approach Nodes v 4 Keys Virtual Route v 2 v 1 HASH Physical Link Actual Route PHYSICAL NETWORK v 1 v 2 v 3 v 4 Virtual Link v 3 VIRTUAL OVERLAY NETWORK

Link Distribution Each node independently selects k long-hop links as per some distribution. x-d

Link Distribution Each node independently selects k long-hop links as per some distribution. x-d 1 Nodes x Links chosen as per x-d 2

Abstract model Simple metric space: 1 D line. Hash(key) = Metric space location. Short-hop

Abstract model Simple metric space: 1 D line. Hash(key) = Metric space location. Short-hop links: immediate neighbors. Long-hops links: inverse-distance distribution. Pr[edge(u, v)] = 1/d(u, v) / 1/d(u, v’) Greedy Routing: forward message to neighbor closest to target in metric space.

What do we care about? • Do we get similar upper bounds on routing

What do we care about? • Do we get similar upper bounds on routing time with failures? • Is it possible to design a link distribution that beats the O(log 2 n) bound for routing given by 1/d distribution? • Can we dynamically construct such a network?

Greedy routing with failures Analyze message delivery in phases [Kleinberg ‘ 99]. Phase 0

Greedy routing with failures Analyze message delivery in phases [Kleinberg ‘ 99]. Phase 0 Phase 1 Target t Message at node n in phase i: 2 i At most (log n + 1) such phases. Phase 2 d(n, t) < 2 i+1

[1. . log n] long-hop links Suppose each node has k long-hop links. Average

[1. . log n] long-hop links Suppose each node has k long-hop links. Average time spent in each phase: ((log n)/k). With O(log n) such phases: Total time: O((log 2 n)/k). With failures: Suppose each node/link fails with prob (1 -p). Average time spent in each phase: ((log n)/pk). Total time: O((log 2 n)/pk)

Simulation results n=131072 nodes log n=17 links What happens with > log n links?

Simulation results n=131072 nodes log n=17 links What happens with > log n links?

What do we care about? • Do we get similar upper bounds on routing

What do we care about? • Do we get similar upper bounds on routing time with failures? • Is it possible to design a link distribution that beats the O(log 2 n) bound for routing given by 1/d distribution? Lower bound on routing time as a function of number of links per node. • Can we dynamically construct such a network?

Intuition for lower bound [KUW’ 88] Time needed for a non-increasing real-valued Markov chain

Intuition for lower bound [KUW’ 88] Time needed for a non-increasing real-valued Markov chain X 0, X 1, X 2…. to drop to 1 bounded by: where = E[Xt –Xt+1: Xt = z] is a non-decreasing function of z.

Upper bound on time SFO NYC x z Starting from x, average speed at

Upper bound on time SFO NYC x z Starting from x, average speed at z =. gives lower bound on average crossing speed. ( is non-decreasing so ) gives upper bound on time.

Lower bound on time SFO NYC x z gives upper bound on average crossing

Lower bound on time SFO NYC x z gives upper bound on average crossing speed. mz= sup gives lower bound on time. This may give too large an estimate, so condition against high bursts of speed.

Tool for lower bound Non-increasing Markov chain: X 0, X 1, X 2 ….

Tool for lower bound Non-increasing Markov chain: X 0, X 1, X 2 …. . , state space S. Few long jumps Upper bound on speed [no long jumps] Pr[Xt – Xt+1 U : Xt = x] E[Xt – Xt+1 : Xt=x, Xt – Xt+1 < U] mz = sup { : x S, x [z, z+U) } Time from x [no long jumps] E[Time to reach 0] T(X 0)/[ T(X 0) + (1 - )]

Applying tool to routing Cannot bound progress of single node with an arbitrary distribution!

Applying tool to routing Cannot bound progress of single node with an arbitrary distribution! So use an aggregate chain St of nodes for collective behavior of nodes in some range. Links Nodes d 3 St d 2 d 1 0 Track ln(|St|) for recurrence relation. St+1 0

Lower bounds Random graph G. Node x has k independent links on average. x

Lower bounds Random graph G. Node x has k independent links on average. x links to (x-1) and (x+1). Expected time to reach 0 from a Point chosen uniformly from 1. . n: link ignored 1 -sided routing: s link ignored 2 -sided routing: * s * Probability of choosing links symmetric about 0 and unimodal. (ln 2 n) worse that O(ln n) for a tree: cost of assuming symmetry between nodes.

What do we care about? • Do we get similar upper bounds on routing

What do we care about? • Do we get similar upper bounds on routing time with failures? • Is it possible to design a link distribution that beats the O(log 2 n) bound for routing given by 1/d distribution? • Can we dynamically construct such a network?

Heuristic for construction New node chooses neighbors using inverse distance distribution. Links to live

Heuristic for construction New node chooses neighbors using inverse distance distribution. Links to live nodes closest to chosen ones. Selects older nodes to point to it. new link adjusted link initial link ideal link x y absent node new node older node

Open problems • Does lower bound generalize to multidimensional metric spaces? • Does backtracking

Open problems • Does lower bound generalize to multidimensional metric spaces? • Does backtracking give provably good routing bound? Shah ’ 02] • Design a self-stabilization mechanism. [Aspnes, submitted to SODA • Analyze security properties such as anonymity and byzantine failures. ?