A Measurement Study of Napster and Gnutella Krishna

  • Slides: 1
Download presentation
A Measurement Study of Napster and Gnutella Krishna P. Gummadi, Stefan Saroiu and Steven

A Measurement Study of Napster and Gnutella Krishna P. Gummadi, Stefan Saroiu and Steven D. Gribble – U. of Washington 1. Motivation § Lots of research and industrial excitement: § Chord [MIT], Tapestry, CAN [UCB], Jxta [SUN], Herald, Past [MSR], Publius [AT&T] § A distributed infrastructure largely comprised of voluntary, dynamic ad-hoc membership by peers. § Peers have symmetric roles (serving, downloading and routing) throughout system. § No knowledge regarding the fundamental characteristics of peers participating in the network § This knowledge can help in evaluating the effectiveness of different schemes. 2. Measurement Methodology § Our measurements proceeded in three stages: 1. Periodically crawl Napster and Gnutella: § discover peers, IP’s. overlay topology, and whatever metadata about peers 2. Feed output from crawl into custom measurement tools: § measure bottleneck bandwidth to/from peers using SProbe. § measure IP latency to/from peers § track content and degree of sharing, where possible 3. Sub-sample population to measure lifetime: § Track availability of peers at application and IP level 3. Results How many peers have server-like behavior ? High upstream bandwidth ? Low latency ? § Majority of the peers (>50%) connect through Cable or DSL modems. § On average, peers have low upstream bandwidths compared to downstream bandwidths, a feature more representative of a client than a server. High availability ? § Large variation in IP level latencies. For a fraction of peers (~20%) transmission delay << latency, implying congestion. § Session durations strikingly similar in both systems. Median session: ~60 mins. Hence, content on a peer unlikely to be available without replication. What is the extent of free-riding ? How many peers lie about their bandwidth ? § A large fraction of peers (~25%) choose not to report their bandwidth, they are either unaware of it of have no incentive to report it. § Modem (<64 Kbps) users share less files and do more downloads compared to broad band users. § Peers have an incentive to report lower bandwidths, a significant fraction do so. § Sharing less files: Top 7% of nodes share more than bottom 75% in Gnutella. 40 -60% of peers in Napster share only 10 -15% of files. § Lack of knowledge is universal. 4. Conclusions 5. Future Work § Peers’ characteristics are very heterogeneous. A system should delegate responsibilities to its peers based on their characteristics. § Apply the results of these measurements to evaluate several proposed distributed index systems. § The system should measure the characteristics of a peer rather than rely on self-reports from the peers themselves. § Analyze content life time patterns, and geographical distribution of the content and peers 6. More Information A more complete analysis of the measurements is to appear in Multimedia Computing and Networking (MMCN) 2002. A tech report is available online at: http: //www. cs. washington. edu/homes/gummadi/p 2 ptechreport. pdf