PeertoPeer EE 122 Intro to Communication Networks Fall

  • Slides: 17
Download presentation
Peer-to-Peer EE 122: Intro to Communication Networks Fall 2010 (MW 4 -5: 30 in

Peer-to-Peer EE 122: Intro to Communication Networks Fall 2010 (MW 4 -5: 30 in 101 Barker) Scott Shenker TAs: Sameer Agarwal, Sara Alspaugh, Igor Ganichev, Prayag Narula http: //inst. eecs. berkeley. edu/~ee 122/ Materials with thanks to Jennifer Rexford, Ion Stoica, Vern Paxson and other colleagues at Princeton and UC Berkeley

Today’s Lecture • The Opening Act (~30 minutes) – Scott talking about Peer-to-Peer Systems

Today’s Lecture • The Opening Act (~30 minutes) – Scott talking about Peer-to-Peer Systems • The Main Event (rest of the time, and beyond) – Igor Ganichev talking about Networking Libraries 2

Ground Rules • No slides (for me) – No laptops for you • I

Ground Rules • No slides (for me) – No laptops for you • I won’t test you on anything from this lecture – This is for context • I will give you homework on a P 2 P system – But based on material in the book – Won’t be hard – Some material covered in section 3

Peer-to-Peer • Design paradigm: No central contol – Large number of identical nodes –

Peer-to-Peer • Design paradigm: No central contol – Large number of identical nodes – Highly resilient and scalable – Just what you need to run a large datacenter • Economic model: leverage user nodes – No need for huge investment – Broad geographic distribution – Self-scaling • Will discuss both, but start with economic model…. – A continuing struggle for control 4

In the beginning…. . • AT&T created the telephone network • First large-scale person-to-person

In the beginning…. . • AT&T created the telephone network • First large-scale person-to-person communication infrastructure – The patent dispute of the telephone makes our patent litigation battles seem like child’s play • The Telephone network dominated for two generations…. 5

The Telephone Model • Functionality controlled by network operator – They sink the money

The Telephone Model • Functionality controlled by network operator – They sink the money into the infrastructure – They get to decide what that infrastructure does – But government regulated company (set ROI, etc. ) • End-user only has “dumb terminal” – Legally restricted in its use of that terminal – Until the court’s finally gave some freedom to users • Regulated monopoly led to glacial innovation in functionality but extreme reliability and polish – Why spend money on features no one knows they want? – Spend money improving what people notice (failures) 6

Then came the Internet… • End points had complete freedom, and substantial computing power

Then came the Internet… • End points had complete freedom, and substantial computing power – Infrastructure just carried bits • Completely different economic model – Small guys can innovate – Big guys run dumb infrastructure (like utilities) • Result: – Rapid innovation in applications (e. g. , email, web) – Diversity of content (on web) – Low barrier to entry • And finally, even the big boys noticed…. 7

The Empire Strikes Back • Zipf’s law restores order to the universe – Popularity

The Empire Strikes Back • Zipf’s law restores order to the universe – Popularity ~ 1/rank – Lots of weight at top (people like the same things) – Lots of weight in tail (but lots of idiosyncratic tastes) • A Tale of Two Markets – Lots of action in the tail (anyone can play) – But only a few really big guys (hard to enter this market) • High barrier to entry: CDNs – Bandwidth – Servers – Management 8

Revenge of the Nerds • Peer-to-Peer restores the balance – Takes “contributed” nodes from

Revenge of the Nerds • Peer-to-Peer restores the balance – Takes “contributed” nodes from participants – Together they provide enough aggregate bandwidth • The key is in coordinated these peer nodes – First: Napster (Shawn Fanning) • Academia followed (as it always does) • My lecture on how academia has missed out on everything? – We are really good at solving problems – We are really terrible at figuring out what people want…. . 9

Coordination Mechanisms • Must be: – Scalable – Fault-tolerant – Can use commodity parts

Coordination Mechanisms • Must be: – Scalable – Fault-tolerant – Can use commodity parts • A good way to build systems in general! • Now finished with P 2 P as economic model – Moving on to……. . 10

Peer-to-Peer as Design Paradigm • Once you can coordinate many disparate peers – You

Peer-to-Peer as Design Paradigm • Once you can coordinate many disparate peers – You can certainly coordinate many co-located peers – Now the dominant design style in datacenters – DHT-like data structures are everywhere • This is what made Google work: (like Jobs at App) – Design as if failure is the typical case – Recover from failure only at the highest possible layer o If routing fails use another server, don’t wait for routing to recover o This is hard to accept for some people…. – Low cost components – Scale out, not up 11

P 2 P Systems Do Three Main Things • Help user determine which content

P 2 P Systems Do Three Main Things • Help user determine which content they want – Some form of search – P 2 P form of Google • Then locate that content – Locate where that content is on the Internet – P 2 P form of DNS (map name to location) • Then download that content – P 2 P form of Akamai 12

We need P 2 P forms of • Search (keyword) • Directory • CDN

We need P 2 P forms of • Search (keyword) • Directory • CDN • What kinds of coordination mechanisms do we need for these tasks? 13

P 2 P Search • Basic approach: – Since search can be complicated, just

P 2 P Search • Basic approach: – Since search can be complicated, just do it on each machine independently, and keep going for as long as you need • Examples: – Broadcast among superpeers – Random walk (theory) • Cannot match efficiency of Google 14

P 2 P Directory • In most cases, a few centralized servers will do

P 2 P Directory • In most cases, a few centralized servers will do • If you need to scale further, then use DHT – Put/Get interface • DHT: simple version is consistent hashing – Everyone knows set of servers – Map key to server using the successor rule 15

P 2 P Download • The first key here is self-scaling • If every

P 2 P Download • The first key here is self-scaling • If every person who downloads something also has to upload it to someone else, the system works • The second key here is asymmetric bandwidth – That’s where chunks come in – Downloading many chunks 16

Modern P 2 P Systems Use a Mixture • Search to find name (wildcard

Modern P 2 P Systems Use a Mixture • Search to find name (wildcard search) – Flood among superpeers • Directory lookup to find host given exact name – DHT-like structure • Chunked download – Self-scaling – Asymmetric bandwidth 17