Bittorrent The protocol its background and uses 1
Bittorrent: The protocol, its background and uses 1. Bit. Torrent Background a) What is Bit. Torrent? b) Who’s the author, history 2. The Protocol a) b) c) d) Terminology Distributed Scenario Structure of. torrent files Protocol between peers and trackers 3. Bit. Torrent Applications a) Bittorent Inc, Usages throughout industry
Bit. Torrent “You get so tired of having your work die, ” he says. “I just wanted to make something that people would actually use. ” • The above quote if from Bram Cohen, Bit. Torrent’s author, in an interview with Wired in 2005.
What is Bit. Torrent? From 10, 000 feet Efficient content distribution system using file swarming. Does not perform all the functions of a typical p 2 p system, like searching. http: //www. cs. uiowa. edu/~ghosh/bittorrent. ppt
What is Bit. Torrent? • Bit. Torrent introduced two novel concepts • Rather than providing a search protocol itself, it was designed to integrate seamlessly with the Web and made files (torrents) available via Web pages, which could be searched for using standard Web search tools. • It enabled so-called file swarming; that is, once a peer starts downloading that file, it also makes whatever portion of the file that is downloaded immediately available for sharing.
What is Bit. Torrent • The file-swarming process is enabled through the use of a tracker: • an HTTP-based server used to dynamically synchronise and update the peers as they are downloading - tracks availability of pieces of the file on the network. • The tracker also can monitor users’ usage on the network – how much do they contribute? • Then implements a tit-for-tat scheme, which divides bandwidth according to how much a peer contributes to the other peers in the network – if you do not share, you cannot consume.
Bit. Torrent Bram Cohen • Born 1975 - computer programmer • Engineered large parts of Mojo Nation (mojonation. net) - parts of it similar in flavour to Bittorrent (Pre April 2001). • April 2001, Focused on authoring the peer-to-peer (P 2 P) Bit. Torrent protocol and writing the first file sharing program to use the protocol, also known as Bit. Torrent. • He is also the organizer of the San Francisco Bay Area P 2 P-hackers meeting, and the co-author of Codeville. Currently lives in the San Francisco Bay Area
Start of Bit. Torrent - Code. Con • Cohen unveiled his novel ideas at the first Code. Con conference in 2002 • Code. Con is a conference for hackers and technology enthusiasts. • Co-organised by Bram and his roommate Len Sassaman. • Code. Con intended to be a low cost conference (I. e. <$100) with a focus on developers doing presentations of working code, rather than on companies with products to sell. • It remains an event for those seeking information about new directions in software, though Bit. Torrent continues to lay claim to the title of "most famous presentation".
Features? • Peer-to-peer in nature
Taxonomy for Distributed Systems Taxonomy is based on following factors and their relation to centralization: 1. Resource Discovery: Mechanism for discovering resources on a distributed system? • Examples: DNS, Napster Lookup, Jini LUS, UDDI, Gnutella broadcast etc 2. Resource Availability: Scalability – do resources scale with network? - does access to them scale with network? 3. Resource Communication: Two types: Brokered Communication (centralized): communication is passed through a central server - resources do not have direct references to each other. Point to point (decentralized -peer to peer): a direct connection between the sender and the receiver.
Centralization of Point-to-Point Connections True Peer to Peer e. g. Gnutella Web Server Equal Peers, balanced (equal) load on communication Many to one relationship between users and the web server and therefore this can be considered centralized communication Bit. Torrent pieces
Features? • • • Peer-to-peer in nature Central server called a tracker Tracker uses HTTP Download and upload at the same time Efficiency improves the more a file is downloaded
Downloading Speeds Download speeds depend on two factors: • Bit. Torrent keeps track of how much you contribute to hosting files for the group. • The more you share, the faster your downloads. • The more people trading a file, the more options for obtaining its pieces. • So, unlike the old Napster, popularity doesn't bog down the process -- it gives it a shot of adrenaline • Trackers also more dynamic than Napster servers - provide updates
File Swarming • File swarming allows users to download files to the maximum of their Download capability of their broadband connection • Enables simultaneous downloads of pieces of the same file from multiple users. • Significant because broadband has a far lower Upload bandwidth than Download • upload bandwidth can be ten times slower than download • You can connect to, say, ten peers, will balance this mismatch and enable full download capacity
Bit. Torrent Protocol • The Bit. Torrent protocol is an open specification • Can be found in full on the Bit. Torrent Web site • Is updated periodically in order to keep various Bit. Torrent applications compatible.
Terminology 1 • Torrent - metadata file containing the information about a file to be shared on the Bit. Torrent network • Peer - a participant in the network • Seed - the peer that has a complete copy of the file (who probably created the torrent) • Swarm - peers that are connected (interested) in a particular file • Tracker - server responsible for keeping track of the people in a swarm
Terminology 2 • Choked - state of a connection when a peer does not wish to upload information at this time (perhaps because s/he already has too many connections) • Interested - a client is “interested” if they are interested in downloading a file from another BT node. • Piece - piece of a file in Bittorrent - typically a power of 2, depends on file size - common sizes are 256 K, 512 K or 1 MB. • Bencoding - terse format for Bit. Torrent messages
Bit. Torrent A Bit. Torrent application generally has the following components: • • • An 'original' downloader - seed An ordinary web server The end user web browsers - they click on a: A static 'metainfo' file (a. torrent file) Start the end user downloading apps (Bit. Torrent) A Bit. Torrent tracker • There are ideally many end users for a single file.
Lectures as. Torrent Seed - Ian T. 1. Ian creates Ians. Lectures. torrent, (metadata) and uploaders it to Web site Web Server Web Sites contain. torrent files Ians. Lectures. torrent 2. User clicks Ians. Lectures. torrent, which launches the Bit. Torrent Client User Web Browser Bit. Torrent Client (enthusiastic student) Other Bit. Torrent Client (enthusiastic student) Because of MIME mapping from. torrent to Bit. Torrent application 4. Bit. Torrent client contacts specified tracker and finds “interested” clients Tracker Other Bit. Torrent 3. Clients show interest in Ians. Lectures. torrent Client (enthusiastic student) 5. Clients connect to each other and seed to download pieces
Bit. Torrent Messages - Bencoding • Bencoding is a way to specify and organize data in a terse format. It supports the following 4 types: • Strings are encoded as follows: <string length>: <string data> e. g. 4: spam represents the string "spam" • Integers are encoded as follows: i<integer>e e. g. i 3 e represents the integer "3” • Lists are encoded as follows: l<bencoded values>e e. g. l 4: spam 4: eggse represents the list of two strings: [ "spam", "eggs" ] • Dictionaries are encoded as follows: d<bencoded string><bencoded element>e - note keys must be bencoded strings. E. g. d 4: spaml 1: a 1: bee represents the dictionary { "spam" => [ "a", "b" ] }
. torrent Files The content of a ". torrent" is a bencoded dictionary, containing: • announce: The URL of the tracker (string) - later versions have lists of trackers. • info: a dictionary that describes the file(s) of the torrent contains the following: • Name - name for the file • Piece length: number of bytes in each piece (integer) • Pieces: string consisting of the concatenation of all 20 -byte SHA 1 hash values, one per piece (byte string) • Format changes if there’s one file (as above) or many, where there are files occurrences of the above information (piece length and pieces) and path is used to replace name for uniqueness.
Bit. Torrent - Trackers Centralised: All clients go to one server The Bit. Torrent Solution: customers help distribute content Their contribution grows at the same rate as their demand, creating limitless scalability for a fixed cost. Tracker maintains the process
Tracker Scenario Step 1 - Pieces 1, 2 and 3 Seed Step 2 - Pieces 4, 5 and 6 Tracker Update ! BT 1 BT 3 Step 2 - Piece 1 Step 2 - Piece 3 BT 2 Step 1 Step 2 - Piece 2
Tracker GET Request Peer -> Tracker • Info_hash - 20 byte SHA 1 hash of the bencoded form of the info value from the metainfo file. • Peer_id - string of length 20 containing ID of downloader - generated at random at the start of a new download. • IP - IP (or dns name) of peer. • Port - port number for the peer - tries port 6881 and if that port is taken try 6882, then 6883, etc. and give up after 6889. • Uploaded - total amount uploaded so far. • Downloaded - The total amount downloaded so far. • Left - number of bytes this peer still has to download • Event - optional key which maps to started, completed, or stopped (or empty, which is the same as not being present).
Tracker Response • Tracker -> peer • Tracker responses are bencoded dictionaries. • If a tracker response has a key failure reason, then that maps to a human readable string which explains why the query failed, and no other keys are required. • Otherwise, it must have two keys: • Interval which maps to the number of seconds the downloader should wait between regular rerequests • Peers maps to a list of dictionaries corresponding to peers, each of which contains the keys peer id, ip, and port, which map to the peer's self-selected ID, IP address or dns name as a string, and port number, respectively.
Scenario Tracker Web Server . torre nt Web page with link to. torrent C A Peer [Leech] B Downloader Peer “US” [Leech] [Seed]
Scenario Tracker Web Server Web page with link to. torrent ce n ou nn a t Ge C A Peer [Leech] B Downloader Peer “US” [Leech] [Seed]
Scenario Tracker Web Server Web page with link to. torrent st li r e e e-p s on sp e R C A Peer [Leech] B Downloader Peer “US” [Leech] [Seed]
Scenario Tracker Web Server Web page with link to. torrent Shake-hand C A Peer [Leech] Sh ak Peer e-h an d B Downloader Peer “US” [Leech] [Seed]
Scenario Tracker Web Server Web page with link to. torrent pieces A Peer [Leech] pie ce s C Peer B Downloader Peer “US” [Leech] [Seed]
Scenario Tracker Web Server Web page with link to. torrent pieces A Peer [Leech] pie ce ce s s C Peer B Downloader Peer “US” [Leech] [Seed]
Scenario Tracker Web Server Web page with link to. torrent ce n ou nn a ist l t r Ge ee p se n o sp e pieces R A Peer [Leech] pie ce ce s s C Peer B Downloader Peer “US” [Leech] [Seed]
Strengths • Better bandwidth utilization • Never before speeds. • Up to 7 MB/s from the Internet. • Limit free riding – tit-for-tat • Limit leech attack – coupling upload & download • Spurious files not propagated • Ability to resume a download • Open Source implementations !
Potential Drawbacks • Small files – latency, overhead • Scalability • Millions of peers – Tracker behavior (uses 1/1000 of bandwidth) • Single point of failure - although there can be many trackers, there is only one tracker assigned to each torrent file • Difficult to load balance • Solved later by having lists of alternative trackers • Robustness • System progress dependent on altruistic nature of seeds (and peers) • Malicious attacks and leeches.
Who Uses it? • 160 million clients, 100 million active users • According to their website, the company has announced partnerships with some 55 companies, including:
Bittorrent: summary 1. Bit. Torrent a) b) c) d) e) Underlying file sharing protocol Role of the. torrent Use and role of the tracker Bittorrent Scenario How file swarming works
- Slides: 35