Attacking Kad Network 20090304 Hongil Kim E ChanTin

  • Slides: 15
Download presentation
Attacking Kad Network 20090304 Hongil Kim E. Chan-Tin, P. Wang, J. Tyra, T. Malchow,

Attacking Kad Network 20090304 Hongil Kim E. Chan-Tin, P. Wang, J. Tyra, T. Malchow, D. Foo Kune, N. Hopper, Y. Kim, "Attacking the Kad Network - Real World Evaluation and High Fidelity Simulation using DVN -", Wiley Security and Communication Networks 2009 1

P 2 P Applications � File Sharing : Napster, Gnutella, Bit. Torrent, etc �

P 2 P Applications � File Sharing : Napster, Gnutella, Bit. Torrent, etc � Recent Commercial Applications Skype Bit. Torrent becomes legit P 2 P TV by Yahoo Japan � Research community P 2 P File and archival systems: Ivy, Kosha, Oceanstore, CFS Web caching: Squirrel, Coral Multicast systems: SCRIBE P 2 P DNS: Co. DNS and Co. Do. NS Internet routing: RON Next generation Internet Architecture: I 3 2

P 2 P Systems � How to find the desired information? Napster. com Centralized

P 2 P Systems � How to find the desired information? Napster. com Centralized structured: Napster Decentralized unstructured: Gnutella Match Decentralized structured: K Napster V Distributed Hash Table O K V O ▪ Content Addressable! K V � A DHT provides a hash table’s simple put/get interface K V P Insert a data object, i. e. , key-value pair (k, v) K V Retrieve the value v using key k P: a node looking for a file V B O: offerer of. Athe. Kfile … K V Query. Hit X Download retrieve (K 1) 3

DHT: Terminologies Every node has a unique ID: node. ID Every object has a

DHT: Terminologies Every node has a unique ID: node. ID Every object has a unique ID: key Keys and node. IDs are logically arranged on a ring (ID space) � A data object is stored at its root(key) and several replica roots � � � Closest node. ID to the key (or successor of k) Range: the set of keys that a node is responsible for � Routing table size: O(log(N)) � Routing delay: O(log(N)) hops � Content addressable! � C Q A X D B Y R k (k, v) 4

Target P 2 P System � Kad A peer-to-peer DHT based on Kademlia �

Target P 2 P System � Kad A peer-to-peer DHT based on Kademlia � Kad Network Overnet: an overlay built on top of e. Donkey clients ▪ Used by P 2 P Bots Overlay built using e. D 2 K series clients ▪ e. Mule, a. Mule, MLDonkey ▪ Over 1 million nodes, many more firewalled users BT series clients ▪ Overlay on Azureus ▪ Overlay on Mainline and Bit. Comet 5

Kademlia Protocol 01001011 123. 24. 3. 1 00100101 23. 37. 12. 13 01011010 311.

Kademlia Protocol 01001011 123. 24. 3. 1 00100101 23. 37. 12. 13 01011010 311. 1. 3. 4 … 01000001 129. 5. 3. 1 0 1 11011011 11000100 11111110 11001011 0 1 … 11010001011 10010100 10001110 … 0 1 10000001 1 � � 11000100 1100 Find/store 11001010 0 d(X, Y) = X XOR Y An entry in k-bucket shares at least k-bit prefix with the node. ID k=20 in overnet � 10101100 K bucket 10101100 Add new contact if k-bucket is not full Parallel, iterative, prefix-matching routing � Replica roots: k closest nodes � 6

Kad Protocol 10101100 1 1 0 1 0 15 14 13 12 0 1

Kad Protocol 10101100 1 1 0 1 0 15 14 13 12 0 1 1 0 11 10 1 9 1 0 8 1 7 0 0 0 6 1 5 1 0 4 3 0 2 1 0 0 1 1 � � � 1 0 0 No restriction on node. ID Replica root: |r, k| < K buckets with index [0, 4] can be split if new contact is added to full bucket Wide routing table short routing path K bucket in i-th level covers 1/2 i ID space A knows new node by asking or contact from other nodes � Hello_req is used for liveness � � � routing request can be used 7

Vulnerabilities of Kad � No admission control, no verifiable binding An attacker can launch

Vulnerabilities of Kad � No admission control, no verifiable binding An attacker can launch a Sybil attack by generating an arbitrary number of IDs � Eclipse Attack Stay long enough: Kad prefers long-lived contact (ID, IP) update: Kad client will update IP for a given ID without any verification � Termination condition Query terminates when A receives 300 matches. � Timeout When M returns many contacts close to K, A contacts only those nodes and timeouts. 8

Actual Attack �Preparation phase Backpointer Hijacking: 8 A, attacker M ▪ Learns A’s Routing

Actual Attack �Preparation phase Backpointer Hijacking: 8 A, attacker M ▪ Learns A’s Routing Table by sending appropriate queries ▪ Then, change routing table by sending the following message. 0 x. D 00 D IPMB A Hello, B, IPM M �Execution phase Provide many non-existing contacts ▪ Fact: Query will timeout after trying 25 contacts. 9

Screen Shots 10

Screen Shots 10

Summary of Estimated Cost � Assumption Total 1 M nodes 800 routing table entries

Summary of Estimated Cost � Assumption Total 1 M nodes 800 routing table entries 100 Mbps network link � Preparation cost 41. 2 GB bandwidth to hijack 30% of routing table Takes 55 minutes with 100 Mbps link � Query prevention 100 Mbps link is sufficient to stop 65% of WHOLE query messages. 11

Large scale simulation � 11, 303 ~ 16, 105 Kad nodes running on ~500

Large scale simulation � 11, 303 ~ 16, 105 Kad nodes running on ~500 Planet. Lab machines ^ Comparison between expected and measured 4 keyword query failures 4 Number of messages used to attack one node 4 Bandwidth usage 12

Self reflection attack � Fill node A’s routing table with A itself. A C

Self reflection attack � Fill node A’s routing table with A itself. A C IPC … G IPG A C Hello, X, IPA G Attack C … G C G ^ ≈ 100% queries failed after attack ^ Nodes can recover slowly ^ Second round of attack 13

Mitigations � Identity authentication Method Secure Persistent ID Incremental deployable Verify the liveness of

Mitigations � Identity authentication Method Secure Persistent ID Incremental deployable Verify the liveness of old IP No Yes Drop Hello with new IP Yes No Yes ID=hash(IP) Yes No No ID=hash(Public Key) Yes No � Routing correctness Independent parallel routes ▪ Incrementally deployable backpointers Current method Independent parallel routes 40% 98% fail 45% fail 10% 59. 5% fail 1. 7% fail 14

Thank you Any Questions?

Thank you Any Questions?