On the Anonymity of Anonymity Systems Andrei Serjantov




































- Slides: 36
On the Anonymity of Anonymity Systems Andrei Serjantov schnur@gmail. com (anonymous)
Outline • Anonymity informally – Anonymity Properties • Anonymity of Existing Implementations – Analysis • Probability, Entropy • Attacks – Low Latency – Intersection • Conclusion
What is Anonymity? Actually, we assume humans are tied to computers and anonymize those Anonymity does not hide the presence of the individuals/computers just their identity
Anonymity System This guy does not even know he is on the internet(!)
Anonymity Properties I: Receiver Untraceability A B Senders are observable – i. e. the attacker knows that A sent a message to someone Receivers are not observable – ie the attacker does not know if B received a message
Anonymity Properties II: Sender Untraceability B A Senders unobservable….
Anonymity Properties III: Unlinkability A B Senders and Receivers are observable, but not clear who is talking to whom
Anonymous from Who? (threat model) • The observer: – Can compromise (almost) everything but two users of the system – Observes and modifies all network traffic – Observes all network traffic • Global Passive Adversary – Observes some network traffic – Is the service the user is accessing
Properties • A mix cascade guarantees that a global active attacker cannot distinguish two honest users who send one message each between time t and t’. – e. g. mixing votes • DC-net – (both sender and receiver anonymity) • Can be expressed formally
Anonymity of Existing Implementations Mixes Mix Systems
Timed Mix
Mix System R - Receiver A - Mix B - Mix Sender M, 0101011 R B A M, 0101011 R R B M, 0101011 R B Receiver
Doing Things Anonymously • Can provide guarantees for those who wish to send one message < 32 K, and suffer the consequences of it not reaching the receiver • Real life is not like that – Anonymous email (Mixmaster, Mixminion) • Send and receive anonymous emails – Web Browsing (JAP, TOR, Tarzan, Morphmix) • Wide file size distribution • Low latency
Anonymity Analysis of Existing Systems • Define a system, and an adversary • Take inputs into the system – e. g. web request message stream – Email interaction • Compute observation Hence figure out how vulnerable the anonymity of a certain activity is to a particular adversary.
Inputs, Model, Observation System: M 1 M 2 M 1 (transition semantics model of the mixes) Sender 1 Inputs: Sender 2 Sender 3 R 2 Sender 1 Sender 2 M 2 Sender 3 R 1 R 2 R 3 Attacker: Global Passive Adversary R 3 R 1 Observation: R 2 Sender 1 M 1 Sender 2 R 3 M 2 Sender 3 R 1
Mix Network A B Q C D R Traditionally {A, B, C, D}
Timed Mix A B C D {A, B, C, D}
Mix Network A B Q C D R Traditionally {A, B, C, D} The message arriving to R is much more likely to be from D than from A
Pool Mix • M messages stay in the mix at each round • Messages to be sent are picked from both the N and the M • A message might stay in the mix for an very long time (but the probability of this happening is very small) M N+M N N • The anonymity set of a message leaving at round i includes the senders who sent messages processed during previous rounds
Adding Probabilities • Let us add the probability of that event having occurred to each event • Call this Anonymity Probability Distribution • So {A, B, C, D} could become: – {(A, ¼), (B, ¼), (C, ¼), (D, ¼)} – Or, {(A, 0. 5), (B, 0. 1), (C, 0. 1), (D, 0. 3)} • The probability distribution you come up with will depend on your observation, (+ knowledge, computational power…)
Entropy • Ok, what can we do with the probability distribution afterwards? • From information theory, is the information content of a probability distribution • Can use this for: – Measuring anonymity – Expressing new attacks (ones which do not modify the set, but modify the distribution) – Comparing effectiveness of attacks
Pool Mix Revisited • Could not previously compare a pool mix with a other mixes • Now we can! • Compute the entropy of the geometric distribution • Pool mix with 100 inputs and 10 “feedbacks” is equivalent to a standard mix with 140 inputs(!!!) • But, average delay of a message going through a pool mix is greater • In the above example, 9% chance “of staying for another round”
Mix Networks • Can also compute the anonymity probability distribution in mix networks • Model and details in [Ser 04] A B Q C D R {(A, 0. 125), (B, 0. 125), (C, 0. 25), (D, 0. 5)}
Impact of Low Latency and Repeated Communication -Packet Counting -Intersection
Connection-based Anonymity Systems • A number of nodes – Nodes do not mix, but do onion encryption • Packets are forwarded along links • All packets of a connection are forwarded via the same sequence of nodes “Classical” Network P 2 P anonymity system
The Packet Counting Attack I • Connection-based Anonymity Systems split the data up into many fairly small packets <1 K • All packets of an anonymous connection travel down the same path • Thus, counting the packets may reveal which connections go where • Merely coarse-grained packet counting required
Packet Counting II • Observe the mix for time t and count packets on each link • Correlate incoming and outgoing links – 1075 and 1076 3056 2497 2748 2850 1804 1353 1076 1075 • Ok if: – d (mix delay) << t – t is much smaller than interval between new connections starting
Packet Counting – Key Observation • Packet counting works if the whole connection is lone – i. e. if it is the only connection on all the links (from the client to the server) it passes through This case may be attackable, we consider it not to be
Packet Counting – Results • Hence, we need 2 or more connections on as many links as possible • In our paper (ESORICS 2003) we define this formally • Then simulate, showing that – E. g. 100 nodes, 100 connections via 2 -4 nodes 92% of connections are lone (p 2 p scenario) – E. g. 20 nodes, 200 connections via 2 -4 nodes 2. 5% of connections lone (classic network)
Repeated Communication To M Alice Steves B Threshold B+1 To N As seen by the attacker M N
The Model
Simplification introduced by the model Alice
The Results (1000 rounds, B=10) P(Estimate) Receivers, r Estimate of probability of Alice sending to r
The Results
The Results
Conclusions • Anonymity is a security property – not just privacy • Analysis of anonymity properties important – Has been a neglected area – Uses tools from other fields (graph theory, probability) • Plenty of applications – Identity management – Electronic voting – Anonymous email (whistle blowing)