On the Anonymity of Anonymity Systems Andrei Serjantov

  • Slides: 36
Download presentation
On the Anonymity of Anonymity Systems Andrei Serjantov schnur@gmail. com (anonymous)

On the Anonymity of Anonymity Systems Andrei Serjantov schnur@gmail. com (anonymous)

Outline • Anonymity informally – Anonymity Properties • Anonymity of Existing Implementations – Analysis

Outline • Anonymity informally – Anonymity Properties • Anonymity of Existing Implementations – Analysis • Probability, Entropy • Attacks – Low Latency – Intersection • Conclusion

What is Anonymity? Actually, we assume humans are tied to computers and anonymize those

What is Anonymity? Actually, we assume humans are tied to computers and anonymize those Anonymity does not hide the presence of the individuals/computers just their identity

Anonymity System This guy does not even know he is on the internet(!)

Anonymity System This guy does not even know he is on the internet(!)

Anonymity Properties I: Receiver Untraceability A B Senders are observable – i. e. the

Anonymity Properties I: Receiver Untraceability A B Senders are observable – i. e. the attacker knows that A sent a message to someone Receivers are not observable – ie the attacker does not know if B received a message

Anonymity Properties II: Sender Untraceability B A Senders unobservable….

Anonymity Properties II: Sender Untraceability B A Senders unobservable….

Anonymity Properties III: Unlinkability A B Senders and Receivers are observable, but not clear

Anonymity Properties III: Unlinkability A B Senders and Receivers are observable, but not clear who is talking to whom

Anonymous from Who? (threat model) • The observer: – Can compromise (almost) everything but

Anonymous from Who? (threat model) • The observer: – Can compromise (almost) everything but two users of the system – Observes and modifies all network traffic – Observes all network traffic • Global Passive Adversary – Observes some network traffic – Is the service the user is accessing

Properties • A mix cascade guarantees that a global active attacker cannot distinguish two

Properties • A mix cascade guarantees that a global active attacker cannot distinguish two honest users who send one message each between time t and t’. – e. g. mixing votes • DC-net – (both sender and receiver anonymity) • Can be expressed formally

Anonymity of Existing Implementations Mixes Mix Systems

Anonymity of Existing Implementations Mixes Mix Systems

Timed Mix

Timed Mix

Mix System R - Receiver A - Mix B - Mix Sender M, 0101011

Mix System R - Receiver A - Mix B - Mix Sender M, 0101011 R B A M, 0101011 R R B M, 0101011 R B Receiver

Doing Things Anonymously • Can provide guarantees for those who wish to send one

Doing Things Anonymously • Can provide guarantees for those who wish to send one message < 32 K, and suffer the consequences of it not reaching the receiver • Real life is not like that – Anonymous email (Mixmaster, Mixminion) • Send and receive anonymous emails – Web Browsing (JAP, TOR, Tarzan, Morphmix) • Wide file size distribution • Low latency

Anonymity Analysis of Existing Systems • Define a system, and an adversary • Take

Anonymity Analysis of Existing Systems • Define a system, and an adversary • Take inputs into the system – e. g. web request message stream – Email interaction • Compute observation Hence figure out how vulnerable the anonymity of a certain activity is to a particular adversary.

Inputs, Model, Observation System: M 1 M 2 M 1 (transition semantics model of

Inputs, Model, Observation System: M 1 M 2 M 1 (transition semantics model of the mixes) Sender 1 Inputs: Sender 2 Sender 3 R 2 Sender 1 Sender 2 M 2 Sender 3 R 1 R 2 R 3 Attacker: Global Passive Adversary R 3 R 1 Observation: R 2 Sender 1 M 1 Sender 2 R 3 M 2 Sender 3 R 1

Mix Network A B Q C D R Traditionally {A, B, C, D}

Mix Network A B Q C D R Traditionally {A, B, C, D}

Timed Mix A B C D {A, B, C, D}

Timed Mix A B C D {A, B, C, D}

Mix Network A B Q C D R Traditionally {A, B, C, D} The

Mix Network A B Q C D R Traditionally {A, B, C, D} The message arriving to R is much more likely to be from D than from A

Pool Mix • M messages stay in the mix at each round • Messages

Pool Mix • M messages stay in the mix at each round • Messages to be sent are picked from both the N and the M • A message might stay in the mix for an very long time (but the probability of this happening is very small) M N+M N N • The anonymity set of a message leaving at round i includes the senders who sent messages processed during previous rounds

Adding Probabilities • Let us add the probability of that event having occurred to

Adding Probabilities • Let us add the probability of that event having occurred to each event • Call this Anonymity Probability Distribution • So {A, B, C, D} could become: – {(A, ¼), (B, ¼), (C, ¼), (D, ¼)} – Or, {(A, 0. 5), (B, 0. 1), (C, 0. 1), (D, 0. 3)} • The probability distribution you come up with will depend on your observation, (+ knowledge, computational power…)

Entropy • Ok, what can we do with the probability distribution afterwards? • From

Entropy • Ok, what can we do with the probability distribution afterwards? • From information theory, is the information content of a probability distribution • Can use this for: – Measuring anonymity – Expressing new attacks (ones which do not modify the set, but modify the distribution) – Comparing effectiveness of attacks

Pool Mix Revisited • Could not previously compare a pool mix with a other

Pool Mix Revisited • Could not previously compare a pool mix with a other mixes • Now we can! • Compute the entropy of the geometric distribution • Pool mix with 100 inputs and 10 “feedbacks” is equivalent to a standard mix with 140 inputs(!!!) • But, average delay of a message going through a pool mix is greater • In the above example, 9% chance “of staying for another round”

Mix Networks • Can also compute the anonymity probability distribution in mix networks •

Mix Networks • Can also compute the anonymity probability distribution in mix networks • Model and details in [Ser 04] A B Q C D R {(A, 0. 125), (B, 0. 125), (C, 0. 25), (D, 0. 5)}

Impact of Low Latency and Repeated Communication -Packet Counting -Intersection

Impact of Low Latency and Repeated Communication -Packet Counting -Intersection

Connection-based Anonymity Systems • A number of nodes – Nodes do not mix, but

Connection-based Anonymity Systems • A number of nodes – Nodes do not mix, but do onion encryption • Packets are forwarded along links • All packets of a connection are forwarded via the same sequence of nodes “Classical” Network P 2 P anonymity system

The Packet Counting Attack I • Connection-based Anonymity Systems split the data up into

The Packet Counting Attack I • Connection-based Anonymity Systems split the data up into many fairly small packets <1 K • All packets of an anonymous connection travel down the same path • Thus, counting the packets may reveal which connections go where • Merely coarse-grained packet counting required

Packet Counting II • Observe the mix for time t and count packets on

Packet Counting II • Observe the mix for time t and count packets on each link • Correlate incoming and outgoing links – 1075 and 1076 3056 2497 2748 2850 1804 1353 1076 1075 • Ok if: – d (mix delay) << t – t is much smaller than interval between new connections starting

Packet Counting – Key Observation • Packet counting works if the whole connection is

Packet Counting – Key Observation • Packet counting works if the whole connection is lone – i. e. if it is the only connection on all the links (from the client to the server) it passes through This case may be attackable, we consider it not to be

Packet Counting – Results • Hence, we need 2 or more connections on as

Packet Counting – Results • Hence, we need 2 or more connections on as many links as possible • In our paper (ESORICS 2003) we define this formally • Then simulate, showing that – E. g. 100 nodes, 100 connections via 2 -4 nodes 92% of connections are lone (p 2 p scenario) – E. g. 20 nodes, 200 connections via 2 -4 nodes 2. 5% of connections lone (classic network)

Repeated Communication To M Alice Steves B Threshold B+1 To N As seen by

Repeated Communication To M Alice Steves B Threshold B+1 To N As seen by the attacker M N

The Model

The Model

Simplification introduced by the model Alice

Simplification introduced by the model Alice

The Results (1000 rounds, B=10) P(Estimate) Receivers, r Estimate of probability of Alice sending

The Results (1000 rounds, B=10) P(Estimate) Receivers, r Estimate of probability of Alice sending to r

The Results

The Results

The Results

The Results

Conclusions • Anonymity is a security property – not just privacy • Analysis of

Conclusions • Anonymity is a security property – not just privacy • Analysis of anonymity properties important – Has been a neglected area – Uses tools from other fields (graph theory, probability) • Plenty of applications – Identity management – Electronic voting – Anonymous email (whistle blowing)