Random walks and analysis of algorithms in cryptography

Talk overview n Cryptanalysis RC 4 stream cipher n card shuffling n brute force

RC 4 stream cipher n n RC stands for “Ron’s Code, ” designed in

Abridged history of [alleged] RC 4™ n n 1994 – leaked to cypherpunks mailing

Usage n n SSL/TLS Windows, Lotus Notes, Oracle, etc. Cellular Digital Packet Data Open.

Encryption key state 0001111010101 plain text = cipher text cipher t

Decryption key state 0001111010101 cipher text cipher t = plain text

Security Requirement Indistinguishability from a perfect source of randomness: given part of the output

Second byte [MS 01] n Second byte of RC 4 output is 0 with

Related key attack [FMS 01] n Wireless Equivalent Privacy protocol (part of 802. 11

Recommendation Discard the first 256 bytes of RC 4 output [RSA, MS] Is this

RC 4 internal state n 21 Permutation S on 256 bytes: 123 134 24

Key scheduling algorithm (all arithmetic is mod 256) for i : = 0 to

Pseudo-random number generator i : = 0 j : = 0 repeat i :

Both RC 4’s routines for i : = 0 to 255 S[i] : =

Idealization of RC 4 for i : = 0 to 255 S[i] : =

Idealization of RC 4 for i : = 0 to n - 1 S[i]

Exchange shuffle n RC 4 card shuffling: i i … i random j When

Perfect shuffling n The textbook algorithm to shuffle cards: swap( S[i], S[j]) i i

Why is it not random? n n! does not divide nn Sign of the

First byte of RC 4 output n The first byte, S[S[1]+S[S[1]]], is biased:

Distinguisher n Less than 2, 000 to recognize a nonrandom output with 10% error

Mixing time n The permutation becomes more and more random. nonrandomness time

Variation distance between two distributions, P and Q on S: d(P, Q)=½ s S

The end of the beginning of RC 4 n What is the sufficient number

Card shuffling To shuffle 52 cards: - 7 riffle shuffles ~ 100 random transpositions

Lower bound Sign of the permutation: after t rounds sign can be predicted with

Upper bound Checking argument: 1. initially all cards are unchecked 2. check S[i] if

Checking argument i j S[i] is indistinguishable from other checked cards

Checking argument It takes (n log n) steps to check all cards. It gives

Mixing time n at least (n) n at most O (n log n)

What if n = 256? n n Optimistically (go with the lower bound) mixes

New development n n E. Mossel, A. Sinclair, Y. Peres (Berkeley): the upper bound

Backtracking j : = S[1] t : = S[1] + S[j] output S[t] j

Cost of backtracking n n n Keep guessing until there is a critical mass

Improvement j : = S[1] t : = S[1] + S[j] output S[t] j

Running time of improved algorithm n n Much more intricate analysis of an unbalanced

Why is it interesting? n n What about “short RC 4”: 64 -byte permutation?

Broadcast encryption k k k source k k k receivers Very little overhead One

Broadcast encryption source k 1 k 2 k 3 k 4 k 5 k

Broadcast encryption source k 1 k 2 k 3 k 4 k 5 receivers

Subset-cover framework (Naor-Lotspiech’ 01) S 7 S 1 S 8 S 6 S 4

Subset-cover framework (Naor-Lotspiech’ 01) receiver u knows keys: k 3 k 4 k 5

Key distribution n n Based on some formal characteristic: e. g. , DVD’s serial

Broadcast using subset cover S 10 S 8 S 1 S 3 S 5

Subtree difference All receivers are associated with the leaves of a full binary tree

Subtree differences special set Si, j i j

Greedy algorithm n Easy greedy algorithm for constructing a subtree cover for any set

Greedy algorithm n Find a node such that both of its children have exactly

Greedy algorithm n Add (at most) two sets to the cover

Greedy algorithm n Revoke the entire subtree

Greedy algorithm n Could be less than two sets

Analysis of this algorithm R - number of revoked users C – number of

Asymptotic 1. 2451134… 3 log 2 4/3 1. 2451114… E[C]/E[R] p

Halevy-Shamir scheme n Noticed that subtree differences are decomposable:

Halevy-Shamir scheme n Fewer special sets reduce memory requirement on receivers

Improvement n n For practical parameters save additionally 20% compared to the Halevy-Shamir scheme

Other work n n n New classes of hash functions and analysis of a

Slides: 89

Download presentation

Random walks and analysis of algorithms in cryptography Ilya Mironov Stanford University

Talk overview n Cryptanalysis RC 4 stream cipher n card shuffling n brute force attack n n Broadcast encryption analysis n optimization n n Other work

RC 4 stream cipher n n RC stands for “Ron’s Code, ” designed in 1987 by Ron Rivest. Several design goals: speed n support of 8 -bit architecture n simplicity (to circumvent export regulations) n

Abridged history of [alleged] RC 4™ n n 1994 – leaked to cypherpunks mailing list 1995 - first weakness (USENET post) 1996 – appeared in “Applied Cryptography” by B. Schneier as “alleged RC 4” 1997 – first published analysis MS theses: 3 Ph. D thesis: 1

Usage n n SSL/TLS Windows, Lotus Notes, Oracle, etc. Cellular Digital Packet Data Open. BSD pseudo-random number generator

Encryption key state 0001111010101 plain text = cipher text cipher t

Decryption key state 0001111010101 cipher text cipher t = plain text

Security Requirement Indistinguishability from a perfect source of randomness: given part of the output stream, it is impossible to distinguish it from a random string

Second byte [MS 01] n Second byte of RC 4 output is 0 with twice the expected probability

Related key attack [FMS 01] n Wireless Equivalent Privacy protocol (part of 802. 11 b standard): Using keys with known prefixes - BAD IV 1, key IV 1, 001010 IV 2, key IV 2, 1010110001 IV 3, key IV 3, 010111 IV 4, key IV 4, 101010 key

Recommendation Discard the first 256 bytes of RC 4 output [RSA, MS] Is this enough?

RC 4 internal state n 21 Permutation S on 256 bytes: 123 134 24 91 218 13 n Two indices i, j n log 2 (256! 256) 1700 bits 250 138 53 …

Key scheduling algorithm (all arithmetic is mod 256) for i : = 0 to 255 S[i] : = i j : = 0 for i : = 0 to 255 j : = j + S[i] + key[i] swap (S[i], S[j])

Pseudo-random number generator i : = 0 j : = 0 repeat i : = i + 1 j : = j + S[i] swap (S[i], S[j]) output (S[ S[i] + S[j] ])

Both RC 4’s routines for i : = 0 to 255 S[i] : = i j : = 0 for i : = 0 to 255 j : = j + S[i] + key[i] swap (S[i], S[j]) i, j : = 0 repeat i : = i + 1 j : = j + S[i] swap (S[i], S[j]) output (S[ S[i] + S[j] ]) key scheduling pseudo-random number generator

Both RC 4’s routines for i : = 0 to 255 S[i] : = i j : = 0 for i : = 0 to 255 j : = random j + S[i] +(256) key[i] swap (S[i], S[j]) i, j : = 0 repeat i : = i + 1 j : = random j + S[i] (256) swap (S[i], S[j]) key scheduling pseudo-random number generator

Both RC 4’s routines for i : = 0 to 255 S[i] : = i for i : = 0 to 255 j : = random (256) swap (S[i], S[j]) i : = 0 repeat i : = i + 1 j : = random (256) swap (S[i], S[j]) key scheduling pseudo-random number generator

Idealization of RC 4 for i : = 0 to 255 S[i] : = i i : = 0 repeat i : = i + 1 j : = random (256) swap (S[i], S[j])

Idealization of RC 4 for i : = 0 to n - 1 S[i] : = i i : = 0 repeat i : = i + 1 j : = random (n) swap (S[i], S[j])

Talk overview n Cryptanalysis RC 4 stream cipher n card shuffling n brute force attack n n Broadcast encryption analysis n optimization n n Other work

Exchange shuffle n RC 4 card shuffling: i i … i random j When i = n - 1 the permutation is not random

Perfect shuffling n The textbook algorithm to shuffle cards: swap( S[i], S[j]) i i … i random j When i = n - 1 the permutation is perfectly random

Why is it not random? n n! does not divide nn Sign of the permutation: the sign changes each time with probability 1 -1/n Positions of individual cards are predictable

First byte of RC 4 output n The first byte, S[S[1]+S[S[1]]], is biased:

Distinguisher n Less than 2, 000 to recognize a nonrandom output with 10% error

Mixing time n The permutation becomes more and more random. nonrandomness time

Variation distance between two distributions, P and Q on S: d(P, Q)=½ s S |P(s)-Q(s)| variation distance time

The end of the beginning of RC 4 n What is the sufficient number of swaps for the permutation to become random? Find t such that d(Pt, U) <

Card shuffling To shuffle 52 cards: - 7 riffle shuffles ~ 100 random transpositions ~ 30, 000 adjacent transpositions - exchange (RC 4) shuffles?

Lower bound Sign of the permutation: after t rounds sign can be predicted with probability e-2 t

Upper bound Checking argument: 1. initially all cards are unchecked 2. check S[i] if - either i=j - or S[j] is checked 3. keep doing until all cards are checked

Checking argument i j

Checking argument i j S[i] is indistinguishable from other checked cards

Checking argument It takes (n log n) steps to check all cards. It gives an upper bound.

Mixing time n at least (n) n at most O (n log n)

What if n = 256? n n Optimistically (go with the lower bound) mixes in 4 256 steps Conservatively (use the upper bound) mixes in 16 256 steps

New development n n E. Mossel, A. Sinclair, Y. Peres (Berkeley): the upper bound is tight mixing time = Θ(n log n) Distinguisher: look at the cards from the left half

Talk overview n Cryptanalysis RC 4 stream cipher n card shuffling n brute force attack n n Broadcast encryption optimization n analysis n n Other work

Backtracking j : = S[1] t : = S[1] + S[j] output S[t] j : = j + S[2] t : = S[2] + S[j] output S[t] j : = j + S[3] t : = S[3] + S[j] output S[t]

Backtracking j : = S[1] t : = S[1] + S[j] output S[t] j : = j + S[2] t : = S[2] + S[j] output S[t] j : = j + S[3] t : = S[3] + S[j] output S[t] S[1]

Backtracking j : = S[1] t : = S[1] + S[j] output S[t] j : = j + S[2] t : = S[2] + S[j] output S[t] j : = j + S[3] t : = S[3] + S[j] output S[t] S[1] S[j]

Backtracking j : = S[1] t : = S[1] + S[j] output S[t] j : = j + S[2] t : = S[2] + S[j] output S[t] j : = j + S[3] t : = S[3] + S[j] output S[t] S[1] S[j] S[2]

Backtracking j : = S[1] t : = S[1] + S[j] output S[t] j : = j + S[2] t : = S[2] + S[j] output S[t] j : = j + S[3] t : = S[3] + S[j] output S[t] S[1] S[j] S[2] S[j]

Backtracking j : = S[1] t : = S[1] + S[j] output S[t] j : = j + S[2] t : = S[2] + S[j] output S[t] j : = j + S[3] t : = S[3] + S[j] output S[t] S[1] S[j] S[2] S[j] S[3]

Backtracking j : = S[1] t : = S[1] + S[j] output S[t] j : = j + S[2] t : = S[2] + S[j] output S[t] j : = j + S[3] t : = S[3] + S[j] output S[t] S[1] S[j] S[2] S[j] S[3] S[j]

Backtracking j : = S[1] t : = S[1] + S[j] output S[t] j : = j + S[2] t : = S[2] + S[j] output S[t] j : = j + S[3] t : = S[3] + S[j] output S[t] S[1] S[j] S[2] S[j] S[3]

Backtracking j : = S[1] t : = S[1] + S[j] output S[t] j : = j + S[2] t : = S[2] + S[j] output S[t] j : = j + S[3] t : = S[3] + S[j] output S[t] S[1] S[j] S[2] S[j] S[3] S[j]

Backtracking j : = S[1] t : = S[1] + S[j] output S[t] j : = j + S[2] t : = S[2] + S[j] output S[t] j : = j + S[3] t : = S[3] + S[j] output S[t] S[1] S[j] S[2] S[j] S[3]

Backtracking j : = S[1] t : = S[1] + S[j] output S[t] j : = j + S[2] t : = S[2] + S[j] output S[t] j : = j + S[3] t : = S[3] + S[j] output S[t] S[1] S[j] S[2] S[j] S[3] S[j]

Cost of backtracking n n n Keep guessing until there is a critical mass ≈ 100 entries Each guess is ≈ 8 bits, which multiplies the running time by 28 Estimated running time ~ 2800 (for comparison – there are 2200 particles in the universe)

Improvement j : = S[1] t : = S[1] + S[j] output S[t] j : = j + S[2] t : = S[2] + S[j] output S[t] j : = j + S[3] t : = S[3] + S[j] output S[t] S[1] S[2] S[3]

Running time of improved algorithm n n Much more intricate analysis of an unbalanced tree Estimated less than 2600

Why is it interesting? n n What about “short RC 4”: 64 -byte permutation? internal state has size 300 bits 64 -byte RC 4 is secure against the old attack, borderline under the new attack

Talk overview n Cryptanalysis RC 4 stream cipher n card shuffling n brute force attack n n Broadcast encryption analysis n optimization n n Other work

Broadcast encryption k k k source k k k receivers Very little overhead One rogue user compromises the whole system

Broadcast encryption source k 1 k 2 k 3 k 4 k 5 k 1, k 2, k 3, k 4, k 5, …, kn k 6 k 7 … kn receivers broadcast E[k 1, k], E[k 2, k], …, E[kn, k], E[k, M]

Broadcast encryption source k 1 k 2 k 3 k 4 k 5 receivers Simple user revocation Too many keys k 1, k 2, k 3, k 4, k 5, …, kn k 6 k 7 … kn

Subset-cover framework (Naor-Lotspiech’ 01) S 7 S 1 S 8 S 6 S 4 S 5 S 3 S 2

Subset-cover framework (Naor-Lotspiech’ 01) receiver u knows keys: k 3 k 4 k 5 S 7 S 1 S 8 S 6 S 4 u S 3 S 5 S 2

Key distribution n n Based on some formal characteristic: e. g. , DVD’s serial number Using some real-life descriptors: — Microsoft employees — researchers — California state residents — Ph. D’s

Broadcast using subset cover S 10 S 8 S 1 S 3 S 5 header uses k 1, k 3, k 5, k 6, k 8, k 10 S 6

Subtree difference All receivers are associated with the leaves of a full binary tree k 00 k 01 k 0… 0 k 0… 1 k 1… 1

Subtree differences special set Si, j i j

Subtree difference

Greedy algorithm n Easy greedy algorithm for constructing a subtree cover for any set of revoked users

Greedy algorithm n Find a node such that both of its children have exactly one revoked descendant

Greedy algorithm n Add (at most) two sets to the cover

Greedy algorithm n Revoke the entire subtree

Greedy algorithm n Could be less than two sets

Analysis of this algorithm R - number of revoked users C – number of sets in the cover C ≤ 2 R-1 n averaged over sets of fixed size [NNL’ 01] E[C] ≤ 1. 38 R n simulation experiments give [NNL’ 01] E[C] ~ 1. 25 R n

Analysis of this algorithm R - number of revoked users C – number of sets in the cover If a user is revoked with probability p « 1: E[C] ≈ 1. 24511 E[R]

Exact formula where

Mellin transform

Asymptotic E[C]/E[R] 1. 24511 p

Asymptotic 1. 2451134… 3 log 2 4/3 1. 2451114… E[C]/E[R] p

Talk overview n Cryptanalysis RC 4 stream cipher n card shuffling n brute force attack n n Broadcast encryption analysis n optimization n n Other work

Halevy-Shamir scheme n Noticed that subtree differences are decomposable:

Halevy-Shamir scheme n Fewer special sets reduce memory requirement on receivers

Improvement n n For practical parameters save additionally 20% compared to the Halevy-Shamir scheme This is joint work with N. Alon, D. Halevy, A. Shamir

Talk overview n Cryptanalysis RC 4 stream cipher n card shuffling n brute force attack n n Broadcast encryption analysis n optimization n n Other work

Other work n n n New classes of hash functions and analysis of a construction for hash functions [Eurocrypt’ 01] Crypto and game theory in peer-to-peer filesharing networks [EC’ 01, FC’ 02] Construction of short signatures based on discrete logarithm [CT-RSA’ 03]