A TimingResistant Elliptic Curve Backdoor in RSA The
A Timing-Resistant Elliptic Curve Backdoor in RSA The 3 rd International SKLOIS Conference on Information Security and Cryptology Inscrypt 2007 Aug. 31 – Sep. 5, 2007 Xining, China Moti M. Yung joint work with Adam L. Young
What is Kleptography? n Kleptography is the study of stealing information securely and subliminally. n Kleptography is dedicated to researching ways of obtaining such data in an undetectable fashion with high security guarantees. n It is a formal cryptographic study of backdoor designs.
What is the goal of kleptography? n n n To develop a robust backdoor within a cryptosystem that: 1) Provides the attacker with the desired secret information (e. g. , private key of the unwary user) 2) Cannot be detected in black-box implementations (I/O access only) except by the attacker 3) If a reverse-engineer (i. e. , not the attacker) breaches the blackbox, then the previously stolen information remains confidential (secure against reverse-engineering). Ideally, confidentiality holds going forward as well. The successful reverse-engineer will learn that the attack is carried out, BUT will be unable to use the backdoor. It is the design of cryptographic Trojan horses that are robust against reverse-engineering.
Background – Crypto ’ 96, EC and Crypto 97 n We introduced the notion of kleptography [YY 96, YY 97]. n An asymmetric backdoor is a covert backdoor that can only be used by the attacker, even if the full specification is made public. The first example RSA composite that hides in its high order bits one of its primes (in the ROM). A kleptographic attack (SETUP) must be indistinguishable: n Under black-box queries, the device with the backdoor must appear to be the same as the device without the backdoor. n The hidden composite must be well encrypted n The encrypting RSA was half the size n n We also showed DH based klepto attacks.
Normal RSA Key Generation Let e be the public RSA exponent that is shared by all the users (e. g. , e is often taken to be 216+1) 1) choose a large number p randomly (e. g. , p is 512 bits long) 2) if p is composite or gcd(e, p - 1) 1 then goto step 1 3) choose a large number q randomly 4) if q is composite or gcd(e, q - 1) 1 then goto step 3 5) output the public key (n=pq, e) and the private key p Note that the private exponent d is found by solving for (d, k) in ed + k (n) = 1 (using the extended Euclidean alg. ) n
RSA Encryption/Decryption Let d be the private exponent where ed = 1 mod (p-1)(q-1) n Let Zn* denote the set of numbers in {1, 2, 3, …, n-1} that are relatively prime to n n To encrypt m Zn* compute: n c = me mod n n To decrypt the ciphertext c compute: m = cd mod n
Kleptographic RSA Key Generation The key generation algorithm is modified to contain a cryptotrojan. The cryptotrojan contains the attacker’s public key y. This is an earlier version of the attack [YY 96, YY 97], more mature versions exist [YY 04, YY 05]. 1) choose a large value s randomly (e. g. , 512 -bits) 2) compute p = H(s) where H is a cryptographic one-way function 3) if p is composite then goto step 1 4) choose a large value RND randomly 5) compute c as the asymmetric encryption of s under y [1/2 size modulus] 6) solve for (q, r) in (c || RND) = pq + r 7) if q is composite then goto step 1 8) output the public key (n=pq, e) and the private key p Note that n is about 1024 bits in length n
Recovering the RSA Private Key n The private key is recovered as follows: n n n The attacker obtains the public key (n, e) of the user Let u be the 512 uppermost bits of n The attacker sets c 1 = u and c 2 = u+1 (c 2 accounts for a potential borrow bit having been taken from the computation n = pq = (c || RND) – r The attacker decrypts c 1 and c 2 to get s 1 and s 2, respectively Either p 1 = H(s 1) or p 2 = H(s 2) will divide n evenly Only the attacker can perform this operation since only the attacker knows the needed private decryption key.
Definition of a SETUP A SETUP attack is an algorithmic modification C’ of a cryptosystem C with the following properties: 1) Halting Correctness: C and C' are efficient algorithms. 2) Output Indistinguishability: The outputs of C and C' are computationally indistinguishable to all efficient algorithms except for the attacker A. 3) Confidentiality of C: The outputs of C do not compromise the security of the cryptosystem that C implements. 4) Confidentiality of C': The outputs of C' only compromise the security of the cryptosystem that C’ implements with respect to the attacker A. 5) Ability to compromise C': With overwhelming probability the attacker A can decrypt, forge, or otherwise cryptanalyze at least one private output of C' given a sufficient number of public outputs of C'.
Formal Aspects There is a formal “security model and definitions” [CTRSA-05] n The design employs tools of modern cryptography: indistinguishability, random oracle assumption regarding strong hash functions, etc. n There is a proof of security of the design (in the model). n It is “fun” to use formal methodology and proof techniques to prove the “security of klepto” which gives us a new notion in modern cryptography that of “provable insecurity” n As mentioned trapdoor is half the size? What’s next? (it may be a problem).
Background - Intuition Behind SAC ’ 05 n Elliptic Curve Cryptography gives smaller ciphertexts (with point compression) than RSA [RSA 78] with a comparable security parameter. → The use of ECC in kleptography [YY 96, YY 97]. BUT how do we ensure the backdoor is hidden with the ECC approach? ? This is what SAC ’ 05 solved. n IDEA: The use of a twisted pair of binary curves gives a DH key exchange value that is (essentially) a bit string selected uniformly at random. → This suggests that we can embed a kleptographic DH key exchange value in the upper order bits of n = pq and achieve indistinguishability of backdoor RSA key pairs vs. “normal” RSA key pairs. Hen: Derive prime from exchanged key (in the ROM)…
Binary Curves Background n Curve Ea, b is given by the Weierstrass equation, y 2 +xy = x 3 +ax 2 + b n Here the coefficients a and b are in F 2 m and b ≠ 0. n m is prime to avoid the GHS attack [GHS 02]. n curve must provide a suitable setting for EC-DDH [JN 03] n curve must satisfy the MOV condition
Property of Twisted Curves n Recall that if the trace Tr F m/F (a) 2 2 ≠ Tr F 2 m/F (a') then Ea, b and Ea', b 2 are “twists” of one another [Ka 86, Ka 88, Ka 91]. When two such curves form a twist then for every x F 2 m there exists a y F 2 m such that (x, y) is a point on Ea, b or Ea', b n n The two possibilities are as follows. n Either (x, y) and (x, x+y) are points on the same curve, or n (x, y) = (0, b) is on both curves.
Timing Attacks n n n So we solved the “space gap issue” in general-- Now it is time to deal with timing attacks. QUESTION: Is it true that klepto RSA keygen is always slower than regular keygen? ? [Kucner Kutylowsky ‘ 01] indeed showed timing attacks. First Answer: positive answer, i. e. , backdoor computations are “extra” computations always seemingly above and beyond what is “needed” to generate random RSA prime pairs. To dispute this one can develop a general theory etc. (difficulty to compare time etc. and may lead to asympt. But non concrete result) Algorithmic Engineering Methodology: On the other hand one methodology is to concentrate on one “accepted” specific algorithm and via “algorithmic engineering” trying to beat it (the algorithm and its code is the adversary to overcome)
What we will show in fact: n So here we rather chose a “common” implementation of RSA key generation (Open. SSL) and developed an attack against it. (This is an established implementation, not likely to be changed frequently). n Second Answer: We demonstrate a negative answer in this case to the above question, giving a concrete practical answer (crypto engineering type of treatment). How Klepto implementation based on Open. SSL is not slower than the Open. SSL RSA Key. Gen.
Our contributions n We present the first timing-resistant asymmetric backdoor in RSA (the work by Crépeau-Slakmon CT-RSA ‘ 03 constituted symmetric backdoors [CS 03] where attack exploits symmetric keys– not immune to reverse engineering). The backdoor we present is based on [YY 05] but is substantially redesigned. n We present the first benchmarks of the backdoor’s running time. n We present a backdoor (that we built into Open. SSL) that has a running time that is even faster than normal Open. SSL RSA key generation and show… …that the particulars of the prime incremental search method/trialdivision can have a very significant impact on timing-resistance engineering.
Highlights of Carefully Chosen Speed-Ups n [YY 05] does NOT do the following: n use incremental prime search/trial division n minimize the number of Miller-Rabin iterations (Damgård-Land-Pom) n exploit Non-Adjacent Form (NAF) for scalar multiplication n At first we made the following improvements: n use the Open. SSL incremental search/fast trial division n minimize the number of Miller-Rabin iterations n use w. NAF splitting [Mö 02] in the asymmetric backdoor n eliminated point halving during key generation ([YY 05] uses point halving).
Highlights continued… Even after the above changes, our backdoor key generator was still slower than Open. SSL RSA key generation. n Main discovery: n We then devised an extended incremental search algorithm (modification of Open. SSL incremental search/fast trial division algorithm). n We show this design allows our construction to resist timing analysis by running faster than Open. SSL RSA key gen. n Work is based on actual implementation experimentations (all done by A. Young) n
Parameters for Experiment n n n m is the prime 257 (avoids GHS attack) We use a = 0 in the curve Ea, b = E 0, b and a = 1 in the curve Ea, b = E 1, b The irreducible trinomial is f(z) = z 257+z 12 + 1 # of points on E 0, b is 4 q 0, # of points on E 1, b is 2 q 1 We used the following values for b, q 0, and q 1 : b = 197 D 4 C 3 C 909 B 4 C 8 EAC 18 BB 296 C 11 BFB 18 C 80 B 37 C 0 C 62 AFD 8 E 5 F 00104 C 46 EEAF 0 B q 0 = 800000000000000005 EB 3 E 3179500 E 2 B 5 D 2 F 8 EA 6 DCC 363 C 1 F q 1 = FFFFFFFFFFFFFFFF 429839 D 0 D 5 FE 3 A 945 A 0 E 2 B 24679387 C 3
Speed-ups in Elliptic Curve Key Exchange There are four bases: G 0, G 1, Y 0, Y 1. n A curve a is selected and then the w. NAF splitting values for Ga, Ya are “loaded”. n This enables fast scalar multiplication over the base points Ga, Ya. n “Alice” chooses the scalar K randomly such that KGa is a random bit string (a was chosen randomly in proportion to the # of points on each curve). n Alice reduces K to get k and computes the shared DH secret k. Ya. n This is faster than [YY 05] in which k was chosen first and then point -halving was applied to k to get K. n
DH key exchange over a twist in GF(2 m) without halving Gen. DHExchange. Values(): Input: none Output: spub, spriv {0, 1}m+1 1. with probability (4 q 0 -1)/2 m+1 set a = 0 and with probability (2 q 1 -1)/2 m+1 set a = 1 2. if a = 0 set (r, cof) = (q 0, 4) else set (r, cof) = (q 1, 2) 3. set groupg = New. Curve(p, a, b) and set groupy = New. Curve(p, a, b) 4. if a = 1 then 5. Loadw. NAFsplit(groupg, group. Vals. G 1, group. Points. G 1) 6. Loadw. NAFsplit(groupy, group. Vals. Y 1, group. Points. Y 1) 7. else 8. Loadw. NAFsplit(groupg, group. Vals. G 0, group. Points. G 0) 9. Loadw. NAFsplit(groupy, group. Vals. Y 0, group. Points. Y 0) 10. Set. Generator(groupg, Ga, 2(2 -a)r, cof) 11. Set. Generator(groupy, Ya, r, cof) 12. choose K uniformly at random such that 0 < K < 2(2 -a)r 13. set k = K 14 if (K r) then set k = Fast. Kmodr(K, r) 15. P = Fast. Point. Mul(groupg, K) /* compute P = KGa */ 16. spub = Point. Compress(groupg, P) 17. P = Fast. Point. Mul(groupy, k) /* compute P = k. Ya */ 18. spriv = Point. Compress(groupg, P) 19. return (spub, spriv)
Recovering the DH secret Recover. DHShared. Secret(spub, x 0, x 1): Input: spub {0, 1}m+1 and EC private keys x 0, x 1 Output: spriv {0, 1}m+1 1. set (v, r, cof) = (0, q 0, 4) 2. ((U, V), w) = Point. Decompress(0, spub) 3. if (w = 0) then 4. compute ((U, V), w) = Point. Decompress(1, spub) 5. set (v, r, cof) = (1, q 1, 2) 6. set a = v and set group = New. Curve(p, a, b) 7. let P 1 be the point corresponding to (U, V) 8. P 1 = Point. Double(group, P 1) 9. if (v = 0) then set P 1 = Point. Double(group, P 1) 10. Set. Generator(group, P 1, r, cof) 11. P 2 = Point. Mul(group, xa) /* compute P 2 = xa. P 1 */ 12. return spriv = Point. Compress(group, P 2)
Backdoor Design Strategy n Our strategy was twofold: n Model the backdoor RSA key generator after the existing Open. SSL RSA prime pair generator. n Determine how to make the backdoor run faster via efficiency improvements. Failing that, determine a minimal change (ideally) that enables the backdoor key generator to run faster than Open. SSL. n We found the Open. SSL RSA prime pair generator to be quite fast. We found that we had to change the prime search algorithm a little to make the backdoor key generator run faster than the Open. SSL key generator. n Again: this is crypto engineering/ algorithms engineering n
Deciding Acceptable Primes n We use Prime. Checks. For. Size that takes as input the bit length of a candidate prime and returns a positive integer specifying the number of iterations of Miller-Rabin [Mi 76, Ra 80]. n The number of Miller-Rabin iterations is based on the “average case error estimates for the strong probable prime test” [DLP 93]. n For 1024, 2048, and 4096 bit numbers it does 2, 2, and 3 Miller. Rabin iterations, respectively. Is. Acceptable. Prime. Fast(e, len, p 1) returns true if and only if p 1 is an acceptable RSA prime. len is the required bit length of p 1 and e is the RSA exponent.
Incremental Search/Fast Trial Division Tricks n Goal: fast incremental search for a prime p 1 such that p 1 -1 is not divisible by small primes (so check that p 1 and p 1 -1 aren’t divisible by small primes). n We use a 2048 element array primes[ ] for trial division n Candidate prime p 1 is already odd and is reduced modulo primes[1], primes[2], …, primes[2047] and the results are stored in the mods[ ] array. n The 32 -bit word delta is successively incremented by 2. n The candidate prime p 1 passes the fast trial division if for i = 1 to 2047: ((mods[i] + delta) mod primes[i] 1) Open. SSL gives up on p 1 if, before delta overflows, p 1 is a successful candidate but is found to be composite. n We change this to give up p 1 only if delta overflows. n
Namely. . The search for the prime is incremented +2 as long as the size in bits of the incremented value is of the same bit length. n Update of trial division is faster n n Now though, the prime drawing method is skewed. So we point this out and assume that it actually does not change much the distribution and postulated this as an assumption: prove it or use it to attack the method.
Using Incremental Search and Fast Trial Division Gen. Prime. With. PRNGFast(seed, key, D, len, e): Input: seed, key, D {0, 1}128, required bit length len, RSA exponent e Output: An acceptable prime p 1 1. Pseudo. Rnd. Gen. AES 1281(seed, key, D) /* initialize the PRNG, D is the “date” */ 2. p 1 = Pseudo. Rnd. Gen. AES 1280(len/8) /* return next len/8 bytes of PRNG stream */ 3. set the 2 most significant bits and the least significant bit of p 1 to 1 4. for i = 1 to NUMPRIMES - 1 step 1 do: /* NUMPRIMES = 2048 */ 5. mods[i] = p 1 mod primes[i] /* primes[ ] stores the first 2048 primes */ 6. delta = 0 7. for i = 1 to NUMPRIMES - 1 step 1 do: 8. if ((mods[i] + delta) mod primes[i] 1) 9. delta = delta + 2 10. if (delta > MAXDELTA) then goto step 2 11. goto step 7 12. if (Is. Acceptable. Prime. Fast(e, len, prime + delta) = false) 13. delta = delta + 2 14. if (delta > MAXDELTA) then goto step 2 15. goto step 7 16. Pseudo. Rnd. Gen. AES 1282() /* zeroize values and free memory */ 17. set p 1 = p 1 + delta and return p 1
Final Building Blocks DHSecret. To. PRNGParams(spriv) outputs (seed, key, D) n Recall that |spriv| = m+1 bits. In our case m = 257. n Right shifts spriv by 2 bits to derive 256 bits for use in the PRNG. n 128 of these bits are used for key and the other 128 for seed n The “date” value D is a fixed (randomly selected) 128 -bit value Gen. Prime. With. Oracle. Incr(spriv, len, e) returns an acceptable prime p 1 n Calls DHSecret. To. PRNGParams(spriv) to get (seed, key, D) n Outputs p 1 where p 1 = Gen. Prime. With. PRNGFast(seed, key, D, len, e)
Backdoor Key Pair Generator Get. Primes. Fast(bits, e, spub, spriv): 1. len = bits/2 2. p 1 = Gen. Prime. With. Oracle. Incr(spriv, len, e) 3. = bits - (8 + m + 1) 4. choose r 1 R {0, 1}7 and r 2 R {0, 1} 5. set nc = 1 || r 1 || spub || r 2 6. solve for (q 1, r) in nc = q 1 p 1 + r 7. if (|q 1| ≠ len or the 2 nd most significant bit of q 1 is not 1) then goto step 4 8. set the least significant bit of q 1 to 1 9. for i = 1 to NUMPRIMES - 1 step 1 do: 10. mods[i] = q 1 mod primes[i] 11. delta = 0 12. for i = 1 to NUMPRIMES - 1 step 1 do: 13. if ((mods[i] + delta) mod primes[i] 1) 14. delta = delta + 2 15. if (delta > MAXDELTA) then goto step 4 16. goto step 12 17. if (Is. Acceptable. Prime. Fast(e, len, q 1 + delta) = false) 18. delta = delta + 2 19. if (delta > MAXDELTA) then goto step 4 20. goto step 12 21. set q 1 = q 1 + delta and return pair of acceptable RSA primes (p 1, q 1)
Security n We replaced the random oracle in [YY 05] with a heuristic PRNG (for provability assume it is a RO). n The PRNG is based on ANSI X 9. 17 with DES replaced with AES-128. n Security is therefore a heuristic argument. n RECALL POSSIBLE WEAKNESS: In comparison to Open. SSL we changed the incremental search/fast trial division algorithm (to get speed). n SO, it may be possible to sample prime pairs and see a difference w. r. t. Open. SSL. n We leave this issue as open. n OBSERVATION: This suggests a cryptanalyst could/should focus on differences resulting from differing incremental search/trial division strategies.
Performance ave. for Fast SETUP Alg. Performance Measure |n| = 1024 10, 000 trials |n| = 2048 10, 000 trials |n| = 4096 10, 000 trials Block size for w. NAF 8 8 8 Value for w. NAF 4 4 4 Ave. # inversions/RSA key 43. 00 43. 13 Ave. # additions/RSA key 84. 33 84. 28 84. 29 Ave. # doublings/RSA key 14. 42 14. 41 Ave. RSA key gen. time (sec) 0. 111 0. 768 8. 723 Ave. RSA key rec. time (sec) 0. 048 0. 374 4. 308 Table 1. Performance Averages for Fast SETUP Algorithm Pentium 4, 2. 79 GHz, 512 MB RAM, Ubuntu Linux
Average RSA key generation/recovery time RSA Key Generation Method |n| = 1024 10, 000 trials |n| = 2048 10, 000 trials |n| = 4096 10, 000 trials 1) Modified SAC ’ 05 Gen (see [34]) 0. 233 1. 164 10. 171 2) Modified SAC ’ 05 Rec 0. 116 0. 740 7. 390 3) Fast SETUP Algorithm Gen 0. 111 0. 768 8. 723 4) Fast SETUP Algorithm Rec 0. 048 0. 374 4. 308 5) Normal Open. SSL Gen 0. 131 0. 937 9. 294 6) Open. SSL with ext. incr. srch. Gen 0. 082 0. 733 8. 632 Table 2. Average RSA key generation/recovery time in seconds
Cryptovirology Graphics Spam
Conclusion n We presented an asymmetric backdoor in RSA key generation that uses a twisted pair of binary curves. n We presented an algorithms engineering approach to investigating speed of alternative (kleptographic) implementation against an established system n We presented a number of speed-ups, one of which enables the backdoor key pair generator to run faster than the Open. SSL RSA key generator. n This suggests that asymmetric backdoors in RSA can resist timing analysis, but this is an example against one alg. (more work is needed on this problem). n Note also: We used Kleptographic motivation as an engine for deriving new algorithmic ideas at various levels (special implementation speedups, better use of theory, faster methods that can be used elsewhere)
References n n n n (1 of 2) [CS 03] C. Crépeau, A. Slakmon, “Simple Backdoors for RSA Key Generation, ” In The Cryptographers' Track at the RSA Conference, pp. 403 -416, 2003. [DLP 93] I. Damgård, P. Landrock, C. Pomerance, “Average Case Error Estimates for the Strong Probable Prime Test, ” In Math. Comp. , 61(203): 177 -194, 1993. [GHS 02] P. Gaudry, F. Hess, N. Smart, “Constructive and Destructive Facets of Weil Descent on Elliptic Curves, ” In Journal of Cryptology, v. 15, pp. 19 -46, 2002. [JN 03] A. Joux, K. Nguyen. Separating DDH from CDH in Cryptographic Groups. In Journal of Cryptology, v. 16, n. 4, pages 239 -247, 2003. [Ka 86] B. S. Kaliski. A Pseudo-Random Bit Generator Based on Elliptic Logarithms. In Advances in Cryptology—Crypto '86, pages 84 -103, 1986. [Ka 88] B. S. Kaliski. Elliptic Curves and Cryptography: A Pseudorandom Bit Generator and Other Tools. Ph. D Thesis, MIT, Feb. 1988. [Ka 91] B. S. Kaliski. One-Way Permutations on Elliptic Curves. In Journal of Cryptology, v. 3, n. 3, pages 187 -199, 1991.
References n n n n (2 of 2) [Mi 76] G. L. Miller, “Riemann's Hypothesis and Tests for Primality, ” In J. Comp. Syst. Sci. , v. 13, n. 3, pp. 300 -317, 1976. [Mö 02] B. Möller, “Improved Techniques for Fast Exponentiation, ” In Information Security and Cryptology—ICISC '02, pp. 298 -312, 2002. [Mö 04] B. Möller. A Public-Key Encryption Scheme with Pseudo-Random Ciphertexts. In ESORICS '04, pages 335 -351, 2004. [Ra 80] M. O. Rabin, “Probabilistic Algorithms for Testing Primality, ” In J. Number Th. , v. 12, pp. 128 -138, 1980. [RSA 78] R. Rivest, A. Shamir, L. Adleman. A Method for Obtaining Digital Signatures and Public-Key Cryptosystems, CACM 21(2), 1978. [YY 96] A. Young, M. Yung. The Dark Side of Black-Box Cryptography, or: Should we trust Capstone? In Advances in Cryptology—Crypto '96, pages 89 -103, 1996. [YY 97] A. Young, M. Yung. Kleptography: Using Cryptography Against Cryptography. In Advances in Cryptology—Eurocrypt '97, pages 62 -74, 1997. [YY 05] A. Young, M. Yung, “A Space Efficient Backdoor in RSA and its Applications, ” Selected Areas in Cryptography—SAC 2005.
- Slides: 36