PRACTICAL ISSUE WITH DIFFIEHELLMAN KEY EXCHANGE DH JUNWU

REVIEW (TEXTBOOK DH) DH is the first public-key algorithm, invented in 1976. Security based

REVIEW CONT. Public parameters A large prime p, an element g c 1 =

NUMBER FIELD SIEVE (NFS) DISCRETE LOG ALGORITHM n This algorithm requires a good understanding

EFFICIENCY For one 512 -bit DH prime, NFS took 0. 9 cores per year

TLS HANDSHAKE Let’s move to real world, recall the 4 phases of TLS handshake.

TLS HANDSHAKE CONT. Phase 3: the client verifies the server’s certificate and signature and

TLS HANDSHAKE CONT. Hello, […, DHE, …] Server. Key. Exchange, cert, sign(p, g, gb)

LOGJAM ATTACK TLS supports DH as one of several key exchange methods, most commonly

LOGJAM ATTACK CONT. MITM Hello, […, DHE_EXPORT, …] Server. Key. Exchange, cert, sign(p 512,

STILL HAVE PROBLEM… How do we compute the shared secret gab in real time?

IF WE CANNOT DOWNGRADE…? Because the reuse of standard primes, our method still work.

IT IS WORTH Precomputation for a single 1024 -bit prime allows passive decryption of

MITIGATIONS Logjam attack: Raise minimum DH lengths. Attack on 1024 -bit discrete log: Move

Slides: 15

Download presentation

PRACTICAL ISSUE WITH DIFFIE-HELLMAN KEY EXCHANGE (DH) JUNWU LIU Most content of this presentation comes from: Adrian, David, et al. "Imperfect forward secrecy: How Diffie. Hellman fails in practice. " Proceedings of the 22 nd ACM SIGSAC Conference on Computer and Communications Security. ACM, 2015.

REVIEW (TEXTBOOK DH) DH is the first public-key algorithm, invented in 1976. Security based on the difficulty of calculating discrete logarithms in a finite field. The protocol goes as follows: 1. Alice and Bob agree on a large prime, p and g, such that g is primitive mod p. 2. Alice chooses a random large integer a and sends Bob c 1 = ga mod p. 3. Bob chooses a random large integer b and sends Alice c 2 = gb mod p. 4. Alice computes k = c 2 a mod p. Bob computes k = c 1 b mod p. It’s easy to compute c 1 c 2 but HARD to recover a and b (no known polynomial algorithm).

REVIEW CONT. Public parameters A large prime p, an element g c 1 = ga mod p Alice Bob c 2 = gb mod p k = gab mod p

NUMBER FIELD SIEVE (NFS) DISCRETE LOG ALGORITHM n This algorithm requires a good understanding of algebraic number theory and beyond my current mathematical level, so I can’t describe it in detail. But we don’t really need to understand it, there are only two things relevant to our subject: n NFS is the most known efficient discrete log factorisation algorithm. But still not a polynomial time algorithm. n The first three steps, called precomputation, are only depended on the prime p, and once finished, it can be reused to compute individual logs of many targets in the last stage. So a single large precomputation on p can be used to efficiently break all DH exchanges made with that prime.

EFFICIENCY For one 512 -bit DH prime, NFS took 0. 9 cores per year on polynomial selection, 2. 5 cores per year on sieving and 7. 7 cores per year on linear algebra. So overall 11 cores per year on precomputation. After precomputation, in the final descent stage, we only need 10 cores per mins to compute individual discrete logs. You can reduce the time by using more cores. Now we already know that we can compute particular logs in acceptable time after precomputation on 512 -bit DH prime, what’s next?

TLS HANDSHAKE Let’s move to real world, recall the 4 phases of TLS handshake. Phase 1: the client sends client_hello message to server with several parameters, the only point of noticing is cipher_suites, it is a list of the cryptographic options supported in the client side machine, sorted with the client’s first preference first. The server replies with a server_hello message that contains the same parameters. TLS specifies ciphersuites supporting multiple varieties of DH, textbook DH with unrestricted strength is called ephemeral DH, or DHE. Phase 2: if they decided to use DHE, then the server is responsible for selecting the DH parameters. It chooses a group (p, g), computes gb, and sends a server_key_exchange message containing a signature over the tuple (p, g, gb). Then sends server_done message and wait for response.

TLS HANDSHAKE CONT. Phase 3: the client verifies the server’s certificate and signature and responds with a client_key_exchange message containing ga. Phase 4: the client sends a finished message includes a keyed MAC of its view of the handshake transcript, to allow the server to confirm the proper handshake executed at the client side, and server also do the same thing. Thereafter, the handshake is completed and the client and server can exchange data.

TLS HANDSHAKE CONT. Hello, […, DHE, …] Server. Key. Exchange, cert, sign(p, g, gb) Server. Done client Client. Key. Exchange, ga Finished n Problem? n Protocol flaw: server does not sign the chosen ciphersuite. serve r

LOGJAM ATTACK TLS supports DH as one of several key exchange methods, most commonly using 1024 -bit prime. A man-in-the-middle can downgrade TLS clients to use reduced-strength DH with any server that allows DHE_EXPORT ciphersuite is restricted to primes no longer than 512 bits to comply with 1990 s U. S. export restrictions on cryptography. Although this restrictions are no longer in effect, but many servers still support it for backwards compatibility. Then, by finding the 512 -bit discrete log, the attacker can learn the session key and read or modify the contents as he want.

LOGJAM ATTACK CONT. MITM Hello, […, DHE_EXPORT, …] Server. Key. Exchange, cert, sign(p 512, g, gb); Server. Done The client will interpret Client. Key. Exchange, the tuple as valid DHE ga parameters chosen by Calculate b by using the server, because the NFS, get gab structure of this Finishe message is same to d DHE message Finishe d Hello, […, DHE, …] client MITM rewrite ciphersuite field serve r As I mentioned before, server does not sign the chosen ciphersuite, so client would not know the message has been modified. But in the last phase, client and server have to send their keyed MAC of its view of the handshake transcript, so attacker has to pretend to be the server to complete handshake with client, because the client and server have different

STILL HAVE PROBLEM… How do we compute the shared secret gab in real time? Generating primes with special properties is computationally burdensome, so many implementation use fixed or standardized DH parameters. 92% HTTPS which allow DHE_EXPORT use one of two most popular primes. So after precomputation, we only need 10 cores per mins to compute individual discrete logs. But 1 min is still not good enough, browsers have shorter timeouts, we need to send Finished message before. There are couple of ways to do this. We can choose to attack command-line clients, normally this kind of clients are running unattended, so the have a long or no timeouts, and attacker can finish their gab before too late. Or for some browsers (because this method not work on every brower), we can send TLS warning alerts, which are ignored by the browser but reset the handshake timer, to keep connection alive. Or if the client supports TLS False Start extension, it will send early application data without waiting for the server’s Finished message to reduce connection latency. So although attacker still cannot finish message on time, but it can get some useful information from this data.

IF WE CANNOT DOWNGRADE…? Because the reuse of standard primes, our method still work. For 768 -bit DH and 1024 -bit DH (mainstream), the estimating costs shows below: Sieving core-years Linear Algebra core-years Descent core-time DH-512 2. 5 7. 7 10 mins DH-768 8, 000 28500 2 days DH-1024 10, 000 35, 000 30 days Although 45 M core-years is a huge number, it is possible for attacker with national support. The latest supercomputer Sunway Taihu. Light has more than 10 M cores. Special-purpose hardware also works, it is cheaper but quicker, a special machine which worth $100 Ms can precomputes one 1024 -bit prime every year.

IT IS WORTH Precomputation for a single 1024 -bit prime allows passive decryption of connections to 66% of VPN servers and 26% of SSH servers. Precomputation for a second common 1024 -bit prime allows passive decryption for 18% of top 1 M HTTPS domains.

MITIGATIONS Logjam attack: Raise minimum DH lengths. Attack on 1024 -bit discrete log: Move to elliptic curve cryptography, such as ECDH Use more bits primes. Do not use common primes, generate a new one.

Questions?