Introduction to Practical Cryptography Hash Functions Agenda Hash

  • Slides: 96
Download presentation
Introduction to Practical Cryptography Hash Functions

Introduction to Practical Cryptography Hash Functions

Agenda • Hash Functions – Properties – Uses – General Design – Examples: MD

Agenda • Hash Functions – Properties – Uses – General Design – Examples: MD 4, MD 5, SHA-0, SHA-1, SHA 256 – Collision Attacks – General Method – SHA 3 competition

Hash Properties • Map bit strings of arbitrary length to fixedlength outputs h =

Hash Properties • Map bit strings of arbitrary length to fixedlength outputs h = hash(m), h is fixed-length, short • Not injective, but collisions unlikely – Example: 2160 possible values • Computationally infeasible to generate collisions • Computationally infeasible to invert

Hash Properties • Preimage resistant: given h, hard to find m such that h

Hash Properties • Preimage resistant: given h, hard to find m such that h = hash(m) • Second preimage resistant: given m 1, hard to find m 2 (≠ m 1) such that hash(m 1) = hash(m 2) • Collision-resistant: hard to find m 1 and m 2, m 2 ≠ m 1, such that hash(m 1) = hash(m 2)

In Practice • Heuristics • Simple operations, performance: – Iterative, series of rounds –

In Practice • Heuristics • Simple operations, performance: – Iterative, series of rounds – Diffusion through logical operations, addition, shifts, rotates

Uses • Data integrity: – Error detection • Attacker can still modify file and

Uses • Data integrity: – Error detection • Attacker can still modify file and recompute hash – Forgery, modification - MAC • Shorten data for signing • Data Structures: – Hash list – Hash (Merkle) Tree – Hash Table

Hash - Uses • Integrity – Error Detection – Hash file/message – Attacker can

Hash - Uses • Integrity – Error Detection – Hash file/message – Attacker can still modify file and recompute hash • Integrity – prevent forgery, modification – MAC (keyed hash) • Authentication – Signature: hash data to shorten (for efficiency) then encrypt with public key algorithm – Append shared secret to unencrypted data then hash • Random bits • Data Structures: – Hash list – Hash (Merkle) Tree – Hash Table

MAC • Message Authentication Code – keyed hash • Examples: – Encrypt hash with

MAC • Message Authentication Code – keyed hash • Examples: – Encrypt hash with symmetric key cipher – HMAC H(H(K c 1) || H((K c 2) || m))

CBC-MAC Need to be careful in designing a MAC: (tag) Without knowing the key

CBC-MAC Need to be careful in designing a MAC: (tag) Without knowing the key k, can ask to calculate the CBC-MAC tag of any message Produce a MAC for a new message: use two message-tag pairs (m, t) and (m', t’), m and m' : single blocks (can be random) (m || (t m’), t’) : valid message-tag pair CBC-MAC of m || (t m’) is t’ (see next slide)

CBC-MAC • two message-tag pairs (m, t) and (m', t’), • m and m'

CBC-MAC • two message-tag pairs (m, t) and (m', t’), • m and m' : single blocks (can be random) • CBC-MAC of m || (t m’) is also t’ • when computing CBC, result from m is t (“ciphertext” from block cipher) • then continuing CBC mode: • t is XORed with t m’ • t m’ t = m’ and the result is the next input to the block cipher • thus the output is t’ • therefore, (m || t m’, t’) is a valid message-tag pair

MAC • MAC: H(Key | message) • Append attack – With most current hashes

MAC • MAC: H(Key | message) • Append attack – With most current hashes that “chain” blocks can get a valid MAC for H(Key | message’) • Put the secret at the end of the message • Don’t use all the output bits as the MAC

HMAC • • • Append 0’s to key until have 512 bits XOR with

HMAC • • • Append 0’s to key until have 512 bits XOR with constant and prepend to data Hash XOR key with different constant and prepend to result Hash again H( (key const 1) || H( ((key || 0. . 0) const 2) || data) ) • If H is secure, so is HMAC: • Collision resistant • If given HMAC(key, x), can’t compute HMAC(key, y) without knowing key

Generating “Random” Bits • Hash together information containing some randomness – – – Copy

Generating “Random” Bits • Hash together information containing some randomness – – – Copy of every keystroke, mouse event Time between keystrokes Disk seek latency, sector number, etc. Contents of the display Audio input Packet inter-arrival latency, CPU load • Hash all this information together • Allows generation of pseudo-random data

Challenge-Response without Encryption Alice Bob Ra --------> <-------- H(K | Ra), Rb H(K |

Challenge-Response without Encryption Alice Bob Ra --------> <-------- H(K | Ra), Rb H(K | Rb) ---------> H is a hash function K is a shared secret Ra, Rb are random values

Storing Passwords • Idea: hash password and store result • When user enters password,

Storing Passwords • Idea: hash password and store result • When user enters password, hash and compare with stored value

Password Storing • Old Unix password algorithm: – – – Store hash of user

Password Storing • Old Unix password algorithm: – – – Store hash of user password Hash typed password, compare with stored hash First 8 bytes of password are the secret key Then encrypt all-zeroes block with DES-like algorithm Salt: 12 -bit random number, (used in modified DES) Salt stored with hashed result • Later versions used MD 5, Blowfish

Hash List h 2 1 h 1 2 h 1 3 h 1 4

Hash List h 2 1 h 1 2 h 1 3 h 1 4 h 1 5 h 1 6 h 1 7 h 1 8 m 1 m 2 m 3 m 4 m 5 m 6 m 7 m 8

Hash Tree h 4 1 h 3 1 h 2 1 h 3 2

Hash Tree h 4 1 h 3 1 h 2 1 h 3 2 h 2 3 h 2 4 h 1 1 h 1 2 h 1 3 h 1 4 h 1 5 h 1 6 h 1 7 h 1 8 m 1 m 2 m 3 m 4 m 5 m 6 m 7 m 8

General Structure Message m padded to M, a multiple of a fixed-length block M

General Structure Message m padded to M, a multiple of a fixed-length block M is divided into segments m 1, m 2, … mn m 1 IV F m 2 …… mn F … F hash value Merkle-Damgard, 1989 F is called the compression function Takes inputs mi and output of previous iteration Typically a series of rounds Output called a “chaining variable” Typically, a function operates on chaining variables then adds to mi

General Structure • Padding – “ 100… 0” – MD 4, MD 5: last

General Structure • Padding – “ 100… 0” – MD 4, MD 5: last 64 bits depend on first block – SHA*: last bits depend on length of message

General Structure m 1 IV F m 2 …… mn F … F hash

General Structure m 1 IV F m 2 …… mn F … F hash value Avalanche: All output bits depend on all input bits Diffusion: ideally want change to one input bit to change each output bit with prob. ½

MD 4 • • Rivest, RFC 1320 Fast in software Simple to program Memory

MD 4 • • Rivest, RFC 1320 Fast in software Simple to program Memory efficient - no large data structures

MD 4 Notation • • word = 32 bits Message m XY = X

MD 4 Notation • • word = 32 bits Message m XY = X AND Y X v Y = X OR Y

MD 4 • m’ = m 100 … 0 – Pad m until it

MD 4 • m’ = m 100 … 0 – Pad m until it is 64 bits short of a multiple of 512 – Message is always padded (i. e. even 448 bits are padded) – Append a 1 followed by 0’s : • M[0. . . N-1] = m’ with low order 64 bits of m appended to it – N is a multiple of 16 • Four-word buffer (A, B, C, D) initialize to: – – A: 01 23 45 67 B: 89 ab cd ef C: fe dc ba 98 D: 76 54 32 10 just counts 0 to 15 and back

MD 4 – Internal Functions • F(X, Y, Z) = XY v not(X) Z

MD 4 – Internal Functions • F(X, Y, Z) = XY v not(X) Z – Bitwise conditional: if X then Y else Z. • G(X, Y, Z) = XY v XZ v YZ – Bitwise majority function: bit positions in which 2 or more bits are 1, output has a 1, else output has a 0 • H(X, Y, Z) = X Y Z – Bit positions with odd number of 1’s are 1, rest are 0 • Note: if bits of X, Y, and Z are independent and unbiased, each bit of F(X, Y, Z) and each bit of G(X, Y, Z) also will be independent and unbiased.

MD 4 for (i = 0 to N/16 -1) { /* Copy block i

MD 4 for (i = 0 to N/16 -1) { /* Copy block i into X. */ For (j = 0 to 15) { X[j] = M[i*16+j] } /* Save A, B, C, D */ AA = A BB = B CC = C DD = D /* Combine Message blocks with A, B, C, D */ Round 1 Round 2 Round 3 Increment A, B, C, D by their values (AA, BB, CC, DD) at start of iteration } /* end for i */ Output A, B, C, D compression function A, B, C, D is chaining variable

MD 4 Round 1 Function operates on chaining variable, adds in message block /*

MD 4 Round 1 Function operates on chaining variable, adds in message block /* [abcd k s] denotes a = (a + F(b, c, d) + X[k]) <<< s. */ /* 16 operations. */ [ABCD 0 3]; [DABC 1 7]; [CDAB 2 11] ; [BCDA 3 19]; [ABCD 4 3]; [DABC 5 7] ; [CDAB 6 11]; [BCDA 7 19]; [ABCD 8 3]; [DABC 9 7]; [CDAB 10 11]; [BCDA 11 19] [ABCD 12 3]; [DABC 13 7]; [CDAB 14 11]; [BCDA 15 19]; Note: each word rotates through each of the four positions for each value of s X[k] (M) combined with A, B, C, D Words sequential in round 1 (i. e. k = 1, 2, 3, …. 15 in order)

MD 4 Round 2 /* [abcd k s] denotes a = (a + G(b,

MD 4 Round 2 /* [abcd k s] denotes a = (a + G(b, c, d) + X[k] + 0 x 5 A 827999) <<< s. */ /* 16 operations. */ [ABCD 0 3]; [DABC 4 5]; [CDAB 8 9]; [BCDA 12 13]; [ABCD 1 3]; [DABC 5 5]; [CDAB 9 9]; [BCDA 13 13]; [ABCD 2 3]; [DABC 6 5]; [CDAB 10 9]; [BCDA 14 13]; [ABCD 3 3]; [DABC 7 5]; [CDAB 11 9]; [BCDA 15 13]; Word ordering altered from round 1

MD 4 Round 3 /* Let [abcd k s] denotes a = (a +

MD 4 Round 3 /* Let [abcd k s] denotes a = (a + H(b, c, d) + X[k] + 0 x 6 ED 9 EBA 1) <<< s. */ /* 16 operations. */ [ABCD 0 3]; [DABC 8 9]; [CDAB 4 11]; [BCDA 12 15] [ABCD 2 3]; [DABC 10 9]; [CDAB 6 11]; [BCDA 14 15] [ABCD 1 3]; [DABC 9 9]; [CDAB 5 11]; [BCDA 13 15]; [ABCD 3 3]; [DABC 11 9]; [CDAB 7 11]; [BCDA 15 15] Word ordering partially altered from round 2

MD 4 – End of Loop Addition /* increment each of A, B, C,

MD 4 – End of Loop Addition /* increment each of A, B, C, D by the value it had before this block was started. */ A = A + AA B = B + BB C = C + CC D = D + DD

MD 4 Constants • 5 A 827999: 32 -bit constant, represents the square root

MD 4 Constants • 5 A 827999: 32 -bit constant, represents the square root of 2. The octal value is 013240474631. • 6 ED 9 EBA 1: 32 -bit constant, represents the square root of 3. The octal value is 015666365641.

MD 5 • Designed to replace MD 4 – First two and last two

MD 5 • Designed to replace MD 4 – First two and last two rounds of MD 4 attacked • RFC 1321, Rivest for (i = 0 to N/16 -1) { /* Copy block i into X */ For (j = 0 to 15) { X[j] = M[i*16+j] } /* Save A, B, C, D */ AA = A BB = B CC = C DD = D /* Combine Message blocks with A, B, C, D */ Round 1 Round 2 Round 3 Round 4 Increment A, B, C, D by value at start of iteration } /* end for i */ Output A, B, C, D

MD 5 Changes to MD 4 1. Fourth round added 2. Each step now

MD 5 Changes to MD 4 1. Fourth round added 2. Each step now has a unique additive constant 3. The function G changed from (XY v XZ v YZ) to (XZ v Y not(Z)) (less symmetric) 4. The order in which input words are accessed in rounds 2 and 3 is changed, to make these less like similar to each other. 5. The shift amounts in each round have been changed to produce a faster avalanche effect. The shifts in different rounds are distinct.

MD 5 Internal Functions • • F(X, Y, Z) = XY v not(X) Z

MD 5 Internal Functions • • F(X, Y, Z) = XY v not(X) Z G(X, Y, Z) = XZ v Y not(Z) H(X, Y, Z) = X Y Z I(X, Y, Z) = Y (X v not(Z))

MD 5 • Uses a 64 -element table T[1. . . 64] constructed from

MD 5 • Uses a 64 -element table T[1. . . 64] constructed from the sine function • T[i] equals the integer part of 4294967296 times abs(sin(i)), where i is in radians

MD 5 Round 1 /* [abcd k s i] denotes a = b +

MD 5 Round 1 /* [abcd k s i] denotes a = b + ((a + F(b, c, d) + X[k] + T[i]) <<< s). */ /* 16 operations. */ [ABCD 0 7 1] [DABC 1 12 2] [CDAB 2 17 3] [BCDA 3 22 4] [ABCD 4 7 5] [DABC 5 12 6] [CDAB 6 17 7] [BCDA 7 22 8] [ABCD 8 7 9] [DABC 9 12 10] [CDAB 10 17 11] [BCDA 11 22 12] [ABCD 12 7 13] [DABC 13 12 14] [CDAB 14 17 15] [BCDA 15 22 16] Constant added, varies

MD 5 Round 2 /* [abcd k s i] denotes a = b +

MD 5 Round 2 /* [abcd k s i] denotes a = b + ((a + G(b, c, d) + X[k] + T[i]) <<< s). */ /*16 operations. */ [ABCD 1 5 17] [DABC 6 9 18] [CDAB 11 14 19] [BCDA 0 20 20] [ABCD 5 5 21] [DABC 10 9 22] [CDAB 15 14 23] [BCDA 4 20 24] [ABCD 9 5 25] [DABC 14 9 26] [CDAB 3 14 27] [BCDA 8 20 28] [ABCD 13 5 29] [DABC 2 9 30] [CDAB 7 14 31] [BCDA 12 20 32] not reusing shift amounts across rounds

MD 5 Round 3 /* Let [abcd k s t] denotes a = b

MD 5 Round 3 /* Let [abcd k s t] denotes a = b + ((a + H(b, c, d) + X[k] + T[i]) <<< s). */ /* Do the following 16 operations. */ [ABCD 5 4 33] [DABC 8 11 34] [CDAB 11 16 35] [BCDA 14 23 36] [ABCD 1 4 37] [DABC 4 11 38] [CDAB 7 16 39] [BCDA 10 23 40] [ABCD 13 4 41] [DABC 0 11 42] [CDAB 3 16 43] [BCDA 6 23 44] [ABCD 9 4 45] [DABC 12 11 46] [CDAB 15 16 47] [BCDA 2 23 48] Word ordering altered from round 2 to greater extent than in MD 4

MD 5 Round 4 /* [abcd k s t] denotes a = b +

MD 5 Round 4 /* [abcd k s t] denotes a = b + ((a + I(b, c, d) + X[k] + T[i]) <<< s). */ /* 16 operations. */ [ABCD 0 6 49] [DABC 7 10 50] [CDAB 14 15 51] [BCDA 5 21 52] [ABCD 12 6 53] [DABC 3 10 54] [CDAB 10 15 55] [BCDA 1 21 56] [ABCD 8 6 57] [DABC 15 10 58] [CDAB 6 15 59] [BCDA 13 21 60] [ABCD 4 6 61] [DABC 11 10 62] [CDAB 2 15 63] [BCDA 9 21 64] One more round than MD 4

MD 5 Addition A = A + AA B = B + BB C

MD 5 Addition A = A + AA B = B + BB C = C + CC D = D + DD Same as MD 4

RIPEMD, Haval • RIPEMD - modified MD 4 – Rotation changed – Order of

RIPEMD, Haval • RIPEMD - modified MD 4 – Rotation changed – Order of message words altered – Two instances run in parallel with different constants; at end of each block, output of each added to chaining variables • Haval – modified MD 5 – – – Processes 1024 -bit message blocks instead of 512 Internal functions take 7 variables, nonlinear Permutes input to round Chaining variable: 8 segments instead of 4 Different constants

SHA-0 • • Input m 264 bits Output 160 bits Padded, processed in 512

SHA-0 • • Input m 264 bits Output 160 bits Padded, processed in 512 -bit blocks Each iteration takes 160 -bit chaining variable and 512 -bit block, outputs 160 -bit chaining value – First chaining value is a fixed constant (IV) – Last chaining value is the output

SHA-0 • Pad m to multiples of 512 bits – Append 1, 0’s, length

SHA-0 • Pad m to multiples of 512 bits – Append 1, 0’s, length of m • N 512 -bit blocks For (j=0 to N-1) { 512 -bit block divided into 32 -bit segments: m 0, m 1, … m 16 Expand to 80 32 -bit segments: for i = 16 to 79: mi = mi-3 mi-8 mi-14 mi-16 80 32 -bit words processed by round function }

SHA-0 Round Rotating segments of chaining variable a, b, c, d, e are the

SHA-0 Round Rotating segments of chaining variable a, b, c, d, e are the chaining variable ki is a round constant Function f varies per round – on next slide function operating on chaining variables

SHA-0 Round Function

SHA-0 Round Function

SHS • NIST FIPS 180 -2, Secure Hash Standard (SHS) 2002 • SHA-1: message

SHS • NIST FIPS 180 -2, Secure Hash Standard (SHS) 2002 • SHA-1: message < 264 bits, 160 -bit output • SHA-256: message < 264 bits, 256 -bit output • SHA-384: message < 2128 bits, 384 -bit output • SHA-512: message < 2128 bits, 512 -bit output

SHA-1 and SHA-256 Padding • Message M of l bits • Pad to a

SHA-1 and SHA-256 Padding • Message M of l bits • Pad to a multiple of 512 bits – Append a 1 – Append k 0’s where l + 1 + k = 448 mod 512 – Append 64 bits equal to the binary representation of l

SHA-1 and SHA-256 Processing • • • N 512 -bit blocks Block denoted by

SHA-1 and SHA-256 Processing • • • N 512 -bit blocks Block denoted by M(i) 32 -bit segment of block denoted by Mj(i) i = ith block j = jth 32 -bit segment of ith block, j = 0, 1… 16

SHA-1 Internal functions, each operating on three 32 -bit words

SHA-1 Internal functions, each operating on three 32 -bit words

SHA-1 Constants used in the rounds:

SHA-1 Constants used in the rounds:

SHA-1 Initialization of array containing the hash value Five 32 -bit words

SHA-1 Initialization of array containing the hash value Five 32 -bit words

SHA-1 Algorithm Pad the message M Break into N 512 -bit blocks Initialize H

SHA-1 Algorithm Pad the message M Break into N 512 -bit blocks Initialize H for i = 1 to N { Populate W with block i and rotate Initialize intermediate variables a, b, c, d, e 80 rounds Update H } Output H

SHA-1 for i = 1 to N: Change from SHA-0, rotate 1 bit

SHA-1 for i = 1 to N: Change from SHA-0, rotate 1 bit

SHA-1 Operating on chaining variable then adding in message block

SHA-1 Operating on chaining variable then adding in message block

SHA-1 Update chaining variable end of for i loop Output :

SHA-1 Update chaining variable end of for i loop Output :

SHA-256 Pad the message M Break into N 512 -bit blocks Initialize H for

SHA-256 Pad the message M Break into N 512 -bit blocks Initialize H for i = 1 to N { Populate W with block i and rotate Initialize intermediate variables a, b, c, d, e, f, g, h 64 rounds Update H } Output H

SHA-256 logical functions, inputs are 32 -bit words Change from SHA 1

SHA-256 logical functions, inputs are 32 -bit words Change from SHA 1

SHA-256 Constants used in the rounds: 64 32 -bit constants K 0, K 1.

SHA-256 Constants used in the rounds: 64 32 -bit constants K 0, K 1. . K 63 Change from SHA 1

SHA-256 Initialization of chaining variable Eight 32 -bit words

SHA-256 Initialization of chaining variable Eight 32 -bit words

SHA-256 for i = 1 to N: Change from SHA 1

SHA-256 for i = 1 to N: Change from SHA 1

SHA-256 Operating on chaining variable then adding in message block Change from SHA 1

SHA-256 Operating on chaining variable then adding in message block Change from SHA 1

SHA-256 Update chaining variable end of for i loop Output :

SHA-256 Update chaining variable end of for i loop Output :

Collisions - Uses • Generate meaningful files with the same hash • Example: Code

Collisions - Uses • Generate meaningful files with the same hash • Example: Code download – replace code, hash file remains unchanged

Birthday Paradox • 23 people in a room • probability 2 have same birthday

Birthday Paradox • 23 people in a room • probability 2 have same birthday is 50% In general: • Given “random” mapping H(xi) = yi • n inputs, k possible outputs • n(n-1)/2 possible input pairs • probability H(xi) = H(xj) is 1/k • Approximation: – need k/2 pairs for 50% probability so want n > (k) ½ (for match) • Hash functions: want low probability of a match • Larger k is, the more inputs an attacker must try

Collisions - Uses • Generate certificates with same hash • X. 509 certificate: –

Collisions - Uses • Generate certificates with same hash • X. 509 certificate: – Name, usage, extension… – RSA public key: (n, e) – Signature of CA • Attack – Two certificates identical except for RSA modulus: n 1, n 2 n 1 ≠n 2 – MD 5(n 1) = MD 5(n 2) • • Lenstra, Wang, Weger, Colliding X. 509 Certificates, eprint March 2005 Certificates available at http: //www. win. tue. nl/~bdeweger/Colliding. Certificates/ The CA certificate is self-signed MD 5 hashes are not identical when include 4 byte ASN. 1 header *certificate image from Yin, ACNS 2005

Collisions Differentials • Recall differential cryptanalysis from block ciphers – Look at of inputs,

Collisions Differentials • Recall differential cryptanalysis from block ciphers – Look at of inputs, of outputs after each round • In block ciphers, 1 to 1 mapping, – Never have in ≠ 0 out = 0 • In hash functions, round output shorter than input – There exists in ≠ 0 out = 0 – Need to find these ‘s

Collisions - Differentials • m ≠ 0 for two inputs can produce =0 in

Collisions - Differentials • m ≠ 0 for two inputs can produce =0 in an intermediate value of the hash • MD 4, MD 5, SHA* … are of the form: M = [m 0, m 1, … mn] For (i=0 to n) { Process mi, produce Hi (the chaining variable) Process mi+1, combine with Hi to get Hi+1 … } Furthermore, mi is only added into chaining variables

Collisions Differentials • M = [m 0, m 1, … mn] • M’ =

Collisions Differentials • M = [m 0, m 1, … mn] • M’ = [m’ 0, m’ 1, … m’n] • Find m 0, m’ 0 that produce same chaining variable, • Set mi = m’i for remaining blocks • In MD 4, MD 5 – issue of padding includes bits from unpadded message • In SHA* - length is append

Collisions - Differentials • Processing block M in blocks M = [m 0, m

Collisions - Differentials • Processing block M in blocks M = [m 0, m 1, … mn] M’ = [m’ 0, m’ 1, … m’n] • Find messages that produce x through block i with high probability • Set jth block of messages to cancel x • Remaining blocks of both messages can be =

Existing Hashes: General Structure Message m padded to M, a multiple of a fixed-length

Existing Hashes: General Structure Message m padded to M, a multiple of a fixed-length block M is divided into segments m 1, m 2, … mn m 1 IV F m 2 …… mn F … F F is called the compression function Takes inputs mi and output of previous iteration Typically a series of rounds Output called a “chaining variable” hash value

Collisions • Find M 1 ≠ M 2 that produce x through block i

Collisions • Find M 1 ≠ M 2 that produce x through block i with high probability • Set subsequent blocks of m 1, m 2 to cancel x • Remaining blocks of m 1, m 2 can be = m 11 IV F m 21 F x 1 i F x 2 i m 1 i+2 m 1 n F F x 1 i+1 = Z F m 2 i+1 m 2 i IV F m 1 i+1 m 1 i H x 2 i+1 = Z F x 1 i x 2 i = x x 1 i+1 = x 2 i+1 H

Workload of Known Attacks (2005) Hash function Expected strength Known attacks MD 4 264

Workload of Known Attacks (2005) Hash function Expected strength Known attacks MD 4 264 O(3) MD 5 264 O(230+) SHA-0 280 O(239) SHA-1 280 O(263)

Collisions • Wang, Feng, Lai, Yu, 2004 • Collisions for MD 4, MD 5,

Collisions • Wang, Feng, Lai, Yu, 2004 • Collisions for MD 4, MD 5, Haval-256, RIPEMD • Message Modification – Find differential path that produces possible collision – Identify conditions for path to hold – Modify message words to follow path in first round (and second round as much as possible) • Setting messages in this manner increases chance collision holds

MD 4 Collisions M’ = M + C C = (0, 231, -228+231, 0,

MD 4 Collisions M’ = M + C C = (0, 231, -228+231, 0, 0, 0, -216, 0, 0, 0) MD 4(M) = MD 4(M’) mi for i = 1, 2, 12 differ between M and M’

MD 4 Collision Example m 0[16] = { 0 xa 8 b 1 b

MD 4 Collision Example m 0[16] = { 0 xa 8 b 1 b 641, 0 x 88 d 2 ecaf, 0 xb 7 d 7 c 1 a 1, 0 x 99044241, 0 xffef 1639, 0 x 1934 bdcf, 0 x 30 e 2 adb 8, 0 x 252 ac 4 b 4, 0 x 7 bad 86 a 5, 0 x 7883 f 30 e, 0 x 8 b 37 f 23 b, 0 xd 694 dce 0, 0 x 701 d 8 b 69, 0 x 045095 eb, 0 x 92012 e 03, 0 x 71 ed 419 e } m 1[16] = { 0 xa 8 b 1 b 641, 0 x 08 d 2 ecaf, 0 x 27 d 7 c 1 a 1, 0 x 99044241, 0 xffef 1639, 0 x 1934 bdcf, 0 x 30 e 2 adb 8, 0 x 252 ac 4 b 4, 0 x 7 bad 86 a 5, 0 x 7883 f 30 e, 0 x 8 b 37 f 23 b, 0 xd 694 dce 0, 0 x 701 c 8 b 69, 0 x 045095 eb, 0 x 92012 e 03, 0 x 71 ed 419 e }

MD 5 – Prior Collisions • Bert den Boer and Bosselaers: pseudocollision for MD

MD 5 – Prior Collisions • Bert den Boer and Bosselaers: pseudocollision for MD 5 - same message with different IVs • Dobbertin: two different 512 -bit messages with a chosen initial value different from that in MD 5

MD 5 Collisions M, N each 512 bits N, N’ cancels differential from M,

MD 5 Collisions M, N each 512 bits N, N’ cancels differential from M, M’ Uses initial IV of MD 5 Two 1024 -bit messages ~ 1 hour to find M’s, 5 -15 minutes to find N’s

SHA-0 Collisions • • 2004 Joux pair of 2048 bit inputs 80 K CPU

SHA-0 Collisions • • 2004 Joux pair of 2048 bit inputs 80 K CPU hours (3 weeks) Work equivalent of 251 hashes Local differentials of 6 steps

SHA-0 Collisions • Wang, Yu, Yin 2005 • Full collision in equivalent work of

SHA-0 Collisions • Wang, Yu, Yin 2005 • Full collision in equivalent work of 239 hashes • Impossible differential in rounds 2 -4 • Local collisions in round 1 • Conditions that this differential path holds

SHA-0 Local Collisions • Wang 1997 • Let mi, j = ith message word,

SHA-0 Local Collisions • Wang 1997 • Let mi, j = ith message word, jth bit • 6 step local collision that can start at any step i – used to construct full collisions • If a message difference in bit j first occurs in step i – Difference will affect chaining variables a, b, c, d, e consecutively in the next five steps – To offset these differences and reach a local collision, differences introduced in subsequent message words • Probability associated with the local collision depends on the boolean function, j, and conditions on the message bits • One attack: – j = 2 – Conditions mi, 2 = ⌐mi+1, 7 and mi, 2 = ⌐mi+2, 2

SHA-0 Local Collision nc = no carry

SHA-0 Local Collision nc = no carry

SHA-0 Collisions • Biham and Chen 2004 – started at i > 17 –

SHA-0 Collisions • Biham and Chen 2004 – started at i > 17 – start at i = 22, work O(256) hashes – near collisions with work O(240) hashes • Biham and Chen (2004), Joux (2004), Wang and Yu (2005) – Multi-block collisions – Use near collisions in several blocks to produce overall collision – Used with above, Joux had first full collision, work O(251) hashes

SHA-1 Collisions • Rijmen and Oswald 2005: collisions for 53 reduced-round version • Wang,

SHA-1 Collisions • Rijmen and Oswald 2005: collisions for 53 reduced-round version • Wang, Yin, Yu 2005: Collisions without padding on full 80 rounds, work O(269) hashes • Wang, Yao 2005: Improved above attack to work O(263) hashes • Rechberger and De Cannière 2006: way to choose part of the message

SHA-256 • Round function has local collision with probability in the range 2 -9

SHA-256 • Round function has local collision with probability in the range 2 -9 to 2 -39 • Message expansion is more complicated than SHA-0, SHA-1 – Expansion block from 16 to 64 words

Colliding Certificates • Fill in all fields except for – RSA public key modulus,

Colliding Certificates • Fill in all fields except for – RSA public key modulus, n – signature (except for first zero byte - to prevent bit string from being a negative integer) • Three requirements: – compliant to X. 509 and the ASN. 1 DER – byte lengths of modulus, n, and public exponent, e, fixed in advance • Can fix e as “Fermat-4” number e = 65537. Same e used in both certificates. – position where public key modulus starts is an exact multiple of 64 bytes after the beginning of the “to be signed” part – do by adding dummy information to the subject Distinguished Name. i. e. want part prior to n to be an integral number of message blocks to which the message blocks containing n are appended when computing the MD 5 hash • • Run MD 5 on the first portion of the “to be signed” part, truncated at the position where n starts This input to MD 5 is an exact multiple of 512 bits. Suppress the padding normally used in MD 5, use output as IV to for the next step. i. e. this would be the chaining variable input to the iteration in which n starts to be processed Construct two different 1024 bit strings b 1 and b 2 for which the MD 5 compression function with the IV from the previous step produces a collision

Colliding Certificates • Construct two RSA moduli from b 1 and b 2 by

Colliding Certificates • Construct two RSA moduli from b 1 and b 2 by appending to each the same 1024 -bit string b – Generate random primes p 1 and p 2 of ~512 bits, such that e is coprimeto p 1 − 1 and p 2 − 1 – Compute b 0 between 0 and p 1 p 2 such that p 1|b 121024 + b 0 and p 2|b 221024 + b 0 For k = 0, 1, 2, . . . , • • compute b = b 0 + kp 1 p 2; Check if both q 1 = (b 121024 + b)/p 1 and q 2 = (b 221024 + b)/p 2 are primes Check if e is coprime to both q 1 − 1 and q 2 − 1 If k is such that b 21024, restart with new random primes p 1, p 2 Stop when q 1 and q 2 have been found; output • • n 1 = b 121024 + b • n 2 = b 221024 + b • p 1, p 2, q 1, q 2 Expect that this algorithm will produce in a reasonable time, two RSA moduli n 1 = p 1 q 1 and n 2 = p 2 q 2, that will form an MD 5 -collision with the specified IV – p 1 and p 2 are around 500 bits in size, usually takes a few minutes – few days when 512 bit p 1, p 2 and 1536 bit q 1, q 2 (search space for k nearly empty)

Colliding Certificates • Insert the modulus n 1 into the certificate - “to be

Colliding Certificates • Insert the modulus n 1 into the certificate - “to be signed” part is complete • Compute the MD 5 hash of the entire “to be signed” part (including MD 5 -padding, standard MD 5 -IV) • Apply PKCS#1 v 1. 5 -padding 3, and perform a modular exponentiation using the issuing Certification Authority’s private key. – This gives the signature, which is added to the certificate. – First certificate now is complete. • Second certificate – use n 2 as the public key modulus and signature still valid

NIST Hash Workshop • Proposed competition – SHA 3 – – – • •

NIST Hash Workshop • Proposed competition – SHA 3 – – – • • Call for proposals – Nov. 2007 Allow time for public comments on hash function requirements Review submissions starting in 2009 Select winner end of 2011 Standard released in 2012 Then time to integrate into applications 6+ years before standard replacement

SHA 3 Requirements • NIST does not currently plan to withdraw SHA-2 • SHA-3

SHA 3 Requirements • NIST does not currently plan to withdraw SHA-2 • SHA-3 can be directly substituted for SHA-2 in current applications – must provide message digests of 224, 256, 384 and 512 -bits to allow substitution for the SHA-2 family • 160 -bit hash value produced by SHA-1 is becoming too small to use for digital signatures – a 160 -bit replacement hash algorithm is not contemplated.

SHA 3 Requirements • Certain properties of the SHA-2 hash functions must be preserved

SHA 3 Requirements • Certain properties of the SHA-2 hash functions must be preserved – – input parameters, output sizes collision resistance, preimage resistance, second-preimage resistance “one-pass” streaming mode of execution successful attack on the SHA-2 hash functions is unlikely to be applicable to SHA -3. – be suitably flexible for a wide variety of implementations, • May offer additional properties – – – • randomized hashing parallelizable more efficient to implement on some platforms more suitable for certain applications may avoid some of the incidental “generic” properties (such as length extension) of the Merkle-Damgard construct that often result in insecure applications For interoperability, prefer a single hash algorithm family (different size message digests be internally generated in as similar a manner as possible)

Approaches • Augment existing hash functions: – Add bit count to input – part

Approaches • Augment existing hash functions: – Add bit count to input – part of chaining value – Padding: 1, 0’s, hash size, message length • New algorithms, with approach similar to past hashes – chaining • New approaches that avoid Merkle. Damgard chaining • Algorithm related to mathematically hard problems

Alternatives • CMC mode – Forward and backward diffusion – Use last block –

Alternatives • CMC mode – Forward and backward diffusion – Use last block – Inefficient – memory requirements – > double computational work m 1 m 4 m 3 m 2 T X 1 X 4 M M T hash • Binary tree m 1 m 2 m 3 – Block cipher, PRP construct as nodes – double computational work hash m 4

Alternatives • Hybrid – CMC mode or chaining on segments – Subset of output

Alternatives • Hybrid – CMC mode or chaining on segments – Subset of output forms next layer m blocks m blocks m blocks

Segment Processing • CMC mode: – forward and backward diffusion across blocks – collision

Segment Processing • CMC mode: – forward and backward diffusion across blocks – collision implies weakness in block cipher m 1 m 4 m 3 m 2 T X 1 M X 4 M M M T segment output

Segment Processing IV 128 128 128 … … 128 128 input 128 128 segment

Segment Processing IV 128 128 128 … … 128 128 input 128 128 segment output • Elastic Chaining mode: – Only forward diffusion … but less memory/intermediate state than CMC mode – If key is based on first block, collision implies weakness in block cipher – Use elastic version of 128 -bit cipher or a 256 -bit cipher

Segment Processing • Don’t use existing forward chaining in segment – differential attack applies

Segment Processing • Don’t use existing forward chaining in segment – differential attack applies to segment m 11 IV IV F F m 21 m 22 F m 13 m 12 F x 12 x 22 m 14 m 1 l F F F m 23 m 24 m 2 l F F x 13 = Z x 23 = Z F x 12 x 22 = x segment output