CSC 482582 Computer Security Hash Functions CSC 482582

Topics 1. 2. 3. 4. 5. 6. 7. Hash Functions Applications of Hash Functions

Hash Functions Hash Function h: M MD �Input M: variable length message M �Output

Applications of Hash Functions Verifying file integrity �How do you know that a file

Why attack hash functions? Create forged security certificate to �Make phishing site appear legitimate.

Flame Malware Cyber espionage tool discovered in 2012 �Records audio, screenshots, bluetooth, and file

Avalanche Effect The avalanche effect is shown when a small change to the input

Secure Hash Function A function h = hash(m) must have 3 properties to be

Pre-image Attacks A pre-image attack attempts to find a message m that has a

Collision Attacks A collision attack attempts to find two different messages m 1 and

The Birthday Paradox The birthday paradox concerns the probability that, in a set of

Birthday Attack A birthday attack exploits the mathematics behind the birthday problem to find

Birthday Attack Analysis The birthday attack procedure follows these steps: Randomly generate a sequence

Merkle–Damgård construction �Select a cryptographic hash function f(m, d). �Apply repeatedly to fixed size

Length Extension Attacks Let M and N be messages, such that �M is the

Chosen Prefix Attacks A collision attack on Merkle–Damgård hashes, in which �Attacker begins with

Message-Digest Algorithm 5 (MD 5) �Developed by Ron Rivest in 1991 �Uses 128 -bit

MD 5 Collision Attack History Initial attacks (2004) could only find collisions in files

Secure Hash Algorithm (SHA-1) �Developed by NSA; approved as federal std by NIST �SHA-0

SHA-2 �Developed by NSA; approved as federal std by NIST �SHA-2 (2001) � 224,

SHA-3 �Winner of open NIST competition (2007 -2012) �Final standard released in August 2015.

Argon 2 �Winner of Password Hashing Competition (2013 -2015) �Competition hosted by cryptographers, not

HMAC A keyed hash message authentication code (HMAC) is the use of a hash

Why not use h(k ∥ m) as HMAC? Length extension attacks on Merkle–Damgård construction

HMAC Algorithm HMAC-h(k, m) = h(k opad || h(k ipad || m)) �k is

Key Points 1. Hashes are 1 -way functions h=hash(m) that 1. 2. Produce same

References 1. 2. 3. 4. 5. 6. 7. 8. 9. Matt Bishop, Introduction to

Slides: 27

Download presentation

CSC 482/582: Computer Security Hash Functions CSC 482/582: Computer Security Slide #1

Topics 1. 2. 3. 4. 5. 6. 7. Hash Functions Applications of Hash Functions Secure Hash Functions Collision Attacks Pre-Image Attacks Current Hash Functions HMAC: Keyed Hash Functions CSC 482/582: Computer Security Slide #2

Hash Functions Hash Function h: M MD �Input M: variable length message M �Output MD: fixed length “Message Digest” of input Many inputs produce same output (called a hash collision) �Limited number of outputs; infinite number of inputs �Avalanche effect: small input change -> big output change Example Hash Function �Sum 32 -bit words of message mod 232 M CSC 482/582: Computer Security h MD=h(M) Slide #3

Applications of Hash Functions Verifying file integrity �How do you know that a file you downloaded was not corrupted during download? Storing passwords (confidentiality) �To avoid compromise of all passwords by an attacker who has gained admin access, store hash of passwords. �Additional features needed for secure passwords. Digital signatures (authentication) �Cryptographic verification that data was downloaded from the intended source and not modified. �Used for operating system patches and packages. CSC 482/582: Computer Security Slide #4

Why attack hash functions? Create forged security certificate to �Make phishing site appear legitimate. �Bypass code signing checks on updates. Distribute malware �Replace legitimate app with malware app. �Ensure both apps have legitimate hash value, so victims cannot distinguish between them. Forge digital signatures �Replace contract where victim pays $50 to attacker with one where victim pays $5, 000. CSC 482/582: Computer Security Slide #5

Flame Malware Cyber espionage tool discovered in 2012 �Records audio, screenshots, bluetooth, and file data. �Exfiltrates data via SSL encrypted channel. Bypassed code signing security in MS Windows �Used hash collision to create a certificate apparently signed by Microsoft Certificate Authority. �Malware digitally signed with forged certificate. �Code signing accepted that malware was valid as certificate apparently signed by MS CA. Attack could be used as MITM attack on MS Update �Attacker substitutes Windows patch with malware. CSC 482/582: Computer Security Slide #6

Avalanche Effect The avalanche effect is shown when a small change to the input of a block cipher or hash function makes a large change in the output. Hashing “Cryptography”: MD 5 (128 -bit) = 64 ef 07 ce 3 e 4 b 420 c 334227 eecb 3 b 3 f 4 c SHA 1 (160 -bit) = b 804 ec 5 a 0 d 83 d 19 d 8 db 908572 f 51196505 d 09 f 98 Hashing “Cryptography 1”: MD 5 (128 -bit) = 443 d 4 fb 1 fedeb 86 b 69582169 c 2719 c 24 SHA 1 (160 -bit) = 838498 e 48147106062 a 64 c 523 ddfe 11 bd 07 a 5 eac CSC 482/582: Computer Security Slide #7

Secure Hash Function A function h = hash(m) must have 3 properties to be secure: Pre-image resistance: Given a hash h it should be difficult to find any message m such that h = hash(m). Functions that lack this property are vulnerable to pre-image attacks. 2. Second pre-image resistance: Given an input m 1 it should be difficult to find another input m 2 such that m 1 ≠ m 2 and hash(m 1) = hash(m 2). Functions that lack this property are vulnerable to second-preimage attacks. 3. Collision resistance: It should be difficult to find two different messages m 1 and m 2 such that hash(m 1) = hash(m 2). Such a pair is called a cryptographic hash collision. This property is sometimes referred to as strong collision resistance. It requires a hash value at least twice as long as that required for preimage-resistance; otherwise collisions may be found by a birthday attack. 1. CSC 482/582: Computer Security Slide #8

Pre-image Attacks A pre-image attack attempts to find a message m that has a specific hash value h, such that h=hash(m). �Brute force attack is possible with 2 n operations, where n is the length of the hash value. �For n >= 64, brute force considered infeasible. Pre-image attacks better than brute force would allow �Forgery of digital signatures. �Better cracking of hashed passwords. No practical pre-image attacks exist against widely used hash functions. �An MD 5 collision can be found in 2123. 4 operations. CSC 482/582: Computer Security Slide #9

Collision Attacks A collision attack attempts to find two different messages m 1 and m 2 such that hash(m 1) = hash(m 2). Collisions must exist because there are more inputs than fixed-sized outputs for hash functions. Pigeonhole principle: if there are n containers for n+1 objects, then at least 1 container will have 2 objects in it. Collision attacks do not impact password hashing, but do allow forged certificates and signatures. CSC 482/582: Computer Security Slide #10

The Birthday Paradox The birthday paradox concerns the probability that, in a set of n randomly chosen people, some pair of them will have the same birthday. By the pigeonhole principle, the probability reaches 100% when the number of people reaches 367. However, 99% probability is reached with just 57 people, and 50% probability with 23 people. The birthday paradox is a violation of our intuition, not a true paradox. It arises because the chance of shared birthdays increases with the number of unique pairs of people, which is n(n-1)/2 for n people. CSC 482/582: Computer Security Slide #11

Birthday Attack A birthday attack exploits the mathematics behind the birthday problem to find hash collisions. �Suppose a hash function h has a b-bit long output. �Therefore there are 2 b possible hash values. Attacker generates many random messages �Computes hash of each one. �Searches for pairs of messages with same hash value. �By similar mathematics as in the birthday problem, attacker can find a collision with about 2 b/2 messages. CSC 482/582: Computer Security Slide #12

Birthday Attack Analysis The birthday attack procedure follows these steps: Randomly generate a sequence of plaintexts X 1, X 2, X 3, … 2. For each Xi compute yi = h(Xi) and test whether yi = yj for some j < i 3. Stop as soon as a collision has been found 1. If there are m possible hash values, the probability that the ith plaintext does not collide with any of the previous i – 1 plaintexts is 1 - (i - 1)/m The probability Fk that the attack fails (no collisions) after k plaintexts is Fk = (1 - 1/m) (1 - 2/m) (1 - 3/m) … (1 - (k - 1)/m) Using the standard approximation 1 - x e-x Fk e-(1/m + 2/m + 3/m + … + (k-1)/m) = e-k(k-1)/2 m The attack succeeds/fails with probability ½ when Fk = ½ , that is, e-k(k-1)/2 m = ½ k 1. 17 m½ We conclude that a hash function with b-bit values provides ~b/2 bits of security. CSC 482/582: Computer Security Slide #13

Merkle–Damgård construction �Select a cryptographic hash function f(m, d). �Apply repeatedly to fixed size blocks of message mi. �Use output of previous stage di as second input. �Start with initialization vector d 0 = IV CSC 482/582: Computer Security Slide #14

Length Extension Attacks Let M and N be messages, such that �M is the set of blocks m 1 through mk �N is the set of blocks m 1 through mk+1 �That is, N is M with the addition of block mk+1 Merkle–Damgård construction means �h(M), the hash of message M, is also �The intermediate value after k blocks in computing h(N) How could this be harmful? �What if messages authenticated using h(X || M) �X is a secret known to only Alice and Bob, but still �Eve can create message N and compute h(X || N). CSC 482/582: Computer Security Slide #15

Chosen Prefix Attacks A collision attack on Merkle–Damgård hashes, in which �Attacker begins with two different prefixes p 1, p 2 �Attempts to find two suffixes m 1 and m 2 �such that hash(p 1 ∥ m 1) = hash(p 2 ∥ m 2). Such an attack allows custom creation of two completely different documents with identical hashes. Example attack �Attacker creates two SSL certificate files for two different domains but with identical hashes. �Attacker asks CA to sign certificate for one domain. �Attacker creates phishing site for other domain. �User browser successfully validates phishing site certificate, since digital signature is valid for both domains. CSC 482/582: Computer Security Slide #16

Message-Digest Algorithm 5 (MD 5) �Developed by Ron Rivest in 1991 �Uses 128 -bit hash values �Merkle–Damgård construction �Still widely used in legacy applications even though collision vulnerabilities allow forgery of digital signatures and SSL certificates. �Attacks can find collisions in seconds on a PC. �Best attack only requires 218 attempts. CSC 482/582: Computer Security Slide #17

MD 5 Collision Attack History Initial attacks (2004) could only find collisions in files that differed only in last few bytes. 2. Early attacks (2008) used cluster of 200 PS 3 s for a couple of days. 3. Current attacks can find a collision in seconds on single PC. 1. Lesson: Cryptanalytic attacks always improve. Change algorithms before they do. CSC 482/582: Computer Security Slide #18

Secure Hash Algorithm (SHA-1) �Developed by NSA; approved as federal std by NIST �SHA-0 (1993) and SHA-1 (1995) � 160 -bit hash values �Merkle–Damgård construction �SHA-1 developed to correct insecurity of SHA-0 �SHA-1 certificates deprecated by browsers in 2017 �Google found first SHA-1 collision in Feb 2017. �Required 6610 years of GPU computation. CSC 482/582: Computer Security Slide #19

SHA-2 �Developed by NSA; approved as federal std by NIST �SHA-2 (2001) � 224, 256, 384, or 512 -bit hash values �Merkle–Damgård construction �Current recommended hash function for security applications like digital signatures or SSL certificates. �Cryptanalysts making progress but no breaks �Can only find collisions if modify hash algorithm by reducing number of rounds from 80 (SHA-512) to 46 or 64 (SHA-256) to 41. CSC 482/582: Computer Security Slide #20

SHA-3 �Winner of open NIST competition (2007 -2012) �Final standard released in August 2015. �SHA-3 (2012) � 224, 256, 384, or 512 -bit hash values �Keccak was winning algorithm out of field of 64. �An alternative to SHA-2 �Not a replacement as SHA-2 is not broken. �Built on sponge-function instead of Merkle–Damgård construction like MD 5, SHA-1, SHA-2 so that the same cryptanalytic techniques will not work against SHA-3. CSC 482/582: Computer Security Slide #21

Argon 2 �Winner of Password Hashing Competition (2013 -2015) �Competition hosted by cryptographers, not NIST. �Goal: hash function for password storage that is slow on CPUs, GPUs (unlike scrypt), and FPGAs (unlike bcrypt) and that is resistant to lookup table attacks. �Argon 2 �Scalable time and memory requirements. �Built-in 128 -bit nonce protects against lookup table attacks. �Configurable output size (128 bit is default. ) CSC 482/582: Computer Security Slide #22

HMAC A keyed hash message authentication code (HMAC) is the use of a hash function for calculating a message authentication code (MAC) based on a message in combination with a secret cryptographic key. HMAC protects against threat models in which attackers have the ability to modify hash values. �If attacker could modify data, then he could change both the file and its hash value, causing the victim to think that the file was downloaded correctly when in fact the attacker substituted a different file. �This threat model allows an attack on hashes without finding a collision or pre-image. CSC 482/582: Computer Security Slide #23

Why not use h(k ∥ m) as HMAC? Length extension attacks on Merkle–Damgård construction hashes allow attacker to append data s to end of message m and create a valid HMAC for m ∥ s. Most widely used hashes vulnerable to this attack �MD 5, SHA-1, SHA-256, SHA-512 CSC 482/582: Computer Security Slide #24

HMAC Algorithm HMAC-h(k, m) = h(k opad || h(k ipad || m)) �k is the secret key �m is the message �h is a hash function like SHA-2 �ipad (inner padding) is 00110110 repeated. �opad (outer padding) is 01011100 repeated. Threat can’t generate HMAC for any message m without knowing key k. �Algorithm prevents length extension attacks. �Commonly used to protect authentication cookies. CSC 482/582: Computer Security Slide #25

Key Points 1. Hashes are 1 -way functions h=hash(m) that 1. 2. Produce same sized h for any input m. Avalanche effect: small change in m big change in h. 1. 2. Collision attacks Pre-image attacks 2. Secure hash functions must be resistant to attacks 3. Attacks allow threats to forge certificates & signatures. 4. Widely used hash functions 1. MD 5, SHA-1, SHA-2, SHA-3 1. 2. Some widely used hashes (MD 5, SHA-1) broken. Use SHA-2 with 256 or more bits now. 5. Current state of hash function security 6. Keyed hash functions cannot be computed by attacker HMAC-h(k, m) = h(k´ opad || h(k´ ipad || m)) CSC 482/582: Computer Security Slide #26

References 1. 2. 3. 4. 5. 6. 7. 8. 9. Matt Bishop, Introduction to Computer Security, Addison-Wesley, 2005. Steven Friedl, “An Illustrated Guide to Cryptographic Hashes, ” http: //www. unixwiz. net/techtips/iguide-crypto-hashes. html Goodrich and Tammasia, Introduction to Computer Security, Pearson, 2011. Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone, Handbook of Applied Cryptography, http: //www. cacr. math. uwaterloo. ca/hac/, CRC Press, 1996. NIST, FIPS-198 a, “The Keyed-Hash Message Authentication Code (HMAC)”, http: //csrc. nist. gov/publications/fips 198/fips-198 a. pdf Rogaway, P. ; Shrimpton, T. "Cryptographic Hash-Function Basics: Definitions, Implications, and Separations for Preimage Resistance, Second-Preimage Resistance, and Collision Resistance. ” Fast Software Encryption (2004) (Springer-Verlag). Alexander Sotirov et. Al. , MD 5 considered harmful today: Creating a rogue CA certificate, http: //www. win. tue. nl/hashclash/rogue-ca/, December 30, 2008. Peter Selinger, MD 5 Collision Demo, http: //www. mscs. dal. ca/~selinger/md 5 collision/ Joe Wetzels, Open Sesame: The Password Hashing Competition and Argon 2, https: //eprint. iacr. org/2016/104. pdf, 2016. CSC 482/582: Computer Security Slide #27