HASHING II CS 2110 Spring 2018 Hash Functions



















- Slides: 19

HASHING II CS 2110 Spring 2018

Hash Functions 0 1 4 1 Requirements: 1) 2) 3 deterministic return a number in [0. . n] Properties of a good hash: 1) 2) 3) 4) fast collision-resistant evenly distributed hard to invert

Hash Table add(“CA”) 3 Hash hunction CA 0 b 1 MA Two ways of handling collisions: 1. Chaining 2 5 mod 6 3 4 NY 2. Open Addressing 5 CA

Hash. Set and Hash. Map Set<V>{ } Map<K, V>{ boolean add(V value); V put(K key, V value); boolean contains(V value); V get(K key); boolean remove(V value); V remove(K key); }

put('a') put('b') put('c') put('d') get('d') remove('c') Open Addressing get('d') put('e') Remove Chaining 0 a c d 1 e 2 3 b 0 1 2 3 a e c d b

Time Complexity (no resizing) 6 Collision Handling Chaining Open Addressing put(v) get(v) remove(v)

Load Factor 7 Load factor

Expected Chain Length 8 0 1 a e 2 3 b 4 5 c d

9 Expected Time Complexity (no resizing) Collision Handling Chaining Open Addressing put(v) get(v) remove(v)

Expected Number of Probes 10 0 1 2 3 4 5

11 Expected Time Complexity (no resizing) Collision Handling put(v) get(v) Chaining Open Addressing Assuming constant load factor We need to dynamically resize! remove(v)

Amortized Analysis 12 vs. In an amortized analysis, the time required to perform a sequence of operations is averaged over all the operations Can be used to calculate average cost of operation

Amortized Analysis of put 13

14 Expected Time Complexity (with dynamic resizing) Collision Handling Chaining Open Addressing put(v) get(v) remove(v)

Cuckoo Hashing

Cuckoo Hashing 16 Alternative solution to collisions Assume you have two hash functions H 1 and H 2 element a b c d e 0 9 17 11 5 5 2 10 3 13 H 1 H 2 0 a 1 2 3 4 5 b d b c d e c What if there are loops?

Complexity of Cuckoo Hashing 17 Worst Case: Collision Handling put(v) get(v) remove(v) Chaining Open Addressing Cuckoo Hashing Expected Case: Collision Handling Chaining Open Addressing Cuckoo Hashing put(v)

Bloom Filters 18 Assume we only want to implement a set What if you had stored the value at "all" hash locations (instead of one)? element H 1 H 2 0 �� 1 a b c d e 0 9 17 11 5 5 2 10 3 13 2 3 �� �� 4 5 ��

Features of Bloom Filters 19