Linear probing When inserting into hash table also
Linear probing When inserting into hash table (also when searching)If hash function results in collision, try the next available slot; if that results in collision try the next slot after that, and so on (if slot 3 is full, try slot 4, then slot 5, then slot 6, and so on) using the following auxiliary hash function: for all normal hash function as before number of slots to skip for probe i to make sure it is mapped overall to one of the m slots in the hash table CSC 317 1
Linear probing 0 Example: 1 Insert keys 2, 6, 9, 16 2 2 3 9 4 16 5 6 CSC 317 6 2
Linear probing • Pro: easy to implement • Con: can result in primary clustering keys can cluster by taking adjacent slots in hash table, since each time searching for next available slot when there is collision Consequence: Longer search time … What about if we insert 2 or 3 slots away instead of 1 slot away? CSC 317 3
Linear probing What about if we insert 2 or 3 slots away instead of 1 slot away? Answer: still have problem that if two keys initially mapped to same hash slot, they have identical probe sequences, since offset of next slot to check doesn’t depend on key What to do? Make offset determined by another key function – double hashing! CSC 317 4
Double hashing Definition: are now two hash functions indicates the number of slots to skip (depends now both on the probe number and 2 nd has function). mod m is to make sure it is mapped overall to one of m slots in the hashtable. i CSC 317 5
Double hashing 0 Example: 1 2 Keys to insert: 14, 17, 25, 3: 3 14 4 5 6 17 7 3 8 25 9 10 CSC 317 6
Double hashing: summary • Better approach than linear probing because number of slots to skip when there is a collision depends on the key • For this reason, more useful in practice CSC 317 7
Analysis open addressing Theorem: assuming uniform hashing (each probe sequence equally likely) and load factor the expected number of probes of an unsuccessful search is CSC 317 8
Analysis open addressing Example 1: • Assume that the table is half full: α = 0. 5 • Two probes on average: Example 2: • Assume that the table is pretty full: α = 0. 9 • 10 probes on average: Conclusions: • Average number probes increases as α increases • What happens when α is close to 1? CSC 317 9
Intuition for analysis No formal proof, but concept. • Probability that 1 st probe leads to a slot in the hash table that is occupied: Pr(first probe slot occupied) : • Probability that 2 nd probe leads to a slot in the hash table that is occupied: Why? CSC 317 10
Intuition for analysis • Probability that 1 st and 2 nd probe leads to a slot in the hash table that is occupied: Pr(first and second probe occupied) CSC 317 11
Intuition for analysis Putting it together: • We always make the first probe: 1 • Prob first probe in occupied slot • Prob second probe in occupied slot • … • Putting this together, expected number of probes bounded above by: geometric series with α<1 CSC 317 12
Analysis open addressing • Successful search: more involved, but can’t be worse than unsuccessful search • We won’t analyse CSC 317 13
Analysis open addressing Does expected number probes always hold in practice? Answer: No. • Depends on open addressing approach and whether uniform hashing assumption is achieved. • Double hashing is in practice better than linear probing. CSC 317 14
Universal hashing • If adversary learns hash function, then can exploit the system by sending data that all map to same slot in hash table (slow down or halt a system) • Why? Search time is θ(n) Solutions: • Cryptographic hash function that is very hard to decipher • Randomly choose hash functions (independent from keys) from a whole family of hash functions, so that adversary doesn’t know which random function was chosen CSC 317 15
Universal hashing • There are practical ways of designing so-called universal hash functions, in a way that the keys spread evenly into the slots and average probes of 1/1 -α holds • Example: random prime number larger than the universe of keys p > m CSC 317 16
- Slides: 16