CSE 326 Hashing Richard Anderson instead of Martin
- Slides: 13
CSE 326 Hashing Richard Anderson (instead of Martin Tompa)
Chaining review 0 1 2 3 k A H(k) = k mod 17 4 5 B 6 C 7 D 8 E F G H I J K Collect twelve birthdays from students and hash into the table with chaining 9 10 11 12 13 14 15 16
Open address hashing n Store all elements in table If a cell is occupied, try another cell. n Linear probing, try cells n n H(k), H(k) + 1 mod m, H(k) + 2 mod m, . .
Open Address Hashing k H(k) = k mod 17 0 0 1 1 2 2 A 3 3 I 4 4 J 5 5 A 53 2 B 41 7 6 6 C C 91 6 7 7 B D 75 7 8 8 D 9 9 F 10 10 K 11 11 12 12 13 13 14 14 15 15 16 16 E 13 13 F 6 6 G 43 7 H 67 16 I 88 3 J 36 2 K 40 6 Collect twelveexample Step through birthdays from students and hash into the table Emphasize thewith chaining growing regions E H
Open address hashing Lookup (K) { p = H(K); loop { if (A[p] is empty) return false; if (A[p] == K) return true; p = (p + 1) mod m; } }
Open address hashing issues n Issues: n Clumping n n Cost per operation Deletion
Double hashing n n Use separate hash functions for the first probe and the collision resolution H 1(k), H 1(k) + H 2(k) mod m, H 1(k) + 2 H 2(k) mod m, H 1(k) + 3 H 2(k) mod m , . . . Return to earlier slide to update the access code
Double hashing example month A B C day H 1(k) 0 1 2 3 H 1(k) = day mod 17 4 H 2(k) = month 6 5 7 D 8 E 9 F 10 G H I J K Collect twelve birthdays (month and day) from students and hash into the table with chaining WRITE MONTHS AS NUMBERS 11 12 13 14 15 16
Double hashing vs. Single hashing Single n Load factor a, cost per operation n Single hashing Double n Double hashing a
Trade offs between chaining and open addressing Chaining Open Addressing Space Time Deletions Coding complexity High load factor
Hash Functions n Function n Efficient Uniform mapping to range Avoids systematic collisions
Hashing strings n String n Suppose
n n Fact: So: