Extendible Hashing Primarily used for storage of files

Extendible Hashing • Primarily used for storage of files on disk – Number of disk accesses is important factor (only 2 needed for search) • The hash table grows as required • Use a directory of pointers to buckets – If directory has (global) depth g then it is organized according to the first g bits – One entry for every possible value with g bits – Entry for bit pattern b 1…bg has a pointer to bucket containing all records with keys k for which h(k) starts with bit pattern b 1…bg – Some other versions use bit pattern at end, not start. – Each bucket has its own local depth l (l ≤ g): the number of bits used for that bucket COSC 2 P 03 Week 12 1

Extendible Hashing Example • Suppose that g=2 and bucket size = 4. • Suppose that we have records with these keys and hash function h(key) = key mod 64: key 288 8 1064 120 148 204 641 700 258 1586 44 h(key) = key mod 64 32 8 40 56 20 12 1 60 2 50 44 COSC 2 P 03 Week 12 bit pattern 100000 001000 111000 010100 001100 000001 111100 000010 110010 101010 2

Extendible Hashing Example – directory and bucket structure g=2 00 01 10 11 l=2 l=2 8 148 288 120 204 1064 700 641 44 1586 258 COSC 2 P 03 Week 12 3

Bucket and directory split • Insert 68 • 68 mod 64 = 000100 g=3 000 001 010 011 100 101 110 111 l=3 l=2 l=2 641 8 148 288 120 258 204 1064 700 44 1586 68 COSC 2 P 03 Week 12 4

Bucket split – no directory split • Insert 48 and 575 • 48 mod 64 = 48 = 110000 • 575 mod 64 = 63 = 111111 g=3 000 001 010 011 100 101 110 111 l=3 l=2 l=3 641 8 148 288 1586 120 258 204 1064 48 700 68 44 COSC 2 P 03 Week 12 575 5

Multiple splits • • • Insert 16, 18, 22, 23 16 mod 64 = 16 = 010000 18 mod 64 = 18 = 010010 22 mod 64 = 22 = 010110 23 mod 64 = 23 = 010111 Setting l=3 gives this intermediate (partial) picture… 010 l=3 011 l=3 148 16 Continue to next page… 18 22 COSC 2 P 03 Week 12 23 6

Multiple splits, continued • Setting l=4 (and thus g=4) gives this final result… g=4 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 l=3 l=4 641 8 16 258 204 18 68 l=3 l=2 l=3 148 288 1586 120 22 1064 48 700 23 44 COSC 2 P 03 Week 12 575 7
- Slides: 7