Hashing Collision Resolution Schemes Collision Resolution Techniques Introduction

  • Slides: 16
Download presentation
Hashing: Collision Resolution Schemes • Collision Resolution Techniques • Introduction to Separate Chaining •

Hashing: Collision Resolution Schemes • Collision Resolution Techniques • Introduction to Separate Chaining • Collision Resolution using Separate Chaining • Introduction to Collision Resolution using Open Addressing 1

Collision Resolution Techniques • There are three broad ways of collision resolution: • 1.

Collision Resolution Techniques • There are three broad ways of collision resolution: • 1. Separate Chaining: A linked list-based implementation. • 2. Open Addressing: Array-based implementation. (i) (iii) (iv) Linear probing (linear search) Quadratic probing (nonlinear search) Random increments/decrements Rehashing (double hashing) • 3. Buckets methods: Usually a combination of (1) & (2) 2

Introduction to Separate Chaining • The hash table is implemented as an array of

Introduction to Separate Chaining • The hash table is implemented as an array of linked lists. • Inserting an item, r, at index i is simply insertion into the linked list at position i. • Synonyms are chained in the same linked list. • Retrieval of an item, r, with hash address, i, is simply retrieval from the linked list at position i. • Deletion of an item, r, with hash address, i, is simply deleting r from the linked list at position i. 3

Separate Chaining with String Keys • Recall that search keys can be numbers, strings

Separate Chaining with String Keys • Recall that search keys can be numbers, strings or some other object. • The following Java method implements such technique public static int hash(String key, int table. Size) { int hash. Val = 0; for (int i = 0; i < key. length(); i++) { hash. Val += key. char. At(i); } return hash. Val % table. Size; } • The following class which describes commodity items class Commodity. Item { String name; // commodity name int quantity; // commodity quantity needed double price; // commodity price } 4

Example 1: Separate Chaining • Devise an appropriate hash function and use it to

Example 1: Separate Chaining • Devise an appropriate hash function and use it to load the information about the following commodity items into a hash table of size 13 using separate chaining. onion tomato cabbage carrot okra mellon potato Banana olive salt cucumber mushroom orange 1 1 3 1 1 2 2 10. 0 8. 50 3. 50 5. 50 6. 50 10. 0 7. 50 3 2 2 3 3 2 4. 0 15. 0 2. 50 4. 50 5. 50 3. 00 5

Example 1: Separate Chaining (cont'd) 0 1 2 3 4 5 6 7 8

Example 1: Separate Chaining (cont'd) 0 1 2 3 4 5 6 7 8 9 Item onion tomato cabbage carrot okra mellon potato Banana olive salt cucumber mushroom orange Qty 1 1 3 1 1 2 2 3 3 2 Price 10. 0 8. 50 3. 50 5. 50 6. 50 10. 0 7. 50 4. 0 15. 0 2. 50 4. 50 5. 50 3. 00 h(key) 1 10 4 1 0 10 0 11 10 7 9 6 12 10 11 12 6

Introduction to Open Addressing • In this method the entries are placed inside the

Introduction to Open Addressing • In this method the entries are placed inside the array itself. • The probe sequence is essentially a sequence of functions {h 0, h 1, h 2, …, hn-1} where, hi: K -> {0, 1, …, n-1 } • To insert item r, we examine array locations h 0(r), h 1(r), h 2(r), . . . , • Similarly, to find item r, we examine the same sequence of locations in the same order. 7

Introduction to Open Addressing (cont'd) • The most common probe sequences are of the

Introduction to Open Addressing (cont'd) • The most common probe sequences are of the form hi(r) = (h(r) + c(i)) mod n, i = 0, 1, …, n-1. • The function c(i) is required to have the following two properties: • Property 1: c(0) = 0. • Property 2: The set of values {c(0) mod n, c(1) mod n, c(2) mod n, …, c(n-1) mod n} must contain every integer between 0 and n-1 inclusive. 8

Open Addressing: Linear Probing • Linear Probe: Here the function c(i) is a linear

Open Addressing: Linear Probing • Linear Probe: Here the function c(i) is a linear function in i: c(i) = ai + b • Property 1 requires that c(0) = 0. Therefore, b must be zero. • For c(i) = ai to satisfy Property 2, a and n must be relatively prime. • The linear probing sequence that is usually used is hi (r)= (h(r) + i) mod n, i=0, 1, 2, …, n-1 • Insert record at first empty slot and if no empty slot is found then the hash table is full and insertion fails. 9

Example 2: Linear Probing • Use the hash function h(r) = r. id %

Example 2: Linear Probing • Use the hash function h(r) = r. id % 13 to load the following records into an array of size 13. Al-Otaibi Ziyad 1. 73 985926 Al-Turki, Musab Ahmad Bakeer 1. 60 970876 Al-Saegh, Radha Mahdi 1. 58 980962 Al-Shahrani, Adel Saad 1. 80 986074 Al-Awami, Louai Adnan Muhammad 1. 73 970728 Al-Amer, Yousuf Jauwad 1. 66 994593 Al-Helal, Husain Ali Abdul. Mohsen 1. 70 996321 Then insert the following records using linear probing to resolve collisions, if any. Al-Najjar, Khaled Ziyad Al-Ali, Amr Ali Zaid Al-Ramadi, Husam Yahya 1. 69 1. 79 1. 58 987615 987630 987602 10

6 7 8 9 10 11 Amr Musab Adel 12 Husam 5 Radha 4

6 7 8 9 10 11 Amr Musab Adel 12 Husam 5 Radha 4 Khalid 3 Ziyad 2 Louai 1 Yousuf 0 Husain Example 2: Introduction to Hashing (cont'd) 11

Linear Probing: Some Notes • Notice from this table that a large cluster has

Linear Probing: Some Notes • Notice from this table that a large cluster has already been formed. • In general, empty cells following the cluster have higher chance of being hashed into. • The probability of taking longer probe sequences is much higher with clusters. • This is one disadvantage of linear probing. Other methods attempt to improve on this. 12

Introduction to Retrieval & Deletion • Retrieval: To search for a record we: •

Introduction to Retrieval & Deletion • Retrieval: To search for a record we: • Calculate its hash value. • Check that location of the array for the record. · If found, return the record. · If not, keep searching until you find the record or you reach an empty table location. • Attempting to retrieve a non-existent record is very expensive. • Deletion: • In open addressing, where a record is stored is not ecessarily its home position. • We cannot just set the location of a deleted record to empty. • A special flag or key value is needed to mark deleted records locations. 13

Example 3: Retrieval & Deletion 6 7 8 9 10 11 Amr Musab Adel

Example 3: Retrieval & Deletion 6 7 8 9 10 11 Amr Musab Adel 12 Husam 5 Radha 4 Khalid 3 Ziyad 2 Louai 1 Yousuf 0 Husain • Consider the following hash table constructed in Example 2: Delete Khalid's record (id 987615) and then retrieve the records for Amr and then that of Husam. 14

8 9 10 11 ? 12 Husam 7 Adel 6 Musab 5 Amr 4

8 9 10 11 ? 12 Husam 7 Adel 6 Musab 5 Amr 4 Radha 3 Ziyad 2 Louai 1 Yousuf 0 Husain Example 3: Retrieval & Deletion 15

Exercises 1. Given that, c(i) = a*i, for c(i) in linear probing, we discussed

Exercises 1. Given that, c(i) = a*i, for c(i) in linear probing, we discussed that this equation satisfies Property 2 only when a and n are relatively prime. Explain what the requirement of being relatively prime means in simple plain language. 2. Consider the general probe sequence, hi (r) = (h(r) + c(i))mod n. Are we sure that if c(i) satisfies Property 2, then hi(r) will cover all n hash table locations, 0, 1, . . . , n-1? Explain. 3. Suppose you are given k records to be loaded into a hash table of size n, with k < n using linear probing. Does the order in which these records are loaded matter for retrieval and insertion? Explain. 4. A prime number is always the best choice of a hash table size. Is this statement true of false? Justify your answer either way. 16