Hashing Hashing is another method for sorting and
- Slides: 27
Hashing • Hashing is another method for sorting and searching data. – Hashing makes it easier to add and remove elements from a data structure. – The worst-case behavior for locating a key is linear – Q(n). – Java’s standard hash table class is: java. util. Hashtable
Hashing • Hashing usually implements a data structure called a hash table. – A hash table is an effective data structure. – A hash table is a generalization of an array. – A hash table requires a key to access data.
Hashing – A hash table uses an array whose length is proportional to the number of keys actually stored. – The array index is computed from the key, rather than using the key to access the array. • The key is a unique identifying value.
Hashing Functions • Hashing requires the use of a hashing function. – The purpose of the hashing function is to compute the storage slot from the key. • Maps key values to array indices. – This calculation reduces the range of array indices that need to be handled.
Hashing Functions – If a hashing function groups key values together, this is called clustering of the keys. • A good hashing function distributes the key values uniformly through the array’s index range. • Any hashing function that results in clustering should be changed. • A good hashing function has an equal likelihood of hashing a key into any of the slots. • The java. util. Hashtable contains the method hash. Code
Hashing Functions • The division hash function depends upon the remainder of division. – Math. abs(H(k)) % table. length – When using the division hash function, it is best to have a table size that is a prime number of the form 4 n + 3. – Using the division hash function can result in many collisions.
Hashing Functions • The mid-square hash function converts the key to an integer, then doubles the key. The function returns the middle digits of the results. • The multiplicative hash function converts the key to an integer and multiplies it by a constant less than one. The function returns the first few digits of the fractional part of the result.
Example Table 0 Universe of Keys - U H(k 1) H(k 4) K 1 Actual K 4 Keys – K K 2 K 5 K 3 H(k 2) H(k 3) m-1
Collisions • A collision occurs when the hashing function calculates the same array index for two different objects and one is already stored into the array index location. – Two keys hash to the same slot.
Collision Example Table 0 Universe of Keys - U H(k 1) H(k 4) K 1 Actual K 4 Keys – K K 2 K 5 K 3 H(k 2) = H(k 5) H(k 3) m-1
Open Addressing • Open addressing ensures that all elements are stored directly into the hash table. – Every table slot contains either data or null. – The problem is that the table can fill up. – The good thing is that there are no external storage locations for the table elements.
Open Addressing – Open addressing attempts to resolve collisions using various methods.
Linear Probing • Linear Probing resolves collisions by placing the data into the next open slot in the table. • If this slot is open, the data is stored in the slot. • If this slot is not open, the algorithm looks at the next slot (index) until an open slot is found.
Linear Probing – It is difficult to delete items from a hash table that uses open addressing. • Can not simply put null into the slot because may miss information. Instead place Deleted into the empty slot. – If H’(k) is the ordinary hash function, the linear probing hash function is: • H(k, i) = (H’(k) + 1) % m where i = 0, 1, 2, … , m and m is the number of elements that can be stored into the table.
Linear Probing – A problem associated with Linear Probing is called, primary clustering. • Primary clustering occurs when many items hash into the same slot and long runs of slots are filled up. • This results in increased search times.
Linear Probing Table 0 Universe of Keys - U H(k 1) H(k 4) K 1 Actual K 4 Keys – K K 2 K 5 K 3 H(k 2) = H(k 5) H(k 3) m-1
Double Hashing • Double hashing is one of the best methods for dealing with collisions. – The slot location is calculated based upon the hash function (H 1(k)). If the slot is full, then a second hash function is calculated and combined with the first hash function (H(k, i)) to determine a new slot.
Double Hashing – Assume that: • H 1(k) = Math. abs(H(k)) % table. length • H 2(k) = 1 + Math. abs(H(k)) % (table. length – x) where x is a small value; 1, 2, or 3. – Then: • H(k, i) = (H 1(k) + i H 2(k) ) % m
Double Hashing Table 0 H(k 5) Universe of Keys - U H(k 1) H(k 4) K 1 Actual K 4 Keys – K K 2 K 5 K 3 H(k 2) = H(k 5) H(k 3) m-1
External Chaining • In external chaining the hash table contains an array in which each component can hold more than one element of the hash table. – Essentially, a multiple dimension array or a linked list of elements can exist for each table slot. • The typical implementation is that each slot contains a linked list.
External Chaining Table 0 Universe of Keys - U H(k 1) H(k 4) K 1 Actual K 4 Keys – K K 2 K 5 K 3 H(k 2) H(k 3) m-1 H(k 5)
Load Factor • The load factor is a fraction that represents the number of elements stored in the table divided by the size of the table’s array. – a = the number of elements stored in the table the size of the table’s array
Load Factor – If open addressing is used, then each table slot holds at most one element, therefore, the load factor can never be greater than 1. – If external chaining is used, then each table slot can hold many elements, therefore, the load factor may be greater than 1.
Hashing Analysis • The worst case analysis for hashing is the case where every key is hashed into the same slot. – Q (n) – linear time. • The average time can be much faster.
Average Search Analysis • Searching with Linear probing. – For a table that is not near full: • ½ ( 1 + 1 / (1 – a) ) – For a table that is full or near full: • Math. Sqrt( n ( p / 8) ) • Searching with double hashing. – (-ln (1 – a) ) / a where ‘l’ in ‘ln’ is ‘L’ • Searching with chained hashing. – 1 + (a / 2 ) • See Figure 11. 6 in Main. Page 561
Coding Example • Search Times program that demonstrates Linear, Binary, and Hashing. – The hashing uses the Hash. Table class.
Hashing • Java provides the Hash. Table class, but it also provides two other classes. – The Hash. Map class implements a hash table using a map data structure. – The Hash. Set class implements a hash table using sets.
- Difference between external and internal sorting
- Modulo function c++
- Dynamic hashing using directories
- Static and dynamic hashing in dbms
- Distinguish between extendible and linear hashing
- Harris burdick pictures
- Depth sorting algorithm
- Library sorting method
- Hidden surface removal in computer graphics
- Mid square hash function
- Introduction of symposium
- Difference between sorting and grading
- Intracellular compartments and protein sorting
- Differentiate between bubble and quick sorting
- Searching and sorting arrays in c++
- Searching and sorting in java
- Restricting and sorting data in oracle
- Sorting and grading in food processing
- Lesson 1: analyzing a graph
- Searching and sorting in java
- Searching and sorting java
- Searching and sorting in java
- Physical and chemical properties sorting activity
- Fspos vägledning för kontinuitetshantering
- Novell typiska drag
- Tack för att ni lyssnade bild
- Vad står k.r.å.k.a.n för
- Shingelfrisyren