Data Structures for Java William H Ford William

  • Slides: 84
Download presentation
Data Structures for Java William H. Ford William R. Topp Chapter 21 Hashing as

Data Structures for Java William H. Ford William R. Topp Chapter 21 Hashing as a Map Implementation Bret Ford © 2005, Prentice Hall © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Introduction to Hashing n n n A hash table distributes elements in a series

Introduction to Hashing n n n A hash table distributes elements in a series of linked lists, referred to as buckets. A hash function maps a value to an index in the table. The function provides access to an element much like an index provides access to an array element. Like a binary search tree, a hash table provides an implementation of the Set and Map interfaces. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Introduction to Hashing (continued) n n A binary search tree can access data stored

Introduction to Hashing (continued) n n A binary search tree can access data stored by value with O(log 2 n) average search time. We would like to design a storage structure that yields O(1) average retrieval time. In this way, access to an item is independent of the number of other items in the collection. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Introduction to Hashing (continued) n A hash table is an array of references. Associated

Introduction to Hashing (continued) n A hash table is an array of references. Associated with the table is a hash function that takes a key as an argument and returns an integer value. n By using the remainder after dividing the hash value by the table size, we have a mapping of the key to an index in the table. n © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Introduction to Hashing (concluded) Hash Value: Hash. Table index: hf(key) = hash. Value %

Introduction to Hashing (concluded) Hash Value: Hash. Table index: hf(key) = hash. Value % n © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Using a Hash Function n Consider the hash function hf(x) = x, where x

Using a Hash Function n Consider the hash function hf(x) = x, where x is a nonnegative integer (the identity function). Assume the table is the array table. Entry with n = 7 elements. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Using a Hash Function (concluded) n With hash function hf() and table size n,

Using a Hash Function (concluded) n With hash function hf() and table size n, the table index for a key is i = hf(key)%n. Collisions occur for any two keys that differ by a multiple of n. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Designing Hash Functions n Some general design principles guide the creation of all hash

Designing Hash Functions n Some general design principles guide the creation of all hash functions. Evaluating a hash function should be efficient. n A hash function should produce uniformly distributed hash values. This spreads the hash table indices around the table, which helps minimize collisions. n © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Designing Hash Functions (continued) n The Java programming language provides a general hashing function

Designing Hash Functions (continued) n The Java programming language provides a general hashing function with the hash. Code() method in the Object superclass. public int hash. Code() { … } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Designing Hash Functions (continued) n Object's hash. Code()converts the internal address of the object

Designing Hash Functions (continued) n Object's hash. Code()converts the internal address of the object into an integer value, which has limited application since two different objects will normally have different values for hash. Code(), even if they store the same data. // strings one and two are the same; not so for integer values // one. hash. Code() and two. hash. Code() String one = "java", two = "java"; © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Designing Hash Functions (continued) n The Integer class provides the identity function for hash.

Designing Hash Functions (continued) n The Integer class provides the identity function for hash. Code(). public int hash. Code() { return value; } n Unless the integer data has random characteristics, this is not a good hash function. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Designing Hash Functions (continued) n In the majority of hash-table applications, the key is

Designing Hash Functions (continued) n In the majority of hash-table applications, the key is a string. n To create an efficient hash function, we must combine the sequence of characters in the string to form an integer. public int hash. Code() { int hash = 0; for (int i = 0; i < n; i++) hash = 31*hash + s[i]; return hash; } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Designing Hash Functions (concluded) The following are hash code values for three different strings.

Designing Hash Functions (concluded) The following are hash code values for three different strings. The value for string str. B is a negative number due to integer overflow. String str. A = "and", str. B = "uncharacteristically", str. C = "algorithm"; hash. Value = str. A. hash. Code(); hash. Value = str. B. hash. Code(); hash. Value = str. C. hash. Code(); // hash. Value = 96727 // hash. Value = -2112884372 // hash. Value = 225490031 In general, a hash function may result in integer overflow and return a negative number. The following calculation insures that the table index is nonnegative. table. Index = (hash. Value & Integer. MAX_VALUE) % table. Size © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

User-Defined Hash Functions n To create a custom hash function, a class overrides the

User-Defined Hash Functions n To create a custom hash function, a class overrides the method hash. Code(). n For the Time 24 class, the hash value for an object is its time converted to minutes. Since hour and minute are normalized to fall within the ranges 0 to 23 and 0 to 59 respectively, each time is unique. public int hash. Code() { // hash value is time in minutes; // as normalized time, value is positive return hour*60 + minute; } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

User-Defined Hash Functions (continued) n The custom hash function for Product objects must mix

User-Defined Hash Functions (continued) n The custom hash function for Product objects must mix the bits for the serial number to create a random value. public class Product { // last 4 digits record year in which the product was made. // identity hash function is not sufficient private int serial. Num; . . . } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

User-Defined Hash Functions (concluded) public class Product { private int serial. Num; . .

User-Defined Hash Functions (concluded) public class Product { private int serial. Num; . . . public int hash. Code() { // assign serial. Num to a long variable long hash. Value = serial. Num; // square to obtain a nonnegative long integer hash. Value *= hash. Value; // return the remainder after dividing // by the largest int value; its bits // are "jumbled up" return (int)(hash. Value % Integer. MAX_VALUE); } } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Designing Hash Tables n When two or more data items hash to the same

Designing Hash Tables n When two or more data items hash to the same table index, they cannot occupy the same position in the table. n We are left with the option of locating one of the items at another position in the table (linear probing) or of redesigning the table to store a sequence of colliding keys at each index (chaining with separate lists). © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Linear Probing n The hash table is an array of elements with an associated

Linear Probing n The hash table is an array of elements with an associated hash function. To add an item Initially, tag each entry in the table as "empty". n Apply the hash function to the key and divide the value by the table size to obtain a table index. If the entry is empty, insert the item. n Otherwise, start at the next hash index and scan successive indices, wrapping around to the start of the table after probing the last table entry. An insertion occurs at the first open location. n © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Linear Probing (continued) n The search returns to the original hash location without finding

Linear Probing (continued) n The search returns to the original hash location without finding an open slot, the table is full, and the linear probing algorithm throws an exception. table. Index = x % 11 © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Linear Probing (continued) // compute hash index of item for a table of size

Linear Probing (continued) // compute hash index of item for a table of size n int index = (item. hash. Code()&Integer. MAX_VALUE)%n, orig. Index; // save the original hash index orig. Index = index; // cycle through the table looking for an empty slot, a // match or a table full condition (origindex == index). do { // test whether the table slot is empty or the key matches // the data field of the table entry if table[index] is empty insert item in table at table[index] and return else if table[index] matches item return // begin a probe starting at the next table location index = (index+1) % n; } while (index != orig. Index); // we have gone around table without finding match or open slot throw new Buffer. Overflow. Exception(); © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Linear Probing (concluded) n If the size of the table is large relative to

Linear Probing (concluded) n If the size of the table is large relative to the number of items, linear probing works well, because a good hash function generates indices that are evenly distributed over the table range, and collisions will be minimal. As the ratio of table size to the number of items approaches 1, the algorithm deteriorates to the sequential search. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Chaining with Separate Lists n Chaining with separate lists defines the hash table as

Chaining with Separate Lists n Chaining with separate lists defines the hash table as an indexed sequence of linked lists. Each list, called a bucket, holds a set of items that hash to the same table location. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Chaining with Separate Lists (continued) n A bucket is a singly linked list. Each

Chaining with Separate Lists (continued) n A bucket is a singly linked list. Each entry of the array is the first node in a sequence of items that hash to the table index. A node has the familiar structure with two fields, one for the value and one for the reference to the next node. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Chaining with Separate Lists (continued) n To add object item, use the hash function

Chaining with Separate Lists (continued) n To add object item, use the hash function to identify the index of the appropriate bucket in the array (table). If table[i] is null, add item as the first entry in the list. n Otherwise begin with the first node, entry = table[i], and compare item with entry. node. Value. If there is no match, continue the scan with node entry. next, and so forth. If item is not in the list, add it to the front of the list. n © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Chaining with Separate Lists (continued) Consider the following sequence of eight elements {54, 77,

Chaining with Separate Lists (continued) Consider the following sequence of eight elements {54, 77, 94, 89, 14, 45, 35, 76} with the identity hash function and table. Size = 11. The figure displays the lists. Each entry in a table includes the number of probes to add the element. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Chaining with Separate Lists (concluded) n n n Chaining with separate lists is generally

Chaining with Separate Lists (concluded) n n n Chaining with separate lists is generally faster than linear probing since chaining only searches items that hash to the same table location. With linear probing, the number of table entries is limited to the table size, whereas the linked lists used in chaining grow as necessary. To delete an element, just erase it from the associated list. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Rehashing n As the number of entries in the hash table increases, search performance

Rehashing n As the number of entries in the hash table increases, search performance deteriorates. Rehashing increases the hash table size when the number of entries in the table is a specified percentage of its size. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

A Hash Table as a Collection n The generic class Hash stores elements in

A Hash Table as a Collection n The generic class Hash stores elements in a hash table using chaining with separate lists and implements the Collection interface. hash. Code() must be provided by the generic type. n The constructor creates a hash table with initial size 17. The table grows as rehashing occurs. n The method to. String() returns a commaseparated list that, by the nature of hashing, is not ordered. n © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

A Hash Table as a Collection (concluded) © 2005 Pearson Education, Inc. , Upper

A Hash Table as a Collection (concluded) © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class Implementation n The hash table is an array whose elements are the

Hash Class Implementation n The hash table is an array whose elements are the first node in a singly linked list. n Define an inner class Entry with an integer field hash. Value that stores the hash code value and avoids recomputing the hash function during rehashing. hash. Value = item. hash. Code() & Integer. MAX_VALUE; © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Entry Inner Class private static class Entry<T> { // value in the hash table

Entry Inner Class private static class Entry<T> { // value in the hash table T value; // save value. hash. Code() & Integer. MAX_VALUE int hash. Value; // next entry in the linked list // of colliding values Entry<T> next; // entry with given data and node value Entry(T value, int hash. Value, Entry<T> next) { this. value = value; this. hash. Value = hash. Value; this. next = next; } } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class Instance Variables n n n The Entry array, table, defines the singly-linked

Hash Class Instance Variables n n n The Entry array, table, defines the singly-linked lists that store the elements. The integer variable hash. Table. Size specifies the number of entries in the table. The variable table. Threshold has the value (int)(table. length * MAX_LOAD_FACTOR) where the double constant MAX_LOAD_FACTOR specifies the maximum allowed ratio of the elements in the table and the table size. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class Instance Variables (concluded) n n MAX_LOAD_FACTOR = 0. 75 (number of hash

Hash Class Instance Variables (concluded) n n MAX_LOAD_FACTOR = 0. 75 (number of hash table entries is 75% of the table size) is generally a good value. When the number of elements in the table equals table. Threshold, a rehash occurs. The variable mod. Count is used by iterators to determine whether external updates may have invalidated the scan. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class Constructor n The Hash class constructor creates the 17 -element array table

Hash Class Constructor n The Hash class constructor creates the 17 -element array table with 17 empty lists. A rehash will first occur when the hash collection size equals 12. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class Outline public class Hash<T> implements Collection<T> { // the hash table private

Hash Class Outline public class Hash<T> implements Collection<T> { // the hash table private Entry[] table; private int hash. Table. Size; private final double MAX_LOAD_FACTOR =. 75; private int table. Threshold; // for iterator consistency checks private int mod. Count = 0; // construct an empty hash table with 17 buckets public Hash() { table = new Entry[17]; hash. Table. Size = 0; table. Threshold = (int)(table. length * MAX_LOAD_FACTOR); }. . . } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class add() n The algorithm for add(): Compute the hash index for the

Hash Class add() n The algorithm for add(): Compute the hash index for the parameter item and scan the list to see if item is currently in the hash table. If so, return false. n Create a new Entry with value item and insert it at the front of the list. n n hash. Value is assigned to the entry so it will not have to be computed when rehashing occurs. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class add() (continued) n Increment hash. Table. Size and mod. Count. If hash.

Hash Class add() (continued) n Increment hash. Table. Size and mod. Count. If hash. Table. Size ≥ table. Threshold, call rehash(). The size of the new table is 2*table. length + 1 © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash add() (continued) // add item to the hash table if it is not

Hash add() (continued) // add item to the hash table if it is not // already present and return true; otherwise, // return false public boolean add(T item) { // compute the hash table index int hash. Value = item. hash. Code() & Integer. MAX_VALUE, index = hash. Value % table. length; Entry<T> entry; // entry references the front of a linked // list of colliding values entry = table[index]; © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash add() (continued) // scan the linked list and return false // if item

Hash add() (continued) // scan the linked list and return false // if item is in list while (entry != null) { if (entry. value. equals(item)) return false; entry = entry. next; } // we will add item, so increment mod. Count++; // create the new table entry so its successor // is the current head of the list entry = new Entry<T>(item, hash. Value, (Entry<T>)table[index]); © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash add() (concluded) // add it at the front of the linked list //

Hash add() (concluded) // add it at the front of the linked list // and increment the size of the hash table[index] = entry; hash. Table. Size++; if (hash. Table. Size >= table. Threshold) rehash(2*table. length + 1); // a new entry is added return true; } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class rehash() n The method rehash() takes the size of the new hash

Hash Class rehash() n The method rehash() takes the size of the new hash table as an argument performs rehashing. n Create a new table with the specified size and cycle through the nodes in the original table. For each node, use the hash. Value field modulo the new table size to hash to the new index. Insert the node at the front of the linked list. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class rehash() (continued) private void rehash(int new. Table. Size) { // allocate the

Hash Class rehash() (continued) private void rehash(int new. Table. Size) { // allocate the new hash table and // record a reference to the current // one in old. Table Entry[] new. Table = new Entry[new. Table. Size], old. Table = table; Entry<T> entry, next. Entry; int index; // cycle through the current hash table for (int i=0; i < table. length; i++) { // record the current entry = table[i]; © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class rehash() (continued) // see if there is a linked list present if

Hash Class rehash() (continued) // see if there is a linked list present if (entry != null) { // have at least one element in a linked list do { // record the next entry in the // original linked list next. Entry = entry. next; // compute the new table index = entry. hash. Value % new. Table. Size; // insert entry the front of the // new table's linked list at // location index entry. next = new. Table[index]; new. Table[index] = entry; © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class rehash() (concluded) // assign the next entry in the // original linked

Hash Class rehash() (concluded) // assign the next entry in the // original linked list to entry = next. Entry; } while (entry != null); } } // the table is now new. Table table = new. Table; // update the table threshold table. Threshold = (int)(table. length * MAX_LOAD_FACTOR); // let garbage collection get rid of old. Table = null; } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash remove() n Compute the hash table index. Using variables prev and curr that

Hash remove() n Compute the hash table index. Using variables prev and curr that move through the linked list in tandem, search for item. If not present, return false; otherwise, remove item from the list. If prev == null, this involves updating table[index] to reference the successor to the front of the list. Decrement hash. Table. Size, increment mod. Count, and return true. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash remove() (continued) public boolean remove(Object item) { // compute the hash table index

Hash remove() (continued) public boolean remove(Object item) { // compute the hash table index int index = (item. hash. Code() & Integer. MAX_VALUE) % table. length; Entry<T> curr, prev; // curr references the front of a // linked list of colliding values; // initialize prev to null curr = table[index]; prev = null; // scan the linked list for item while (curr != null) if (curr. value. equals(item)) { // we have located item and will remove // it; increment mod. Count++; © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash remove() (continued) // if prev is not null, curr is not the front

Hash remove() (continued) // if prev is not null, curr is not the front // of the list; just skip over curr if (prev != null) prev. next = curr. next; else // curr is front of the list; the // new front of the list is curr. next table[index] = curr. next; // decrement hash table size and return true hash. Table. Size--; return true; } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash remove() (concluded) else { // move prev and curr forward prev = curr;

Hash remove() (concluded) else { // move prev and curr forward prev = curr; curr = curr. next; } return false; } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class Iterators n Search the hash table for the first nonempty bucket in

Hash Class Iterators n Search the hash table for the first nonempty bucket in the array of linked lists. Once the bucket is located, the iterator traverses all of the elements in the corresponding linked list and then continues the process by looking for the next nonempty bucket. The iterator reaches the end of the table when it reaches the end of the list for the last nonempty bucket. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class Iterators (continued) n Iterator objects are instances of the inner class Iterator.

Hash Class Iterators (continued) n Iterator objects are instances of the inner class Iterator. Impl whose variables are: Integer index that identifies the current bucket (table[index]) scanned by the iterator. n The Entry reference next pointing to the current node in the current bucket. n The variable last. Returned that references the last value returned by next(). n The iterator variable expected. Mod. Count used in conjunction with the collection variable mod. Count. n © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class Iterators (continued) // inner class that implements hash table iterators private class

Hash Class Iterators (continued) // inner class that implements hash table iterators private class Iterator. Impl implements Iterator<T> { // next entry to return Entry<T> next; // to check iterator consistency int expected. Mod. Count; // index of current bucket index; // reference to the last value returned by next() T last. Returned; . . . } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Class Iterators (continued) n The elements enter the collection in the order (19,

Hash Class Iterators (continued) n The elements enter the collection in the order (19, 32, 11, 27) using the identify hash function. The iterator visits the elements in the order (11, 32, 27, 19). © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Iterator Constructor n A loop iterates up the list of buckets until it

Hash Iterator Constructor n A loop iterates up the list of buckets until it locates the first nonempty bucket. The loop variable i becomes the initial value for index and table[i] references the front of the list. This is the initial value for next. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Iterator Constructor (concluded) Iterator. Impl() { int i = 0; Entry<T> n =

Hash Iterator Constructor (concluded) Iterator. Impl() { int i = 0; Entry<T> n = null; // the expected mod. Count starts at mod. Count expected. Mod. Count = mod. Count; // find the first nonempty bucket if (hash. Table. Size != 0) while (i < table. length && ((n = table[i]) == null)) i++; next = n; index = i; last. Returned = null; } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Iterator next() n n The method next() first determines that the operation is

Hash Iterator next() n n The method next() first determines that the operation is valid by checking that mod. Count and expected. Mod. Count are equal and that we are not at the end of the hash table. If the iterator is in a consistent state, next() saves entry. value in last. Returned and uses a loop index i and entry to perform the iterator scan for the subsequent element in the hash table. The return value is last. Returned. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Iterator next() (continued) public T next() { // check for iterator consistency if

Hash Iterator next() (continued) public T next() { // check for iterator consistency if (mod. Count != expected. Mod. Count) throw new Concurrent. Modification. Exception(); // we will return the value in Entry object next Entry<T> entry = next; // if entry is null, we are at the end of the table if (entry == null) throw new No. Such. Element. Exception(); // capture the value we will return last. Returned = entry. value; // move to the next entry in the current // linked list Entry<T> n = entry. next; // record the current bucket index int i = index; © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Iterator next() (concluded) if (n == null) { // we are at the

Hash Iterator next() (concluded) if (n == null) { // we are at the end of a bucket; search for the // next nonempty bucket i++; while (i < table. length && (n = table[i]) == null) i++; } index = i; next = n; return last. Returned; } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Iterator remove() n n The remove() method first determines that the operation is

Hash Iterator remove() n n The remove() method first determines that the operation is valid by checking that last. Returned is not null and that mod. Count and expected. Mod. Count are equal. If all is well, the iterator remove() method calls the Hash class remove() method with last. Returned as the argument. By assigning to expected. Mod. Count the current value of mod. Count, the iterator remains consistent. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Iterator remove() (concluded) public void remove() { // check for a missing call

Hash Iterator remove() (concluded) public void remove() { // check for a missing call to next() or previous() if (last. Returned == null) throw new Illegal. State. Exception( "Iterator call to next() " + "required before calling remove()"); if (mod. Count != expected. Mod. Count) throw new Concurrent. Modification. Exception(); // remove last. Returned by calling remove() in Hash; // this call will increment mod. Count Hash. this. remove(last. Returned); expected. Mod. Count = mod. Count; last. Returned = null; } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

The Hash. Map Collection n The design of the Hash. Map collection is similar

The Hash. Map Collection n The design of the Hash. Map collection is similar to the implementation of Tree. Map. A Hash. Map is not ordered since the position of elements depends on hashing the keys. This affects the method to. String() which returns a listing of the elements based on the iterator order. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

The Hash. Map Collection (continued) n The Hash. Map class stores elements in a

The Hash. Map Collection (continued) n The Hash. Map class stores elements in a hash table containing linked lists of Entry objects. The inner class Entry contains key -value pairs. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

The Hash. Map Collection (continued) n The inner class Entry implements the Map. Entry

The Hash. Map Collection (continued) n The inner class Entry implements the Map. Entry interface which defines the methods get. Key(), get. Value() and set. Value(). A to. String() method returns a representation of an entry in the format "key=value". The constructor has arguments for each field in the node. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Entry Class (partial listing) static class Entry<K, V> implements Map. Entry<K, V> { K

Entry Class (partial listing) static class Entry<K, V> implements Map. Entry<K, V> { K key; V value; Entry<K, V> next; int hash. Value; // make a new entry with given key, value Entry(K key, V value, int hash. Value, Entry<K, V> next) { this. key = key; this. value = value; this. hash. Value = hash. Value; this. next = next; }. . . } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Accessing Entries in a Hash. Map n The methods get(), and contains. Key() take

Accessing Entries in a Hash. Map n The methods get(), and contains. Key() take a key reference argument and must locate a corresponding entry in the map. n This task is performed by the private Hash. Map method get. Entry() which takes a key as an argument, applies the hash function to the © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Accessing Entries in a Hash. Map (continued) // return a reference to the entry

Accessing Entries in a Hash. Map (continued) // return a reference to the entry with the specified key // if there is one in the hash map; otherwise, return null public Entry<K, V> get. Entry(K key) { int index = (key. hash. Code() & Integer. MAX_VALUE) % table. length; Entry<K, V> entry; entry = table[index]; while (entry != null) { if (entry. key. equals(key)) return entry; entry = entry. next; } return null; } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Accessing Entries in a Hash. Map (concluded) // returns the value that corresponds to

Accessing Entries in a Hash. Map (concluded) // returns the value that corresponds to // the specified key public V get(K key) { Entry<K, V> p = get. Entry(key); if (p == null) return null; else return p. value; } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Updating Entries in a Hash. Map n The method put() updates the Hash. Map.

Updating Entries in a Hash. Map n The method put() updates the Hash. Map. Construct a table index by applying the hash function for the key and scan the linked list for a match with the key. If a match occurs, apply set. Value() and return its result. n If key does not occur in the list, insert a new Entry object at the front of the linked list. If the hash map size has reached the table threshold, apply rehashing. Conclude by returning null. n © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Updating Entries in a Hash. Map (continued) // assigns value as the value associated

Updating Entries in a Hash. Map (continued) // assigns value as the value associated with key // in this map and returns the previous value // associated with the key, or null if there // was no mapping for the key public V put(K key, V value) { // compute the hash table index int hash. Value = key. hash. Code() & Integer. MAX_VALUE, index = hash. Value % table. length; Entry<K, V> entry; // entry references the front of a linked // list of colliding values entry = table[index]; © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Updating Entries in a Hash. Map (continued) // scan the linked list. if key

Updating Entries in a Hash. Map (continued) // scan the linked list. if key matches the key in an // entry, return entry. set. Value(value). this // replaces the value in the entry and returns the // previous value while (entry != null) { if (entry. key. equals(key)) return entry. set. Value(value); entry = entry. next; } // we will add item, so increment mod. Count++; // create the new table entry so its successor // is the current head of the list entry = new Entry<K, V>(key, value, hash. Value, (Entry<K, V>)table[index]); © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Updating Entries in a Hash. Map (concluded) // add it at the front of

Updating Entries in a Hash. Map (concluded) // add it at the front of the linked list // and increment the size of the hash map table[index] = entry; hash. Map. Size++; if (hash. Map. Size >= table. Threshold) rehash(2*table. length + 1); return null; // a new entry is inserted } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Summary of Hash. Map Design © 2005 Pearson Education, Inc. , Upper Saddle River,

Summary of Hash. Map Design © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash. Set Class n The Hash. Set class uses a Hash. Map by composition.

Hash. Set Class n The Hash. Set class uses a Hash. Map by composition. The class defines a static Object reference called PRESENT. This becomes the value component for each entry in the map. The constant reference serves as a dummy placeholder in an entry pair. n Declare a private instance variable map of type Hash. Map having T as the type of the set elements and Object as the value type. The constructor instantiates the map collection. This has the effect of creating an empty set. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash. Set Class (continued) public class Hash. Set<T> implements Set<T> { // value for

Hash. Set Class (continued) public class Hash. Set<T> implements Set<T> { // value for each key in the map private static final Object PRESENT = new Object(); // set implemented using a hash map private Hash. Map<T, Object> map; // create an empty set object public Hash. Set() { map = new Hash. Map<T, Object>(); }. . . } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash. Set add() n The set methods are implemented with map methods that use

Hash. Set add() n The set methods are implemented with map methods that use the entry <item, PRESENT> as the argument. n add() uses the map method put(). If a duplicate exists, then put() simply updates the value field of the entry to PRESENT which is its current value. The map method returns null if a new element is added, so a return value of null indicates that the add() inserted item. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash. Set add() (concluded) public boolean add(T item) { return map. put(item, PRESENT) ==

Hash. Set add() (concluded) public boolean add(T item) { return map. put(item, PRESENT) == null; } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash. Set iterator() n The Hash. Set iterator must traverse the keys in the

Hash. Set iterator() n The Hash. Set iterator must traverse the keys in the map. Implement the method iterator() by returning an iterator for the key set collection view of the map. // returns an iterator for the elements in the set public Iterator<T> iterator() { return map. key. Set(). iterator(); } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash. Set remove() n The Hash. Set remove() method calls the remove() method for

Hash. Set remove() n The Hash. Set remove() method calls the remove() method for the map. To determine whether an element was removed from the set, verify that the return value from the map remove() call is the reference PRESENT. public boolean remove(Object obj) { return map. remove(obj) == PRESENT; } © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Table Performance n n A good hash function provides a uniform distribution of

Hash Table Performance n n A good hash function provides a uniform distribution of hash values. Hash table performance is measured by using the load factor = n/m, where n is the number of elements in the hash table and m is the number of buckets. For linear probe, 0 ≤ ≤ 1. n For chaining with separate lists, it is possible that > 1. n © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Table Performance (continued) n The worst case linear probe or chaining with separate

Hash Table Performance (continued) n The worst case linear probe or chaining with separate lists occurs when all data items hash to the same table location. If the table contains n elements, the search time is O(n), no better than that for the sequential search. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Table Performance (continued) n Assume that the hash function uniformly distributes indices around

Hash Table Performance (continued) n Assume that the hash function uniformly distributes indices around the hash table. n We can expect = n/m elements in each bucket. On the average, an unsuccessful search makes comparisons before arriving at the end of a list and returning failure. n Mathematical analysis shows that the average number of probes for a successful search is approximately 1 + /2. n © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Hash Table Performance (concluded) n Assume the number of elements n in the hash

Hash Table Performance (concluded) n Assume the number of elements n in the hash table is bounded by some amount, say, R*m, where m is the table size. n In this case, = n/m (R*m)/m = R, and the following relationships hold for the average cases, so the average running time is O(1)! S 1 + /2 ≤ 1 + R/2 U = ≤ R (Successful Search) (Unsuccessful Search) © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Evaluating Ordered and Unordered Sets and Maps n Use an ordered set or map

Evaluating Ordered and Unordered Sets and Maps n Use an ordered set or map if an iteration should return elements in order (average search O(log 2 n). Use an unordered set or map when fast access and updates are needed without any concern for the ordering of elements (average search time O(1)). © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Timing Example n Program Search. Comp. java: Reads a file of 25025 randomly ordered

Timing Example n Program Search. Comp. java: Reads a file of 25025 randomly ordered words and inserts each word into a Tree. Set and into a Hash. Set. n Determines the amount of time required to build both of the data structures. n Shuffles the input from the file and times a search of the Tree. Set and Hash. Set for each word in the shuffled input. n Displays the time required for each search technique. n © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.

Timing Example (concluded) Run: Number of words is 25025 Built Tree. Set in 0.

Timing Example (concluded) Run: Number of words is 25025 Built Tree. Set in 0. 078 seconds Built Hash. Set in 0. 047 seconds Tree. Set search time is 0. 078 seconds Hash. Set search time is 0. 016 seconds Note that the Hash. Set search time is considerably better than that for a Tree. Set. © 2005 Pearson Education, Inc. , Upper Saddle River, NJ. All rights reserved.