Hash Tables in C Steven Hansen Overview What
Hash Tables in C Steven Hansen
Overview • • What is a hash table? What is a hash? What are they useful for? Are they easy to use?
What is a hash table • A hash table is an array that uses calculated “hash” values to perform lookups. • Every hash table is made up of “buckets” that hold the values you wish to store.
What is a hash? • A hash is a value that is calculated using the data or the “key” • The key is whatever data you are using to lookup other data in the table. • E. g. Using a name as a key for a phone number.
How do I calculate a hash? • You use an algorithm that uses the data in the key to create a hash value suitable for your needs. • There are many algorithms you can use, some better than others. • You can use a modulus to keep the hash within range.
Example Algorithms • for (h=0, i=0; i<len; ++i) h += key[i]; • for (h=0, i=0; i<len; ++i) h ^= key[i]; • for (h=0, i=0; i<len; ++i) h *= key[i]; • None of these are very good algorithms. • From http: //burtle. net/bob/hashfaq. html
How easy is a hash table to setup and use? • Quite easy, just create an array and hash data into it! • You can treat it just like any other array when you are trying to find data. • To find data you re-calculate the hash and look in that bucket.
What are the problems? • Collisions – When two pieces of data want to occupy the same space. – This is solved trivially, you can create an array of linked lists and add data to the list instead of the bucket. – Or you could just move the data down into an open slot and search for it the same way. • If a hashing algorithm is slow or creates too many collisions the efficiency is degraded.
What makes C different? • Memory management. – To create a dynamically allocated object you must use malloc which returns a void pointer to the object. You then have to cast it as the correct pointer type – e. g. class *ex = (* class)malloc(sizeof(class)) – This creates the object ex of type class. – To clear the memory you would use free(ex)
Example of a basic hash • Data: [1, 2, 3, … , 40] • Algorithm: hash = i % 10; • Hash Table = int table[10];
Bucket Data 0 10 20 30 40 1 1 11 21 31 2 2 12 22 32 3 3 13 23 33 4 4 14 24 34 5 5 15 25 35 6 6 16 26 36 7 7 17 27 37 8 8 18 28 38 9 9 19 29 39
Perfect Hash • A perfect hash is an algorithm that has no collisions. • These are usually found when a hashing algorithm is created for known data.
Cryptographic Hashing • MD 4, MD 5, SHA, Snefru, RIPE-MD • Secure hashes with many uses – Password verification – Used to uniquely identify files
Resources • http: //burtle. net/bob/hashfaq. ht ml – Excellent resource for hashing algorithms • Section 2. 9 of the textbook – Explains the basics of hashing and some collision resolution • http: //en. wikipedia. org/wiki/Hash_table – Good start for understanding hash tables
- Slides: 14