CS 112 Lab 8 - Hash Tables

Open Addressed hash tables

We implement and compare performance of three collision handling methods for open addressed hash tables.

An algorithm to insert data (in this example data are strings) into a hash table is the following

  1. Construct a sufficiently large bucket array to store all data references, the size of the bucket array 'm' is typically chosen to be a prime number.

  2. Given data item 'd': compute its hash-key 'HK(d)', this is an integer.

  3. Initialize probe = 1.

  4. Compute the hash-function corresponding to HK(d): 'HF( HK(d), probe )'
    HF( HK(d), probe ) is an integer in the range [0, num_buckets - 1] .
    (three methods to compute the hash probe sequence HF are given below)

  5. Check if the bucket corresponding to HF( HK(d), probe ) is occupied

    If the bucket is occupied (i.e., collsion): increment probe and repeat step 4
    Else: insert data d into this bucket and goto the next data item.

Three standard probing schemes to compute the hash probe sequence HF are,

  1. Linear probing:

    HF_linear( HK(d), probe ) = ( HK(d) + probe ) mod m

  2. Quadratic probing:
    fix c1, c2 as two sufficiently large prime numbers (you can use this applet to generate prime numbers)

    HF_quadratic( HK(d), probe ) = ( HK(d) + c1*probe + c2*probe^2 ) mod m

  3. Double hashing:

    h1 = HK(d) mod m
    h2 = 1 + ( HK(d) mod (m-2) )

    HF_double( HK(d), probe ) = ( h1 + probe * h2 ) mod m

TASKS:

  1. Complete hashFunction() methods for QuadraticProbeHashTable and DoubleHashingTable classes

  2. Finish insertData() method in HashTable class

NOTE: your performance numbers should be close to those displayed below.

TRY: can you find a better hash function with fewer collisions?

Collision-handling performance comparison


Java code (1 of 2): HashTable.java

package lab08;
 
import java.awt.Color;
import java.awt.Font;
import java.awt.Graphics;
 
class LinearProbeHashTable extends HashTable {
    public LinearProbeHashTable(){ super();    }
    int hashFunction(int hashKey, int probe){
        int hashValue = (hashKey + probe) % numBuckets;
        return hashValue;
    }
    String getDescription(){ return "Linear Probing"; }
}
 
class QuadraticProbeHashTable extends HashTable{
    int c1, c2;
    QuadraticProbeHashTable() {
        super();
        c1 = 191747; c2 = 89989; // Choose two large primes
    }
    int hashFunction(int hashKey, int probe){
        // TASK: Set Quadratic probe hashValue below.
        int hashValue = 0;
        return hashValue;
    }
    String getDescription(){ return "Quadratic Probing"; }
}
 
class DoubleHashingTable extends HashTable{
    DoubleHashingTable() {
        super();
    }
    int hashFunction(int hashKey, int probe){
        // TASK: Set Double hashing hashValue below.
        int hashValue = 0;
        return hashValue;
    }
    String getDescription(){ return "Double Hashing"; }
}
 
public abstract class HashTable {
    String [] hashBuckets;
    int numBuckets;
    
    int num_collisions, num_operations; // Keep track of collisions and operations
    public HashTable(){
        numBuckets = 6203; // A prime number approximately = 2 * number of data.
        hashBuckets = new String[numBuckets];
        num_collisions = 0;
        num_operations = 0;
    }
    
    static int getHashKey(String data){
        // Gets an integer key by summing the char values in input data.
        int hashKey = 0;
        for(int i = 0; i < data.length(); i++)
            hashKey += data.charAt(i);
        return hashKey;
    }
    
    abstract int hashFunction(int hashKey, int probe);
    
    void insertData(int dataHashKey, String newData) throws Exception {
        num_operations++;  // Increment the operations counter.
        
        int probe = 1;
        
        // TASK:
        //   Use hashFunction() to obtain hash bucket index. 
        //   Use a while loop to probe successive buckets until an empty bucket  
        //   is found (increment probe and num_collisions within the while loop).
        //   Ensure to not have an infinite loop if the probe sequence loops or 
        //   the hashBucket array is full. (Hint: keep track of the loop count and use that.)
        //
        //   When an empty bucket is found, insert newData into that bucket.
        
        //int hashBucket = /*????*/;
    }
    
    
    void paint(Graphics g, int left, int top, int right, int bottom){
        top = top + 30;
        g.setColor(Color.BLACK); g.setFont(new Font("Verdana",Font.BOLD,15));
        g.drawString( getDescription()+" ouccupied buckets,", left, top - 5 );
        g.setColor(Color.RED);
        g.drawString(""+num_collisions+" collisions for "+num_operations+" insertions", 
                left + 350, top - 5); 
        
        int nrows = 40;
        int ncols = (int)Math.ceil(1.0*numBuckets/nrows);
        int cellw = Math.min((right-left)/ncols, (bottom-top)/nrows);
        int counter = 0;
        int y = top;
        for(int i = 0; i < nrows && counter < numBuckets; i++, y += cellw){
            int x = left;
            for(int j = 0; j < ncols && counter < numBuckets; j++, x += cellw){
                if( hashBuckets[counter++] != null )
                    g.setColor(Color.ORANGE);
                else
                    g.setColor(Color.WHITE);
                g.fillRect(x, y, cellw, cellw);
            }
        }
    }
    abstract String getDescription();
}

Java code (2 of 2): HashTableClient.java

Download here.

Solutions

 

URL

http://cs-people.bu.edu/tvashwin/cs112_spring09/lab08.html