Wednesday, 23 October 2013

How HashMap works in Java.


HashMap works on the principle of hashing. A HashMap stores keys and associated values.
Using hashcode keys are assigned bucket(array index) and both key and value gets stored in it.
HashMap allows null key as well as null value(s).

HashMap implementation ensures that its array length is always equal to at least max( size, capacity ) / load_factor. Default load factor for HashMap is 0.75 and default capacity is 16. So, the default (new HashMap<>()) size of array of entries is 16 / 0.75 = 21.33 ~ 22.
In case this size fills HashMap doubles it's size and rehashes all the entries.
Equals method implies Hashcode i.e.  equals() => hashcode().(equals and hashcode contract)

public V put(K key, V value)
Associates the specified value with the specified key in this map. If the map previously contained a mapping for the key, the old value is replaced.

How get() method works in HashMap?
Using key as input HashMap calculates its hashcode() and then locates the key and its associated value in bucket.

Connecting question is: What if two keys have same hashcode()?
Two keys having same hashcode() will land in same bucket.
So if you know Hashing it's a matter of collision resolution, in case of a collision in HashMap, key-value pairs are stored in a linked array, so once there is a collision one needs to locate the bucket using hashcode() and then using equals() method locate key and return its associated value.

hashcode() => locate bucket => equals() => identify key based on equals() and return its associated value.

Internally HashMap implements a static class Entry:
transient Entry <K,V> [] table;

Data structure for storing Key-value pairs:

static class Entry<K,V> implements Map.Entry<K,V> {
        final K key;
        V value;
        Entry<K,V> next;
        int hash;

        /**
         * Creates new entry.
         */
        Entry(int h, K k, V v, Entry<K,V> n) {
            value = v;
            next = n;
            key = k;
            hash = h;
        }
//other code emitted


So whenever there is a collision it stores key-value pair(s) in a new Entry and gives reference of already stored key-value to the 'Entry<K,V> next' variable.
bucketindex calculated based on the hash of key and then key-values are stored based on this calculation.
code snippet:

Entry<K,V> e = table[bucketIndex];  //if there is already an entry at bucketIndex then reference it to next
table[bucketIndex] = new Entry<>(hash, key, value, e); 

//new Entry code
 Entry(int h, K k, V v, Entry<K,V> n) {
            value = v;
            next = n;
            key = k;
            hash = h;
        }