2018-04-22 HashMap底层原理分析

阿里电话面试问到了关于HashMap底层的实现，虽然之前看过底层源码，但是还是回答的不够清晰。所以赶紧去查一下，这里做个笔记记录一下，捋顺思路。

java数据结构中有数组和链表这两种结构
数组:存储在内存中是连续的，因为是连续内存空间，所以很容易通过下标索引查询到，但是插入和删除困难。
链表:链表的元素在内存中的存储位置是不确定的，分散存储，没有索引下标，所以查询寻址难，而插入和删除容易（只需要移动指针）
综合这两者的优点，摒弃缺点，做成了哈希表。
图1

1226017-20170828223332265-962231004.png

图2

1226017-20170828223404171-54056261.png

哈希表又称为数组链表，底层还是数组但数组中每一项是一个链表。

下面我们对HashMap的源码进行分析

一、首先看一下put方法（通过put方法基本能看出hashmap结构）

    public V put(K key, V value) {
        //如果table数组为空数组{}，进行数组填充（为table分配实际内存空间），入参为threshold，此时threshold为initialCapacity 默认是1<<4(16)
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        if (key == null)
            return putForNullKey(value);//创建null的key
        int hash = hash(key);//计算key的hash值
        int i = indexFor(hash, table.length);//通过key的hash值计算在数组table中的位置
        //下边循环判断该数组 i 位置的链表中是否已经存在该key，如果存在则直接将旧值覆盖
        for (Entry e = table[i]; e != null; e = e.next) {
            Object k;
            //判断该条链上是否有hash值相同且key相同的    
            //若存在相同，则直接覆盖value，返回旧value  
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        //修改次数增加1
        modCount++;
        //在数组的 i 处增加节点entry（这里是在链表头部插入，作为链表头）
        addEntry(hash, key, value, i);
        return null;
    }

通过put代码，我们可以清晰看出保存过程
1，通过判断table为空的，则初始化数组

private void inflateTable(int toSize) {
        // Find a power of 2 >= toSize
        int capacity = roundUpToPowerOf2(toSize);
        
        //这行是计算阀值，当达到该阀值，则默认扩容capacity * loadFactor为16 * 0.75=12
        threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
        table = new Entry[capacity];
        initHashSeedAsNeeded(capacity);
    }

通过roundUpToPowerOf2(toSize)可以计算出大于或等于toSize的最接近toSize的2的n次幂，比如toSize=13,则capacity=16;to_size=16,capacity=16;to_size=17,capacity=32.

private static int roundUpToPowerOf2(int number) {
        // assert number >= 0 : "number must be non-negative";
        return number >= MAXIMUM_CAPACITY
                ? MAXIMUM_CAPACITY
                : (number > 1) ? Integer.highestOneBit((number - 1) << 1) : 1;
    }

该方法Integer.highestOneBit是用来获取最左边的bit（其他bit位为0）所代表的数值.

2，第一步判断key为null，则putForNullKey(value);

private V putForNullKey(V value) {
        for (Entry e = table[0]; e != null; e = e.next) {
            if (e.key == null) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        modCount++;
        addEntry(0, null, value, 0);
        return null;
    }

从代码可以看出，如果key为null的值，默认就存储到table[0]数组第一位。然后遍历table[0]的链表的每个节点Entry，如果发现其中存在节点Entry的key为null，就替换新的value，然后返回旧的value，如果没发现key等于null的节点Entry，就增加新的节点
3，计算key的hash，得到k的hash值

final int hash(Object k) {
        int h = hashSeed;
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }

        h ^= k.hashCode();

        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

该函数用了很多异或，移动运算，对key的hashcode进行进一步运算，来保证最终获取的存储位置尽量分布均匀

4，通过hash值计算index（数组table所处的位置）

static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        return h & (length-1);
    }

h & (length-1);就相当于h %length。也就是用hash值对table数组的长度取模，这里采用了位运算，提高性能
最终就可以拿到存储的下标

5，然后通过下标i，table[i]拿到table 的i位置的链表头节点。循环链表，如果发现hash，key都相同的节点时，就替换为新的value，然后返回旧的value，只有hash相同时，循环内并没有做任何处理

for (Entry e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

6，通过addEntry方法来新插入一个节点。

    void addEntry(int hash, K key, V value, int bucketIndex) {
        if ((size >= threshold) && (null != table[bucketIndex])) {
            resize(2 * table.length);//当size超过临界阈值threshold，并且即将发生哈希冲突时进行扩容2倍
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }

        createEntry(hash, key, value, bucketIndex);
    }

    void createEntry(int hash, K key, V value, int bucketIndex) {
        Entry e = table[bucketIndex];
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        size++;
    }

通过以上代码能够得知，当size大于阈值的时候，需要进行数组扩容，扩容时，需要新建一个长度为之前数组2倍的新的数组，然后将当前的Entry数组中的元素全部传输过去，扩容后的新数组长度为之前的2倍，所以扩容相对来说是个耗资源的操作。

void resize(int newCapacity) {
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }

        Entry[] newTable = new Entry[newCapacity];
        transfer(newTable, initHashSeedAsNeeded(newCapacity));
        table = newTable;
        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
    }

//initHashSeedAsNeeded方法是判断是否需要重新计算hash值
final boolean initHashSeedAsNeeded(int capacity) {
        boolean currentAltHashing = hashSeed != 0;
        boolean useAltHashing = sun.misc.VM.isBooted() &&
                (capacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);
        boolean switching = currentAltHashing ^ useAltHashing;
        if (switching) {
            hashSeed = useAltHashing
                ? sun.misc.Hashing.randomHashSeed(this)
                : 0;
        }
        return switching;
    }

2018-04-22 HashMap底层原理分析

一、首先看一下put方法（通过put方法基本能看出hashmap结构）

你可能感兴趣的:(2018-04-22 HashMap底层原理分析)