HashMap的构造函数
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
this.loadFactor = loadFactor;
this.threshold = tableSizeFor(initialCapacity);
}
/**
* Constructs an empty HashMap with the specified initial
* capacity and the default load factor (0.75).
*
* @param initialCapacity the initial capacity.
* @throws IllegalArgumentException if the initial capacity is negative.
*/
public HashMap(int initialCapacity) {
this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
/**
* Constructs an empty HashMap with the default initial capacity
* (16) and the default load factor (0.75).
*/
public HashMap() {
this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
主要是有两个参数,分别是代表初始容量的initialCapacity和负载因子loadFactor。初始容量不一定是hashMap的真实初始容量。初始容量只能是2的n次方,所以假设你想要初始化为1000的容量,new HashMap(1000),实际得到的容量却是1024。为什么呢,请看下面方法
/**
* Returns a power of two size for the given target capacity.
*
*/
static final int tableSizeFor(int cap) {
int n = cap - 1;
n |= n >>> 1;
n |= n >>> 2;
n |= n >>> 4;
n |= n >>> 8;
n |= n >>> 16;
return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}
改方法传入1000返回1024,传入17返回32,传入64返回64,所以该方法返回的大于参数的最小的2的n次方的值。
HashMap的存储结构
为什么默认负载因子是0.75
说道负载因子不得不说另外一个HashMap变量threshold.阈值,threshold,在HashMap调用put方法的时候,会先判断元素个数size是否大于threshold,如果size大于threshold则会进行扩容,那么threshold是如何计算的呢。
void addEntry(int hash, K key, V value, int bucketIndex) {
if ((size >= threshold) && (null != table[bucketIndex])) {
resize(2 * table.length);
hash = (null != key) ? hash(key) : 0;
bucketIndex = indexFor(hash, table.length);
}
createEntry(hash, key, value, bucketIndex);
}
void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}
Entry[] newTable = new Entry[newCapacity];
transfer(newTable, initHashSeedAsNeeded(newCapacity));
table = newTable;
threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
}
threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
由此可见threshold=capacity*loadFactor。loadFactor越大,threshold也越大,扩容的时机也会推迟,即是HashMap的存储数组利用率越高,反正越小。空间利用率越高,Hash冲突的概率也就越高,所以loadFactor的值不能过大也不能过小,0.75是一个比较合理的值。
HashMap的扩容机制
HashMap在1.7版本链表采用的是头插法
下面是jdk1.7的扩容代码
void transfer(Entry[] newTable, boolean rehash) {
int newCapacity = newTable.length;
for (Entry e : table) {
while(null != e) {
Entry next = e.next;
if (rehash) {
e.hash = null == e.key ? 0 : hash(e.key);
}
//代码(1)
int i = indexFor(e.hash, newCapacity);
// 下面三行代码采用了头插法
e.next = newTable[i];
newTable[i] = e;
e = next;
}
}
}
为什么会有e.next=newTable[i]?如果没有这行代码,直接newTable[i] = e;则会把newTable[i]原有的位置的元素覆盖掉了。
扩容造成的死链表问题。
假设线程A,B同时扩容执行到了上面扩容机制的代码(1)处,线程B,先完成了扩容,则HashMap的状态如下图
线程A借着执行 e.next = newTable[i];
这行代码把Entry(3,A)的next指向了Entry(7,C),继续执行newTable[i] = e;
这行代码吧Entry(3,A)放在table[3]的链表头,这个时候闭环的链表已经形成,线程B,扩容完成的时候,Entry(7)的next指向Entry(3),现在线程A,扩容过程中又吧Entry(3)的next指向了Entry(7)
如下图