Map源码解析之HashMap
Map源码解析之HashMap红黑树
前面两篇文章分析了HashMap的数组+链表/红黑树的数据结构以及新加、增加节点、删除节点、查询节点等相关方法。下面我们分析一下HashMap中的集合和循环等方法。
在HashMap中有3个集合类的成员变量,分别为keySet,values和entrySet,并且可通过HashMap#keySet、HashMap#values和HashMap#entrySet等方法获取。
transient Set keySet;
transient Collection values;
transient Set> entrySet;
上面3个成员变量在HashMap的使用中页很常用,分别代表着HashMap的key值集、value值集合key-value集。
事实上,这3者和普通的集合都不同,它们分别对应着HashMap的3个成员类:HashMap.KeySet、HashMap.Values、HashMap.EntrySet,和我们平时常见的ArrayList、HashSet等集合类不同,这3者并不包含任何实质性的内容,其内部没有任何实质性的元素。
以keySet为例,我们通过HashMap#ketSet方法进行访问。
public Set keySet() {
Set ks = keySet;
if (ks == null) {
ks = new KeySet();
keySet = ks;
}
return ks;
}
可以发现其实其只是返回了一个HashMap.KeySet对象,HashMap代码中找不到任何对keySet的操作。
final class KeySet extends AbstractSet {
public final int size() { return size; }
public final void clear() { HashMap.this.clear(); }
public final Iterator iterator() { return new KeyIterator(); }
public final boolean contains(Object o) { return containsKey(o); }
public final boolean remove(Object key) {
return removeNode(hash(key), key, null, false, true) != null;
}
public final Spliterator spliterator() {
return new KeySpliterator<>(HashMap.this, 0, -1, 0, 0);
}
public final void forEach(Consumer super K> action) {
Node[] tab;
if (action == null)
throw new NullPointerException();
if (size > 0 && (tab = table) != null) {
int mc = modCount;
for (int i = 0; i < tab.length; ++i) {
for (Node e = tab[i]; e != null; e = e.next)
action.accept(e.key);
}
if (modCount != mc)
throw new ConcurrentModificationException();
}
}
}
进一步查看HashMap.KeySet类可以发现,所有对keySet的操作本质上还是通过HashMap的方法来实现的,最终操作的都是HashMap的数据结构。
values和entrySet也类似,不在展开描述。
下面我们开始了解、分析和比较HashMap循环迭代的方式
(1)迭代器解析
HashMap中有3个迭代器类:KeyIterator、ValueIterator和EntryIterator,分别对应KeySet、values和entrySet属性,这3个都继承了HashIterator,所有的操作也都基于HashIterator。
abstract class HashIterator {
Node next; // next entry to return
Node current; // current entry
int expectedModCount; // for fast-fail
int index; // current slot
HashIterator() {
expectedModCount = modCount;
Node[] t = table;
current = next = null;
index = 0;
if (t != null && size > 0) { // advance to first entry
do {} while (index < t.length && (next = t[index++]) == null);
}
}
public final boolean hasNext() {
return next != null;
}
final Node nextNode() {
Node[] t;
Node e = next;
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
if (e == null)
throw new NoSuchElementException();
if ((next = (current = e).next) == null && (t = table) != null) {
do {} while (index < t.length && (next = t[index++]) == null);
}
return e;
}
public final void remove() {
Node p = current;
if (p == null)
throw new IllegalStateException();
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
current = null;
K key = p.key;
removeNode(hash(key), key, null, false, false);
expectedModCount = modCount;
}
}
final class KeyIterator extends HashIterator
implements Iterator {
public final K next() { return nextNode().key; }
}
final class ValueIterator extends HashIterator
implements Iterator {
public final V next() { return nextNode().value; }
}
final class EntryIterator extends HashIterator
implements Iterator> {
public final Map.Entry next() { return nextNode(); }
}
(2)循环方式
从上文HashIterator#nextNode和HashIterator#remove方法可以知道,在遍历过程中除了HashIterator#remove外不能有其他方式进行修改(不包括覆盖节点的value值),否则会抛出异常。
根据迭代器介绍可以知道,有下文3种基于迭代器的方式可以进行循环,3者的循环都是基于HashIterator,循环性能是一样的,但是第一种获取value需要额外通过HashMap#get方法查询,第二种方式无法获取key值。
Map map = new HashMap<>();
for (Iterator iterator = map.keySet().iterator(); iterator.hasNext(); ) {
Integer key = iterator.next();
Integer value = map.get(key);
}
for (Iterator iterator = map.values().iterator(); iterator.hasNext(); ) {
Integer value = iterator.next();
}
for (Iterator> iterator = map.entrySet().iterator(); iterator.hasNext(); ) {
Map.Entry entry = iterator.next();
Integer key = entry.getKey();
Integer value = entry.getValue();
}
只是一种语法糖,本质还是通过迭代器,和迭代器一致。
Map map = new HashMap<>();
for (Integer key : map.keySet()) {
Integer value = map.get(key);
}
for (Integer value : map.values()) {
}
for (Map.Entry entry : map.entrySet()) {
Integer key = entry.getKey();
Integer value = entry.getValue();
}
以HashMap#forEach为例,对应的是函数式编程,遍历后调用BiConsumer#accept方法
@Override
public void forEach(BiConsumer super K, ? super V> action) {
Node[] tab;
if (action == null)
throw new NullPointerException();
if (size > 0 && (tab = table) != null) {
int mc = modCount;
for (int i = 0; i < tab.length; ++i) {
for (Node e = tab[i]; e != null; e = e.next)
action.accept(e.key, e.value);
}
if (modCount != mc)
throw new ConcurrentModificationException();
}
}
配合Lambda表达式,本质上是匿名类,不能对外部的变量进行修改。
Map map = new HashMap<>();
map.forEach((key, value) -> {});
map.keySet().forEach(key -> {
Integer value = map.get(key);
});
map.values().forEach(value -> {});
map.entrySet().forEach((entry) -> {
Integer key = entry.getKey();
Integer value = entry.getValue();
});
HashMap#compute方法其实和HashMap#put方法类似,都是插入或覆盖节点,不同的是HashMap#put的value值是固定的,而compute方法的value根据key值和oldValue方法计算而来。
作为特殊情况,如果计算出来的value是null,删除节点。
@Override
public V compute(K key,
BiFunction super K, ? super V, ? extends V> remappingFunction) {
if (remappingFunction == null)
throw new NullPointerException();
int hash = hash(key);
Node[] tab; Node first; int n, i;
int binCount = 0;
TreeNode t = null;
Node old = null;
if (size > threshold || (tab = table) == null ||
(n = tab.length) == 0)
n = (tab = resize()).length;
if ((first = tab[i = (n - 1) & hash]) != null) {
if (first instanceof TreeNode)
old = (t = (TreeNode)first).getTreeNode(hash, key);
else {
Node e = first; K k;
do {
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k)))) {
old = e;
break;
}
++binCount;
} while ((e = e.next) != null);
}
}
V oldValue = (old == null) ? null : old.value;
//根据key值和oldValue计算value值
V v = remappingFunction.apply(key, oldValue);
if (old != null) {
if (v != null) {
old.value = v;
afterNodeAccess(old);
}
else
//计算得到的value值为null,直接删除
removeNode(hash, key, null, false, true);
}
else if (v != null) {
if (t != null)
t.putTreeVal(this, tab, hash, key, v);
else {
tab[i] = newNode(hash, key, v, first);
if (binCount >= TREEIFY_THRESHOLD - 1)
treeifyBin(tab, hash);
}
++modCount;
++size;
afterNodeInsertion(true);
}
return v;
}
只有当key值对应的节点存在或者对应的value为null时进行处理,其它逻辑和HashMap#compute方法一致。
作为特殊情况,如果计算出来的value是null,删除节点。
public V computeIfPresent(K key,
BiFunction super K, ? super V, ? extends V> remappingFunction) {
if (remappingFunction == null)
throw new NullPointerException();
Node e; V oldValue;
int hash = hash(key);
//找到对应节点且节点value值不为null
if ((e = getNode(hash, key)) != null &&
(oldValue = e.value) != null) {
V v = remappingFunction.apply(key, oldValue);
if (v != null) {
e.value = v;
afterNodeAccess(e);
return v;
}
else
removeNode(hash, key, null, false, true);
}
return null;
}
只有当key对应的节点不存在或者对应的value值为null才进行处理,其它逻辑和HashMap#compute方法一致。
另外,由于key值不存在,自然没有oldValue参数,获取value值的方法只有key这一个参数。
作为特殊情况,如果计算出来的value是null,不处理。
@Override
public V computeIfAbsent(K key,
Function super K, ? extends V> mappingFunction) {
if (mappingFunction == null)
throw new NullPointerException();
int hash = hash(key);
Node[] tab; Node first; int n, i;
int binCount = 0;
TreeNode t = null;
Node old = null;
if (size > threshold || (tab = table) == null ||
(n = tab.length) == 0)
n = (tab = resize()).length;
if ((first = tab[i = (n - 1) & hash]) != null) {
if (first instanceof TreeNode)
old = (t = (TreeNode)first).getTreeNode(hash, key);
else {
Node e = first; K k;
do {
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k)))) {
old = e;
break;
}
++binCount;
} while ((e = e.next) != null);
}
V oldValue;
//如果找到对应节点且节点value值不为null,不作处理,直接返回
if (old != null && (oldValue = old.value) != null) {
afterNodeAccess(old);
return oldValue;
}
}
V v = mappingFunction.apply(key);
if (v == null) {
return null;
} else if (old != null) {
old.value = v;
afterNodeAccess(old);
return v;
}
else if (t != null)
t.putTreeVal(this, tab, hash, key, v);
else {
tab[i] = newNode(hash, key, v, first);
if (binCount >= TREEIFY_THRESHOLD - 1)
treeifyBin(tab, hash);
}
++modCount;
++size;
afterNodeInsertion(true);
return v;
}
和HashMap#compute方法类似,根据key值找节点,找到则覆盖,找不到则插入。
只不过参数BiFunction的方法apply的参数不同,HashMap#compute以key参数和oldValue(对应节点的value值)作为参数,HashMap#merge则以value参数和oldValue(对应节点的value值)为参数,如果oldValue为null或者节点不存在,则直接取value参数作为节点的value值。同样的,作为特殊情况,如果计算出来的value值为null,删除节点。
public V merge(K key, V value,
BiFunction super V, ? super V, ? extends V> remappingFunction) {
if (value == null)
throw new NullPointerException();
if (remappingFunction == null)
throw new NullPointerException();
int hash = hash(key);
Node[] tab; Node first; int n, i;
int binCount = 0;
TreeNode t = null;
Node old = null;
if (size > threshold || (tab = table) == null ||
(n = tab.length) == 0)
n = (tab = resize()).length;
if ((first = tab[i = (n - 1) & hash]) != null) {
if (first instanceof TreeNode)
old = (t = (TreeNode)first).getTreeNode(hash, key);
else {
Node e = first; K k;
do {
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k)))) {
old = e;
break;
}
++binCount;
} while ((e = e.next) != null);
}
}
if (old != null) {
V v;
if (old.value != null)
//根据oldValue和value参数计算value值
v = remappingFunction.apply(old.value, value);
else
//oldValue为null,直接取value参数
v = value;
if (v != null) {
old.value = v;
afterNodeAccess(old);
}
else
//计算出来的value值为null,删除节点
removeNode(hash, key, null, false, true);
return v;
}
if (value != null) {
if (t != null)
t.putTreeVal(this, tab, hash, key, value);
else {
tab[i] = newNode(hash, key, value, first);
if (binCount >= TREEIFY_THRESHOLD - 1)
treeifyBin(tab, hash);
}
++modCount;
++size;
afterNodeInsertion(true);
}
return value;
}
用value参数覆盖key参数对应的节点的value值。
@Override
public V replace(K key, V value) {
Node e;
if ((e = getNode(hash(key), key)) != null) {
V oldValue = e.value;
e.value = value;
afterNodeAccess(e);
return oldValue;
}
return null;
}
用newValue参数覆盖key参数和oldValue参数对应的节点的value值。
@Override
public boolean replace(K key, V oldValue, V newValue) {
Node e; V v;
if ((e = getNode(hash(key), key)) != null &&
((v = e.value) == oldValue || (v != null && v.equals(oldValue)))) {
e.value = newValue;
afterNodeAccess(e);
return true;
}
return false;
}
用每个节点的key值和value值作为BiFunctio的apply方法的参数,并将方法返回值作为新的value值对节点进行覆盖。注意,期间会检查modCount 是否一致,不一致会抛出异常。
``
@Override
public void replaceAll(BiFunction super K, ? super V, ? extends V> function) {
Node
if (function == null)
throw new NullPointerException();
if (size > 0 && (tab = table) != null) {
int mc = modCount;
for (int i = 0; i < tab.length; ++i) {
for (Node
e.value = function.apply(e.key, e.value);
}
}
if (modCount != mc)
throw new ConcurrentModificationException();
}
}