HashSet是一种元素集合,实现了Set接口,是一种快速的,去重的集合对象,继承关系见图1所示。
图1 HashSet继承关系图
HashSet继承自Set接口,继承Set的还包括TreeSet,SortedSet等,他们都有一个基本的标准的特征,不包含重复的元素。HashSet从名称上容易看出,这是通过Hash来实现的元素去重以及对元素的各种快速操作。
HashSet是如何做到的呢?从成员变量和构造函数中就可以一窥端倪。
private transient HashMap map;
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
首先看着两个成员变量,HashSet中有一个HashMap,似乎已经预示着这两个者之间有一些关联,后面一个变量则印证了这个猜测,这个变量可能是要放置进HashMap中的。
/**
* Constructs a new, empty set; the backing HashMap instance has
* default initial capacity (16) and load factor (0.75).
*/
public HashSet() {
map = new HashMap<>();
}
/**
* Constructs a new set containing the elements in the specified
* collection. The HashMap is created with default load factor
* (0.75) and an initial capacity sufficient to contain the elements in
* the specified collection.
*
* @param c the collection whose elements are to be placed into this set
* @throws NullPointerException if the specified collection is null
*/
public HashSet(Collection extends E> c) {
map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
addAll(c);
}
/**
* Constructs a new, empty set; the backing HashMap instance has
* the specified initial capacity and the specified load factor.
*
* @param initialCapacity the initial capacity of the hash map
* @param loadFactor the load factor of the hash map
* @throws IllegalArgumentException if the initial capacity is less
* than zero, or if the load factor is nonpositive
*/
public HashSet(int initialCapacity, float loadFactor) {
map = new HashMap<>(initialCapacity, loadFactor);
}
/**
* Constructs a new, empty set; the backing HashMap instance has
* the specified initial capacity and default load factor (0.75).
*
* @param initialCapacity the initial capacity of the hash table
* @throws IllegalArgumentException if the initial capacity is less
* than zero
*/
public HashSet(int initialCapacity) {
map = new HashMap<>(initialCapacity);
}
看到这些HashSet的构造函数,是不是有些似曾相识的感觉,与HashMap的构造函数有些雷同,或者说,这就是在构造HashMap。因为实质上,HashSet就是由HashMap实现的,来看一下HashSet的关键函数就更清楚了。
HashSet在构造之初,首先初始化了一个HashMap,并利用这个HashMap作为存储数据的容器。前面有介绍过,HashMap是一个键值对容器,节点元素以Node
HashSet关键的方法罗列如下,可以看到,所有的操作都是针对Key进行的。之前定义的PRESENT变量,就是为了填充到HashMap中Node节点的Value部分。
/**
* Returns an iterator over the elements in this set. The elements
* are returned in no particular order.
*
* @return an Iterator over the elements in this set
* @see ConcurrentModificationException
*/
public Iterator iterator() {
return map.keySet().iterator();
}
/**
* Returns the number of elements in this set (its cardinality).
*
* @return the number of elements in this set (its cardinality)
*/
public int size() {
return map.size();
}
/**
* Returns true if this set contains no elements.
*
* @return true if this set contains no elements
*/
public boolean isEmpty() {
return map.isEmpty();
}
/**
* Returns true if this set contains the specified element.
* More formally, returns true if and only if this set
* contains an element e such that
* (o==null ? e==null : o.equals(e)).
*
* @param o element whose presence in this set is to be tested
* @return true if this set contains the specified element
*/
public boolean contains(Object o) {
return map.containsKey(o);
}
/**
* Adds the specified element to this set if it is not already present.
* More formally, adds the specified element e to this set if
* this set contains no element e2 such that
* (e==null ? e2==null : e.equals(e2)).
* If this set already contains the element, the call leaves the set
* unchanged and returns false.
*
* @param e element to be added to this set
* @return true if this set did not already contain the specified
* element
*/
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
/**
* Removes the specified element from this set if it is present.
* More formally, removes an element e such that
* (o==null ? e==null : o.equals(e)),
* if this set contains such an element. Returns true if
* this set contained the element (or equivalently, if this set
* changed as a result of the call). (This set will not contain the
* element once the call returns.)
*
* @param o object to be removed from this set, if present
* @return true if the set contained the specified element
*/
public boolean remove(Object o) {
return map.remove(o)==PRESENT;
}
/**
* Removes all of the elements from this set.
* The set will be empty after this call returns.
*/
public void clear() {
map.clear();
}
HashSet不需要重新实现这些函数,都直接借用HashMap的就好。包装在外层的HashSet实质上是一个代理,具体的操作都交由内部构造的HashMap实例实现。通过这样的方式,可以更加方便高效地实现HashSet这种数据结构,也避免了以几乎同样方式再实现HashSet的冗余。
HashSet是一种支持快速操作的数据结构,理想情况下对HashSet的操作可达到常量级别,是存储不重复元素的常用集合。HashSet本身不是线程安全的,在多线程环境下使用可能出现并发访问问题,要在多线程环境下使用Set可以尝试使用ConcurrentHashMap构造一个类似的ConcurrentHashSet来实现。