Java容器类框架分析(5)HashSet源码分析

概述

在分析HashSet源码前，先看看HashSet的继承关系

HashSet继承关系

从上图可以看出，HashSet继承自AbstractSet，实现了Set接口，接着看一下源码中的注释

This class implements the Set interface, backed by a hash table
(actually a HashMapinstance). It makes no guarantees as to the
iteration order of the set; in particular, it does not guarantee that the
order will remain constant over time. This class permits the null element.
HashSet实现了Set接口，内部有一个哈希表支撑(实际上就是一个HashMap实例)，它不保证迭代的顺序；尤其是，随着时间的变化，它不能保证set的迭代顺序保持不变。允许插入空值。

到此发现，HashSet实际上可以拆分成Hash跟Set,Hash指的是HashMap，Set则是指实现了Set接口，这样看来，HashSet的实现其实就比较简单了，下面开始分析源码。

正文

成员变量

//序列化ID
 static final long serialVersionUID = -5024744406713321676L;
//内置的HashMap
 private transient HashMap map;

 // 就是一个傀儡，填充HashMap的Value而已，没有实际意义
 private static final Object PRESENT = new Object();

构造方法

空的构造方法

初始化一个空的HashMap

    public HashSet() {
        map = new HashMap<>();
    }

带有容量的构造方法

HashMap给定一个容量

 public HashSet(int initialCapacity) {
        map = new HashMap<>(initialCapacity);
    }

带有容量跟负载因子的构造方法

  public HashSet(int initialCapacity, float loadFactor) {
        map = new HashMap<>(initialCapacity, loadFactor);
    }

带有容量跟负载因子，以及Value类型区分

dummy作为Value是基本类型跟引用类型，注意此处初始化的是一个LinkedHashMap

  HashSet(int initialCapacity, float loadFactor, boolean dummy) {
        map = new LinkedHashMap<>(initialCapacity, loadFactor);
    }

通过一个集合初始化

    public HashSet(Collection c) {
        map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
        addAll(c);
    }

调用addAll方法

    public boolean addAll(Collection c) {
        boolean modified = false;
        //循环遍历
        for (E e : c)
        //如果set中没有此元素，添加成功
            if (add(e))
                modified = true;
        return modified;
    }

增加元素

添加一个元素，如果Map中存在，返回false，否则返回true

  public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

看一下Map的put方法

public V put(K key, V value) {
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        if (key == null)
            return putForNullKey(value);
        int hash = sun.misc.Hashing.singleWordWangJenkinsHash(key);
        int i = indexFor(hash, table.length);
        for (HashMapEntry e = table[i]; e != null; e = e.next) {
            Object k;
        //这里比较了hash值跟equals方法
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

所以Set元素必须复写hashcode跟equals方法，不然会导致元素错乱

删除元素

  public boolean remove(Object o) {
  //直接调用map的方法
        return map.remove(o)==PRESENT;
    }

clear

 public void clear() {
 //调用map的Clear方法
        map.clear();
    }

contains方法

   public boolean contains(Object o) {
   调用map的contains方法
        return map.containsKey(o);
    }

isEmpty

  public boolean isEmpty() {
  //调用map的isEmpty方法
        return map.isEmpty();
    }

迭代

 public Iterator iterator() {
 //因为不需要value，所以只是调用了keySet的iterator
        return map.keySet().iterator();
    }

分析了一下，其实最终的底层实现都是在调用HashMap的方法，所以了解了HashMap的源码之后，HashSet其实就会比较简单了

总结

HashSet是非线程安全的，允许插入空元素
HashSet不允许重复元素
HashSet的Key需要复写hashcode跟equals方法