Java中的HashSet

Java中的HashSet

1.源代码如下:

compact1, compact2, compact3
java.util
Class HashSet
java.lang.Object 
java.util.AbstractCollection 
java.util.AbstractSet 
java.util.HashSet 

Type Parameters: 
E - the type of elements maintained by this set 
All Implemented Interfaces: 
Serializable, Cloneable, Iterable, Collection, Set 
Direct Known Subclasses: 
JobStateReasons, LinkedHashSet 
--------------------------------------------------------------------------------
public class HashSet
extends AbstractSet
implements Set, Cloneable, Serializable
This class implements the Set interface, backed by a hash table (actually a HashMap instance). 
It makes no guarantees as to the iteration order of the set; in particular,
it does not guarantee that the order will remain constant over time. This class permits the null element. 
This class offers constant time performance for the basic operations 
(add, remove, contains and size), assuming the hash function disperses the elements properly 
among the buckets. Iterating over this set requires time proportional to the sum of the HashSet
instance's size (the number of elements) plus the "capacity" of the backing HashMap instance
(the number of buckets). Thus, it's very important not to set the initial capacity too high 
(or the load factor too low) if iteration performance is important. 

Note that this implementation is not synchronized. If multiple threads access a hash set
concurrently, and at least one of the threads modifies the set, it must be synchronized 
externally. This is typically accomplished by synchronizing on some object that naturally 
encapsulates the set. If no such object exists, the set should be "wrapped" using the 
Collections.synchronizedSet method. This is best done at creation time, to prevent accidental 
unsynchronized access to the set:

Set s = Collections.synchronizedSet(new HashSet(...));The iterators returned by this class's 
iterator method are fail-fast: if the set is modified at any time after the iterator is created, 
in any way except through the iterator's own remove method, the Iterator throws a 
ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator 
fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an 
undetermined time in the future. 

Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally 
speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent 
modification. Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. 
Therefore, it would be wrong to write a program that depended on this exception for its 
correctness: the fail-fast behavior of iterators should be used only to detect bugs. 

This class is a member of the Java Collections Framework.

Since: 
1.2 
See Also: 
Collection, Set, TreeSet, HashMap, Serialized Form 

2.主要方法

  • HashSet的add方法
public boolean add(E e)
Adds the specified element to this set if it is not already present. More formally, 
adds the specified element e to this set if this set contains no element e2 such that
(e==null ? e2==null : e.equals(e2)).
If this set already contains the element, the call leaves the set unchanged and returns 
false.
Specified by: 
add in interface Collection 
Specified by: 
add in interface Set 
Overrides: 
add in class AbstractCollection 
Parameters: 
e - element to be added to this set 
Returns: 
true if this set did not already contain the specified element 

译注:如果指定元素不存在于Set中,则添加。如果这个set已经包含待添加元素,这个调用不会改变set,并且返回false。如果这个集合还没有包含指定的元素,则添加进set,并返回true。

3.面试题

  • 问:给定一个字符串(不一定全为字母)A及它的长度n,保证字符串中有重复字符。请设计一个高效算法,找到第一次重复出现的字符。
    测试样例:
    "qywyer23tdd",11
    输出值:y
    思考:
    对于本例有好多种算法。讲解两种如下:
    1.可以利用一些数据结构的属性。比如这里的HashSet。
    2.我们知道,对于可打印字符有限,我们可以使用一个数组记录。如果再次遇到重复的字符,即可输出。
    现用第一种方法实现,代码如下:
package grammer;
import java.util.HashSet;

public class TestHashSet {
    public static void main(String[] args) {
        getSecondDisplay("qywyer23tdd");
    }
    public static void getSecondDisplay(String str){

        char[] a = str.toCharArray();
        HashSet hs = new HashSet<>();//新建一个HashSet,用这个hashset去存储这串字符

        for(int i = 0; i< str.length();i++)
        {
            if (!hs.add(a[i]))
            {
                System.out.println(a[i]);
                return;
            }
        }
        return;
    }
}

运行结果
y

  • HashSet不能保证遍历的顺序【即遍历的结果和元素插入的顺序没有关系】
package grammer;

import java.util.HashSet;
import java.util.Iterator;

public class TestHashSet {
    static HashSet hashSet = new HashSet();
    public static void main(String[] args) {     
        displayHashSet();
    }   
    //显示HashSet的数据
    public static void displayHashSet(){
        hashSet.add("My");
        hashSet.add("name");
        hashSet.add("is");
        hashSet.add("LittleLawson");
        Iterator iterator = hashSet.iterator();

        //遍历输出顺序与插入顺序不同
        while(iterator.hasNext()){
            System.out.println(iterator.next());
        }
    }
}

4.总结

  • 底层是HashMap
  • 非线程安全
  • 不保证遍历顺序

5.疑问

  • size和capacity有什么区别?
    Iteration over a LinkedHashSet requires time proportional to the size of the set, regardless of its capacity. Iteration over a HashSet is likely to be more expensive, requiring time proportional to its capacity.
  • hashSet的实现原理?
    往HashSet添加元素的时候,HashSet会先调用元素的hashCode方法得到元素的哈希值,然后通过元素的哈希值经过移位等运算,就可以算出该元素在哈希表中的存储位置。
    情况1:如果算出元素存储的位置目前没有任何元素存储,那么该元素可以直接存储到该位置上。
    情况2:如果算出该元素的存储位置目前已经存在有其他的元素了,那么会调用该元素的equals方法与该位置的元素再比较一次,如果equals返回的是true,那么该元素与这个位置上的元素就视为重复元素,不允许添加。如果equals方法返回的是false,那么添加该元素运行

你可能感兴趣的:(Java,JDK源码解读)