最近在看HashMap源码,感觉注释写的挺好,不如试着翻译下。
注:只翻译了HashMap类前面的注释,后面的注释有空再说吧。
注:这个基于JDK 1.8
上图中的链表结构中如果存放的数据太多,JDK1.8会使用红黑树存储来提高查找(get)性能。
public class HashMap extends AbstractMap implements Map, Cloneable, Serializable
Hash table based implementation of the Map interface. This implementation provides all of the optional map operations, and permits null values and the null key. (The HashMap class is roughly equivalent to Hashtable , except that it is unsynchronized and permits nulls.) This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.
HashMap是基于Map接口实现的。这个实现提供了所有可选的map操作,并且允许空value与空key。(HashMap跟HashTable基本一样,只是HashMap 非同步、允许null)。HashMap不保证Map的顺序(即,存储顺序跟put顺序不一致);而且,不保证这个顺序不发生变化(当rehash时,和可能发生变化)。
This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets. Iteration over collection views requires time proportional to the “capacity” of the HashMap instance (the number ofybuckets) plus its size (the number of key-value mappings). Thus, it’s very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.
HashMap对基本操作(put、get)提供了常量级实现,假设hash函数能够将元素均匀分布在不同的桶中。迭代操作的时间性能取决于HashMap实例的容量(capacity,桶的数量)和他的size(Key-Value映射的数量)。这样,如果对迭代性能要求比较高的话,不要将容量capacity设置的过高(或者,装载因子不要太低。因为装载因子越低,空桶数量越多;如果装载因子太高,可能会发生碰撞,导致一个桶后面有很多元素,要么使用链表实现,要么使用红黑树实现)。
An instance of HashMap has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.
初始化的容量capacity和装载因子load factor这两个参数会影响HashMap实例的性能。容量是指hash表中桶的数量,容量的初始值就是hash表被创建时的容量。 装载因子是用来衡量hash表满的程度。当hash表中entry(表项)的数量超过了容量与装载因子的乘积,hash表就会rehash(也就是说,内部的数据结构会重建),导致hash表的桶数量翻倍。
As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.
通常的规则,0.75(使用hash表桶的3/4)这个装载因子会在时间性能和空间开销之间达到一个平衡。高的装载因子虽然能够降低空间开销,但是却增加了查找开销(反映在HashMap的大多数操作上,包括put和get)。当设置初始的容量时,要考虑预计放进map中的表项的数量和装载因子,以降低rehash操作的次数。如果初始的容量比预计的entry数量除以装载因子的商还要大,那么永远不需要rehash操作。
If many mappings are to be stored in a HashMap instance, creating it with a sufficiently large capacity will allow the mappings to be stored more efficiently than letting it perform automatic rehashing as needed to grow the table. Note that using many keys with the same {@code hashCode()} is a sure way to slow down performance of any hash table. To ameliorate impact, when keys are {@link Comparable}, this class may use comparison order among keys to help break ties.
如果要往HashMap实例中存放很多K-V映射,创建这个实例时就给足够大的容量,这要比按需rehash的性能要高。注意,很多keys具有相同的hashCode是一个降低hash表性能的相当行之有效的方式。(老外写注释这么有意思,我们中国人可能这么写:千万不要让很多key具有相同的hashCode,这会降低hash表的性能。)为了降低这种情况(许多Key具有相同的hashCode)的影响,Key最好能够实现Comparable接口,这样这个类就能在这些key之间使用比较来降低这种影响。
Note that this implementation is not synchronized. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the map.
注意,这个HashMap的实现不是同步的。如果有多个线程并发的访问hashMap, 至少有一个线程结构化的修改map,他必须要在外部同步。(所谓结构化修改是指,任何添加或删除一个或多个KV映射,仅仅改变一个实例已经有的KV映射的值不是结构性修改)。这通常是通过同步封装在这个map上的某个对象来是实现的( 该句由 @liu_shi_jun 翻译,感谢)。
If no such object exists, the map should be “wrapped” using the{@link Collections#synchronizedMap Collections.synchronizedMap} method. This is best done at creation time, to prevent accidental unsynchronized access to the map: Map m = Collections.synchronizedMap(new HashMap(…));
(这段承接上段的最后一句)如果没有这样的对象(包含map的对象)存在,这个map应该使用Collections#synchronizedMap来封装。为了防止偶然异步访问这个map,最好在创建的时候就这么干:
Map m = Collections.synchronizedMap(new HashMap(…));
The iterators returned by all of this class’s “collection view methods” are fail-fast: if the map is structurally modified at any time after the iterator is created, in any way except through the iterator’s own remove method, the iterator will throw a {@link ConcurrentModificationException}. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.
Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.
当迭代器创建之后,除了通过迭代器自己的remove方法,任何对map进行了结构性修改,迭代器就会抛出 并发修改的异常。当面对并发修改的时候,迭代器会快速明了的报错,而不冒任何在未来的某个不确定时间的不确定行为的风险。
注意,迭代器的快速失败行为不能得到保证,一般来说,存在非同步的并发修改时,不可能作出任何坚决的保证。快速失败迭代器尽最大努力抛出 ConcurrentModificationException。因此,编写依赖于此异常的程序的做法是错误的,正确做法是:迭代器的快速失败行为应该仅用于检测程序错误。
有翻译不当的地方,还请各位不吝赐教。