Husky Yue Daily Record—— HahMap 、Hashtable、ConcurrentHashMap Resourse Code Annotation

一、HahMap Resourse Code Annotation

initCapacity: 16
loadFactor: 0.75
resize: double

static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
static final int MAXIMUM_CAPACITY = 1 << 30; //1073741824
if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
 oldCap >= DEFAULT_INITIAL_CAPACITY)
 newThr = oldThr << 1; // double threshold

HashMap is under package java.util,the annotations of it are below:

Firstly, Hash table based implementation of the Map interface.

  • This implementation provides all of the optional map operations, and permits
    null values and the null key.
    *The HashMap class is roughly equivalent to Hashtable, except that it is
    unsynchronized and permits nulls.
  • This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.

Secondly, this implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets.

  • Iteration over collection views requires time proportional to the “capacity” of the
    HashMap instance (the number of buckets) plus its size (the number
    of key-value mappings).
  • Thus, it’s very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.

Thirdly, an instance of HashMap has two parameters that affect its performance:

  • initial capacity and load factor.
  • The capacity is the number of buckets in the hash table, and the initial
    capacity is simply the capacity at the time the hash table is created.
  • The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and thecurrent capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.

Fourthly, as a general rule, the default load factor (.75) offers a good tradeoff between time and space costs.

  • Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put).
  • The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations.
  • If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

Fifthly, if many mappings are to be stored in a HashMap instance, creating it with a sufficiently large capacity will allow the mappings to be stored more efficiently than letting it perform automatic rehashing as needed to grow the table. * Note that usingmany keys with the same {@code hashCode()} is a sure way to slow down performance of any hash table. * To ameliorate impact, when keys are {@link Comparable}, this class may use comparison order among keys to help break ties.

Sixthly, note that this implementation is not synchronized.

  • If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally.
  • A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.
  • This is typically accomplished by synchronizing on some object that naturally encapsulates the map. If no such object exists, the map should be “wrapped” using the {@link Collections#synchronizedMap Collections.synchronizedMap} method. This is best done at creation time, to prevent accidental unsynchronized access to the map:
Map m = Collections.synchronizedMap(new HashMap(...));

Seventhly, The iterators returned by all of this class's "collection view methods" are fail-fast:

  • if the map is structurally modified at any time after the iterator is created, in any way except through the iterator’s own remove method, the iterator will throw a
    {@link ConcurrentModificationException}. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.

At the end, note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in thepresence of unsynchronized concurrent modification.

  • Fail-fast iterators throw ConcurrentModificationException on a best-effort basis.
  • Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.

二、Hashtable Resourse Code Annotation

initCapacity: 11
loadFactor: 0.75
resize: double add one

public Hashtable() {
     
        this(11, 0.75f);
    }
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
int newCapacity = (oldCapacity << 1) + 1;

Hashtable implements a hash table, which maps keys to values. Any non-null object can be used as a key or as a value.

Firstly, To successfully store and retrieve objects from a hashtable, the objects used as keys must implement the hashCode method and the equals method.

  • An instance of Hashtable has two parameters that affect its performance: initial capacity and load factor.

  • The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created.

  • Note that the hash table is open: in the case of a “hashcollision”, a single bucket stores multiple entries, which must be searchedsequentially.

  • The load factor is a measure of how full the hashtable is allowed to get before its capacity is automatically increased.

  • The initial capacity and load factor parameters are merely hints to the implementation. The exact details as to when and whether the rehash method is invoked are implementation-dependent.

  • Generally, the default load factor (.75) offers a good tradeoff betweentime and space costs. Higher values decrease the space overhead but increase the time cost to look up an entry (which is reflected in most Hashtable operations, including get and put).

  • The initial capacity controls a tradeoff between wasted space and theneed for rehash operations, which are time-consuming.No rehash operations will ever occur if the initial capacity is greater than the maximum number of entries the Hashtable will contain divided by its load factor.

  • However, setting the initial capacity too high can waste space.

    If many entries are to be made into a Hashtable, creating it with a sufficiently large capacity may allow the entries to be inserted more efficiently than letting it perform automatic rehashing as needed to grow the table.

  • This example creates a hashtable of numbers. It uses the names of the numbers as keys:

   Hashtable numbers
     = new Hashtable();
   numbers.put("one", 1);
   numbers.put("two", 2);
   numbers.put("three", 3);}
  • To retrieve a number, use the following code:
 {
  Integer n = numbers.get("two");
  if (n != null) {
     System.out.println("two = " + n);
  }}

Secondly, The iterators returned by the iterator method of the collections returned by all of this class's "collection view methods" are fail-fast:

  • if the Hashtable is structurally modified at any time after the iterator is created, in any way except through the iterator’s own remove method, the iterator will throw a {@link ConcurrentModificationException}. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.
  • The Enumerations returned by Hashtable’s keys and elements methods are not fail-fast.

Thirdly, note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. * Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.

  • As of the Java 2 platform v1.2, this class was retrofitted to

  • implement the {@link Map} interface, making it a member of the
  • Java Collections Framework. Unlike the new collection
  • implementations, {@code Hashtable} is synchronized. If a
  • thread-safe implementation is not needed, it is recommended to use
  • {@link HashMap} in place of {@code Hashtable}. If a thread-safe
  • highly-concurrent implementation is desired, then it is recommended
  • to use {@link java.util.concurrent.ConcurrentHashMap} in place of
  • {@code Hashtable}.

三、ConcurrentHashMap Resourse Code Annotation

ConcurrentHashMap is under package java.util.concurrent.
It is a hash table supporting full concurrency of retrievals and high expected concurrency for updates.

  • This class obeys the same functional specification as {@link java.util.Hashtable}, and includes versions of methods corresponding to each method of {@code Hashtable}.
  • However, even though all operations are thread-safe, retrieval operations do not entail locking, and there is not any support for locking the entire table in a way that prevents all access.
  • This class is fully interoperable with {@code Hashtable} in programs that rely on its thread safety but not on its synchronization details.

Firstly, Retrieval operations (including {@code get}) generally do not block, so may overlap with update operations (including {@code put} and {@code remove}). Retrievals reflect the results of the most recently completed update operations holding upon their onset.

  • More formally, an update operation for a given key bears a happens-before relation with any (non-null) retrieval for that key reporting the updated value.) For aggregate operations such as {@code putAll} and {@code clear}, concurrent retrievals may reflect insertion or removal of only some entries.
  • Similarly, Iterators, Spliterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator/enumeration. They do not throw {@link java.util.ConcurrentModificationException ConcurrentModificationException}.
  • However, iterators are designed to be used by only one thread at a time.
  • Bear in mind that the results of aggregate status methods including {@code size}, {@code isEmpty}, and {@code containsValue} are typically useful only when a map is not undergoing concurrent updates in other threads.
  • Otherwise the results of these methods reflect transient states that may be adequate for monitoring or estimation purposes, but not for program control.

Secondly, the table is dynamically expanded when there are too many collisions (i.e., keys that have distinct hash codes but fall into the same slot modulo the table size), with the expected average effect of maintaining roughly two bins per mapping (corresponding to a 0.75 load factor threshold for resizing).

  • There may be much variance around this average as mappings are added and removed, but overall, this maintains a commonly accepted time/space tradeoff for hash tables.
  • However, resizing this or any other kind of hash table may be a relatively slow operation. When possible, it is a good idea to provide a size estimate as an optional {@codeinitialCapacity} constructor argument. An additional optional {@code loadFactor} constructor argument provides a further means of customizing initial table capacity by specifying the table density to be used in calculating the amount of space to allocate for the given number of elements.
  • Also, for compatibility with previous versions of this class, constructors may optionally specify an expected {@code concurrencyLevel} as an additional hint for internal sizing.
  • Note that using many keys with exactly the same {@code hashCode()} is a sure way to slow down performance of any hash table.
  • To ameliorate impact, when keys are {@link Comparable}, this class may use comparison order among keys to help break ties.

Thirdly, a {@link Set} projection of a ConcurrentHashMap may be created (using {@link #newKeySet()} or {@link #newKeySet(int)}), or viewed (using {@link #keySet(Object)} when only keys are of interest, and themapped values are (perhaps transiently) not used or all take the same mapping value.

Fourthly, A ConcurrentHashMap can be used as scalable frequency map (a form of histogram or multiset) by using {@link java.util.concurrent.atomic.LongAdder} values and initializing via {@link #computeIfAbsent computeIfAbsent}. For example, to add a count to a {@code ConcurrentHashMap

Sixthly, this class and its views and iterators implement all of the optional methods of the {@link Map} and {@link Iterator} interfaces.

Seventhly, like {@link Hashtable} but unlike {@link HashMap}, this class does not allow {@code null} to be used as a key or value.

Eighthly, ConcurrentHashMaps support a set of sequential and parallel bulk operations that, unlike most {@link Stream} methods, are designed to be safely, and often sensibly, applied even with maps that are being concurrently updated by other threads; for example, when computing a snapshot summary of the values in a shared registry.

  • There are three kinds of operation, each with four forms, accepting functions with Keys, Values, Entries, and (Key, Value) arguments and/or return values. Because the elements of a ConcurrentHashMap are not ordered in any particular way, and may be processed in different orders in different parallel executions, the correctness of supplied functions should not depend on any ordering, or on any other objects or values that may transiently change while computation is in progress; and except for forEach actions, should ideally be side-effect-free.
  • Bulk operations on {@link java.util.Map.Entry} objects do not support method {@code setValue}.
  1. forEach: Perform a given action on each element.
    A variant form applies a given transformation on each element before performing the action.

  2. search: Return the first available non-null result of applying a given function on each element; skipping further search when a result is found.

  3. reduce: Accumulate each element. The supplied reduction function cannot rely on ordering (more formally, it should be both associative and commutative).

There are five variants:

  1. Plain reductions. (There is not a form of this method for (key, value) function arguments since there is no corresponding return type.)
  2. Mapped reductions that accumulate the results of a given function applied to each element.
  3. Reductions to scalar doubles, longs, and ints, using agiven basis value.

These bulk operations accept a {@code parallelismThreshold} argument. Methods proceed sequentially if the current map size is estimated to be less than the given threshold.
Using a value of {@code Long.MAX_VALUE} suppresses all parallelism. Using a value of {@code 1} results in maximal parallelism by partitioning into enough subtasks to fully utilize the {@link ForkJoinPool#commonPool()} that is used for all parallel computations.
Normally, you would initially choose one of these extreme values, and then measure performance of using in-between values that trade off overhead versus throughput.

Ninetiethly, the concurrency properties of bulk operations follow from those of ConcurrentHashMap: Any non-null result returned from {@code get(key)} and related access methods bears a happens-before relation with the associated insertion or update.

  • The result of any bulk operation reflects the composition of these per-element relations (but is not necessarily atomic with respect to the map as a whole unless it is somehow known to be quiescent).
  • Conversely, because keys and values in the map are never null, null serves as a reliable atomic indicator of the current lack of any result.
  • To maintain this property, null serves as an implicit basis for all non-scalar reduction operations.
  • For the double, long, and int versions, the basis should be one that, when combined with any other value, returns that other value (more formally, it should be the identity element for the reduction).
  • Most common reductions have these properties; for example, computing a sum with basis 0 or a minimum with basis MAX_VALUE.

The tenth, search and transformation functions provided as arguments should similarly return null to indicate the lack of any result (in which case it is not used).

  • In the case of mapped reductions, this also enables transformations to serve as filters, returning null (or, in the case of primitive specializations, the identity basis) if the element should not be combined. You can create compound transformations and filterings by composing them yourself under this “null means there is nothing there now” rule before using them in search orreduce operations.

The eleventh, methods accepting and/or returning Entry arguments maintain key-value associations. They may be useful for example when finding the key for the greatest value. Note that "plain" Entry arguments can be supplied using {@code newAbstractMap.SimpleEntry(k,v)}.

The twelfth, Bulk operations may complete abruptly, throwing an exception encountered in the application of a supplied function. Bear in mind when handling such exceptions that other concurrently executing functions could also have thrown exceptions, or would have done so if the first exception had not occurred.

Speedups for parallel compared to sequential forms are common but not guaranteed. Parallel operations involving brief functions on small maps may execute more slowly than sequential forms if the underlying work to parallelize the computation is more expensive than the computation itself. Similarly, parallelization may not lead to much actual parallelism if all processors are busy performing unrelated tasks.

  • All arguments to all task methods must be non-null.

  • This class is a member of the Java Collections Framework.

你可能感兴趣的:(English,Blog,hashmap,hashtable)