Java ArrayList的实现原理及源码解析

ArrayList 的介绍:

ArrayList 是Java collections 集合的一种实现,是大小可自动动态改变的列表,其底层数据结构是一个允许存放任何对象类型(包括 null 的)的数组 Object[]。

/**
 * 
 * Resizable-array implementation of the List interface. Implements all
 * optional list operations, and permits all elements, including null.
 * In addition to implementing the List interface, this class provides
 * methods to manipulate the size of the array that is used internally to store
 * the list. (This class is roughly equivalent to Vector, except that
 * it is unsynchronized.) ArrayList 实现了 List 接口,是大小可动态改变的数组,实现了 list 所有的操作方法,允许
 * 插入任何类型的元素,包括 null 。除了实现 List 接口,ArrayList 还提供了操作实际存储
 * ArrayList的数组大小的方法。(ArrayList 几乎和 Vector 一致,除了 ArrayList 是不同步、非线程安全的)
 * 
 *
 * 

* The size, isEmpty, get, set, * iterator, and listIterator operations run in constant time. * The add operation runs in amortized constant time, that is, * adding n elements requires O(n) time. All of the other operations run in * linear time (roughly speaking). The constant factor is low compared to that * for the LinkedList implementation. ArrayList 类的 * size、isEmpty、get、set、iterator、listIterator 方法执行所需的时间是常量。 add * 方法是以分摊的固定时间执行的,即,插入 n 个元素的时间复杂度是 O(n) 。其他的方法 * 执行所需的时间是线性的(大致而言)。相比 LinkedList * 实现类,ArrayList 类的常数因子较小。 *

* Each ArrayList instance has a capacity. The capacity is the * size of the array used to store the elements in the list. It is always at * least as large as the list size. As elements are added to an ArrayList, its * capacity grows automatically. The details of the growth policy are not * specified beyond the fact that adding an element has constant amortized time * cost. 每一个 ArrayList 实例具有容量。容量是存储 ArrayList 元素的数组大小。该数组至少和 ArrayList * 容量的大小一样大,随着 ArrayList 中不断插入元素,它会自动扩充容量。未说明自动扩容机制的细 * 节,因为这不只是添加元素会带来分摊固定时间开销那样简单。 * *

* An application can increase the capacity of an ArrayList instance * before adding a large number of elements using the ensureCapacity * operation. This may reduce the amount of incremental reallocation. * 在插入大量元素之前,应用程序可以通过 ensureCapacity 方法 扩充 ArrayList 的容量。 该操作减少了增量再分配的次数。 * *

* Note that this implementation is not synchronized. If * multiple threads access an ArrayList instance concurrently, and at * least one of the threads modifies the list structurally, it must be * synchronized externally. (A structural modification is any operation that * adds or deletes one or more elements, or explicitly resizes the backing * array; merely setting the value of an element is not a structural * modification.) This is typically accomplished by synchronizing on some object * that naturally encapsulates the list. 注意 ArrayList * 实现类是不同步的、非线程安全的。若多个线程同时访问一个 ArrayList 实例,并且至少有一个线程在结构上改变了该 * ArrayList,那么它必须保持外部同步。 (结构上的改变是指插入或删除一个或者多个元素元素,或者显式调整底层数 * 组的大小的操作,给一个元素赋值不是结构上的改变。)外部同步一般通过对自然封装该 ArrayList 的对象进行同步操 * 作来完成。 * * If no such object exists, the list should be "wrapped" using the * {@link Collections#synchronizedList Collections.synchronizedList} method. * This is best done at creation time, to prevent accidental unsynchronized * access to the list: 如果不存在这样的对象,则应该使用 Collections.synchronizedList 方法将该 * ArrayList“包装”起来。 这个操作最好在创建的时候完成,以防止意外对列表进行不同步的访问: * *

 *   List list = Collections.synchronizedList(new ArrayList(...));
 * 
* *

* The iterators returned by this class's * {@link #iterator() iterator} and {@link #listIterator(int) listIterator} * methods are fail-fast: if the list is structurally modified at any * time after the iterator is created, in any way except through the iterator's * own {@link ListIterator#remove() remove} or {@link ListIterator#add(Object) * add} methods, the iterator will throw a * {@link ConcurrentModificationException}. Thus, in the face of concurrent * modification, the iterator fails quickly and cleanly, rather than risking * arbitrary, non-deterministic behavior at an undetermined time in the future. * ArrayList 类的 iterator 和 listIterator 方法返回的迭代器是快速失败 (注:fail-fast 机制是java * 集合(Collection)中的一种错误机制。当多个线程对同一个集合的内容进行操作时,就可能 * 会产生fail-fast事件。例如:当某一个线程A通过iterator去遍历某集合的过程中,若该集合 * 的内容被其他线程所改变了;那么线程A访问集合时,就会抛出ConcurrentModificationException * 异常,产生fail-fast事件。)的:在创建迭代器之后,除非通过迭代器自身的 remove 或 add 方法从 * 结构上对列表进行修改,否则在任何时间以任何方式对列表进行修改,迭代器都会抛出 ConcurrentModificationException。 * 因此,面对并发的修改,迭代器很快就会完全失败,而不是冒着在将来某个不确定时间发生任意不确定行为的风险。 * *

* Note that the fail-fast behavior of an iterator cannot be guaranteed as it * is, generally speaking, impossible to make any hard guarantees in the * presence of unsynchronized concurrent modification. Fail-fast iterators throw * {@code ConcurrentModificationException} on a best-effort basis. Therefore, it * would be wrong to write a program that depended on this exception for its * correctness: the fail-fast behavior of iterators should be used only to * detect bugs. 注意迭代器的快速失败机制无法得到保证,一般来说,不可能对是否出现不同步并发修 * 改做出任何硬性保证。快速失败迭代器在尽力而为地抛出 ConcurrentModificationException异常。 * 所以,为了提高这类迭代器的正确性而编写一个依赖于此异常的程序是错误的做法:迭代器的快速失败 * 机制应该仅用于检测 bug。 * *

* This class is a member of the * Java * Collections Framework. ArrayList 类是 Java Collections Framework 的实现类之一。 * * @author Josh Bloch * @author Neal Gafter * @see Collection * @see List * @see LinkedList * @see Vector * @since 1.2 */

ArrayList 的成员变量:

1.DEFAULT_CAPACITY,默认初始化容量

2.EMPTY_ELEMENTDATA,没有任何元素的 ArrayList  实例,空对象

3.elementData:实际存储 ArrayList  元素的数组

4.size:elementData 数组的长度

5.MAX_ARRAY_SIZE:ArrayList 的最大容量大小,即  Integer.MAX_VALUE - 8 

	/**
	 * Default initial capacity. 默认的初始化容量是10
	 */
	private static final int DEFAULT_CAPACITY = 10;

	/**
	 * Shared empty array instance used for empty instances. 没有初始化 ArrayList
	 * 的大小,即集合常量 EMPTY_ELEMENTDATA
	 */
	private static final Object[] EMPTY_ELEMENTDATA = {};

	/**
	 * The array buffer into which the elements of the ArrayList are stored. The
	 * capacity of the ArrayList is the length of this array buffer. Any empty
	 * ArrayList with elementData == EMPTY_ELEMENTDATA will be expanded to
	 * DEFAULT_CAPACITY when the first element is added. 存储 ArrayList
	 * 元素的缓冲数组。ArrayList 实例的容量就是这个缓冲数组的大小。 当插入第一个元素时,底层为 
	 * EMPTY_ELEMENTDATA 的 ArrayList 的大小会自动扩 大到 DEFAULT_CAPACITY 。
	 */
	private transient Object[] elementData;

	/**
	 * The size of the ArrayList (the number of elements it contains). ArrayList
	 * 包含的元素个数
	 * 
	 * @serial
	 */
	private int size;

	/**
	 * ArrayList 的最大容量大小为 Integer.MAX_VALUE - 8 。 虚拟机程序需要在 ArrayList 实例
	 * 中保留一些空间存储header words 。 若试图分配更大的容量,可能会导致 OutOfMemoryError异常 :
	 * 超出了虚拟机限制的容量大小。
	 * The maximum size of array to allocate. Some VMs reserve some header words in
	 * an array. Attempts to allocate larger arrays may result in
	 * OutOfMemoryError: Requested array size exceeds VM limit
	 */
	private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

ArrayList 的构造函数:

1.ArrayList(int initialCapacity) :指定初始化容量的构造函数

2.ArrayList():无参数构造函数,在插入第一个元素时应用程序将其扩容到初始容量 10

3.ArrayList(Collection c):以指定集合为参数的构造函数,ArrayList 元素的顺序就是该初始集合的迭代器 返回的元素顺序

	/**
	 * Constructs an empty list with the specified initial capacity.
	 * 指定初始化容量的构造函数
	 * 
	 * @param initialCapacity
	 *            the initial capacity of the list
	 * @throws IllegalArgumentException
	 *             if the specified initial capacity is negative
	 */
	public ArrayList(int initialCapacity) {
		super();
		if (initialCapacity < 0)
			throw new IllegalArgumentException("Illegal Capacity: " + initialCapacity);
		this.elementData = new Object[initialCapacity];
	}

	/**
	 * Constructs an empty list with an initial capacity of ten.
	 * 无参数构造函数,在插入第一个元素时应用程序将其扩容到初始容量 10
	 */
	public ArrayList() {
		super();
		this.elementData = EMPTY_ELEMENTDATA;
	}

	/**
	 * Constructs a list containing the elements of the specified collection, in
	 * the order they are returned by the collection's iterator.
	 * 以指定集合为参数的构造函数,ArrayList 元素的顺序就是该初始集合的迭代器 返回的元素顺序。
	 * 
	 * @param c
	 *            the collection whose elements are to be placed into this list
	 * @throws NullPointerException
	 *             if the specified collection is null
	 */
	public ArrayList(Collection c) {
		elementData = c.toArray();
		size = elementData.length;
		// c.toArray might (incorrectly) not return Object[] (see 6260652)
		if (elementData.getClass() != Object[].class)
			elementData = Arrays.copyOf(elementData, size, Object[].class);
	}

ArrayList 的自动扩容实现:

ArrayList 是在插入新元素时,判断是否需要扩容以及在需要扩容时完成扩容。

插入元素的方法:

1.add(E e):在 ArrayList 的末端插入一个元素,复杂度O(1)

2.add(int index, E element) :在指定的位置插入一个元素,复杂度O(n)

3.addAll(Collection c):在 ArrayList 的末端插入一个集合,复杂度O(1)

4.addAll(int index, Collection c):在指定的位置插入一个集合,复杂度O(n)

	/**
	 * Appends the specified element to the end of this list. 
	 * 在 ArrayList 实例的数组末端插入一个指定的元素
	 * 
	 * @param e
	 *            element to be appended to this list
	 * @return true (as specified by {@link Collection#add})
	 */
	public boolean add(E e) {
		ensureCapacityInternal(size + 1); // Increments modCount!!
		elementData[size++] = e;
		return true;
	}

	/**
	 * Inserts the specified element at the specified position in this list.
	 * Shifts the element currently at that position (if any) and any subsequent
	 * elements to the right (adds one to their indices).
	 * 在指定的位置插入一个元素,原先位置及其后面的元素依次后移
	 *
	 * @param index
	 *            index at which the specified element is to be inserted
	 * @param element
	 *            element to be inserted
	 * @throws IndexOutOfBoundsException
	 *             {@inheritDoc}
	 */
	public void add(int index, E element) {
		rangeCheckForAdd(index);

		ensureCapacityInternal(size + 1); // Increments modCount!!
		System.arraycopy(elementData, index, elementData, index + 1, size - index);
		elementData[index] = element;
		size++;
	}
/**
	 * Appends all of the elements in the specified collection to the end of
	 * this list, in the order that they are returned by the specified
	 * collection's Iterator. The behavior of this operation is undefined if the
	 * specified collection is modified while the operation is in progress.
	 * (This implies that the behavior of this call is undefined if the
	 * specified collection is this list, and this list is nonempty.)
	 *  在 ArrayList 的末端插入一个集合
	 *            collection containing elements to be added to this list
	 * @return true if this list changed as a result of the call
	 * @throws NullPointerException
	 *             if the specified collection is null
	 */
	public boolean addAll(Collection c) {
		Object[] a = c.toArray();
		int numNew = a.length;
		ensureCapacityInternal(size + numNew); // Increments modCount
		System.arraycopy(a, 0, elementData, size, numNew);
		size += numNew;
		return numNew != 0;
	}

	/**
	 * Inserts all of the elements in the specified collection into this list,
	 * starting at the specified position. Shifts the element currently at that
	 * position (if any) and any subsequent elements to the right (increases
	 * their indices). The new elements will appear in the list in the order
	 * that they are returned by the specified collection's iterator.
	 * 在指定的位置插入一个集合,原先位置及其后面的元素依次后移
	 * @param index
	 *            index at which to insert the first element from the specified
	 *            collection
	 * @param c
	 *            collection containing elements to be added to this list
	 * @return true if this list changed as a result of the call
	 * @throws IndexOutOfBoundsException
	 *             {@inheritDoc}
	 * @throws NullPointerException
	 *             if the specified collection is null
	 */
	public boolean addAll(int index, Collection c) {
		rangeCheckForAdd(index);

		Object[] a = c.toArray();
		int numNew = a.length;
		ensureCapacityInternal(size + numNew); // Increments modCount

		int numMoved = size - index;
		if (numMoved > 0)
			System.arraycopy(elementData, index, elementData, index + numNew, numMoved);

		System.arraycopy(a, 0, elementData, index, numNew);
		size += numNew;
		return numNew != 0;
	}

add(E) 方法插入一个新元素的实现来分析源码的操作,

	/**
	 * Appends the specified element to the end of this list. 
	 * 在 ArrayList 实例的数组末端插入一个指定的元素
	 * 
	 * @param e
	 *            element to be appended to this list
	 * @return true (as specified by {@link Collection#add})
	 */
	public boolean add(E e) {
		ensureCapacityInternal(size + 1); // Increments modCount!!
		elementData[size++] = e;
		return true;
	}
检查是否需要扩容,执行  ensureCapacityInternal(int minCapacity) 方法,此时的 minCapacity 的参数值为 size+1,即当前数组的长度+新插入的一个元素。

	private void ensureCapacityInternal(int minCapacity) {
		// 若为 EMPTY_ELEMENTDATA
		if (elementData == EMPTY_ELEMENTDATA) {
			// 则将 minCapacity 设置为默认容量的值(10)
			minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
		}
		// 确认容量是否够用
		ensureExplicitCapacity(minCapacity);
	}
假设程序是以无参的构造方法创建了一个 ArrayList , List flowerList = new ArrayList<>();

那么此时 flowerList  是 EMPTY_ELEMENTDATA,那么将所需容量 minCapacity 赋值为 ath.max(DEFAULT_CAPACITY, minCapacity),在当前的假设中 minCapacity =10。

确认容量是否够用及是否执行扩容操作,执行  ensureExplicitCapacity(int minCapacity)  方法,

	private void ensureExplicitCapacity(int minCapacity) {
		modCount++;

		// overflow-conscious code
		/**
		 * 判断是否需要扩容 例如如果要插入的元素个数是12(即 minCapacity=10),
		 * 而默认的初始化后的容量是10 (即此时 elementData.length =10),
		 * 那么 12>10,需要扩容
		 * 
		 */
		if (minCapacity - elementData.length > 0)
			grow(minCapacity);// 扩容,grow 方法是扩容机制的核心函数
	}
minCapacity =10,而 elementData.length=0 ,所以需要扩容,

执行 grow(int minCapacity) 扩容函数,

	/**
	 * Increases the capacity to ensure that it can hold at least the number of
	 * elements specified by the minimum capacity argument. 
	 * 扩容
	 * 
	 * @param minCapacity
	 *            the desired minimum capacity
	 */
	private void grow(int minCapacity) {
		// overflow-conscious code
		int oldCapacity = elementData.length;// 扩容前的容量
		/**
		 * 注: <<是左移运算符,num << 1,相当于num乘以2 ;
		 *        >>是右移运算符,num >> 1,相当于num除以2 ;
		 *        >>>是无符号右移,忽略符号位,空位都以0补齐;
		 *        扩充后的容量是oldCapacity+oldCapacity/2。											
		 */
		int newCapacity = oldCapacity + (oldCapacity >> 1);
		if (newCapacity - minCapacity < 0) //若扩充的容量比所需容量小,
			newCapacity = minCapacity;//则直接扩充至所需容量
		if (newCapacity - MAX_ARRAY_SIZE > 0)//若扩充的容量比最大限制容量大
			newCapacity = hugeCapacity(minCapacity);//则重新调整扩充的容量
		// minCapacity is usually close to size, so this is a win:
		//扩充容量,并将原先 ArrayList 的元素赋值到新的 ArrayList 中。
		elementData = Arrays.copyOf(elementData, newCapacity);
	}

	private static int hugeCapacity(int minCapacity) {
		if (minCapacity < 0) // overflow 若扩充的容量超出int的范围,溢出为负,
			throw new OutOfMemoryError();//抛出 内存溢出异常
		return (minCapacity > MAX_ARRAY_SIZE) ? Integer.MAX_VALUE : MAX_ARRAY_SIZE;//否则返回扩充后的容量
	}

在当前的假设中将 ArrayList 的容量扩大为10,最后回到 add(E) 方法中,执行 elementData[size++] = e; ,将新增的元素 e 放入 elementData[size] 中,数组的实际大小 size也增加1。

通过其他三种方法来插入元素,同理需要依次执行判断当前的 ArrayList  是否需要扩容、确定扩容后的容量,以及最终的扩容实现操作,最后才执行插入操作。

ArrayList 的访问、修改和删除元素的方法:

1.访问:get(int index) ,返回指定位置的元素,复杂度O(1)

2.修改:set(int index, E element),给指定位置的元素赋值,复杂度O(1)

3.移除:remove(int index),移除指定位置的元素,复杂度O(1)

4.移除:remove(Object o) ,实际上也是先找出待移除元素的位置,再将其移除,复杂度O(n)

	/**
	 * Returns the element at the specified position in this list.
	 * 返回指定位置的元素
	 * @param index
	 *            index of the element to return
	 * @return the element at the specified position in this list
	 * @throws IndexOutOfBoundsException
	 *             {@inheritDoc}
	 */
	public E get(int index) {
		rangeCheck(index);

		return elementData(index);
	}

	/**
	 * Replaces the element at the specified position in this list with the
	 * specified element.
	 * 给指定位置的元素赋值
	 * @param index
	 *            index of the element to replace
	 * @param element
	 *            element to be stored at the specified position
	 * @return the element previously at the specified position
	 * @throws IndexOutOfBoundsException
	 *             {@inheritDoc}
	 */
	public E set(int index, E element) {
		rangeCheck(index);

		E oldValue = elementData(index);
		elementData[index] = element;
		return oldValue;
	}
	/**
	 * Removes the element at the specified position in this list. Shifts any
	 * subsequent elements to the left (subtracts one from their indices).
	 *  移除指定位置的元素
	 * @param index
	 *            the index of the element to be removed
	 * @return the element that was removed from the list
	 * @throws IndexOutOfBoundsException
	 *             {@inheritDoc}
	 */
	public E remove(int index) {
		rangeCheck(index);

		modCount++;
		E oldValue = elementData(index);

		int numMoved = size - index - 1;
		if (numMoved > 0)
			System.arraycopy(elementData, index + 1, elementData, index, numMoved);
		elementData[--size] = null; // clear to let GC do its work

		return oldValue;
	}

	/**
	 * Removes the first occurrence of the specified element from this list, if
	 * it is present. If the list does not contain the element, it is unchanged.
	 * More formally, removes the element with the lowest index i such
	 * that
	 * (o==null ? get(i)==null : o.equals(get(i)))
	 * (if such an element exists). Returns true if this list contained
	 * the specified element (or equivalently, if this list changed as a result
	 * of the call).
	 * 移除元素
	 * @param o
	 *            element to be removed from this list, if present
	 * @return true if this list contained the specified element
	 */
	public boolean remove(Object o) {
		if (o == null) {//待移除的元素若为null
			for (int index = 0; index < size; index++)
				if (elementData[index] == null) {
					fastRemove(index);
					return true;
				}
		} else {//待移除的元素不为null
			for (int index = 0; index < size; index++)
				if (o.equals(elementData[index])) {
					fastRemove(index);
					return true;
				}
		}
		return false;
	}
不难发现,ArrayList 的访问、修改和删除元素的方法实际是基于数组的操作,而且是根据索引操作的。

在 ArrayList 的介绍中提到这 get(index) 、set(index) 方法执行所需的时间是常量。

根据索引访问或者修改任何一个位置的元素对 ArrayList  而言所需的时间理论上是一样的,属于常量级别的时间。

而 add(int index, E element) 插入操作的时间复杂度是O(n),因为可以在任何位置执行插入操作,若不是在 ArrayList 的末端插入元素或者集合,那么在插入位置及其后的元素全部需要后移,因此根据算法可得出插入操作的时间复杂度是O(n)。

remove(int index) 移除操作和 add(int index, E element) 插入操作一样,移除指定位置的元素后,仍然需要将后面的元素逐个前移补上,因此时间复杂度也是O(n)。

总结:

1.ArrayList 是基于数组实现的,是不同步、非线程安全的;

2.ArrayList 的大小是可以动态调整的;

2.ArrayList 访问数据速度快,插入和删除元素慢。

你可能感兴趣的:(Java)