你不知道的LinkedList(一)：基于jdk1.8的LinkdeList源码分析

[toc]
在对ArrayList源码有过了解之后，现在对LinkedList源码进行相应的分析。

1.结构及成员变量

1.1基本结构

linkedList本质是实现了一个双向链表。其类继承关系如下图：

image.png

可以看到LinkedList继承了AbstractSequentialList,实现了List, Deque, Cloneable, java.io.Serializable。比较特别的是实现了Deque双端队列，这是队列基于链表的一种实现。

/**
 * Doubly-linked list implementation of the {@code List} and {@code Deque}
 * interfaces.  Implements all optional list operations, and permits all
 * elements (including {@code null}).
 *
 * All of the operations perform as could be expected for a doubly-linked
 * list.  Operations that index into the list will traverse the list from
 * the beginning or the end, whichever is closer to the specified index.
 *
 * 
Note that this implementation is not synchronized.
 * If multiple threads access a linked list concurrently, and at least
 * one of the threads modifies the list structurally, it must be
 * synchronized externally.  (A structural modification is any operation
 * that adds or deletes one or more elements; merely setting the value of
 * an element is not a structural modification.)  This is typically
 * accomplished by synchronizing on some object that naturally
 * encapsulates the list.
 *
 * If no such object exists, the list should be "wrapped" using the
 * {@link Collections#synchronizedList Collections.synchronizedList}
 * method.  This is best done at creation time, to prevent accidental
 * unsynchronized access to the list:
 *   List list = Collections.synchronizedList(new LinkedList(...));
 *
 * The iterators returned by this class's {@code iterator} and
 * {@code listIterator} methods are fail-fast: if the list is
 * structurally modified at any time after the iterator is created, in
 * any way except through the Iterator's own {@code remove} or
 * {@code add} methods, the iterator will throw a {@link
 * ConcurrentModificationException}.  Thus, in the face of concurrent
 * modification, the iterator fails quickly and cleanly, rather than
 * risking arbitrary, non-deterministic behavior at an undetermined
 * time in the future.
 *
 * 
Note that the fail-fast behavior of an iterator cannot be guaranteed
 * as it is, generally speaking, impossible to make any hard guarantees in the
 * presence of unsynchronized concurrent modification.  Fail-fast iterators
 * throw {@code ConcurrentModificationException} on a best-effort basis.
 * Therefore, it would be wrong to write a program that depended on this
 * exception for its correctness:   the fail-fast behavior of iterators
 * should be used only to detect bugs.
 *
 * This class is a member of the
 * 
 * Java Collections Framework.
 *
 * @author  Josh Bloch
 * @see     List
 * @see     ArrayList
 * @since 1.2
 * @param  the type of elements held in this collection
 */

public class LinkedList
    extends AbstractSequentialList
    implements List, Deque, Cloneable, java.io.Serializable
{

}

其注释大意为，双向链表实现了List和Deque接口，并实现了所有可选的方法，可以允许元素为null。
对于一个双向链表，所有操作都按照预期那样，索引到列表的操作将从列表开始或者结束位置（以距离索引更近的位置为准）来遍历该列表。
需要注意的是这个链表没有采用synchronized实现。如果多线程并发的访问一个链表，并且至少有一个线程修改了链表的结构，那么它必须采用同步的方式。（结构修改是指添加或者删除一个或者多个元素的任何操作。仅仅设置元素的值并不是结构修改）。这通常是通过对自然封装的列表对象进行同步来实现。
如果没有这样的对象，那么最好是用Collections.synchronizedList方法。

 List list = Collections.synchronizedList(new LinkedList(...));

迭代器是fail-fast方法实现的，如果其结构被修改，那么在使用迭代的过程中将会出现ConcurrentModificationException异常。因此，在并发情况下修改，迭代器是回快速失败。
需要注意的是，迭代器的fail-fast行为不能得到任何保证，因为一般来说，在非同步的并发修改时不可能得到任何担保。

1.2 成员变量

transient int size = 0;

/**
 * Pointer to first node.
 * Invariant: (first == null && last == null) ||
 *            (first.prev == null && first.item != null)
 */
transient Node first;

/**
 * Pointer to last node.
 * Invariant: (first == null && last == null) ||
 *            (last.next == null && last.item != null)
 */
transient Node last;

linkedList的成员变量主要有三个，分别是表示链表长度的size，以及链表头和尾的指针first和last。注意这三个变量都是采用transient修饰，序列化的时候这些属性回呗忽略。

1.3 Node

linkedList的元素是由一个内部类构成：

private static class Node {
    E item;
    Node next;
    Node prev;

    Node(Node prev, E element, Node next) {
        this.item = element;
        this.next = next;
        this.prev = prev;
    }
}

这个类支持泛型，然后有两个指针，一个next指向后一个元素，一个指针prev指向前一个元素。

2.LinkedList的数据结构及其其基本操作

2.1基本数据结构

那么我们在对代码进行了解之后，可以发现，LinkedList实际上就是一个以Node节点为基础的双向链表，在这个链表中还有两个指针first和last。假定存在一个元素为1、2、3的linkedList，其构成如下图：

image.png

我们可以看到其内部的first和last指针分别指向这个双向链表的首尾位置。然后每个node之间的连接关系也如上图所示。

2.2 构造方法

LinkedList提供两个构造方法，分别是：

    /**
     * Constructs an empty list.
     */
    public LinkedList() {
    }

这是一个空的构造方法，此时Linked的各个指针均为空，只是创建了一个LinkedList的对象。
当然，也可以从一个集合中产生一个linkedList。

    /**
     * Constructs a list containing the elements of the specified
     * collection, in the order they are returned by the collection's
     * iterator.
     *
     * @param  c the collection whose elements are to be placed into this list
     * @throws NullPointerException if the specified collection is null
     */
    public LinkedList(Collection c) {
        this();
        addAll(c);
    }

实际上这个方法还是调用之前的空构造方法，再采用addAll方法添加元素。

    public boolean addAll(Collection c) {
        return addAll(size, c);
    }
    
    
    public boolean addAll(int index, Collection c) {
        checkPositionIndex(index);

        Object[] a = c.toArray();
        int numNew = a.length;
        if (numNew == 0)
            return false;

        Node pred, succ;
        if (index == size) {
            succ = null;
            pred = last;
        } else {
            succ = node(index);
            pred = succ.prev;
        }

        for (Object o : a) {
            @SuppressWarnings("unchecked") E e = (E) o;
            Node newNode = new Node<>(pred, e, null);
            if (pred == null)
                first = newNode;
            else
                pred.next = newNode;
            pred = newNode;
        }

        if (succ == null) {
            last = pred;
        } else {
            pred.next = succ;
            succ.prev = pred;
        }

        size += numNew;
        modCount++;
        return true;
    }

需要注意的是此时采用了一个pred指针，用来指向上一次添加的节点，这样再加入新节点的时候就可以直接将pred设置为新增加节点的prev指针。
这个算法可以作为平时刷leetcode的时候的重要参考。

2.3 add(E e)

/**
 * Appends the specified element to the end of this list.
 *
 * This method is equivalent to {@link #addLast}.
 *
 * @param e element to be appended to this list
 * @return {@code true} (as specified by {@link Collection#add})
 */
public boolean add(E e) {
    linkLast(e);
    return true;
}

实际上这个用得最多的add方法，实际上是调用的尾插法linkLast:

/**
 * Links e as last element.
 */
void linkLast(E e) {
    final Node l = last;
    final Node newNode = new Node<>(l, e, null);
    last = newNode;
    if (l == null)
        first = newNode;
    else
        l.next = newNode;
    size++;
    modCount++;
}

引入了一个指针l，这个指针先找到last。通过Node的构造方法，将prev指向l，之后变更last为new出来的新的node节点。
此时通过l==null判断此时list中是否为空，如果l==null则说明没有任何元素，那么first的指针也会指向newNode。反之则将l的next指向newNode。
需要注意的是，这种修改会到modCount增加。这是实现fail-fast机制的基础。
modCount 在其继承的抽象类中。
此过程可以通过如下图表示：
假定我们在linkedList中插入 1 和 2。我们来看看这个过程。
首先我们在LinkedList中添加1，由于最开始这个List为空，因此添加1的时候会导致fist和last都指向这个节点，Node上的next和prev值为空。

image.png

当我们再次添加节点2的时候，再来看这个过程。
在这个时候，l的值等于last，指向了节点1。然后new一个新的节点2，这一步的时候，node2的prev指针就指向了l。再将last指向这个新的node。此时如下所示：

image.png

再此之后再判断，l此时不为null，那么将l的next指针指向这个新的node。

image.png

之后再把size加1,modCount加1，方法执行完成并回收局部变量。
最终如下：

image.png

上面即使LinkedList再尾部添加一个元素的过程。

2.4 add(int index, E element)

在指定索引的位置插入元素。其代码如下：

/**
 * Inserts the specified element at the specified position in this list.
 * Shifts the element currently at that position (if any) and any
 * subsequent elements to the right (adds one to their indices).
 *
 * @param index index at which the specified element is to be inserted
 * @param element element to be inserted
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public void add(int index, E element) {
    checkPositionIndex(index);

    if (index == size)
        linkLast(element);
    else
        linkBefore(element, node(index));
}

实质上是对index判断是否为size大小，如果与size一致，因为index实际上从0开始，而size从1开始计数，那么就说明应该是直接在尾部插入，直接就调用尾插法即可。反之则调用linkBefore方法。

/**
 * Inserts element e before non-null Node succ.
 */
void linkBefore(E e, Node succ) {
    // assert succ != null;
    final Node pred = succ.prev;
    final Node newNode = new Node<>(pred, e, succ);
    succ.prev = newNode;
    if (pred == null)
        first = newNode;
    else
        pred.next = newNode;
    size++;
    modCount++;
}

还需要注意的是此时还有个 node(index)方法。

/**
 * Returns the (non-null) Node at the specified element index.
 */
Node node(int index) {
    // assert isElementIndex(index);

    if (index < (size >> 1)) {
        Node x = first;
        for (int i = 0; i < index; i++)
            x = x.next;
        return x;
    } else {
        Node x = last;
        for (int i = size - 1; i > index; i--)
            x = x.prev;
        return x;
    }
}

这个方法也是一个在LinkedList中非常常用的方法。实际上是个根据index查找Node的方法。此时如果index小于size的一半，则从链表头开始查找。如果大于size的一般，则从链表尾部开始查找。
而且这个判断的位置采用的是位移计算>>1。这个是我们自己在写代码的时候需要学习的，位移计算的效率是最高的。

在linkBefore方法中，其执行过程如下图所示。
假定我们有一个linkedList数组，其中有1、2、3共3个元素。现在需要将4插入到index为1的位置。
首先，调用node(1)方法，定义一个指针succ指向这个元素。

image.png

定义一个指针pred指向succ的prev节点。之后创建一个新节点，其prev为pred指向的节点，next为succ指向的节点。

image.png

此时将succ的prev指针指向newNode。

image.png

再进行判断，此时pred不为null，所以将pred的next指针指向newNode。

image.png

这样添加操作就执行完成，对size和modCount进行加1操作。之后再回收变量。操作完成之后的链表如下：

image.png

这样一个linkedList的指定位置插入操作就执行完成。

2.5 get(int index)

我们再看看linked的get方法：

/**
 * Returns the element at the specified position in this list.
 *
 * @param index index of the element to return
 * @return the element at the specified position in this list
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public E get(int index) {
    checkElementIndex(index);
    return node(index).item;
}

private void checkElementIndex(int index) {
    if (!isElementIndex(index))
        throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}


/**
 * Tells if the argument is the index of an existing element.
 */
private boolean isElementIndex(int index) {
    return index >= 0 && index < size;
}

可以看到在执行过程中首先对index进行判断，是不是一个合法的index。index的范围应该在0-size之间。否则返回IndexOutOfBoundsException异常。
之后再调用上文提到的node方法。

/**
 * Returns the (non-null) Node at the specified element index.
 */
Node node(int index) {
    // assert isElementIndex(index);

    if (index < (size >> 1)) {
        Node x = first;
        for (int i = 0; i < index; i++)
            x = x.next;
        return x;
    } else {
        Node x = last;
        for (int i = size - 1; i > index; i--)
            x = x.prev;
        return x;
    }
}

判断index与size的一半进行对比。如果小则从前面查找，如果大则从尾部查找。
这个方法的时间复杂度是n。比较低效。

2.6 remove()

remove操作代码如下：

/**
 * Removes the element at the specified position in this list.  Shifts any
 * subsequent elements to the left (subtracts one from their indices).
 * Returns the element that was removed from the list.
 *
 * @param index the index of the element to be removed
 * @return the element previously at the specified position
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public E remove(int index) {
    checkElementIndex(index);
    return unlink(node(index));
}

同样需要执行checkElementIndex。之后再执行unlink方法。

/**
 * Unlinks non-null node x.
 */
E unlink(Node x) {
    // assert x != null;
    final E element = x.item;
    final Node next = x.next;
    final Node prev = x.prev;

    if (prev == null) {
        first = next;
    } else {
        prev.next = next;
        x.prev = null;
    }

    if (next == null) {
        last = prev;
    } else {
        next.prev = prev;
        x.next = null;
    }

    x.item = null;
    size--;
    modCount++;
    return element;
}

需要注意的是，unlink方法，中间有个node(index)方法进行查找。这样remove也是一个耗时的过程。
unlink方法本质就是对元素的前后节点指针的修改，之后将移除的元素内部的指针也修改为null以便GC回收，最后size--,modCount再加1。

2.7 其他方法

LinkedList还有很多其他的方法。另外还实现了Deque接口，需要实现一些对队列的操作。

addFirst(E e) 在队列头部添加元素。
addLast(E e) 在队列尾部添加元素。
boolean offerFirst(E e) 在队列头部插入元素，并返回true。
boolean offerLast(E e) 在队列尾部插入元素并返回true。
removeFirst() 移除队列头部元素。
removeLast(); 移除队列尾部元素。
E pollFirst() 取出队列头部元素。并在队列中删除。
E pollLast() 取出队列尾部元素。并在队列中删除。
E getFirst() 得到队列头部元素。不改变队列。如果队列为空抛出异常。
E getLast()得到队列尾部元素，不改变队列。如果队列为空则抛出异常。
E peekFirst()得到队列头部元素，不改变队列，如果为空则返回null。
E peekLast() 得到队列尾部元素，不改变队列，如果为空则返回null。
size() 得到队列的长度。
boolean contains(Object o) 判断是否包含某个元素，这个效率比较低下。
push(E e) 等价于addFirst。
peek() 得到队列头部元素。
poll() 得到队列头部元素并移除。

等等，还有很多方法，由于并不常用就不一一介绍了。

3 总结

本文对LinkedList的源码进行了分析，个人认为其链表的操作方法，是我们在leetcode上解决链表问题的时候值得参考的地方。
另外，LinkedList的性能比较低下，我们对LinkedList的理解实际上是存在很多误区的。尤其是查找的过程性能非常低。但是我们对LinkedList的remove等操作又离不开这个寻址过程。个人认为LinkedList除了可以作为队列之外，本身并没有太多的使用价值。
在下一篇文章中，将对LinkedList的性能进行分析。也许会改变一直以来大家的认知。

你不知道的LinkedList(一)：基于jdk1.8的LinkdeList源码分析

1.结构及成员变量

1.1基本结构

1.2 成员变量

1.3 Node

2.LinkedList的数据结构及其其基本操作

2.1基本数据结构

2.2 构造方法

2.3 add(E e)

2.4 add(int index, E element)

2.5 get(int index)

2.6 remove()

2.7 其他方法

3 总结

你可能感兴趣的:(你不知道的LinkedList(一)：基于jdk1.8的LinkdeList源码分析)