(以下解析均以1.8代码为准)
说到java中的锁,最常用的方法就包括ReentrantLock的lock和unlock方法。那么这两个方法是如何实现的呢 ?今天我们就来试着解析一下,顺便了解java concurrent包中非常关键的一个类:AbstractQueuedSynchronizer。
以下是ReentrantLock中的lock与unlock方法:
public void lock() {
sync.lock();
}
public void unlock() {
sync.release(1);
}
打完,收工!
(开玩笑啦)
看来ReentrantLock只是个傀儡,实际管理的是sync。我们看下其中Sync的源码:
abstract static class Sync extends AbstractQueuedSynchronizer {
/**
* Performs {@link Lock#lock}. The main reason for subclassing
* is to allow fast path for nonfair version.
*/
abstract void lock();
/**
* Performs non-fair tryLock. tryAcquire is implemented in
* subclasses, but both need nonfair try for trylock method.
*/
final boolean nonfairTryAcquire(int acquires) {
final Thread current = Thread.currentThread();
int c = getState();
if (c == 0) {
if (compareAndSetState(0, acquires)) {
setExclusiveOwnerThread(current);
return true;
}
}
else if (current == getExclusiveOwnerThread()) {
int nextc = c + acquires;
if (nextc < 0) // overflow
throw new Error("Maximum lock count exceeded");
setState(nextc);
return true;
}
return false;
}
protected final boolean tryRelease(int releases) {
int c = getState() - releases;
if (Thread.currentThread() != getExclusiveOwnerThread())
throw new IllegalMonitorStateException();
boolean free = false;
if (c == 0) {
free = true;
setExclusiveOwnerThread(null);
}
setState(c);
return free;
}
protected final boolean isHeldExclusively() {
// While we must in general read state before owner,
// we don't need to do so to check if current thread is owner
return getExclusiveOwnerThread() == Thread.currentThread();
}
final boolean isLocked() {
return getState() != 0;
}
}
其中Sync有两个实现类,分别完成公平/非公平锁。
/**
* Sync object for non-fair locks
*/
static final class NonfairSync extends Sync {
private static final long serialVersionUID = 7316153563782823691L;
/**
* Performs lock. Try immediate barge, backing up to normal
* acquire on failure.
*/
final void lock() {
if (compareAndSetState(0, 1))
setExclusiveOwnerThread(Thread.currentThread());
else
acquire(1);
}
protected final boolean tryAcquire(int acquires) {
return nonfairTryAcquire(acquires);
}
}
/**
* Sync object for fair locks
*/
static final class FairSync extends Sync {
private static final long serialVersionUID = -3000897897090466540L;
final void lock() {
acquire(1);
}
/**
* Fair version of tryAcquire. Don't grant access unless
* recursive call or no waiters or is first.
*/
protected final boolean tryAcquire(int acquires) {
final Thread current = Thread.currentThread();
int c = getState();
if (c == 0) {
if (!hasQueuedPredecessors() &&
compareAndSetState(0, acquires)) {
setExclusiveOwnerThread(current);
return true;
}
}
else if (current == getExclusiveOwnerThread()) {
int nextc = c + acquires;
if (nextc < 0)
throw new Error("Maximum lock count exceeded");
setState(nextc);
return true;
}
return false;
}
}
Sync完成了一些方法,但幕后规定大体框架的还是AbstractQueuedSynchronizer(AQS),继承关系如下图:
head: volatile Node, 等待队列的头节点,懒加载,hread和prev都是null。第一个有资格抢CPU资源的是head的后继节点。
tail: volatile Node, 等待队列尾节点,懒加载
state: volatile int, 同步状态
spinForTimeoutThreshold: final long。小于这个阈值时,认为自旋比使用带超时的park更高效。默认为1000纳秒
exclusiveOwnerThread: transient Thread。锁的排他持有线程,这里有transient是因为序列化没有意义:当前主机的线程在解序列化之后是没有什么用处的。
waitStatus状态:
通常只要检查waitStatus符号,如果是负数说明正在等待信号量,如果>0说明被cancel了,不用写很多"waitStatus== *"的判断
那么究竟是如何通过一个AQS就能完成锁的实现呢?下图展示了两个线程调用ReentrantLock的lock方法,可能产生的调用顺序:
接下来我们按照时间顺序来依次探究他们的源代码
compareAndSetState直接通过C++实现的unsafe调用操作系统层面的原子方法
以下为compareAndSetState的源码:
protected final boolean compareAndSetState(int expect, int update) {
// See below for intrinsics setup to support this
return unsafe.compareAndSwapInt(this, stateOffset, expect, update);
}
由于此时AQS还没有被任何线程更改,所以操作成功,AQS状态修改为1。
此时Thread2也想获取锁,因此也调用了compareAndSetState(0,1)。但是由于当前AQS的状态已经变为1,Thread2的CAS失败了。
此时Thread1已杜绝其它线程通过CAS更新AQS状态,可以通过setExclusiveOwnerThread方法声明"现在我是话事人"。以下是setExclusiveOwnerThread,很简单:
protected final void setExclusiveOwnerThread(Thread thread) {
exclusiveOwnerThread = thread;
}
回顾一下lock方法:
final void lock() {
if (compareAndSetState(0, 1))
setExclusiveOwnerThread(Thread.currentThread());
else
acquire(1);
}
Thread2在compareAndSetState失败后,只能通过acquire方法成为候选的等待线程。这里的逻辑是排他锁的核心,值得慢慢细说。
以下是acquire的源码:
public final void acquire(int arg) {
if (!tryAcquire(arg) &&
acquireQueued(addWaiter(Node.EXCLUSIVE), arg))
selfInterrupt();
}
static void selfInterrupt() {
Thread.currentThread().interrupt();
}
其中tryAcquire是AQS的抽象方法,返回是否成功获取到资源,由具体的子类实现(如文首提到的ReentrantLock中的NonFairSync类)。acquireQueued(addWaiter(Node.EXCLUSIVE), arg)则实现了未获取到资源后阻塞等待的逻辑。
在等待的过程中我们就会接触到AQS内部的等待队列,在队列中的head为一个空节点,tail为一个指针。head后的第一个节点在队伍中有最优先的资格获取资源。
为方便讨论,假设当前的队列中有一个处于等待状态的节点(节点状态默认为0,关于节点状态转换的解析,详见文末的另一篇文章),如下图:
现在 Thread2 进行了一次tryAcquire,但是由于锁还在Thread1手上,所以tryAcquire失败了。Thread2需要阻塞等待唤醒。
Thread2 首先调用 addWaiter,它为当前线程创建指定模式下的节点,并加入等待队列。入参是Node.EXCLUSIVE或者Node.SHARED。如果tail不为空,直接尝试使用CAS更新tail,如果CAS失败或者tail为空,则使用enq方法(此篇文章不再展开)无限循环设置tail。
private Node addWaiter(Node mode) {
Node node = new Node(Thread.currentThread(), mode);
// Try the fast path of enq; backup to full enq on failure
Node pred = tail;
if (pred != null) {
node.prev = pred;
if (compareAndSetTail(pred, node)) {
pred.next = node;
return node;
}
}
enq(node);
return node;
}
调用addWaiter之后,等待队列的状态如下所示:
创建成功节点后,Thread2通过acquireQueued方法阻塞等待其它线程唤醒它。如果在阻塞的过程中被中断了,则返回true,由acquire方法调用selfInterrupt方法处理中断。以下是acquireQueued的源码:
/**
* Acquires in exclusive uninterruptible mode for thread already in
* queue. Used by condition wait methods as well as acquire.
*
* @param node the node
* @param arg the acquire argument
* @return {@code true} if interrupted while waiting
*/
final boolean acquireQueued(final Node node, int arg) {
boolean failed = true;
try {
boolean interrupted = false;
for (;;) {
final Node p = node.predecessor();
if (p == head && tryAcquire(arg)) {
setHead(node);
p.next = null; // help GC
failed = false;
return interrupted;
}
if (shouldParkAfterFailedAcquire(p, node) &&
parkAndCheckInterrupt())
interrupted = true;
}
} finally {
if (failed)
cancelAcquire(node);
}
}
private void setHead(Node node) {
head = node;
node.thread = null;
node.prev = null;
}
整体上,acquireQueued方法不断获取当前节点的前驱节点,并按照分支进行如下处理:
分支1的代码均已了解,下面我们了解一下分支2:
shouldparkAfterFailedAcquire负责检查并更新节点状态,并返回是否可以阻塞当前线程,以下是源代码:
private static boolean shouldParkAfterFailedAcquire(Node pred, Node node) {
int ws = pred.waitStatus;
if (ws == Node.SIGNAL)
/*
* 前驱节点也在等待
* 因此当获取失败时,可以直接调用park堵塞,返回true
*/
return true;
if (ws > 0) {
/*
* 前驱节点状态为CANCELLED,向前查找非CANCELLED的节点作为prev
*
*/
do {
node.prev = pred = pred.prev;
} while (pred.waitStatus > 0);
pred.next = node;
} else {
/*
* 状态为0或者PROPAGATE,设置前驱节点状态为SIGNAL
*
*
*/
compareAndSetWaitStatus(pred, ws, Node.SIGNAL);
}
return false;
}
由于调用处是个无限循环,会不断tryAcquire,如果第一次此方法返回false,接下来pred的状态就会变成SIGNAL,最终还是会返回true。
在我们的例子中,Thread2没有被中断,而是进入队列等待了,等待队列如下图:
一旦shouldparkAfterFailedAcquire返回true,则调用parkAndCheckInterrupt挂起当前线程:
private final boolean parkAndCheckInterrupt() {
LockSupport.park(this);
return Thread.interrupted();
}
如果拿到锁/出现异常,调用cancelAcquire。cancelAcquire会清理队列中的CANCELLED的状态,并将当前的node状态更新为cancelled。
/**
* Cancels an ongoing attempt to acquire.
*
* @param node the node
*/
private void cancelAcquire(Node node) {
// Ignore if node doesn't exist
if (node == null)
return;
node.thread = null;
// Skip cancelled predecessors
Node pred = node.prev;
while (pred.waitStatus > 0)
node.prev = pred = pred.prev;
// predNext is the apparent node to unsplice. CASes below will
// fail if not, in which case, we lost race vs another cancel
// or signal, so no further action is necessary.
Node predNext = pred.next;
// Can use unconditional write instead of CAS here.
// After this atomic step, other Nodes can skip past us.
// Before, we are free of interference from other threads.
node.waitStatus = Node.CANCELLED;
// If we are the tail, remove ourselves.
if (node == tail && compareAndSetTail(node, pred)) {
compareAndSetNext(pred, predNext, null);
} else {
// If successor needs signal, try to set pred's next-link
// so it will get one. Otherwise wake it up to propagate.
int ws;
if (pred != head &&
((ws = pred.waitStatus) == Node.SIGNAL ||
(ws <= 0 && compareAndSetWaitStatus(pred, ws, Node.SIGNAL))) &&
pred.thread != null) {
Node next = node.next;
if (next != null && next.waitStatus <= 0)
compareAndSetNext(pred, predNext, next);
} else {
unparkSuccessor(node);
}
node.next = node; // help GC
}
}
unlock直接转发到release方法,源码如下:
public final boolean release(int arg) {
if (tryRelease(arg)) {
Node h = head;
if (h != null && h.waitStatus != 0)
unparkSuccessor(h);
return true;
}
return false;
}
release首先调用tryRelease归还资源,并尝试通过unparkSuccessor唤醒head后的第一个节点。唤醒方式:找到第一个处于等待状态的线程,并调用LockSupport.unpark唤醒对应线程。具体实现如下:
private void unparkSuccessor(Node node) {
/*
* If status is negative (i.e., possibly needing signal) try
* to clear in anticipation of signalling. It is OK if this
* fails or if status is changed by waiting thread.
*/
int ws = node.waitStatus;
if (ws < 0)
compareAndSetWaitStatus(node, ws, 0);
/*
* Thread to unpark is held in successor, which is normally
* just the next node. But if cancelled or apparently null,
* traverse backwards from tail to find the actual
* non-cancelled successor.
*/
Node s = node.next;
if (s == null || s.waitStatus > 0) {
s = null;
for (Node t = tail; t != null && t != node; t = t.prev)
if (t.waitStatus <= 0)
s = t;
}
if (s != null)
LockSupport.unpark(s.thread);
}
现在我们假设Thread1释放锁之后,当前第一个节点拿到锁,使用完之后也释放掉锁,Thread2就会被第一个节点unpark。此时队列如下图:
一旦Thread2被unpark,就会继续执行acquireQueued中的循环。既然现在它是第一个节点了,只要成功执行tryAcquire,Thread2就终于可以翻身农奴把歌唱,持有锁了!
一切看起来很美好,那为什么叫这种锁nonFair锁呢?
ans: 因为在lock时会先尝试CAS,此时可能锁刚被另一个线程释放,那么CAS成功就会抢占队列头的节点,导致插队。
公平锁与非公平锁的区别仅在于FairSync中的lock和tryAcquire实现不同:
/**
* Sync object for fair locks
*/
static final class FairSync extends Sync {
private static final long serialVersionUID = -3000897897090466540L;
final void lock() {
acquire(1);
}
/**
* Fair version of tryAcquire. Don't grant access unless
* recursive call or no waiters or is first.
*/
protected final boolean tryAcquire(int acquires) {
final Thread current = Thread.currentThread();
int c = getState();
if (c == 0) {
if (!hasQueuedPredecessors() &&
compareAndSetState(0, acquires)) {
setExclusiveOwnerThread(current);
return true;
}
}
else if (current == getExclusiveOwnerThread()) {
int nextc = c + acquires;
if (nextc < 0)
throw new Error("Maximum lock count exceeded");
setState(nextc);
return true;
}
return false;
}
}
对比非公平锁,公平锁的lock方法不会首先尝试CAS,避免了上一节最后一小段中"插队"的情况,但同时也导致了一定程度的效率下降。
下面是非公平锁的tryAcquire方法(非公平锁的方法nonFairTryAcquire在Sync中,这样的代码结构看起来有点错乱,在11中已经改为FairSync和NonFairSync各实现一个tryAcquire了,结构看起来正常许多)
// 非公平锁的nonfairTryAcquire实现
final boolean nonfairTryAcquire(int acquires) {
final Thread current = Thread.currentThread();
int c = getState();
if (c == 0) {
if (compareAndSetState(0, acquires)) {
setExclusiveOwnerThread(current);
return true;
}
}
else if (current == getExclusiveOwnerThread()) {
int nextc = c + acquires;
if (nextc < 0) // overflow
throw new Error("Maximum lock count exceeded");
setState(nextc);
return true;
}
return false;
}
可以看到,唯一的区别在于公平锁在CAS之前会调用hasQueuedPredecessors方法:
public final boolean hasQueuedPredecessors() {
// The correctness of this depends on head being initialized
// before tail and on head.next being accurate if the current
// thread is first in queue.
Node t = tail; // Read fields in reverse initialization order
Node h = head;
Node s;
return h != t &&
((s = h.next) == null || s.thread != Thread.currentThread());
}
此方法检查是否有线程在当前线程之前排队,在以下情况返回false:
只有hasQueuedPredecessors方法返回false,即没有任何线程在当前线程之前,当前线程才可以使用CAS尝试获取资源。
除了Sync对象中lock和tryAcquire方法,公平锁与非公平锁的实现是一模一样的,在此不再赘述。
之前我也看过一些java源码解析,通常都是一个个方法地讲解当前类,但却难以将各个方法串联起来。本篇文章试着通过解析ReentrantLock的lock和unlock方法来了解AbstractQueuedSynchronizer的实现细节。篇幅原因,对于AQS中节点状态的转换过程和共享锁的实现没有详细解析。但我的一位老哥碰巧也写了一篇关于AQS内部节点状态转换细节的文章,可以移步学习。