在 Java 的并发编程中,有一个问题需要特别注意,那就是死锁,如果发生了死锁,基本就是重启,而重启将会丢失运行中的数据。所以,了解死锁的形成并排查死锁到预防死锁成了一个重要的问题。
package cn.sxt.game;
/* 死锁
*
* */
public class DeadLock {
public static void main(String[] args) {
Object o1=new Object();
Object o2=new Object();
Thread t1=new MyThread1(o1,o2);
Thread t2=new MyThread2(o1,o2);
t1.setName("t1");//给线程起名字t1
t2.setName("t2");//给线程起名字t2
t1.start();
t2.start();
}
}
class MyThread1 extends Thread{
Object o1;
Object o2;
public MyThread1(Object o1,Object o2){
this.o1=o1;
this.o2=o2;
}
@Override
public void run() {
synchronized (o1){
try {
System.out.println("获取资源1");
sleep(1000);// 等待 1 秒让另一个线程拿到锁
} catch (InterruptedException e) {
e.printStackTrace();
}
synchronized (o2){
}
}
}
}
class MyThread2 extends Thread{
Object o1;
Object o2;
public MyThread2(Object o1,Object o2){
this.o1=o1;
this.o2=o2;
}
@Override
public void run() {
synchronized (o2){
try {
System.out.println("获取资源2");
sleep(1000);// 等待 1 秒让另一个线程拿到锁
} catch (InterruptedException e) {
e.printStackTrace();
}
synchronized (o1){
}
}
}
}
面的代码中,我们启用了两个线程,分别抢占2个资源,但这两个资源又分别被不同的对象(字符串)锁住了。当第一个线程,进入同步块,拿到o1锁,并等待 1 秒钟去拿o2锁,当第二个线程进入同步块后,拿到o2锁,并等待 1 秒钟去拿o1锁,注意:此时, 拿着 o1 锁的线程企图拿到 o2 的锁,但这个时候,拿着 o2 的线程也想去拿o1 的锁。于是就出现了互相僵持的情况,谁也无法拿到对方的锁,整个系统就卡死了。
这种情况就是死锁。
像我们现在写的代码是自己故意造出来的死锁,我们能够发现,那如果是线上环境怎么办,假如我们的系统卡死了,我们怎么知道到底是哪一段代码出现了问题,有没有可能使死锁的问题。也就是如何检测死锁。
2. 如何检测死锁?
由于死锁极难通过人工的方式查出来,因此JDK 提供了命令来检测某个java进程中心线程的情况,并排查有没有死锁。 jps , 用来查看java 程序的进程号,当然在 Linux 中也可以通过别的方式获取, jstack 进程号命令则可以答应对应进程的栈信息,并找到死锁。
在windows使用jps的命令
C:\Users\16047>jps
1488 Jps
3028 Launcher
9188 DeadLock
2632
C:\Users\16047>jstack 9188
2020-04-14 09:05:35
Full thread dump Java HotSpot(TM) 64-Bit Server VM (13.0.1+9 mixed mode, sharing):
Threads class SMR info:
_java_thread_list=0x00000254ffd7b290, length=13, elements={
0x00000254ffa11000, 0x00000254ffa14800, 0x00000254ffa37000, 0x00000254ffa38000,
0x00000254ffa39800, 0x00000254ffae1000, 0x00000254ffa42800, 0x00000254ff9f5800,
0x00000254ffced800, 0x00000254ffcf7000, 0x00000254ffd04800, 0x00000254ffd3f800,
0x00000254f8011800
}
"Reference Handler" #2 daemon prio=10 os_prio=2 cpu=0.00ms elapsed=108.48s tid=0x00000254ffa11000 nid=0x28e4 waiting on condition [0x00000065b3afe000]
java.lang.Thread.State: RUNNABLE
at java.lang.ref.Reference.waitForReferencePendingList(java.base@13.0.1/Native Method)
at java.lang.ref.Reference.processPendingReferences(java.base@13.0.1/Reference.java:241)
at java.lang.ref.Reference$ReferenceHandler.run(java.base@13.0.1/Reference.java:213)
"Finalizer" #3 daemon prio=8 os_prio=1 cpu=0.00ms elapsed=108.48s tid=0x00000254ffa14800 nid=0x4204 in Object.wait() [0x00000065b3bfe000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(java.base@13.0.1/Native Method)
- waiting on <0x0000000089d0aec8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(java.base@13.0.1/ReferenceQueue.java:155)
- locked <0x0000000089d0aec8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(java.base@13.0.1/ReferenceQueue.java:176)
at java.lang.ref.Finalizer$FinalizerThread.run(java.base@13.0.1/Finalizer.java:170)
"Signal Dispatcher" #4 daemon prio=9 os_prio=2 cpu=0.00ms elapsed=108.46s tid=0x00000254ffa37000 nid=0x181c runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Attach Listener" #5 daemon prio=5 os_prio=2 cpu=31.25ms elapsed=108.46s tid=0x00000254ffa38000 nid=0x2bb4 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #6 daemon prio=9 os_prio=2 cpu=62.50ms elapsed=108.46s tid=0x00000254ffa39800 nid=0x25c0 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
No compile task
"C1 CompilerThread0" #9 daemon prio=9 os_prio=2 cpu=46.88ms elapsed=108.46s tid=0x00000254ffae1000 nid=0x2988 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
No compile task
"Sweeper thread" #10 daemon prio=9 os_prio=2 cpu=0.00ms elapsed=108.46s tid=0x00000254ffa42800 nid=0x3ce8 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Common-Cleaner" #11 daemon prio=8 os_prio=1 cpu=0.00ms elapsed=108.44s tid=0x00000254ff9f5800 nid=0x3b78 in Object.wait() [0x00000065b41fe000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(java.base@13.0.1/Native Method)
- waiting on <0x0000000089db21c0> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(java.base@13.0.1/ReferenceQueue.java:155)
- locked <0x0000000089db21c0> (a java.lang.ref.ReferenceQueue$Lock)
at jdk.internal.ref.CleanerImpl.run(java.base@13.0.1/CleanerImpl.java:148)
at java.lang.Thread.run(java.base@13.0.1/Thread.java:830)
at jdk.internal.misc.InnocuousThread.run(java.base@13.0.1/InnocuousThread.java:134)
"Monitor Ctrl-Break" #12 daemon prio=5 os_prio=0 cpu=15.63ms elapsed=108.36s tid=0x00000254ffced800 nid=0x2a38 runnable [0x00000065b43fe000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.SocketDispatcher.read0(java.base@13.0.1/Native Method)
at sun.nio.ch.SocketDispatcher.read(java.base@13.0.1/SocketDispatcher.java:46)
at sun.nio.ch.NioSocketImpl.tryRead(java.base@13.0.1/NioSocketImpl.java:262)
at sun.nio.ch.NioSocketImpl.implRead(java.base@13.0.1/NioSocketImpl.java:313)
at sun.nio.ch.NioSocketImpl.read(java.base@13.0.1/NioSocketImpl.java:351)
at sun.nio.ch.NioSocketImpl$1.read(java.base@13.0.1/NioSocketImpl.java:802)
at java.net.Socket$SocketInputStream.read(java.base@13.0.1/Socket.java:937)
at sun.nio.cs.StreamDecoder.readBytes(java.base@13.0.1/StreamDecoder.java:297)
at sun.nio.cs.StreamDecoder.implRead(java.base@13.0.1/StreamDecoder.java:339)
at sun.nio.cs.StreamDecoder.read(java.base@13.0.1/StreamDecoder.java:188)
- locked <0x0000000089b125a0> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(java.base@13.0.1/InputStreamReader.java:185)
at java.io.BufferedReader.fill(java.base@13.0.1/BufferedReader.java:161)
at java.io.BufferedReader.readLine(java.base@13.0.1/BufferedReader.java:326)
- locked <0x0000000089b125a0> (a java.io.InputStreamReader)
at java.io.BufferedReader.readLine(java.base@13.0.1/BufferedReader.java:392)
at com.intellij.rt.execution.application.AppMainV2$1.run(AppMainV2.java:64)
"Service Thread" #13 daemon prio=9 os_prio=0 cpu=0.00ms elapsed=108.36s tid=0x00000254ffcf7000 nid=0x42b4 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"t1" #14 prio=5 os_prio=0 cpu=0.00ms elapsed=108.35s tid=0x00000254ffd04800 nid=0x3384 waiting for monitor entry [0x00000065b46fe000]
java.lang.Thread.State: BLOCKED (on object monitor)
at cn.sxt.game.MyThread1.run(DeadLock.java:36)
- waiting to lock <0x0000000089ccb560> (a java.lang.Object)
- locked <0x0000000089ccb550> (a java.lang.Object)
"t2" #15 prio=5 os_prio=0 cpu=0.00ms elapsed=108.35s tid=0x00000254ffd3f800 nid=0x3c18 waiting for monitor entry [0x00000065b47ff000]
java.lang.Thread.State: BLOCKED (on object monitor)
at cn.sxt.game.MyThread2.run(DeadLock.java:59)
- waiting to lock <0x0000000089ccb550> (a java.lang.Object)
- locked <0x0000000089ccb560> (a java.lang.Object)
"DestroyJavaVM" #16 prio=5 os_prio=0 cpu=156.25ms elapsed=108.35s tid=0x00000254f8011800 nid=0x2780 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"VM Thread" os_prio=2 cpu=0.00ms elapsed=108.48s tid=0x00000254ffa10000 nid=0x2544 runnable
"GC Thread#0" os_prio=2 cpu=0.00ms elapsed=108.49s tid=0x00000254f8055800 nid=0x117c runnable
"G1 Main Marker" os_prio=2 cpu=0.00ms elapsed=108.49s tid=0x00000254f8066800 nid=0x3098 runnable
"G1 Conc#0" os_prio=2 cpu=0.00ms elapsed=108.49s tid=0x00000254f8068800 nid=0x24e4 runnable
"G1 Refine#0" os_prio=2 cpu=0.00ms elapsed=108.49s tid=0x00000254ff88a800 nid=0x127c runnable
"G1 Young RemSet Sampling" os_prio=2 cpu=0.00ms elapsed=108.49s tid=0x00000254ff88c000 nid=0x1900 runnable
"VM Periodic Task Thread" os_prio=2 cpu=0.00ms elapsed=108.36s tid=0x00000254ffcfa800 nid=0x230c waiting on condition
JNI global refs: 16, weak refs: 0
// 找到一个死锁
Found one Java-level deadlock:
=============================
"t1":
waiting to lock monitor 0x00000254ffa1df00 (object 0x0000000089ccb560, a java.lang.Object),
which is held by "t2"
"t2":
waiting to lock monitor 0x00000254ffa1dd00 (object 0x0000000089ccb550, a java.lang.Object),
which is held by "t1"
Java stack information for the threads listed above:
===================================================
"t1":
at cn.sxt.game.MyThread1.run(DeadLock.java:36)
// 等待 0x0000000089ccb560 锁
- waiting to lock <0x0000000089ccb560> (a java.lang.Object)
//持有0x0000000089ccb550 锁
- locked <0x0000000089ccb550> (a java.lang.Object)
"t2":
at cn.sxt.game.MyThread2.run(DeadLock.java:59)
//等待0x0000000089ccb550 锁
- waiting to lock <0x0000000089ccb550> (a java.lang.Object)
// 持有 0x0000000089ccb560 锁
- locked <0x0000000089ccb560> (a java.lang.Object)
//发现一个死锁
Found 1 deadlock.
t1 waiting to lock <0x0000000089ccb560>locked <0x0000000089ccb550>
t2 waiting to lock <0x0000000089ccb550>locked <0x0000000089ccb560>
我们首先使用 jps 命令找到 java 进程号,然后使用 jstack 进程号 打印进程栈的信息,其中,在最后的部分,jstack 告诉我们,他找到了一个死锁,其中又详细的信息:t1 线程(这里我们已经给线程起名字为t1,通常情况下,给线程起一个合适的名字将更有利于排查)持有Object类型的编号为 0x0000000089ccb550 的锁,等待编号为 0x0000000089ccb560 的锁 , 但这个锁由 t2 持有,于此同时,t1 和 t2 相反。t2 线程持有 0x0000000089ccb560 的锁,等待 0x0000000089ccb550 的锁。我们的注释里也写上了。
那么发生了死锁,该怎么办呢?最简单的办法就是重启,重启之后,对 jstack 中打印的堆栈信息中的代码进行修改。重新发布。当然还有一些高级策略,比如让进程回滚到死锁前的状态,然后让他们顺序进入同步块。
死锁有哪些形成的原因
一般来说,要出现死锁问题需要满足以下条件:
互斥条件:一个资源每次只能被一个线程使用。
请求与保持条件:一个进程因请求资源而阻塞时,对已获得的资源保持不放。
不剥夺条件:进程已获得的资源,在未使用完之前,不能强行剥夺。
循环等待条件:若干进程之间形成一种头尾相接的循环等待资源关系。
死锁是由四个必要条件导致的,所以一般来说,只要破坏这四个必要条件中的一个条件,死锁情况就应该不会发生。
我们在编程的时候尽量避免发生死锁,如果出现死锁可以使用 jstack 命令查看线程是否有死锁。