关于死锁以及如何查看线程是否有死锁

在 Java 的并发编程中,有一个问题需要特别注意,那就是死锁,如果发生了死锁,基本就是重启,而重启将会丢失运行中的数据。所以,了解死锁的形成并排查死锁到预防死锁成了一个重要的问题。

  1. 什么是死锁?
package cn.sxt.game;
/* 死锁
*
* */
public class DeadLock {
    public static void main(String[] args) {
        Object o1=new Object();
        Object o2=new Object();
        Thread t1=new MyThread1(o1,o2);
        Thread t2=new MyThread2(o1,o2);
        t1.setName("t1");//给线程起名字t1
        t2.setName("t2");//给线程起名字t2
        t1.start();
        t2.start();
    }
}
class MyThread1 extends Thread{
    Object o1;
    Object o2;
    public MyThread1(Object o1,Object o2){
        this.o1=o1;
        this.o2=o2;
    }

    @Override
    public void run() {
        synchronized (o1){
            try {
                System.out.println("获取资源1");
                sleep(1000);// 等待 1 秒让另一个线程拿到锁
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            synchronized (o2){

            }
        }
    }
}
class MyThread2 extends Thread{
    Object o1;
    Object o2;
    public MyThread2(Object o1,Object o2){
        this.o1=o1;
        this.o2=o2;
    }

    @Override
    public void run() {
        synchronized (o2){
            try {
                System.out.println("获取资源2");
                sleep(1000);// 等待 1 秒让另一个线程拿到锁
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            synchronized (o1){

            }
        }
    }
}

面的代码中,我们启用了两个线程,分别抢占2个资源,但这两个资源又分别被不同的对象(字符串)锁住了。当第一个线程,进入同步块,拿到o1锁,并等待 1 秒钟去拿o2锁,当第二个线程进入同步块后,拿到o2锁,并等待 1 秒钟去拿o1锁,注意:此时, 拿着 o1 锁的线程企图拿到 o2 的锁,但这个时候,拿着 o2 的线程也想去拿o1 的锁。于是就出现了互相僵持的情况,谁也无法拿到对方的锁,整个系统就卡死了。

这种情况就是死锁。

像我们现在写的代码是自己故意造出来的死锁,我们能够发现,那如果是线上环境怎么办,假如我们的系统卡死了,我们怎么知道到底是哪一段代码出现了问题,有没有可能使死锁的问题。也就是如何检测死锁。
2. 如何检测死锁?
由于死锁极难通过人工的方式查出来,因此JDK 提供了命令来检测某个java进程中心线程的情况,并排查有没有死锁。 jps , 用来查看java 程序的进程号,当然在 Linux 中也可以通过别的方式获取, jstack 进程号命令则可以答应对应进程的栈信息,并找到死锁。
在windows使用jps的命令

C:\Users\16047>jps
        1488 Jps
        3028 Launcher
        9188 DeadLock
        2632

        C:\Users\16047>jstack 9188
        2020-04-14 09:05:35
        Full thread dump Java HotSpot(TM) 64-Bit Server VM (13.0.1+9 mixed mode, sharing):

        Threads class SMR info:
        _java_thread_list=0x00000254ffd7b290, length=13, elements={
        0x00000254ffa11000, 0x00000254ffa14800, 0x00000254ffa37000, 0x00000254ffa38000,
        0x00000254ffa39800, 0x00000254ffae1000, 0x00000254ffa42800, 0x00000254ff9f5800,
        0x00000254ffced800, 0x00000254ffcf7000, 0x00000254ffd04800, 0x00000254ffd3f800,
        0x00000254f8011800
        }

        "Reference Handler" #2 daemon prio=10 os_prio=2 cpu=0.00ms elapsed=108.48s tid=0x00000254ffa11000 nid=0x28e4 waiting on condition  [0x00000065b3afe000]
        java.lang.Thread.State: RUNNABLE
        at java.lang.ref.Reference.waitForReferencePendingList(java.base@13.0.1/Native Method)
        at java.lang.ref.Reference.processPendingReferences(java.base@13.0.1/Reference.java:241)
        at java.lang.ref.Reference$ReferenceHandler.run(java.base@13.0.1/Reference.java:213)

        "Finalizer" #3 daemon prio=8 os_prio=1 cpu=0.00ms elapsed=108.48s tid=0x00000254ffa14800 nid=0x4204 in Object.wait()  [0x00000065b3bfe000]
        java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(java.base@13.0.1/Native Method)
        - waiting on <0x0000000089d0aec8> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@13.0.1/ReferenceQueue.java:155)
        - locked <0x0000000089d0aec8> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@13.0.1/ReferenceQueue.java:176)
        at java.lang.ref.Finalizer$FinalizerThread.run(java.base@13.0.1/Finalizer.java:170)

        "Signal Dispatcher" #4 daemon prio=9 os_prio=2 cpu=0.00ms elapsed=108.46s tid=0x00000254ffa37000 nid=0x181c runnable  [0x0000000000000000]
        java.lang.Thread.State: RUNNABLE

        "Attach Listener" #5 daemon prio=5 os_prio=2 cpu=31.25ms elapsed=108.46s tid=0x00000254ffa38000 nid=0x2bb4 waiting on condition  [0x0000000000000000]
        java.lang.Thread.State: RUNNABLE

        "C2 CompilerThread0" #6 daemon prio=9 os_prio=2 cpu=62.50ms elapsed=108.46s tid=0x00000254ffa39800 nid=0x25c0 waiting on condition  [0x0000000000000000]
        java.lang.Thread.State: RUNNABLE
        No compile task

        "C1 CompilerThread0" #9 daemon prio=9 os_prio=2 cpu=46.88ms elapsed=108.46s tid=0x00000254ffae1000 nid=0x2988 waiting on condition  [0x0000000000000000]
        java.lang.Thread.State: RUNNABLE
        No compile task

        "Sweeper thread" #10 daemon prio=9 os_prio=2 cpu=0.00ms elapsed=108.46s tid=0x00000254ffa42800 nid=0x3ce8 runnable  [0x0000000000000000]
        java.lang.Thread.State: RUNNABLE

        "Common-Cleaner" #11 daemon prio=8 os_prio=1 cpu=0.00ms elapsed=108.44s tid=0x00000254ff9f5800 nid=0x3b78 in Object.wait()  [0x00000065b41fe000]
        java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(java.base@13.0.1/Native Method)
        - waiting on <0x0000000089db21c0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@13.0.1/ReferenceQueue.java:155)
        - locked <0x0000000089db21c0> (a java.lang.ref.ReferenceQueue$Lock)
        at jdk.internal.ref.CleanerImpl.run(java.base@13.0.1/CleanerImpl.java:148)
        at java.lang.Thread.run(java.base@13.0.1/Thread.java:830)
        at jdk.internal.misc.InnocuousThread.run(java.base@13.0.1/InnocuousThread.java:134)

        "Monitor Ctrl-Break" #12 daemon prio=5 os_prio=0 cpu=15.63ms elapsed=108.36s tid=0x00000254ffced800 nid=0x2a38 runnable  [0x00000065b43fe000]
        java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.SocketDispatcher.read0(java.base@13.0.1/Native Method)
        at sun.nio.ch.SocketDispatcher.read(java.base@13.0.1/SocketDispatcher.java:46)
        at sun.nio.ch.NioSocketImpl.tryRead(java.base@13.0.1/NioSocketImpl.java:262)
        at sun.nio.ch.NioSocketImpl.implRead(java.base@13.0.1/NioSocketImpl.java:313)
        at sun.nio.ch.NioSocketImpl.read(java.base@13.0.1/NioSocketImpl.java:351)
        at sun.nio.ch.NioSocketImpl$1.read(java.base@13.0.1/NioSocketImpl.java:802)
        at java.net.Socket$SocketInputStream.read(java.base@13.0.1/Socket.java:937)
        at sun.nio.cs.StreamDecoder.readBytes(java.base@13.0.1/StreamDecoder.java:297)
        at sun.nio.cs.StreamDecoder.implRead(java.base@13.0.1/StreamDecoder.java:339)
        at sun.nio.cs.StreamDecoder.read(java.base@13.0.1/StreamDecoder.java:188)
        - locked <0x0000000089b125a0> (a java.io.InputStreamReader)
        at java.io.InputStreamReader.read(java.base@13.0.1/InputStreamReader.java:185)
        at java.io.BufferedReader.fill(java.base@13.0.1/BufferedReader.java:161)
        at java.io.BufferedReader.readLine(java.base@13.0.1/BufferedReader.java:326)
        - locked <0x0000000089b125a0> (a java.io.InputStreamReader)
        at java.io.BufferedReader.readLine(java.base@13.0.1/BufferedReader.java:392)
        at com.intellij.rt.execution.application.AppMainV2$1.run(AppMainV2.java:64)

        "Service Thread" #13 daemon prio=9 os_prio=0 cpu=0.00ms elapsed=108.36s tid=0x00000254ffcf7000 nid=0x42b4 runnable  [0x0000000000000000]
        java.lang.Thread.State: RUNNABLE

        "t1" #14 prio=5 os_prio=0 cpu=0.00ms elapsed=108.35s tid=0x00000254ffd04800 nid=0x3384 waiting for monitor entry  [0x00000065b46fe000]
        java.lang.Thread.State: BLOCKED (on object monitor)
        at cn.sxt.game.MyThread1.run(DeadLock.java:36)
        - waiting to lock <0x0000000089ccb560> (a java.lang.Object)
        - locked <0x0000000089ccb550> (a java.lang.Object)

        "t2" #15 prio=5 os_prio=0 cpu=0.00ms elapsed=108.35s tid=0x00000254ffd3f800 nid=0x3c18 waiting for monitor entry  [0x00000065b47ff000]
        java.lang.Thread.State: BLOCKED (on object monitor)
        at cn.sxt.game.MyThread2.run(DeadLock.java:59)
        - waiting to lock <0x0000000089ccb550> (a java.lang.Object)
        - locked <0x0000000089ccb560> (a java.lang.Object)

        "DestroyJavaVM" #16 prio=5 os_prio=0 cpu=156.25ms elapsed=108.35s tid=0x00000254f8011800 nid=0x2780 waiting on condition  [0x0000000000000000]
        java.lang.Thread.State: RUNNABLE

        "VM Thread" os_prio=2 cpu=0.00ms elapsed=108.48s tid=0x00000254ffa10000 nid=0x2544 runnable

        "GC Thread#0" os_prio=2 cpu=0.00ms elapsed=108.49s tid=0x00000254f8055800 nid=0x117c runnable

        "G1 Main Marker" os_prio=2 cpu=0.00ms elapsed=108.49s tid=0x00000254f8066800 nid=0x3098 runnable

        "G1 Conc#0" os_prio=2 cpu=0.00ms elapsed=108.49s tid=0x00000254f8068800 nid=0x24e4 runnable

        "G1 Refine#0" os_prio=2 cpu=0.00ms elapsed=108.49s tid=0x00000254ff88a800 nid=0x127c runnable

        "G1 Young RemSet Sampling" os_prio=2 cpu=0.00ms elapsed=108.49s tid=0x00000254ff88c000 nid=0x1900 runnable
        "VM Periodic Task Thread" os_prio=2 cpu=0.00ms elapsed=108.36s tid=0x00000254ffcfa800 nid=0x230c waiting on condition

        JNI global refs: 16, weak refs: 0

        // 找到一个死锁
        Found one Java-level deadlock:
        =============================
        "t1":
        waiting to lock monitor 0x00000254ffa1df00 (object 0x0000000089ccb560, a java.lang.Object),
        which is held by "t2"
        "t2":
        waiting to lock monitor 0x00000254ffa1dd00 (object 0x0000000089ccb550, a java.lang.Object),
        which is held by "t1"

        Java stack information for the threads listed above:
        ===================================================
        "t1":
        at cn.sxt.game.MyThread1.run(DeadLock.java:36)
        // 等待 0x0000000089ccb560 锁
        - waiting to lock <0x0000000089ccb560> (a java.lang.Object)
        //持有0x0000000089ccb550 锁
        - locked <0x0000000089ccb550> (a java.lang.Object)
        "t2":
        at cn.sxt.game.MyThread2.run(DeadLock.java:59)
        //等待0x0000000089ccb550 锁
        - waiting to lock <0x0000000089ccb550> (a java.lang.Object)
        // 持有 0x0000000089ccb560 锁
        - locked <0x0000000089ccb560> (a java.lang.Object)
        //发现一个死锁
        Found 1 deadlock.
t1  waiting to lock <0x0000000089ccb560>locked <0x0000000089ccb550>
t2  waiting to lock <0x0000000089ccb550>locked <0x0000000089ccb560>

我们首先使用 jps 命令找到 java 进程号,然后使用 jstack 进程号 打印进程栈的信息,其中,在最后的部分,jstack 告诉我们,他找到了一个死锁,其中又详细的信息:t1 线程(这里我们已经给线程起名字为t1,通常情况下,给线程起一个合适的名字将更有利于排查)持有Object类型的编号为 0x0000000089ccb550 的锁,等待编号为 0x0000000089ccb560 的锁 , 但这个锁由 t2 持有,于此同时,t1 和 t2 相反。t2 线程持有 0x0000000089ccb560 的锁,等待 0x0000000089ccb550 的锁。我们的注释里也写上了。

那么发生了死锁,该怎么办呢?最简单的办法就是重启,重启之后,对 jstack 中打印的堆栈信息中的代码进行修改。重新发布。当然还有一些高级策略,比如让进程回滚到死锁前的状态,然后让他们顺序进入同步块。

  1. 死锁有哪些形成的原因
    一般来说,要出现死锁问题需要满足以下条件:

    互斥条件:一个资源每次只能被一个线程使用。

    请求与保持条件:一个进程因请求资源而阻塞时,对已获得的资源保持不放。

    不剥夺条件:进程已获得的资源,在未使用完之前,不能强行剥夺。

    循环等待条件:若干进程之间形成一种头尾相接的循环等待资源关系。

死锁是由四个必要条件导致的,所以一般来说,只要破坏这四个必要条件中的一个条件,死锁情况就应该不会发生。
我们在编程的时候尽量避免发生死锁,如果出现死锁可以使用 jstack 命令查看线程是否有死锁。

你可能感兴趣的:(关于死锁以及如何查看线程是否有死锁)