死锁是指在程序里出现两个或两个以上的线程永远被堵塞住,出现这种情况的前提是至少有两个线程和两个或更多的公共资源。下面是我写的一个简单的会产生死锁现象的例子,我们来分析下它的原理:
Java死锁例子
package com.journaldev.threads; public class ThreadDeadlock { public static void main(String[] args) throws InterruptedException { Object obj1 = new Object(); Object obj2 = new Object(); Object obj3 = new Object(); Thread t1 = new Thread(new SyncThread(obj1, obj2), "t1"); Thread t2 = new Thread(new SyncThread(obj2, obj3), "t2"); Thread t3 = new Thread(new SyncThread(obj3, obj1), "t3"); t1.start(); Thread.sleep(5000); t2.start(); Thread.sleep(5000); t3.start(); } } class SyncThread implements Runnable{ private Object obj1; private Object obj2; public SyncThread(Object o1, Object o2){ this.obj1=o1; this.obj2=o2; } @Override public void run() { String name = Thread.currentThread().getName(); System.out.println(name + " acquiring lock on "+obj1); synchronized (obj1) { System.out.println(name + " acquired lock on "+obj1); work(); System.out.println(name + " acquiring lock on "+obj2); synchronized (obj2) { System.out.println(name + " acquired lock on "+obj2); work(); } System.out.println(name + " released lock on "+obj2); } System.out.println(name + " released lock on "+obj1); System.out.println(name + " finished execution."); } private void work() { try { Thread.sleep(30000); } catch (InterruptedException e) { e.printStackTrace(); } } }
在上面的例子里,SyncThread类实现了Runnable接口,并且通过同步块来顺序锁住两个Objects才能正常运行下去。
在main方法里,我有三个线程在跑SyncThread,并且让其相互之间有共享资源。这三个线程运行起来后会出现这么一种情况:某个线程可以给第一个object加锁后继续运行,但紧接着当它试着给第二个object加锁时,有可能第二个object已经被另一线程加锁了,所以不得不进入等待状态。这样就有可能在各线程之间形成一种对公共资源的循环依赖,从而导致出现死锁。
当我执行上面代码时,输出如下。可以看到程序因为死锁终止了。
t1 acquiring lock on java.lang.Object@6d9dd520 t1 acquired lock on java.lang.Object@6d9dd520 t2 acquiring lock on java.lang.Object@22aed3a5 t2 acquired lock on java.lang.Object@22aed3a5 t3 acquiring lock on java.lang.Object@218c2661 t3 acquired lock on java.lang.Object@218c2661 t1 acquiring lock on java.lang.Object@22aed3a5 t2 acquiring lock on java.lang.Object@218c2661 t3 acquiring lock on java.lang.Object@6d9dd520
在这个小例子里我们可以清楚的看到死锁,不过在实际应用中想要发现死锁并且调试它,是非常困难的。
分析死锁
想分析死锁,我们需要看下应用里dump出来的Java线程情况。我们可以通过VisualVM profiler或者jstack来dump。
下面是dump出来的以上死锁现场的数据。
2012-12-27 19:08:34 Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.5-b02 mixed mode): "Attach Listener" daemon prio=5 tid=0x00007fb0a2814000 nid=0x4007 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "DestroyJavaVM" prio=5 tid=0x00007fb0a2801000 nid=0x1703 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "t3" prio=5 tid=0x00007fb0a204b000 nid=0x4d07 waiting for monitor entry [0x000000015d971000] java.lang.Thread.State: BLOCKED (on object monitor) at com.journaldev.threads.SyncThread.run(ThreadDeadlock.java:41) - waiting to lock <0x000000013df2f658> (a java.lang.Object) - locked <0x000000013df2f678> (a java.lang.Object) at java.lang.Thread.run(Thread.java:722) "t2" prio=5 tid=0x00007fb0a1073000 nid=0x4207 waiting for monitor entry [0x000000015d209000] java.lang.Thread.State: BLOCKED (on object monitor) at com.journaldev.threads.SyncThread.run(ThreadDeadlock.java:41) - waiting to lock <0x000000013df2f678> (a java.lang.Object) - locked <0x000000013df2f668> (a java.lang.Object) at java.lang.Thread.run(Thread.java:722) "t1" prio=5 tid=0x00007fb0a1072000 nid=0x5503 waiting for monitor entry [0x000000015d86e000] java.lang.Thread.State: BLOCKED (on object monitor) at com.journaldev.threads.SyncThread.run(ThreadDeadlock.java:41) - waiting to lock <0x000000013df2f668> (a java.lang.Object) - locked <0x000000013df2f658> (a java.lang.Object) at java.lang.Thread.run(Thread.java:722) "Service Thread" daemon prio=5 tid=0x00007fb0a1038000 nid=0x5303 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread1" daemon prio=5 tid=0x00007fb0a1037000 nid=0x5203 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread0" daemon prio=5 tid=0x00007fb0a1016000 nid=0x5103 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Signal Dispatcher" daemon prio=5 tid=0x00007fb0a4003000 nid=0x5003 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Finalizer" daemon prio=5 tid=0x00007fb0a4800000 nid=0x3f03 in Object.wait() [0x000000015d0c0000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x000000013de75798> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135) - locked <0x000000013de75798> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177) "Reference Handler" daemon prio=5 tid=0x00007fb0a4002000 nid=0x3e03 in Object.wait() [0x000000015cfbd000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x000000013de75320> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:503) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133) - locked <0x000000013de75320> (a java.lang.ref.Reference$Lock) "VM Thread" prio=5 tid=0x00007fb0a2049800 nid=0x3d03 runnable "GC task thread#0 (ParallelGC)" prio=5 tid=0x00007fb0a300d800 nid=0x3503 runnable "GC task thread#1 (ParallelGC)" prio=5 tid=0x00007fb0a2001800 nid=0x3603 runnable "GC task thread#2 (ParallelGC)" prio=5 tid=0x00007fb0a2003800 nid=0x3703 runnable "GC task thread#3 (ParallelGC)" prio=5 tid=0x00007fb0a2004000 nid=0x3803 runnable "GC task thread#4 (ParallelGC)" prio=5 tid=0x00007fb0a2005000 nid=0x3903 runnable "GC task thread#5 (ParallelGC)" prio=5 tid=0x00007fb0a2005800 nid=0x3a03 runnable "GC task thread#6 (ParallelGC)" prio=5 tid=0x00007fb0a2006000 nid=0x3b03 runnable "GC task thread#7 (ParallelGC)" prio=5 tid=0x00007fb0a2006800 nid=0x3c03 runnable "VM Periodic Task Thread" prio=5 tid=0x00007fb0a1015000 nid=0x5403 waiting on condition JNI global references: 114 Found one Java-level deadlock: ============================= "t3": waiting to lock monitor 0x00007fb0a1074b08 (object 0x000000013df2f658, a java.lang.Object), which is held by "t1" "t1": waiting to lock monitor 0x00007fb0a1010f08 (object 0x000000013df2f668, a java.lang.Object), which is held by "t2" "t2": waiting to lock monitor 0x00007fb0a1012360 (object 0x000000013df2f678, a java.lang.Object), which is held by "t3" Java stack information for the threads listed above: =================================================== "t3": at com.journaldev.threads.SyncThread.run(ThreadDeadlock.java:41) - waiting to lock <0x000000013df2f658> (a java.lang.Object) - locked <0x000000013df2f678> (a java.lang.Object) at java.lang.Thread.run(Thread.java:722) "t1": at com.journaldev.threads.SyncThread.run(ThreadDeadlock.java:41) - waiting to lock <0x000000013df2f668> (a java.lang.Object) - locked <0x000000013df2f658> (a java.lang.Object) at java.lang.Thread.run(Thread.java:722) "t2": at com.journaldev.threads.SyncThread.run(ThreadDeadlock.java:41) - waiting to lock <0x000000013df2f678> (a java.lang.Object) - locked <0x000000013df2f668> (a java.lang.Object) at java.lang.Thread.run(Thread.java:722) Found 1 deadlock.
从上面内容可以清楚的看到线程间因为公共资源,导致了死锁的出现。
我们可以通过查看处于BLOCKED状态的线程,和那些它需要等待加锁的资源。每个资源都是有唯一ID的,我们可以发现在等的资源已经被其它线程给加锁了。比如"t3"线程在等着给资源"0x000000013df2f658"加锁,但该资源其实已经被"t1"线程给锁住了。
当我们分析出了出现死锁情况的原因时,我们就要修改代码来避免这种情况了。
避免死锁
这里有些可供参考的指南,使我们可以避开大部分的死锁情况。
避免嵌套锁:这是导致死锁的最普遍原因。当你已经给一个资源上锁后,避免再去锁住另一个。如果你只依赖于单个资源,基本上是不可能出现死锁现象的。比如下面的代码,是另一种run()方法的实现,它因为消除了嵌套锁从而避免了死锁现象。
只对需要用到的加锁:你应该在你必须依赖该资源时,才想着去给它加锁。比如上面的例子,我锁住了整个Object资源,但如果我只需要用到它的某个字段,那我应该是对该特定的字段加锁,而不是锁住整个object。
避免无限期等待:如果你的两个线程在进行thread join操作时,互相无限期的等待对方完成任务,就可能出现死锁。所以当你的线程必须要等待另一个线程完成任务时,最好加上一个最长等待时间。
英文原文在这,由newhottopic.com翻译