这个文章是在16年使用LitePal库时遇到的一个问题,翻出来当做记录,目前的LitePal貌似也改了这个实现。
当时的项目使用了LitePal库作为数据库存储,当时User类存在数据库中,初始化的时候会异步从数据库读取User信息,并且在主线程初始化一个默认的User类使用,但这过程出现死锁的概率非常高,发现是库的设计有问题,分析得到的过程结论如下:
日志:
"main" prio=5 tid=1 Blocked
| group="main" sCount=1 dsCount=0 obj=0x746f9a50 self=0xf4736a00
| sysTid=17043 nice=0 cgrp=bg_non_interactive sched=0/0 handle=0xf71b8de4
| state=S schedstat=( 1381247238 86682835 492 ) utm=127 stm=11 core=5 HZ=100
| stack=0xff657000-0xff659000 stackSize=8MB
| held mutexes=
at com.meizu.lifekit.entity.UserSingleton.(UserSingleton.java:22)
waiting to lock <0x0fd0e4f2> (a java.lang.Class) held by thread 16
at com.meizu.lifekit.entity.UserSingleton.(UserSingleton.java:21)
at com.meizu.lifekit.entity.UserSingleton$InstanceHolder.(UserSingleton.java:180)
at com.meizu.lifekit.entity.UserSingleton.getInstance(UserSingleton.java:30)
at com.meizu.lifekit.data.mainpage.NewHomeFragment.(NewHomeFragment.java:82)
at java.lang.Class.newInstance!(Native method)
TID16:
"LifeKitApplication" prio=5 tid=16 Blocked
| group="main" sCount=1 dsCount=0 obj=0x132db8e0 self=0xf395b900
| sysTid=17067 nice=0 cgrp=bg_non_interactive sched=0/0 handle=0xd9418930
| state=S schedstat=( 2762923 1967693 9 ) utm=0 stm=0 core=1 HZ=100
| stack=0xd9316000-0xd9318000 stackSize=1038KB
| held mutexes=
kernel: __switch_to+0x74/0x8c
kernel: futex_wait_queue_me+0xd8/0x168
kernel: futex_wait+0xe4/0x234
kernel: do_futex+0x184/0xa14
kernel: compat_SyS_futex+0x7c/0x168
kernel: el0_svc_naked+0x20/0x28
native: #00 pc 00017698 /system/lib/libc.so (syscall+28)
native: [#1](https://github.com/LitePalFramework/LitePal/issues/1) pc 000e8985 /system/lib/libart.so (_ZN3art17ConditionVariable4WaitEPNS_6ThreadE+80)
native: [#2](https://github.com/LitePalFramework/LitePal/issues/2) pc 0029fb2b /system/lib/libart.so (_ZN3art7Monitor4LockEPNS_6ThreadE+394)
native: [#3](https://github.com/LitePalFramework/LitePal/issues/3) pc 002a25cb /system/lib/libart.so (_ZN3art7Monitor12MonitorEnterEPNS_6ThreadEPNS_6mirror6ObjectE+270)
native: [#4](https://github.com/LitePalFramework/LitePal/issues/4) pc 002d775d /system/lib/libart.so (_ZN3art10ObjectLockINS_6mirror6ObjectEEC2EPNS_6ThreadENS_6HandleIS2_EE+24)
native: [#5](https://github.com/LitePalFramework/LitePal/issues/5) pc 0012bfdf /system/lib/libart.so (_ZN3art11ClassLinker15InitializeClassEPNS_6ThreadENS_6HandleINS_6mirror5ClassEEEbb.part.593+94)
native: [#6](https://github.com/LitePalFramework/LitePal/issues/6) pc 0012ce93 /system/lib/libart.so (_ZN3art11ClassLinker17EnsureInitializedEPNS_6ThreadENS_6HandleINS_6mirror5ClassEEEbb+82)
native: [#7](https://github.com/LitePalFramework/LitePal/issues/7) pc 002aea0d /system/lib/libart.so (_ZN3artL18Class_classForNameEP7_JNIEnvP7_jclassP8_jstringhP8_jobject+452)
native: [#8](https://github.com/LitePalFramework/LitePal/issues/8) pc 0025eb19 /data/dalvik-cache/arm/system@[[email protected]](mailto:[email protected]) (Java_java_lang_Class_classForName__Ljava_lang_String_2ZLjava_lang_ClassLoader_2+132)
at java.lang.Class.classForName!(Native method)
waiting to lock <0x081409ec> (a java.lang.Class) held by thread 1
at java.lang.Class.forName(Class.java:324)
at java.lang.Class.forName(Class.java:285)
at org.litepal.LitePalBase.getSupportedFields(LitePalBase.java:170)
at org.litepal.crud.DataHandler.query(DataHandler.java:124)
at org.litepal.crud.QueryHandler.onFindLast(QueryHandler.java:96)
at org.litepal.crud.DataSupport.findLast(DataSupport.java:576)
locked <0x0fd0e4f2> (a java.lang.Class)
at org.litepal.crud.DataSupport.findLast(DataSupport.java:561)
locked <0x0fd0e4f2> (a java.lang.Class)
at com.meizu.lifekit.LifeKitApplication$ApplicationHandler.handleMessage(LifeKitApplication.java:117)
at android.os.Handler.dispatchMessage(Handler.java:111)
at android.os.Looper.loop(Looper.java:207)
at android.os.HandlerThread.run(HandlerThread.java:61)
原因:表User类实例化时,同时查询数据库,互相持有锁造成死锁
User实例化时持有User.class
锁申请父类DataSupport.class
锁,查询User表时持有DataSupport.class
锁申请User.class
锁,互相持有对方资源不释放死锁了。
实例化User类的时候发生的事情,先初始化User类的过程中持有User.class
锁,发现DataSupport
是父类,此时User.class处于being_initialized
状态,尝试初始化DataSupport
类(想进入being_initialized状态),但是DataSupport.class
已经被子线程的查询操作访问先进行初始化(已经进入了being_initialized
状态,但查询线程需要等待User
的being_initialized
状态完成),所以主线程实例化User的时候持有User.class
锁,等待DataSupport.class
锁;而子线程的查询User操作的时候,先初始化DataSupport
类,持有DataSupport.class
锁,等待访问User.class
锁,顺序相反地持锁,死锁BOOM!如果这个是程序逻辑死锁的话,那我们使用LitePal时就不能查询的时候同时在另一个线程实例化该类了。算是库的设计缺陷吧
类初始化过程持锁的原因 参考链接:https://yq.aliyun.com/articles/73595
我当时在郭霖的GitHub提的死锁issue