想说一句,CSDN博客的编辑器真的很烂,而且有一个重大的漏洞(可以盗CSND的号,亲自测试成功,细节不说了,哈哈),后来我告诉他们了,他们修复了并且给我一件衣服和背包,然后升级后这个漏洞又出现了。
衣服脏了,背包破了,也该换新的了,于是我用QQ联系客服,暂时回应。。。。。
开始说正题
直播间的飘赞使用SurfaceView,但是当直播间Activity到后台时(SurfaceView销毁了),极少情况下会出现ANR。我通过查看SurfaceView源码发现了一个坑,其实很多人使用的姿势不对,他们没有出现ANR只是幸运而已。
1、如何找ANR日志
出现ANR之后我立刻想到要拿到ANR日志,可以通过如下命令获取ANR日志:
adb pull data/anr/traces.txt
这样就把ANR日志下载到电脑了。
2、分析ANR日志
打开ANR日志,可以看到main线程的堆栈信息
"main" prio=5 tid=1 Waiting
| group="main" sCount=1 dsCount=0 obj=0x76c353e8 self=0xf4a64500
| sysTid=7047 nice=-11 cgrp=default sched=0/0 handle=0xf7276b4c
| state=S schedstat=( 0 0 0 ) utm=2795 stm=388 core=7 HZ=100
| stack=0xff030000-0xff032000 stackSize=8MB
| held mutexes=
at java.lang.Object.wait!(Native method)
- waiting on <0x03fd06cb> (a java.lang.Object)
at java.lang.Thread.parkFor$(Thread.java:1220)
- locked <0x03fd06cb> (a java.lang.Object)
at sun.misc.Unsafe.park(Unsafe.java:299)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:810)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:843)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1172)
at java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:196)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:257)
at android.view.SurfaceView.updateWindow(SurfaceView.java:638)
at android.view.SurfaceView.onWindowVisibilityChanged(SurfaceView.java:316)
at android.view.View.dispatchWindowVisibilityChanged(View.java:10434)
at android.view.ViewGroup.dispatchWindowVisibilityChanged(ViewGroup.java:1328)
at android.view.ViewGroup.dispatchWindowVisibilityChanged(ViewGroup.java:1328)
at android.view.ViewGroup.dispatchWindowVisibilityChanged(ViewGroup.java:1328)
... repeated 1 times
at android.view.ViewRootImpl.performTraversals(ViewRootImpl.java:1750)
at android.view.ViewRootImpl.doTraversal(ViewRootImpl.java:1437)
at android.view.ViewRootImpl$TraversalRunnable.run(ViewRootImpl.java:7397)
at android.view.Choreographer$CallbackRecord.run(Choreographer.java:920)
at android.view.Choreographer.doCallbacks(Choreographer.java:695)
at android.view.Choreographer.doFrame(Choreographer.java:631)
at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:906)
at android.os.Handler.handleCallback(Handler.java:739)
at android.os.Handler.dispatchMessage(Handler.java:95)
at android.os.Looper.loop(Looper.java:158)
at android.app.ActivityThread.main(ActivityThread.java:7237)
at java.lang.reflect.Method.invoke!(Native method)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:1230)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1120)
1、分析SurfaceView源码:
根据log可以看出是SurfacecView导致的ANR,直播间的飘赞动画就是用SurfaceView实现的,在执行SurfaceView.updateWindow方法里面的ReentrantLock.lock()时一直阻塞在这里,导致了ANR。
打开SurfaceView源码,看到updateWindow方法里面果然有mSurfaceLock.lock()方法。
mSurfaceLock是这样被定义的:final ReentrantLock mSurfaceLock = new ReentrantLock();
肯定是有一个地方没有调用unlock释放锁,导致调用lock时一直无法获得锁,想到Canvas有lock,并且需要开发者及时unlock。
操作画布的代码并没有问题,在finally里unlock也是正确的,如下:
Canvas canvas = mHolder.lockCanvas();
if(canvas != null){
try {
for (Heart heart : mHeartArray) {
canvas.drawBitmap(heart.bitmap, null, heart.dst, mPaint);
}
} catch (Exception e) {
e.printStackTrace();
} finally {
mHolder.unlockCanvasAndPost(canvas);
}
}
自己反复让Activity前后台切换,因为SurfaceView不可见会被销毁,可见后会被创建。这时终于复现了ANR,并且看到了一条异常:
Surface has already been released.
于是开始具体分析源码,先看unlockCanvasAndPost实现,因为可能unlock
// SurfaceView.SurfaceHolder的实现
@Override
public void unlockCanvasAndPost(Canvas canvas) {
mSurface.unlockCanvasAndPost(canvas);
mSurfaceLock.unlock();
}
// Surface类
public void unlockCanvasAndPost(Canvas canvas) {
synchronized (mLock) {
checkNotReleasedLocked();
//...
}
}
// 找到了那个抛异常位置,如果在这里抛出异常,那么在就不会执行SurfaceLock.unlock了,最后导致再次lock的时候出现ANR。
// 当mNativeObject=0时,会抛这个异常,接着看mNativeObject什么情况下回置为0.
private void checkNotReleasedLocked() {
if (mNativeObject == 0) {
throw new IllegalStateException("Surface has already been released.");
}
}
// 原来这个方法会把mNativeObject置为0,接分析哪里调用这个方法
private void setNativeObjectLocked(long ptr) {
//...
mNativeObject = ptr;
//...
}
// 搜索了一下,原来这里调用了setNativeObjectLocked(0)
@Deprecated
public void transferFrom(Surface other) {
if (other != this) {
//...
other.setNativeObjectLocked(0);
//...
}
}
// SurfaceView里调用transferFrom
/** @hide */
protected void updateWindow(boolean force, boolean redrawNeeded) {
mSurfaceLock.lock();
try {
} finally {
mSurfaceLock.unlock();
}
try {
....
if (mSurfaceCreated && (surfaceChanged || (!visible && visibleChanged))) {
mSurfaceCreated = false;
if (mSurface.isValid()) {
callbacks = getSurfaceCallbacks();
for (SurfaceHolder.Callback c : callbacks) {
c.surfaceDestroyed(mSurfaceHolder);
}
}
}
mSurface.transferFrom(mNewSurface);
....
} finally {
}
}
}
SurfaceView生命周期如下:
surfaceCreated:当从不可见状态变为可见状态时
surfaceChanged:当大小改变时
surfaceDestroyed:当从可见状态变为不可见状态时
根据BUG复现步骤,点击聊天按钮,跳转到聊天页面,此时直播间处于不可见状态,因此SurfaeView会被销毁,所以会调用surfaceDestroyed。
// 从上面代码可以看到,先回调surfaceDestroyed,然后执行mSurface.transferFrom(mNewSurface),这时会将mNativeObject置为0,
// 如果恰好此时调用unlockCanvasAndPost,会抛出异常,并且不能调用unlock,导致下次创建SurfaceView时发生ANR。
产生ANR的原因:简而言之,处于在lockCanvas和unlockCanvasAndPost之间时,SurfaceView销毁了,导致unlock失败,出现了死锁。
总结本次ANR过程:
第一步:执行了mHolder.lockCanvas(),lock成功获得锁
第二步:此时恰巧遇到SurfaceView销毁,surfaceDestroyed执行,并且将mNativeObject置为0
第三步:调用unlockCanvasAndPost,但是由于mNativeObject为0,所以抛出异常,并没有成功unlock
第四步:SurfaceView重新创建,尝试lock,因为上次的锁没有释放,所以进入了无限等待。
解决方法:分为2步
1、在操作画布过程增加同步锁,让整个操作画布过程作为一个整体
synchronized (this) {
if (mDrawFlag) {
Canvas canvas = mHolder.lockCanvas();
if (canvas != null) {
try {
for (Heart heart : mHeartArray) {
canvas.drawBitmap(heart.bitmap, null, heart.dst, mPaint);
}
}
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
mHolder.unlockCanvasAndPost(canvas);
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
@Override
public void surfaceDestroyed(SurfaceHolder holder) {
synchronized (this) {
mDrawFlag = false;
}
}
解决这个ANR,简而言之,阻止SurfaceView在lockCanvas和unlockCanvasAndPost之间销毁,在上面两处加上了同步块。