blockCanary原理

blockCanary

对于android里面的性能优化,最主要的问题就是UI线程的阻塞导致的,对于如何准确的计算UI的绘制所耗费的时间,是非常有必要的,blockCanary是基于这个需求出现的,同样的,也是基于LeakCanary,和LeakCanary有着显示页面和堆栈信息。

使用

首先在gradle引入

implementation 'com.github.markzhai:blockcanary-android:1.5.0'

然后Application里面进行初始化和start

 BlockCanary.install(this, new BlockCanaryContext()).start();

原理:

其中BlockCanaryContext表示的就是我们监测的某些参数,包括卡顿的阈值、输出文件的路径等等

//默认卡顿阈值为1000ms
public int provideBlockThreshold() {
        return 1000;
    }
//输出的log
 public String providePath() {
        return "/blockcanary/";
    }
//支持文件上传
public void upload(File zippedFile) {
        throw new UnsupportedOperationException();
    }
//可以在卡顿提供自定义操作
@Override
    public void onBlock(Context context, BlockInfo blockInfo) {

    }

其中,init只是创建出BlockCanary实例。主要是start方法的操作。

   /**
     * Start monitoring.
     */
    public void start() {
        if (!mMonitorStarted) {
            mMonitorStarted = true;
            Looper.getMainLooper().setMessageLogging(mBlockCanaryCore.monitor);
        }
    }

其实就是给主线程的Looper设置一个monitor。
我们可以先看看主线程的looper实现。

public static void loop() {
    ...
    for (;;) {
        ...
        // This must be in a local variable, in case a UI event sets the logger
        Printer logging = me.mLogging;
        if (logging != null) {
            logging.println(">>>>> Dispatching to " + msg.target + " " +
                    msg.callback + ": " + msg.what);
        }
        msg.target.dispatchMessage(msg);
        if (logging != null) {
            logging.println("<<<<< Finished to " + msg.target + " " + msg.callback);
        }
        ...
    }
}

在上面的loop循环的代码中,msg.target.dispatchMessage就是我们UI线程收到每一个消息需要执行的操作,都在其内部执行。
系统也在其执行的前后都会执行logging类的print的方法,这个方法是我们可以自定义的。所以只要我们在运行的前后都添加一个时间戳,用运行后的时间减去运行前的时间,一旦这个时间超过了我们设定的阈值,那么就可以说这个操作卡顿,阻塞了UI线程,最后通过dump出此时的各种信息,来分析各种性能瓶颈。
那么接下来可以看看这个monitor的println方法。

@Override
    public void println(String x) {
        //如果当前是在调试中,那么直接返回,不做处理
        if (mStopWhenDebugging && Debug.isDebuggerConnected()) {
            return;
        }
        if (!mPrintingStarted) {
            //执行操作前
            mStartTimestamp = System.currentTimeMillis();
            mStartThreadTimestamp = SystemClock.currentThreadTimeMillis();
            mPrintingStarted = true;
            startDump();
        } else {
            //执行操作后
            final long endTime = System.currentTimeMillis();
            mPrintingStarted = false;
            //是否卡顿
            if (isBlock(endTime)) {
                notifyBlockEvent(endTime);
            }
            stopDump();
        }
    }

在ui操作执行前,将会记录当前的时间戳,同时会startDump。
在ui操作执行后,将会计算当前是否卡顿了,如果卡顿了,将会回调到onBlock的onBlock方法。同时将会停止dump。
为什么操作之前就开启了startDump,而操作执行之后就stopDump呢?

private void startDump() {
        if (null != BlockCanaryInternals.getInstance().stackSampler) {
            BlockCanaryInternals.getInstance().stackSampler.start();
        }

        if (null != BlockCanaryInternals.getInstance().cpuSampler) {
            BlockCanaryInternals.getInstance().cpuSampler.start();
        }
    }
public void start() {
        if (mShouldSample.get()) {
            return;
        }
        mShouldSample.set(true);

        HandlerThreadFactory.getTimerThreadHandler().removeCallbacks(mRunnable);
        HandlerThreadFactory.getTimerThreadHandler().postDelayed(mRunnable,
                BlockCanaryInternals.getInstance().getSampleDelay());
    }

其实startDump的时候并没有马上start,而是会postDelay一个runnable,这个runnable就是执行dump的真正的操作,delay的时间就是我们设置的阈值的0.8
也就是,一旦我们的stop在设置的延迟时间之前执行,就不会真正的执行dump操作。

 public void stop() {
        if (!mShouldSample.get()) {
            return;
        }
        mShouldSample.set(false);
        HandlerThreadFactory.getTimerThreadHandler().removeCallbacks(mRunnable);
    }

只有当stop操作在设置的延迟时间之后执行,才会执行dump操作。

 private Runnable mRunnable = new Runnable() {
        @Override
        public void run() {
            doSample();

            if (mShouldSample.get()) {
                HandlerThreadFactory.getTimerThreadHandler()
                        .postDelayed(mRunnable, mSampleInterval);
            }
        }
    };

这个doSameple分别会dump出stack信息和cpu信息。
cpu:

  try {
            cpuReader = new BufferedReader(new InputStreamReader(
                    new FileInputStream("/proc/stat")), BUFFER_SIZE);
            String cpuRate = cpuReader.readLine();
            if (cpuRate == null) {
                cpuRate = "";
            }

            if (mPid == 0) {
                mPid = android.os.Process.myPid();
            }
            pidReader = new BufferedReader(new InputStreamReader(
                    new FileInputStream("/proc/" + mPid + "/stat")), BUFFER_SIZE);
            String pidCpuRate = pidReader.readLine();
            if (pidCpuRate == null) {
                pidCpuRate = "";
            }

            parse(cpuRate, pidCpuRate);
        }

stack:

 protected void doSample() {
        StringBuilder stringBuilder = new StringBuilder();

        for (StackTraceElement stackTraceElement : mCurrentThread.getStackTrace()) {
            stringBuilder
                    .append(stackTraceElement.toString())
                    .append(BlockInfo.SEPARATOR);
        }

        synchronized (sStackMap) {
            if (sStackMap.size() == mMaxEntryCount && mMaxEntryCount > 0) {
                sStackMap.remove(sStackMap.keySet().iterator().next());
            }
            sStackMap.put(System.currentTimeMillis(), stringBuilder.toString());
        }
    }

这样,整个blockCanary的执行过程就完毕了。

ANR

当卡顿时间大于一定值之后,将会造成ANR,那么Android系统的ANR是如何检测出来的呢?其实就是通过Watchdog来实现的,这个Watchdog是一个线程。

public class Watchdog extends Thread {
}

我们主要看一下其中的run方法的实现。

 @Override
    public void run() {
        boolean waitedHalf = false;
        while (true) {
            final ArrayList blockedCheckers;
            final String subject;
            final boolean allowRestart;
            int debuggerWasConnected = 0;
            synchronized (this) {
                long timeout = CHECK_INTERVAL;
                // Make sure we (re)spin the checkers that have become idle within
                // this wait-and-check interval
                for (int i=0; i 0) {
                    debuggerWasConnected--;
                }

                // NOTE: We use uptimeMillis() here because we do not want to increment the time we
                // wait while asleep. If the device is asleep then the thing that we are waiting
                // to timeout on is asleep as well and won't have a chance to run, causing a false
                // positive on when to kill things.
                long start = SystemClock.uptimeMillis();
                while (timeout > 0) {
                    if (Debug.isDebuggerConnected()) {
                        debuggerWasConnected = 2;
                    }
                    try {
                        wait(timeout);
                    } catch (InterruptedException e) {
                        Log.wtf(TAG, e);
                    }
                    if (Debug.isDebuggerConnected()) {
                        debuggerWasConnected = 2;
                    }
                    timeout = CHECK_INTERVAL - (SystemClock.uptimeMillis() - start);
                }

在这个run方法中,会开启一个死循环,主要用于持续检测ANR

while (true) {
 
 }

通过wait,设置每一次休眠时间,

 long start = SystemClock.uptimeMillis();
                while (timeout > 0) {
                    if (Debug.isDebuggerConnected()) {
                        debuggerWasConnected = 2;
                    }
                    try {
                        wait(timeout);
                    } catch (InterruptedException e) {
                        Log.wtf(TAG, e);
                    }
                    if (Debug.isDebuggerConnected()) {
                        debuggerWasConnected = 2;
                    }
                    timeout = CHECK_INTERVAL - (SystemClock.uptimeMillis() - start);
                }

当timeout计算完毕之后,会尝试获取当前各个的线程的状态

final int waitState = evaluateCheckerCompletionLocked();
   private int evaluateCheckerCompletionLocked() {
        int state = COMPLETED;
        for (int i=0; i
public int getCompletionStateLocked() {
            if (mCompleted) {
                return COMPLETED;
            } else {
                long latency = SystemClock.uptimeMillis() - mStartTime;
                if (latency < mWaitMax/2) {
                    return WAITING;
                } else if (latency < mWaitMax) {
                    return WAITED_HALF;
                }
            }
            return OVERDUE;
        }

一旦有线程等待时间超过了最大等待时间,则表示当前已经有ANR。需要dump此时的堆栈信息。

if (waitState == COMPLETED) {
                    // The monitors have returned; reset
                    waitedHalf = false;
                    continue;
                } else if (waitState == WAITING) {
                    // still waiting but within their configured intervals; back off and recheck
                    continue;
                } else if (waitState == WAITED_HALF) {
                    if (!waitedHalf) {
                        // We've waited half the deadlock-detection interval.  Pull a stack
                        // trace and wait another half.
                        ArrayList pids = new ArrayList();
                        pids.add(Process.myPid());
                        ActivityManagerService.dumpStackTraces(true, pids, null, null,
                            getInterestingNativePids());
                        waitedHalf = true;
                    }
                    continue;
                }

此外,还有一个第三方库,ANRWatchDog,也是用来检测Anr的,其实原理更加简单,

public void run() {
        setName("|ANR-WatchDog|");

        int lastTick;
        int lastIgnored = -1;
        while (!isInterrupted()) {
            lastTick = _tick;
            _uiHandler.post(_ticker);
            try {
                Thread.sleep(_timeoutInterval);
            }
            catch (InterruptedException e) {
                _interruptionListener.onInterrupted(e);
                return ;
            }

            // If the main thread has not handled _ticker, it is blocked. ANR.
            if (_tick == lastTick) {
                if (!_ignoreDebugger && Debug.isDebuggerConnected()) {
                    if (_tick != lastIgnored)
                        Log.w("ANRWatchdog", "An ANR was detected but ignored because the debugger is connected (you can prevent this with setIgnoreDebugger(true))");
                    lastIgnored = _tick;
                    continue ;
                }

                ANRError error;
                if (_namePrefix != null)
                    error = ANRError.New(_namePrefix, _logThreadsWithoutStackTrace);
                else
                    error = ANRError.NewMainOnly();
                _anrListener.onAppNotResponding(error);
                return;
            }
        }
    }

它会在线程中,利用uiHanlder抛出一个计数器,然后wait指定时间,一旦等待时间到达,那么它会检查计数的值是否发生改变,如果没有发生改变,表示uiHandler的计算方法并没有执行到。也就是出现了Anr,此时需要dump堆栈信息。

你可能感兴趣的:(Android)