Handler机制深入解析

知乎上看到这样一个问题Android中为什么主线程不会因为Looper.loop()里的死循环卡死?,于是试着对Handler源码重新看了一下,其实Android的消息机制是Pipe+epoll(了解epoll),有消息时则依次执行,没消息时调用epoll.wait等待唤醒;由于Android中生命周期、UI绘制都是动过Handler实现的,因此自然不会发生阻塞卡死

Android为了保证主线程在生命周期内不退出,所以用了无限循环,在循环中,如果没有事件需要处理,则使用epoll进行休眠时监听,休眠会让线程让出CPU,因此节省系统资源。

而在next取消息时,如果当前没有消息要处理,则会调用本地层的Looper的pollOnce来epoll_wait监听之前创建的Pipe,主线程进入休眠状态;

而唤醒则需要别的线程进行唤醒;熟悉Android源码的知道,AMS向Activity发送生命周期消息是通过Binder来实现的,ActivityThread中有Binder的服务器端即ApplicationThread,它是运行在Binder线程中的;当AMS发送消息时,ApplicationThread会通过主线程的Handler向主线程发送消息,通过往pipe里面写入字节,来将epoll_wait唤醒,即将主线程唤醒;

而相仿的,当有触摸事件时,IMS通过WMS来向应用进程发送消息,它也是通过Binder来传递的,同样也会将主线程唤醒;因此来说主线程只会休眠等待唤醒,而并不会只有一个主线程而卡死。

1、创建Looper

之间Java层的简单分析Handler,Looper,Message,MessageQueue之间关系浅析,可以知道这几个主要类之间的简单的关系,从Looper.prepare()开始,知道这里会对Looper及MessageQueue进行初始化。

public final class Looper {
    private Looper(boolean quitAllowed) {
        mQueue = new MessageQueue(quitAllowed);
        mThread = Thread.currentThread();
    }
}
public final class MessageQueue {
    // 本地方法对MessageQueue进行初始化
    private native static long nativeInit();
    MessageQueue(boolean quitAllowed) {
        mQuitAllowed = quitAllowed;
        mPtr = nativeInit();
    }
}
可以看到这里有个native方法;查看其对应的JNI函数:

/*@path \frameworks\base\core\jni\android_os_MessageQueue.cpp*/
static jlong android_os_MessageQueue_nativeInit(JNIEnv* env, jclass clazz) {
    // 创建一个本地NativeMessageQueue
    NativeMessageQueue* nativeMessageQueue = new NativeMessageQueue();
    if (!nativeMessageQueue) {
        jniThrowRuntimeException(env, "Unable to allocate native queue");
        return 0;
    }

    // 增加强引用计数
    nativeMessageQueue->incStrong(env);
    // 返回新创建的NativeMessageQueue的地址
    return reinterpret_cast<jlong>(nativeMessageQueue);
}
可以看到这里创建了一个本地的 nativeMessageQueue 对象,并将地址传递给Java层,也就是MessageQueue中的ptr变量指向的就是本地的nativeMessageQueue  对象;

继续来看nativeMessageQueue 的初始化:

/*@path \frameworks\base\core\jni\android_os_MessageQueue.cpp*/
class NativeMessageQueue : public MessageQueue {
public:
    NativeMessageQueue();
    virtual ~NativeMessageQueue();

    virtual void raiseException(JNIEnv* env, const char* msg, jthrowable exceptionObj);

    void pollOnce(JNIEnv* env, int timeoutMillis);

    void wake();

private:
    bool mInCallback;
    jthrowable mExceptionObj;
};

// 构造函数:
NativeMessageQueue::NativeMessageQueue() : mInCallback(false), mExceptionObj(NULL) {
    // 类似于ThreadLocal机制
    mLooper = Looper::getForThread();
    if (mLooper == NULL) {
        // 创建Looper
        mLooper = new Looper(false);
        Looper::setForThread(mLooper);
    }
}
可以看到在NativeMessageQueue的构造函数中创建了(本地端的)Looper,在Java端,则是在Looper的构造函数中创建的MessageQueue,两者刚好相反。这里同样使用类似于ThreadLocal机制,来保证Looper的线程性。继续来看Looper构造函数:

Looper的定义在\system\core\include\utils\Looper.h,构造函数如下:

/*@path \system\core\libutils\Looper.cpp*/
// 传进来的allowNonCallbacks值为false
Looper::Looper(bool allowNonCallbacks) :
        mAllowNonCallbacks(allowNonCallbacks), mSendingMessage(false),
        mResponseIndex(0), mNextMessageUptime(LLONG_MAX) {
    int wakeFds[2];

    // 可以看到创建一个Pipe管道
    int result = pipe(wakeFds);
    // 获取管道的读写文件描述符
    mWakeReadPipeFd = wakeFds[0];
    mWakeWritePipeFd = wakeFds[1];

    // 读写均设置为O_NONBLOCK非阻塞模式
    result = fcntl(mWakeReadPipeFd, F_SETFL, O_NONBLOCK);
    result = fcntl(mWakeWritePipeFd, F_SETFL, O_NONBLOCK);

    mIdling = false;

    // 创建一个epoll的句柄
    mEpollFd = epoll_create(EPOLL_SIZE_HINT);

    // 定义的event事件数据结构,epoll会将已经发生的事件赋值到eventItem数据结构中
    struct epoll_event eventItem;
    memset(& eventItem, 0, sizeof(epoll_event));  // 分配内存

    // 设置为EPOLLIN:EPOLLIN事件则只有当对端有数据写入时才会触发,
    // 所以触发一次后需要不断读取所有数据直到读完EAGAIN为止。
    // 否则剩下的数据只有在下次对端有写入时才能一起取出来了。
    eventItem.events = EPOLLIN;
    eventItem.data.fd = mWakeReadPipeFd;
    // 注册,将mWakeReadPipeFd加入到监控链表
    result = epoll_ctl(mEpollFd, EPOLL_CTL_ADD, mWakeReadPipeFd, & eventItem);
}
创建了一个Pipe管道,并将其读文件描述符加入到监控列表中,监听是否有新数据要读。至此,Java层的Looper初始化工作完成。


二、Handler发送消息

Handler发送消息最终调用的都是:

public boolean sendMessageAtTime(Message msg, long uptimeMillis) {
    MessageQueue queue = mQueue;
    if (queue == null) {
        RuntimeException e = new RuntimeException(
                this + " sendMessageAtTime() called with no mQueue");
        Log.w("Looper", e.getMessage(), e);
        return false;
    }
    return enqueueMessage(queue, msg, uptimeMillis);
}
继续看enqueueMessage:

private boolean enqueueMessage(MessageQueue queue, Message msg, long uptimeMillis) {
    msg.target = this;
    if (mAsynchronous) {
        msg.setAsynchronous(true);
    }
    return queue.enqueueMessage(msg, uptimeMillis);
}
再调用MessageQueue的入消息队列方法之前,看到有个mAsynchronous条件判断,这个用于判断事件是否是异步事件。平时代码中使用handler发送的消息均为同步事件。

这里涉及到同步事件与异步事件Barrier的问题,参考 http://www.cnblogs.com/angeldevil/p/3340644.html
来看MessageQueue(framework层)中的enqueueSyncBarrier方法:

/*@path \frameworks\base\core\java\android\os\MessageQueue.java*/
int enqueueSyncBarrier(long when) {
    // Enqueue a new sync barrier token.
    // We don't need to wake the queue because the purpose of a barrier is to stall it.
    synchronized (this) {
        // 每一次调用enqueueSyncBarrier都会自增1,这样每一个异步事件都会有一个唯一的ID
        final int token = mNextBarrierToken++;
        // 从Message池中获取一个Message
        final Message msg = Message.obtain();
        // 设置为inuse
        msg.markInUse();
        msg.when = when;
        msg.arg1 = token;

        Message prev = null;
        Message p = mMessages;
        // 如果when不为0,表示event是delay的,按照时间选择出最佳的插入位置
        if (when != 0) {
            while (p != null && p.when <= when) {
                prev = p;
                p = p.next;
            }
        }
        // 将msg插入到消息队列中
        if (prev != null) { // invariant: p == prev.next
            msg.next = p;
            prev.next = msg;
        } else {
            msg.next = p;
            mMessages = msg;
        }
        return token;
    }
}
这里获取了一个Message,并设置相关参数,然后把它插入到消息队列中,注意这里没有设置msg的target,调用该方法插入的事件的target为null。来看在消息处理next()时对其的特殊处理:

Message prevMsg = null;
Message msg = mMessages;
if (msg != null && msg.target == null) {
    // Stalled by a barrier.  Find the next asynchronous message in the queue.
    do {
        prevMsg = msg;
        msg = msg.next;
    } while (msg != null && !msg.isAsynchronous());
}
可以看到,一旦遇到msg的target为null的时候,此时不继续顺序执行同步事件,而是起到一个拦截器的作用,找到后面一个异步事件然后将其出队列进行执行。也就是 异步消息的执行原理 就是向同步事件队列插入一个Barriar进行拦截,等待异步事件被取出执行。这里保证了异步事件会优于同步事件而立即被执行。同步事件会一直等待,直至调用removeSyncBarrier方法。

继续来看enqueueMessage:

boolean enqueueMessage(Message msg, long when) {
    // 这里是同步事件入队列,保证了异步事件不会混淆进来
    if (msg.target == null) {
        throw new IllegalArgumentException("Message must have a target.");
    }
    if (msg.isInUse()) {
        throw new IllegalStateException(msg + " This message is already in use.");
    }

    synchronized (this) {
        msg.markInUse();
        msg.when = when;
        Message p = mMessages;
        boolean needWake;
        // 将Message插入到合适的位置
        if (p == null || when == 0 || when < p.when) {
            // New head, wake up the event queue if blocked.
            msg.next = p;
            mMessages = msg;
            needWake = mBlocked;
        } else {
            // Inserted within the middle of the queue.  Usually we don't have to wake
            // up the event queue unless there is a barrier at the head of the queue
            // and the message is the earliest asynchronous message in the queue.

            // 这里的needWake为true将会调用nativeWake,将会很有用处
            needWake = mBlocked && p.target == null && msg.isAsynchronous();
            Message prev;
            for (;;) {
                prev = p;
                p = p.next;
                if (p == null || when < p.when) {
                    break;
                }
                if (needWake && p.isAsynchronous()) {
                    needWake = false;
                }
            }
            msg.next = p; // invariant: p == prev.next
            prev.next = msg;
        }

        // We can assume mPtr != 0 because mQuitting is false.
        if (needWake) {
            nativeWake(mPtr);
        }
    }
    return true;
}

三、消息处理

消息处理是Looper中的循环处理函数loop(),而loop的关键是通过MessageQueue的next方法中取出消息Message;

来看next;

Message next() {
    // 这个ptr之前分析过,指向native端创建的nativeMessageQueue对象
    final long ptr = mPtr;
    if (ptr == 0) {
        return null;
    }

    int pendingIdleHandlerCount = -1; // -1 only during first iteration
    int nextPollTimeoutMillis = 0;
    for (;;) {
        if (nextPollTimeoutMillis != 0) {
            Binder.flushPendingCommands();
        }

        // ********** 重要的方法 *********** //
        nativePollOnce(ptr, nextPollTimeoutMillis);

        synchronized (this) {
            // Try to retrieve the next message.  Return if found.
            final long now = SystemClock.uptimeMillis();
            Message prevMsg = null;
            Message msg = mMessages;
            // 由前面知道这里是barrier,用于处理队列中的异步事务
            if (msg != null && msg.target == null) {
                do {
                    prevMsg = msg;
                    msg = msg.next;
                } while (msg != null && !msg.isAsynchronous());
            }
            if (msg != null) {
                // 表示现在没有事件需要执行
                if (now < msg.when) {
                    // 可以看到nextPollTimeoutMillis表示距离最近ready的Message执行有多久时间
                    nextPollTimeoutMillis = (int) Math.min(msg.when - now, Integer.MAX_VALUE);
                } else {
                    // 取出一个Message,简单的单链表删除操作
                    mBlocked = false;
                    // 这个对应异步事件
                    if (prevMsg != null) {
                        prevMsg.next = msg.next;
                    } else {
                        mMessages = msg.next;
                    }
                    msg.next = null;
                    return msg;
                }
            } else {
                // 事件队列为空时,设置为-1
                nextPollTimeoutMillis = -1;
            }

            // Process the quit message now that all pending messages have been handled.
            if (mQuitting) {
                dispose();
                return null;
            }

            // If first time idle, then get the number of idlers to run.
            // Idle handles only run if the queue is empty or if the first message
            // in the queue (possibly a barrier) is due to be handled in the future.
            // 当消息队列为空,或者当前没有消息需要执行时,即当前空闲状态下,执行IdleHandler的事件
            if (pendingIdleHandlerCount < 0
                    && (mMessages == null || now < mMessages.when)) {
                pendingIdleHandlerCount = mIdleHandlers.size();
            }
            if (pendingIdleHandlerCount <= 0) {
                // No idle handlers to run.  Loop and wait some more.
                mBlocked = true;
                continue;
            }

            if (mPendingIdleHandlers == null) {
                mPendingIdleHandlers = new IdleHandler[Math.max(pendingIdleHandlerCount, 4)];
            }
            mPendingIdleHandlers = mIdleHandlers.toArray(mPendingIdleHandlers);
        }

        // Run the idle handlers.
        // We only ever reach this code block during the first iteration.
        for (int i = 0; i < pendingIdleHandlerCount; i++) {
            final IdleHandler idler = mPendingIdleHandlers[i];
            mPendingIdleHandlers[i] = null; // release the reference to the handler

            boolean keep = false;
            try {
                keep = idler.queueIdle();
            } catch (Throwable t) {
                Log.wtf("MessageQueue", "IdleHandler threw exception", t);
            }

            if (!keep) {
                synchronized (this) {
                    mIdleHandlers.remove(idler);
                }
            }
        }

        // Reset the idle handler count to 0 so we do not run them again.
        pendingIdleHandlerCount = 0;

        // While calling an idle handler, a new message could have been delivered
        // so go back and look again for a pending message without waiting.
        nextPollTimeoutMillis = 0;
    }
}
来看本地方法nativePollOnce,其对应于native层的NativeMessageQueue::pollOnce:

void NativeMessageQueue::pollOnce(JNIEnv* env, int timeoutMillis) {
    mInCallback = true;
    mLooper->pollOnce(timeoutMillis);
    mInCallback = false;
    if (mExceptionObj) {
        env->Throw(mExceptionObj);
        env->DeleteLocalRef(mExceptionObj);
        mExceptionObj = NULL;
    }
}
可以看到本地层其实调用的是Looper::pollOnce:

int Looper::pollOnce(int timeoutMillis, int* outFd, int* outEvents, void** outData) {
    int result = 0;
    for (;;) {
        while (mResponseIndex < mResponses.size()) {
            const Response& response = mResponses.itemAt(mResponseIndex++);
            int ident = response.request.ident;
            if (ident >= 0) {
                int fd = response.request.fd;
                int events = response.events;
                void* data = response.request.data;
#if DEBUG_POLL_AND_WAKE
                ALOGD("%p ~ pollOnce - returning signalled identifier %d: "
                        "fd=%d, events=0x%x, data=%p",
                        this, ident, fd, events, data);
#endif
                if (outFd != NULL) *outFd = fd;
                if (outEvents != NULL) *outEvents = events;
                if (outData != NULL) *outData = data;
                return ident;
            }
        }

        if (result != 0) {
#if DEBUG_POLL_AND_WAKE
            ALOGD("%p ~ pollOnce - returning result %d", this, result);
#endif
            if (outFd != NULL) *outFd = 0;
            if (outEvents != NULL) *outEvents = 0;
            if (outData != NULL) *outData = NULL;
            return result;
        }

        // 
        result = pollInner(timeoutMillis);
    }
}
接着调用pollInner:

/*@path \system\core\libutils\Looper.cpp*/
// timeoutMillis传入的即为nextPollTimeoutMillis
int Looper::pollInner(int timeoutMillis) {
    // timeoutMillis表示Java层消息队列距离最近事件空闲时间
    // mNextMessageUptime表示native层本地消息队列mMessageEnvelopes空闲时间
    if (timeoutMillis != 0 && mNextMessageUptime != LLONG_MAX) {
        nsecs_t now = systemTime(SYSTEM_TIME_MONOTONIC);
        int messageTimeoutMillis = toMillisecondTimeoutDelay(now, mNextMessageUptime);
        // 协调时间,设置timeoutMillis为两者最小值
        if (messageTimeoutMillis >= 0
            && (timeoutMillis < 0 || messageTimeoutMillis < timeoutMillis)) {
            timeoutMillis = messageTimeoutMillis;
        }
    }

    // Poll.
    int result = POLL_WAKE;
    mResponses.clear();
    mResponseIndex = 0;

    // We are about to idle.
    mIdling = true;

    // ****************** 这里调用epoll_wait ********************** //
    struct epoll_event eventItems[EPOLL_MAX_EVENTS];
    // 监控事件
    int eventCount = epoll_wait(mEpollFd, eventItems, EPOLL_MAX_EVENTS, timeoutMillis);

    // No longer idling.
    mIdling = false;

    // Acquire lock.
    mLock.lock();

    // 出错直接返回
    if (eventCount < 0) {
        if (errno == EINTR) {
            goto Done;
        }
        result = POLL_ERROR;
        goto Done;
    }

    // Check for poll timeout.
    if (eventCount == 0) {
        result = POLL_TIMEOUT;
        goto Done;
    }

    // 表示有事件发生,遍历eventItems
    for (int i = 0; i < eventCount; i++) {
        int fd = eventItems[i].data.fd;
        uint32_t epollEvents = eventItems[i].events;
        if (fd == mWakeReadPipeFd) {
            if (epollEvents & EPOLLIN) {
                // 调用awoken
                awoken();
            } else {
            }
        } else {
            // 否则表示可能是通过Looper的addFD函数添加的
            ssize_t requestIndex = mRequests.indexOfKey(fd);
            if (requestIndex >= 0) {
                int events = 0;
                if (epollEvents & EPOLLIN) events |= EVENT_INPUT;
                if (epollEvents & EPOLLOUT) events |= EVENT_OUTPUT;
                if (epollEvents & EPOLLERR) events |= EVENT_ERROR;
                if (epollEvents & EPOLLHUP) events |= EVENT_HANGUP;
                pushResponse(events, mRequests.valueAt(requestIndex));
            } else {
                ALOGW("Ignoring unexpected epoll events 0x%x on fd %d that is "
                              "no longer registered.", epollEvents, fd);
            }
        }
    }
    Done: ;

    .........
}
先不看Done部分,来看awoken:

/*@path \system\core\libutils\Looper.cpp*/
void Looper::awoken() {
    char buffer[16];
    ssize_t nRead;
    do {
        // 从mWakeReadPipeFd读取数据
        nRead = read(mWakeReadPipeFd, buffer, sizeof(buffer));
    } while ((nRead == -1 && errno == EINTR) || nRead == sizeof(buffer));
}

void Looper::wake() {
    ssize_t nWrite;
    do {
        // 往mWakeWritePipeFd写入一个'W'
        nWrite = write(mWakeWritePipeFd, "W", 1);
    } while (nWrite == -1 && errno == EINTR);

    if (nWrite != 1) {
        if (errno != EAGAIN) {
            ALOGW("Could not write wake signal, errno=%d", errno);
        }
    }
}
Looper中wake和awoken是成对存在的,wake往mWakeWritePipeFd写入一个'W',则epoll_wait时会监听到有事件发生;而awoken则从mWakeReadPipeFd读取数据,消耗掉该写入的字符。

那么调用wake的时机是什么?

在前面MessageQueue的enqueueMessage中,有这么一段:

// We can assume mPtr != 0 because mQuitting is false.
if (needWake) {
    nativeWake(mPtr);
}

nativeWake对应的就是wake;而nativeWake调用的时机,是needWake为true;而needWake为true的情况时如果当前Java事件队列blocked,而当前队列头部插入一个新事件,这个时候需要调用wake来唤醒epoll_wait;因为此时timeout事件会相比之间值发生改变,需要重新计算;另一种情况时如果当前插入了异步事件,也需要唤醒(注意,如果消息队列中有异步消息并且执行时间在新消息之前,所以不需要唤醒)。对于native层的消息队列,也是类似。

接着来看pollInner中的Done部分:

int Looper::pollInner(int timeoutMillis) {
    .........
    Done: ;
    // 调用Message的Callback
    mNextMessageUptime = LLONG_MAX;
    while (mMessageEnvelopes.size() != 0) {
        nsecs_t now = systemTime(SYSTEM_TIME_MONOTONIC);
        const MessageEnvelope& messageEnvelope = mMessageEnvelopes.itemAt(0);
        if (messageEnvelope.uptime <= now) {
            // Remove the envelope from the list.
            // We keep a strong reference to the handler until the call to handleMessage
            // finishes.  Then we drop it so that the handler can be deleted *before*
            // we reacquire our lock.
            { // obtain handler
                sp<MessageHandler> handler = messageEnvelope.handler;
                Message message = messageEnvelope.message;
                mMessageEnvelopes.removeAt(0);
                mSendingMessage = true;
                mLock.unlock();

                // 调用handleMessage处理事件
                handler->handleMessage(message);
            } // release handler

            mLock.lock();
            mSendingMessage = false;
            result = POLL_CALLBACK;
        } else {
            // The last message left at the head of the queue determines the next wakeup time.
            mNextMessageUptime = messageEnvelope.uptime;
            break;
        }
    }

    // Release lock.
    mLock.unlock();

    // Invoke all response callbacks.
    for (size_t i = 0; i < mResponses.size(); i++) {
        Response& response = mResponses.editItemAt(i);
        if (response.request.ident == POLL_CALLBACK) {
            int fd = response.request.fd;
            int events = response.events;
            void* data = response.request.data;
            int callbackResult = response.request.callback->handleEvent(fd, events, data);
            if (callbackResult == 0) {
                removeFd(fd);
            }
            // Clear the callback reference in the response structure promptly because we
            // will not clear the response vector itself until the next poll.
            response.request.callback.clear();
            result = POLL_CALLBACK;
        }
    }
    return result;
}

你可能感兴趣的:(Handler机制深入解析)