知乎上看到这样一个问题Android中为什么主线程不会因为Looper.loop()里的死循环卡死?,于是试着对Handler源码重新看了一下,其实Android的消息机制是Pipe+epoll(了解epoll),有消息时则依次执行,没消息时调用epoll.wait等待唤醒;由于Android中生命周期、UI绘制都是动过Handler实现的,因此自然不会发生阻塞卡死。
Android为了保证主线程在生命周期内不退出,所以用了无限循环,在循环中,如果没有事件需要处理,则使用epoll进行休眠时监听,休眠会让线程让出CPU,因此节省系统资源。
而在next取消息时,如果当前没有消息要处理,则会调用本地层的Looper的pollOnce来epoll_wait监听之前创建的Pipe,主线程进入休眠状态;
而唤醒则需要别的线程进行唤醒;熟悉Android源码的知道,AMS向Activity发送生命周期消息是通过Binder来实现的,ActivityThread中有Binder的服务器端即ApplicationThread,它是运行在Binder线程中的;当AMS发送消息时,ApplicationThread会通过主线程的Handler向主线程发送消息,通过往pipe里面写入字节,来将epoll_wait唤醒,即将主线程唤醒;
而相仿的,当有触摸事件时,IMS通过WMS来向应用进程发送消息,它也是通过Binder来传递的,同样也会将主线程唤醒;因此来说主线程只会休眠等待唤醒,而并不会只有一个主线程而卡死。
1、创建Looper
之间Java层的简单分析Handler,Looper,Message,MessageQueue之间关系浅析,可以知道这几个主要类之间的简单的关系,从Looper.prepare()开始,知道这里会对Looper及MessageQueue进行初始化。
public final class Looper { private Looper(boolean quitAllowed) { mQueue = new MessageQueue(quitAllowed); mThread = Thread.currentThread(); } } public final class MessageQueue { // 本地方法对MessageQueue进行初始化 private native static long nativeInit(); MessageQueue(boolean quitAllowed) { mQuitAllowed = quitAllowed; mPtr = nativeInit(); } }可以看到这里有个native方法;查看其对应的JNI函数:
/*@path \frameworks\base\core\jni\android_os_MessageQueue.cpp*/ static jlong android_os_MessageQueue_nativeInit(JNIEnv* env, jclass clazz) { // 创建一个本地NativeMessageQueue NativeMessageQueue* nativeMessageQueue = new NativeMessageQueue(); if (!nativeMessageQueue) { jniThrowRuntimeException(env, "Unable to allocate native queue"); return 0; } // 增加强引用计数 nativeMessageQueue->incStrong(env); // 返回新创建的NativeMessageQueue的地址 return reinterpret_cast<jlong>(nativeMessageQueue); }可以看到这里创建了一个本地的 nativeMessageQueue 对象,并将地址传递给Java层,也就是MessageQueue中的ptr变量指向的就是本地的nativeMessageQueue 对象;
继续来看nativeMessageQueue 的初始化:
/*@path \frameworks\base\core\jni\android_os_MessageQueue.cpp*/ class NativeMessageQueue : public MessageQueue { public: NativeMessageQueue(); virtual ~NativeMessageQueue(); virtual void raiseException(JNIEnv* env, const char* msg, jthrowable exceptionObj); void pollOnce(JNIEnv* env, int timeoutMillis); void wake(); private: bool mInCallback; jthrowable mExceptionObj; }; // 构造函数: NativeMessageQueue::NativeMessageQueue() : mInCallback(false), mExceptionObj(NULL) { // 类似于ThreadLocal机制 mLooper = Looper::getForThread(); if (mLooper == NULL) { // 创建Looper mLooper = new Looper(false); Looper::setForThread(mLooper); } }可以看到在NativeMessageQueue的构造函数中创建了(本地端的)Looper,在Java端,则是在Looper的构造函数中创建的MessageQueue,两者刚好相反。这里同样使用类似于ThreadLocal机制,来保证Looper的线程性。继续来看Looper构造函数:
Looper的定义在\system\core\include\utils\Looper.h,构造函数如下:
/*@path \system\core\libutils\Looper.cpp*/ // 传进来的allowNonCallbacks值为false Looper::Looper(bool allowNonCallbacks) : mAllowNonCallbacks(allowNonCallbacks), mSendingMessage(false), mResponseIndex(0), mNextMessageUptime(LLONG_MAX) { int wakeFds[2]; // 可以看到创建一个Pipe管道 int result = pipe(wakeFds); // 获取管道的读写文件描述符 mWakeReadPipeFd = wakeFds[0]; mWakeWritePipeFd = wakeFds[1]; // 读写均设置为O_NONBLOCK非阻塞模式 result = fcntl(mWakeReadPipeFd, F_SETFL, O_NONBLOCK); result = fcntl(mWakeWritePipeFd, F_SETFL, O_NONBLOCK); mIdling = false; // 创建一个epoll的句柄 mEpollFd = epoll_create(EPOLL_SIZE_HINT); // 定义的event事件数据结构,epoll会将已经发生的事件赋值到eventItem数据结构中 struct epoll_event eventItem; memset(& eventItem, 0, sizeof(epoll_event)); // 分配内存 // 设置为EPOLLIN:EPOLLIN事件则只有当对端有数据写入时才会触发, // 所以触发一次后需要不断读取所有数据直到读完EAGAIN为止。 // 否则剩下的数据只有在下次对端有写入时才能一起取出来了。 eventItem.events = EPOLLIN; eventItem.data.fd = mWakeReadPipeFd; // 注册,将mWakeReadPipeFd加入到监控链表 result = epoll_ctl(mEpollFd, EPOLL_CTL_ADD, mWakeReadPipeFd, & eventItem); }创建了一个Pipe管道,并将其读文件描述符加入到监控列表中,监听是否有新数据要读。至此,Java层的Looper初始化工作完成。
二、Handler发送消息
Handler发送消息最终调用的都是:
public boolean sendMessageAtTime(Message msg, long uptimeMillis) { MessageQueue queue = mQueue; if (queue == null) { RuntimeException e = new RuntimeException( this + " sendMessageAtTime() called with no mQueue"); Log.w("Looper", e.getMessage(), e); return false; } return enqueueMessage(queue, msg, uptimeMillis); }继续看enqueueMessage:
private boolean enqueueMessage(MessageQueue queue, Message msg, long uptimeMillis) { msg.target = this; if (mAsynchronous) { msg.setAsynchronous(true); } return queue.enqueueMessage(msg, uptimeMillis); }再调用MessageQueue的入消息队列方法之前,看到有个mAsynchronous条件判断,这个用于判断事件是否是异步事件。平时代码中使用handler发送的消息均为同步事件。
/*@path \frameworks\base\core\java\android\os\MessageQueue.java*/ int enqueueSyncBarrier(long when) { // Enqueue a new sync barrier token. // We don't need to wake the queue because the purpose of a barrier is to stall it. synchronized (this) { // 每一次调用enqueueSyncBarrier都会自增1,这样每一个异步事件都会有一个唯一的ID final int token = mNextBarrierToken++; // 从Message池中获取一个Message final Message msg = Message.obtain(); // 设置为inuse msg.markInUse(); msg.when = when; msg.arg1 = token; Message prev = null; Message p = mMessages; // 如果when不为0,表示event是delay的,按照时间选择出最佳的插入位置 if (when != 0) { while (p != null && p.when <= when) { prev = p; p = p.next; } } // 将msg插入到消息队列中 if (prev != null) { // invariant: p == prev.next msg.next = p; prev.next = msg; } else { msg.next = p; mMessages = msg; } return token; } }这里获取了一个Message,并设置相关参数,然后把它插入到消息队列中,注意这里没有设置msg的target,调用该方法插入的事件的target为null。来看在消息处理next()时对其的特殊处理:
Message prevMsg = null; Message msg = mMessages; if (msg != null && msg.target == null) { // Stalled by a barrier. Find the next asynchronous message in the queue. do { prevMsg = msg; msg = msg.next; } while (msg != null && !msg.isAsynchronous()); }可以看到,一旦遇到msg的target为null的时候,此时不继续顺序执行同步事件,而是起到一个拦截器的作用,找到后面一个异步事件然后将其出队列进行执行。也就是 异步消息的执行原理 就是向同步事件队列插入一个Barriar进行拦截,等待异步事件被取出执行。这里保证了异步事件会优于同步事件而立即被执行。同步事件会一直等待,直至调用removeSyncBarrier方法。
继续来看enqueueMessage:
boolean enqueueMessage(Message msg, long when) { // 这里是同步事件入队列,保证了异步事件不会混淆进来 if (msg.target == null) { throw new IllegalArgumentException("Message must have a target."); } if (msg.isInUse()) { throw new IllegalStateException(msg + " This message is already in use."); } synchronized (this) { msg.markInUse(); msg.when = when; Message p = mMessages; boolean needWake; // 将Message插入到合适的位置 if (p == null || when == 0 || when < p.when) { // New head, wake up the event queue if blocked. msg.next = p; mMessages = msg; needWake = mBlocked; } else { // Inserted within the middle of the queue. Usually we don't have to wake // up the event queue unless there is a barrier at the head of the queue // and the message is the earliest asynchronous message in the queue. // 这里的needWake为true将会调用nativeWake,将会很有用处 needWake = mBlocked && p.target == null && msg.isAsynchronous(); Message prev; for (;;) { prev = p; p = p.next; if (p == null || when < p.when) { break; } if (needWake && p.isAsynchronous()) { needWake = false; } } msg.next = p; // invariant: p == prev.next prev.next = msg; } // We can assume mPtr != 0 because mQuitting is false. if (needWake) { nativeWake(mPtr); } } return true; }
三、消息处理
消息处理是Looper中的循环处理函数loop(),而loop的关键是通过MessageQueue的next方法中取出消息Message;
来看next;
Message next() { // 这个ptr之前分析过,指向native端创建的nativeMessageQueue对象 final long ptr = mPtr; if (ptr == 0) { return null; } int pendingIdleHandlerCount = -1; // -1 only during first iteration int nextPollTimeoutMillis = 0; for (;;) { if (nextPollTimeoutMillis != 0) { Binder.flushPendingCommands(); } // ********** 重要的方法 *********** // nativePollOnce(ptr, nextPollTimeoutMillis); synchronized (this) { // Try to retrieve the next message. Return if found. final long now = SystemClock.uptimeMillis(); Message prevMsg = null; Message msg = mMessages; // 由前面知道这里是barrier,用于处理队列中的异步事务 if (msg != null && msg.target == null) { do { prevMsg = msg; msg = msg.next; } while (msg != null && !msg.isAsynchronous()); } if (msg != null) { // 表示现在没有事件需要执行 if (now < msg.when) { // 可以看到nextPollTimeoutMillis表示距离最近ready的Message执行有多久时间 nextPollTimeoutMillis = (int) Math.min(msg.when - now, Integer.MAX_VALUE); } else { // 取出一个Message,简单的单链表删除操作 mBlocked = false; // 这个对应异步事件 if (prevMsg != null) { prevMsg.next = msg.next; } else { mMessages = msg.next; } msg.next = null; return msg; } } else { // 事件队列为空时,设置为-1 nextPollTimeoutMillis = -1; } // Process the quit message now that all pending messages have been handled. if (mQuitting) { dispose(); return null; } // If first time idle, then get the number of idlers to run. // Idle handles only run if the queue is empty or if the first message // in the queue (possibly a barrier) is due to be handled in the future. // 当消息队列为空,或者当前没有消息需要执行时,即当前空闲状态下,执行IdleHandler的事件 if (pendingIdleHandlerCount < 0 && (mMessages == null || now < mMessages.when)) { pendingIdleHandlerCount = mIdleHandlers.size(); } if (pendingIdleHandlerCount <= 0) { // No idle handlers to run. Loop and wait some more. mBlocked = true; continue; } if (mPendingIdleHandlers == null) { mPendingIdleHandlers = new IdleHandler[Math.max(pendingIdleHandlerCount, 4)]; } mPendingIdleHandlers = mIdleHandlers.toArray(mPendingIdleHandlers); } // Run the idle handlers. // We only ever reach this code block during the first iteration. for (int i = 0; i < pendingIdleHandlerCount; i++) { final IdleHandler idler = mPendingIdleHandlers[i]; mPendingIdleHandlers[i] = null; // release the reference to the handler boolean keep = false; try { keep = idler.queueIdle(); } catch (Throwable t) { Log.wtf("MessageQueue", "IdleHandler threw exception", t); } if (!keep) { synchronized (this) { mIdleHandlers.remove(idler); } } } // Reset the idle handler count to 0 so we do not run them again. pendingIdleHandlerCount = 0; // While calling an idle handler, a new message could have been delivered // so go back and look again for a pending message without waiting. nextPollTimeoutMillis = 0; } }来看本地方法nativePollOnce,其对应于native层的NativeMessageQueue::pollOnce:
void NativeMessageQueue::pollOnce(JNIEnv* env, int timeoutMillis) { mInCallback = true; mLooper->pollOnce(timeoutMillis); mInCallback = false; if (mExceptionObj) { env->Throw(mExceptionObj); env->DeleteLocalRef(mExceptionObj); mExceptionObj = NULL; } }可以看到本地层其实调用的是Looper::pollOnce:
int Looper::pollOnce(int timeoutMillis, int* outFd, int* outEvents, void** outData) { int result = 0; for (;;) { while (mResponseIndex < mResponses.size()) { const Response& response = mResponses.itemAt(mResponseIndex++); int ident = response.request.ident; if (ident >= 0) { int fd = response.request.fd; int events = response.events; void* data = response.request.data; #if DEBUG_POLL_AND_WAKE ALOGD("%p ~ pollOnce - returning signalled identifier %d: " "fd=%d, events=0x%x, data=%p", this, ident, fd, events, data); #endif if (outFd != NULL) *outFd = fd; if (outEvents != NULL) *outEvents = events; if (outData != NULL) *outData = data; return ident; } } if (result != 0) { #if DEBUG_POLL_AND_WAKE ALOGD("%p ~ pollOnce - returning result %d", this, result); #endif if (outFd != NULL) *outFd = 0; if (outEvents != NULL) *outEvents = 0; if (outData != NULL) *outData = NULL; return result; } // result = pollInner(timeoutMillis); } }接着调用pollInner:
/*@path \system\core\libutils\Looper.cpp*/ // timeoutMillis传入的即为nextPollTimeoutMillis int Looper::pollInner(int timeoutMillis) { // timeoutMillis表示Java层消息队列距离最近事件空闲时间 // mNextMessageUptime表示native层本地消息队列mMessageEnvelopes空闲时间 if (timeoutMillis != 0 && mNextMessageUptime != LLONG_MAX) { nsecs_t now = systemTime(SYSTEM_TIME_MONOTONIC); int messageTimeoutMillis = toMillisecondTimeoutDelay(now, mNextMessageUptime); // 协调时间,设置timeoutMillis为两者最小值 if (messageTimeoutMillis >= 0 && (timeoutMillis < 0 || messageTimeoutMillis < timeoutMillis)) { timeoutMillis = messageTimeoutMillis; } } // Poll. int result = POLL_WAKE; mResponses.clear(); mResponseIndex = 0; // We are about to idle. mIdling = true; // ****************** 这里调用epoll_wait ********************** // struct epoll_event eventItems[EPOLL_MAX_EVENTS]; // 监控事件 int eventCount = epoll_wait(mEpollFd, eventItems, EPOLL_MAX_EVENTS, timeoutMillis); // No longer idling. mIdling = false; // Acquire lock. mLock.lock(); // 出错直接返回 if (eventCount < 0) { if (errno == EINTR) { goto Done; } result = POLL_ERROR; goto Done; } // Check for poll timeout. if (eventCount == 0) { result = POLL_TIMEOUT; goto Done; } // 表示有事件发生,遍历eventItems for (int i = 0; i < eventCount; i++) { int fd = eventItems[i].data.fd; uint32_t epollEvents = eventItems[i].events; if (fd == mWakeReadPipeFd) { if (epollEvents & EPOLLIN) { // 调用awoken awoken(); } else { } } else { // 否则表示可能是通过Looper的addFD函数添加的 ssize_t requestIndex = mRequests.indexOfKey(fd); if (requestIndex >= 0) { int events = 0; if (epollEvents & EPOLLIN) events |= EVENT_INPUT; if (epollEvents & EPOLLOUT) events |= EVENT_OUTPUT; if (epollEvents & EPOLLERR) events |= EVENT_ERROR; if (epollEvents & EPOLLHUP) events |= EVENT_HANGUP; pushResponse(events, mRequests.valueAt(requestIndex)); } else { ALOGW("Ignoring unexpected epoll events 0x%x on fd %d that is " "no longer registered.", epollEvents, fd); } } } Done: ; ......... }先不看Done部分,来看awoken:
/*@path \system\core\libutils\Looper.cpp*/ void Looper::awoken() { char buffer[16]; ssize_t nRead; do { // 从mWakeReadPipeFd读取数据 nRead = read(mWakeReadPipeFd, buffer, sizeof(buffer)); } while ((nRead == -1 && errno == EINTR) || nRead == sizeof(buffer)); } void Looper::wake() { ssize_t nWrite; do { // 往mWakeWritePipeFd写入一个'W' nWrite = write(mWakeWritePipeFd, "W", 1); } while (nWrite == -1 && errno == EINTR); if (nWrite != 1) { if (errno != EAGAIN) { ALOGW("Could not write wake signal, errno=%d", errno); } } }Looper中wake和awoken是成对存在的,wake往mWakeWritePipeFd写入一个'W',则epoll_wait时会监听到有事件发生;而awoken则从mWakeReadPipeFd读取数据,消耗掉该写入的字符。
那么调用wake的时机是什么?
在前面MessageQueue的enqueueMessage中,有这么一段:
// We can assume mPtr != 0 because mQuitting is false. if (needWake) { nativeWake(mPtr); }
nativeWake对应的就是wake;而nativeWake调用的时机,是needWake为true;而needWake为true的情况时如果当前Java事件队列blocked,而当前队列头部插入一个新事件,这个时候需要调用wake来唤醒epoll_wait;因为此时timeout事件会相比之间值发生改变,需要重新计算;另一种情况时如果当前插入了异步事件,也需要唤醒(注意,如果消息队列中有异步消息并且执行时间在新消息之前,所以不需要唤醒)。对于native层的消息队列,也是类似。
接着来看pollInner中的Done部分:
int Looper::pollInner(int timeoutMillis) { ......... Done: ; // 调用Message的Callback mNextMessageUptime = LLONG_MAX; while (mMessageEnvelopes.size() != 0) { nsecs_t now = systemTime(SYSTEM_TIME_MONOTONIC); const MessageEnvelope& messageEnvelope = mMessageEnvelopes.itemAt(0); if (messageEnvelope.uptime <= now) { // Remove the envelope from the list. // We keep a strong reference to the handler until the call to handleMessage // finishes. Then we drop it so that the handler can be deleted *before* // we reacquire our lock. { // obtain handler sp<MessageHandler> handler = messageEnvelope.handler; Message message = messageEnvelope.message; mMessageEnvelopes.removeAt(0); mSendingMessage = true; mLock.unlock(); // 调用handleMessage处理事件 handler->handleMessage(message); } // release handler mLock.lock(); mSendingMessage = false; result = POLL_CALLBACK; } else { // The last message left at the head of the queue determines the next wakeup time. mNextMessageUptime = messageEnvelope.uptime; break; } } // Release lock. mLock.unlock(); // Invoke all response callbacks. for (size_t i = 0; i < mResponses.size(); i++) { Response& response = mResponses.editItemAt(i); if (response.request.ident == POLL_CALLBACK) { int fd = response.request.fd; int events = response.events; void* data = response.request.data; int callbackResult = response.request.callback->handleEvent(fd, events, data); if (callbackResult == 0) { removeFd(fd); } // Clear the callback reference in the response structure promptly because we // will not clear the response vector itself until the next poll. response.request.callback.clear(); result = POLL_CALLBACK; } } return result; }