知乎上看到这样一个问题Android中为什么主线程不会因为Looper.loop()里的死循环卡死?,于是试着对Handler源码重新看了一下,其实Android的消息机制是Pipe+epoll(了解epoll),有消息时则依次执行,没消息时调用epoll.wait等待唤醒;由于Android中生命周期、UI绘制都是动过Handler实现的,因此自然不会发生阻塞卡死。
Android为了保证主线程在生命周期内不退出,所以用了无限循环,在循环中,如果没有事件需要处理,则使用epoll进行休眠时监听,休眠会让线程让出CPU,因此节省系统资源。
而在next取消息时,如果当前没有消息要处理,则会调用本地层的Looper的pollOnce来epoll_wait监听之前创建的Pipe,主线程进入休眠状态;
而唤醒则需要别的线程进行唤醒;熟悉Android源码的知道,AMS向Activity发送生命周期消息是通过Binder来实现的,ActivityThread中有Binder的服务器端即ApplicationThread,它是运行在Binder线程中的;当AMS发送消息时,ApplicationThread会通过主线程的Handler向主线程发送消息,通过往pipe里面写入字节,来将epoll_wait唤醒,即将主线程唤醒;
而相仿的,当有触摸事件时,IMS通过WMS来向应用进程发送消息,它也是通过Binder来传递的,同样也会将主线程唤醒;因此来说主线程只会休眠等待唤醒,而并不会只有一个主线程而卡死。
1、创建Looper
之间Java层的简单分析Handler,Looper,Message,MessageQueue之间关系浅析,可以知道这几个主要类之间的简单的关系,从Looper.prepare()开始,知道这里会对Looper及MessageQueue进行初始化。
public final class Looper {
private Looper(boolean quitAllowed) {
mQueue = new MessageQueue(quitAllowed);
mThread = Thread.currentThread();
}
}
public final class MessageQueue {
// 本地方法对MessageQueue进行初始化
private native static long nativeInit();
MessageQueue(boolean quitAllowed) {
mQuitAllowed = quitAllowed;
mPtr = nativeInit();
}
}
可以看到这里有个native方法;查看其对应的JNI函数:
/*@path \frameworks\base\core\jni\android_os_MessageQueue.cpp*/
static jlong android_os_MessageQueue_nativeInit(JNIEnv* env, jclass clazz) {
// 创建一个本地NativeMessageQueue
NativeMessageQueue* nativeMessageQueue = new NativeMessageQueue();
if (!nativeMessageQueue) {
jniThrowRuntimeException(env, "Unable to allocate native queue");
return 0;
}
// 增加强引用计数
nativeMessageQueue->incStrong(env);
// 返回新创建的NativeMessageQueue的地址
return reinterpret_cast(nativeMessageQueue);
}
可以看到这里创建了一个本地的 nativeMessageQueue 对象,并将地址传递给Java层,也就是MessageQueue中的ptr变量指向的就是本地的nativeMessageQueue
对象;
继续来看nativeMessageQueue 的初始化:
/*@path \frameworks\base\core\jni\android_os_MessageQueue.cpp*/
class NativeMessageQueue : public MessageQueue {
public:
NativeMessageQueue();
virtual ~NativeMessageQueue();
virtual void raiseException(JNIEnv* env, const char* msg, jthrowable exceptionObj);
void pollOnce(JNIEnv* env, int timeoutMillis);
void wake();
private:
bool mInCallback;
jthrowable mExceptionObj;
};
// 构造函数:
NativeMessageQueue::NativeMessageQueue() : mInCallback(false), mExceptionObj(NULL) {
// 类似于ThreadLocal机制
mLooper = Looper::getForThread();
if (mLooper == NULL) {
// 创建Looper
mLooper = new Looper(false);
Looper::setForThread(mLooper);
}
}
可以看到在NativeMessageQueue的构造函数中创建了(本地端的)Looper,在Java端,则是在Looper的构造函数中创建的MessageQueue,两者刚好相反。这里同样使用类似于ThreadLocal机制,来保证Looper的线程性。继续来看Looper构造函数:
Looper的定义在\system\core\include\utils\Looper.h,构造函数如下:
/*@path \system\core\libutils\Looper.cpp*/
// 传进来的allowNonCallbacks值为false
Looper::Looper(bool allowNonCallbacks) :
mAllowNonCallbacks(allowNonCallbacks), mSendingMessage(false),
mResponseIndex(0), mNextMessageUptime(LLONG_MAX) {
int wakeFds[2];
// 可以看到创建一个Pipe管道
int result = pipe(wakeFds);
// 获取管道的读写文件描述符
mWakeReadPipeFd = wakeFds[0];
mWakeWritePipeFd = wakeFds[1];
// 读写均设置为O_NONBLOCK非阻塞模式
result = fcntl(mWakeReadPipeFd, F_SETFL, O_NONBLOCK);
result = fcntl(mWakeWritePipeFd, F_SETFL, O_NONBLOCK);
mIdling = false;
// 创建一个epoll的句柄
mEpollFd = epoll_create(EPOLL_SIZE_HINT);
// 定义的event事件数据结构,epoll会将已经发生的事件赋值到eventItem数据结构中
struct epoll_event eventItem;
memset(& eventItem, 0, sizeof(epoll_event)); // 分配内存
// 设置为EPOLLIN:EPOLLIN事件则只有当对端有数据写入时才会触发,
// 所以触发一次后需要不断读取所有数据直到读完EAGAIN为止。
// 否则剩下的数据只有在下次对端有写入时才能一起取出来了。
eventItem.events = EPOLLIN;
eventItem.data.fd = mWakeReadPipeFd;
// 注册,将mWakeReadPipeFd加入到监控链表
result = epoll_ctl(mEpollFd, EPOLL_CTL_ADD, mWakeReadPipeFd, & eventItem);
}
创建了一个Pipe管道,并将其读文件描述符加入到监控列表中,监听是否有新数据要读。至此,Java层的Looper初始化工作完成。
二、Handler发送消息
Handler发送消息最终调用的都是:
public boolean sendMessageAtTime(Message msg, long uptimeMillis) {
MessageQueue queue = mQueue;
if (queue == null) {
RuntimeException e = new RuntimeException(
this + " sendMessageAtTime() called with no mQueue");
Log.w("Looper", e.getMessage(), e);
return false;
}
return enqueueMessage(queue, msg, uptimeMillis);
}
继续看enqueueMessage:
private boolean enqueueMessage(MessageQueue queue, Message msg, long uptimeMillis) {
msg.target = this;
if (mAsynchronous) {
msg.setAsynchronous(true);
}
return queue.enqueueMessage(msg, uptimeMillis);
}
再调用MessageQueue的入消息队列方法之前,看到有个mAsynchronous条件判断,这个用于判断事件是否是异步事件。平时代码中使用handler发送的消息均为同步事件。
/*@path \frameworks\base\core\java\android\os\MessageQueue.java*/
int enqueueSyncBarrier(long when) {
// Enqueue a new sync barrier token.
// We don't need to wake the queue because the purpose of a barrier is to stall it.
synchronized (this) {
// 每一次调用enqueueSyncBarrier都会自增1,这样每一个异步事件都会有一个唯一的ID
final int token = mNextBarrierToken++;
// 从Message池中获取一个Message
final Message msg = Message.obtain();
// 设置为inuse
msg.markInUse();
msg.when = when;
msg.arg1 = token;
Message prev = null;
Message p = mMessages;
// 如果when不为0,表示event是delay的,按照时间选择出最佳的插入位置
if (when != 0) {
while (p != null && p.when <= when) {
prev = p;
p = p.next;
}
}
// 将msg插入到消息队列中
if (prev != null) { // invariant: p == prev.next
msg.next = p;
prev.next = msg;
} else {
msg.next = p;
mMessages = msg;
}
return token;
}
}
这里获取了一个Message,并设置相关参数,然后把它插入到消息队列中,注意这里没有设置msg的target,调用该方法插入的事件的target为null。来看在消息处理next()时对其的特殊处理:
Message prevMsg = null;
Message msg = mMessages;
if (msg != null && msg.target == null) {
// Stalled by a barrier. Find the next asynchronous message in the queue.
do {
prevMsg = msg;
msg = msg.next;
} while (msg != null && !msg.isAsynchronous());
}
可以看到,一旦遇到msg的target为null的时候,此时不继续顺序执行同步事件,而是起到一个拦截器的作用,找到后面一个异步事件然后将其出队列进行执行。也就是
异步消息的执行原理
就是向同步事件队列插入一个Barriar进行拦截,等待异步事件被取出执行。这里保证了异步事件会优于同步事件而立即被执行。同步事件会一直等待,直至调用removeSyncBarrier方法。
继续来看enqueueMessage:
boolean enqueueMessage(Message msg, long when) {
// 这里是同步事件入队列,保证了异步事件不会混淆进来
if (msg.target == null) {
throw new IllegalArgumentException("Message must have a target.");
}
if (msg.isInUse()) {
throw new IllegalStateException(msg + " This message is already in use.");
}
synchronized (this) {
msg.markInUse();
msg.when = when;
Message p = mMessages;
boolean needWake;
// 将Message插入到合适的位置
if (p == null || when == 0 || when < p.when) {
// New head, wake up the event queue if blocked.
msg.next = p;
mMessages = msg;
needWake = mBlocked;
} else {
// Inserted within the middle of the queue. Usually we don't have to wake
// up the event queue unless there is a barrier at the head of the queue
// and the message is the earliest asynchronous message in the queue.
// 这里的needWake为true将会调用nativeWake,将会很有用处
needWake = mBlocked && p.target == null && msg.isAsynchronous();
Message prev;
for (;;) {
prev = p;
p = p.next;
if (p == null || when < p.when) {
break;
}
if (needWake && p.isAsynchronous()) {
needWake = false;
}
}
msg.next = p; // invariant: p == prev.next
prev.next = msg;
}
// We can assume mPtr != 0 because mQuitting is false.
if (needWake) {
nativeWake(mPtr);
}
}
return true;
}
三、消息处理
消息处理是Looper中的循环处理函数loop(),而loop的关键是通过MessageQueue的next方法中取出消息Message;
来看next;
Message next() {
// 这个ptr之前分析过,指向native端创建的nativeMessageQueue对象
final long ptr = mPtr;
if (ptr == 0) {
return null;
}
int pendingIdleHandlerCount = -1; // -1 only during first iteration
int nextPollTimeoutMillis = 0;
for (;;) {
if (nextPollTimeoutMillis != 0) {
Binder.flushPendingCommands();
}
// ********** 重要的方法 *********** //
nativePollOnce(ptr, nextPollTimeoutMillis);
synchronized (this) {
// Try to retrieve the next message. Return if found.
final long now = SystemClock.uptimeMillis();
Message prevMsg = null;
Message msg = mMessages;
// 由前面知道这里是barrier,用于处理队列中的异步事务
if (msg != null && msg.target == null) {
do {
prevMsg = msg;
msg = msg.next;
} while (msg != null && !msg.isAsynchronous());
}
if (msg != null) {
// 表示现在没有事件需要执行
if (now < msg.when) {
// 可以看到nextPollTimeoutMillis表示距离最近ready的Message执行有多久时间
nextPollTimeoutMillis = (int) Math.min(msg.when - now, Integer.MAX_VALUE);
} else {
// 取出一个Message,简单的单链表删除操作
mBlocked = false;
// 这个对应异步事件
if (prevMsg != null) {
prevMsg.next = msg.next;
} else {
mMessages = msg.next;
}
msg.next = null;
return msg;
}
} else {
// 事件队列为空时,设置为-1
nextPollTimeoutMillis = -1;
}
// Process the quit message now that all pending messages have been handled.
if (mQuitting) {
dispose();
return null;
}
// If first time idle, then get the number of idlers to run.
// Idle handles only run if the queue is empty or if the first message
// in the queue (possibly a barrier) is due to be handled in the future.
// 当消息队列为空,或者当前没有消息需要执行时,即当前空闲状态下,执行IdleHandler的事件
if (pendingIdleHandlerCount < 0
&& (mMessages == null || now < mMessages.when)) {
pendingIdleHandlerCount = mIdleHandlers.size();
}
if (pendingIdleHandlerCount <= 0) {
// No idle handlers to run. Loop and wait some more.
mBlocked = true;
continue;
}
if (mPendingIdleHandlers == null) {
mPendingIdleHandlers = new IdleHandler[Math.max(pendingIdleHandlerCount, 4)];
}
mPendingIdleHandlers = mIdleHandlers.toArray(mPendingIdleHandlers);
}
// Run the idle handlers.
// We only ever reach this code block during the first iteration.
for (int i = 0; i < pendingIdleHandlerCount; i++) {
final IdleHandler idler = mPendingIdleHandlers[i];
mPendingIdleHandlers[i] = null; // release the reference to the handler
boolean keep = false;
try {
keep = idler.queueIdle();
} catch (Throwable t) {
Log.wtf("MessageQueue", "IdleHandler threw exception", t);
}
if (!keep) {
synchronized (this) {
mIdleHandlers.remove(idler);
}
}
}
// Reset the idle handler count to 0 so we do not run them again.
pendingIdleHandlerCount = 0;
// While calling an idle handler, a new message could have been delivered
// so go back and look again for a pending message without waiting.
nextPollTimeoutMillis = 0;
}
}
来看本地方法nativePollOnce,其对应于native层的NativeMessageQueue::pollOnce:
void NativeMessageQueue::pollOnce(JNIEnv* env, int timeoutMillis) {
mInCallback = true;
mLooper->pollOnce(timeoutMillis);
mInCallback = false;
if (mExceptionObj) {
env->Throw(mExceptionObj);
env->DeleteLocalRef(mExceptionObj);
mExceptionObj = NULL;
}
}
可以看到本地层其实调用的是Looper::pollOnce:
int Looper::pollOnce(int timeoutMillis, int* outFd, int* outEvents, void** outData) {
int result = 0;
for (;;) {
while (mResponseIndex < mResponses.size()) {
const Response& response = mResponses.itemAt(mResponseIndex++);
int ident = response.request.ident;
if (ident >= 0) {
int fd = response.request.fd;
int events = response.events;
void* data = response.request.data;
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ pollOnce - returning signalled identifier %d: "
"fd=%d, events=0x%x, data=%p",
this, ident, fd, events, data);
#endif
if (outFd != NULL) *outFd = fd;
if (outEvents != NULL) *outEvents = events;
if (outData != NULL) *outData = data;
return ident;
}
}
if (result != 0) {
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ pollOnce - returning result %d", this, result);
#endif
if (outFd != NULL) *outFd = 0;
if (outEvents != NULL) *outEvents = 0;
if (outData != NULL) *outData = NULL;
return result;
}
//
result = pollInner(timeoutMillis);
}
}
接着调用pollInner:
/*@path \system\core\libutils\Looper.cpp*/
// timeoutMillis传入的即为nextPollTimeoutMillis
int Looper::pollInner(int timeoutMillis) {
// timeoutMillis表示Java层消息队列距离最近事件空闲时间
// mNextMessageUptime表示native层本地消息队列mMessageEnvelopes空闲时间
if (timeoutMillis != 0 && mNextMessageUptime != LLONG_MAX) {
nsecs_t now = systemTime(SYSTEM_TIME_MONOTONIC);
int messageTimeoutMillis = toMillisecondTimeoutDelay(now, mNextMessageUptime);
// 协调时间,设置timeoutMillis为两者最小值
if (messageTimeoutMillis >= 0
&& (timeoutMillis < 0 || messageTimeoutMillis < timeoutMillis)) {
timeoutMillis = messageTimeoutMillis;
}
}
// Poll.
int result = POLL_WAKE;
mResponses.clear();
mResponseIndex = 0;
// We are about to idle.
mIdling = true;
// ****************** 这里调用epoll_wait ********************** //
struct epoll_event eventItems[EPOLL_MAX_EVENTS];
// 监控事件
int eventCount = epoll_wait(mEpollFd, eventItems, EPOLL_MAX_EVENTS, timeoutMillis);
// No longer idling.
mIdling = false;
// Acquire lock.
mLock.lock();
// 出错直接返回
if (eventCount < 0) {
if (errno == EINTR) {
goto Done;
}
result = POLL_ERROR;
goto Done;
}
// Check for poll timeout.
if (eventCount == 0) {
result = POLL_TIMEOUT;
goto Done;
}
// 表示有事件发生,遍历eventItems
for (int i = 0; i < eventCount; i++) {
int fd = eventItems[i].data.fd;
uint32_t epollEvents = eventItems[i].events;
if (fd == mWakeReadPipeFd) {
if (epollEvents & EPOLLIN) {
// 调用awoken
awoken();
} else {
}
} else {
// 否则表示可能是通过Looper的addFD函数添加的
ssize_t requestIndex = mRequests.indexOfKey(fd);
if (requestIndex >= 0) {
int events = 0;
if (epollEvents & EPOLLIN) events |= EVENT_INPUT;
if (epollEvents & EPOLLOUT) events |= EVENT_OUTPUT;
if (epollEvents & EPOLLERR) events |= EVENT_ERROR;
if (epollEvents & EPOLLHUP) events |= EVENT_HANGUP;
pushResponse(events, mRequests.valueAt(requestIndex));
} else {
ALOGW("Ignoring unexpected epoll events 0x%x on fd %d that is "
"no longer registered.", epollEvents, fd);
}
}
}
Done: ;
.........
}
先不看Done部分,来看awoken:
/*@path \system\core\libutils\Looper.cpp*/
void Looper::awoken() {
char buffer[16];
ssize_t nRead;
do {
// 从mWakeReadPipeFd读取数据
nRead = read(mWakeReadPipeFd, buffer, sizeof(buffer));
} while ((nRead == -1 && errno == EINTR) || nRead == sizeof(buffer));
}
void Looper::wake() {
ssize_t nWrite;
do {
// 往mWakeWritePipeFd写入一个'W'
nWrite = write(mWakeWritePipeFd, "W", 1);
} while (nWrite == -1 && errno == EINTR);
if (nWrite != 1) {
if (errno != EAGAIN) {
ALOGW("Could not write wake signal, errno=%d", errno);
}
}
}
Looper中wake和awoken是成对存在的,wake往mWakeWritePipeFd写入一个'W',则epoll_wait时会监听到有事件发生;而awoken则从mWakeReadPipeFd读取数据,消耗掉该写入的字符。
那么调用wake的时机是什么?
在前面MessageQueue的enqueueMessage中,有这么一段:
// We can assume mPtr != 0 because mQuitting is false.
if (needWake) {
nativeWake(mPtr);
}
nativeWake对应的就是wake;而nativeWake调用的时机,是needWake为true;而needWake为true的情况时如果当前Java事件队列blocked,而当前队列头部插入一个新事件,这个时候需要调用wake来唤醒epoll_wait;因为此时timeout事件会相比之间值发生改变,需要重新计算;另一种情况时如果当前插入了异步事件,也需要唤醒(注意,如果消息队列中有异步消息并且执行时间在新消息之前,所以不需要唤醒)。对于native层的消息队列,也是类似。
接着来看pollInner中的Done部分:
int Looper::pollInner(int timeoutMillis) {
.........
Done: ;
// 调用Message的Callback
mNextMessageUptime = LLONG_MAX;
while (mMessageEnvelopes.size() != 0) {
nsecs_t now = systemTime(SYSTEM_TIME_MONOTONIC);
const MessageEnvelope& messageEnvelope = mMessageEnvelopes.itemAt(0);
if (messageEnvelope.uptime <= now) {
// Remove the envelope from the list.
// We keep a strong reference to the handler until the call to handleMessage
// finishes. Then we drop it so that the handler can be deleted *before*
// we reacquire our lock.
{ // obtain handler
sp handler = messageEnvelope.handler;
Message message = messageEnvelope.message;
mMessageEnvelopes.removeAt(0);
mSendingMessage = true;
mLock.unlock();
// 调用handleMessage处理事件
handler->handleMessage(message);
} // release handler
mLock.lock();
mSendingMessage = false;
result = POLL_CALLBACK;
} else {
// The last message left at the head of the queue determines the next wakeup time.
mNextMessageUptime = messageEnvelope.uptime;
break;
}
}
// Release lock.
mLock.unlock();
// Invoke all response callbacks.
for (size_t i = 0; i < mResponses.size(); i++) {
Response& response = mResponses.editItemAt(i);
if (response.request.ident == POLL_CALLBACK) {
int fd = response.request.fd;
int events = response.events;
void* data = response.request.data;
int callbackResult = response.request.callback->handleEvent(fd, events, data);
if (callbackResult == 0) {
removeFd(fd);
}
// Clear the callback reference in the response structure promptly because we
// will not clear the response vector itself until the next poll.
response.request.callback.clear();
result = POLL_CALLBACK;
}
}
return result;
}