前言
消息机制
众所周知Android是基于消息驱动的,启动Activity等一系列操作都是通过消息机制实现,在JAVA层消息机制主要由几个类实现:
- 消息的表示:Message
- 消息队列: MessageQueue
- 消息循环:Looper
- 消息处理:Handler
在Android 4.2版本之后,在Native层也可以通过消息机制来处理Native层的一些功能,对应的两个主要C++类为Looper和MessageQueue,相关源码的文件路径为(不同版本可能不一样):
1. system\core\libutils\Looper.cpp
2. system\core\include\utils\Looper.h
3. frameworks\base\core\jni\android_os_MessageQueue.cpp
epoll
在开始源码分析前,读者需要对Linux的epoll有一定的了解,epoll是IO多路复用的机制,I/O多路复用就是通过一种机制,一个进程可以监视多个描述符,一旦某个描述符就绪(一般是读就绪或者写就绪),能够通知程序进行相应的读写操作。epoll跟select,poll本质上都是同步I/O,因为他们都需要在读写事件就绪后自己负责进行读写,也就是说这个读写过程是阻塞的,而异步I/O则无需自己负责进行读写,异步I/O的实现会负责把数据从内核拷贝到用户空间。
epoll有三个接口,在下面的源码分析会涉及到,具体的接口说明可以参阅下面这篇文章 IO多路复用之epoll总结。
#include
int epoll_create(int size); //创建一个epoll的句柄
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event); //epoll的事件注册函数
int epoll_wait(int epfd, struct epoll_event * events, int maxevents, int timeout); //等待事件的产生
Looper成员
这篇文章主要通过消息机制的初始化、消息的发送、消息循环和处理三个部分进行分析,首先我们先看以下nativce层的Looper.h(位于 system\core\include\utils)的主要成员变量和成员函数,至于为什么只看Looper而不看android_os_MessageQueue,这是因为android_os_MessageQueue最后都是调用到了Looper类的成员函数,android_os_MessageQueue并没有很复杂的逻辑在里面。
Looper.h的主要成员变量和成员函数如下(可以先看后面源码分析再回来看这一部分)。
const bool mAllowNonCallbacks; //是否允许监听的fd事件没有回调函数,Looper初始化设置为false
int mWakeEventFd; // 通过eventfd函数创建的fd,在epoll中注册监听,通过向该fd读写数据从而控制唤醒和阻塞
int mEpollFd; //创建epoll成功后保存的epoll的fd,注册监听和等待IO事件需要用到
bool mEpollRebuildRequired; //是否需要重新创建epoll
void rebuildEpollLocked(); //重新调用epoll_create构建epoll
void scheduleEpollRebuildLocked(); //设置mEpollRebuildRequired为true进而进行epoll重新创建
//类似于map的KeyedVector对象,存储epoll监听的其他fd,key为fd,value为保存有fd、回调函数等的Request对象
KeyedVector mRequests;
struct Request { //Request数据结构
int fd; //文件描述符
int ident;
int events; //监听的事件
int seq;
sp callback; //回调函数
void* data;
void initEventItem(struct epoll_event* eventItem) const;
};
Vector mResponses; //保存epoll监听其他fd发送事件后需要处理的Request对象
void pushResponse(int events, const Request& request); //往 mResponses push数据
struct Response { //Response数据结构
int events; //发生的事件
Request request; //Request对象
};
Vector mMessageEnvelopes; // Native层的消息队列
nsecs_t mNextMessageUptime; //native层消息队列下一个需要处理的消息的时间
void sendMessageAtTime(nsecs_t uptime, const sp& handler,
const Message& message); //从native层的消息队列发送消息
struct MessageEnvelope { //消息的数据结构
MessageEnvelope() : uptime(0) { }
MessageEnvelope(nsecs_t uptime, const sp handler,
const Message& message) : uptime(uptime), handler(handler), message(message) {
}
nsecs_t uptime;
sp handler;
Message message;
};
int pollOnce(int timeoutMillis, int* outFd, int* outEvents, void** outData); //调用pollInner函数
int pollInner(int timeoutMillis); //调用epoll_wait进入阻塞,当监听到发生事件后进行处理
void wake(); //通过往mWakeEventFd写入数据唤醒epoll_wait
void awoken(); //从mWakeEventFd读取数据,在pollInner调用到
//向epoll添加需要监听的其他fd
int addFd(int fd, int ident, int events, Looper_callbackFunc callback, void* data);
int addFd(int fd, int ident, int events, const sp& callback, void* data);
//pollInner和pollOnce可能返回的结果
enum {
POLL_WAKE = -1,
POLL_CALLBACK = -2,
POLL_TIMEOUT = -3,
POLL_ERROR = -4,
};
一、消息机制的初始化
Android应用主线程的启动是在ActivityThread.java的main函数启动的,消息机制的初始化和循环也是在这里开始的。
public static void main(String[] args) {
//......
//初始化Looper
Looper.prepareMainLooper();
ActivityThread thread = new ActivityThread();
thread.attach(false);
if (sMainThreadHandler == null) {
sMainThreadHandler = thread.getHandler();
}
if (false) {
Looper.myLooper().setMessageLogging(new
LogPrinter(Log.DEBUG, "ActivityThread"));
}
// End of event ActivityThreadMain.
Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);
//开始Looper循环
Looper.loop();
throw new RuntimeException("Main thread loop unexpectedly exited");
}
主线程通过Looper.prepareMainLooper()函数初始化Looper, 并通过调用Looper.loop()开始消息循环和处理。
public static void prepareMainLooper() {
prepare(false);
synchronized (Looper.class) {
if (sMainLooper != null) {
throw new IllegalStateException("The main Looper has already been prepared.");
}
sMainLooper = myLooper();
}
}
private static void prepare(boolean quitAllowed) {
if (sThreadLocal.get() != null) {
throw new RuntimeException("Only one Looper may be created per thread");
}
//将Looper对象放到ThreadLocal中
sThreadLocal.set(new Looper(quitAllowed));
}
在prepareMainLooper()中new一个Looper对象并放到ThreadLocal,在Native的Looper初始化中也是new一个Looper对象并放到线程的本地变量中,在后面的源码分析中就会看到。
private Looper(boolean quitAllowed) {
mQueue = new MessageQueue(quitAllowed);
mThread = Thread.currentThread();
}
在Looper的构造函数初始化消息队列和mThread变量。
private native static long nativeInit(); //初始化native的Looper的MessageQueue
MessageQueue(boolean quitAllowed) {
mQuitAllowed = quitAllowed;
mPtr = nativeInit(); //调用nativeInit函数
}
在MessageQueue初始化的时候调用了nativeInit,这是一个Native方法,这也是Native层消息机制初始化的入口。
static jlong android_os_MessageQueue_nativeInit(JNIEnv* env, jclass clazz) {
NativeMessageQueue* nativeMessageQueue = new NativeMessageQueue(); //创建一个NativeMessageQueue对象
if (!nativeMessageQueue) {
jniThrowRuntimeException(env, "Unable to allocate native queue");
return 0;
}
nativeMessageQueue->incStrong(env);
return reinterpret_cast(nativeMessageQueue);
}
nativeInit的源码位于frameworks\base\core\jni\android_os_MessageQueue.cpp中,函数初始化了一个NativeMessageQueue类并将NativeMessageQueue对象的指针返回给Java层MessageQueue对象的mPtr成员。
接着再看NativeMessageQueue的构造函数
NativeMessageQueue::NativeMessageQueue() :
mPollEnv(NULL), mPollObj(NULL), mExceptionObj(NULL) {
mLooper = Looper::getForThread();
if (mLooper == NULL) {
mLooper = new Looper(false);
Looper::setForThread(mLooper);
}
}
在NativeMessageQueue的构造函数中也是初始化一个Looper对象,在这里Looper的存储也是用到了本地存储(getForThread和setForThread函数实现),这跟在Java层使用ThreadLocal存储Looper对象类似。
接下来看Looper::getForThread()和Looper::setForThread(mLooper)是如何将Looper对象进行本地存储:
void Looper::setForThread(const sp& looper) {
sp old = getForThread(); // also has side-effect of initializing TLS
if (looper != NULL) {
looper->incStrong((void*)threadDestructor);
}
//存储Looper到TLS中
pthread_setspecific(gTLSKey, looper.get());
if (old != NULL) {
old->decStrong((void*)threadDestructor);
}
}
sp Looper::getForThread() {
int result = pthread_once(& gTLSOnce, initTLSKey);
LOG_ALWAYS_FATAL_IF(result != 0, "pthread_once failed");
//获取Looper的TLS对象
return (Looper*)pthread_getspecific(gTLSKey);
}
可以看到Looper对象的存储通过int pthread_setspecific(pthread_key_t key, const void *value) 和 void *pthread_getspecific(pthread_key_t key) 这两个函数实现,这是linux的 TLS(线程局部存储),相当于Java的ThreadLocal。
接着再看Looper的构造函数:
Looper::Looper(bool allowNonCallbacks) : //初始化过程中mAllowNonCallbacks为false
mAllowNonCallbacks(allowNonCallbacks), mSendingMessage(false),
mPolling(false), mEpollFd(-1), mEpollRebuildRequired(false),
mNextRequestSeq(0), mResponseIndex(0), mNextMessageUptime(LLONG_MAX)
{
//创建一个eventfd对象,返回的mWakeEventFd是epoll需要监听的文件描述符
mWakeEventFd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC);
LOG_ALWAYS_FATAL_IF(mWakeEventFd < 0, "Could not make wake event fd: %s",
strerror(errno));
AutoMutex _l(mLock);
rebuildEpollLocked(); //创建epoll对eventfd进行监听
}
void Looper::rebuildEpollLocked() {
// Close old epoll instance if we have one.
if (mEpollFd >= 0) {
#if DEBUG_CALLBACKS
ALOGD("%p ~ rebuildEpollLocked - rebuilding epoll set", this);
#endif
close(mEpollFd);
}
//创建epoll对象并回传其描述符。
mEpollFd = epoll_create(EPOLL_SIZE_HINT);
LOG_ALWAYS_FATAL_IF(mEpollFd < 0, "Could not create epoll instance: %s", strerror(errno));
struct epoll_event eventItem;
memset(& eventItem, 0, sizeof(epoll_event)); // zero out unused members of data field union
//EPOLLIN :表示监听的文件描述符可以读
eventItem.events = EPOLLIN;
eventItem.data.fd = mWakeEventFd;
//将mWakeEventFd加入epoll对象的监听文件描述符中并设置触发条件为文件描述符可以读
int result = epoll_ctl(mEpollFd, EPOLL_CTL_ADD, mWakeEventFd, & eventItem);
LOG_ALWAYS_FATAL_IF(result != 0, "Could not add wake event fd to epoll instance: %s",
strerror(errno));
//epoll对象监听其他的fd
for (size_t i = 0; i < mRequests.size(); i++) {
const Request& request = mRequests.valueAt(i);
struct epoll_event eventItem;
request.initEventItem(&eventItem);
int epollResult = epoll_ctl(mEpollFd, EPOLL_CTL_ADD, request.fd, & eventItem);
if (epollResult < 0) {
ALOGE("Error adding epoll events for fd %d while rebuilding epoll set: %s",
request.fd, strerror(errno));
}
}
}
一开始Looper的构造函数将传入的allowNonCallbacks赋值给了mAllowNonCallbacks,可以看到传入的值为false,这一个值在后面会利用到,标识监听的fd允不允许没有回调函数。
Looper的构造函数通过调用eventfd [关于eventfd可以戳这里] 创建一个文件描述符来进行事件通知,其中通过EFD_NONBLOCK宏设置该文件描述符的IO为非阻塞,返回的 mWakeEventFd是接下来epoll需要监听的文件描述符。其实在比较早的版本这里是使用管道(pipe)作为epoll监听的对象。
在Native层的Looper使用了 epoll ,通过epoll_create创建了一个epoll对象并将返回的epoll文件描述符保存在mEpollFd,接着再通过epoll_ctl函数对mWakeEventFd的IO事件进行监听,监听事件为EPOLLIN,也就是eventfd有数据可以读。在后面中epoll也对其他fd进行监听,在Looper中也提供了Looper.addFd()函数添加需要epoll需要监听的fd。
接下来分析Looper的两个函数addFd函数,分析addFd函数是因为后面消息循环和处理会涉及到这些通过addFd监听的fd的处理,当然读者可以直接跳过,在后面也跳过这些fd监听事件的处理,这并不会影响消息机制的源码分析。
int addFd(int fd, int ident, int events, ALooper_callbackFunc callback, void* data);
int addFd(int fd, int ident, int events, const sp& callback, void* data);
fd表示要监听的描述符;ident表示要监听的事件的标识,值必须>=0或者为ALOOPER_POLL_CALLBACK(-2);event表示要监听的事件;callback是事件发生时的回调函数。
下面是addFd函数的代码。
int Looper::addFd(int fd, int ident, int events, const sp& callback, void* data) {
#if DEBUG_CALLBACKS
ALOGD("%p ~ addFd - fd=%d, ident=%d, events=0x%x, callback=%p, data=%p", this, fd, ident,
events, callback.get(), data);
#endif
if (!callback.get()) {
if (! mAllowNonCallbacks) {
ALOGE("Invalid attempt to set NULL callback but not allowed for this looper.");
return -1; //callback为空且 mAllowNonCallbacks为false则直接返回-1,添加失败
}
if (ident < 0) {
ALOGE("Invalid attempt to set NULL callback with ident < 0.");
return -1;
}
} else {
ident = POLL_CALLBACK;
}
{ // acquire lock
AutoMutex _l(mLock);
//封装成Request对象
Request request;
request.fd = fd;
request.ident = ident;
request.events = events;
request.seq = mNextRequestSeq++;
request.callback = callback;
request.data = data;
if (mNextRequestSeq == -1) mNextRequestSeq = 0; // reserve sequence number -1
struct epoll_event eventItem;
request.initEventItem(&eventItem); //初始化epoll_event
ssize_t requestIndex = mRequests.indexOfKey(fd); //检查需要监听的fd是否已经存在
if (requestIndex < 0) {
//注册需要监听的fd
int epollResult = epoll_ctl(mEpollFd, EPOLL_CTL_ADD, fd, & eventItem);
if (epollResult < 0) {
ALOGE("Error adding epoll events for fd %d: %s", fd, strerror(errno));
return -1;
}
//将封装的request保存到mRequests(KeyedVector对象,类似于Map)中,key为监听的文件描述符
mRequests.add(fd, request);
} else {
int epollResult = epoll_ctl(mEpollFd, EPOLL_CTL_MOD, fd, & eventItem);
if (epollResult < 0) {
if (errno == ENOENT) { //旧的文件描述符可以已经关闭,需要重新添加注册监听
// Tolerate ENOENT because it means that an older file descriptor was
// closed before its callback was unregistered and meanwhile a new
// file descriptor with the same number has been created and is now
// being registered for the first time. This error may occur naturally
// when a callback has the side-effect of closing the file descriptor
// before returning and unregistering itself. Callback sequence number
// checks further ensure that the race is benign.
//
// Unfortunately due to kernel limitations we need to rebuild the epoll
// set from scratch because it may contain an old file handle that we are
// now unable to remove since its file descriptor is no longer valid.
// No such problem would have occurred if we were using the poll system
// call instead, but that approach carries others disadvantages.
#if DEBUG_CALLBACKS
ALOGD("%p ~ addFd - EPOLL_CTL_MOD failed due to file descriptor "
"being recycled, falling back on EPOLL_CTL_ADD: %s",
this, strerror(errno));
#endif
epollResult = epoll_ctl(mEpollFd, EPOLL_CTL_ADD, fd, & eventItem); //重新添加注册监听
if (epollResult < 0) {
ALOGE("Error modifying or adding epoll events for fd %d: %s",
fd, strerror(errno));
return -1; //添加失败返回-1
}
scheduleEpollRebuildLocked(); //重新创建epoll
} else {
ALOGE("Error modifying epoll events for fd %d: %s", fd, strerror(errno));
return -1;
}
}
mRequests.replaceValueAt(requestIndex, request); //替换成新的Request
}
} // release lock
return 1;
}
可以看到mAllowNonCallbacks的作用就在于此,当mAllowNonCallbacks为true时允许callback为NULL,在pollOnce中ident作为结果返回,否则不允许callback为空,当callback不为NULL时,ident的值会被忽略。
由于一开始初始化的时候mAllowNonCallbacks设置为false,如果callback为空则添加失败,添加成功的ident都为POLL_CALLBACK,这在接下来对这些fd的监听处理至关重要。
接着addFd函数中将监听的fd、event、indent等封装在一个Request对象中,然后根据fd从 mRequests(KeyedVector对象,类似于Map)查找监听的fd是否已经存在,如果不存在则通过epoll_ctl注册监听(EPOLL_CTL_ADD)并将该Request对象存储在 mRequests对象中,如果存在则通过epoll_ctl修改监听(EPOLL_CTL_MOD)并通过mRequests.replaceValueAt替换成最新的Request对象。
二、消息的发送
在java层中发送消息的接口有很多个,但最终都会调用到MessageQueue.enqueueMessage函数。
boolean enqueueMessage(Message msg, long when) {
if (msg.target == null) {
throw new IllegalArgumentException("Message must have a target.");
}
if (msg.isInUse()) {
throw new IllegalStateException(msg + " This message is already in use.");
}
synchronized (this) {
if (mQuitting) {
IllegalStateException e = new IllegalStateException(
msg.target + " sending message to a Handler on a dead thread");
Log.w(TAG, e.getMessage(), e);
msg.recycle();
return false;
}
msg.markInUse();
msg.when = when;
Message p = mMessages;
boolean needWake;
//如果p为空即消息队列为空,或者新添加的消息的执行时间when是0,
//或者新添加的消息的执行时间比消息队列头的消息的执行时间还早说明消息队列中没有消息,
//那么msg将是第一个消息,needWake根据mBlocked的情况考虑是否触发
if (p == null || when == 0 || when < p.when) {
// New head, wake up the event queue if blocked.
msg.next = p;
mMessages = msg;
needWake = mBlocked; //当前消息队列阻塞则需要唤醒
} else {
//否则根据when将该消息插入到适合的位置
//当前消息队列头部消息为拦截器则不需要唤醒
needWake = mBlocked && p.target == null && msg.isAsynchronous();
Message prev;
for (;;) {
prev = p;
p = p.next;
if (p == null || when < p.when) {
break;
}
if (needWake && p.isAsynchronous()) {
needWake = false; //当前消息为异步消息则不需要唤醒
}
}
msg.next = p; // invariant: p == prev.next
prev.next = msg;
}
//调用nativeWake,以触发nativePollOnce函数结束等待
if (needWake) {
nativeWake(mPtr);
}
}
return true;
}
可以看到发送消息是根据当前消息队列的情况和该消息的执行时间(when)去插入到队列中合适的位置,有三种情况该消息将插入到队列头部,一是消息队列没有消息,二是该消息执行时间(when)为0,三是该消息的执行时间比该消息队列第一个消息的执行时间还要早;其他情况则是根据执行时间(when)将该消息插入到合适的位置。这样消息的发送就成功了。
再看一下nativeWake触发nativePollOnce函数结束等待的条件,一般来说当前消息队列正处于blocked(阻塞状态)且该消息执行时间(when)为0则需要唤醒正在等待的epoll对象。
接着看nativeWake的C++代码:
//android_os_MessageQueue.cpp
static void android_os_MessageQueue_nativeWake(JNIEnv* env, jclass clazz, jlong ptr) {
NativeMessageQueue* nativeMessageQueue = reinterpret_cast(ptr);
nativeMessageQueue->wake();
}
void NativeMessageQueue::wake() {
mLooper->wake();
}
//Looper.cpp
void Looper::wake() {
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ wake", this);
#endif
uint64_t inc = 1;
//向eventFd写入一个uint64_t大小的数据以唤醒epoll
ssize_t nWrite = TEMP_FAILURE_RETRY(write(mWakeEventFd, &inc, sizeof(uint64_t)));
if (nWrite != sizeof(uint64_t)) {
if (errno != EAGAIN) {
ALOGW("Could not write wake signal: %s", strerror(errno));
}
}
}
void Looper::awoken() {
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ awoken", this);
#endif
uint64_t counter;
//读取一个uint64_t的数据
TEMP_FAILURE_RETRY(read(mWakeEventFd, &counter, sizeof(uint64_t)));
}
可以看到nativeWake最终调用到native层Looper的wake函数,在wake函数中向eventFd写入一个uint64_t大小的数据,这样由于eventFd有数据可读因此epoll_wait会从等待状态中醒来。eventFd的读数据是在awoken函数,awoken是在epoll_wait结束等待后被调用到,只是将数据读出后将可以继续往后处理了,至于awoken在哪里调用到在后面会说到。
Java层不止可以发送消息,native层也可以发送消息,native层发送消息是在Looper的sendMessageAtTime函数里面。
void Looper::sendMessageAtTime(nsecs_t uptime, const sp& handler,
const Message& message) {
#if DEBUG_CALLBACKS
ALOGD("%p ~ sendMessageAtTime - uptime=%" PRId64 ", handler=%p, what=%d",
this, uptime, handler.get(), message.what);
#endif
size_t i = 0;
{ // acquire lock
AutoMutex _l(mLock);
//根据执行时间uptine进行排队
size_t messageCount = mMessageEnvelopes.size();
while (i < messageCount && uptime >= mMessageEnvelopes.itemAt(i).uptime) {
i += 1;
}
MessageEnvelope messageEnvelope(uptime, handler, message);
mMessageEnvelopes.insertAt(messageEnvelope, i, 1);
// Optimization: If the Looper is currently sending a message, then we can skip
// the call to wake() because the next thing the Looper will do after processing
// messages is to decide when the next wakeup time should be. In fact, it does
// not even matter whether this code is running on the Looper thread.
if (mSendingMessage) {
return;
}
} // release lock
// 如果当前消息插入到头部则唤醒epoll_wait
if (i == 0) {
wake();
}
}
可以看到在native层的消息是用MessageEnvelope封装了该消息Message以及处理这个消息的Handler和执行时间uptime,然后根据执行时间进行排队,如果该消息是排在队列的最前面则需要通过调用wake函数唤醒epoll_wait。可以看出native层发送消息的流程跟Java层发送消息的流程很相似。
根据上面Java层和native层发送消息的分析,可以看到两者发送消息时都是根据消息的执行时间(when和uptime)进行排队,然后再判断是否需要唤醒正在等待epoll_wait从而处理刚插入的消息,唤醒的步骤是往epoll监听的mWakeEventFd文件描述符写入数据。
三、消息循环和处理
消息的循环和处理是在Java层的Looper.loop()进行,接下来就开始分析这一个函数。
public static void loop() {
final Looper me = myLooper();
if (me == null) {
throw new RuntimeException("No Looper; Looper.prepare() wasn't called on this thread.");
}
final MessageQueue queue = me.mQueue;
// Make sure the identity of this thread is that of the local process,
// and keep track of what that identity token actually is.
Binder.clearCallingIdentity();
final long ident = Binder.clearCallingIdentity();
for (;;) {
Message msg = queue.next(); // 获取下一个消息,可能会阻塞
if (msg == null) {
// No message indicates that the message queue is quitting.
return;
}
// This must be in a local variable, in case a UI event sets the logger
Printer logging = me.mLogging;
if (logging != null) {
logging.println(">>>>> Dispatching to " + msg.target + " " +
msg.callback + ": " + msg.what);
}
//调用处理这一个Message的Handler处理这一个消息
msg.target.dispatchMessage(msg);
if (logging != null) {
logging.println("<<<<< Finished to " + msg.target + " " + msg.callback);
}
// Make sure that during the course of dispatching the
// identity of the thread wasn't corrupted.
final long newIdent = Binder.clearCallingIdentity();
if (ident != newIdent) {
Log.wtf(TAG, "Thread identity changed from 0x"
+ Long.toHexString(ident) + " to 0x"
+ Long.toHexString(newIdent) + " while dispatching to "
+ msg.target.getClass().getName() + " "
+ msg.callback + " what=" + msg.what);
}
//当消息处理完毕需要进行回收,这是因为消息队列是以链表的形式存在,因此不回收占用内存会一直增加
msg.recycleUnchecked();
}
}
loop函数是一个死循环,每次从 queue.next()获取下一个需要处理的消息,然后通过msg.target.dispatchMessage(msg)调用这一个消息的Handler处理这一个消息,处理完毕需要对这一个消息进行回收。调用queue.next()的时候可能会阻塞是因为最终是调用到epoll_wait进行监听等待。
接下来看MessageQueue.next()函数。
Message next() {
// Return here if the message loop has already quit and been disposed.
// This can happen if the application tries to restart a looper after quit
// which is not supported.
final long ptr = mPtr;
if (ptr == 0) {
return null;
}
int pendingIdleHandlerCount = -1; // -1 only during first iteration
int nextPollTimeoutMillis = 0;
for (;;) {
if (nextPollTimeoutMillis != 0) {
Binder.flushPendingCommands();
}
//调用nativePollOnce等待nextPollTimeoutMillis,在这里可能会阻塞
nativePollOnce(ptr, nextPollTimeoutMillis);
synchronized (this) {
// Try to retrieve the next message. Return if found.
final long now = SystemClock.uptimeMillis();
Message prevMsg = null;
Message msg = mMessages;
//如果第一个Message为Barrier则往后找到第一个异步消息
if (msg != null && msg.target == null) {
// Stalled by a barrier. Find the next asynchronous message in the queue.
do {
prevMsg = msg;
msg = msg.next;
} while (msg != null && !msg.isAsynchronous());
}
if (msg != null) {
if (now < msg.when) {
// Next message is not ready. Set a timeout to wake up when it is ready.
//下一个消息还没有到执行时间,设置下一个消息的超时
//这一个时间也是下一个循环nativePollOnce需要等待的时间
nextPollTimeoutMillis = (int) Math.min(msg.when - now, Integer.MAX_VALUE);
} else {
// Got a message.
mBlocked = false;
if (prevMsg != null) {
prevMsg.next = msg.next;
} else {
mMessages = msg.next;
}
msg.next = null;
if (DEBUG) Log.v(TAG, "Returning message: " + msg);
msg.markInUse();
return msg; //返回一个Message给Looper进行处理
}
} else {
// No more messages.
nextPollTimeoutMillis = -1;
}
// Process the quit message now that all pending messages have been handled.
if (mQuitting) {
dispose();
return null;
}
// 处理注册的Idlehandler,当没有消息时,Looper会调用Idlehandler做一些工作比如垃圾回收
if (pendingIdleHandlerCount < 0
&& (mMessages == null || now < mMessages.when)) {
pendingIdleHandlerCount = mIdleHandlers.size();
}
if (pendingIdleHandlerCount <= 0) {
// No idle handlers to run. Loop and wait some more.
//没有Idlehandler执行则需要进行等待,mBlocked设置为true,然后调用continue继续下一个循环
mBlocked = true;
continue;
}
if (mPendingIdleHandlers == null) {
mPendingIdleHandlers = new IdleHandler[Math.max(pendingIdleHandlerCount, 4)];
}
mPendingIdleHandlers = mIdleHandlers.toArray(mPendingIdleHandlers);
}
// Run the idle handlers.
// We only ever reach this code block during the first iteration.
for (int i = 0; i < pendingIdleHandlerCount; i++) {
final IdleHandler idler = mPendingIdleHandlers[i];
mPendingIdleHandlers[i] = null; // release the reference to the handler
boolean keep = false;
try {
keep = idler.queueIdle();
} catch (Throwable t) {
Log.wtf(TAG, "IdleHandler threw exception", t);
}
if (!keep) {
synchronized (this) {
mIdleHandlers.remove(idler);
}
}
}
// Reset the idle handler count to 0 so we do not run them again.
pendingIdleHandlerCount = 0;
// While calling an idle handler, a new message could have been delivered
// so go back and look again for a pending message without waiting.
nextPollTimeoutMillis = 0; //设置nextPollTimeoutMillis为0
}
}
nativePollOnce进入native层后会根据native层的消息队列和nextPollTimeoutMillis决定是否调用epoll_wait进入等待状态,这个在接下来会讲到。
nativePollOnce返回后会取Java层消息队列的头部消息,如果头部消息是Barrier(即target == null的消息)就往后遍历到第一个异步消息。如果没有需要执行的消息。则设置nextPollTimeoutMillis = -1,否则根据这一个消息的执行时间(when),如果已经到执行时间则将将该消息markInUse并从消息队列移除最后返回,否则设置nextPollTimeoutMillis = (int) Math.min(msg.when - now, Integer.MAX_VALUE)为这一个消息距离需要执行还要多久。
在next函数有两个变量需要关注,一个是nextPollTimeoutMillis,另一个是mBlocked。
nextPollTimeoutMillis是nativePollOnce需要传入的参数,-1表示没有需要处理的消息,大于等于0则表示java层下一个消息需要多久执行。
mBlocked是表示当前是否处于阻塞状态,可以看到当有消息需要立即处理时会被设置为false,当没有消息或者消息还未到执行时间而且当前需要处理的Idlehandler数目为0时设置为true。
接下来看nativePollOnce函数。
static void android_os_MessageQueue_nativePollOnce(JNIEnv* env, jobject obj,
jlong ptr, jint timeoutMillis) {
NativeMessageQueue* nativeMessageQueue = reinterpret_cast(ptr);
nativeMessageQueue->pollOnce(env, obj, timeoutMillis);
}
void NativeMessageQueue::pollOnce(JNIEnv* env, jobject pollObj, int timeoutMillis) {
mPollEnv = env;
mPollObj = pollObj;
mLooper->pollOnce(timeoutMillis);
mPollObj = NULL;
mPollEnv = NULL;
if (mExceptionObj) {
env->Throw(mExceptionObj);
env->DeleteLocalRef(mExceptionObj);
mExceptionObj = NULL;
}
}
inline int pollOnce(int timeoutMillis) {
return pollOnce(timeoutMillis, NULL, NULL, NULL);
}
int Looper::pollOnce(int timeoutMillis, int* outFd, int* outEvents, void** outData) {
int result = 0;
for (;;) { //进入死循环,当调用过一次pollInner后就会跳出
while (mResponseIndex < mResponses.size()) {
const Response& response = mResponses.itemAt(mResponseIndex++);
int ident = response.request.ident;
//根据上面分析这里ident都会为POLL_CALLBACK(-2),因此这里不会执行
if (ident >= 0) {
int fd = response.request.fd;
int events = response.events;
void* data = response.request.data;
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ pollOnce - returning signalled identifier %d: "
"fd=%d, events=0x%x, data=%p",
this, ident, fd, events, data);
#endif
if (outFd != NULL) *outFd = fd;
if (outEvents != NULL) *outEvents = events;
if (outData != NULL) *outData = data;
return ident;
}
}
if (result != 0) { //第二次循环后会进入并返回,结束循环
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ pollOnce - returning result %d", this, result);
#endif
if (outFd != NULL) *outFd = 0;
if (outEvents != NULL) *outEvents = 0;
if (outData != NULL) *outData = NULL;
return result;
}
//调用pollInner函数
result = pollInner(timeoutMillis);
}
}
可以看到nativePollOnce最终调用到了 Looper::pollOnce函数,Looper::pollOnce函数最后调用到了pollInner函数。
int Looper::pollInner(int timeoutMillis) {
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ pollOnce - waiting: timeoutMillis=%d", this, timeoutMillis);
#endif
//取native层和Java层下一个最早需要执行的消息作为epoll等待的时间
if (timeoutMillis != 0 && mNextMessageUptime != LLONG_MAX) {
nsecs_t now = systemTime(SYSTEM_TIME_MONOTONIC);
//计算nativce层下一个消息还有多久执行
int messageTimeoutMillis = toMillisecondTimeoutDelay(now, mNextMessageUptime);
//当messageTimeoutMillis大于0且java层没有消息,或者native层messageTimeoutMillis小于Java层的timeoutMillis时
if (messageTimeoutMillis >= 0
&& (timeoutMillis < 0 || messageTimeoutMillis < timeoutMillis)) {
timeoutMillis = messageTimeoutMillis;
}
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ pollOnce - next message in %" PRId64 "ns, adjusted timeout: timeoutMillis=%d",
this, mNextMessageUptime - now, timeoutMillis);
#endif
}
// Poll.
int result = POLL_WAKE;
mResponses.clear();
mResponseIndex = 0;
// We are about to idle.
mPolling = true;
//epoll开始等待监听事件的发生
struct epoll_event eventItems[EPOLL_MAX_EVENTS];
int eventCount = epoll_wait(mEpollFd, eventItems, EPOLL_MAX_EVENTS, timeoutMillis);
// No longer idling.
mPolling = false;
// Acquire lock.
mLock.lock();
// Rebuild epoll set if needed.
if (mEpollRebuildRequired) {
mEpollRebuildRequired = false;
rebuildEpollLocked();
goto Done;
}
// 事件总数小于0说明发送错误
if (eventCount < 0) {
if (errno == EINTR) {
goto Done;
}
ALOGW("Poll failed with an unexpected error: %s", strerror(errno));
result = POLL_ERROR; //POLL_ERROR = -4
goto Done;
}
// 事件总数为0说明超时
if (eventCount == 0) {
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ pollOnce - timeout", this);
#endif
result = POLL_TIMEOUT; //POLL_TIMEOUT = -3
goto Done;
}
// 事件总数大于0则读取
#if DEBUG_POLL_AND_WAKE
ALOGD("%p ~ pollOnce - handling events from %d fds", this, eventCount);
#endif
for (int i = 0; i < eventCount; i++) {
int fd = eventItems[i].data.fd;
uint32_t epollEvents = eventItems[i].events;
if (fd == mWakeEventFd) { //有新消息添加进入
if (epollEvents & EPOLLIN) {
awoken(); //调用awoken读取数据
} else {
ALOGW("Ignoring unexpected epoll events 0x%x on wake event fd.", epollEvents);
}
} else { //处理其他监听的fd的事件
ssize_t requestIndex = mRequests.indexOfKey(fd); //根据fd从mRequests中查找发生IO事件需要处理的Request对象
if (requestIndex >= 0) {
int events = 0;
if (epollEvents & EPOLLIN) events |= EVENT_INPUT;
if (epollEvents & EPOLLOUT) events |= EVENT_OUTPUT;
if (epollEvents & EPOLLERR) events |= EVENT_ERROR;
if (epollEvents & EPOLLHUP) events |= EVENT_HANGUP;
pushResponse(events, mRequests.valueAt(requestIndex)); //将需要处理的Request对象push到mResponses(Vector对象中)
} else {
ALOGW("Ignoring unexpected epoll events 0x%x on fd %d that is "
"no longer registered.", epollEvents, fd);
}
}
}
Done: ;
// Invoke pending message callbacks.
mNextMessageUptime = LLONG_MAX;
while (mMessageEnvelopes.size() != 0) {
nsecs_t now = systemTime(SYSTEM_TIME_MONOTONIC);
//获取native层的头部消息,如果可以执行则调用相应的Handler执行
const MessageEnvelope& messageEnvelope = mMessageEnvelopes.itemAt(0);
if (messageEnvelope.uptime <= now) {
// Remove the envelope from the list.
// We keep a strong reference to the handler until the call to handleMessage
// finishes. Then we drop it so that the handler can be deleted *before*
// we reacquire our lock.
{ // obtain handler
sp handler = messageEnvelope.handler;
Message message = messageEnvelope.message;
mMessageEnvelopes.removeAt(0);
mSendingMessage = true;
mLock.unlock();
#if DEBUG_POLL_AND_WAKE || DEBUG_CALLBACKS
ALOGD("%p ~ pollOnce - sending message: handler=%p, what=%d",
this, handler.get(), message.what);
#endif
handler->handleMessage(message); //处理native层的Message
} // release handler
mLock.lock();
mSendingMessage = false;
result = POLL_CALLBACK;
} else {
// The last message left at the head of the queue determines the next wakeup time.
mNextMessageUptime = messageEnvelope.uptime; //记录下一个Message发生的时间
break;
}
}
// Release lock.
mLock.unlock();
//处理所有发生IO事件的fd,通过调用callback进行回调处理
// Invoke all response callbacks.
for (size_t i = 0; i < mResponses.size(); i++) {
Response& response = mResponses.editItemAt(i);
if (response.request.ident == POLL_CALLBACK) {
int fd = response.request.fd;
int events = response.events;
void* data = response.request.data;
#if DEBUG_POLL_AND_WAKE || DEBUG_CALLBACKS
ALOGD("%p ~ pollOnce - invoking fd event callback %p: fd=%d, events=0x%x, data=%p",
this, response.request.callback.get(), fd, events, data);
#endif
// Invoke the callback. Note that the file descriptor may be closed by
// the callback (and potentially even reused) before the function returns so
// we need to be a little careful when removing the file descriptor afterwards.
int callbackResult = response.request.callback->handleEvent(fd, events, data); //回调处理
if (callbackResult == 0) {
removeFd(fd, response.request.seq); //移除
}
// Clear the callback reference in the response structure promptly because we
// will not clear the response vector itself until the next poll.
response.request.callback.clear(); //移除回调函数
result = POLL_CALLBACK; //POLL_CALLBACK = -2
}
}
return result;
}
在上面的分析我们知道Java层的消息都保存在了Java层MessageQueue的成员mMessages中,Native层的消息都保存在了Native Looper的mMessageEnvelopes中,这说明了有两个按时间排序的消息队列。timeoutMillis表示Java层下一个消息还有多久需要执行,mNextMessageUpdate表示Native层下一个要执行的消息的时间。当timeoutMillis为0,epoll_wait直接设置TimeOut为0;如果timeoutMillis为-1(说明Java层无消息)则计算native层下一个需要mNextMessageUpdate获取native层的timeout作为epoll_wait的timeout参数。
epoll_wait返回的是所监听文件描述符发生IO事件的总数(eventCount),一般有三种情况:
- eventCount < 0 : 出错返回
- eventCount = 0:超时,监听的文件描述符没有事件发送
- eventCount > 0 : 监听的文件描述符有时间发生
前两种情况都是通过goto Done直接跳到Done后面的代码。第三种情况则会循环处理所有的事件,如果是mWakeReadPipeFd的EPOLLIN事件就调用awoken函数(上面所说的awoken函数就是在这里执行),如果不是则是通过addFD添加的文件描述符(fd)发送了IO事件,此时将发生的事件封装成Response再push到mResonses队列(Vector对象)。
接着再看一下Done部分的代码,一开始从mMessageEnvelopes取出头部的Native消息,如果到达了执行时间就调用它内部保存的MessageeHandler的handleMessage处理并从Native消息队列移除,设置result为POLL_CALLBACK,否则计算mNextMessageUptime表示Native消息队列下一次消息要执行的时间。
最后,遍历mResponses(前面刚通过pushResponse存进去的),如果response.request.ident == POLL_CALLBACK,就调用注册的callback的handleEvent(fd, events, data)进行处理,然后从mResonses队列中移除,这次遍历完之后,mResponses中保留来来的就都是ident>=0并且callback为NULL的了。由于在NativeMessageQueue初始化Looper时传入了mAllowNonCallbacks为false,所以这次处理完后mResponses一定为空。
接下来返回到pollOnce。pollOnce是一个for循环,pollInner中处理了所有response.request.ident==POLL_CALLBACK的Response,在第二次进入for循环后如果mResponses不为空就可以找到ident>0的Response,将其ident作为返回值返回由调用pollOnce的函数自己处理,在这里我们是在NativeMessageQueue中调用的Loope的pollOnce,没对返回值进行处理,而且mAllowNonCallbacks为false也就不可能进入这个循环。pollInner返回值不可能是0,或者说只可能是负数,所以pollOnce中的for循环只会执行两次,在第二次就返回了。
四、总结
这篇源码分析涉及到很多linux的知识,如果没有linux基础看起来会很费劲。从这篇文章我们可以看到很清楚地看到Android是如何利用linux内核去构建自己的framework层的,这对学习Android底层会有很大的帮助。
参考资料
- IO多路复用之epoll总结
- TLS--线程局部存储
- linux eventfd
- Android消息处理机制(Handler、Looper、MessageQueue与Message)