Android5.0 源码研究---进程间通信 Linux内核源码部分解析

什么叫进程间通信:

一般而言,进程有单独的地址空间。我们可以了解下可执行程序被装载到内存后建立的一系列映射等理解这一点。如此以来意味着如果我们有两个进程(进程A和进程B),那么,在进程A中声明的数据对于进程B是不可用的。而且,进程B看不到进程A中发生的事件,反之亦然。如果进程A和B一起工作来完成某个任务,必须有一个在两个进程间通信信息和时间的方法。

进程和线程不同,相同进程中的线程中的一些数据可以共享,编程人员需要考虑的是数据间的同步,而进程则不行,必须有进程间通信的机制。

Android中进程间通信的机制的设计:

思考下在Android系统之中我们一般用到进程间通信的地方在于 Service和其余组件通信,比如Activity等。

设计师们采用的是binder机制来实现进程间的通信,binder是粘合剂的意思,在Android中有以下三种组件来控制进程间通信:

1. Server

2. Client

3. Service Manager

现在我们来模拟下一个 Service A和一个Activity B间的通信,假设他们在不同的进程间通信

首先系统会启动一个Service Manger,这个组件独立于需要通信的组件在一个进程之中,首先它会成为一个守护进程,Service A会挂载在Service Manger中,作为Server,等待请求的到来,而此时Activity B则是作为Client,当他们需要通信时,Service Manager会协调二者之间的动作。

下面会根据上面介绍的过程来一步一步详细解析binder机制:

Service Manager的作用:

先前有过简略的介绍,Service Manager的作用是一个守护进程,我们可以理解为一个管理者的角色,它会给Server提供远程接口用来挂载server,同时会给client提供查询接口,来获取Server的服务,我们不妨把一次进程间的通信理解为一次服务的过程,两个进程一个为另一个提供一次服务,而Service Manager就是一个平台,可以在平台之上由进程之间提供服务,下面看下Service Manager的入口:

int main(int argc, char **argv)
{
    struct binder_state *bs;

    bs = binder_open(128*1024);
    if (!bs) {
        ALOGE("failed to open binder driver\n");
        return -1;
    }

    if (binder_become_context_manager(bs)) {
        ALOGE("cannot become context manager (%s)\n", strerror(errno));
        return -1;
    }
    //...........................(安全机制检查 省略)
    binder_loop(bs, svcmgr_handler);

    return 0;
}
首先看第一段:

    struct binder_state *bs;

    bs = binder_open(128*1024);
struct binder_state
{
    int fd;        //文件描述符
    void *mapped;  //映射在进程空间的起始地址
    size_t mapsize;//映射的大小
};
这个函数是 打开Binder设备文件,并将信息填充在binder_state bs 这个结构体中,binder_open:

struct binder_state *binder_open(size_t mapsize)
{
    struct binder_state *bs;
    struct binder_version vers;

    bs = malloc(sizeof(*bs));
    if (!bs) {
        errno = ENOMEM;
        return NULL;
    }

    bs->fd = open("/dev/binder", O_RDWR);
    if (bs->fd < 0) {
        fprintf(stderr,"binder: cannot open device (%s)\n",
                strerror(errno));
        goto fail_open;
    }

    if ((ioctl(bs->fd, BINDER_VERSION, &vers) == -1) ||
        (vers.protocol_version != BINDER_CURRENT_PROTOCOL_VERSION)) {
        fprintf(stderr, "binder: driver version differs from user space\n");
        goto fail_open;
    }

    bs->mapsize = mapsize;
    bs->mapped = mmap(NULL, mapsize, PROT_READ, MAP_PRIVATE, bs->fd, 0);
    if (bs->mapped == MAP_FAILED) {
        fprintf(stderr,"binder: cannot map device (%s)\n",
                strerror(errno));
        goto fail_map;
    }

    return bs;

fail_map:
    close(bs->fd);
fail_open:
    free(bs);
    return NULL;
}
·我们可以看到的是,在binder_open中主要是对bs这个结构体中的成员进行填充,包括文件描述符、映射在进程空间的起始地址、映射大小,而这个结构体对象将在以后对内存进行操作中起到至关重要的作用。

在函数中有一条语句:

    bs->fd = open("/dev/binder", O_RDWR);
这个函数的主要作用是创建一个struct binder_proc数据结构来保存打开设备文件/dev/binder的进程的上下文信息。

在这个结构体重有四个重要的的成员变量,分别是四个红黑树:

threads:用来保存用户请求的线程
nodes:用来保存binder实体
refs_by_desc树和refs_by_node用来保存binder的引用

到这里我们看到了费劲心机打开的文件的第一个用途,用来保存client请求时的一些重要的信息,这些信息会在以后的通信过程中起到重要的作用。

回到binder_open函数,在打开完文件后下一步有这样一条语句:

    bs->mapped = mmap(NULL, mapsize, PROT_READ, MAP_PRIVATE, bs->fd, 0);
这个函数也是binder机制的精髓所在,它会同时在进程的空间开辟一片区域,同时也会在Linux的内核之中映射一片区域,这样做的好处是只需要把Client进程空间的数据拷贝一次到内核空间,然后Server与内核共享这个数据就可以了,整个过程只需要执行一次内存拷贝,提高了效率,同时这也是进程通信最核心的地方。

一般来说我们需要映射的大小不超过4M,在内核函数binder_mmap中会做相应的检查。

下面进入到:

int binder_become_context_manager(struct binder_state *bs)  
{  
    return ioctl(bs->fd, BINDER_SET_CONTEXT_MGR, 0);  
} 
这一段语句的主要目的是让Service Manager成为一个守护进程,可以理解为成为一个管理者

解释一下ioctl()函数:

ioctl是设备驱动程序中对设备的I/O通道进行管理的函数。

对于Android Linux内核真正调用的是Binder驱动程序的binder_ioctl函数,下面对这个函数做一下简要的分析:

目前调用这个函数的最终的目的在于使得Service Manager成为一个守护进程,而在函数中有一些重要的结构体

struct binder_thread

它会将和这个守护进程有关联的线程组织起来,在他的成员变量中有transaction_stack表示线程正在处理的事务,todo表示发往该线程的数据列表等,这些将在以后组织server与client通信的过程中起到作用。

Service Manager成为一名管理者后它将进入一个循环,在Server挂载上来后循环则会一直等待Client的请求,回到main函数,也就进入了最后一步,插一句题外话,在进入最后一步之前我们可以看到在main函数中注释掉了一部分,这一部分是Android 5.0相对于Android 2.0新增的一个安全检查机制,不做过多的描述,最后一步调用的是binder_loop()这个函数,看下实现:

void binder_loop(struct binder_state *bs, binder_handler func)
{
    int res;
    struct binder_write_read bwr;
    uint32_t readbuf[32];

    bwr.write_size = 0;
    bwr.write_consumed = 0;
    bwr.write_buffer = 0;

    readbuf[0] = BC_ENTER_LOOPER;
    binder_write(bs, readbuf, sizeof(uint32_t));

    for (;;) {
        bwr.read_size = sizeof(readbuf);
        bwr.read_consumed = 0;
        bwr.read_buffer = (uintptr_t) readbuf;

        res = ioctl(bs->fd, BINDER_WRITE_READ, &bwr);

        if (res < 0) {
            ALOGE("binder_loop: ioctl failed (%s)\n", strerror(errno));
            break;
        }

        res = binder_parse(bs, 0, (uintptr_t) readbuf, bwr.read_consumed, func);
        if (res == 0) {
            ALOGE("binder_loop: unexpected reply?!\n");
            break;
        }
        if (res < 0) {
            ALOGE("binder_loop: io error %d %s\n", res, strerror(errno));
            break;
        }
    }
}
解释一下binder_write_read 结构体:

struct binder_write_read {  
    signed long write_size; /* bytes to write */  
    signed long write_consumed; /* bytes consumed by driver */  
    unsigned long   write_buffer;  
    signed long read_size;  /* bytes to read */  
    signed long read_consumed;  /* bytes consumed by driver */  
    unsigned long   read_buffer;  
};  
在函数的开头我们定义了一个binder_write_read bwr结构体变量,这个结构体将用于IO操作

接下来for (;;),这是一个无穷的循环,我们理解为一直循环等待请求就可以了。

Service 是如何成为一个Server的,又如何挂载到Service Manager上?

在上一段中我们探讨了Service Manager是如何成为一个通信间的管理者,那么作为通信的服务端的一个Service 是如何挂载到Service Manager上成为一个Server的,将在本节详细介绍,我们以MediaPlayerService作为例子讲解:

我们知道进程间的通信是本地方法完成的,所以MediaPlayerService也对应有一个本地的native 父类叫做:BnMediaPlayerService 这个类用于处理进程间通信

在MediaPlayerService 启动的main方法中 有这么关键的几句:

        sp proc(ProcessState::self());
        sp sm = defaultServiceManager();
        MediaPlayerService::instantiate();
        ProcessState::self()->startThreadPool();
        IPCThreadState::self()->joinThreadPool();
在第一句中出现了一个ProcessState 此处有必要介绍下 实际上 BnMediaPlayerService是使用了IPCThreadState接收Client处发送过来的请求,而IPCThreadState又借助了ProcessState类来与Binder驱动程序交互(与驱动程序交互可以理解为向内存数据的读写,进程间通信的方式正是对内存读写的过程)。

下面我们进入到ProcessState::self()这个函数中:

sp ProcessState::self()
{
    Mutex::Autolock _l(gProcessMutex);
    if (gProcess != NULL) {
        return gProcess;
    }
    gProcess = new ProcessState;
    return gProcess;
}
调用后返回一个gProcess全局唯一实例,我们先前说过以后我们将通过这个对象对binder驱动进行调用,也就是对内存进行读写,再看下ProcessState的构造函数:

ProcessState::ProcessState()
    : mDriverFD(open_driver())
    , mVMStart(MAP_FAILED)
    , mManagesContexts(false)
    , mBinderContextCheckFunc(NULL)
    , mBinderContextUserData(NULL)
    , mThreadPoolStarted(false)
    , mThreadPoolSeq(1)
{
    if (mDriverFD >= 0) {
#if !defined(HAVE_WIN32_IPC)
        mVMStart = mmap(0, BINDER_VM_SIZE, PROT_READ, MAP_PRIVATE | MAP_NORESERVE, mDriverFD, 0);
        if (mVMStart == MAP_FAILED) {
            // *sigh*
            ALOGE("Using /dev/binder failed: unable to mmap transaction memory.\n");
            close(mDriverFD);
            mDriverFD = -1;
        }
#else
        mDriverFD = -1;
#endif
    }

    LOG_ALWAYS_FATAL_IF(mDriverFD < 0, "Binder driver could not be opened.  Terminating.");
}
看完构造函数我们便豁然开朗了,有了上一节的基础我们可以很快找到 open_driver() 和 mmap函数,一个打开驱动文件 一个在进程内存和内核内存中作出映射。

关注下 此处open_driver()的实现:

static int open_driver()
{
    int fd = open("/dev/binder", O_RDWR);
    if (fd >= 0) {
        fcntl(fd, F_SETFD, FD_CLOEXEC);
        int vers = 0;
        status_t result = ioctl(fd, BINDER_VERSION, &vers);
        if (result == -1) {
            ALOGE("Binder ioctl to obtain version failed: %s", strerror(errno));
            close(fd);
            fd = -1;
        }
        if (result != 0 || vers != BINDER_CURRENT_PROTOCOL_VERSION) {
            ALOGE("Binder driver protocol does not match user space protocol!");
            close(fd);
            fd = -1;
        }
        size_t maxThreads = 15;
        result = ioctl(fd, BINDER_SET_MAX_THREADS, &maxThreads);
        if (result == -1) {
            ALOGE("Binder ioctl to set max threads failed: %s", strerror(errno));
        }
    } else {
        ALOGW("Opening '/dev/binder' failed: %s\n", strerror(errno));
    }
    return fd;
}
在执行open后有两次调用ioctl()的过程,先前解释过,ioctl这个函数是进行IO操作的函数,第一次调用是写入一个版本号,第二次是写入一个最大的线程数,我们思考下此时MediaPlayerService是作为一个Server,也就是说能够接受的最大client访问数是15。

我们返回到ProcessState对象的构造函数之中 接下来我们调用的mmap函数 在函数中第二个参数

#define BINDER_VM_SIZE ((1*1024*1024) - (4096 *2))
这个参数设定下来,就带表已经在进程内存中预留了(1*1024*1024) - (4096 *2)大小的空间了。至此已经初步具备了和内存交互的能力了。

回到我们的MediaPlayerService 启动的main方法中下一步:

        sp sm = defaultServiceManager();
实际上它本质上是一个BpServiceManager,包含了一个句柄值为0的Binder引用,可以理解为拿到一个Service Manager 实例就可以了。

拿到实例之后MediaPlayerService 启动的main方法中执行:

MediaPlayerService::instantiate();
void MediaPlayerService::instantiate() {
    defaultServiceManager()->addService(
            String16("media.player"), new MediaPlayerService());
}
这便是我们一直在说的将MediaPlayerService 挂载在Service Manager上通过addService方法:

    virtual status_t addService(const String16& name, const sp& service,
            bool allowIsolated)
    {
        Parcel data, reply;
        data.writeInterfaceToken(IServiceManager::getInterfaceDescriptor());
        data.writeString16(name);
        data.writeStrongBinder(service);
        data.writeInt32(allowIsolated ? 1 : 0);
        status_t err = remote()->transact(ADD_SERVICE_TRANSACTION, data, &reply);
        return err == NO_ERROR ? reply.readExceptionCode() : err;
    }
在挂载过程中我们频繁的看到了Parcel 的使用

第一步我们向Pacel数据中写入一个RPC头部其中包含一个数字与一个字符串

data.writeInterfaceToken(IServiceManager::getInterfaceDescriptor());

再次写入一个字符串到Pacel中:

data.writeString16(name);

下面是重要的一步:

data.writeStrongBinder(service);
status_t Parcel::writeStrongBinder(const sp& val)
{
    return flatten_binder(ProcessState::self(), val, this);
}
status_t flatten_binder(const sp& /*proc*/,
    const sp& binder, Parcel* out)
{
    flat_binder_object obj;
    obj.flags = 0x7f | FLAT_BINDER_FLAG_ACCEPTS_FDS;
    if (binder != NULL) {
        IBinder *local = binder->localBinder();
        if (!local) {
            BpBinder *proxy = binder->remoteBinder();
            if (proxy == NULL) {
                ALOGE("null proxy");
            }
            const int32_t handle = proxy ? proxy->handle() : 0;
            obj.type = BINDER_TYPE_HANDLE;
            obj.binder = 0; /* Don't pass uninitialized stack data to a remote process */
            obj.handle = handle;
            obj.cookie = 0;
        } else {
            obj.type = BINDER_TYPE_BINDER;
            obj.binder = reinterpret_cast(local->getWeakRefs());
            obj.cookie = reinterpret_cast(local);
        }
    } else {
        obj.type = BINDER_TYPE_BINDER;
        obj.binder = 0;
        obj.cookie = 0;
    }

    return finish_flatten_binder(binder, obj, out);
}
首先介绍下flat_binder_object 结构体 其实将他理解为一个binder对象就可以了:

struct flat_binder_object {  
    /* 8 bytes for large_flat_header. */  
    unsigned long       type;  
    unsigned long       flags;  
  
    /* 8 bytes of data. */  
    union {  
        void        *binder;    /* local object */  
        signed long handle;     /* remote object */  
    };  
  
    /* extra data associated with local object */  
    void            *cookie;  
}; 
我们可以看到在flatten_binder中首先是对unsigned long flags; 进行初始化:0x7f表示处理本Binder实体请求数据包的线程的最低优先级,FLAT_BINDER_FLAG_ACCEPTS_FDS表示这个Binder实体可以接受文件描述符,Binder实体在收到文件描述符时,就会在本进程中打开这个文件。

在可以看到我们的const sp& binder 参数一层一层追踪上去就是一个MediaPlayerService实例,一定不为空,所以函数一定会执行:

            obj.type = BINDER_TYPE_BINDER;
            obj.binder = reinterpret_cast(local->getWeakRefs());
            obj.cookie = reinterpret_cast(local);
Binder实体地址的指针local保存在flat_binder_obj的成员变量cookie中,完成对三个成员的填充后:

finish_flatten_binder(binder, obj, out)
inline static status_t finish_flatten_binder(  
    const sp& binder, const flat_binder_object& flat, Parcel* out)  
{  
    return out->writeObject(flat, false);  
} 
最后将我们准备好的flat_binder_object,也就是binder对象写入到Pacel中去。

此时我们可以返回到addService函数中了,此时我们的最后一步:

        status_t err = remote()->transact(ADD_SERVICE_TRANSACTION, data, &reply);
函数最终是调用的:

status_t IPCThreadState::transact(int32_t handle,
                                  uint32_t code, const Parcel& data,
                                  Parcel* reply, uint32_t flags)
{
    status_t err = data.errorCheck();

    flags |= TF_ACCEPT_FDS;

    IF_LOG_TRANSACTIONS() {
        TextOutput::Bundle _b(alog);
        alog << "BC_TRANSACTION thr " << (void*)pthread_self() << " / hand "
            << handle << " / code " << TypeCode(code) << ": "
            << indent << data << dedent << endl;
    }
    
    if (err == NO_ERROR) {
        LOG_ONEWAY(">>>> SEND from pid %d uid %d %s", getpid(), getuid(),
            (flags & TF_ONE_WAY) == 0 ? "READ REPLY" : "ONE WAY");
        err = writeTransactionData(BC_TRANSACTION, flags, handle, code, data, NULL);
    }
    
    if (err != NO_ERROR) {
        if (reply) reply->setError(err);
        return (mLastError = err);
    }
    
    if ((flags & TF_ONE_WAY) == 0) {
        #if 0
        if (code == 4) { // relayout
            ALOGI(">>>>>> CALLING transaction 4");
        } else {
            ALOGI(">>>>>> CALLING transaction %d", code);
        }
        #endif
        if (reply) {
            err = waitForResponse(reply);
        } else {
            Parcel fakeReply;
            err = waitForResponse(&fakeReply);
        }
        #if 0
        if (code == 4) { // relayout
            ALOGI("<<<<<< RETURNING transaction 4");
        } else {
            ALOGI("<<<<<< RETURNING transaction %d", code);
        }
        #endif
        
        IF_LOG_TRANSACTIONS() {
            TextOutput::Bundle _b(alog);
            alog << "BR_REPLY thr " << (void*)pthread_self() << " / hand "
                << handle << ": ";
            if (reply) alog << indent << *reply << dedent << endl;
            else alog << "(none requested)" << endl;
        }
    } else {
        err = waitForResponse(NULL, NULL);
    }
    
    return err;
}
我们来分析这个函数,在之前我们在service 的启动main函数中初始化了一个Pacel对象叫做data(我们写入了一些数据),回忆一下:

        Parcel data, reply;
        data.writeInterfaceToken(IServiceManager::getInterfaceDescriptor());
        data.writeString16(name);
        data.writeStrongBinder(service);
        data.writeInt32(allowIsolated ? 1 : 0);
在这之中我们填入了许多关于这个 MediaPlayerService的信息,现在我们将在transact中调用writeTransactionData

status_t IPCThreadState::writeTransactionData(int32_t cmd, uint32_t binderFlags,
    int32_t handle, uint32_t code, const Parcel& data, status_t* statusBuffer)
{
    binder_transaction_data tr;

    tr.target.ptr = 0; /* Don't pass uninitialized stack data to a remote process */
    tr.target.handle = handle;
    tr.code = code;
    tr.flags = binderFlags;
    tr.cookie = 0;
    tr.sender_pid = 0;
    tr.sender_euid = 0;
    
    const status_t err = data.errorCheck();
    if (err == NO_ERROR) {
        tr.data_size = data.ipcDataSize();
        tr.data.ptr.buffer = data.ipcData();
        tr.offsets_size = data.ipcObjectsCount()*sizeof(binder_size_t);
        tr.data.ptr.offsets = data.ipcObjects();
    } else if (statusBuffer) {
        tr.flags |= TF_STATUS_CODE;
        *statusBuffer = err;
        tr.data_size = sizeof(status_t);
        tr.data.ptr.buffer = reinterpret_cast(statusBuffer);
        tr.offsets_size = 0;
        tr.data.ptr.offsets = 0;
    } else {
        return (mLastError = err);
    }
    
    mOut.writeInt32(cmd);
    mOut.write(&tr, sizeof(tr));
    
    return NO_ERROR;
}

writeTransactionData的函数作用很明显,将data对象中的信息填入到  binder_transaction_data tr中,具体的写入过程我们可以不深究,而data就是有关于MediaPlayerService实例的有关信息。

在writeTransactionData函数的末尾我们之前初始化的binder_transaction_data tr写入到mOut中。

接下来在transact中会进入到waitForResponse中

status_t IPCThreadState::waitForResponse(Parcel *reply, status_t *acquireResult)
{
    int32_t cmd;
    int32_t err;

    while (1) {
        if ((err=talkWithDriver()) < NO_ERROR) break;
        err = mIn.errorCheck();
        if (err < NO_ERROR) break;
        if (mIn.dataAvail() == 0) continue;
        
        cmd = mIn.readInt32();
        
        IF_LOG_COMMANDS() {
            alog << "Processing waitForResponse Command: "
                << getReturnString(cmd) << endl;
        }

        switch (cmd) {
        case BR_TRANSACTION_COMPLETE:
            if (!reply && !acquireResult) goto finish;
            break;
        
        case BR_DEAD_REPLY:
            err = DEAD_OBJECT;
            goto finish;

        case BR_FAILED_REPLY:
            err = FAILED_TRANSACTION;
            goto finish;
        
        case BR_ACQUIRE_RESULT:
            {
                ALOG_ASSERT(acquireResult != NULL, "Unexpected brACQUIRE_RESULT");
                const int32_t result = mIn.readInt32();
                if (!acquireResult) continue;
                *acquireResult = result ? NO_ERROR : INVALID_OPERATION;
            }
            goto finish;
        
        case BR_REPLY:
            {
                binder_transaction_data tr;
                err = mIn.read(&tr, sizeof(tr));
                ALOG_ASSERT(err == NO_ERROR, "Not enough command data for brREPLY");
                if (err != NO_ERROR) goto finish;

                if (reply) {
                    if ((tr.flags & TF_STATUS_CODE) == 0) {
                        reply->ipcSetDataReference(
                            reinterpret_cast(tr.data.ptr.buffer),
                            tr.data_size,
                            reinterpret_cast(tr.data.ptr.offsets),
                            tr.offsets_size/sizeof(binder_size_t),
                            freeBuffer, this);
                    } else {
                        err = *reinterpret_cast(tr.data.ptr.buffer);
                        freeBuffer(NULL,
                            reinterpret_cast(tr.data.ptr.buffer),
                            tr.data_size,
                            reinterpret_cast(tr.data.ptr.offsets),
                            tr.offsets_size/sizeof(binder_size_t), this);
                    }
                } else {
                    freeBuffer(NULL,
                        reinterpret_cast(tr.data.ptr.buffer),
                        tr.data_size,
                        reinterpret_cast(tr.data.ptr.offsets),
                        tr.offsets_size/sizeof(binder_size_t), this);
                    continue;
                }
            }
            goto finish;

        default:
            err = executeCommand(cmd);
            if (err != NO_ERROR) goto finish;
            break;
        }
    }

finish:
    if (err != NO_ERROR) {
        if (acquireResult) *acquireResult = err;
        if (reply) reply->setError(err);
        mLastError = err;
    }
    
    return err;
}

在循环的头部调用了talkWithDriver():

status_t IPCThreadState::talkWithDriver(bool doReceive)
{
    if (mProcess->mDriverFD <= 0) {
        return -EBADF;
    }
    
    binder_write_read bwr;
    
    // Is the read buffer empty?
    const bool needRead = mIn.dataPosition() >= mIn.dataSize();
    
    // We don't want to write anything if we are still reading
    // from data left in the input buffer and the caller
    // has requested to read the next data.
    const size_t outAvail = (!doReceive || needRead) ? mOut.dataSize() : 0;
    
    bwr.write_size = outAvail;
    bwr.write_buffer = (uintptr_t)mOut.data();

    // This is what we'll read.
    if (doReceive && needRead) {
        bwr.read_size = mIn.dataCapacity();
        bwr.read_buffer = (uintptr_t)mIn.data();
    } else {
        bwr.read_size = 0;
        bwr.read_buffer = 0;
    }

    IF_LOG_COMMANDS() {
        TextOutput::Bundle _b(alog);
        if (outAvail != 0) {
            alog << "Sending commands to driver: " << indent;
            const void* cmds = (const void*)bwr.write_buffer;
            const void* end = ((const uint8_t*)cmds)+bwr.write_size;
            alog << HexDump(cmds, bwr.write_size) << endl;
            while (cmds < end) cmds = printCommand(alog, cmds);
            alog << dedent;
        }
        alog << "Size of receive buffer: " << bwr.read_size
            << ", needRead: " << needRead << ", doReceive: " << doReceive << endl;
    }
    
    // Return immediately if there is nothing to do.
    if ((bwr.write_size == 0) && (bwr.read_size == 0)) return NO_ERROR;

    bwr.write_consumed = 0;
    bwr.read_consumed = 0;
    status_t err;
    do {
        IF_LOG_COMMANDS() {
            alog << "About to read/write, write size = " << mOut.dataSize() << endl;
        }
#if defined(HAVE_ANDROID_OS)
        if (ioctl(mProcess->mDriverFD, BINDER_WRITE_READ, &bwr) >= 0)
            err = NO_ERROR;
        else
            err = -errno;
#else
        err = INVALID_OPERATION;
#endif
        if (mProcess->mDriverFD <= 0) {
            err = -EBADF;
        }
        IF_LOG_COMMANDS() {
            alog << "Finished read/write, write size = " << mOut.dataSize() << endl;
        }
    } while (err == -EINTR);

    IF_LOG_COMMANDS() {
        alog << "Our err: " << (void*)(intptr_t)err << ", write consumed: "
            << bwr.write_consumed << " (of " << mOut.dataSize()
                        << "), read consumed: " << bwr.read_consumed << endl;
    }

    if (err >= NO_ERROR) {
        if (bwr.write_consumed > 0) {
            if (bwr.write_consumed < mOut.dataSize())
                mOut.remove(0, bwr.write_consumed);
            else
                mOut.setDataSize(0);
        }
        if (bwr.read_consumed > 0) {
            mIn.setDataSize(bwr.read_consumed);
            mIn.setDataPosition(0);
        }
        IF_LOG_COMMANDS() {
            TextOutput::Bundle _b(alog);
            alog << "Remaining data size: " << mOut.dataSize() << endl;
            alog << "Received commands from driver: " << indent;
            const void* cmds = mIn.data();
            const void* end = mIn.data() + mIn.dataSize();
            alog << HexDump(cmds, mIn.dataSize()) << endl;
            while (cmds < end) cmds = printReturnCommand(alog, cmds);
            alog << dedent;
        }
        return NO_ERROR;
    }
    
    return err;
}
talkWithDriver()主要的作用是和binder驱动进行交互,也就是进行内存的读写以达到数据的共享,目的就是将先前准备了很久的数据写入到Service Manager 所在的进程之中,而写入的数据就是有关于MediaPlayerService实例的有关信息,我们准备了这么多回归原则就是需要将MediaPlayerService挂载到Manager Service上

但是Manager Service和MediaPlayerService并不在同一进程中,所以挂载这一操作也是需要进程间的通信。

此时在talkWithDriver()中

我们首先将先前准备好的mOut对象写入:

    bwr.write_size = outAvail;
    bwr.write_buffer = (uintptr_t)mOut.data();
紧接着:

if (doReceive && needRead) {
        bwr.read_size = mIn.dataCapacity();
        bwr.read_buffer = (uintptr_t)mIn.data();
    } else {
        bwr.read_size = 0;
        bwr.read_buffer = 0;
    }
if语句一定为真,所以会执行:

        bwr.read_size = mIn.dataCapacity();
        bwr.read_buffer = (uintptr_t)mIn.data();
接下来:

ioctl(mProcess->mDriverFD, BINDER_WRITE_READ, &bwr)
这个函数我们已经比较熟悉了,就是和驱动进行交互的函数,执行的命令是:BINDER_WRITE_READ,与驱动交互的数据是:bwr

进入到ioctl函数中我们关注下命令为BINDER_WRITE_READ的部分:

    case BINDER_WRITE_READ: {  
        struct binder_write_read bwr;  
        if (size != sizeof(struct binder_write_read)) {  
            ret = -EINVAL;  
            goto err;  
        }  
        if (copy_from_user(&bwr, ubuf, sizeof(bwr))) {  
            ret = -EFAULT;  
            goto err;  
        }  
        if (binder_debug_mask & BINDER_DEBUG_READ_WRITE)  
            printk(KERN_INFO "binder: %d:%d write %ld at %08lx, read %ld at %08lx\n",  
            proc->pid, thread->pid, bwr.write_size, bwr.write_buffer, bwr.read_size, bwr.read_buffer);  
        if (bwr.write_size > 0) {  
            ret = binder_thread_write(proc, thread, (void __user *)bwr.write_buffer, bwr.write_size, &bwr.write_consumed);  
            if (ret < 0) {  
                bwr.read_consumed = 0;  
                if (copy_to_user(ubuf, &bwr, sizeof(bwr)))  
                    ret = -EFAULT;  
                goto err;  
            }  
        }  
        if (bwr.read_size > 0) {  
            ret = binder_thread_read(proc, thread, (void __user *)bwr.read_buffer, bwr.read_size, &bwr.read_consumed, filp->f_flags & O_NONBLOCK);  
            if (!list_empty(&proc->todo))  
                wake_up_interruptible(&proc->wait);  
            if (ret < 0) {  
                if (copy_to_user(ubuf, &bwr, sizeof(bwr)))  
                    ret = -EFAULT;  
                goto err;  
            }  
        }  
        if (binder_debug_mask & BINDER_DEBUG_READ_WRITE)  
            printk(KERN_INFO "binder: %d:%d wrote %ld of %ld, read return %ld of %ld\n",  
            proc->pid, thread->pid, bwr.write_consumed, bwr.write_size, bwr.read_consumed, bwr.read_size);  
        if (copy_to_user(ubuf, &bwr, sizeof(bwr))) {  
            ret = -EFAULT;  
            goto err;  
        }  
        break;  
    }  
我们知道由于先前我们将mOut对象写入了bwr 所以这里的bwr.write_size > 0

函数会进入到

binder_thread_write(proc, thread, (void __user *)bwr.write_buffer, bwr.write_size, &bwr.write_consumed);
这里解释一下 我们可以看到有一个thread参数,它是一个线程,而这个线程正是Service Manager所在的线程。

在binder_thread_write中下面是关键:

            struct binder_transaction_data tr;  
  
            if (copy_from_user(&tr, ptr, sizeof(tr)))  
                return -EFAULT;  
            ptr += sizeof(tr);  
            binder_transaction(proc, thread, &tr, cmd == BC_REPLY);  
            break;  
我们可以看首先将数据拷入到binder_transaction_data tr中,紧接着调用binder_transaction(proc, thread, &tr, cmd == BC_REPLY)函数:

这个函数做了以下这么几件事:

1 一个待处理事务t和一个待完成工作项tcomplete,并执行初始化工作,而将来的工作就是处理数据,处理的数据就是我们一直在提的MediaPlayerService实例。

2 在Service Manager的进程空间中分配一块内存来保存数据

最终启动 Service Manager来处理数据。

之前我们一直调用的是binder_thread_write,在Service Manager唤醒后,先前写,现在该将数据读入到Service Manager中了,于是该调用binder_thread_read简单的解释下这个函数的原理,函数的最终目的是将数据拷入到Service Manager的进程空间的缓冲区中。

这一步完成之后会返回到binder_loop中:

void binder_loop(struct binder_state *bs, binder_handler func)
{
    int res;
    struct binder_write_read bwr;
    uint32_t readbuf[32];

    bwr.write_size = 0;
    bwr.write_consumed = 0;
    bwr.write_buffer = 0;

    readbuf[0] = BC_ENTER_LOOPER;
    binder_write(bs, readbuf, sizeof(uint32_t));

    for (;;) {
        bwr.read_size = sizeof(readbuf);
        bwr.read_consumed = 0;
        bwr.read_buffer = (uintptr_t) readbuf;

        res = ioctl(bs->fd, BINDER_WRITE_READ, &bwr);

        if (res < 0) {
            ALOGE("binder_loop: ioctl failed (%s)\n", strerror(errno));
            break;
        }

        res = binder_parse(bs, 0, (uintptr_t) readbuf, bwr.read_consumed, func);
        if (res == 0) {
            ALOGE("binder_loop: unexpected reply?!\n");
            break;
        }
        if (res < 0) {
            ALOGE("binder_loop: io error %d %s\n", res, strerror(errno));
            break;
        }
    }
}
函数中我们调用了binder_parse函数进行数据解析,解析的是readbuf:

int binder_parse(struct binder_state *bs, struct binder_io *bio,
                 uintptr_t ptr, size_t size, binder_handler func)
{
    int r = 1;
    uintptr_t end = ptr + (uintptr_t) size;

    while (ptr < end) {
        uint32_t cmd = *(uint32_t *) ptr;
        ptr += sizeof(uint32_t);
#if TRACE
        fprintf(stderr,"%s:\n", cmd_name(cmd));
#endif
        switch(cmd) {
        case BR_NOOP:
            break;
        case BR_TRANSACTION_COMPLETE:
            break;
        case BR_INCREFS:
        case BR_ACQUIRE:
        case BR_RELEASE:
        case BR_DECREFS:
#if TRACE
            fprintf(stderr,"  %p, %p\n", (void *)ptr, (void *)(ptr + sizeof(void *)));
#endif
            ptr += sizeof(struct binder_ptr_cookie);
            break;
        case BR_TRANSACTION: {
            struct binder_transaction_data *txn = (struct binder_transaction_data *) ptr;
            if ((end - ptr) < sizeof(*txn)) {
                ALOGE("parse: txn too small!\n");
                return -1;
            }
            binder_dump_txn(txn);
            if (func) {
                unsigned rdata[256/4];
                struct binder_io msg;
                struct binder_io reply;
                int res;

                bio_init(&reply, rdata, sizeof(rdata), 4);
                bio_init_from_txn(&msg, txn);
                res = func(bs, txn, &msg, &reply);
                binder_send_reply(bs, &reply, txn->data.ptr.buffer, res);
            }
            ptr += sizeof(*txn);
            break;
        }
        //........
        }
    }

    return r;
}
在函数中还做了reply的初始化,最后解析完成的数据就可以供Service Manager使用了,此时我们回到遥远的Service Manager启动主函数中,真正使用解析后的数据的函数是:

int svcmgr_handler(struct binder_state *bs,
                   struct binder_transaction_data *txn,
                   struct binder_io *msg,
                   struct binder_io *reply)
{
    struct svcinfo *si;
    uint16_t *s;
    size_t len;
    uint32_t handle;
    uint32_t strict_policy;
    int allow_isolated;

    //ALOGI("target=%x code=%d pid=%d uid=%d\n",
    //  txn->target.handle, txn->code, txn->sender_pid, txn->sender_euid);

    if (txn->target.handle != svcmgr_handle)
        return -1;

    if (txn->code == PING_TRANSACTION)
        return 0;

    // Equivalent to Parcel::enforceInterface(), reading the RPC
    // header with the strict mode policy mask and the interface name.
    // Note that we ignore the strict_policy and don't propagate it
    // further (since we do no outbound RPCs anyway).
    strict_policy = bio_get_uint32(msg);
    s = bio_get_string16(msg, &len);
    if (s == NULL) {
        return -1;
    }

    if ((len != (sizeof(svcmgr_id) / 2)) ||
        memcmp(svcmgr_id, s, sizeof(svcmgr_id))) {
        fprintf(stderr,"invalid id %s\n", str8(s, len));
        return -1;
    }

    if (sehandle && selinux_status_updated() > 0) {
        struct selabel_handle *tmp_sehandle = selinux_android_service_context_handle();
        if (tmp_sehandle) {
            selabel_close(sehandle);
            sehandle = tmp_sehandle;
        }
    }

    switch(txn->code) {
    case SVC_MGR_GET_SERVICE:
    case SVC_MGR_CHECK_SERVICE:
        s = bio_get_string16(msg, &len);
        if (s == NULL) {
            return -1;
        }
        handle = do_find_service(bs, s, len, txn->sender_euid, txn->sender_pid);
        if (!handle)
            break;
        bio_put_ref(reply, handle);
        return 0;

    case SVC_MGR_ADD_SERVICE:
        s = bio_get_string16(msg, &len);
        if (s == NULL) {
            return -1;
        }
        handle = bio_get_ref(msg);
        allow_isolated = bio_get_uint32(msg) ? 1 : 0;
        if (do_add_service(bs, s, len, handle, txn->sender_euid,
            allow_isolated, txn->sender_pid))
            return -1;
        break;

    case SVC_MGR_LIST_SERVICES: {
        uint32_t n = bio_get_uint32(msg);

        if (!svc_can_list(txn->sender_pid)) {
            ALOGE("list_service() uid=%d - PERMISSION DENIED\n",
                    txn->sender_euid);
            return -1;
        }
        si = svclist;
        while ((n-- > 0) && si)
            si = si->next;
        if (si) {
            bio_put_string16(reply, si->name);
            return 0;
        }
        return -1;
    }
    default:
        ALOGE("unknown code %d\n", txn->code);
        return -1;
    }

    bio_put_uint32(reply, 0);
    return 0;
}
其实当中的核心部分在于:

 do_add_service(bs, s, len, handle, txn->sender_euid,allow_isolated, txn->sender_pid)
这个函数的实现很简单,就是把MediaPlayerService这个Binder实体的引用写到一个struct svcinfo结构体中,主要是它的名称和句柄值,然后插入到链接svclist的头部去。这样,Client来向Service Manager查询服务接口时,只要给定服务名称,Service Manger就可以返回相应的句柄值了。
 从这里层层返回,最后回到MediaPlayerService::instantiate函数中,至此,IServiceManager::addService终于执行完毕了。

我们来回顾下整个过程做一遍梳理:

1 我们使用了ProcessState这个类来作为和进程交互的接口,在这个类的构造函数中,我们打开驱动文件以及在内存中作出映射,在和内核交互的ioctl函数中我们初始化了一些参数 包括最大支持的并发请求数,以及Binder驱动程序就为当前进程预留的BINDER_VM_SIZE大小的内存空间

2 将MediaPlayerService实例的数据写入到Pacel中去,再将这个Pacel包装到binder中

3 在Service Manager的进程空间中开辟一片缓存区域,将binder数据写入到缓存区域中

4 将缓存数据解析,并将MediaPlayerService实例链接svclist的头部方便以后的client的查询

5 清除缓存数据

到这里addService就做完了,看似复杂其实每一步都非常关键。

在addService 完成后Service 就挂载到了Service Manager,现在我们回到Service 启动的main函数中上接下来调用的是:

ProcessState::self()->startThreadPool();  
IPCThreadState::self()->joinThreadPool();  
void ProcessState::startThreadPool()
{
    AutoMutex _l(mLock);
    if (!mThreadPoolStarted) {
        mThreadPoolStarted = true;
        spawnPooledThread(true);
    }
}
 这里调用spwanPooledThread:

void ProcessState::spawnPooledThread(bool isMain)
{
    if (mThreadPoolStarted) {
        String8 name = makeBinderThreadName();
        ALOGV("Spawning new pooled thread, name=%s\n", name.string());
        sp t = new PoolThread(isMain);
        t->run(name.string());
    }
}
这里主要是创建一个线程,PoolThread继续Thread类,,其run函数最终调用子类的threadLoop函数:

    virtual bool threadLoop()
    {
        IPCThreadState::self()->joinThreadPool(mIsMain);
        return false;
    }
跟进到joinThreadPool()中:
void IPCThreadState::joinThreadPool(bool isMain)
{
    LOG_THREADPOOL("**** THREAD %p (PID %d) IS JOINING THE THREAD POOL\n", (void*)pthread_self(), getpid());

    mOut.writeInt32(isMain ? BC_ENTER_LOOPER : BC_REGISTER_LOOPER);
    
    // This thread may have been spawned by a thread that was in the background
    // scheduling group, so first we will make sure it is in the foreground
    // one to avoid performing an initial transaction in the background.
    set_sched_policy(mMyThreadId, SP_FOREGROUND);
        
    status_t result;
    do {
        processPendingDerefs();
        // now get the next command to be processed, waiting if necessary
        result = getAndExecuteCommand();

        if (result < NO_ERROR && result != TIMED_OUT && result != -ECONNREFUSED && result != -EBADF) {
            ALOGE("getAndExecuteCommand(fd=%d) returned unexpected error %d, aborting",
                  mProcess->mDriverFD, result);
            abort();
        }
        
        // Let this thread exit the thread pool if it is no longer
        // needed and it is not the main process thread.
        if(result == TIMED_OUT && !isMain) {
            break;
        }
    } while (result != -ECONNREFUSED && result != -EBADF);

    LOG_THREADPOOL("**** THREAD %p (PID %d) IS LEAVING THE THREAD POOL err=%p\n",
        (void*)pthread_self(), getpid(), (void*)result);
    
    mOut.writeInt32(BC_EXIT_LOOPER);
    talkWithDriver(false);
}
这个函数最终是在一个无穷循环中,通过调用talkWithDriver函数来和Binder驱动程序进行交互,实际上就是调用talkWithDriver来等待Client的请求,然后再调用executeCommand来处理请求,至此Server的挂载过程就全部完成了。
Client如何获得Service Manager的接口?

之前我们是以MediaPlayerService举得例子,现在我们用mediaplayer来作为说明,mediaplayer继承于IMediaDeathNotifier,打开IMediaDeathNotifier的实现第一个函数便是getMediaPlayerService(),很明显这个函数就是获得Service接口的开始:

IMediaDeathNotifier::getMediaPlayerService()
{
    ALOGV("getMediaPlayerService");
    Mutex::Autolock _l(sServiceLock);
    if (sMediaPlayerService == 0) {
        sp sm = defaultServiceManager();
        sp binder;
        do {
            binder = sm->getService(String16("media.player"));
            if (binder != 0) {
                break;
            }
            ALOGW("Media player service not published, waiting...");
            usleep(500000); // 0.5 s
        } while (true);

        if (sDeathNotifier == NULL) {
            sDeathNotifier = new DeathNotifier();
        }
        binder->linkToDeath(sDeathNotifier);
        sMediaPlayerService = interface_cast(binder);
    }
    ALOGE_IF(sMediaPlayerService == 0, "no media player service!?");
    return sMediaPlayerService;
}

 接下去的while循环是通过sm->getService接口来不断尝试获得名称为“media.player”的Service,即MediaPlayerService。为什么要通过这无穷循环来得MediaPlayerService呢?因为这时候MediaPlayerService可能还没有启动起来,所以这里如果发现取回来的binder接口为NULL,就睡眠0.5秒,然后再尝试获取,这是获取Service接口的标准做法。

我们直接进入到getService中:

    virtual sp getService(const String16& name) const  
    {  
        unsigned n;  
        for (n = 0; n < 5; n++){  
            sp svc = checkService(name);  
            if (svc != NULL) return svc;  
            LOGI("Waiting for service %s...\n", String8(name).string());  
            sleep(1);  
        }  
        return NULL;  
    }  
我们可以看到在getService依然是调用了checkService():

    virtual sp checkService( const String16& name) const
    {
        Parcel data, reply;
        data.writeInterfaceToken(IServiceManager::getInterfaceDescriptor());
        data.writeString16(name);
        remote()->transact(CHECK_SERVICE_TRANSACTION, data, &reply);
        return reply.readStrongBinder();
    }
和上一节一样,一样是先准备好两个Parcel,这里向Pacel中写入RPC头部和上一节一样都是 “media.player”,下面调用transact()之后的过程和上一节基本类似,不想写了

最终会获得一个Service 实例,就可以进行服务了。
























你可能感兴趣的:(Android源码解析)