【android学习笔记】init.rc中声明的守护进程启动的流程

在Init.rc中,用service关键字声明了一系列服务.

init.rc对service的说明如下:(详见system/core/init/readme.txt)

Services--------Services are programs which init launches and (optionally) restartswhen they exit.  

Services take the form of:service [ ]*   

Options-------Options are modifiers to services.  

They affect how and when initruns the service.

critical   

This is a device-critical service. If it exits more than four times in   four minutes, the device will reboot into recovery mode.

disabled  

 This service will not automatically start with its class.   It must be explicitly started by name.

setenv    

Set the environment variable to in the launched process.

socket [ [ ] ]   

Create a unix domain socket named /dev/socket/ and pass   its fd to the launched process.   must be "dgram", "stream" or "seqpacket".   

User and group default to 0.

user    

Change to username before exec'ing this service.   Currently defaults to root.  (??? probably should default to nobody)   Currently, if your process requires linux capabilities then you cannot use   this command. You must instead request the capabilities in-process while   still root, and then drop to your desired uid.

group [ ]*   

Change to groupname before exec'ing this service.  Additional   groupnames beyond the (required) first one are used to set the   supplemental groups of the process (via setgroups()).   Currently defaults to root.  (??? probably should default to nobody)oneshot   Do not restart the service when it exits.

class    

Specify a class name for the service.  All services in a   named class may be started or stopped together.  A service   is in the class "default" if one is not specified via the   class option.

onrestart    Execute a Command (see below) when service restarts.

可以看出,service中比较关键的几个选项是类别(class),启动选项(oneshot disabled,critical),用户组(group,user)。

以surfaceflinger为例:

service surfaceflinger /system/bin/surfaceflinger    

class main    

user system    

group graphics    

onrestart restart zygote

它的类别是main,用户是system,属于graphics ,且没有声明disabled,所以在启动main这个类别的时候,surfaceflinger就会被启动。

main类别是在哪里启动的呢?

搜索class_start关键字,在On boot的时候,就会启动了

on boot

  #省略无关.....

 class_start core    

 class_start main

readme中对class_start有如下说明:

class_start  

Start all services of the specified class if they are   not already running.

class_stop  

Stop all services of the specified class if they are   currently running. 

所以,在系统启动触发boot这个trigger之后,core和main类别的没有声明为disabled的守护进程就被系统启动了,

具体的过程还得去看init对init.rc文件的解析过程,暂且放着。

那么还有一个疑问,那些声明了为disabled的守护进程是怎么启动的呢?

以Bootanimation为例:

service bootanim /system/bin/bootanimation    

class main    

user graphics    

group graphics    

disabled    

oneshot

它也是main组别的,用户和组均是graphics,且是disabled,也就是class_start main的时候不会启动它,它是在哪里启动的呢?

之前的博客上有讲过,是在surfaceflinger的readytorun函数中调用的:(也可以直接搜索bootanim关键字)

status_t SurfaceFlinger::readyToRun()
{
//省略
    if(SurfaceFlinger::sBootanimEnable){
    // start boot animation
    LOGI("start bootanim!");
    property_set("ctl.start", "bootanim");
   	}
    return NO_ERROR;
}


实际调用的是 system/core/libcutils/Properties.c的property_set函数:

#ifdef HAVE_LIBC_SYSTEM_PROPERTIES

#define _REALLY_INCLUDE_SYS__SYSTEM_PROPERTIES_H_
#include 

int property_set(const char *key, const char *value)
{
    return __system_property_set(key, value);
}

位于bionic/libc/bionic/system_properties.c


int __system_property_set(const char *key, const char *value)
{
    int err;
    int tries = 0;
    int update_seen = 0;
    prop_msg msg;

    if(key == 0) return -1;
    if(value == 0) value = "";
    if(strlen(key) >= PROP_NAME_MAX) return -1;
    if(strlen(value) >= PROP_VALUE_MAX) return -1;

    memset(&msg, 0, sizeof msg);
    msg.cmd = PROP_MSG_SETPROP;
    strlcpy(msg.name, key, sizeof msg.name);
    strlcpy(msg.value, value, sizeof msg.value);

    err = send_prop_msg(&msg);
    if(err < 0) {
        return err;
    }

    return 0;
}


将key value cmd 打包成msg,交由send_prop_msg处理,在此函数中,将msg打包,通过socket发送给对端的property_service。

static int send_prop_msg(prop_msg *msg)
{
    struct pollfd pollfds[1];
    struct sockaddr_un addr;
    socklen_t alen;
    size_t namelen;
    int s;
    int r;
    int result = -1;
    s = socket(AF_LOCAL, SOCK_STREAM, 0);
    if(s < 0) {
        return result;
    }

    memset(&addr, 0, sizeof(addr));
    namelen = strlen(property_service_socket);//property_service的socket位于 /dev/socket/property_service
    strlcpy(addr.sun_path, property_service_socket, sizeof addr.sun_path);
    addr.sun_family = AF_LOCAL;
    alen = namelen + offsetof(struct sockaddr_un, sun_path) + 1;

    if(TEMP_FAILURE_RETRY(connect(s, (struct sockaddr *) &addr, alen) < 0)) {
        close(s);
        return result;
    }
    r = TEMP_FAILURE_RETRY(send(s, msg, sizeof(prop_msg), 0));

    if(r == sizeof(prop_msg)) {
        // We successfully wrote to the property server but now we
        // wait for the property server to finish its work.  It
        // acknowledges its completion by closing the socket so we
        // poll here (on nothing), waiting for the socket to close.
        // If you 'adb shell setprop foo bar' you'll see the POLLHUP
        // once the socket closes.  Out of paranoia we cap our poll
        // at 250 ms.
        pollfds[0].fd = s;
        pollfds[0].events = 0;
        r = TEMP_FAILURE_RETRY(poll(pollfds, 1, 250 /* ms */));
        if (r == 1 && (pollfds[0].revents & POLLHUP) != 0) {
            result = 0;
        } else {
            // Ignore the timeout and treat it like a success anyway.
            // The init process is single-threaded and its property
            // service is sometimes slow to respond (perhaps it's off
            // starting a child process or something) and thus this
            // times out and the caller thinks it failed, even though
            // it's still getting around to it.  So we fake it here,
            // mostly for ctl.* properties, but we do try and wait 250
            // ms so callers who do read-after-write can reliably see
            // what they've written.  Most of the time.
            // TODO: fix the system properties design.
            result = 0;
        }
    }

    close(s);
    return result;
}

到这里有必要说下property_service,这货是init进程创建的,在init.c的main函数中,    

queue_builtin_action(property_service_init_action, "property_service_init");

会调用到

void start_property_service(void)
{
    int fd;

    load_properties_from_file(PROP_PATH_SYSTEM_BUILD);
    load_properties_from_file(PROP_PATH_SYSTEM_DEFAULT);
    load_properties_from_file(PROP_PATH_LOCAL_OVERRIDE);
    /* Read persistent properties after all default values have been loaded. */
    load_persistent_properties();

    fd = create_socket(PROP_SERVICE_NAME, SOCK_STREAM, 0666, 0, 0);
    if(fd < 0) return;
    fcntl(fd, F_SETFD, FD_CLOEXEC);
    fcntl(fd, F_SETFL, O_NONBLOCK);

    listen(fd, 8);
    property_set_fd = fd; //记住propservice的fd
}
就是创建了socket,注意这里的PROP_SERVICE_NAME =“property_service”

property_service是怎么接收到调用的呢?

init.c的main函数的最后,会有轮询:

void main ()
{
  //.....
 
  for(;;) {
        int nr, i, timeout = -1;

        execute_one_command();
        restart_processes();

        if (!property_set_fd_init && get_property_set_fd() > 0) { //检查property_service是否已经初始化好了
            ufds[fd_count].fd = get_property_set_fd();
            ufds[fd_count].events = POLLIN;
            ufds[fd_count].revents = 0;
            fd_count++;          //如果OK,则将计数+1
            property_set_fd_init = 1;
        }
        if (!signal_fd_init && get_signal_fd() > 0) {
            ufds[fd_count].fd = get_signal_fd();
            ufds[fd_count].events = POLLIN;
            ufds[fd_count].revents = 0;
            fd_count++;
            signal_fd_init = 1;
        }
        if (!keychord_fd_init && get_keychord_fd() > 0) {
            ufds[fd_count].fd = get_keychord_fd();
            ufds[fd_count].events = POLLIN;
            ufds[fd_count].revents = 0;
            fd_count++;
            keychord_fd_init = 1;
        }

        if (process_needs_restart) {
            timeout = (process_needs_restart - gettime()) * 1000;
            if (timeout < 0)
                timeout = 0;
        }
        if (!action_queue_empty() || cur_action)
            timeout = 0;

#if BOOTCHART
        if (bootchart_count > 0) {
            if (timeout < 0 || timeout > BOOTCHART_POLLING_MS)
                timeout = BOOTCHART_POLLING_MS;
            if (bootchart_step() < 0 || --bootchart_count == 0) {
                bootchart_finish();
                bootchart_count = 0;
            }
        }
#endif

        nr = poll(ufds, fd_count, timeout);//轮询
        if (nr <= 0)
            continue;

        for (i = 0; i < fd_count; i++) {
            if (ufds[i].revents == POLLIN) {
                if (ufds[i].fd == get_property_set_fd()) //如果是property_service,则调用handle_property_set_fd
                    handle_property_set_fd();
                else if (ufds[i].fd == get_keychord_fd())
                    handle_keychord();
                else if (ufds[i].fd == get_signal_fd())
                    handle_signal();
            }
        }
    }

    return 0;
}

也就是说,当有人给/dev/socket/property_service发送消息后,这里就会调用handle_property_set_fd来处理。


void handle_property_set_fd()
{
    prop_msg msg;
    int s;
    int r;
    int res;
    struct ucred cr;
    struct sockaddr_un addr;
    socklen_t addr_size = sizeof(addr);
    socklen_t cr_size = sizeof(cr);

    if ((s = accept(property_set_fd, (struct sockaddr *) &addr, &addr_size)) < 0) {
        return;
    }
    /* Check socket options here */
    if (getsockopt(s, SOL_SOCKET, SO_PEERCRED, &cr, &cr_size) < 0) {
        close(s);
        ERROR("Unable to recieve socket options\n");
        return;
    }

    r = TEMP_FAILURE_RETRY(recv(s, &msg, sizeof(msg), 0));
    if(r != sizeof(prop_msg)) {
        ERROR("sys_prop: mis-match msg size recieved: %d expected: %d errno: %d\n",
              r, sizeof(prop_msg), errno);
        close(s);
        return;
    }
    switch(msg.cmd) { //根据传入的msg的cmd分别做处理,我们传入的是setprop,且是ctl.start
    case PROP_MSG_SETPROP:
        msg.name[PROP_NAME_MAX-1] = 0;
        msg.value[PROP_VALUE_MAX-1] = 0;

        if(memcmp(msg.name,"ctl.",4) == 0) {
            // Keep the old close-socket-early behavior when handling
            // ctl.* properties.
            close(s);
            if (check_control_perms(msg.value, cr.uid, cr.gid)) { //检查权限,system和root可以无视,其他只有对应组和user完全相同才能启动
                handle_control_message((char*) msg.name + 4, (char*) msg.value);//处理请求,传入的字符串是start
            } else {
                ERROR("sys_prop: Unable to %s service ctl [%s] uid:%d gid:%d pid:%d\n",
                        msg.name + 4, msg.value, cr.uid, cr.gid, cr.pid);
            }
        } else {
            if (check_perms(msg.name, cr.uid, cr.gid)) {
                property_set((char*) msg.name, (char*) msg.value);
            } else {
                ERROR("sys_prop: permission denied uid:%d  name:%s\n",
                      cr.uid, msg.name);
            }

            // Note: bionic's property client code assumes that the
            // property server will not close the socket until *AFTER*
            // the property is written to memory.
            close(s);
        }
        break;
    default:
        close(s);
        break;
    }
}

handle_control_message位于 system/core/init/init.c


void handle_control_message(const char *msg, const char *arg)
{
    if (!strcmp(msg,"start")) {//我们是start
        msg_start(arg); 
    } else if (!strcmp(msg,"stop")) {
        msg_stop(arg);
    } else if (!strcmp(msg,"restart")) {
        msg_stop(arg);
        msg_start(arg);
    } else {
        ERROR("unknown control msg '%s'\n", msg);
    }
}

static void msg_start(const char *name)
{
    struct service *svc;
    char *tmp = NULL;
    char *args = NULL;
//从之前init.rc中解析的servicelist中找到对应的service
 if (!strchr(name, ':')) 
 svc = service_find_by_name(name); 
 else { tmp = strdup(name); 
 args = strchr(tmp, ':'); 
 *args = '\0'; args++; 
 svc = service_find_by_name(tmp); } 
 if (svc) { service_start(svc, args); //启动service
  } 
  else 
  { 
  ERROR("no such service '%s'\n", name); 
  } 
  if (tmp)
   free(tmp);
  
   }

接下来看下service_start这个函数,在这个函数中,做的主要几件工作是添加环境变量,fork出pid,创建socket,再exec

void service_start(struct service *svc, const char *dynamic_args)
{
    struct stat s;
    pid_t pid;
    int needs_console;
    int n;

        /* starting a service removes it from the disabled or reset
         * state and immediately takes it out of the restarting
         * state if it was in there
         */
    svc->flags &= (~(SVC_DISABLED|SVC_RESTARTING|SVC_RESET));
    svc->time_started = 0;

        /* running processes require no additional work -- if
         * they're in the process of exiting, we've ensured
         * that they will immediately restart on exit, unless
         * they are ONESHOT
         */
    if (svc->flags & SVC_RUNNING) { //如果service当前已经在运行,则return
        return;
    }
    needs_console = (svc->flags & SVC_CONSOLE) ? 1 : 0; //如果service有console的声明
    if (needs_console && (!have_console)) {
        ERROR("service '%s' requires console\n", svc->name);
        svc->flags |= SVC_DISABLED;
        return;
    }

    if (stat(svc->args[0], &s) != 0) {
        ERROR("cannot find '%s', disabling '%s'\n", svc->args[0], svc->name);
        svc->flags |= SVC_DISABLED;
        return;
    }

    if ((!(svc->flags & SVC_ONESHOT)) && dynamic_args) {
        ERROR("service '%s' must be one-shot to use dynamic args, disabling\n",
               svc->args[0]);
        svc->flags |= SVC_DISABLED;
        return;
    }

    NOTICE("starting '%s'\n", svc->name);

    pid = fork(); //fork出pid

    if (pid == 0) {
        struct socketinfo *si;
        struct svcenvinfo *ei;
        char tmp[32];
        int fd, sz;

        if (properties_inited()) {//property area 在init的main中在init之前已经初始化完毕
            get_property_workspace(&fd, &sz); workspace位于/dev/__properties__
            sprintf(tmp, "%d,%d", dup(fd), sz);
            add_environment("ANDROID_PROPERTY_WORKSPACE", tmp);
        }

        for (ei = svc->envvars; ei; ei = ei->next)//如果在service的声明中有setenv选项,则会将这些添加到env中
            add_environment(ei->name, ei->value);

        for (si = svc->sockets; si; si = si->next) {//如果在service的声明中有socket选项的,则会创建socket,如zygote中就有声明     socket zygote stream 666
            int socket_type = (
                    !strcmp(si->type, "stream") ? SOCK_STREAM :
                        (!strcmp(si->type, "dgram") ? SOCK_DGRAM : SOCK_SEQPACKET));
            int s = create_socket(si->name, socket_type,
                                  si->perm, si->uid, si->gid);
            if (s >= 0) {
                publish_socket(si->name, s);//将socket加入env中
            }
        }
        if (svc->ioprio_class != IoSchedClass_NONE) { //init.rc中没找到例子,也没弄清楚是啥意思 - -...
            if (android_set_ioprio(getpid(), svc->ioprio_class, svc->ioprio_pri)) {
                ERROR("Failed to set pid %d ioprio = %d,%d: %s\n",
                      getpid(), svc->ioprio_class, svc->ioprio_pri, strerror(errno));
            }
        }

        if (needs_console) {
            setsid();
            open_console();
        } else {
            zap_stdio();
        }

#if 0
        for (n = 0; svc->args[n]; n++) {
            INFO("args[%d] = '%s'\n", n, svc->args[n]);
        }
        for (n = 0; ENV[n]; n++) {
            INFO("env[%d] = '%s'\n", n, ENV[n]);
        }
#endif

        setpgid(0, getpid());
    /* as requested, set our gid, supplemental gids, and uid */
        if (svc->gid) { //service的gid和uid是通过传入的user和group映射得到的,详细可以看init_parser.c的parse_line_service函数中的    case K_user:。这里简单的说下:gid和uid都是通过decode_uid来得到的,其实就是从android_ids这个全局变量中查找到对应的组/用户对应的值。如 graphics就是AID_GRAPHICS,所以bootanim的uid和gid都是1003
            if (setgid(svc->gid) != 0) {
                ERROR("setgid failed: %s\n", strerror(errno));
                _exit(127);
            }
        }
        if (svc->nr_supp_gids) {
            if (setgroups(svc->nr_supp_gids, svc->supp_gids) != 0) {
                ERROR("setgroups failed: %s\n", strerror(errno));
                _exit(127);
            }
        }
        if (svc->uid) {
            if (setuid(svc->uid) != 0) {
                ERROR("setuid failed: %s\n", strerror(errno));
                _exit(127);
            }
        }
 //接下来就是exec了,实际就是执行了bin文件,并将参数传入,就会进入BootAnimation_main.cpp的main函数
        if (!dynamic_args) {
            if (execve(svc->args[0], (char**) svc->args, (char**) ENV) < 0) {
                ERROR("cannot execve('%s'): %s\n", svc->args[0], strerror(errno));
            }
        } else {
            char *arg_ptrs[INIT_PARSER_MAXARGS+1];
            int arg_idx = svc->nargs;
            char *tmp = strdup(dynamic_args);
            char *next = tmp;
            char *bword;

            /* Copy the static arguments */
            memcpy(arg_ptrs, svc->args, (svc->nargs * sizeof(char *)));

            while((bword = strsep(&next, " "))) {
                arg_ptrs[arg_idx++] = bword;
                if (arg_idx == INIT_PARSER_MAXARGS)
                    break;
            }
            arg_ptrs[arg_idx] = '\0';
            execve(svc->args[0], (char**) arg_ptrs, (char**) ENV);
        }
        _exit(127);
    }

    if (pid < 0) {
        ERROR("failed to start '%s'\n", svc->name);
        svc->pid = 0;
        return;
    }
    svc->time_started = gettime();
    svc->pid = pid;
    svc->flags |= SVC_RUNNING;//标记service的状态和启动时间

    if (properties_inited())
        notify_service_state(svc->name, "running");//通知服务已经运行了
}

这样,通过property_set,surfaceflinger就把bootanim给起来了,播放开机动画.


前面说过,在on boot的时候,通过class_start把main和core类型的service起来,下面看下过程。
在我看来,init启动的过程中,最精髓的过程就是对init.rc的解析,是通过init_parser.c来实现的。
对service的解析是parse_service和parse_line_service函数,将service加入servicelist,并通过关键字解析出值,将service的结构体填充完毕。
各个关键字对应的命令可以看Keywords.h,我们可以看到如下定义:
    KEYWORD(class_start, COMMAND, 1, do_class_start)

也就是说,当on boot被触发后,就会调用do_class_start来启动对应的service

int do_class_start(int nargs, char **args)
{
        /* Starting a class does not start services
         * which are explicitly disabled.  They must
         * be started individually.
         */
    service_for_each_class(args[1], service_start_if_not_disabled);
    return 0;
}
void service_for_each_class(const char *classname,
                            void (*func)(struct service *svc))
{
    struct listnode *node;
    struct service *svc;
    list_for_each(node, &service_list) {
        svc = node_to_item(node, struct service, slist);
        if (!strcmp(svc->classname, classname)) {
            func(svc);
        }
    }
}

static void service_start_if_not_disabled(struct service *svc)
{
    if (!(svc->flags & SVC_DISABLED)) {
        service_start(svc, NULL);
    }
}

实际上就是,通过classname从servicelist中查找到对应的service,如果service不是disabled的,就调用service_start来启动它。

这个就和上面的通过ctl.start的流程最后一样了。

你可能感兴趣的:(android)