【Apollo】supervisor组件的应用

Supervisor

一个client/server系统,用来控制一系列进程在UNIX-like操作系统上

supervisord(server):响应client端的命令,控制进程启动,停止,监控进程,重新启动崩溃或退出的进程,记录进程日志,生成并处理进程生命周期中的点事件。supervisord使用一个配置文件,通常是/etc/supervisord.conf 里面是一些配置信息。

supervisorctl(client):类似于命令行,向server端发送用户需要的控制命令

server与client通信通过socket通信。

Apollo中supervisor组件的应用

supervisor组件的安装,在docker/build/installers/install_supervisor.sh

# Fail on first error.
set -e

apt-get install -y supervisor
# Add supervisord config file
echo_supervisord_conf > /etc/supervisord.conf

server端

通过查看进程可以看到supervisord已经启动,并且配置文件是/apollo/modules/tools/supervisord/dev.conf

ubuntu@in_dev_docker:/apollo$ ps aux | grep supervisor
root       193  0.2  0.0  49900 14312 ?        Ss   10:29   0:02 /usr/bin/python /usr/local/bin/supervisord -c /apollo/modules/tools/supervisord/dev.conf

启动 : 在scripts/bootstrap.sh

# Setup supervisord.
    if [ "$HOSTNAME" == "in_release_docker" ]; then
        supervisord -c /apollo/modules/tools/supervisord/release.conf >& /tmp/supervisord.start.log
        echo "Started supervisord with release conf"
    else
        supervisord -c /apollo/modules/tools/supervisord/dev.conf >& /tmp/supervisord.start.log
        echo "Started supervisord with dev conf"
    fi

查看dev.conf,以routing为例

[program:routing]
command=/apollo/bazel-bin/modules/routing/routing --flagfile=/apollo/modules/routing/conf/routing.conf ;启动routing的命令
autostart=false ;自动启动
numprocs=1  ;启动进程的实例,启动numprocs个routing进程
exitcodes=0 ;autorestart中用到的expected的退出码
stopsignal=INT ;当进程在stop时,可以kill进程的信号
startretries=10 ;尝试启动次数
autorestart=unexpected ;自动重启,进程在异常退出时会自动重启
redirect_stderr=true ;日志将被输出到stdout_logfile指向的文件中
stdout_logfile=/apollo/data/log/routing.out ;日志文件

client端

在启动dreamview的脚本scripts/bootstrap.sh 中可见dreamview和monitor进程都是由supervisorctl控制启动的。

# Start monitor.
supervisorctl start monitor > /dev/null
# Start dreamview.
bash scripts/voice_detector.sh start
supervisorctl start dreamview > /dev/null
echo "Dreamview is running at http://localhost:8888"

dreamview中是通过supervisorctl来控制不同模块进程的。在modules/dreamview/conf/hmi.conf 文件中。
以localization为例,在hmi上开启与关闭localization的模块开关就会执行以下命令。

modules {
  key: "localization"
  value: {
    display_name: "Localization"
    supported_commands {
      key: "start"
      value: "supervisorctl start localization &"
    }
    supported_commands {
      key: "stop"
      value: "supervisorctl stop localization &"
    }
  }
}

应用

supervisor 可以使用的命令如下:

supervisor> help

default commands (type help ):
=====================================
add    exit      open  reload  restart   start   tail   
avail  fg        pid   remove  shutdown  status  update 
clear  maintail  quit  reread  signal    stop    version
  • 查看进程状态
ubuntu@in_dev_docker:/apollo$ sudo supervisorctl status
canbus                           STOPPED   Not started
conti_radar                      STOPPED   Not started
control                          STOPPED   Not started
dreamview                        RUNNING   pid 266, uptime 1:08:45
gps                              STOPPED   Not started
localization                     STOPPED   Not started
mobileye                         STOPPED   Not started
monitor                          RUNNING   pid 207, uptime 1:08:46
navigation_control               STOPPED   Not started
navigation_localization          STOPPED   Not started
navigation_perception            STOPPED   Not started
navigation_planning              STOPPED   Not started
navigation_prediction            STOPPED   Not started
navigation_routing               STOPPED   Not started
navigation_server                STOPPED   Not started
perception                       STOPPED   Not started
planning                         STOPPED   Not started
prediction                       STOPPED   Not started
routing                          STOPPED   Not started
third_party_perception           STOPPED   Not started
  • 查看进程日志
ubuntu@in_dev_docker:/apollo$ sudo supervisorctl                 
canbus                           STOPPED   Not started
conti_radar                      STOPPED   Not started
control                          STOPPED   Not started
dreamview                        RUNNING   pid 266, uptime 1:10:49
gps                              STOPPED   Not started
localization                     STOPPED   Not started
mobileye                         STOPPED   Not started
monitor                          RUNNING   pid 207, uptime 1:10:50
navigation_control               STOPPED   Not started
navigation_localization          STOPPED   Not started
navigation_perception            STOPPED   Not started
navigation_planning              STOPPED   Not started
navigation_prediction            STOPPED   Not started
navigation_routing               STOPPED   Not started
navigation_server                STOPPED   Not started
perception                       STOPPED   Not started
planning                         STOPPED   Not started
prediction                       STOPPED   Not started
routing                          STOPPED   Not started
third_party_perception           STOPPED   Not started
supervisor> tail -f dreamview 
==> Press Ctrl-C to exit <==
pty
E0621 11:40:39.131963   266 hdmap_util.cc:120] RelativeMap is empty
E0621 11:40:39.232152   266 hdmap_util.cc:120] RelativeMap is empty
E0621 11:40:39.332326   266 hdmap_util.cc:120] RelativeMap is empty
E0621 11:40:39.431500   266 hdmap_util.cc:120] RelativeMap is empty
E0621 11:40:39.531697   266 hdmap_util.cc:120] RelativeMap is empty

注意事项:如果想开启/停止进程,还是建议在dreamview上操作,如果dreamview出异常但还需要调试,可以在supervisor命令行中操作。

你可能感兴趣的:(apollo)