Supervisor笔记

最近写了一个wiki看门狗(wiki-watchdog), 作用就是监控wiki的改动,然后通过钉钉机器人发送到群组。因为脚本健壮性的问题,代码有可能会不定期crash掉,所以需要一个能在脚本crash后及时恢复的服务,查了查发现supervisor挺合适。

简介

Supervisor是一款用于管理和监控类 UNIX 操作系统上面的进程工具,基于Python开发,典型的Client/Server架构。其中:

  • supervisord 用于server端启动服务;
  • supervisorctl 相当于client,用于连接server端来间接对进城进行操作
  • echo_supervisord_conf 类似于一个文档,详细展示配置项的含义,对于像我这样初识的人比较友好。

安装

Supervisor 基于Python开发,所以可以通过easy_install和pip的方式进行安装。另外Ubuntu用户也可以方便的使用apt进行安装。具体就不介绍了,网上教程一大堆。

加载

在配置supervisord服务的时候,没有指定配置文件的情况下会有如下搜索路径。

$CWD/supervisord.conf
$CWD/etc/supervisord.conf
/etc/supervisord.conf
/etc/supervisor/supervisord.conf (since Supervisor 3.3.0)
../etc/supervisord.conf (Relative to the executable)
../supervisord.conf (Relative to the executable)

relative,就是指执行supervisord命令的PWD对应的路径。

管理配置

在/etc/supervisor/supervisord.conf的末尾,详细的写着这么一段话。

; The [include] section can just contain the "files" setting.  This
; setting can list multiple files (separated by whitespace or
; newlines).  It can also contain wildcards.  The filenames are
; interpreted as relative to this file.  Included files *cannot*
; include files themselves.

[include]
files = /etc/supervisor/conf.d/*.conf

所以在管理多个进程的时候,就可以直接以.conf结尾扔到conf.d目录下。supervisor会自动读取和加载配置,然后管理我们的服务。

管理进程

supervisor本身的偶用就是帮助我们来管理服务的,所以我们要对conf.d立面的配置文件认真对待。前面通过echo_supervisord_conf命令我们可以看到有这样的配置项。

; The below sample program section shows all possible program subsection values,
; create one or more 'real' program: sections to be able to control them under
; supervisor.

;[program:theprogramname]
;command=/bin/cat              ; the program (relative uses PATH, can take args)
;process_name=%(program_name)s ; process_name expr (default %(program_name)s)
;numprocs=1                    ; number of processes copies to start (def 1)
;directory=/tmp                ; directory to cwd to before exec (def no cwd)
;umask=022                     ; umask for process (default None)
;priority=999                  ; the relative start priority (default 999)
;autostart=true                ; start at supervisord start (default: true)
;startsecs=1                   ; # of secs prog must stay up to be running (def. 1)
;startretries=3                ; max # of serial start failures when starting (default 3)
;autorestart=unexpected        ; when to restart if exited after running (def: unexpected)
;exitcodes=0,2                 ; 'expected' exit codes used with autorestart (default 0,2)
;stopsignal=QUIT               ; signal used to kill process (default TERM)
;stopwaitsecs=10               ; max num secs to wait b4 SIGKILL (default 10)
;stopasgroup=false             ; send stop signal to the UNIX process group (default false)
;killasgroup=false             ; SIGKILL the UNIX process group (def false)
;user=chrism                   ; setuid to this UNIX account to run the program
;redirect_stderr=true          ; redirect proc stderr to stdout (default false)
;stdout_logfile=/a/path        ; stdout log path, NONE for none; default AUTO
;stdout_logfile_maxbytes=1MB   ; max # logfile bytes b4 rotation (default 50MB)
;stdout_logfile_backups=10     ; # of stdout logfile backups (default 10)
;stdout_capture_maxbytes=1MB   ; number of bytes in 'capturemode' (default 0)
;stdout_events_enabled=false   ; emit events on stdout writes (default false)
;stderr_logfile=/a/path        ; stderr log path, NONE for none; default AUTO
;stderr_logfile_maxbytes=1MB   ; max # logfile bytes b4 rotation (default 50MB)
;stderr_logfile_backups=10     ; # of stderr logfile backups (default 10)
;stderr_capture_maxbytes=1MB   ; number of bytes in 'capturemode' (default 0)
;stderr_events_enabled=false   ; emit events on stderr writes (default false)
;environment=A="1",B="2"       ; process environment additions (def no adds)
;serverurl=AUTO                ; override serverurl computation (childutils)

参数很多,但是我们不必全部用到,下面拿个例子来一起看看配置文件的写法。

[program:mytest]
command=bash /etc/supervisor/conf.d/mybash.sh
directory=/etc/supervisor/conf.d
stdout_file=/tmp/stdout.log
startsecs=0
autostart=false
autorestart=false

经过我的测试,command中的执行脚本最好是绝对路径。

示例

在让supervisor帮我们管理进程的之前,我们要确保supervisord服务已经正确开启了。

root@Server218 ~# ps aux | grep supervisor
root     20090  0.0  0.9  59580 19464 ?        Ss   19:58   0:00 /usr/bin/python /usr/bin/supervisord -n -c /etc/supervisor/supervisord.conf
root     22776  0.0  0.0  14196   860 pts/1    S    20:51   0:00 grep --color=auto supervisor

具体管理其他进程需要通过client,也就是supervisorctl来实现,格式为:

supervisorctl start programname
supervisorctl stop programname
supervisorctl restart programname

下面简单写一个shell脚本,略微“耗时”吧。

#!/usr/bin bash
i=1
while [ $i -le 100  ]
do
    let i++
    echo $i
    sleep 1
done

开启服务:

root@Server218 /e/s/conf.d# supervisorctl start mytest
mytest: started

查看状态

root@Server218 /e/s/conf.d# supervisorctl
mytest                           RUNNING   pid 22937, uptime 0:00:21
supervisor> status
mytest                           RUNNING   pid 22937, uptime 0:00:27
supervisor> help

default commands (type help ):
=====================================
add    exit      open  reload  restart   start   tail
avail  fg        pid   remove  shutdown  status  update
clear  maintail  quit  reread  signal    stop    version

关闭服务

root@Server218 /e/s/conf.d# supervisorctl stop mytest
mytest: stopped

这样就完成了对外部服务的管理了。

遇到的问题

1 supervisor.sock refused connection.
解决办法:supervisord重启下supervisord的服务。

2 unix:///tmp/supervisor.sock no such file
解决办法:加权限 chmod 777 /xxx/supervisor.sock

这里把所有的/tmp路径改掉,/tmp/supervisor.sock 改成 /var/run/supervisor.sock,/tmp/supervisord.log 改成 /var/log/supervisor.log,/tmp/supervisord.pid 改成 /var/run/supervisor.pid 要不容易被linux自动清掉
3 启动报错 IOError: [Errno 13] Permission denied: '/var/log/supervisord.log'
解决办法: 给文件或者目录加可写权限, 然后记得重启下supervisord的服务。

你可能感兴趣的:(Supervisor笔记)