在项目中,经常有脚本需要常驻运行的需求。以PHP脚本为例,最简单的方式是:
$ nohup php cli.php &
这样能保证当前终端被关闭或者按CRTL+C后,脚本仍在后台运行。但是没法保证脚本异常后自动重启等。
Supervisor 是用Python开发的一套通用的进程管理程序,能将一个普通的命令行进程变为后台daemon,并监控进程状态,异常退出时能自动重启。
官网介绍:http://supervisord.org/
本文所用环境:
- CentOS release 6.8 (Final)
- Python 2.6.6
- pip 7.1.0 from /usr/lib/python2.6/site-packages (python 2.6)
- supervisor 3.3.4
安装
平台要求
引自官网(http://supervisord.org/introduction.html#platform-requirements):
Supervisor已经过测试,可以在Linux(Ubuntu 9.10),Mac OS X(10.4 / 10.5 / 10.6)和Solaris(10 for Intel)和FreeBSD 6.1上运行。它可能在大多数UNIX系统上都能正常工作。在任何版本的Windows下,Supervisor 都不会运行。Supervisor 可以使用
Python 2.4
或更高版本,但不能在任何版本的Python 3
下使用。
我使用的环境:
$ python -V
Python 2.6.6
安装
安装方法有:
1、easy_install 安装(需安装有pip):
$ easy_install supervisor
2、pip 安装(需安装有pip,推荐):
$ pip install supervisor
3、Debian / Ubuntu可以直接通过apt安装:
$ apt-get install supervisor
本文测试的时候是使用pip安装的。其它方式未做测试。
安装后配置文件生成:
1、通过easy_install或pip安装后,需要运行
$ mkdir /etc/supervisor
$ echo_supervisord_conf > /etc/supervisor/supervisord.conf
生成配置文件。
2、使用apt-get
安装后,supervisor的主配置文件在:
/etc/supervisor/supervisord.conf
supervisor的配置文件默认是不全的。子进程配置文件在:
/etc/supervisor/conf.d/*.conf
启动supervisor
我们先修改supervisord.conf
最后的[include]
部分配置:
[include]
files = /etc/supervisor/conf.d/*.conf
这样就可以支持子配置文件,而不用改动主配置文件。
启动方法一般有:
1、使用 pip
或者easy_install
安装的supervisor
则:
$ supervisord
即可运行。
supervisor 默认在以下路径查找配置文件:/usr/etc/supervisord.conf, /usr/supervisord.conf, supervisord.conf, etc/supervisord.conf, /etc/supervisord.conf, /etc/supervisor/supervisord.conf
如需指定主配置文件,则需要使用-c
参数:
$ supervisord -c /etc/supervisor/supervisord.conf
查看安装的版本:
$ supervisord -v
3.3.4
然后查看supervisor的状态:
$ supervisorctl status
注:
supervisord
是主进程,supervisorctl
是给守护进程发送命令的客户端工具。
2、使用 apt-get
安装的supervisor
直接可以通过
$ /etc/init.d/supervisor start
运行。
使用示例
我们以简单的 /tmp/echo_time.sh
为例:
#/bin/bash
while true; do
echo `date +%Y-%m-%d,%H:%m:%s`
sleep 2
done
在/etc/supervisor/conf.d/
新增子进程配置文件 echo_time.conf
:
[program:echo_time]
command=sh /tmp/echo_time.sh
priority=999 ; the relative start priority (default 999)
autostart=true ; start at supervisord start (default: true)
autorestart=true ; retstart at unexpected quit (default: true)
startsecs=10 ; number of secs prog must stay running (def. 10)
startretries=3 ; max # of serial start failures (default 3)
exitcodes=0,2 ; 'expected' exit codes for process (default 0,2)
stopsignal=QUIT ; signal used to kill process (default TERM)
stopwaitsecs=10 ; max num secs to wait before SIGKILL (default 10)
user=root ; setuid to this UNIX account to run the program
log_stdout=true
log_stderr=true ; if true, log program stderr (def false)
logfile=/tmp/echo_time.log
logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
logfile_backups=10 ; # of logfile backups (default 10)
stdout_logfile_maxbytes=20MB ; stdout 日志文件大小,默认 50MB
stdout_logfile_backups=20 ; stdout 日志文件备份数
stdout_logfile=/tmp/echo_time.stdout.log
然后启动程序:
$ supervisorctl reread
$ supervisorctl update
这两个命令分别代表重新读取配置、更新子进程组。执行update后输出:
echo_time: added process group
这样刚才添加的echo_time
脚本就常驻运行起来了。可以通过日志查看运行情况:
$ tail -f /tmp/echo_time.stdout.log
2018-12-22,14:12:1545459550
2018-12-22,14:12:1545459552
2018-12-22,14:12:1545459554
也可以使用supervisorctl status
查看子进程运行情况:
$ supervisorctl status
echo_time RUNNING pid 28206, uptime 0:00:11
配置文件
主配置
主配置文件名: supervisord.conf
可以通过运行echo_supervisord_conf
获得。这个配置文件一般情况下不需要更改,除了最后的[include]
部分,其余保持默认即可。
[unix_http_server]
file=/tmp/supervisor.sock ; the path to the socket file
;chmod=0700 ; socket file mode (default 0700)
;chown=nobody:nogroup ; socket file uid:gid owner
;username=user ; default is no username (open server)
;password=123 ; default is no password (open server)
;[inet_http_server] ; 配置web后台
;port=127.0.0.1:9001 ; 指定ip_address:port, 使用 *:port 监听所有 IP
;username=user ; 默认没有用户名 (open server)
;password=123 ; 默认没有密码 (open server)
[supervisord]
logfile=/tmp/supervisord.log ; 日志文件; 默认 $CWD/supervisord.log
logfile_maxbytes=50MB ; 日志文件最大大小; 默认 50MB
logfile_backups=10 ; # of main logfile backups; 0 means none, default 10
loglevel=info ; log level; default info; others: debug,warn,trace
pidfile=/tmp/supervisord.pid ; pid文件
nodaemon=false ; 是否运行在前台; 默认是后台
minfds=1024 ; min. avail startup file descriptors; default 1024
minprocs=200 ; min. avail process descriptors;default 200
;umask=022 ; process file creation umask; default 022
;user=chrism ; default is current user, required if root
;identifier=supervisor ; supervisord identifier, default is 'supervisor'
;directory=/tmp ; default is not to cd during start
;nocleanup=true ; don't clean up tempfiles at start; default false
;childlogdir=/tmp ; 'AUTO' child log dir, default $TEMP
;environment=KEY="value" ; key value pairs to add to environment
;strip_ansi=false ; strip ansi escape codes in logs; def. false
; The rpcinterface:supervisor section must remain in the config file for
; RPC (supervisorctl/web interface) to work. Additional interfaces may be
; added by defining them in separate [rpcinterface:x] sections.
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
; 配置 supervisorctl
; configure it match the settings in either the unix_http_server
; or inet_http_server section.
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
;serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket
;username=chris ; should be same as in [*_http_server] if set
;password=123 ; should be same as in [*_http_server] if set
;prompt=mysupervisor ; cmd line prompt (default "supervisor")
;history_file=~/.sc_history ; use readline history if available
; 下面是子进程配置文件示例
; Create one or more 'real' program: sections to be able to control them under
; supervisor.
;[program:theprogramname]
;command=/bin/cat ; the program (relative uses PATH, can take args)
;process_name=%(program_name)s ; process_name expr (default %(program_name)s)
;numprocs=1 ; number of processes copies to start (def 1)
;directory=/tmp ; directory to cwd to before exec (def no cwd)
;umask=022 ; umask for process (default None)
;priority=999 ; the relative start priority (default 999)
;autostart=true ; start at supervisord start (default: true)
;startsecs=1 ; # of secs prog must stay up to be running (def. 1)
;startretries=3 ; max # of serial start failures when starting (default 3)
;autorestart=unexpected ; when to restart if exited after running (def: unexpected)
;exitcodes=0,2 ; 'expected' exit codes used with autorestart (default 0,2)
;stopsignal=QUIT ; signal used to kill process (default TERM)
;stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10)
;stopasgroup=false ; send stop signal to the UNIX process group (default false)
;killasgroup=false ; SIGKILL the UNIX process group (def false)
;user=chrism ; setuid to this UNIX account to run the program
;redirect_stderr=true ; redirect proc stderr to stdout (default false)
;stdout_logfile=/a/path ; stdout log path, NONE for none; default AUTO
;stdout_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
;stdout_logfile_backups=10 ; # of stdout logfile backups (0 means none, default 10)
;stdout_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0)
;stdout_events_enabled=false ; emit events on stdout writes (default false)
;stderr_logfile=/a/path ; stderr log path, NONE for none; default AUTO
;stderr_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
;stderr_logfile_backups=10 ; # of stderr logfile backups (0 means none, default 10)
;stderr_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0)
;stderr_events_enabled=false ; emit events on stderr writes (default false)
;environment=A="1",B="2" ; process environment additions (def no adds)
;serverurl=AUTO ; override serverurl computation (childutils)
; The sample eventlistener section below shows all possible eventlistener
; subsection values. Create one or more 'real' eventlistener: sections to be
; able to handle event notifications sent by supervisord.
;[eventlistener:theeventlistenername]
;command=/bin/eventlistener ; the program (relative uses PATH, can take args)
;process_name=%(program_name)s ; process_name expr (default %(program_name)s)
;numprocs=1 ; number of processes copies to start (def 1)
;events=EVENT ; event notif. types to subscribe to (req'd)
;buffer_size=10 ; event buffer queue size (default 10)
;directory=/tmp ; directory to cwd to before exec (def no cwd)
;umask=022 ; umask for process (default None)
;priority=-1 ; the relative start priority (default -1)
;autostart=true ; start at supervisord start (default: true)
;startsecs=1 ; # of secs prog must stay up to be running (def. 1)
;startretries=3 ; max # of serial start failures when starting (default 3)
;autorestart=unexpected ; autorestart if exited after running (def: unexpected)
;exitcodes=0,2 ; 'expected' exit codes used with autorestart (default 0,2)
;stopsignal=QUIT ; signal used to kill process (default TERM)
;stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10)
;stopasgroup=false ; send stop signal to the UNIX process group (default false)
;killasgroup=false ; SIGKILL the UNIX process group (def false)
;user=chrism ; setuid to this UNIX account to run the program
;redirect_stderr=false ; redirect_stderr=true is not allowed for eventlisteners
;stdout_logfile=/a/path ; stdout log path, NONE for none; default AUTO
;stdout_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
;stdout_logfile_backups=10 ; # of stdout logfile backups (0 means none, default 10)
;stdout_events_enabled=false ; emit events on stdout writes (default false)
;stderr_logfile=/a/path ; stderr log path, NONE for none; default AUTO
;stderr_logfile_maxbytes=1MB ; max # logfile bytes b4 rotation (default 50MB)
;stderr_logfile_backups=10 ; # of stderr logfile backups (0 means none, default 10)
;stderr_events_enabled=false ; emit events on stderr writes (default false)
;environment=A="1",B="2" ; process environment additions
;serverurl=AUTO ; override serverurl computation (childutils)
; The sample group section below shows all possible group values. Create one
; or more 'real' group: sections to create "heterogeneous" process groups.
;[group:thegroupname]
;programs=progname1,progname2 ; each refers to 'x' in [program:x] definitions
;priority=999 ; the relative start priority (default 999)
; 配置include files
; The [include] section can just contain the "files" setting. This
; setting can list multiple files (separated by whitespace or
; newlines). It can also contain wildcards. The filenames are
; interpreted as relative to this file. Included files *cannot*
; include files themselves.
[include]
; .ini和.conf都支持
files = relative/directory/*.ini
子进程配置文件
一般放在:/etc/supervisor/conf.d/
目录 。一个脚本对应一个配置文件。
配置说明:
;*为必须填写项
;*[program:应用名称]
[program:cat]
;*命令路径,如果使用python启动的程序应该为 python /home/test.py,
;不建议放入/home/user/, 对于非user用户一般情况下是不能访问
command=/bin/cat
;当numprocs为1时,process_name=%(program_name)s;
当numprocs>=2时,%(program_name)s_%(process_num)02d
process_name=%(program_name)s
;进程数量
numprocs=1
;执行目录,若有/home/supervisor_test/test1.py
;将directory设置成/home/supervisor_test
;则command只需设置成python test1.py
;否则command必须设置成绝对执行目录
directory=/tmp
;掩码:--- -w- -w-, 转换后rwx r-x w-x
umask=022
;优先级,值越高,最后启动,最先被关闭,默认值999
priority=999
;如果是true,当supervisor启动时,程序将会自动启动
autostart=true
;*自动重启
autorestart=true
;启动延时执行,默认1秒
startsecs=10
;启动尝试次数,默认3次
startretries=3
;当退出码是0,2时,执行重启,默认值0,2
exitcodes=0,2
;停止信号,默认TERM
;中断:INT(类似于Ctrl+C)(kill -INT pid),退出后会将写文件或日志(推荐)
;终止:TERM(kill -TERM pid)
;挂起:HUP(kill -HUP pid),注意与Ctrl+Z/kill -stop pid不同
;从容停止:QUIT(kill -QUIT pid)
;KILL, USR1, USR2其他见命令(kill -l),说明1
stopsignal=TERM
stopwaitsecs=10
;*以root用户执行
user=root
;重定向
redirect_stderr=false
stdout_logfile=/a/path
stdout_logfile_maxbytes=1MB
stdout_logfile_backups=10
stdout_capture_maxbytes=1MB
stderr_logfile=/a/path
stderr_logfile_maxbytes=1MB
stderr_logfile_backups=10
stderr_capture_maxbytes=1MB
;环境变量设置
environment=A="1",B="2"
serverurl=AUTO
简化模板
[program:echo_time]
command=sh /tmp/echo_time.sh
autostart=true
autorestart=true
startsecs=10
startretries=3
exitcodes=0,2
stopsignal=QUIT
stopwaitsecs=10
user=root
log_stdout=true
log_stderr=true
logfile=/tmp/echo_time.log
logfile_maxbytes=1MB
logfile_backups=10
stdout_logfile_maxbytes=20MB
stdout_logfile_backups=20
stdout_logfile=/tmp/echo_time.stdout.log
如果上面的模板里增加下面两句:
process_name=%(process_num)02d
numprocs=3
执行supervisorctl update
后会启动三个进程:
$ supervisorctl status
echo_time:00 RUNNING pid 26762, uptime 2:06:12
echo_time:01 RUNNING pid 26866, uptime 2:06:02
echo_time:02 RUNNING pid 27975, uptime 2:05:52
这对于需要多进程消费的场景非常有用。
防盗版声明:本文系原创文章,发布于公众号飞鸿影的博客
(fhyblog)及博客园,转载需作者同意。
命令行程序
supervisord
supervisord 是主进程。
通过supervisord -h
可以查看帮助说明。示例:
-c/--configuration FILENAME ;指定配置文件
-n/--nodaemon ;运行在前台(调试用)
-v/--version ;打印版本信息
-u/--user USER ;以指定用户(或用户ID)运行
-m/--umask UMASK ;指定子进程的umask,默认是022
-l/--logfile FILENAME ;指定日志文件
-e/--loglevel LEVEL ;指定日志级别
supervisorctl
supervisorctl 是客户端程序,用于向supervisord发起命令。
通过supervisorctl -h
可以查看帮助说明。我们主要关心的是其action
命令:
$ supervisorctl help
default commands (type help ):
=====================================
add exit open reload restart start tail
avail fg pid remove shutdown status update
clear maintail quit reread signal stop version
这些命令对于控制子进程非常重要。示例:
reread ;重新加载配置文件
update ;将配置文件里新增的子进程加入进程组,如果设置了autostart=true则会启动新新增的子进程
status ;查看所有进程状态
status ;查看指定进程状态
start all; 启动所有子进程
start ; 启动指定子进程
restart all; 重启所有子进程
restart ; 重启指定子进程
stop all; 停止所有子进程
stop ; 停止指定子进程
reload ;重启supervisord
add ; 添加子进程到进程组
reomve ; 从进程组移除子进程,需要先stop。注意:移除后,需要使用reread和update才能重新运行该进程
supervisord 有进程组(process group)的概念:只有子进程在进程组,才能被运行。
supervisorctl
也支持交互式命令行:
$ supervisorctl
echo_time RUNNING pid 27188, uptime 0:05:09
supervisor> version
3.3.4
supervisor>
web界面操作
需要开启supervisord.conf
注释掉的这4行:
[inet_http_server] ; inet (TCP) server disabled by default
port=127.0.0.1:9001 ; ip_address:port specifier, *:port for all iface
username=user ; default is no username (open server)
password=123 ; default is no password (open server)
端口默认是监听127.0.0.1:9001
,这里方便测试,修改为:
port=*:9001
然后重启主进程supervisord:
$ supervisorctl reload
浏览器访问:http://myip:9001 ,输入用户名、密码后,即可看到web页面:
注意:如果修改配置文件时,
[inet_http_server]
这一行被注释,会导致不仅web需要认证,命令行使用supervisorctl
也需要认证,这时候就需要在交互式命令行里输入用户名、密码才能进行下一步的操作。
其它问题
1、Centos6 docker环境没有pip
解决方案:需要先安装扩展源EPEL。
EPEL(http://fedoraproject.org/wiki/EPEL) 是由 Fedora 社区打造,为 RHEL 及衍生发行版如 CentOS、Scientific Linux 等提供高质量软件包的项目。
首先安装epel扩展源:
$ yum -y install epel-release
然后再安装pip
$ yum -y install python-pip
查看版本:
$ pip -V
pip 7.1.0 from /usr/lib/python2.6/site-packages (python 2.6)
参考:https://www.cnblogs.com/saolv/p/6963314.html
2、Centos6环境安装supervisor
后,执行:
$ supervisor -V
出现:
Traceback (most recent call last):
File "/usr/bin/echo_supervisord_conf", line 5, in
from pkg_resources import load_entry_point
File "/usr/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/pkg_resources.py", line 2603, in
File "/usr/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/pkg_resources.py", line 666, in require
File "/usr/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/pkg_resources.py", line 565, in resolve
pkg_resources.DistributionNotFound: meld3>=0.6.5
原因:pip安装的meld3不可用,手动安装。
wget https://pypi.python.org/packages/source/m/meld3/meld3-1.0.2.tar.gz
tar -zxf meld3-1.0.2.tar.gz
cd meld3-1.0.2
python setup.py install
感谢https://www.cnblogs.com/hubery/p/5653007.html 文章提供的方法!这个问题困扰了好久,github页面(https://github.com/Supervisor/meld3/issues/23)提供的解决方案都不可行,反而导致pip都不能用了。
3、因参考github里meld3解决方案导致pip不可用:
Traceback (most recent call last):
File "/usr/bin/pip", line 5, in
from pkg_resources import load_entry_point
File "/usr/lib/python2.6/site-packages/pkg_resources/__init__.py", line 957, in
class Environment:
File "/usr/lib/python2.6/site-packages/pkg_resources/__init__.py", line 961, in Environment
self, search_path=None, platform=get_supported_platform(),
File "/usr/lib/python2.6/site-packages/pkg_resources/__init__.py", line 188, in get_supported_platform
plat = get_build_platform()
File "/usr/lib/python2.6/site-packages/pkg_resources/__init__.py", line 391, in get_build_platform
from sysconfig import get_platform
ImportError: No module named sysconfig
解决方案:删除/site-packages
下面的包,重新安装python-setuptools:
rm -rf /usr/lib/python2.6/site-packages/pkg_resources*
yum reinstall python-setuptools
参考:https://stackoverflow.com/questions/50742538/importerror-no-module-named-sysconfig-cant-get-pip-working
4、输入supervisord 命令后报错:
Traceback (most recent call last):
File "/usr/local/bin/supervisord", line 6, in
from pkg_resources import load_entry_point
File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 3019, in
@_call_aside
File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 3003, in _call_aside
f(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 3032, in _initialize_master_working_set
working_set = WorkingSet._build_master()
File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 655, in _build_master
ws.require(__requires__)
File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 963, in require
needed = self.resolve(parse_requirements(requirements))
File "/usr/lib/python2.7/dist-packages/pkg_resources/__init__.py", line 849, in resolve
raise DistributionNotFound(req, requirers)
pkg_resources.DistributionNotFound: The 'supervisor==3.2.0' distribution was not found and is required by the application
解决:
1、查找supervisord在哪
whereis supervisord
supervisord: /usr/local/bin/supervisord
2、执行
rm -rf /usr/local/bin/supervisord
原因是supervisor 正常被卸载,supervisord 并没有被正常卸载。
也可以指定错误版本重新安装:
pip install supervisor==3.2.0
详情:https://hooklife.me/linux/Supervisor%E5%9C%A8deepin%E5%AE%89%E8%A3%85%E3%80%81%E5%8D%B8%E8%BD%BD%E4%B8%8E%E4%BD%BF%E7%94%A8/
参考
1、Supervisor: A Process Control System — Supervisor 3.3.4 documentation
http://supervisord.org/index.html
2、进程管理supervisor的简单说明 - jyzhou - 博客园
https://www.cnblogs.com/zhoujinyi/p/6073705.html
3、DevOps
https://mp.weixin.qq.com/s/mqrkAEaGFKJy-4PQadJJ9w?
4、supervisor使用 - 回首郑板桥 - 博客园
https://www.cnblogs.com/hubery/p/5653007.html