http://zhangweide.cn/archive/2013/supervisor-note.html
Supervisord
supervisord的出现,结束了我这苦恼的问题,它可以帮你守护任何进程,当然如果它的进程也挂了就全都over了。实际情况是上线三个多月运行非常好,没有发现进程掉过。
CentOS下安装Supervisord
# yum install python-setuptools
# easy_install supervisor
创建配置文件
# echo_supervisord_conf > /etc/supervisord.conf
修改配置文件
# vi /etc/supervisord.conf
在末尾添加
1
2
3
4
5
6
|
[program:chat]
command
=python
/data0/htdocs/chat/main
.py
priority=1
numprocs=1
autostart=
true
autorestart=
true
|
配置说明:
command 要执行的命令
priority 优先级
numprocs 启动几个进程
autostart supervisor启动的时候是否随着同时启动
autorestart 当程序over的时候,这个program会自动重启,一定要选上
启动Supervisord
# supervisord
查看帮助
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
[root@localhost core]
# supervisord --help
supervisord -- run a
set
of applications as daemons.
Usage:
/usr/bin/supervisord
[options]
Options:
-c
/--configuration
FILENAME -- configuration
file
-n
/--nodaemon
-- run
in
the foreground (same as
'nodaemon true'
in
config
file
)
-h
/--help
-- print this usage message and
exit
-
v
/--version
-- print supervisord version number and
exit
-u
/--user
USER -- run supervisord as this user (or numeric uid)
-m
/--umask
UMASK -- use this
umask
for
daemon subprocess (default is 022)
-d
/--directory
DIRECTORY -- directory to chdir to when daemonized
-l
/--logfile
FILENAME -- use FILENAME as logfile path
-y
/--logfile_maxbytes
BYTES -- use BYTES to limit the max size of logfile
-z
/--logfile_backups
NUM -- number of backups to keep when max bytes reached
-e
/--loglevel
LEVEL -- use LEVEL as log level (debug,info,warn,error,critical)
-j
/--pidfile
FILENAME -- write a pid
file
for
the daemon process to FILENAME
-i
/--identifier
STR -- identifier used
for
this instance of supervisord
-q
/--childlogdir
DIRECTORY -- the log directory
for
child process logs
-k
/--nocleanup
-- prevent the process from performing cleanup (removal of
old automatic child log files) at startup.
-a
/--minfds
NUM -- the minimum number of
file
descriptors
for
start success
-t
/--strip_ansi
-- strip ansi escape codes from process output
--minprocs NUM -- the minimum number of processes available
for
start success
--profile_options OPTIONS -- run supervisord under profiler and output
results based on OPTIONS,
which
is a comma-sep'd
list of
'cumulative'
,
'calls'
, and
/or
'callers'
,
e.g.
'cumulative,callers'
)
|
启动时指定配置文件
# supervisord -c /etc/supervisord.conf
进入ctl模式
# supervisorctl
ctl中的简单命令
help 查看命令帮助
status 查看状态
stop XXX 停止某一个进程
start XXX 启动某个进程
restart XXX 重启某个进程
reload 载入最新的配置文件,停止原有进程并按新的配置启动、管理所有进程
update 根据最新的配置文件,启动新配置或有改动的进程,配置没有改动的进程不会受影响而重启。
测试
这里以守护nginx进程来演示,首先在/etc/supervisord.conf加入
1
2
3
4
5
6
|
[program:nginx]
command
=
/usr/local/nginx/sbin/nginx
priority=1
numprocs=1
autostart=
true
autorestart=
true
|
然后启动supervisord
1
2
3
|
[root@localhost core]
# supervisord -c /etc/supervisord.conf
[root@localhost core]
# ps -le | grep supervisord
1 S 0 14035 1 0 80 0 - 48722 poll_s ? 00:00:00 supervisord
|
查看nginx的进程
1
2
3
4
5
|
[root@localhost core]
# ps -le | grep nginx
1 S 0 14037 1 0 80 0 - 56260 rt_sig ? 00:00:00 nginx
5 S 99 14038 14037 0 80 0 - 56363 ep_pol ? 00:00:00 nginx
5 S 99 14039 14037 0 80 0 - 56300 ep_pol ? 00:00:00 nginx
5 S 99 14040 14037 0 80 0 - 56300 ep_pol ? 00:00:00 nginx
|
杀掉nginx进程
1
|
[root@localhost core]
# kill -9 14037
|
然后接着重新查看nginx进程
1
2
3
4
|
[root@localhost core]
# ps -le | grep nginx
5 S 99 14038 1 0 80 0 - 56363 ep_pol ? 00:00:00 nginx
5 S 99 14039 1 0 80 0 - 56300 ep_pol ? 00:00:00 nginx
4 S 0 14456 14035 0 80 0 - 56259 hrtime ? 00:00:00 nginx
|
起死回生了,并且pid已经由14037变成14038。搞定!
通过web管理
supervisord可以通过web管理进程以及查看进程状态,需要在配置文件里开启
找到[inet_http_server]这一段,修改成
1
2
3
4
|
[inet_http_server] ; inet (TCP) server disabled by default
port=*:9001 ; (ip_address:port specifier, *:port
for
all iface)
username=admin ; (default is no username (
open
server))
password=123 ; (default is no password (
open
server))
|
其中port这个字段要各位注意,如果*:9001表示允许所有ip访问,如果指定单个IP可以 xx.xx.xx.xx:9001 这样既可。如果你开启了iptabls记得要在规则里允许port指定的端口号。
然后保存配置,重启supervisord
https://github.com/mlazarov/supervisord-monitor