supervisor 教程

概述

(1) 基于python编写,安装方便
(2) 进程管理工具,可以很方便的对用户定义的进程进行启动,关闭,重启,并且对意外关闭的进程进行重启 ,只需要简单的配置一下即可,且有web端,状态、日志查看清晰明了。
(3) 组成部分 supervisord[服务端,所以要通过这个来启动它]
            supervisorctl[客户端,可以来执行stop等命令]
(4) 官方文档地址:http://supervisord.org/
    

安装

python 第三方包的安装方法,此处不详细描述

pip install supervisor

配置 & 使用

查看默认配置

运行

    echo_supervisord_conf

即可看到默认配置情况,但是一般情况下,我们都不要去修改默认的配置,而是将默认配置重定向到另外的文件中,不同的进程运用不同的配置文件去对默认文件进行复写即可。

    echo_supervisord_conf > /etc/supervisord.conf

默认配置说明

[unix_http_server]
;file=/tmp/supervisor.sock   ; (the path to the socket file)
;建议修改为 /var/run 目录,避免被系统删除
file=/var/run/supervisor.sock   ; (the path to the socket file)
;chmod=0700                 ; socket file mode (default 0700)
;chown=nobody:nogroup       ; socket file uid:gid owner
;username=user              ; (default is no username (open server))
;password=123               ; (default is no password (open server))

;[inet_http_server]         ; inet (TCP) server disabled by default
;port=127.0.0.1:9001        ; (ip_address:port specifier, *:port for ;all iface)
;username=user              ; (default is no username (open server))
;password=123               ; (default is no password (open server))
...

[supervisord]
;logfile=/tmp/supervisord.log ; 日志文件(main log file;default $CWD/supervisord.log)
;建议修改为 /var/log 目录,避免被系统删除
logfile=/var/log/supervisor/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=50MB        ; 日志文件大小(max main logfile bytes b4 rotation;default 50MB)
logfile_backups=10           ; 日志文件保留备份数量(num of main logfile rotation backups;default 10)
loglevel=info                ; 日志级别(log level;default info; others: debug,warn,trace)
;pidfile=/tmp/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
;建议修改为 /var/run 目录,避免被系统删除
pidfile=/var/run/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
;设置启动supervisord的用户,一般情况下不要轻易用root用户来启动,除非你真的确定要这么做
;user=chrism                 ; (default is current user, required if root)
nodaemon=false               ; (start in foreground if true;default false)
minfds=1024                  ; (min. avail startup file descriptors;default 1024)
minprocs=200                 ; (min. avail process descriptors;default 200)
;umask=022                   ; (process file creation umask;default 022)
;identifier=supervisor       ; (supervisord identifier, default is 'supervisor')
;directory=/tmp              ; (default is not to cd during start)
;nocleanup=true              ; (don't clean up tempfiles at start;default false)
;childlogdir=/tmp            ; ('AUTO' child log dir, default $TEMP)
;environment=KEY="value"     ; (key value pairs to add to environment)
;strip_ansi=false            ; (strip ansi escape codes in logs; def. false)

[unix_http_server]
file=/tmp/supervisor.sock   ; (the path to the socket file)
;chmod=0700                 ; socket file mode (default 0700)
;chown=nobody:nogroup       ; socket file uid:gid owner
;username=user              ; (default is no username (open server))
;password=123               ; (default is no password (open server))

[supervisorctl]
; 必须和'unix_http_server'里面的设定匹配
;serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL  for a unix socket
;建议修改为 /var/run 目录,避免被系统删除
serverurl=unix:///var/run/supervisor.sock ; use a unix:// URL  for a unix socket
;serverurl=http://127.0.0.1:9001 ; use an http:// url to specify an inet socket
;username=chris              ; should be same as http_username if set
;password=123                ; should be same as http_password if set

;[program:theprogramname]
;command=/bin/cat              ; the program (relative uses PATH, can take args)
;process_name=%(program_name)s ; process_name expr (default %(program_name)s)
;numprocs=1                    ; number of processes copies to start (def 1)
;directory=/tmp                ; directory to cwd to before exec (def no cwd)
;umask=022                     ; umask for process (default None)
;priority=999                  ; the relative start priority (default 999)
;autostart=true                ; start at supervisord start (default: true)
;startsecs=1                   ; # of secs prog must stay up to be running (def. 1)
;startretries=3                ; max # of serial start failures when starting (default 3)
;autorestart=unexpected        ; when to restart if exited after running (def: unexpected)
;exitcodes=0,2                 ; 'expected' exit codes used with autorestart (default 0,2)
;stopsignal=QUIT               ; signal used to kill process (default TERM)
;stopwaitsecs=10               ; max num secs to wait b4 SIGKILL (default 10)
;stopasgroup=false             ; send stop signal to the UNIX process group (default false)
;killasgroup=false             ; SIGKILL the UNIX process group (def false)
;user=chrism                   ; setuid to this UNIX account to run the program
;redirect_stderr=true          ; redirect proc stderr to stdout (default false)
;stdout_logfile=/a/path        ; stdout log path, NONE for none; default AUTO
;stdout_logfile_maxbytes=1MB   ; max # logfile bytes b4 rotation (default 50MB)
;stdout_logfile_backups=10     ; # of stdout logfile backups (default 10)
;stdout_capture_maxbytes=1MB   ; number of bytes in 'capturemode' (default 0)
;stdout_events_enabled=false   ; emit events on stdout writes (default false)
;stderr_logfile=/a/path        ; stderr log path, NONE for none; default AUTO
;stderr_logfile_maxbytes=1MB   ; max # logfile bytes b4 rotation (default 50MB)
;stderr_logfile_backups=10     ; # of stderr logfile backups (default 10)
;stderr_capture_maxbytes=1MB   ; number of bytes in 'capturemode' (default 0)
;stderr_events_enabled=false   ; emit events on stderr writes (default false)
;environment=A="1",B="2"       ; process environment additions (def no adds)
;serverurl=AUTO                ; override serverurl computation (childutils)

;[group:thegroupname]
;programs=progname1,progname2  ; each refers to 'x' in [program:x] definitions
;priority=999                  ; the relative start priority (default 999)

[include]
files = /etc/supervisor/*.conf

配置文件都有说明,且很简单,就不多描述。

启动服务端

现在,让我们来启动supervisor服务。

supervisord -c /etc/supervisord.conf

查看supervisord 是否运行:

ps aux|grep supervisor

项目配置及运行

上面我们已经把 supervisrod 运行起来了,现在可以添加我们要管理的进程的配置文件。可以把所有配置项都写到 supervisord.conf 文件里,但并不推荐这样做,而是通过 include 的方式把不同的程序(组)写到不同的配置文件里。

[include]
files = /etc/supervisor/*.conf

以下为azkaban的配置文件目录(亲测):

/etc/supervisor/azkaban.conf

[program:azkaban]
command=java -Xmx4G -Dlog4j.configuration=file:bin/internal/../../conf/log4j.properties -Dlog4j.log.dir=bin/internal/../../logs -server -Dcom.sun.management.jmxremote -Djava.io.tmpdir=/tmp -Dexecutorport=12321 -Dserverpath=/opt/azkaban/azkaban-web-server -Djava.library.path=/usr/bin/hadoop/lib/native/Linux-amd64-64 -cp :bin/internal/../../lib/activation-1.1.jar:bin/internal/../../lib/animal-sniffer-annotations-1.14.jar:bin/internal/../../lib/antlr-2.7.2.jar:bin/internal/../../lib/aopalliance-1.0.jar:bin/internal/../../lib/az-core-3.62.0-7-g4f2f631.jar:bin/internal/../../lib/az-flow-trigger-dependency-plugin-3.62.0-7-g4f2f631.jar:bin/internal/../../lib/az-hdfs-viewer-3.62.0-7-g4f2f631.jar:bin/internal/../../lib/az-jobsummary-3.62.0-7-g4f2f631.jar:bin/internal/../../lib/azkaban-common-3.62.0-7-g4f2f631.jar:bin/internal/../../lib/azkaban-db-3.62.0-7-g4f2f631.jar:bin/internal/../../lib/azkaban-spi-3.62.0-7-g4f2f631.jar:bin/internal/../../lib/azkaban-web-server-3.62.0-7-g4f2f631.jar:bin/internal/../../lib/az-reportal-3.62.0-7-g4f2f631.jar:bin/internal/../../lib/c3p0-0.9.1.1.jar:bin/internal/../../lib/cglib-nodep-2.2.jar:bin/internal/../../lib/checker-compat-qual-2.0.0.jar:bin/internal/../../lib/commons-beanutils-1.7.0.jar:bin/internal/../../lib/commons-chain-1.1.jar:bin/internal/../../lib/commons-codec-1.9.jar:bin/internal/../../lib/commons-collections-3.2.2.jar:bin/internal/../../lib/commons-compress-1.2.jar:bin/internal/../../lib/commons-dbcp2-2.1.1.jar:bin/internal/../../lib/commons-dbutils-1.5.jar:bin/internal/../../lib/commons-digester-1.8.jar:bin/internal/../../lib/commons-fileupload-1.2.1.jar:bin/internal/../../lib/commons-io-2.4.jar:bin/internal/../../lib/commons-jexl-2.1.1.jar:bin/internal/../../lib/commons-lang-2.6.jar:bin/internal/../../lib/commons-lang3-3.4.jar:bin/internal/../../lib/commons-logging-1.2.jar:bin/internal/../../lib/commons-math3-3.0.jar:bin/internal/../../lib/commons-pool2-2.4.2.jar:bin/internal/../../lib/commons-validator-1.3.1.jar:bin/internal/../../lib/data-1.15.7.jar:bin/internal/../../lib/data-transform-1.15.7.jar:bin/internal/../../lib/dom4j-1.1.jar:bin/internal/../../lib/error_prone_annotations-2.1.3.jar:bin/internal/../../lib/gson-2.8.1.jar:bin/internal/../../lib/guava-23.6-jre.jar:bin/internal/../../lib/guice-4.1.0.jar:bin/internal/../../lib/httpclient-4.5.3.jar:bin/internal/../../lib/httpcore-4.4.6.jar:bin/internal/../../lib/j2objc-annotations-1.1.jar:bin/internal/../../lib/jackson-annotations-2.8.2.jar:bin/internal/../../lib/jackson-core-2.8.2.jar:bin/internal/../../lib/jackson-core-asl-1.9.5.jar:bin/internal/../../lib/jackson-databind-2.8.2.jar:bin/internal/../../lib/jackson-mapper-asl-1.9.5.jar:bin/internal/../../lib/javax.inject-1.jar:bin/internal/../../lib/javax.servlet-api-3.0.1.jar:bin/internal/../../lib/jcl-over-slf4j-1.7.6.jar:bin/internal/../../lib/jetty-6.1.26.jar:bin/internal/../../lib/jetty-util-6.1.26.jar:bin/internal/../../lib/joda-time-2.10.jar:bin/internal/../../lib/jopt-simple-4.3.jar:bin/internal/../../lib/json-20070829.jar:bin/internal/../../lib/jsr305-1.3.9.jar:bin/internal/../../lib/li-jersey-uri-1.15.7.jar:bin/internal/../../lib/log4j-1.2.17.jar:bin/internal/../../lib/logback-classic-1.1.4.jar:bin/internal/../../lib/logback-core-1.1.4.jar:bin/internal/../../lib/logging-interceptor-3.3.0.jar:bin/internal/../../lib/mail-1.4.5.jar:bin/internal/../../lib/metrics-core-3.1.0.jar:bin/internal/../../lib/metrics-jvm-3.1.0.jar:bin/internal/../../lib/mina-core-1.1.7.jar:bin/internal/../../lib/mysql-connector-java-5.1.28.jar:bin/internal/../../lib/netty-3.2.3.Final.jar:bin/internal/../../lib/okhttp-3.3.0.jar:bin/internal/../../lib/okhttp-apache-3.3.0.jar:bin/internal/../../lib/okio-1.8.0.jar:bin/internal/../../lib/oro-2.0.8.jar:bin/internal/../../lib/parseq-1.3.6.jar:bin/internal/../../lib/pegasus-common-1.15.7.jar:bin/internal/../../lib/quartz-2.2.1.jar:bin/internal/../../lib/r2-1.15.7.jar:bin/internal/../../lib/restli-common-1.15.7.jar:bin/internal/../../lib/restli-server-1.15.7.jar:bin/internal/../../lib/retrofit-2.0.2.jar:bin/internal/../../lib/servlet-api-2.5-20081211.jar:bin/internal/../../lib/slf4j-api-1.7.18.jar:bin/internal/../../lib/slf4j-log4j12-1.7.18.jar:bin/internal/../../lib/snakeyaml-1.18.jar:bin/internal/../../lib/snappy-0.3.jar:bin/internal/../../lib/sslext-1.2-0.jar:bin/internal/../../lib/struts-core-1.3.8.jar:bin/internal/../../lib/struts-taglib-1.3.8.jar:bin/internal/../../lib/struts-tiles-1.3.8.jar:bin/internal/../../lib/velocity-1.7.jar:bin/internal/../../lib/velocity-tools-2.0.jar:bin/internal/../../extlib/*.jar:bin/internal/../../plugins/*/*.jar:/usr/bin/hadoop/conf:/usr/bin/hadoop/* azkaban.webapp.AzkabanWebServer -conf bin/internal/../../conf
directory=/opt/azkaban/azkaban-web-server
user=root
autostart=true
startsecs=0
autorestart=unexpected
startretries=0
exitcodes=0
numprocs=1
logfile=/opt/azkaban/azkaban-web-server/logs/supervisor/azkaban.log
log_stderr=true
redirect_stderr=true
stdout_logfile=/opt/azkaban/azkaban-web-server/logs/supervisor/azkaban.log

配置完成以后,即可运行:

supervisord -c /etc/supervisord.conf

查看运行状态

$ supervisorctl status

out:
azkaban         RUNNING   pid 62040, uptime 0:10:09

打开浏览器,输入127.0.0.9001,输入用户名与密码(如果配置文件中inet_http_server中作了设置),可以看到下面这个界面:

supervisor 教程_第1张图片

 

使用supervisorctl

在启动服务之后,运行:

supervisorctl -c /etc/supervisord.conf

out:
azkaban         RUNNING   pid 62040, uptime 0:10:09

若成功,则会进入supervisorctl的shell界面,有以下方法:

status    # 查看程序状态
stop azkaban   # 关闭 update_ip 程序
start azkaban  # 启动 update_ip 程序
restart azkaban    # 重启 update_ip 程序
reread    # 读取有更新(增加)的配置文件,不会启动新添加的程序
update    # 重启配置文件修改过的程序

执行相关操作后,可以在web端看到具体的变化情况,如stop 程序

stop azkaban

其实,也可以不使用supervisorctl shell界面,而在bash终端运行:

$ supervisorctl status
$ supervisorctl stop azkaban
$ supervisorctl start azkaban
$ supervisorctl restart azkaban
$ supervisorctl reread
$ supervisorctl update 

多个进程管理

按照官方文档的定义,一个 [program:x] 实际上是表示一组相同特征或同类的进程组,也就是说一个 [program:x] 可以启动多个进程。这组进程的成员是通过 numprocs 和 process_name 这两个参数来确定的,这句话什么意思呢,我们来看这个例子。

; 设置进程的名称,使用 supervisorctl 来管理进程时需要使用该进程名
[program:foo] 

; 可以在 command 这里用 python 表达式传递不同的参数给每个进程
command=python server.py --port=90%(process_num)02d

directory=/home/python/tornado_server ; 执行 command 之前,先切换到工作目录

; 若 numprocs 不为1,process_name 的表达式中一定要包含 process_num 来区分不同的进程
numprocs=2                   
process_name=%(program_name)s_%(process_num)02d; 

user=oxygen                 ; 使用 oxygen 用户来启动该进程

autorestart=true  ; 程序崩溃时自动重启

redirect_stderr=true      ; 重定向输出的日志

stdout_logfile = /var/log/supervisord/
tornado_server.log
loglevel=info

上面这个例子会启动两个进程,process_name 分别为 foo:foo_01 和 foo:foo_02。通过这样一种方式,就可以用一个 [program:x] 配置项,来启动一组非常类似的进程。

Supervisor 同时还提供了另外一种进程组的管理方式,通过这种方式,可以使用 supervisorctl 命令来管理一组进程。跟 [program:x] 的进程组不同的是,这里的进程是一个个的 [program:x] 。

[group:thegroupname]
programs=progname1,progname2  ; each refers to 'x' in [program:x] definitions
priority=999                  ; the relative start priority (default 999)

当添加了上述配置后,progname1 和 progname2 的进程名就会变成 thegroupname:progname1 和 thegroupname:progname2 以后就要用这个名字来管理进程了,而不是之前的 progname1。

以后执行 supervisorctl stop thegroupname: 就能同时结束 progname1 和 progname2,执行 supervisorctl stop thegroupname:progname1 就能结束 progname1。

 安装注册为系统服务

因为 Linux 在启动的时候会执行 /etc/rc.local 里面的脚本,所以只要在这里添加执行命令就可以


# 如果是 Ubuntu 添加以下内容
/usr/local/bin/supervisord -c /etc/supervisord.conf

# 如果是 Centos 添加以下内容
/usr/bin/supervisord -c /etc/supervisord.conf

备注

项目配置文件,如果按正常方式配置command执行命令报错,可以先启动目标程序,

在启动服务之后,运行:

ps aux | grep targetProcess

将显示的启动命令copy即可。

你可能感兴趣的:(Azkaban,调度系统)