cloudfoundry研究(一) ---- BOSH与monit

我们一般使用BOSH来部署cloudfoundry。使用bosh vms命令来查看各个节点的运行情况,如下所示:

cloudfoundry研究(一) ---- BOSH与monit_第1张图片


通过这种形式,我们可以一目了然的查看到各节点的运行情况(running,failing等等),而这些信息都是通过Monit来获取的。


什么是Monit?

Monit是一个跨平台的用来监控Unix/linux系统(比如Linux、BSD、OSX、Solaris)的工具。Monit特别易于安装,而且非常轻量级(只有500KB大小),并且不依赖任何第三方程序、插件或者库。然而,Monit可以胜任全面监控、进程状态监控、文件系统变动监控、邮件通知和对核心服务的自定义动作等场景。易于安装、轻量级的实现以及强大的功能,让Monit成为一个理想的后备监控工具。Monit 包含一个内嵌的 HTTP(S) Web 界面,可以使用浏览器方便地查看 Monit 所监视的服务。


BOSH中的Monit

我们可以登录到cloudfoundry的节点上,在每一个节点上我们都可以发现/var/vcap/bosh/bin/monit这个可执行文件,执行 monit -h查看一下monit可以做哪些事:
root@ubuntu:/var/vcap/bosh/bin# ./monit -h
Usage: monit [options] {arguments}
Options are as follows:
 -c file       Use this control file
 -d n          Run as a daemon once per n seconds
 -g name       Set group name for start, stop, restart, monitor and unmonitor
 -l logfile    Print log information to this file
 -p pidfile    Use this lock file in daemon mode
 -s statefile  Set the file monit should write state information to
 -I            Do not run in background (needed for run from init)
 -t            Run syntax check for the control file
 -v            Verbose mode, work noisy (diagnostic output)
 -H [filename] Print SHA1 and MD5 hashes of the file or of stdin if the
               filename is omited; monit will exit afterwards
 -V            Print version number and patchlevel
 -h            Print this text
Optional action arguments for non-daemon mode are as follows:
 start all           - Start all services
 start name          - Only start the named service
 stop all            - Stop all services
 stop name           - Only stop the named service
 restart all         - Stop and start all services
 restart name        - Only restart the named service
 monitor all         - Enable monitoring of all services
 monitor name        - Only enable monitoring of the named service
 unmonitor all       - Disable monitoring of all services
 unmonitor name      - Only disable monitoring of the named service
 reload              - Reinitialize monit
 status              - Print full status information for each service
 summary             - Print short status information for each service
 quit                - Kill monit daemon process
 validate            - Check all services and start if not running
 procmatch <pattern> - Test process matching pattern

(Action arguments operate on services defined in the control file)

monit不仅仅可以监控服务,还可以启动,停止,重启服务(start ,stop, restart...),功能不可谓不强大。
首先看监控,执行命令: monit summary
root@ubuntu:/var/vcap/bosh/bin# ./monit summary
The Monit daemon 5.2.4 uptime: 1h 7m

Process 'nats'                      running
Process 'nats_stream_forwarder'     running
Process 'etcd'                      running
Process 'hm9000_listener'           running
Process 'hm9000_fetcher'            running
Process 'hm9000_analyzer'           running
Process 'hm9000_sender'             running
Process 'hm9000_metrics_server'     running
Process 'hm9000_api_server'         running
Process 'hm9000_evacuator'          running
Process 'hm9000_shredder'           running
Process 'cloud_controller_ng'       running
Process 'cloud_controller_worker_local_1' running
Process 'cloud_controller_worker_local_2' running
Process 'nginx_cc'                  running
Process 'cloud_controller_migration' running
Process 'cloud_controller_worker_1' running
Process 'cloud_controller_clock'    running
Process 'uaa'                       running
Process 'consul_template'           running
File 'haproxy_config'               accessible
Process 'haproxy'                   running
Process 'gorouter'                  running
Process 'warden'                    running
Process 'dea_next'                  running
Process 'dir_server'                running
Process 'loggregator_trafficcontroller' running
Process 'doppler'                   running
Process 'metron_agent'              running
Process 'dea_logging_agent'         running
Process 'etcd_metrics_server'       running
Process 'consul_agent'              running
Process 'route_registrar'           running
Process 'postgres'                  running
System 'system_ubuntu'              running
以上列出了所有被监控的的cloudfoundry组件的运行情况。

重启一个服务,执行命令: monit restart 
root@ubuntu:/var/vcap/bosh/bin# ./monit restart nats
root@ubuntu:/var/vcap/bosh/bin# ./monit summary
The Monit daemon 5.2.4 uptime: 1h 11m

Process 'nats'                      not monitored - restart pending
Process 'nats_stream_forwarder'     not monitored
Process 'etcd'                      running
Process 'hm9000_listener'           running
Process 'hm9000_fetcher'            running
Process 'hm9000_analyzer'           running
Process 'hm9000_sender'             running
Process 'hm9000_metrics_server'     running
Process 'hm9000_api_server'         running
Process 'hm9000_evacuator'          running
Process 'hm9000_shredder'           running
Process 'cloud_controller_ng'       running
Process 'cloud_controller_worker_local_1' running
Process 'cloud_controller_worker_local_2' running
Process 'nginx_cc'                  running
Process 'cloud_controller_migration' running
Process 'cloud_controller_worker_1' running
Process 'cloud_controller_clock'    running
Process 'uaa'                       running
Process 'consul_template'           running
File 'haproxy_config'               accessible
Process 'haproxy'                   running
Process 'gorouter'                  running
Process 'warden'                    running
Process 'dea_next'                  running
Process 'dir_server'                running
Process 'loggregator_trafficcontroller' running
Process 'doppler'                   running
Process 'metron_agent'              running
Process 'dea_logging_agent'         running
Process 'etcd_metrics_server'       running
Process 'consul_agent'              running
Process 'route_registrar'           running
Process 'postgres'                  running
System 'system_ubuntu'              running
root@ubuntu:/var/vcap/bosh/bin#

过一段时间,nats服务就会重新启动起来。
root@ubuntu:/var/vcap/bosh/bin# ./monit summary
The Monit daemon 5.2.4 uptime: 1h 11m

Process 'nats'                      running
Process 'nats_stream_forwarder'     running

总之,monit提供了很多的命令,在此不一一列举了。

自定义Monit

bosh中的monit配置文件monitrc存放在 /var/vcap/bosh/etc目录下面,文件的内容形如以下:
root@ubuntu:/var/vcap/bosh/etc# cat monitrc
set daemon 10
set logfile /var/vcap/monit/monit.log
set httpd port 2822 and use address 10.0.0.112
  allow cleartext /var/vcap/monit/monit.user

include /var/vcap/monit/*.monitrc
include /var/vcap/monit/job/*.monitrc

set daemon 10 规定了检查间隔为10秒
allow cleartext /var/vcap/monit/monit.user  规定了登录的用户名密码存放的文件
set httpd port 2822 and use address 10.0.0.112  设置了web服务器的地址和端口(后面会讲到如何打开web页面,以便更直观的看到监控信息)
include /var/vcap/monit/*.monitrc
include /var/vcap/monit/job/*.monitrc 这两条命令是为了引入其他的配置文件,可以使用通配符

打开web页面

1. 在上述monitrc文件中设置:
set httpd port 2822 and use address 10.0.0.112
其中2822为自定义的端口号,10.0.0.112为本机的ip

2. 修改防火墙
iptables -A INPUT -p tcp --dport 2822 -j ACCEPT
iptables -A OUTPUT -p tcp --sport 2822 -j ACCEPT
 
  

永久保存防火墙设置:

编辑/etc/network/interfaces文件,添加以下内容

pre-up iptables-restore < /etc/iptables.rules
post-down iptables-restore < /etc/iptables.downrules


修改完的文件类似于以下

auto eth0
iface eth0 inet dhcp
  pre-up iptables-restore < /etc/iptables.rules
  post-down iptables-restore < /etc/iptables.downrules


执行如下命令

sudo sh -c "iptables-save -c > /etc/iptables.rules" 





3. 重启monit服务
monit reload

通过浏览器访问:http://10.0.0.112:2822,使用/var/vcap/monit/monit.user文件中(见monitrc文件中定义的路径)的用户名密码登录系统,可以看到如下的效果:






你可能感兴趣的:(linux,CloudFoundry,BOSH,monit)