zabbix监控keepalived主备状态以及脑裂

zabbix监控keepalived主备状态以及脑裂

文章目录

  • zabbix监控keepalived主备状态以及脑裂
    • 环境说明:
      • 1.配置keepalived监控主备状态的脚本
        • 在master主机上编写脚本
        • 在slave主机上编写脚本
      • 2.配置keepalived加入监控脚本的配置
        • 2.1.配置主keepalived配置文件
        • 2.2.配置备keepalived
        • 第一轮测试(测试keepalived是否监控haproxy负载均衡机)
      • 3.对keepalived进行监控
        • 3.1.在要slave主机(haproxy2)上安装agent
        • 3.2.编辑一个脚本文件,用于获取服务的进程(脚本默认放在同一个地方,此处我们创建一个专门用于放置脚本文件的目录,放置到/scripts,不要放在用户家目录下面,防止后续出现权限受限的问题)
        • 3.3.添加主机
        • 3.4.创建监控项
        • 3.5.创建触发器
        • 4.6.第二轮测试
        • 4.7.邮箱通知
        • 4.7.邮箱通知

环境说明:

服务器名称 IP地址 所需服务\架构 系统版本
zabbix 192.168.195.130 lamp架构,zabbix_server centos 8
haproxy1 192.168.195.133 keepalived,haproxy centos 8
haproxy2 192.168.195.134 keepalived,haproxy centos 8
web1 192.168.195.135 http centos 8
web2 192.168.195.136 http centos 8

注:下列步骤中,有关与zabbix监控服务、自定义监控以及haproxy配置负载均衡的详细内容,可以通过访问下列官网查看

监控服务zabbix部署-CSDN博客

zabbix服务自定义监控_碳烤小肥杨…的博客-CSDN博客

haproxy负载均衡-CSDN博客

1.配置keepalived监控主备状态的脚本

keepalived通过脚本监控负载均衡机的状态

在master主机上编写脚本
//该脚本是为了得知master主机上是否存在haproxy服务进程,如果没有则说明服务出现了问题,无法正常提供服务,所以我们写入判断,当haproxy进程小于1时则关闭keepalived服务,自动释放内存

[root@haproxy1 ~]# mkdir /scripts && cd /scripts
[root@haproxy1 scripts]# vim check_haproxy.sh
[root@haproxy1 scripts]# cat check_haproxy.sh
#!/bin/bash

haproxy_status=$(ps -ef|grep -Ev "grep|$0"|grep '\bhaproxy\b'|wc -l)
if [ $haproxy_status -lt 1 ];then
    systemctl stop keepalived
fi

[root@haproxy1 scripts]# chmod +x check_haproxy.sh
[root@haproxy1 scripts]# ll
total 4
-rwxr-xr-x 1 root root 148 Oct 13 21:21 check_haproxy.sh
在slave主机上编写脚本
//该脚本是为了得知本主机是处于哪种状态(mastert|slave),当本主机变成master主机后,则进行第一个判断,当haproxy服务进程数小于1时,开启haproxy服务,继续进行负载均衡;而当本主机变回slave主机后,则进行第二个判断,当haproxy服务进程大于0时,关闭haproxy服务,避免与master主机上的haproxy服务产生冲突,从而导致流量无法正确转移到后端的web页面主机

[root@haproxy2 ~]# mkdir /scripts && cd /scripts
[root@haproxy2 scripts]# vim notify.sh
[root@haproxy2 scripts]# cat notify.sh 
#!/bin/bash

case "$1" in
  master)
        haproxy_status=$(ps -ef|grep -Ev "grep|$0"|grep '\bhaproxy\b'|wc -l)
        if [ $haproxy_status -lt 1 ];then
            systemctl start haproxy
        fi
  ;;
  backup)
        haproxy_status=$(ps -ef|grep -Ev "grep|$0"|grep '\bhaproxy\b'|wc -l)
        if [ $haproxy_status -gt 0 ];then
            systemctl stop haproxy
        fi
  ;;
  *)
        echo "Usage:$0 master|backup VIP"
  ;;
esac
[root@haproxy2 scripts]# chmod +x notify.sh
[root@haproxy2 scripts]# ls
notify.sh
[root@haproxy2 scripts]# ll
total 4
-rwxr-xr-x 1 root root 461 Oct 13 21:25 notify.sh

2.配置keepalived加入监控脚本的配置

2.1.配置主keepalived配置文件
[root@haproxy1 ~]# vim /etc/keepalived/keepalived.conf  
[root@haproxy1 ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   router_id haproxy1
}

vrrp_script haproxy_check {
    script "/scripts/check_haproxy.sh"
    interval 1
    fall 3
    weight -40
}

vrrp_instance VI_1 {
    state MASTER
    interface ens160
    virtual_router_id 80
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 12345678
    }
    virtual_ipaddress {
        192.168.195.100
    }
    track_script {
        haproxy_check
    }
}

virtual_server 192.168.195.100 80 {
    delay_loop 6
    lb_algo rr
    lb_kind NAT
    persistence_timeout 50
    protocol TCP

    real_server 192.168.195.133 80 {
        weight 1
        TCP_CHECK {
            connect_port 80
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
          }
        }
    real_server 192.168.195.134 80 {
        weight 1
        TCP_CHECK {
            connect_port 80 
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
          }
        }
}

[root@haproxy1 ~]# systemctl restart keepalived.service
2.2.配置备keepalived
[root@haproxy2 ~]# vim /etc/keepalived/keepalived.conf 
[root@haproxy2 ~]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   router_id haproxy2
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens160
    virtual_router_id 80
    priority 80
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 12345678
    }
    virtual_ipaddress {
        192.168.195.100
    }
    notify_master "/scripts/notify.sh master"
    notify_backup "/scripts/notify.sh backup"
}

virtual_server 192.168.195.100 80 {
    delay_loop 6
    lb_algo rr
    lb_kind NAT
    persistence_timeout 50
    protocol TCP

    real_server 192.168.195.133 80 {
        weight 1
        TCP_CHECK {
            connect_port 80
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
          }
        }
    real_server 192.168.195.134 80 {
        weight 1
        TCP_CHECK {
            connect_port 80 
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
          }
        }
}

[root@haproxy2 ~]# systemctl restart keepalived.service
第一轮测试(测试keepalived是否监控haproxy负载均衡机)

测试前查看服务状态

master主机
//keepalived服务和haproxy服务正常运行,查看vip
[root@haproxy1 ~]# systemctl is-active haproxy.service 
active
[root@haproxy1 ~]# systemctl is-active keepalived.service 
active
[root@haproxy1 ~]# ip a show ens160 | grep 192.168.195.100
    inet 192.168.195.100/32 scope global ens160
[root@haproxy1 ~]# ip a show ens160 | grep 192.168.195.100 | wc -l
1
[root@haproxy1 ~]# ss -antl
State         Recv-Q        Send-Q                 Local Address:Port                 Peer Address:Port        
LISTEN        0             128                          0.0.0.0:80                        0.0.0.0:*           
LISTEN        0             128                          0.0.0.0:22                        0.0.0.0:*           
LISTEN        0             128                          0.0.0.0:8189                      0.0.0.0:*           
LISTEN        0             128                             [::]:22                           [::]:*

slave主机
//haproxy服务关闭,keepalved服务保持开启
[root@haproxy2 ~]# systemctl is-active haproxy.service 
inactive
[root@haproxy2 ~]# systemctl is-active keepalived.service 
active
[root@haproxy2 ~]# ip a show ens160 | grep 192.168.195.100
[root@haproxy2 ~]# ip a show ens160 | grep 192.168.195.100 | wc -l
0
[root@haproxy2 ~]# ss -antl
State         Recv-Q        Send-Q                 Local Address:Port                 Peer Address:Port        
LISTEN        0             128                          0.0.0.0:22                        0.0.0.0:*           
LISTEN        0             128                             [::]:22                           [::]:*

测试中

模拟master主机(haproxy1)的haproxy服务超负载导致服务关闭

//关闭haproxy服务后,keepalived配置文件中追踪的脚本检测到haproxy服务进程消失,则执行关闭keepalived服务的命令,自动释放内存,同时vip也会跳转到slave主机(haproxy2)主机上,从而成为新的master
[root@haproxy1 ~]# systemctl stop haproxy.service 
[root@haproxy1 ~]# systemctl is-active haproxy.service 
inactive
[root@haproxy1 ~]# systemctl is-active keepalived.service 
inactive
[root@haproxy1 ~]# ip a show ens160 | grep 192.168.195.100
[root@haproxy1 ~]# ip a show ens160 | grep 192.168.195.100 | wc -l
0
[root@haproxy1 ~]# ss -antl
State         Recv-Q        Send-Q                 Local Address:Port                 Peer Address:Port        
LISTEN        0             128                          0.0.0.0:22                        0.0.0.0:*           
LISTEN        0             128                             [::]:22                           [::]:*

//此时我们再去查看slave主机(haproxy2)上的haproxy服务和vip,通过keepalived配置文件中的脚本检测,vip跳转到本机,本机成为新的master主机之后,执行master主机的任务,从而开启haproxy服务,继续进行负载均衡的任务
[root@haproxy2 ~]# systemctl is-active haproxy.service 
active
[root@haproxy2 ~]# ss -antl
State         Recv-Q        Send-Q                 Local Address:Port                 Peer Address:Port        
LISTEN        0             128                          0.0.0.0:8189                      0.0.0.0:*           
LISTEN        0             128                          0.0.0.0:80                        0.0.0.0:*           
LISTEN        0             128                          0.0.0.0:22                        0.0.0.0:*           
LISTEN        0             128                             [::]:22                           [::]:*           
[root@haproxy2 ~]# ip a show ens160 | grep 192.168.195.100
    inet 192.168.195.100/32 scope global ens160
[root@haproxy2 ~]# ip a show ens160 | grep 192.168.195.100 | wc -l
1

//当我们的运维人员检修之后,使得源master主机(haproxy1)上的haproxy服务重启运行之后,我们再次开启keepalived服务,我们的vip将会被抢回来,从而重新成为master,而salve主机上的则会失去master的权利

master主机(haproxy1)
[root@haproxy1 ~]# systemctl start haproxy.service keepalived.service 
[root@haproxy1 ~]# systemctl is-active haproxy.service 
active
[root@haproxy1 ~]# systemctl is-active keepalived.service 
active
[root@haproxy1 ~]# ss -antl
State         Recv-Q        Send-Q                 Local Address:Port                 Peer Address:Port        
LISTEN        0             128                          0.0.0.0:80                        0.0.0.0:*           
LISTEN        0             128                          0.0.0.0:22                        0.0.0.0:*           
LISTEN        0             128                          0.0.0.0:8189                      0.0.0.0:*           
LISTEN        0             128                             [::]:22                           [::]:*           
[root@haproxy1 ~]# ip a show ens160 | grep 192.168.195.100
    inet 192.168.195.100/32 scope global ens160
[root@haproxy1 ~]# ip a show ens160 | grep 192.168.195.100 | wc -l
1

slave主机(haproxy2)
[root@haproxy2 ~]# systemctl is-active haproxy.service 
inactive
[root@haproxy2 ~]# systemctl is-active keepalived.service 
active
[root@haproxy2 ~]# ip a show ens160 | grep 192.168.195.100
[root@haproxy2 ~]# ip a show ens160 | grep 192.168.195.100 | wc -l
0
[root@haproxy2 ~]# ss -antl
State         Recv-Q        Send-Q                 Local Address:Port                 Peer Address:Port        
LISTEN        0             128                          0.0.0.0:22                        0.0.0.0:*           
LISTEN        0             128                             [::]:22                           [::]:*

3.对keepalived进行监控

对keepalived服务的监控应在备用服务器上进行,通过添加zabbix自定义监控进行。
监控的信息是备上面有无VIP地址(192.168.195.100)

备机上出现VIP有两种情况:

  • 发生了脑裂
  • 正常的主备切换

监控只是监控发生脑裂的可能性,不能保证一定是发生了脑裂,因为正常的主备切换VIP也是会到备上的。

监控脚本如下:

[root@haproxy2 ~]# cd /scripts/
[root@haproxy2 scripts]# vim check_keepalived.sh
[root@haproxy2 scripts]# chmod +x check_keepalived.sh
[root@haproxy2 scripts]# cat check_keepalived.sh
#!/bin/bash

if [ `ip a show ens160 | grep 192.168.195.100 | wc -l` -ne 0 ]
then
    echo "1"
else
    echo "0"
fi
[root@haproxy2 scripts]# ll
total 8
-rwxr-xr-x 1 root root 115 Oct 14 00:06 check_keepalived.sh
-rwxr-xr-x 1 root root 444 Oct 13 22:01 notify.sh	
3.1.在要slave主机(haproxy2)上安装agent
//下载zabbix
[root@haproxy2 ~]# wget https://cdn.zabbix.com/zabbix/sources/stable/6.4/zabbix-6.4.6.tar.gz
--2023-10-14 00:47:19--  https://cdn.zabbix.com/zabbix/sources/stable/6.4/zabbix-6.4.6.tar.gz
Resolving cdn.zabbix.com (cdn.zabbix.com)... 172.67.69.4, 104.26.6.148, 104.26.7.148, ...
Connecting to cdn.zabbix.com (cdn.zabbix.com)|172.67.69.4|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 43744978 (42M) [application/octet-stream]
Saving to: ‘zabbix-6.4.6.tar.gz.1’

zabbix-6.4.6.tar.gz.1       100%[==========================================>]  41.72M   600KB/s    in 2m 0s   

2023-10-14 00:49:20 (356 KB/s) - ‘zabbix-6.4.6.tar.gz.1’ saved [43744978/43744978]

[root@haproxy2 ~]# ls
anaconda-ks.cfg  haproxy-2.7.10  haproxy-2.7.10.tar.gz  zabbix-6.4.6.tar.gz

//创建用户并解压zabbix压缩包
[root@haproxy2 ~]# tar xf zabbix-6.4.6.tar.gz -C /usr/local/
[root@haproxy2 ~]# cd /usr/local/ && ls
bin  etc  games  haproxy  include  lib  lib64  libexec  sbin  share  src  zabbix-6.4.6
[root@haproxy2 local]# cd zabbix-6.4.6/
[root@haproxy2 zabbix-6.4.6]# useradd -r -M -s /sbin/nologin zabbix

//安装编译安装所需要的软件包
[root@haproxy2 zabbix-6.4.6]# yum -y install gcc gcc-c++ make

//进入zabbix-6.4.6的目录进行编译
[root@haproxy2 zabbix-6.4.6]# ./configure --enable-agent
省略. . .
***********************************************************
*            Now run 'make install'                       *
*                                                         *
*            Thank you for using Zabbix!                  *
*              //www.zabbix.com>                    *
***********************************************************
最后报这个则表示编译成功,可直接使用make install安装

[root@haproxy2 zabbix-6.4.6]# make install

//修改zabbix客户端的配置文件
[root@haproxy2 zabbix-6.4.6]# vim /usr/local/etc/zabbix_agentd.conf
[root@haproxy2 zabbix-6.4.6]# grep -A2 '# ServerActive=' /usr/local/etc/zabbix_agentd.conf
# ServerActive=

ServerActive=192.168.195.130   //改为server端的ip
[root@haproxy2 zabbix-6.4.6]# grep -A2 '# Server=' /usr/local/etc/zabbix_agentd.conf
# Server=

Server=192.168.195.130        //改为server端的ip
[root@haproxy2 zabbix-6.4.6]# grep -A2 '# Hostname=' /usr/local/etc/zabbix_agentd.conf
# Hostname=

Hostname=note2          //修改主机名,必须全局唯一

//设置zabbix_agentd开机自启,将zabbix_server端配置好了的service文件传到slave(haproxy2)这台主机
[root@zabbix ~]# scp /usr/lib/systemd/system/zabbix_agentd.service [email protected]:/usr/lib/systemd/system/
root@192.168.195.134's password: 
zabbix_agentd.service                                                               100%  227   211.4KB/s   00:00

[root@haproxy2 ~]# systemctl daemon-reload 
[root@haproxy2 ~]# systemctl enable --now zabbix_agentd.service 
Created symlink /etc/systemd/system/multi-user.target.wants/zabbix_agentd.service → /usr/lib/systemd/system/zabbix_agentd.service.
[root@haproxy2 ~]# ss -antl
State         Recv-Q        Send-Q                Local Address:Port                  Peer Address:Port        
LISTEN        0             128                         0.0.0.0:10050                      0.0.0.0:*           
LISTEN        0             128                         0.0.0.0:22                         0.0.0.0:*           
LISTEN        0             128                            [::]:22                            [::]:*
服务启动成功
3.2.编辑一个脚本文件,用于获取服务的进程(脚本默认放在同一个地方,此处我们创建一个专门用于放置脚本文件的目录,放置到/scripts,不要放在用户家目录下面,防止后续出现权限受限的问题)
//该脚本得到的是主机上是否存在vip,如果slave主机(haproxy2)上存在vip,则说明master主机(haproxy1)上的haproxy服务出现问题,返回值报1,说明服务出现问题

[root@haproxy2 ~]# cd /scripts/
[root@haproxy2 scripts]# vim check_keepalived.sh 
[root@haproxy2 scripts]# cat check_keepalived.sh
#!/bin/bash

if [ `ip a show ens160 | grep 192.168.195.100 | wc -l` -ne 0 ]
then
    echo "1"
else
    echo "0"
fi
[root@haproxy2 scripts]# chmod +x check_keepalived.sh
[root@haproxy2 scripts]# ./check_keepalived.sh   //显示0说明该主机上没有vip
0

//进入配置文件,创建自定义监控任务
[root@haproxy2 scripts]# vim /usr/local/etc/zabbix_agentd.conf
[root@haproxy2 scripts]# tail -1 /usr/local/etc/zabbix_agentd.conf
UserParameter=check_keepalived,/bin/bash /scripts/check_keepalived.sh

//因为我们修改了配置文件,所以需要重启服务,重新读取配置文件内容
[root@haproxy2 scripts]# systemctl restart zabbix_agentd.service

//创建自定义监控任务后,我们需要在server端去测试一下是否能接受到被监控端的值
[root@client ~]# zabbix_get -s 192.168.195.134 -k check_keepalived
0      //成功接收到值

主机上的配置完成

3.3.添加主机

zabbix监控keepalived主备状态以及脑裂_第1张图片
zabbix监控keepalived主备状态以及脑裂_第2张图片
zabbix监控keepalived主备状态以及脑裂_第3张图片

3.4.创建监控项

zabbix监控keepalived主备状态以及脑裂_第4张图片
zabbix监控keepalived主备状态以及脑裂_第5张图片
zabbix监控keepalived主备状态以及脑裂_第6张图片

3.5.创建触发器

zabbix监控keepalived主备状态以及脑裂_第7张图片
zabbix监控keepalived主备状态以及脑裂_第8张图片
zabbix监控keepalived主备状态以及脑裂_第9张图片
zabbix监控keepalived主备状态以及脑裂_第10张图片

4.6.第二轮测试

模拟master主机(haproxy1)的haproxy服务超负载导致服务关闭
zabbix监控keepalived主备状态以及脑裂_第11张图片

master主机(haproxy1)
[root@haproxy1 ~]# systemctl stop haproxy.service
[root@haproxy1 ~]# systemctl is-active haproxy.service 
inactive
[root@haproxy1 ~]# systemctl is-active keepalived.service 
inactive
[root@haproxy1 ~]# ss -antl
State         Recv-Q        Send-Q                 Local Address:Port                 Peer Address:Port        
LISTEN        0             128                          0.0.0.0:22                        0.0.0.0:*           
LISTEN        0             128                             [::]:22                           [::]:*           
[root@haproxy1 ~]# ip a show ens160 | grep 192.168.195.100
[root@haproxy1 ~]# ip a show ens160 | grep 192.168.195.100 | wc -l
0

slave主机(haproxy2)
[root@haproxy2 ~]# systemctl is-active haproxy.service 
active
[root@haproxy2 ~]# systemctl is-active keepalived.service 
active
[root@haproxy2 ~]# ip a show ens160 | grep 192.168.195.100
    inet 192.168.195.100/32 scope global ens160
[root@haproxy2 ~]# ip a show ens160 | grep 192.168.195.100 | wc -l
1
[root@haproxy2 ~]# ss -antl
State         Recv-Q        Send-Q                Local Address:Port                  Peer Address:Port        
LISTEN        0             128                         0.0.0.0:8189                       0.0.0.0:*           
LISTEN        0             128                         0.0.0.0:10050                      0.0.0.0:*           
LISTEN        0             128                         0.0.0.0:80                         0.0.0.0:*           
LISTEN        0             128                         0.0.0.0:22                         0.0.0.0:*           
LISTEN        0             128                            [::]:22                            [::]:*

zabbix监控keepalived主备状态以及脑裂_第12张图片

4.7.邮箱通知

若想要实现发送邮箱的效果,详细步骤在下列网站中查看

[zabbix服务配置邮箱告警(定义媒介、配置动作)_碳烤小肥杨…的博客-CSDN博客]

4.7.邮箱通知

若想要实现发送邮箱的效果,详细步骤在下列网站中查看

zabbix服务配置邮箱告警(定义媒介、配置动作)_碳烤小肥杨…的博客-CSDN博客
zabbix监控keepalived主备状态以及脑裂_第13张图片

你可能感兴趣的:(zabbix,运维,linux,负载均衡,可用性测试)