18.3/18.4/18.5 用keepalived配置高可用集群

2019独角兽企业重金招聘Python工程师标准>>> hot3.png

用keepalived配置高可用集群

  • 准备两台机器130和132,130作为master,132作为backup
  • 两台机器都执行yum install -y keepalived
  • 两台机器都安装nginx,其中130上已经编译安装过nginx,132上需要yum安装nginx: yum install -y nginx
  • 设定vip为100
  • 编辑130上keepalived配置文件,内容获取地址
  • 130编辑监控脚本,内容获取地址
  • 给脚本755权限
  • systemctl start keepalived 130启动服务
  • 132上编辑配置文件,内容获取地址
  • 132上编辑监控脚本,内容获取地址
  • 给脚本755权限
  • 132上也启动服务 systemctl start keepalived

keepalived配置高可用集群


  1. 首先准备两台机器,都安装keepalived
  • keepalived,实际是包含一个服务的,也就是说这个服务用来实现高可用
A机器,安装keepalived
[root@hanfeng ~]# yum install -y keepalived

B机器,安装keepalived
[root@hf-01 ~]# yum install -y keepalived
  1. 这里使用 nginx ,把它作为一个高可用的对象——>使用nginx作为演示对象的原因,因为nginx在工作中,在生产环境中,很多企业把nginx做一个负载均衡器
  • 假设nginx一旦挂掉,那么后端所有的web,即使说是正常的,那也无法访问到
  1. 若是A、B机器没有装nginx服务,可以直接 yum安装
  • 若是lnmp安装过nginx,则不需要安装了(源码包安装的nginx)
  • 直接yum安装的nginx,两者很容易区分(PS:有时直接yum安装不了,需要安装yum扩展源——>yum install -y epel-release)
  • 源码包安装nginx
  • 源码安装nginx经常出现的错误
A机器源码安装nginx
(PS:有时初始化的时候,会看到无法初始化,是因为缺少一些包yum install  -y gcc)

B机器yum安装nginx
[root@hf-01 ~]# yum install -y epel-release
[root@hf-01 ~]# yum install -y nginx
[root@hf-01 ~]# systemctl start nginx
[root@hf-01 ~]# ps aux |grep nginx
root      2825  0.0  0.2 123300  2100 ?        Ss   11:40   0:00 nginx: master process /usr/sbin/nginx
nginx     2826  0.0  0.3 123764  3120 ?        S    11:40   0:00 nginx: worker process
root      2828  0.0  0.0 112656   992 pts/0    R+   11:40   0:00 grep --color=auto nginx
[root@hf-01 ~]# 
  1. 更改keepalived配置文件了,内容地址
  • 默认的配置文件路径在/etc/keepalived/keepalived.conf
  • 清空文件的快捷键方法 > !$
A机器更改配置文件
[root@hanfeng ~]# ls /etc/keepalived/keepalived.conf 
/etc/keepalived/keepalived.conf
[root@hanfeng ~]# > !$        //直接清空文件内容了
> /etc/keepalived/keepalived.conf
[root@hanfeng ~]# cat /etc/keepalived/keepalived.conf
[root@hanfeng ~]# vim /etc/keepalived/keepalived.conf   //去文件地址去下载内容
将拷贝的内容复制进去
只需要改网卡名字和飘逸IP为192.168.133.100
####################### #  全局配置 #######################
global_defs {                            //global_defs 全局配置标识
   notification_email {               //notification_email用于设置报警邮件地址
[email protected]           //可以设置多个,每行一个
   }
   notification_email_from [email protected]    //设置邮件发送地址  
   smtp_server 127.0.0.1                //设置邮件的smtp server地址
   smtp_connect_timeout 30            //设置连接smtp sever超时时间
   router_id LVS_DEVEL
}

###################### #  VRRP配置 ######################
vrrp_script chk_nginx {               
    script "/usr/local/sbin/check_ng.sh"   //检查服务是否正常,通过写脚本实现,脚本检查服务健康状态
    interval 3                      //检查的时间间断是3秒
}
vrrp_instance VI_1 {                        //VRRP配置标识 VI_1是实例名称 
    state MASTER                           //定义master相关
    interface eno16777736                        //通过vrrp协议去通信、去发广播。配置时,需注意自己的网卡名称
    virtual_router_id 51                   //定义路由器ID ,配置的时候和从机器一致
    priority 100                              //权重,主角色和从角色的权重是不同的
    advert_int 1                            //设定MASTER与BACKUP主机质检同步检查的时间间隔,单位为秒
    authentication {                       //认证相关信息
        auth_type PASS                   //这里认证的类型是PASS
        auth_pass aminglinux>com   //密码的形式是一个字符串
    }
    virtual_ipaddress {                  //设置虚拟IP地址 (VIP),又叫做漂移IP地址
        192.168.74.100   //更改为192.168.74.100
    }
    track_script {               //加载脚本 
        chk_nginx            
    }
}

保存退出
  • virtual_ipaddress:简称VIP,这个vip,两台机器,一个主,一个从,正常的情况是主在服务,主宕掉了,从起来了,从启动服务,从启动nginx以后,,启动以后,访问那个IP呢?把域名解析到那个IP上呢?假如解析到主上,主宕掉了,所以这个,需要定义一个公有IP(主上用的IP,从上也用的IP);这个IP是随时可以夏掉,去配置的
  1. 定义监控脚本,脚本内容获取地址
  • 脚本路径在keepalived配置文件中有定义,路径为/usr/local/sbin/check_ng.sh
A机器定义监控脚本
[root@hanfeng ~]# vim /usr/local/sbin/check_ng.sh

#!/bin/bash
#时间变量,用于记录日志
d=`date --date today +%Y%m%d_%H:%M:%S`
#计算nginx进程数量
n=`ps -C nginx --no-heading|wc -l`
#如果进程为0,则启动nginx,并且再次检测nginx进程数量,
#如果还为0,说明nginx无法启动,此时需要关闭keepalived
if [ $n -eq "0" ]; then
        /etc/init.d/nginx start
        n2=`ps -C nginx --no-heading|wc -l`
        if [ $n2 -eq "0"  ]; then
                echo "$d nginx down,keepalived will stop" >> /var/log/check_ng.log
                systemctl stop keepalived
        fi
fi

保存退出
  • “脑裂”,在高可用(HA)系统中,当联系2个节点的“心跳线”断开时,本来为一整体、动作协调的HA系统,就分裂成为2个独立的个体。由于相互失去了联系,都以为是对方出了故障。两个节点上的HA软件像“裂脑人”一样,争抢“共享资源”、争起“应用服务”,就会发生严重后果——或者共享资源被瓜分、2边“服务”都起不来了;或者2边“服务”都起来了,但同时读写“共享存储”,导致数据损坏
  • 如何判断脑裂?
    • 分别在两台机查看当前服务器是否拥有虚拟IP,如果两台服务器都拥有,则说明发生了脑裂,证明目前双机通信出现问题,产生此问题的原有在于 两台服务器都探测不到组内其他服务器的状态(心跳请求无法正常响应),私自判定另一台服务器挂起,则抢占虚拟IP,脑裂的出现是不被允许的,解决此问题的方法为检查防火墙设置(关闭防火墙)或者使用串口通信。
  1. 脚本创建完之后,还需要改变脚本的权限(不更改权限,就无法自动加载脚本,那就无法启动keepalived服务)
[root@hanfeng ~]# chmod 755 /usr/local/sbin/check_ng.sh
[root@hanfeng ~]# 
  1. 启动keepalived服务,并查看是否启动成功(PS:启动不成功,有可能是防火墙未关闭或者规则限制导致的)
  • systemctl stop firewalld 关闭firewalld
  • iptables -nvL
  • setenforce 0 临时关闭selinux
  • getenforce命令查看是否为Permissive
  • 这时再来启动keepalived,就会看到keepalived进程服务了
[root@hanfeng ~]# systemctl start keepalived
[root@hanfeng ~]# ps aux |grep keepalived
root      2970  0.0  0.1 121324  1404 ?        Ss   07:14   0:00 /usr/sbin/keepalived -D
root      2971  0.0  0.2 123396  2356 ?        S    07:14   0:00 /usr/sbin/keepalived -D
root      2972  0.0  0.2 123396  2384 ?        S    07:14   0:00 /usr/sbin/keepalived -D
root      2974  0.0  0.0 112672   988 pts/1    R+   07:14   0:00 grep --color=auto keepalived
[root@hanfeng ~]# 
  1. 查看nginx服务进程
[root@hanfeng ~]# ps aux |grep nginx
root      3004  0.0  0.2 123372  2108 ?        Ss   07:18   0:00 nginx: master process /usr/sbin/nginx
nginx     3005  0.0  0.3 123836  3148 ?        S    07:18   0:00 nginx: worker process
root      3007  0.0  0.0 112672   984 pts/1    R+   07:19   0:00 grep --color=auto nginx
[root@hanfeng ~]# 
  1. 这时停止nginx服务
  • /etc/init.d/nginx stop
[root@hanfeng ~]# /etc/init.d/nginx stop
Stopping nginx (via systemctl):                            [  确定  ]
[root@hanfeng ~]#
  1. 再来查看nginx服务进程,会看到自动加载了
[root@hanfeng ~]# ps aux |grep nginx
root      6238  0.0  0.0  20996   628 ?        Ss   08:07   0:00 nginx: master process /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
nobody    6242  0.0  0.3  23440  3212 ?        S    08:07   0:00 nginx: worker process
nobody    6243  0.0  0.3  23440  3212 ?        S    08:07   0:00 nginx: worker process
root      6263  0.0  0.0 112676   980 pts/0    R+   08:07   0:00 grep --color=auto nginx
[root@hanfeng ~]# 

  1. keepalived日志文件路径
  • /var/log/messages
  1. 查看ip地址,使用 ip add 命令,而不能使用ifconfig命令,因为 ifconfig命令 是无法查看到vip192.168.133.100这个IP的
[root@hanfeng ~]# ip add
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eno16777736:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:c7:05:28 brd ff:ff:ff:ff:ff:ff
    inet 192.168.74.130/24 brd 192.168.74.255 scope global dynamic eno16777736
       valid_lft 1158sec preferred_lft 1158sec
    inet 192.168.133.100/32 scope global eno16777736
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fec7:528/64 scope link 
       valid_lft forever preferred_lft forever
[root@hanfeng ~]# 
  1. 检查A、B 机器下防火墙和selinux是否关闭了,若没有关闭,可能会导致实验失败
  • systemctl stop firewalld 关闭firewalld
  • iptables -nvL
  • setenforce 0 临时关闭selinux
  • getenforce命令查看是否为Permissive

以上就是主机器A的配置

backup 机器配置

  1. 在B机器yum安装nginx和keepalived
[root@hf-01 ~]# yum install -y epel-release
[root@hf-01 ~]# yum install -y nginx

  1. 关闭B机器的防火墙和selinux
  • iptables -F 清空规则
  • setenforce 0 临时关闭selinux
  1. 自定义B机器keepalived配置文件,内容获取地址,更改虚拟IP和主一样的
首先清空B机器keepalived里面自带的配置文件
[root@hf-01 ~]# > /etc/keepalived/keepalived.conf 
[root@hf-01 ~]# cat !$
cat /etc/keepalived/keepalived.conf
[root@hf-01 ~]# 

然后复制配置文件并粘贴进去,更改虚拟IP和主一样的
[root@hf-01 ~]# vim /etc/keepalived/keepalived.conf


global_defs {
   notification_email {
     [email protected]
   }
   notification_email_from [email protected]
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id LVS_DEVEL
}
vrrp_script chk_nginx {
    script "/usr/local/sbin/check_ng.sh"
    interval 3
}
vrrp_instance VI_1 {
    state BACKUP        //这里 和master不一样的名字
    interface eno16777736        //网卡和当前机器一致,否则无法启动keepalived服务
    virtual_router_id 51        //和主机器 保持一致
    priority 90            //权重,要比主机器小的数值
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass aminglinux>com
    }
    virtual_ipaddress {
        192.168.74.100        //这里更改为192.168.74.100
    }
    track_script {
        chk_nginx
    }
}

保存退出
  1. 定义监控脚本,路径再keepalived里面已定义过,脚本内容地址
  • 这个脚本和主上的脚本有一点区别,启动nginx的命令不同,因为一个是yum安装,一个是源码包安装
[root@hf-01 ~]# vim /usr/local/sbin/check_ng.sh

#时间变量,用于记录日志
d=`date --date today +%Y%m%d_%H:%M:%S`
#计算nginx进程数量
n=`ps -C nginx --no-heading|wc -l`
#如果进程为0,则启动nginx,并且再次检测nginx进程数量,
#如果还为0,说明nginx无法启动,此时需要关闭keepalived
if [ $n -eq "0" ]; then
        systemctl start nginx
        n2=`ps -C nginx --no-heading|wc -l`
        if [ $n2 -eq "0"  ]; then
                echo "$d nginx down,keepalived will stop" >> /var/log/check_ng.log                systemctl stop keepalived
        fi
fi
保存退出
  1. 改动脚本的权限,设置为755权限
[root@hf-01 ~]# chmod 755 /usr/local/sbin/check_ng.sh
[root@hf-01 ~]# 

  1. B机器启动keepalived服务
  • systemctl start keepalived
[root@hf-01 ~]# systemctl start keepalived
[root@hf-01 ~]# ps aux |grep keep
root      2814  0.0  0.1 121324  1396 ?        Ss   07:10   0:00 /usr/sbin/keepalived -D
root      2815  0.0  0.2 121324  2740 ?        S    07:10   0:00 /usr/sbin/keepalived -D
root      2816  0.0  0.2 121324  2324 ?        S    07:10   0:00 /usr/sbin/keepalived -D
root      2827  0.0  0.0 112672   980 pts/0    R+   07:10   0:00 grep --color=auto keep
[root@hf-01 ~]# 

如何区分主和从上的nginx?

  • A机器,是源码包安装的nginx(PS:这是lnmp配置好的环境虚拟主机内容)
[root@hanfeng ~]# cat /usr/local/nginx/conf/vhost/aaa.com.conf
server
{
    listen 80 default_server; 
    server_name aaa.com;
    index index.html index.htm index.php;
    root /data/wwwroot/default;
 location ~ \.php$
    {
        include fastcgi_params;
        fastcgi_pass unix:/tmp/champ.sock;
       #fastcgi_pass 127.0.0.1:9000;
        fastcgi_index index.php;
        fastcgi_param SCRIPT_FILENAME /data/wwwroot/default$fastcgi_script_name;
    }

}

[root@hanfeng ~]# cat /data/wwwroot/default/index.html 
This is the default sete.
[root@cham002 ~]# vim /data/wwwroot/default/index.html 
master This is the default sete.
[root@cham002 ~]# 

输入图片说明

  • B机器是yum安装的nginx
    • 默认的索引页在 /usr/share/nginx/html/index.html
[root@hf-02~]# cat /usr/share/nginx/html/index.html 
[root@hf-02 ~]# vim /usr/share/nginx/html/index.html 
backup backup.
![输入图片说明](https://static.oschina.net/uploads/img/201801/28233716_ovDw.png "backup机器访问IP")
  • 访问192.168.74.100这个VIP会看到和主机器(即A机器相同的内容),说明现在访问到的是机器master,VIP在master上

输入图片说明

问题-B机器无法调用nginx服务?

  • B机器关闭nginx服务,keepalived无法拉动nginx服务起来
  • 解决方法:
    • 再次设置755权限,就可以拉动nginx服务了

测试高可用

  1. 模拟线上生产环境,主机器宕机环境,最简单直接的方法,就是直接关闭keepalived服务
  2. 关闭master机器(即A机器)上的keepalived服务关闭
[root@hanfeng ~]# systemctl stop keepalived
[root@hanfeng ~]# 
  1. 查看A机器上的VIP被已经释放掉了
[root@hanfeng ~]# ip add
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eno16777736:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:c7:05:28 brd ff:ff:ff:ff:ff:ff
    inet 192.168.74.130/24 brd 192.168.74.255 scope global dynamic eno16777736
       valid_lft 1219sec preferred_lft 1219sec
    inet6 fe80::20c:29ff:fec7:528/64 scope link 
       valid_lft forever preferred_lft forever
[root@hanfeng ~]# 
  1. 查看backup机器(即B机器)在监听VIP
[root@hf-01 ~]# ip add
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eno16777736:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:ff:fe:93 brd ff:ff:ff:ff:ff:ff
    inet 192.168.74.129/24 brd 192.168.74.255 scope global eno16777736
       valid_lft forever preferred_lft forever
    inet 192.168.74.100/32 scope global eno16777736
       valid_lft forever preferred_lft forever
    inet 192.168.74.150/24 brd 192.168.74.255 scope global secondary eno16777736:0
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:feff:fe93/64 scope link 
       valid_lft forever preferred_lft forever
[root@hf-01 ~]# 
  1. 查看B机器日志
[root@hf-01 ~]# tail /var/log/messages
Jan 29 08:20:39 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:39 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:39 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:39 hanfeng avahi-daemon[568]: Registering new address record for 192.168.74.100 on eno16777736.IPv4.
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
[root@hf-01 ~]# 
  1. 浏览器访问vip,会看到已经变成backup机器上的了

    18.3/18.4/18.5 用keepalived配置高可用集群_第1张图片

  2. 在master机器(即A机器)启动keepalived服务,会看到vip这个IP立刻回来了

[root@hanfeng ~]# systemctl start keepalived
[root@hanfeng ~]# ip add
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eno16777736:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:c7:05:28 brd ff:ff:ff:ff:ff:ff
    inet 192.168.74.130/24 brd 192.168.74.255 scope global dynamic eno16777736
       valid_lft 1577sec preferred_lft 1577sec
    inet 192.168.74.100/32 scope global eno16777736
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fec7:528/64 scope link 
       valid_lft forever preferred_lft forever
[root@hanfeng ~]# 
  1. 查看B机器日志变化
[root@hf-01 ~]# tail /var/log/messages
Jan 29 08:20:39 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:39 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:39 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:39 hanfeng avahi-daemon[568]: Registering new address record for 192.168.74.100 on eno16777736.IPv4.
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
[root@hf-01 ~]# tail /var/log/messages
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:30:01 hanfeng systemd: Started Session 49 of user root.
Jan 29 08:30:01 hanfeng systemd: Starting Session 49 of user root.
Jan 29 08:30:51 hanfeng Keepalived_vrrp[7886]: VRRP_Instance(VI_1) Received advert with higher priority 100, ours 90
Jan 29 08:30:51 hanfeng Keepalived_vrrp[7886]: VRRP_Instance(VI_1) Entering BACKUP STATE
Jan 29 08:30:51 hanfeng Keepalived_vrrp[7886]: VRRP_Instance(VI_1) removing protocol VIPs.
Jan 29 08:30:51 hanfeng avahi-daemon[568]: Withdrawing address record for 192.168.74.100 on eno16777736.
[root@hf-01 ~]# 

##总结

  • 在生产环境中,可能会用到2-3台backup角色, vim /etc/keepalived/keepalived.conf 这里面的权重调成不通级别,权重越高优先级越高!除了nginx服务的话,还可以做MySQL的高可用集群服务。(做mysql的高可用,一定要保证两边的数据一致)

转载于:https://my.oschina.net/u/3707314/blog/1612865

你可能感兴趣的:(18.3/18.4/18.5 用keepalived配置高可用集群)