2019独角兽企业重金招聘Python工程师标准>>>
用keepalived配置高可用集群
- 准备两台机器130和132,130作为master,132作为backup
- 两台机器都执行yum install -y keepalived
- 两台机器都安装nginx,其中130上已经编译安装过nginx,132上需要yum安装nginx: yum install -y nginx
- 设定vip为100
- 编辑130上keepalived配置文件,内容获取地址
- 130编辑监控脚本,内容获取地址
- 给脚本755权限
- systemctl start keepalived 130启动服务
- 132上编辑配置文件,内容获取地址
- 132上编辑监控脚本,内容获取地址
- 给脚本755权限
- 132上也启动服务 systemctl start keepalived
keepalived配置高可用集群
- 首先准备两台机器,都安装keepalived
- keepalived,实际是包含一个服务的,也就是说这个服务用来实现高可用
A机器,安装keepalived
[root@hanfeng ~]# yum install -y keepalived
B机器,安装keepalived
[root@hf-01 ~]# yum install -y keepalived
- 这里使用 nginx ,把它作为一个高可用的对象——>使用nginx作为演示对象的原因,因为nginx在工作中,在生产环境中,很多企业把nginx做一个负载均衡器
- 假设nginx一旦挂掉,那么后端所有的web,即使说是正常的,那也无法访问到
- 若是A、B机器没有装nginx服务,可以直接 yum安装
- 若是lnmp安装过nginx,则不需要安装了(源码包安装的nginx)
- 直接yum安装的nginx,两者很容易区分(PS:有时直接yum安装不了,需要安装yum扩展源——>yum install -y epel-release)
- 源码包安装nginx
- 源码安装nginx经常出现的错误
A机器源码安装nginx
(PS:有时初始化的时候,会看到无法初始化,是因为缺少一些包yum install -y gcc)
B机器yum安装nginx
[root@hf-01 ~]# yum install -y epel-release
[root@hf-01 ~]# yum install -y nginx
[root@hf-01 ~]# systemctl start nginx
[root@hf-01 ~]# ps aux |grep nginx
root 2825 0.0 0.2 123300 2100 ? Ss 11:40 0:00 nginx: master process /usr/sbin/nginx
nginx 2826 0.0 0.3 123764 3120 ? S 11:40 0:00 nginx: worker process
root 2828 0.0 0.0 112656 992 pts/0 R+ 11:40 0:00 grep --color=auto nginx
[root@hf-01 ~]#
- 更改keepalived配置文件了,内容地址
- 默认的配置文件路径在/etc/keepalived/keepalived.conf
- 清空文件的快捷键方法 > !$
A机器更改配置文件
[root@hanfeng ~]# ls /etc/keepalived/keepalived.conf
/etc/keepalived/keepalived.conf
[root@hanfeng ~]# > !$ //直接清空文件内容了
> /etc/keepalived/keepalived.conf
[root@hanfeng ~]# cat /etc/keepalived/keepalived.conf
[root@hanfeng ~]# vim /etc/keepalived/keepalived.conf //去文件地址去下载内容
将拷贝的内容复制进去
只需要改网卡名字和飘逸IP为192.168.133.100
####################### # 全局配置 #######################
global_defs { //global_defs 全局配置标识
notification_email { //notification_email用于设置报警邮件地址
[email protected] //可以设置多个,每行一个
}
notification_email_from [email protected] //设置邮件发送地址
smtp_server 127.0.0.1 //设置邮件的smtp server地址
smtp_connect_timeout 30 //设置连接smtp sever超时时间
router_id LVS_DEVEL
}
###################### # VRRP配置 ######################
vrrp_script chk_nginx {
script "/usr/local/sbin/check_ng.sh" //检查服务是否正常,通过写脚本实现,脚本检查服务健康状态
interval 3 //检查的时间间断是3秒
}
vrrp_instance VI_1 { //VRRP配置标识 VI_1是实例名称
state MASTER //定义master相关
interface eno16777736 //通过vrrp协议去通信、去发广播。配置时,需注意自己的网卡名称
virtual_router_id 51 //定义路由器ID ,配置的时候和从机器一致
priority 100 //权重,主角色和从角色的权重是不同的
advert_int 1 //设定MASTER与BACKUP主机质检同步检查的时间间隔,单位为秒
authentication { //认证相关信息
auth_type PASS //这里认证的类型是PASS
auth_pass aminglinux>com //密码的形式是一个字符串
}
virtual_ipaddress { //设置虚拟IP地址 (VIP),又叫做漂移IP地址
192.168.74.100 //更改为192.168.74.100
}
track_script { //加载脚本
chk_nginx
}
}
保存退出
- virtual_ipaddress:简称VIP,这个vip,两台机器,一个主,一个从,正常的情况是主在服务,主宕掉了,从起来了,从启动服务,从启动nginx以后,,启动以后,访问那个IP呢?把域名解析到那个IP上呢?假如解析到主上,主宕掉了,所以这个,需要定义一个公有IP(主上用的IP,从上也用的IP);这个IP是随时可以夏掉,去配置的
- 定义监控脚本,脚本内容获取地址
- 脚本路径在keepalived配置文件中有定义,路径为/usr/local/sbin/check_ng.sh
A机器定义监控脚本
[root@hanfeng ~]# vim /usr/local/sbin/check_ng.sh
#!/bin/bash
#时间变量,用于记录日志
d=`date --date today +%Y%m%d_%H:%M:%S`
#计算nginx进程数量
n=`ps -C nginx --no-heading|wc -l`
#如果进程为0,则启动nginx,并且再次检测nginx进程数量,
#如果还为0,说明nginx无法启动,此时需要关闭keepalived
if [ $n -eq "0" ]; then
/etc/init.d/nginx start
n2=`ps -C nginx --no-heading|wc -l`
if [ $n2 -eq "0" ]; then
echo "$d nginx down,keepalived will stop" >> /var/log/check_ng.log
systemctl stop keepalived
fi
fi
保存退出
- “脑裂”,在高可用(HA)系统中,当联系2个节点的“心跳线”断开时,本来为一整体、动作协调的HA系统,就分裂成为2个独立的个体。由于相互失去了联系,都以为是对方出了故障。两个节点上的HA软件像“裂脑人”一样,争抢“共享资源”、争起“应用服务”,就会发生严重后果——或者共享资源被瓜分、2边“服务”都起不来了;或者2边“服务”都起来了,但同时读写“共享存储”,导致数据损坏
- 如何判断脑裂?
- 分别在两台机查看当前服务器是否拥有虚拟IP,如果两台服务器都拥有,则说明发生了脑裂,证明目前双机通信出现问题,产生此问题的原有在于 两台服务器都探测不到组内其他服务器的状态(心跳请求无法正常响应),私自判定另一台服务器挂起,则抢占虚拟IP,脑裂的出现是不被允许的,解决此问题的方法为检查防火墙设置(关闭防火墙)或者使用串口通信。
- 脚本创建完之后,还需要改变脚本的权限(不更改权限,就无法自动加载脚本,那就无法启动keepalived服务)
[root@hanfeng ~]# chmod 755 /usr/local/sbin/check_ng.sh
[root@hanfeng ~]#
- 启动keepalived服务,并查看是否启动成功(PS:启动不成功,有可能是防火墙未关闭或者规则限制导致的)
- systemctl stop firewalld 关闭firewalld
- iptables -nvL
- setenforce 0 临时关闭selinux
- getenforce命令查看是否为Permissive
- 这时再来启动keepalived,就会看到keepalived进程服务了
[root@hanfeng ~]# systemctl start keepalived
[root@hanfeng ~]# ps aux |grep keepalived
root 2970 0.0 0.1 121324 1404 ? Ss 07:14 0:00 /usr/sbin/keepalived -D
root 2971 0.0 0.2 123396 2356 ? S 07:14 0:00 /usr/sbin/keepalived -D
root 2972 0.0 0.2 123396 2384 ? S 07:14 0:00 /usr/sbin/keepalived -D
root 2974 0.0 0.0 112672 988 pts/1 R+ 07:14 0:00 grep --color=auto keepalived
[root@hanfeng ~]#
- 查看nginx服务进程
[root@hanfeng ~]# ps aux |grep nginx
root 3004 0.0 0.2 123372 2108 ? Ss 07:18 0:00 nginx: master process /usr/sbin/nginx
nginx 3005 0.0 0.3 123836 3148 ? S 07:18 0:00 nginx: worker process
root 3007 0.0 0.0 112672 984 pts/1 R+ 07:19 0:00 grep --color=auto nginx
[root@hanfeng ~]#
- 这时停止nginx服务
- /etc/init.d/nginx stop
[root@hanfeng ~]# /etc/init.d/nginx stop
Stopping nginx (via systemctl): [ 确定 ]
[root@hanfeng ~]#
- 再来查看nginx服务进程,会看到自动加载了
[root@hanfeng ~]# ps aux |grep nginx
root 6238 0.0 0.0 20996 628 ? Ss 08:07 0:00 nginx: master process /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
nobody 6242 0.0 0.3 23440 3212 ? S 08:07 0:00 nginx: worker process
nobody 6243 0.0 0.3 23440 3212 ? S 08:07 0:00 nginx: worker process
root 6263 0.0 0.0 112676 980 pts/0 R+ 08:07 0:00 grep --color=auto nginx
[root@hanfeng ~]#
- keepalived日志文件路径
- /var/log/messages
- 查看ip地址,使用 ip add 命令,而不能使用ifconfig命令,因为 ifconfig命令 是无法查看到vip192.168.133.100这个IP的
[root@hanfeng ~]# ip add
1: lo: mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eno16777736: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:c7:05:28 brd ff:ff:ff:ff:ff:ff
inet 192.168.74.130/24 brd 192.168.74.255 scope global dynamic eno16777736
valid_lft 1158sec preferred_lft 1158sec
inet 192.168.133.100/32 scope global eno16777736
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fec7:528/64 scope link
valid_lft forever preferred_lft forever
[root@hanfeng ~]#
- 检查A、B 机器下防火墙和selinux是否关闭了,若没有关闭,可能会导致实验失败
- systemctl stop firewalld 关闭firewalld
- iptables -nvL
- setenforce 0 临时关闭selinux
- getenforce命令查看是否为Permissive
以上就是主机器A的配置
backup 机器配置
- 在B机器yum安装nginx和keepalived
[root@hf-01 ~]# yum install -y epel-release
[root@hf-01 ~]# yum install -y nginx
- 关闭B机器的防火墙和selinux
- iptables -F 清空规则
- setenforce 0 临时关闭selinux
- 自定义B机器keepalived配置文件,内容获取地址,更改虚拟IP和主一样的
首先清空B机器keepalived里面自带的配置文件
[root@hf-01 ~]# > /etc/keepalived/keepalived.conf
[root@hf-01 ~]# cat !$
cat /etc/keepalived/keepalived.conf
[root@hf-01 ~]#
然后复制配置文件并粘贴进去,更改虚拟IP和主一样的
[root@hf-01 ~]# vim /etc/keepalived/keepalived.conf
global_defs {
notification_email {
[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_script chk_nginx {
script "/usr/local/sbin/check_ng.sh"
interval 3
}
vrrp_instance VI_1 {
state BACKUP //这里 和master不一样的名字
interface eno16777736 //网卡和当前机器一致,否则无法启动keepalived服务
virtual_router_id 51 //和主机器 保持一致
priority 90 //权重,要比主机器小的数值
advert_int 1
authentication {
auth_type PASS
auth_pass aminglinux>com
}
virtual_ipaddress {
192.168.74.100 //这里更改为192.168.74.100
}
track_script {
chk_nginx
}
}
保存退出
- 定义监控脚本,路径再keepalived里面已定义过,脚本内容地址
- 这个脚本和主上的脚本有一点区别,启动nginx的命令不同,因为一个是yum安装,一个是源码包安装
[root@hf-01 ~]# vim /usr/local/sbin/check_ng.sh
#时间变量,用于记录日志
d=`date --date today +%Y%m%d_%H:%M:%S`
#计算nginx进程数量
n=`ps -C nginx --no-heading|wc -l`
#如果进程为0,则启动nginx,并且再次检测nginx进程数量,
#如果还为0,说明nginx无法启动,此时需要关闭keepalived
if [ $n -eq "0" ]; then
systemctl start nginx
n2=`ps -C nginx --no-heading|wc -l`
if [ $n2 -eq "0" ]; then
echo "$d nginx down,keepalived will stop" >> /var/log/check_ng.log systemctl stop keepalived
fi
fi
保存退出
- 改动脚本的权限,设置为755权限
[root@hf-01 ~]# chmod 755 /usr/local/sbin/check_ng.sh
[root@hf-01 ~]#
- B机器启动keepalived服务
- systemctl start keepalived
[root@hf-01 ~]# systemctl start keepalived
[root@hf-01 ~]# ps aux |grep keep
root 2814 0.0 0.1 121324 1396 ? Ss 07:10 0:00 /usr/sbin/keepalived -D
root 2815 0.0 0.2 121324 2740 ? S 07:10 0:00 /usr/sbin/keepalived -D
root 2816 0.0 0.2 121324 2324 ? S 07:10 0:00 /usr/sbin/keepalived -D
root 2827 0.0 0.0 112672 980 pts/0 R+ 07:10 0:00 grep --color=auto keep
[root@hf-01 ~]#
如何区分主和从上的nginx?
- A机器,是源码包安装的nginx(PS:这是lnmp配置好的环境虚拟主机内容)
[root@hanfeng ~]# cat /usr/local/nginx/conf/vhost/aaa.com.conf
server
{
listen 80 default_server;
server_name aaa.com;
index index.html index.htm index.php;
root /data/wwwroot/default;
location ~ \.php$
{
include fastcgi_params;
fastcgi_pass unix:/tmp/champ.sock;
#fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /data/wwwroot/default$fastcgi_script_name;
}
}
[root@hanfeng ~]# cat /data/wwwroot/default/index.html
This is the default sete.
[root@cham002 ~]# vim /data/wwwroot/default/index.html
master This is the default sete.
[root@cham002 ~]#
- B机器是yum安装的nginx
- 默认的索引页在 /usr/share/nginx/html/index.html
[root@hf-02~]# cat /usr/share/nginx/html/index.html
[root@hf-02 ~]# vim /usr/share/nginx/html/index.html
backup backup.
![输入图片说明](https://static.oschina.net/uploads/img/201801/28233716_ovDw.png "backup机器访问IP")
- 访问192.168.74.100这个VIP会看到和主机器(即A机器相同的内容),说明现在访问到的是机器master,VIP在master上
问题-B机器无法调用nginx服务?
- B机器关闭nginx服务,keepalived无法拉动nginx服务起来
- 解决方法:
- 再次设置755权限,就可以拉动nginx服务了
测试高可用
- 模拟线上生产环境,主机器宕机环境,最简单直接的方法,就是直接关闭keepalived服务
- 关闭master机器(即A机器)上的keepalived服务关闭
[root@hanfeng ~]# systemctl stop keepalived
[root@hanfeng ~]#
- 查看A机器上的VIP被已经释放掉了
[root@hanfeng ~]# ip add
1: lo: mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eno16777736: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:c7:05:28 brd ff:ff:ff:ff:ff:ff
inet 192.168.74.130/24 brd 192.168.74.255 scope global dynamic eno16777736
valid_lft 1219sec preferred_lft 1219sec
inet6 fe80::20c:29ff:fec7:528/64 scope link
valid_lft forever preferred_lft forever
[root@hanfeng ~]#
- 查看backup机器(即B机器)在监听VIP
[root@hf-01 ~]# ip add
1: lo: mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eno16777736: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:ff:fe:93 brd ff:ff:ff:ff:ff:ff
inet 192.168.74.129/24 brd 192.168.74.255 scope global eno16777736
valid_lft forever preferred_lft forever
inet 192.168.74.100/32 scope global eno16777736
valid_lft forever preferred_lft forever
inet 192.168.74.150/24 brd 192.168.74.255 scope global secondary eno16777736:0
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:feff:fe93/64 scope link
valid_lft forever preferred_lft forever
[root@hf-01 ~]#
- 查看B机器日志
[root@hf-01 ~]# tail /var/log/messages
Jan 29 08:20:39 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:39 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:39 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:39 hanfeng avahi-daemon[568]: Registering new address record for 192.168.74.100 on eno16777736.IPv4.
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
[root@hf-01 ~]#
[root@hanfeng ~]# systemctl start keepalived
[root@hanfeng ~]# ip add
1: lo: mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eno16777736: mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:c7:05:28 brd ff:ff:ff:ff:ff:ff
inet 192.168.74.130/24 brd 192.168.74.255 scope global dynamic eno16777736
valid_lft 1577sec preferred_lft 1577sec
inet 192.168.74.100/32 scope global eno16777736
valid_lft forever preferred_lft forever
inet6 fe80::20c:29ff:fec7:528/64 scope link
valid_lft forever preferred_lft forever
[root@hanfeng ~]#
- 查看B机器日志变化
[root@hf-01 ~]# tail /var/log/messages
Jan 29 08:20:39 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:39 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:39 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:39 hanfeng avahi-daemon[568]: Registering new address record for 192.168.74.100 on eno16777736.IPv4.
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
[root@hf-01 ~]# tail /var/log/messages
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:20:44 hanfeng Keepalived_vrrp[7886]: Sending gratuitous ARP on eno16777736 for 192.168.74.100
Jan 29 08:30:01 hanfeng systemd: Started Session 49 of user root.
Jan 29 08:30:01 hanfeng systemd: Starting Session 49 of user root.
Jan 29 08:30:51 hanfeng Keepalived_vrrp[7886]: VRRP_Instance(VI_1) Received advert with higher priority 100, ours 90
Jan 29 08:30:51 hanfeng Keepalived_vrrp[7886]: VRRP_Instance(VI_1) Entering BACKUP STATE
Jan 29 08:30:51 hanfeng Keepalived_vrrp[7886]: VRRP_Instance(VI_1) removing protocol VIPs.
Jan 29 08:30:51 hanfeng avahi-daemon[568]: Withdrawing address record for 192.168.74.100 on eno16777736.
[root@hf-01 ~]#
##总结
- 在生产环境中,可能会用到2-3台backup角色, vim /etc/keepalived/keepalived.conf 这里面的权重调成不通级别,权重越高优先级越高!除了nginx服务的话,还可以做MySQL的高可用集群服务。(做mysql的高可用,一定要保证两边的数据一致)