通过Haproxy实现对web服务的负载均衡及健康检查,pacemaker实现haproxy的高可用,corosync实现心跳信息的传递
Pacemaker是一个集群资源管理器。它利用集群基础构件(OpenAIS、heartbeat或corosync)提供的消息和成员管理能力来探测并从节点或资源级别的故障中恢复,以实现群集服务(亦称资源)的最大可用性。
pacemaker和corosync,后者用于心跳检测,前者用于资源转移。两个结合起来使用,可以实现对高可用架构的自动管理。 心跳检测是用来检测服务器是否还在提供服务,只要出现异常不能提供服务了,就认为它挂掉了。 当检测出服务器挂掉之后,就要对服务资源进行转移。
CoroSync是运行于心跳层的开源软件, PaceMaker是运行于资源转移层的开源软件。
主机 | ip | 服务 |
---|---|---|
server1 | 172.25.1.1 | corosync,pacemaker,haproxy |
server2 | 172.25.1.2 | httpd |
server3 | 172.25.1.3 | httpd |
server4 | 172.25.1.4 | corosync,pacemaker,haproxy |
tar zxf haproxy-1.6.11.tar.gz #解压tar包
yum install rpm-build -y #安装rpm包制作工具
rpmbuild -tb haproxy-1.6.11.tar.gz #将其制作为rpm包
cd /root/rpmbuild/RPMS/x86_64/
rpm -qpl haproxy-1.6.11-1.x86_64.rpm (查看下载后得到的相关文件)
rpm -ivh haproxy-1.6.11-1.x86_64.rpm (安装)
cd /mnt/haproxy-1.6.11/examples
cp content-sw-sample.cfg /etc/haproxy/haproxy.cfg 将模板中的配置文件移到对应目录下
grep 200 /etc/passwd
groupadd -g 200 haproxy 建立一个haproxy组
useradd -u 200 -g 200 -M haproxy
id haproxy
[root@server1 haproxy]# cat haproxy.cfg
#
# This is a sample configuration. It illustrates how to separate static objects
# traffic from dynamic traffic, and how to dynamically regulate the server load.
#
# It listens on 192.168.1.10:80, and directs all requests for Host 'img' or
# URIs starting with /img or /css to a dedicated group of servers. URIs
# starting with /admin/stats deliver the stats page.
#
global
maxconn 10000
stats socket /var/run/haproxy.stat mode 600 level admin
log 127.0.0.1 local0
uid 200
gid 200
chroot /var/empty
daemon
# The public 'www' address in the DMZ
frontend public
bind *:80 name clear
#bind 192.168.1.10:443 ssl crt /etc/haproxy/haproxy.pem
mode http
log global
option httplog
option dontlognull
monitor-uri /monitoruri
maxconn 8000
timeout client 30s
stats uri /admin/stats
#use_backend static if { hdr_beg(host) -i img }
#use_backend static if { path_beg /img /css }
default_backend static
# The static backend backend for 'Host: img', /img and /css.
backend static
mode http
balance roundrobin
option prefer-last-server
retries 2
option redispatch
timeout connect 5s
timeout server 5s
#option httpchk HEAD /favicon.ico
server statsrv1 172.25.1.2:80 check inter 1000
server statsrv2 172.25.1.3:80 check inter 1000
server backup 172.25.1.1:8080 backup
[root@server1 ~]# /etc/init.d/haproxy start #测试完以后一定要关闭,之后pacemaker会控制haproxy的启动
[root@server1 ~]# yum install -y corosync #需配置高可用yum源
[root@server1 ~]# yum install -y crmsh-1.2.6-0.rc2.2.1.x86_64.rpm pssh-2.3.1-2.1.x86_64.rpm
#crm命令安装
cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf
vim /etc/corosync/corosync.conf
#Please read the corosync.conf.5 manual page
compatibility: whitetank
service {
ver:0 ###指定版本,0 时自动启动 pacemaker
name:pacemaker
}
aisexec { ###指定启动 ais 功能时以那个用户的身份去运行,可不写
user:root
group:root
}
totem {
version: 2 #版本唯一,只有2合法
secauth: off #集群认证
threads: 0 #线程数,可加快广播或组播,也可以加快secauth
interface {
ringnumber: 0 #当有多块网卡时需指定
bindnetaddr: 172.25.12.0 #绑定网络地址段
mcastaddr: 226.94.1.1 #组播地址
mcastport: 5405 #组播端口
ttl: 1 #超时时间
}
}
server1和server4启动corosync (server1与server4都需要安装配置)
执行crm status #查看server1和server4状态
[root@server1 corosync]# crm status
Last updated: Sun Oct 7 18:24:24 2018
Last change: Sun Oct 7 16:12:18 2018 via cibadmin on server1
Stack: classic openais (with plugin)
Current DC: server4 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes #两个节点,两个有效票数
0 Resources configured
Online: [ server1 server4 ] #server1与server4已经实现心跳传递
[root@server1 ~]# yum install -y pacemaker
crmsh利用资源配置的命令接口来配置资源
查看当前集群系统所支持的类:
[root@server1 corosync]# crm ra classes
lsb
ocf / heartbeat pacemaker
service
stonith
Crm配置Haproxy
如果想删除配置crm resource到里面stop掉,然后在configure中执行delete+名字即可进行删除!!!
crm(live)# help
This is crm shell, a Pacemaker command line interface.
Available commands:
cib manage shadow CIBs
resource resources management
configure CRM cluster configuration
node nodes management
options user preferences
history CRM cluster history
site Geo-cluster support
ra resource agents information center
status show cluster status
help,? show help (help topics for list of topics)
end,cd,up go back one level
quit,bye,exit exit the program
#crm的help命令为全状态命令,无论进入到那个模块里面都可以使用help来查看帮助文档
crm(live)# ra #进入资源代理模块
crm(live)ra# help #可以看到里面有很多子命令
This level contains commands which show various information about
the installed resource agents. It is available both at the top
level and at the `configure` level.
Available commands:
classes list classes and providers
list list RA for a class (and provider)
meta show meta data for a RA
providers show providers for a RA and a class
help show help (help topics for list of topics)
end go back one level
quit exit the program
crm(live)ra# classes #classes可以查看支持的RA类型
lsb
ocf / heartbeat pacemaker
service
stonith
crm(live)ra# providers IPaddr2 #可以查看RA属于那个类型
heartbeat
crm(live)ra# meta ocf:heartbeat:IPaddr2 #查看对应RA的原数据,下面仅对此RA做了简单摘要
ip* (string): IPv4 or IPv6 address
The IPv4 (dotted quad notation) or IPv6 address (colon hexadecimal notation)
example IPv4 "192.168.1.1".
example IPv6 "2001:db8:DC28:0:0:FC57:D4C8:1FFF".
nic (string): Network interface
The base network interface on which the IP address will be brought
online.
If left empty, the script will try and determine this from the
routing table.
Do NOT specify an alias interface in the form eth0:1 or anything here;
rather, specify the base interface only.
If you want a label, see the iflabel parameter.
Prerequisite:
There must be at least one static IP address, which is not managed by
the cluster, assigned to the network interface.
If you can not assign any static IP address on the interface,
modify this kernel parameter:
sysctl -w net.ipv4.conf.all.promote_secondaries=1 # (or per device)
Operations' defaults (advisory minimum):
start timeout=20s
stop timeout=20s
status timeout=20s interval=10s
monitor timeout=20s interval=10s
执行crm,进入crm命令
crm(live)# configure #进入配置
crm(live)configure# show # 查看配置所有信息
node server1
node server4
property $id="cib-bootstrap-options" \
dc-version="1.1.10-14.el6-368c726" \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes="2" \
crm(live)configure# property stonith-enabled=False
#因为没有定义stonith设备,因为需要关闭stonith
crm(live)configure# verify
crm(live)configure# primitive vip ocf:heartbeat:IPaddr2 params ip=172.25.1.100 nic=eth0 cidr_netmask=24 op monitor interval=1min
#配置vip资源,RA使用ocf类型中heartbeat里面的IPaddr2脚本,params后为传入的参数,vip为172.25.1.100,nic为绑定的网卡名,cidr_netmask=24子网掩码,op后为修改默认的属性值
crm(live)configure# verify
crm(live)configure# commit # 提交
crm(live)# configure
crm(live)configure# primitive haporxy lsb:haproxy op monitor interval=1min
crm(live)configure# property no-quorum-policy=ignore # 忽略节点数检测,因为如果是一节节点的时候服务会出错,因为集群一般都是两个或者或两个以上
crm(live)configure# verify
crm(live)configure# commit # 提交
crm(live)configure# group hagroup vip haporxy #将vip和harpoxy绑定
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# exit
bye
crm_mon监控命令可以查看集群中的服务情况
查看加入的配置:
[root@server1 corosync]# crm configure show
primitive haporxy lsb:haproxy \
op monitor interval="1min"
primitive vip ocf:heartbeat:IPaddr2 \
params ip="172.25.1.100" nic="eth0" cidr_netmask="24" \
op monitor interval="1min"
group hagroup vip haporxy
property $id="cib-bootstrap-options" \
dc-version="1.1.10-14.el6-368c726" \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes="2" \
stonith-enabled="False" \
no-quorum-policy="ignore"
查看集群配置:
[root@server1 corosync]# crm status
Last updated: Sun Oct 7 18:31:15 2018
Last change: Sun Oct 7 16:12:18 2018 via cibadmin on server1
Stack: classic openais (with plugin)
Current DC: server4 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
3 Resources configured
Online: [ server1 server4 ]
Resource Group: hagroup
vip (ocf::heartbeat:IPaddr2): Started server4
haporxy (lsb:haproxy): Started server4
crm node standby server1 (下线server1节点)
crm node online server1(server1上线)
(实验中fence安装在function1中)
[root@foundation1 ~]# yum install fence-virtd.x86_64 fence-virtd-libvirt.x86_64 fence-virtd-multicast.x86_64 -y
mkdir /etc/cluster
dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=128 count=1 ###生成随机数key
[root@foundation1 Desktop]# fence_virtd -c
Module search path [/usr/lib64/fence-virt]:
Available backends:
libvirt 0.1
Available listeners:
multicast 1.2
Listener modules are responsible for accepting requests
from fencing clients.
Listener module [multicast]: ###模式
The multicast listener module is designed for use environments
where the guests and hosts may communicate over a network using
multicast.
The multicast address is the address that a client will use to
send fencing requests to fence_virtd.
Multicast IP Address [225.0.0.12]: ###广播地址
Using ipv4 as family.
Multicast IP Port [1229]: ###端口,可以自行指定
Setting a preferred interface causes fence_virtd to listen only
on that interface. Normally, it listens on all interfaces.
In environments where the virtual machines are using the host
machine as a gateway, this *must* be set (typically to virbr0).
Set to 'none' for no interface.
Interface [br0]: br0 ###此处根据自己的网卡名进行设置
The key file is the shared key information which is used to
authenticate fencing requests. The contents of this file must
be distributed to each physical host and virtual machine within
a cluster.
Key File [/etc/cluster/fence_xvm.key]:
Backend modules are responsible for routing requests to
the appropriate hypervisor or management layer.
Backend module [libvirt]:
Configuration complete.
=== Begin Configuration ===
fence_virtd {
listener = "multicast";
backend = "libvirt";
module_path = "/usr/lib64/fence-virt";
}
listeners {
multicast {
key_file = "/etc/cluster/fence_xvm.key";
address = "225.0.0.12";
interface = "br0";
family = "ipv4";
port = "1229";
}
}
backends {
libvirt {
uri = "qemu:///system";
}
}
=== End Configuration ===
Replace /etc/fence_virt.conf with the above [y/N]? y #替换原有文件
systemctl restart fence_virtd.service ###重启fence服务,其配置文件在/etc/fence_virt.conf
scp /etc/cluster/fence_xvm.key [email protected]:/etc/cluster/
scp /etc/cluster/fence_xvm.key [email protected]:/etc/cluster/
在server1和server4上执行命令stonith_admin -I
查看是否有fence代理:fence_xvm,如果没有需要安装下面软件:
[root@server1 corosync]# yum install fence-virt-0.2.3-15.el6.x86_64
[root@server1 corosync]# stonith_admin -I
fence_xvm
fence_virt
fence_pcmk
fence_legacy
4 devices found
在server4上添加集群服务信息(其实在一台后端服务器可以添加所有的服务配置,这样做只是为了说明这两个节点之间是同步的,在服务提交后可以同步两者之间的信息)
crm(live)# crm
crm(live)configure# primitive vmfence stonith:fence_xvm params pcmk_host_map="server1:test1;server4:test4" op monitor interval=1min
# 添加fence服务处理故障的节点server1:test1 server1是主机名,test1是真实的虚拟机名称
property stonith-enabled=true # 添加fence设备
crm(live)configure# verity
crm(live)configure# commit # 提交
crm(live)# configure
crm(live)configure# show
node server1
node server4 \
attributes standby="off"
primitive haporxy lsb:haproxy \
op monitor interval="1min"
primitive vip ocf:heartbeat:IPaddr2 \
params ip="172.25.1.100" cidr_netmask="24" \
op monitor interval="1min"
primitive vmfence stonith:fence_xvm \
params pcmk_host_map="server1:test1;server4:test4" \
op monitor interval="1min"
group hagroup vip haporxy
property $id="cib-bootstrap-options" \
dc-version="1.1.10-14.el6-368c726" \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes="2" \
stonith-enabled="true" \
no-quorum-policy="ignore"
[root@server1 ~]# /etc/init.d/corosync restart
[root@server4 ~]# /etc/init.d/corosync restart
[root@server1 corosync]# crm status
Last updated: Sun Oct 7 18:37:04 2018
Last change: Sun Oct 7 16:12:18 2018 via cibadmin on server1
Stack: classic openais (with plugin)
Current DC: server4 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
3 Resources configured
Online: [ server1 server4 ]
Resource Group: hagroup
vip (ocf::heartbeat:IPaddr2): Started server4
haporxy (lsb:haproxy): Started server4
vmfence (stonith:fence_xvm): Started server1
[root@server4 ~]# echo c > /proc/sysrq-trigger
模拟server4内核崩溃,等一会server4便会自动重启,并且资源转移到server1中