Corosync+Pacemaker+NFS+Httpd高可用web集群(使用资源管理工具crmsh配置)
框架:crmsh(Corosync+pacemaker)+nfs+httpd
集群节点1:192.168.88.132 cen7.field.com
集群节点2:192.168.88.133 node2.field.com
集群节点3:192.168.88.134 node1.field.com
vip: 192.168.88.188 资源代理:ocf:heartbeat:IPaddr
nfs服务器:node1.field.com 资源代理:ocf:heartbeat:Filesystem
web服务器:cen7.field.com node2.field.com node1.field.com 资源代理:systemd:httpd
配置集群的前提
(1)、时间同步;
(2)、.基于当前正在使用的主机名互相访问;
(3)、是否会用到仲裁设备;
参考:《使用资源管理工具pcs配置Corosync+pacemaker》
一、安装配置Corosync
1、配置Corosync各集群节点
[root@node1 ~]# vim /etc/resolv.conf
[root@node1 ~]# yum install corosync pacemaker -y
可以使用rpm -ql查看安装了那些文件
[root@cen7 ~]# rpm -ql corosync|less
[root@cen7 ~]# rpm -ql pacemaker|less
查看Corosync配置文件
[root@cen7 ~]# cd /etc/corosync/
[root@cen7 corosync]# ls
corosync.conf.example corosync.conf.example.udpu corosync.xml.example uidgid.d
1)、复制配置文件案例并修改配置文件:
[root@cen7 corosync]# cp corosync.conf.example corosync.conf
[root@cen7 corosync]# vim corosync.conf
[root@cen7 corosync]# grep -v '^[[:space:]]*#' corosync.conf
totem {
#totem:图层,定义集群信息传递协议
version: 2
crypto_cipher: aes128
crypto_hash: sha1
secauth: on
#开启安全加密功能
interface {
#interface:定义集群信息传递接口
ringnumber: 0
#主心跳信息传递接口
bindnetaddr: 192.168.88.0
#绑定的网络地址,凡是属于该网段的网络地址都属于ring0
mcastaddr: 239.188.1.188
#组播地址、端口
mcastport: 5405
ttl: 1
}
}
nodelist {
#nodelist:定义节点信息
node {
ring0_addr: 192.168.88.132
nodeid: 1
}
node {
ring0_addr: 192.168.88.133
nodeid: 2
}
node {
ring0_addr: 192.168.88.134
nodeid: 3
}
}
logging {
#日志子系统
fileline: off
to_stderr: no
#是否发给标准错误输出
to_logfile: yes
logfile: /var/log/cluster/corosync.log
#日志文件
to_syslog: no
#是否发给系统日志,考虑到I/O性能,建议只开一个
debug: off
#是否输出调试信息
timestamp: on
#是否记录时间戳,
#注意:每次记录时间戳都得获取一次系统时间,发起一次系统调用,日志量非常大时会不断发起系统调用,可能影响
logger_subsys {
#logger子系统中的QUORUM子系统也要记录日志
subsys: QUORUM
debug: off
}
}
quorum {
provider: corosync_votequorum
#定义投票系统
}
[root@cen7 corosync]# ls
corosync.conf corosync.conf.example corosync.conf.example.udpu corosync.xml.example uidgid.d
2)、生成节点间通信时用到的认证密钥文件:
[root@cen7 corosync]# corosync-keygen
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Press keys on your keyboard to generate entropy.
Press keys on your keyboard to generate entropy (bits = 920).
Press keys on your keyboard to generate entropy (bits = 1000).
Writing corosync key to /etc/corosync/authkey.
[root@cen7 corosync]# ls
authkey corosync.conf.example corosync.xml.example
corosync.conf corosync.conf.example.udpu uidgid.d
3)、配置各节点:
各节点Corosync配置文件和认证密钥文件相同:只需将corosync.conf和authkey复制至node2、node1节点即可完成配置
[root@cen7 corosync]# scp -p authkey corosync.conf node2:/etc/corosync/
#scp -p 保留原有权限
authkey 100% 128 75.3KB/s 00:00
corosync.conf 100% 3031 1.7MB/s 00:00
[root@cen7 corosync]# scp -p authkey corosync.conf node1:/etc/corosync/
authkey 100% 128 49.4KB/s 00:00
corosync.conf 100% 3131 1.1MB/s 00:00
2、确认各集群节点Corosync状态缺
1)、cen7启动Corosync并确认状态
[root@cen7 corosync]# systemctl start corosync.service
[root@cen7 corosync]# systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/usr/lib/systemd/system/corosync.service; disabled; vendor preset: disabled)
Active: active (running) since 四 2018-08-02 07:31:20 CST; 1min 7s ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Process: 1954 ExecStop=/usr/share/corosync/corosync stop (code=exited, status=0/SUCCESS)
Process: 1970 ExecStart=/usr/share/corosync/corosync start (code=exited, status=0/SUCCESS)
Main PID: 1981 (corosync)
CGroup: /system.slice/corosync.service
└─1981 corosync
8月 02 07:31:20 cen7.field.com systemd[1]: Starting Corosync Cluster Engine...
8月 02 07:31:20 cen7.field.com corosync[1970]: Starting Corosync Cluster Engine (corosync):… ]
8月 02 07:31:20 cen7.field.com systemd[1]: Started Corosync Cluster Engine.
Hint: Some lines were ellipsized, use -l to show in full.
查看corosync引擎是否正常启动:
[root@cen7 corosync]# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log
查看初始化成员节点通知是否正常发出
[root@cen7 corosync]# grep TOTEM /var/log/cluster/corosync.log
以下日志可以看出Corosync已然正常启动
[root@cen7 corosync]# tail /var/log/cluster/corosync.log
Aug 02 07:31:30 [1980] cen7.field.com corosync notice [QUORUM] This node is within the non-primary component and will NOT provide any services.
Aug 02 07:31:30 [1980] cen7.field.com corosync notice [QUORUM] Members[1]: 1
Aug 02 07:31:30 [1980] cen7.field.com corosync notice [MAIN ] Completed service synchronization, ready to provide service.
Aug 02 07:31:32 [1980] cen7.field.com corosync notice [TOTEM ] A new membership (192.168.88.132:24) was formed. Members joined: 2
Aug 02 07:31:32 [1980] cen7.field.com corosync notice [QUORUM] This node is within the primary component and will provide service.
Aug 02 07:31:32 [1980] cen7.field.com corosync notice [QUORUM] Members[2]: 1 2
Aug 02 07:31:32 [1980] cen7.field.com corosync notice [MAIN ] Completed service synchronization, ready to provide service.
Aug 02 07:31:49 [1980] cen7.field.com corosync notice [TOTEM ] A new membership (192.168.88.132:28) was formed. Members joined: 3
Aug 02 07:31:49 [1980] cen7.field.com corosync notice [QUORUM] Members[3]: 1 2 3
Aug 02 07:31:49 [1980] cen7.field.com corosync notice [MAIN ] Completed service synchronization, ready to provide service.
Corosync内置许多工具:如下
[root@cen7 corosync]# corosync-
corosync-blackbox corosync-cmapctl corosync-keygen corosync-quorumtool
corosync-cfgtool corosync-cpgtool corosync-notifyd corosync-xmlproc
corosync-cfgtool是显示及配置corosync的工具:使用“corosync-cfgtool -s”可以显示该节点的当前心跳状态
[root@cen7 corosync]# corosync-cfgtool -s
Printing ring status.
Local node ID 1
RING ID 0
id = 192.168.88.132
status = ring 0 active with no faults
[root@cen7 corosync]#
2)、node2启动Corosync并确认状态
[root@node2 corosync]# systemctl start corosync.service
[root@node2 corosync]# systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/usr/lib/systemd/system/corosync.service; disabled; vendor preset: disabled)
Active: active (running) since 四 2018-08-02 07:31:33 CST; 48s ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Process: 2129 ExecStop=/usr/share/corosync/corosync stop (code=exited, status=0/SUCCESS)
Process: 2145 ExecStart=/usr/share/corosync/corosync start (code=exited, status=0/SUCCESS)
Main PID: 2156 (corosync)
CGroup: /system.slice/corosync.service
└─2156 corosync
8月 02 07:31:32 node2.field.com systemd[1]: Starting Corosync Cluster Engine...
8月 02 07:31:33 node2.field.com corosync[2145]: Starting Corosync Cluster Engine (corosync)… ]
8月 02 07:31:33 node2.field.com systemd[1]: Started Corosync Cluster Engine.
Hint: Some lines were ellipsized, use -l to show in full.
[root@node2 corosync]# corosync-cfgtool -s
Printing ring status.
Local node ID 2
RING ID 0
id = 192.168.88.133
status = ring 0 active with no faults
[root@cen7 corosync]# corosync-cmapctl |grep members
runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(192.168.88.132)
runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1.status (str) = joined
runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(192.168.88.133)
runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 2
runtime.totem.pg.mrp.srp.members.2.status (str) = joined
runtime.totem.pg.mrp.srp.members.3.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.3.ip (str) = r(0) ip(192.168.88.134)
runtime.totem.pg.mrp.srp.members.3.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.3.status (str) = joined
3)、node1启动Corosync并确认状态
[root@node1 corosync]# systemctl start corosync.service
[root@node1 corosync]# systemctl status corosync.service
● corosync.service - Corosync Cluster Engine
Loaded: loaded (/usr/lib/systemd/system/corosync.service; disabled; vendor preset: disabled)
Active: active (running) since 四 2018-08-02 07:31:50 CST; 5s ago
Docs: man:corosync
man:corosync.conf
man:corosync_overview
Process: 1909 ExecStart=/usr/share/corosync/corosync start (code=exited, status=0/SUCCESS)
Main PID: 1920 (corosync)
CGroup: /system.slice/corosync.service
└─1920 corosync
8月 02 07:31:49 node1.field.com systemd[1]: Starting Corosync Cluster Engine...
8月 02 07:31:50 node1.field.com corosync[1909]: Starting Corosync Cluster Engine (corosync): [ 确定 ]
8月 02 07:31:50 node1.field.com systemd[1]: Started Corosync Cluster Engine.
[root@node1 corosync]# corosync-cfgtool -s
Printing ring status.
Local node ID 3
RING ID 0
id = 192.168.88.134
status = ring 0 active with no faults
二、启动pacemaker并确认集群节点状态
1、配置启动pacemaker
[root@cen7 corosync]# vim /etc/sysconfig/pacemaker
PCMK_logfile=/var/log/pacemaker.log
#pacemaker添加日志记录文件
使用ansible控制集群:将集群节点加入/etc/ansible/hosts中,便于管理
#或使用systemctl start|stop|restart xxxx
[root@cen7 corosync]# vim /etc/ansible/hosts
[hacluster]
192.168.88.132
192.168.88.133
192.168.88.134
启动各节点pacemaker
[root@cen7 corosync]# ansible hacluster -m service -a 'name=pacemaker state=started enabled=yes'
2、确认pacemaker状态是否正常
[root@cen7 corosync]# systemctl status pacemaker.service
● pacemaker.service - Pacemaker High Availability Cluster Manager
Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; enabled; vendor preset: disabled)
Active: active (running) since 四 2018-08-02 08:18:26 CST; 4min 46s ago
Docs: man:pacemakerd
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/index.html
Main PID: 3510 (pacemakerd)
CGroup: /system.slice/pacemaker.service
├─3510 /usr/sbin/pacemakerd -f
├─3523 /usr/libexec/pacemaker/cib
├─3524 /usr/libexec/pacemaker/stonithd
├─3525 /usr/libexec/pacemaker/lrmd
├─3526 /usr/libexec/pacemaker/attrd
├─3527 /usr/libexec/pacemaker/pengine
└─3528 /usr/libexec/pacemaker/crmd
8月 02 08:18:26 cen7.field.com systemd[1]: Started Pacemaker High Availability Cluster Manager.
8月 02 08:18:26 cen7.field.com systemd[1]: Starting Pacemaker High Availability Cluster Manager...
8月 02 08:18:27 cen7.field.com pacemakerd[3510]: notice: Additional logging available in /var/log/pacemaker.log
[root@node2 corosync]# ps aux|tail
root 3723 0.0 1.8 104268 6272 ? Ss 08:18 0:00 /usr/sbin/pacemakerd -f
haclust+ 3732 0.0 4.1 106156 13956 ? Ss 08:18 0:00 /usr/libexec/pacemaker/cib
root 3733 0.1 2.0 107312 6908 ? Ss 08:18 0:00 /usr/libexec/pacemaker/stonithd
root 3734 0.0 1.3 98776 4352 ? Ss 08:18 0:00 /usr/libexec/pacemaker/lrmd
haclust+ 3735 0.0 1.9 127976 6556 ? Ss 08:18 0:00 /usr/libexec/pacemaker/attrd
haclust+ 3736 0.0 1.0 80480 3484 ? Ss 08:18 0:00 /usr/libexec/pacemaker/pengine
haclust+ 3737 0.0 2.3 140292 7892 ? Ss 08:18 0:00 /usr/libexec/pacemaker/crmd
root 3823 0.0 0.0 0 0 ? S 08:22 0:00 [kworker/0:2]
root 3825 0.0 0.5 155324 1876 pts/2 R+ 08:24 0:00 ps aux
root 3826 0.0 0.1 108200 664 pts/2 S+ 08:24 0:00 tail
查看pacemaker是否正常启动:
[root@node2 corosync]# grep pcmk_startup /var/log/cluster/corosync.log
[root@node2 corosync]# crm_m
crm_master crm_mon
crm_mon:pacemaker自带组件,监控当前集群资源等相关信息的运行状态
[root@node2 corosync]# crm_mon
Stack: corosync
Current DC: cen7.field.com (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Thu Aug 2 08:25:22 2018
Last change: Thu Aug 2 08:18:46 2018 by hacluster via crmd on cen7.field.com
3 nodes configured
0 resources configured
Online: [ cen7.field.com node1.field.com node2.field.com ]
#可以看到,三个节点均已在线
No active resources
[root@node2 corosync]# crm
crmadmin crm_error crm_mon crm_resource crm_standby
crm_attribute crm_failcount crm_node crm_shadow crm_ticket
crm_diff crm_master crm_report crm_simulate crm_verify
#pacemaker自带许多crm命令,使用-h可以查看相关工具用法
说明:使用pacemaker自带的crm命令足够完成所有操作,只是可用性、友好性不如crmsh插件
crm_node:与集群节点相关的管理工具
[root@node2 corosync]# crm_node -h
查看使用帮助
[root@node2 corosync]# crm_node -n
node2.field.com
列出当前节点
[root@node2 corosync]# crm_node -l
列出所有成员节点
3 node1.field.com member
2 node2.field.com member
1 cen7.field.com member
“crm_verify -L -V”命令可以查看配置正确与否
[root@node2 corosync]# crm_verify -L -V
error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined
error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option
error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid
#以上错误是因为corosync默认启用了stonith,而当前集群并没有相应的stonith设备,因此此默认配置不可用,可以通过命令先禁用stonith
三、安装crmsh管理集群
[root@cen7 ~]# ls
anaconda-ks.cfg crmsh-3.0.0-6.2.noarch.rpm crmsh-scripts-3.0.0-6.2.noarch.rpm ~None pssh-2.3.1-7.3.noarch.rpm python-parallax-1.0.0a1-7.1.noarch.rpm python-pssh-2.3.1-7.3.noarch.rpm
[root@cen7 ~]# yum -y install *.rpm
已安装:
crmsh.noarch 0:3.0.0-6.2 crmsh-scripts.noarch 0:3.0.0-6.2 pssh.noarch 0:2.3.1-7.3
python-parallax.noarch 0:1.0.0a1-7.1 python-pssh.noarch 0:2.3.1-7.3
作为依赖被安装:
python-dateutil.noarch 0:1.5-7.el7 python-lxml.x86_64 0:3.2.1-4.el7 rsync.x86_64 0:3.1.2-4.el7
四、配置nfs服务器
1、创建nfs共享目录
[root@node1 ~]# mkdir /www/hadocs -pv
mkdir: 已创建目录 "/www"
mkdir: 已创建目录 "/www/hadocs"
2、编辑测试页
[root@node1 ~]# echo "
3、编辑nfs服务器端配置文件/etc/exports:指定要共享的目录及权限
[root@node1 hadocs]# vim /etc/exports
/www/hadocs 192.168.88.0/24(rw)
4、启动nfs服务
[root@node1 hadocs]# service nfs start
Redirecting to /bin/systemctl start nfs.service
[root@node1 hadocs]# service nfs status
Redirecting to /bin/systemctl status nfs.service
● nfs-server.service - NFS server and services
Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; disabled; vendor preset: disabled)
Active: active (exited) since 四 2018-08-02 08:46:26 CST; 3s ago
Process: 2223 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS (code=exited, status=0/SUCCESS)
Process: 2219 ExecStartPre=/bin/sh -c /bin/kill -HUP `cat /run/gssproxy.pid` (code=exited, status=0/SUCCESS)
Process: 2217 ExecStartPre=/usr/sbin/exportfs -r (code=exited, status=0/SUCCESS)
Main PID: 2223 (code=exited, status=0/SUCCESS)
CGroup: /system.slice/nfs-server.service
8月 02 08:46:25 node1.field.com systemd[1]: Starting NFS server and services...
8月 02 08:46:26 node1.field.com systemd[1]: Started NFS server and services.
5、配置开机自启动
[root@node1 hadocs]# chkconfig nfs on
注意:正在将请求转发到“systemctl enable nfs.service”。
Created symlink from /etc/systemd/system/multi-user.target.wants/nfs-server.service to /usr/lib/systemd/system/nfs-server.service.
五、各节点安装配置httpd并确认状态
1、安装配置cen7节点httpd服务器
1)、yum安装httpd
[root@cen7 ~]# yum install -y httpd
2)、挂载nfs共享目录到httpd网页根目录
[root@cen7 ~]# mount -t nfs 192.168.88.134:/www/hadocs /var/www/html/
3)、确认挂载成功
[root@cen7 ~]# mount | grep nfs
192.168.88.134:/www/hadocs on /var/www/html type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.88.132,local_lock=none,addr=192.168.88.134)
4)、启动httpd并测试是否能访问nfs服务器中网页
[root@cen7 ~]# systemctl start httpd.service
使用curl解析网站主页,确认nfs配置、挂载成功
[root@cen7 ~]# curl 192.168.88.132
5)、测试成功关闭httpd服务并卸载nfs共享目录
[root@cen7 ~]# systemctl stop httpd.service
[root@cen7 ~]# systemctl enable httpd.service
Created symlink from /etc/systemd/system/multi-user.target.wants/httpd.service to /usr/lib/systemd/system/httpd.service.
[root@cen7 ~]# umount /var/www/html/
2、node1、node2安装httpd
注意:httpd配置开机自启动,否则pacemaker可能会找不到httpd资源代理。
笔者未配置httpd开机自启动, pacemaker找不到httpd资源代理
[root@node2 corosync]# yum install httpd -y
[root@node2 corosync]# systemctl enable httpd.service
Created symlink from /etc/systemd/system/multi-user.target.wants/httpd.service to /usr/lib/systemd/system/httpd.service.
[root@node1 hadocs]# yum install httpd -y
[root@node1 hadocs]# systemctl enable httpd.service
Created symlink from /etc/systemd/system/multi-user.target.wants/httpd.service to /usr/lib/systemd/system/httpd.service.
六、使用crmsh配置Corosync+pacemaker+nfs+httpd高可用集群
1、使用crm交互模式:关闭stonish
[root@cen7 ~]# crm
crm(live)# configure
#configure 用于设置具体参数
#configure模式下使用show命令查看配置
crm(live)configure# show
node 1: cen7.field.com
node 2: node2.field.com
node 3: node1.field.com
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.18-11.el7_5.3-2b07d5c5a9 \
cluster-infrastructure=corosync
crm(live)configure# cd
#corosync默认启用的stonith,而当前集群并没有相应的stonith设备,因此此默认配置不可用,可以通过以下命令先禁用stonith
crm(live)# configure property stonith-enabled=false
crm(live)# configure verify
#检查当前配置是否正确, 相当于crm_verify -L
crm(live)# configure commit
#提交配置使配置生效,修改后没提交系统不会保存更改的信息
INFO: apparently there is nothing to commit
INFO: try changing something first
crm(live)# configure
crm(live)configure# show
node 1: cen7.field.com
node 2: node2.field.com
node 3: node1.field.com
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.18-11.el7_5.3-2b07d5c5a9 \
cluster-infrastructure=corosync \
stonith-enabled=false
再次使用“crm_verify -L -V”命令确认配置,没有报错
[root@cen7 ~]# crm_verify -L -V
2、定义集群资源和约束
1)、定义VIP资源:ocf:heartbeat:IPaddr2
#primitive定义资源:
# monitor:定义资源监控选项:interval启动延迟时间timeout监控超时时间
以下命令表示配置虚拟IP地址“192.168.88.188”,检测间隔时间为30s,超时时长为20s
crm(live)configure# primitive webip ocf:heartbeat:IPaddr2 params ip="192.168.88.188" op monitor interval=30s timeout=20s
crm(live)configure# verify
crm(live)configure# show
node 1: cen7.field.com
node 2: node2.field.com
node 3: node1.field.com
primitive webip IPaddr2 \
params ip=192.168.88.188 \
op monitor interval=30s timeout=20s
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.18-11.el7_5.3-2b07d5c5a9 \
cluster-infrastructure=corosync \
stonith-enabled=false
2)、定义httpd资源:yum安装的资源在systemd下,可以使用ra-->classes-->list systemd确认是否存在httpd资源
crm(live)# ra
#resource agents, 查看哪些可供使用的资源代理
crm(live)ra# classes
#资源代理类型:lsb, ocf, systemd, service,资源代理的提供程序:heartbeat , pacemaker
lsb
ocf / .isolation heartbeat openstack pacemaker
service
systemd
crm(live)ra# list systemd
NetworkManager NetworkManager-dispatcher
....
httpd-->可以在systemd资源代理中找到httpd initrd-cleanup
.....
#以下命令表示定义httpd资源,监控检测间隔时间为30s,超时时长为20s
crm(live)configure# primitive webserver systemd:httpd op monitor interval=20s timeout=100s
crm(live)configure# edit
crm(live)configure# verify
3)、定义nfs文件系统资源:文件系统资源属于ocf
资源配置参数:
param资源管理参数:文件系统资源中,必须指明的参数有device(设备)、directory(挂载目录)、fstype(文件系统类型)
start启动资源选项:interval(启动延迟时间)、timeout(超时时间)
stop停止资源选项:interval(启动延迟时间)、timeout(超时时间)
monitor资源监控选项:interval(延迟时间)timeout(监控超时时间)
#以下命令表示创建nfs文件系统资源:将"192.168.88.134:/www/hadocs"nfs共享目录挂载到/var/www/html/目录下,启动超时时间60s,关闭超时时间60s,监控检测间隔时间为20s,超时时长为40s
crm(live)configure# primitive webstore ocf:heartbeat:Filesystem params device="192.168.88.134:/www/hadocs" directory="/var/www/html" fstype="nfs" op start timeout=60s op stop timeout=60s op monitor interval=20s timeout=40s
crm(live)configure# verify
4)、定义倾向性约束:资源必须启动在同一个节点
约束说明:
colocation #排列约束,指定哪些资源捆绑一起,在同一节点上运行
order #指定排列约束中的资源启动顺序,该顺序和colocation顺序相反
location #位置约束,指定资源首选在哪些节点上运行
定义约束的顺序:排列约束-->顺序约束-->位置约束
“inf:”表示倾向性无穷大;“-inf:”表示黏性值负无穷,定义-inf:的两个资源始终不在同一节点运行。
#以下配置表示webserver和(webip webstore)间的黏性值无穷大,即三个资源始终在同一节点上运行。
crm(live)configure# colocation webserver_with_webstore_and_webip inf: webserver ( webip webstore)
crm(live)configure# verify
crm(live)configure# show
node 1: cen7.field.com
node 2: node2.field.com
node 3: node1.field.com
primitive webip IPaddr2 \
params ip=192.168.88.188 \
op monitor interval=30s timeout=20s
primitive webserver systemd:httpd \
op monitor interval=20s timeout=100s
primitive webstore Filesystem \
params device="192.168.88.134:/www/hadocs" directory="/var/www/html" fstype=nfs \
op start timeout=60s interval=0 \
op stop timeout=60s interval=0 \
op monitor interval=20s timeout=40s
colocation webserver_with_webstore_and_webip inf: webserver ( webip webstore )
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.18-11.el7_5.3-2b07d5c5a9 \
cluster-infrastructure=corosync \
stonith-enabled=false
5)、定义顺序约束
crm(live)configure# show xml
#show xml显示完整的xml格式信息
#可以使用help order查看位置约束定义帮助
crm(live)configure# help order
#mandatory默认值,表示强制约束,以下配置表示强制约束先启动webip资源,再启动webstore资源
crm(live)configure# order webstore_after_webip Mandatory: webip webstore
crm(live)configure# verify
#以下配置表示强制约束先启动webstore资源,再启动webserver资源
crm(live)configure# order webserver_after_webstore Mandatory: webstore webserver
crm(live)configure# verify
crm(live)configure# show xml
crm(live)configure# verify
crm(live)configure# commit
查看节点资源定义状态:可以发现,三个资源均已启动在cen7节点上
crm(live)# status
Stack: corosync
Current DC: cen7.field.com (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Thu Aug 2 09:25:20 2018
Last change: Thu Aug 2 09:24:58 2018 by root via cibadmin on cen7.field.com
3 nodes configured
3 resources configured
Online: [ cen7.field.com node1.field.com node2.field.com ]
Full list of resources:
webip (ocf::heartbeat:IPaddr2): Started cen7.field.com
webserver (systemd:httpd): Started cen7.field.com
webstore (ocf::heartbeat:Filesystem): Started cen7.field.com
2、确认高可用集群配置成功与否
1)、确认VIP是否流转成功:
可以看到,cen7上已经启动vip:192.168.88.188
[root@cen7 ~]# ip addr list| grep ens
2: ens32:
inet 192.168.88.132/24 brd 192.168.88.255 scope global noprefixroute ens32
inet 192.168.88.188/24 brd 192.168.88.255 scope global secondary ens32
2)、确认nfs是否正确挂载到cen7上
[root@cen7 ~]# mount| grep nfs
192.168.88.134:/www/hadocs on /var/www/html type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.88.132,local_lock=none,addr=192.168.88.134)
3)、确认cen7是否启动httpd服务
[root@cen7 ~]# ss -tnl|grep 80
LISTEN 0 128 :::80 :::*
4)、确认网站页面访问是否正常
[root@cen7 ~]# curl http://192.168.88.188/
3、高可用web服务可用性测试
1)、手动使当前节点转为备用状态,验证资源流转
使用“crm node standby”可以使当前节点转为备用状态
[root@cen7 ~]# crm node standby
使用“crm status”可以查看集群节点状态信息
[root@cen7 ~]# crm status
Stack: corosync
Current DC: cen7.field.com (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Thu Aug 2 09:28:39 2018
Last change: Thu Aug 2 09:28:29 2018 by root via crm_attribute on cen7.field.com
3 nodes configured -->配置三个节点
3 resources configured -->配置三个资源
Node cen7.field.com: standby -->可以看到:cen7节点已然转为备用状态
Online: [ node1.field.com node2.field.com ]
Full list of resources:
webip (ocf::heartbeat:IPaddr2): Started node1.field.com
webserver (systemd:httpd): Started node1.field.com
webstore (ocf::heartbeat:Filesystem): Started node1.field.com
#可以看到,三个资源都已转移到node1节点上
确认VIP是否流转成功:可以看到,node1上已经启动vip:192.168.88.188
[root@node1 hadocs]# ip addr list| grep ens
2: ens34:
inet 192.168.88.134/24 brd 192.168.88.255 scope global noprefixroute ens34
inet 192.168.88.188/24 brd 192.168.88.255 scope global secondary ens34
确认nfs是否正确挂载到node1上
[root@node1 hadocs]# mount | grep nfs
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)
nfsd on /proc/fs/nfsd type nfsd (rw,relatime)
192.168.88.134:/www/hadocs on /var/www/html type nfs4 (rw,relatime,vers=4.1,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.88.134,local_lock=none,addr=192.168.88.134)
确认网站页面访问是否正常
[root@node1 hadocs]# curl 192.168.88.188
[root@node1 hadocs]#
2)、重新上线备用节点,未定义位置约束不会抢占回资源
使用“crm node online”可以是当前节点转为备用状态
[root@cen7 ~]# crm node online
[root@cen7 ~]# crm status
Stack: corosync
Current DC: cen7.field.com (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Thu Aug 2 09:32:07 2018
Last change: Thu Aug 2 09:32:05 2018 by root via crm_attribute on cen7.field.com
3 nodes configured
3 resources configured
Online: [ cen7.field.com node1.field.com node2.field.com ]-->cen7节点已经重新上线
Full list of resources:
webip (ocf::heartbeat:IPaddr2): Started node1.field.com
webserver (systemd:httpd): Started node1.field.com
webstore (ocf::heartbeat:Filesystem): Started node1.field.com
#资源仍旧启动在node1节点上
4、定义位置约束
1)、指定资源首选在哪些节点上运行:定义单个节点对资源黏性值
crm(live)# configure
#help location查看位置约束使用帮助
crm(live)configure# help location
#以下配置定义cen7节点对webip资源黏性为100,默认不配置。
crm(live)configure# location webservice_prefer_cen7 webip 100: cen7.field.com
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
确认是否能抢占回资源
crm(live)# status
Stack: corosync
Current DC: cen7.field.com (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Thu Aug 2 09:36:01 2018
Last change: Thu Aug 2 09:35:53 2018 by root via cibadmin on cen7.field.com
3 nodes configured
3 resources configured
Online: [ cen7.field.com node1.field.com node2.field.com ]
Full list of resources:
webip (ocf::heartbeat:IPaddr2): Started cen7.field.com
webserver (systemd:httpd): Starting cen7.field.com
webstore (ocf::heartbeat:Filesystem): Started cen7.field.com
#可以发现,定义首先cen7后已经抢占回资源。
此时重新值为standby状态再上线,依然能够抢回资源
[root@cen7 ~]# crm node standby
[root@cen7 ~]# crm status
Stack: corosync
Current DC: cen7.field.com (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Thu Aug 2 09:37:20 2018
Last change: Thu Aug 2 09:37:04 2018 by root via crm_attribute on cen7.field.com
3 nodes configured
3 resources configured
Node cen7.field.com: standby
Online: [ node1.field.com node2.field.com ]
Full list of resources:
webip (ocf::heartbeat:IPaddr2): Started node1.field.com
webserver (systemd:httpd): Started node1.field.com
webstore (ocf::heartbeat:Filesystem): Started node1.field.com
[root@cen7 ~]# crm node online
[root@cen7 ~]# crm status
Stack: corosync
Current DC: cen7.field.com (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Thu Aug 2 09:38:11 2018
Last change: Thu Aug 2 09:37:26 2018 by root via crm_attribute on cen7.field.com
3 nodes configured
3 resources configured
Online: [ cen7.field.com node1.field.com node2.field.com ]
Full list of resources:
webip (ocf::heartbeat:IPaddr2): Started cen7.field.com
webserver (systemd:httpd): Started cen7.field.com
webstore (ocf::heartbeat:Filesystem): Started cen7.field.com
2)、定义集群节点默认黏性值
定义默认黏性值50,则各节点的粘性为3*50>100,standby后重新online并不会抢占回来资源
crm(live)# configure
crm(live)configure# property default-resource-stickiness=50
crm(live)configure# commit
ERROR: Warnings found during check: config may not be valid
Do you still want to commit (y/n)? y
crm(live)configure# cd
[root@cen7 ~]# crm node standby
[root@cen7 ~]# crm status
Stack: corosync
Current DC: cen7.field.com (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Thu Aug 2 09:40:28 2018
Last change: Thu Aug 2 09:39:54 2018 by root via crm_attribute on cen7.field.com
3 nodes configured
3 resources configured
Node cen7.field.com: standby
Online: [ node1.field.com node2.field.com ]
Full list of resources:
webip (ocf::heartbeat:IPaddr2): Started node1.field.com
webserver (systemd:httpd): Started node1.field.com
webstore (ocf::heartbeat:Filesystem): Started node1.field.com
[root@cen7 ~]# crm node online
[root@cen7 ~]# crm status
Stack: corosync
Current DC: cen7.field.com (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Thu Aug 2 09:41:53 2018
Last change: Thu Aug 2 09:40:37 2018 by root via crm_attribute on cen7.field.com
3 nodes configured
3 resources configured
Online: [ cen7.field.com node1.field.com node2.field.com ]
Full list of resources:
webip (ocf::heartbeat:IPaddr2): Started node1.field.com
webserver (systemd:httpd): Started node1.field.com
webstore (ocf::heartbeat:Filesystem): Started node1.field.com
#可以发现,此时不会在抢回资源,cen7由首选节点转为最后节点。
附:以下是crm资源整体配置
crm(live)# configure
crm(live)configure# show
node 1: cen7.field.com \
attributes standby=off
node 2: node2.field.com
node 3: node1.field.com
primitive webip IPaddr2 \
params ip=192.168.88.188 \
op monitor interval=30s timeout=20s
primitive webserver systemd:httpd \
op monitor interval=20s timeout=100s
primitive webstore Filesystem \
params device="192.168.88.134:/www/hadocs" directory="/var/www/html" fstype=nfs \
op start timeout=60s interval=0 \
op stop timeout=60s interval=0 \
op monitor interval=20s timeout=40s
order webserver_after_webstore Mandatory: webstore webserver
colocation webserver_with_webstore_and_webip inf: webserver ( webip webstore )
location webservice_prefer_cen7 webip 100: cen7.field.com
order webstore_after_webip Mandatory: webip webstore
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.18-11.el7_5.3-2b07d5c5a9 \
cluster-infrastructure=corosync \
stonith-enabled=false \
default-resource-stickiness=50