阶段三:我们这次试一下非正常退出
模拟宕机方法:
方法一:虚拟机挂起
方法二:echoc>/proc/sysrq-trigger
恢复所有节点(略)
[root@web1 ~]# cman_tool status
……
Nodes: 4
Expected votes: 8
Total votes: 8
Node votes: 2
Quorum: 5
……
=====Step1:对web3节点进行模拟故障=====
[root@web1 ~]# ssh root@web3 'echo c>/proc/sysrq-trigger'
Sep 22 22:40:18 web1openais[4253]: [TOTEM] The token was lost in the OPERATIONAL state.
#在这之前没有收到web3节点退出的信息,对比阶段二关机状态的日志。
Sep 22 22:40:18 web1 openais[4253]: [TOTEM] Receive multicast socketrecv buffer size (320000 bytes).
Sep 22 22:40:18 web1 openais[4253]: [TOTEM] Transmit multicast socketsend buffer size
……
Sep 22 22:40:30 web1 openais[4253]: [SYNC ] This node is within theprimary component and will provide service.
Sep 22 22:40:30 web1 openais[4253]: [TOTEM] entering OPERATIONAL state.
Sep 22 22:40:30 web1 openais[4253]: [CLM ] got nodejoin message 192.168.1.201
Sep 22 22:40:30 web1 openais[4253]: [CLM ] got nodejoin message 192.168.1.202
Sep 22 22:40:30 web1 openais[4253]: [CLM ] got nodejoin message 192.168.1.204
Sep 22 22:40:30 web1 openais[4253]: [CPG ] got joinlist message from node 1
Sep 22 22:40:30 web1 openais[4253]: [CPG ] got joinlist message from node 2
Sep 22 22:40:30 web1 openais[4253]: [CPG ] got joinlist message from node 4
Sep 22 22:40:35 web1fenced[4272]: fencing node "web3.rocker.com"
Sep 22 22:40:35 web1fenced[4272]: fence "web3.rocker.com" failed
Sep 22 22:40:40 web1fenced[4272]: fencing node "web3.rocker.com"
Sep 22 22:40:40 web1fenced[4272]: fence "web3.rocker.com" failed
Sep 22 22:40:45 web1fenced[4272]: fencing node "web3.rocker.com"
Sep 22 22:40:45 web1fenced[4272]: fence "web3.rocker.com" failed
#我们用了手动fence设备,当集群发现web3节点失联的时候,向管理员申请fence掉web3节点
我们先来看看节点状态
[root@web2 ~]# clustat
Cluster Status for mycluster @ Mon Sep 22 22:42:33 2014
Member Status: Quorate #集群可用
Member Name ID Status
------ ---- ----------
web1.rocker.com 1 Online,rgmanager
web2.rocker.com 2 Online,Local, rgmanager
web3.rocker.com 3 Offline
web4.rocker.com 4 Online,rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:myservice web3.rocker.com started
#但是资源还没有进行转移!
看看quorum
[root@web2 ~]# cman_tool status
……
Nodes: 3
Expected votes: 8
Total votes: 6
Node votes: 2
Quorum: 5 #区别在这里
#Totalvotes改了,意味这web3节点投票失败,然而quorum没有改变。
……
测试一下
=====Step2:把web3节点手动fence掉=====
[root@web2 ~]# fence_ack_manual -n web3.rocker.com
Warning: If the node"web3.rocker.com" has not been manually fenced
(i.e. power cycled or disconnected from shared storage devices)
the GFS file system may become corrupted and all its data
unrecoverable! Please verifythat the node shown above has
been reset or disconnected from storage.
Are you certain you want to continue? [yN] y
can't open /tmp/fence_manual.fifo: No such file or directory
[root@web2 ~]# touch /tmp/fence_manual.fifo
[root@web2 ~]# fence_ack_manual -n web3.rocker.com -e
Warning: If the node"web3.rocker.com" has not been manually fenced
(i.e. power cycled or disconnected from shared storage devices)
the GFS file system may become corrupted and all its data
unrecoverable! Please verifythat the node shown above has
been reset or disconnected from storage.
Are you certain you want to continue? [yN] y
done
#fence成功
#tail /var/log/message
Sep 22 22:52:25 web1 fenced[4272]: fence "web3.rocker.com"overridden by administrator intervention
再看节点状态
[root@web1 ~]# clustat
Cluster Status for mycluster @ Mon Sep 22 22:54:47 2014
Member Status: Quorate
Member Name ID Status
------ ---- ----------
web1.rocker.com 1 Online,Local, rgmanager
web2.rocker.com 2 Online,rgmanager
web3.rocker.com 3 Offline
web4.rocker.com 4 Online,rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:myservice web2.rocker.com started
#资源发生转移了
测试
看看quorum
[root@web1 ~]# cman_tool status
……
Nodes: 3
Expected votes: 8
Total votes: 6
Node votes: 2
Quorum: 5 #quorum不变,因为expectedvote也没变
……
=====step3:继续踢!现在用虚拟机web2挂起=====
Sep 22 22:57:14 web1 openais[4253]: [TOTEM] The token was lost in the OPERATIONAL state.
Sep 22 22:57:14 web1 openais[4253]: [TOTEM] Receive multicast socketrecv buffer size (320000 bytes).
Sep 22 22:57:14 web1 openais[4253]: [TOTEM] Transmit multicast socketsend buffer size (221184 bytes).
Sep 22 22:57:14 web1 openais[4253]: [TOTEM] entering GATHER state from2.
……
Sep 22 22:57:26 web1 openais[4253]:[CMAN ] quorum lost, blocking activity
……
Sep 22 22:57:26 web1 openais[4253]: [TOTEM] entering OPERATIONAL state.
Sep 22 22:57:26 web1 openais[4253]: [CLM ] got nodejoin message 192.168.1.201
Sep 22 22:57:26 web1 ccsd[4247]:Cluster is not quorate. Refusingconnection.
Sep 22 22:57:26 web1 openais[4253]: [CLM ] got nodejoin message 192.168.1.204
Sep 22 22:57:26 web1 ccsd[4247]: Error while processing connect:Connection refused
Sep 22 22:57:26 web1 openais[4253]: [CPG ] got joinlist message from node 1
Sep 22 22:57:26 web1 ccsd[4247]: Invalid descriptor specified (-111).
Sep 22 22:57:26 web1 openais[4253]: [CPG ] got joinlist message from node 4
Sep 22 22:57:26 web1 ccsd[4247]: Someone may be attempting somethingevil.
Sep 22 22:57:26 web1 ccsd[4247]: Error while processing get: Invalidrequest descriptor
#爽!!!终于出现了!!!集群挂起了。
web1的shell界面报错:
Message from syslogd@ at Mon Sep 22 22:57:26 2014 ...
web1 clurgmgrd[4312]:
[root@web1 ~]# clustat
Service states unavailable: Operation requires quorum
Cluster Status for mycluster @ Mon Sep 22 22:58:05 2014
Member Status: Inquorate #集群挂起了!
Member Name ID Status
------ ---- ----------
web1.rocker.com 1 Online,Local
web2.rocker.com 2 Offline
web3.rocker.com 3 Offline
web4.rocker.com 4 Online
#没有转移到资源
[root@web1 ~]# cman_tool status
……
Nodes: 2
Expected votes: 8
Total votes: 4
Node votes: 2
Quorum: 5 Activity blocked
#因为quorum> Total votes,所以集群挂起了
……
************************************************************************
阶段四:配置qdisk
就是为了这种情况的发生,我们需要配置qdisk(我们之前配置了,在用system-config-cluster新建集群的时候),并且开启qdiskd服务
在ss节点为/dev/sdb分两个区(略)
在ss节点编辑配置文件/etc/tgt/targets.conf,添加一个target
backing-store /dev/sdb
ss开启iscsi-target服务:
[root@ss ~]# service tgtd start
Starting SCSI target daemon: Starting target framework daemon
让web所有节点开启iscsi-initial,并且识别target
[root@web1 ~]# for i in web1 web2 web3 web4;do ssh root@$i ‘serviceiscsi start ; iscsiadm -m discovery -t sendtargets -p 192.168.1.205 ; iscsiadm-m node -p 192.168.1.205 -l’;done
查看挂载后的块号
[root@web1 ~]# dmesg
sd 1:0:0:1: Attached scsi disk sdb
sd 1:0:0:1: Attached scsi generic sg6 type 0
[root@web1 ~]# fdisk /dev/sdb -l
Disk /dev/sdb: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 63 506016 83 Linux
/dev/sdb2 64 126 506047+ 83 Linux
创建qdisk分区:
在web某一节点设置即可
[root@web1 ~]# mkqdisk -c /dev/sdb1 -l myqdisk
mkqdisk v0.6.0
Writing new quorum disk label 'myqdisk' to /dev/sdb1.
WARNING: About to destroy all data on /dev/sdb1; proceed [N/y] ? y
Initializing status block for node 1...
……
Initializing status block for node 16...
在所有节点开启qdiskd服务
[root@web1 ~]# for i in web1 web2 web3 web4;do ssh root@$i 'serviceqdiskd start';done
Starting the Quorum Disk Daemon:[ OK ]
Starting the Quorum Disk Daemon:[ OK ]
Starting the Quorum Disk Daemon:[ OK ]
Starting the Quorum Disk Daemon:[ OK ]
查看quorum
[root@web1 ~]# cman_tool status
……
Nodes: 4
Expected votes: 8
Quorum device votes: 2 #qdisk支持投票了
Total votes: 10
Node votes: 2
Quorum: 6
……
=====step1:让web3宕机=====
[root@web3 ~]# echo c>/proc/sysrq-trigger
[root@web1 ~]# clustat
Cluster Status for mycluster @ Tue Sep 23 08:55:41 2014
Member Status: Quorate #集群可用
Member Name ID Status
------ ---- ---- ------
web1.rocker.com 1 Online, Local, rgmanager
web2.rocker.com 2 Online, rgmanager
web3.rocker.com 3 Offline
web4.rocker.com 4 Online, rgmanager
/dev/disk/by-id/scsi-1IET_00010001-p 0 Online, Quorum Disk
#qdisk开启成功
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:myservice web2.rocker.com started
#资源转移了
[root@web1 ~]# cman_tool status
……
Nodes: 3
Expected votes: 8
Quorum device votes: 2
Total votes: 8
Node votes: 2
Quorum: 6
……
=====step2:让web2宕机=====
[root@web2 ~]# echo c>/proc/sysrq-trigger
[root@web1 ~]# clustat
Cluster Status for mycluster @ Tue Sep 23 08:58:25 2014
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
web1.rocker.com 1 Online, Local,rgmanager
web2.rocker.com 2 Offline
web3.rocker.com 3 Offline
web4.rocker.com 4 Online, rgmanager
/dev/disk/by-id/scsi-1IET_00010001-p 0 Online, Quorum Disk
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:myservice web1.rocker.com started
[root@web1 ~]# cman_tool status
……
Nodes: 2
Expected votes: 8
Quorum device votes: 2
Total votes: 6
Node votes: 2
Quorum: 6 #这次就不会cluster挂起
……
=====step3:让web1节点宕机=====
[root@web1 ~]# echo c>/proc/sysrq-trigger
web4 clurgmgrd[4505]:
[root@web4 ~]# clustat
Service states unavailable: Operation requires quorum
Cluster Status for mycluster @ Tue Sep 23 09:02:03 2014
Member Status: Inquorate #集群挂起了
Member Name ID Status
------ ---- ---- ------
web1.rocker.com 1 Offline
web2.rocker.com 2 Offline
web3.rocker.com 3 Offline
web4.rocker.com 4 Online, Local
/dev/disk/by-id/scsi-1IET_00010001-p 0 Online
#没有转移资源
[root@web4 ~]# cman_tool status
……
Nodes: 1
Expected votes: 8
Quorum device votes: 2
Total votes: 4
Node votes: 2
Quorum: 6 Activity blocked #挂了
……
五、结论:
1)在正常关机的情况下,无论是否需要发生资源转移,都会自动把关机的节点踢出去,然后重新计算quorum,不会发生节点数少于过半导致集群挂起;
2)在非正常宕机的情况下,当集群检测到有节点失联,就会通知fence来把它隔离掉,但是,在重新计算quorum后,当节点数少于过半会导致集群挂起;
3)配置了qdisk之后,相当于是总票数的后援,为Total vote加了票数。
关于quorum、vote、Totalvote的关系:
例如4个节点,每个节点2vote,qdisk有2vote。
Total vote=node vote + qdisk vote,这里的Total vote=4X2+2=10
Expected vote=所有节点正常情况下的Total vote +qdisk vote
Quorum=expected vote/2+1,这里的Quorum=10/2+1=6
当检查到关机节点,集群会重新计算Total vote和Quorum。例如,node3关机了,集群重新计算Total vote=2X3+2=8。Quorum=8/2+1=5。
当检测到非正常关机导致与集群失联的节点,Total vote就会重新计算,但是Quorum不变。例如,这里的node2死机了,集群会通知fence设备,把它隔离掉,然后再进行资源在Failover Domain内转移。Total vote=2X3+2=8。但是Quorum保持6不变。当再有节点死机,重新计算得到的Total vote
由此推断,如果把qdisk vote=4,即可实现剩下一台服务器也可以让集群继续工作。
某些资料说,在gfs文件系统上集群,有一个节点就会挂起,我搞不懂什么原理,还有实验如何实施,请大家多多指教。