四、测试
调试工具:
cman_tool、clustat、clusvadm、init 0
测试目的:
1.实现资源转移
2.探究quorum机制和fence机制
测试思路:
不开启qdiskd服务,查看quorum票数,逐个节点关机/模拟宕机,看日志、看quorum;
开启qdiskd服务,查看quorum票数,逐个节点关机/模拟宕机,看日志、看quorum;
阶段一:(移除非资源节点)
所有节点处于正常状态,资源运行在web3节点,没有开启qdisk
[root@web1 ~]# clustat
Cluster Status for mycluster @ Mon Sep 22 21:44:07 2014
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
web1.rocker.com 1 Online, Local,rgmanager
web2.rocker.com 2 Online, rgmanager
web3.rocker.com 3 Online, rgmanager
web4.rocker.com 4 Online, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:myservice web3.rocker.com started
[root@web1 ~]# cman_tool status
……
Nodes: 4
Expected votes: 8
Total votes: 8
Node votes: 2
Quorum: 5
#可见Quorum=(Total votes/Nodes)+1,资料说一旦小于这个值,cluster就会挂起,我们试一下是不是这样
……
=====Step1:对眀攀戀2节点进行关机操作=====
[root@web1 ~]# ssh root@web2 'halt'
查看日志
[root@web1 ~]# tail /var/log/message
Sep 22 21:49:59 web1 clurgmgrd[4807]:
Sep 22 21:50:17 web1 openais[4555]: [TOTEM] The token was lost in theOPERATIONAL state.
Sep 22 21:50:17 web1 openais[4555]: [TOTEM] Receive multicast socketrecv buffer size (320000 bytes).
Sep 22 21:50:17 web1 openais[4555]: [TOTEM] Transmit multicast socketsend buffer size (221184 bytes).
Sep 22 21:50:17 web1 openais[4555]: [TOTEM] entering GATHER state from2.
Sep 22 21:50:29 web1 openais[4555]: [TOTEM] entering GATHER state from0.
Sep 22 21:50:29 web1 openais[4555]: [TOTEM] Creating commit tokenbecause I am the rep.
Sep 22 21:50:29 web1 openais[4555]: [TOTEM] Storing new sequence id forring 14
Sep 22 21:50:29 web1 openais[4555]: [TOTEM] entering COMMIT state.
Sep 22 21:50:30 web1 openais[4555]: [TOTEM] entering RECOVERY state.
Sep 22 21:50:30 web1 openais[4555]: [TOTEM] position [0] member192.168.1.201:
Sep 22 21:50:30 web1 openais[4555]: [TOTEM] previous ring seq 16 rep192.168.1.201
Sep 22 21:50:30 web1 openais[4555]: [TOTEM] aru bc high delivered bcreceived flag 1
Sep 22 21:50:30 web1 openais[4555]: [TOTEM] position [1] member192.168.1.203:
Sep 22 21:50:30 web1 openais[4555]: [TOTEM] previous ring seq 16 rep192.168.1.201
Sep 22 21:50:30 web1 openais[4555]: [TOTEM] aru bc high delivered bcreceived flag 1
Sep 22 21:50:30 web1 openais[4555]: [TOTEM] position [2] member192.168.1.204:
Sep 22 21:50:30 web1 openais[4555]: [TOTEM] previous ring seq 16 rep192.168.1.201
Sep 22 21:50:31 web1 openais[4555]: [TOTEM] aru bc high delivered bcreceived flag 1
Sep 22 21:50:31 web1 openais[4555]: [TOTEM] Did not need to originateany messages in recovery.
Sep 22 21:50:31 web1 openais[4555]: [TOTEM] Sending initial ORF token
Sep 22 21:50:31 web1 last message repeated 2 times
Sep 22 21:50:32 web1 openais[4555]: [CLM ] CLM CONFIGURATION CHANGE
Sep 22 21:50:32 web1 openais[4555]: [CLM ] New Configuration:
Sep 22 21:50:32 web1 openais[4555]: [CLM ] r(0)ip(192.168.1.201)
Sep 22 21:50:32 web1 openais[4555]: [CLM ] r(0)ip(192.168.1.203)
Sep 22 21:50:32 web1 openais[4555]: [CLM ] r(0)ip(192.168.1.204)
Sep 22 21:50:32 web1 openais[4555]: [CLM ] Members Left:
Sep 22 21:50:32 web1 openais[4555]: [CLM ] r(0)ip(192.168.1.202)
Sep 22 21:50:32 web1 openais[4555]: [CLM ] Members Joined:
Sep 22 21:50:32 web1 openais[4555]: [CLM ] CLM CONFIGURATION CHANGE
Sep 22 21:50:32 web1 openais[4555]: [CLM ] New Configuration:
Sep 22 21:50:32 web1 openais[4555]: [CLM ] r(0)ip(192.168.1.201)
Sep 22 21:50:32 web1 openais[4555]: [CLM ] r(0)ip(192.168.1.203)
Sep 22 21:50:32 web1 openais[4555]: [CLM ] r(0)ip(192.168.1.204)
Sep 22 21:50:32 web1 kernel: dlm: closing connection to node 2
Sep 22 21:50:32 web1 openais[4555]: [CLM ] Members Left:
Sep 22 21:50:32 web1 openais[4555]: [CLM ] Members Joined:
Sep 22 21:50:32 web1 openais[4555]: [SYNC ] This node is within theprimary component and will provide service.
Sep 22 21:50:32 web1 openais[4555]: [TOTEM] entering OPERATIONAL state.
Sep 22 21:50:32 web1openais[4555]: [CLM ] got nodejoinmessage 192.168.1.201
Sep 22 21:50:32 web1openais[4555]: [CLM ] got nodejoinmessage 192.168.1.203
Sep 22 21:50:32 web1openais[4555]: [CLM ] got nodejoinmessage 192.168.1.204
Sep 22 21:50:32 web1openais[4555]: [CPG ] got joinlistmessage from node 4
Sep 22 21:50:32 web1openais[4555]: [CPG ] got joinlistmessage from node 1
Sep 22 21:50:32 web1openais[4555]: [CPG ] got joinlistmessage from node 3
#表示重新定义成员
查看成员
[root@web1 ~]# clustat
Cluster Status for mycluster @ Mon Sep 22 21:50:54 2014
Member Status: Quorate #表示集群可用状态
Member Name ID Status
------ ---- ----------
web1.rocker.com 1 Online,Local, rgmanager
web2.rocker.com 2 Offline
web3.rocker.com 3 Online,rgmanager
web4.rocker.com 4 Online, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:myservice web3.rocker.com started
查看quorum
[root@web1 ~]# cman_tool status
……
Nodes: 3
Expected votes: 8
Total votes: 6
Node votes: 2
Quorum: 4
……
#对比关机前的quorum,Total votes=8,Quorum=5;可见,现在Total vote=6,少了web2的2票,但是quorum是重新计算的,就是说quorum是浮动的。
=====Step2:对web4节点进行关机操作=====
[root@web1 ~]# ssh root@web4 'halt'
查看节点状态
[root@web1 ~]# clustat
Cluster Status for mycluster @ Mon Sep 22 21:55:26 2014
Member Status: Quorate #表示集群可用状态
Member Name ID Status
------ ---- ----------
web1.rocker.com 1 Online,Local, rgmanager
web2.rocker.com 2 Offline
web3.rocker.com 3 Online, rgmanager
web4.rocker.com 4 Offline
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:myservice web3.rocker.com started
查看quorum
[root@web1 ~]# cman_tool status
……
Nodes: 2
Expected votes: 8
Total votes: 4
Node votes: 2
Quorum: 3
……
#看,quorum在浮动,也就是说,只剩下一台也不会让整个集群挂掉。注意,前提是节点正常关机的情况下。
*****************************************************************
阶段二:这次要关机占用资源的节点
先恢复web2和web4,开机然后开启服务即可
日志:
Sep 22 22:03:17 web1 kernel: dlm: got connection from 2
Sep 22 22:03:20 web1 kernel: dlm: got connection from 4
[root@web1 ~]# clustat
Cluster Status for mycluster @ Mon Sep 22 22:03:50 2014
Member Status: Quorate
Member Name ID Status
------ ---- ----------
web1.rocker.com 1 Online, Local, rgmanager
web2.rocker.com 2 Online,rgmanager
web3.rocker.com 3 Online,rgmanager
web4.rocker.com 4 Online,rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:myservice web3.rocker.com started
[root@web1 ~]# cman_tool status
……
Nodes: 4
Expected votes: 8
Total votes: 8
Node votes: 2
Quorum: 5
……
===Step1:现在把占用资源的web3关机,看看集群是怎么样运作的===
[root@web1 ~]# ssh root@web3 'halt'
[root@web1 ~]# tail /var/log/message
Sep 22 22:05:31 web1 clurgmgrd[4807]:
Sep 22 22:06:04 web1 openais[4555]: [TOTEM] The token was lost in theOPERATIONAL state.
Sep 22 22:06:04 web1 openais[4555]: [TOTEM] Receive multicast socketrecv buffer size (320000 bytes).
Sep 22 22:06:04 web1 openais[4555]: [TOTEM] Transmit multicast socketsend buffer size (221184 bytes).
Sep 22 22:06:04 web1 openais[4555]: [TOTEM] entering GATHER state from2.
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] entering GATHER state from11.
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] Creating commit tokenbecause I am the rep.
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] Storing new sequence id forring 24
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] entering COMMIT state.
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] entering RECOVERY state.
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] position [0] member192.168.1.201:
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] previous ring seq 32 rep192.168.1.201
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] aru 8d high delivered 8dreceived flag 1
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] position [1] member192.168.1.202:
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] previous ring seq 32 rep192.168.1.201
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] aru 8d high delivered 8dreceived flag 1
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] position [2] member192.168.1.204:
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] previous ring seq 32 rep192.168.1.201
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] aru 8d high delivered 8dreceived flag 1
Sep 22 22:06:16 web1 kernel: dlm: closing connection to node 3
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] Did not need to originateany messages in recovery.
Sep 22 22:06:16 web1 openais[4555]: [TOTEM] Sending initial ORF token
Sep 22 22:06:17 web1 openais[4555]: [CLM ] CLM CONFIGURATION CHANGE
Sep 22 22:06:17 web1 openais[4555]: [CLM ] New Configuration:
Sep 22 22:06:17 web1 openais[4555]: [CLM ] r(0)ip(192.168.1.201)
Sep 22 22:06:17 web1 openais[4555]: [CLM ] r(0)ip(192.168.1.202)
Sep 22 22:06:17 web1 openais[4555]: [CLM ] r(0)ip(192.168.1.204)
Sep 22 22:06:17 web1 openais[4555]: [CLM ] Members Left:
Sep 22 22:06:17 web1 openais[4555]: [CLM ] r(0)ip(192.168.1.203)
Sep 22 22:06:17 web1 openais[4555]: [CLM ] Members Joined:
Sep 22 22:06:17 web1 openais[4555]: [CLM ] CLM CONFIGURATION CHANGE
Sep 22 22:06:17 web1 openais[4555]: [CLM ] New Configuration:
Sep 22 22:06:17 web1 openais[4555]: [CLM ] r(0)ip(192.168.1.201)
Sep 22 22:06:17 web1 openais[4555]: [CLM ] r(0)ip(192.168.1.202)
Sep 22 22:06:17 web1 openais[4555]: [CLM ] r(0)ip(192.168.1.204)
Sep 22 22:06:17 web1 openais[4555]: [CLM ] Members Left:
Sep 22 22:06:17 web1 openais[4555]: [CLM ] Members Joined:
Sep 22 22:06:17 web1 openais[4555]: [SYNC ] This node is within theprimary component and will provide service.
Sep 22 22:06:17 web1 openais[4555]: [TOTEM] entering OPERATIONAL state.
Sep 22 22:06:17 web1 openais[4555]: [CLM ] got nodejoin message 192.168.1.201
Sep 22 22:06:17 web1 openais[4555]: [CLM ] got nodejoin message 192.168.1.202
Sep 22 22:06:17 web1 openais[4555]: [CLM ] got nodejoin message 192.168.1.204
Sep 22 22:06:17 web1 openais[4555]: [CPG ] got joinlist message from node 4
Sep 22 22:06:18 web1 openais[4555]: [CPG ] got joinlist message from node 1
Sep 22 22:06:18 web1 openais[4555]: [CPG ] got joinlist message from node 2
[root@web1 ~]# clustat
Cluster Status for mycluster @ Mon Sep 22 22:06:37 2014
Member Status: Quorate #集群可用
Member Name ID Status
------ ---- ----------
web1.rocker.com 1 Online,Local, rgmanager
web2.rocker.com 2 Online, rgmanager
web3.rocker.com 3 Offline
web4.rocker.com 4 Online,rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:myservice web2.rocker.com started
#集群资源转移到了web2节点上了,测试一下。
查看quorum
[root@web1 ~]# cman_tool status
……
Nodes: 3
Expected votes: 8
Total votes: 6
Node votes: 2
Quorum: 4
……
#对比之前的quorum,这里的quorum也发生了浮动。
=====step2:把web2关机======
[root@web1 ~]# ssh root@web2 'halt'
[root@web1 ~]# tail /var/log/message
Sep 22 22:09:30 web1 clurgmgrd[4807]:
Sep 22 22:09:30 web1 clurgmgrd[4807]:
Sep 22 22:09:33 web1 avahi-daemon[3919]: Registering new address record for 192.168.1.200 on eth0. #资源转到web1节点了
Sep 22 22:09:34 web1clurgmgrd[4807]:
……
Sep 22 22:10:15 web1 openais[4555]: [CLM ] got nodejoin message 192.168.1.201
Sep 22 22:10:15 web1 openais[4555]: [CLM ] got nodejoin message 192.168.1.204
Sep 22 22:10:15 web1 openais[4555]: [CPG ] got joinlist message from node 4
Sep 22 22:10:15 web1 openais[4555]: [CPG ] got joinlist message from node 1
查看节点状态
[root@web1 ~]# clustat
Cluster Status for mycluster @ Mon Sep 22 22:11:25 2014
Member Status: Quorate
Member Name ID Status
------ ---- ----------
web1.rocker.com 1 Online,Local, rgmanager
web2.rocker.com 2 Offline
web3.rocker.com 3 Offline
web4.rocker.com 4 Online,rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:myservice web1.rocker.com started
测试
=====step3:把web1节点关机=====
[root@web4 ~]# ssh root@web1 'halt'
[root@web4 ~]# clustat
Cluster Status for mycluster @ Mon Sep 22 22:31:02 2014
Member Status: Quorate #集群依然可用
Member Name ID Status
------ ---- ----------
web1.rocker.com 1 Offline
web2.rocker.com 2 Offline
web3.rocker.com 3 Offline
web4.rocker.com 4 Online,Local, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:myservice web4.rocker.com started
测试
###字数限制####
###更多见下文###