RHEL5.4 Heartbeat安装(第二部份 测试与监控)

9) 启动服务 监视服务

HA1HA2heartheat 服务启动

/etc/init.d/heartbeat start

监视服务:

首先在HA1上查看 messages

#cat /var/log/messages

Sep 19 15:56:37 HA1 heartbeat: [26814]: info: Version 2 support: false

Sep 19 15:56:37 HA1 heartbeat: [26814]: WARN: Logging daemon is disabled --enabling logging daemon is recommended

Sep 19 15:56:37 HA1 heartbeat: [26814]: info: **************************

Sep 19 15:56:37 HA1 heartbeat: [26814]: info: Configuration validated. Starting heartbeat 3.0.2

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: heartbeat: version 3.0.2

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: Heartbeat generation: 1284708296

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth1

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth1 - Status: 1

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: G_main_add_TriggerHandler: Added signal manual handler

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: G_main_add_TriggerHandler: Added signal manual handler

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: G_main_add_SignalHandler: Added signal handler for signal 17

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: Local status now set to: 'up'

Sep 19 15:56:41 HA1 heartbeat: [26815]: info: Link ha1:eth1 up.

Sep 19 15:56:47 HA1 heartbeat: [26815]: info: Link ha2:eth1 up.

Sep 19 15:56:47 HA1 heartbeat: [26815]: info: Status update for node ha2: status up

Sep 19 15:56:47 HA1 harc[26822]: info: Running /usr/etc/ha.d//rc.d/status status

Sep 19 15:56:47 HA1 heartbeat: [26815]: info: Comm_now_up(): updating status to active

Sep 19 15:56:47 HA1 heartbeat: [26815]: info: Local status now set to: 'active'

Sep 19 15:56:48 HA1 heartbeat: [26815]: info: Status update for node ha2: status active

Sep 19 15:56:48 HA1 harc[26842]: info: Running /usr/etc/ha.d//rc.d/status status

Sep 19 15:57:04 HA1 heartbeat: [26815]: info: remote resource transition completed.

Sep 19 15:57:04 HA1 heartbeat: [26815]: info: remote resource transition completed.

Sep 19 15:57:04 HA1 heartbeat: [26815]: info: Initial resource acquisition complete (T_RESOURCES(us))

Sep 19 15:57:04 HA1 IPaddr[26898]: INFO:  Resource is stopped

Sep 19 15:57:04 HA1 heartbeat: [26862]: info: Local Resource acquisition completed.

Sep 19 15:57:04 HA1 harc[26941]: info: Running /usr/etc/ha.d//rc.d/ip-request-resp ip-request-resp

Sep 19 15:57:04 HA1 ip-request-resp[26941]: received ip-request-resp IPaddr::172.16.6.66/21/eth0 OK yes

Sep 19 15:57:04 HA1 ResourceManager[26964]: info: Acquiring resource group: ha1 IPaddr::172.16.6.66/21/eth0 test

Sep 19 15:57:05 HA1 IPaddr[26992]: INFO:  Resource is stopped

Sep 19 15:57:05 HA1 ResourceManager[26964]: info: Running /etc/ha.d/resource.d/IPaddr 172.16.6.66/21/eth0 start

Sep 19 15:57:05 HA1 IPaddr[27077]: INFO: Using calculated netmask for 172.16.6.66: 255.255.248.0

Sep 19 15:57:05 HA1 IPaddr[27077]: INFO: eval ifconfig eth0:0 172.16.6.66 netmask 255.255.248.0 broadcast 172.16.7.255

Sep 19 15:57:05 HA1 IPaddr[27051]: INFO:  Success

Sep 19 15:57:05 HA1 logger: /etc/ha.d/resource.d/test called with status

Sep 19 15:57:05 HA1 ResourceManager[26964]: info: Running /etc/ha.d/resource.d/test  start

Sep 19 15:57:05 HA1 logger: /etc/ha.d/resource.d/test called with start

 

 

可以看到HA1HA2都启动起来了  我们的test脚本也运行了。我们的 172.16.6.66IP也启来了。

然后再去 ha-log里面看一下

 

[root@HA1 ~]# cat /var/log/ha-log

Sep 19 15:56:37 HA1 heartbeat: [26814]: info: Version 2 support: false

Sep 19 15:56:37 HA1 heartbeat: [26814]: WARN: Logging daemon is disabled --enabling logging daemon is recommended

Sep 19 15:56:37 HA1 heartbeat: [26814]: info: **************************

Sep 19 15:56:37 HA1 heartbeat: [26814]: info: Configuration validated. Starting heartbeat 3.0.2

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: heartbeat: version 3.0.2

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: Heartbeat generation: 1284708296

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth1

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth1 - Status: 1

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: G_main_add_TriggerHandler: Added signal manual handler

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: G_main_add_TriggerHandler: Added signal manual handler

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: G_main_add_SignalHandler: Added signal handler for signal 17

Sep 19 15:56:37 HA1 heartbeat: [26815]: info: Local status now set to: 'up'

Sep 19 15:56:41 HA1 heartbeat: [26815]: info: Link ha1:eth1 up.

Sep 19 15:56:47 HA1 heartbeat: [26815]: info: Link ha2:eth1 up.

Sep 19 15:56:47 HA1 heartbeat: [26815]: info: Status update for node ha2: status up

harc[26822]:    2010/09/19_15:56:47 info: Running /usr/etc/ha.d//rc.d/status status

Sep 19 15:56:47 HA1 heartbeat: [26815]: info: Comm_now_up(): updating status to active

Sep 19 15:56:47 HA1 heartbeat: [26815]: info: Local status now set to: 'active'

Sep 19 15:56:48 HA1 heartbeat: [26815]: info: Status update for node ha2: status active

harc[26842]:    2010/09/19_15:56:48 info: Running /usr/etc/ha.d//rc.d/status status

Sep 19 15:57:04 HA1 heartbeat: [26815]: info: remote resource transition completed.

Sep 19 15:57:04 HA1 heartbeat: [26815]: info: remote resource transition completed.

Sep 19 15:57:04 HA1 heartbeat: [26815]: info: Initial resource acquisition complete (T_RESOURCES(us))

IPaddr[26898]:         2010/09/19_15:57:04 INFO:  Resource is stopped

Sep 19 15:57:04 HA1 heartbeat: [26862]: info: Local Resource acquisition completed.

harc[26941]:    2010/09/19_15:57:04 info: Running /usr/etc/ha.d//rc.d/ip-request-resp ip-request-resp

ip-request-resp[26941]:  2010/09/19_15:57:04 received ip-request-resp IPaddr::172.16.6.66/21/eth0 OK yes

ResourceManager[26964]:     2010/09/19_15:57:04 info: Acquiring resource group: ha1 IPaddr::172.16.6.66/21/eth0 test

IPaddr[26992]:         2010/09/19_15:57:05 INFO:  Resource is stopped

ResourceManager[26964]:     2010/09/19_15:57:05 info: Running /etc/ha.d/resource.d/IPaddr 172.16.6.66/21/eth0 start

IPaddr[27077]:         2010/09/19_15:57:05 INFO: Using calculated netmask for 172.16.6.66: 255.255.248.0

IPaddr[27077]:         2010/09/19_15:57:05 INFO: eval ifconfig eth0:0 172.16.6.66 netmask 255.255.248.0 broadcast 172.16.7.255

IPaddr[27051]:         2010/09/19_15:57:05 INFO:  Success

ResourceManager[26964]:     2010/09/19_15:57:05 info: Running /etc/ha.d/resource.d/test  start

 

内容和messages里面的差不多。

 

HA2里面的日志

Sep 19 23:57:24 HA2 heartbeat: [14041]: info: Version 2 support: false

Sep 19 23:57:24 HA2 heartbeat: [14041]: WARN: Logging daemon is disabled --enabling logging daemon is recommended

Sep 19 23:57:24 HA2 heartbeat: [14041]: info: **************************

Sep 19 23:57:24 HA2 heartbeat: [14041]: info: Configuration validated. Starting heartbeat 3.0.2

Sep 19 23:57:24 HA2 heartbeat: [14042]: info: heartbeat: version 3.0.2

Sep 19 23:57:24 HA2 heartbeat: [14042]: info: Heartbeat generation: 1284893027

Sep 19 23:57:24 HA2 heartbeat: [14042]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth1

Sep 19 23:57:24 HA2 heartbeat: [14042]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth1 - Status: 1

Sep 19 23:57:24 HA2 heartbeat: [14042]: info: G_main_add_TriggerHandler: Added signal manual handler

Sep 19 23:57:24 HA2 heartbeat: [14042]: info: G_main_add_TriggerHandler: Added signal manual handler

Sep 19 23:57:24 HA2 heartbeat: [14042]: info: G_main_add_SignalHandler: Added signal handler for signal 17

Sep 19 23:57:24 HA2 heartbeat: [14042]: info: Local status now set to: 'up'

Sep 19 23:57:26 HA2 heartbeat: [14042]: info: Link ha1:eth1 up.

Sep 19 23:57:26 HA2 heartbeat: [14042]: info: Status update for node ha1: status up

Sep 19 23:57:26 HA2 heartbeat: [14042]: info: Link ha2:eth1 up.

Sep 19 23:57:26 HA2 harc[14049]: info: Running /usr/etc/ha.d//rc.d/status status

Sep 19 23:57:26 HA2 heartbeat: [14042]: info: Comm_now_up(): updating status to active

Sep 19 23:57:26 HA2 heartbeat: [14042]: info: Local status now set to: 'active'

Sep 19 23:57:26 HA2 heartbeat: [14042]: info: Status update for node ha1: status active

Sep 19 23:57:26 HA2 harc[14067]: info: Running /usr/etc/ha.d//rc.d/status status

Sep 19 23:57:42 HA2 heartbeat: [14042]: info: local resource transition completed.

Sep 19 23:57:42 HA2 heartbeat: [14042]: info: Initial resource acquisition complete (T_RESOURCES(us))

Sep 19 23:57:42 HA2 heartbeat: [14086]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeys ha2] to acquire.

Sep 19 23:57:43 HA2 heartbeat: [14042]: info: remote resource transition completed.

 

 

会看到 info: No local resources [/usr/share/heartbeat/ResourceManager listkeys ha2] to acquire.

说明没有任何本地资源,该机器将作为备份服务器并闲置,它只监听来自主服务器的心跳直到主服务器失败为止。

 

# tcpdump -i eth1 -n -p udp port 694

可以查看到 eth1过来的心跳广播。如下所示:

23:06:49.576155 IP 10.0.0.1.40661 > 10.0.0.255.ha-cluster: UDP, length 174

23:06:49.734999 IP 10.0.0.2.50487 > 10.0.0.255.ha-cluster: UDP, length 174

23:06:50.324281 IP 10.0.0.1.40661 > 10.0.0.255.ha-cluster: UDP, length 167

23:06:50.324283 IP 10.0.0.1.40661 > 10.0.0.255.ha-cluster: UDP, length 174

23:06:50.486151 IP 10.0.0.2.50487 > 10.0.0.255.ha-cluster: UDP, length 174

 

10) 模拟故障

我们现在把主服务器的电源直接拔掉 模拟宕机 再找一台机器一直ping 172.16.6.66

 

我们会在ha2ha-log里面看到如下信息

Sep 20 00:03:47 HA2 heartbeat: [14042]: WARN: node ha1: is dead

Sep 20 00:03:47 HA2 heartbeat: [14042]: WARN: No STONITH device configured.

Sep 20 00:03:47 HA2 heartbeat: [14042]: WARN: Shared disks are not protected.

Sep 20 00:03:47 HA2 heartbeat: [14042]: info: Resources being acquired from ha1.

Sep 20 00:03:47 HA2 heartbeat: [14042]: info: Link ha1:eth1 dead.

harc[14105]:    2010/09/20_00:03:48 info: Running /usr/etc/ha.d//rc.d/status status

Sep 20 00:03:48 HA2 heartbeat: [14106]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeys ha2] to acquire.

mach_down[14135]:        2010/09/20_00:03:48 info: Taking over resource group IPaddr::172.16.6.66/21/eth0

ResourceManager[14162]:     2010/09/20_00:03:48 info: Acquiring resource group: ha1 IPaddr::172.16.6.66/21/eth0 test

IPaddr[14190]:         2010/09/20_00:03:48 INFO:  Resource is stopped

ResourceManager[14162]:     2010/09/20_00:03:48 info: Running /etc/ha.d/resource.d/IPaddr 172.16.6.66/21/eth0 start

IPaddr[14275]:         2010/09/20_00:03:48 INFO: Using calculated netmask for 172.16.6.66: 255.255.248.0

IPaddr[14275]:         2010/09/20_00:03:48 INFO: eval ifconfig eth0:0 172.16.6.66 netmask 255.255.248.0 broadcast 172.16.7.255

IPaddr[14249]:         2010/09/20_00:03:48 INFO:  Success

ResourceManager[14162]:     2010/09/20_00:03:48 info: Running /etc/ha.d/resource.d/test  start

mach_down[14135]:        2010/09/20_00:03:48 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired

mach_down[14135]:        2010/09/20_00:03:49 info: mach_down takeover complete for node ha1.

Sep 20 00:03:49 HA2 heartbeat: [14042]: info: mach_down takeover complete.

 

 

我们会看到

node ha1: is dead

说明 ha1以经宕机 

资源脚本首先用status参数调用 然后用start参数启动test 脚本。以完成故障转移。

也可以从messages 里面看到

logger: /etc/ha.d/resource.d/test called with start

说明我们的test脚本以经在HA2中运行了。

再看一下IP地址

HA2eht0:0以经有了 172.16.6.66

 

一旦完成故障转移,则备份服务器会再次监视主服务器的心跳 如果主服务器启动则会再将服务转移回主服务器。

 

试验成功。

 如果不需要 主服务器恢复后自动获得主权限 要在ha.cf中 加入一条
auto_failback on

 

你可能感兴趣的:(职场,heartbeat,休闲,故障群集)