HA集群错误排障以及相关配置

[root@133 ha.d]# cat /var/log/ha-log 
(问题一:)Feb 15 17:18:12 133 heartbeat: [3952]: ERROR: Client child command [/usr/lib/heartbeat/ipfail] is not executable
Feb 15 17:18:12 133 heartbeat: [3952]: ERROR: Heartbeat not started: configuration error.
Feb 15 17:18:12 133 heartbeat: [3952]: ERROR: Configuration error, heartbeat not started.
Feb 15 17:18:23 133 heartbeat: [4094]: ERROR: Client child command [/usr/lib/heartbeat/ipfail] is not executable
Feb 15 17:18:23 133 heartbeat: [4094]: ERROR: Heartbeat not started: configuration error.
Feb 15 17:18:23 133 heartbeat: [4094]: ERROR: Configuration error, heartbeat not started.
Feb 15 17:20:34 133 heartbeat: [4253]: ERROR: Client child command [/usr/lib/heartbeat/ipfail] is not executable
Feb 15 17:20:34 133 heartbeat: [4253]: ERROR: Heartbeat not started: configuration error.
Feb 15 17:20:34 133 heartbeat: [4253]: ERROR: Configuration error, heartbeat not started.
Feb 15 17:25:24 133 heartbeat: [4419]:(问题二) ERROR: Bad permissions on keyfile [/etc/ha.d//authkeys], 600 recommended.
Feb 15 17:25:24 133 heartbeat: [4419]: ERROR: Authentication configuration error.
Feb 15 17:25:24 133 heartbeat: [4419]: ERROR: Configuration error, heartbeat not started.
Feb 15 17:25:55 133 heartbeat: [4571]: info: Pacemaker support: false
(问题三:)Feb 15 17:25:55 133 heartbeat: [4571]: ERROR: Current node [133] not in configuration!
Feb 15 17:25:55 133 heartbeat: [4571]: info: By default, cluster nodes are named by `uname -n` and must be declared with a 'node' directive in the ha.cf file.
Feb 15 17:25:55 133 heartbeat: [4571]: info: See also: http://linux-ha.org/wiki/Ha.cf#node_directive
Feb 15 17:25:55 133 heartbeat: [4571]: WARN: Logging daemon is disabled --enabling logging daemon is recommended
Feb 15 17:25:55 133 heartbeat: [4571]: ERROR: Configuration error, heartbeat not started.
Feb 15 17:29:23 133 heartbeat: [4760]: info: Pacemaker support: false

问题四:在主机上启动heartbeat时,用ip add命令一直查看不到虚拟ip。

解决方案:
问题一:
错误日志提示:Client child command [/usr/lib/heartbeat/ipfail] is not executable。意思是这个文件不是可执行文件。则用find / -name ipfail命令去查找了一下ipfail这个文件,发现它是在usr/lib64/heartbeat/ipfail 这个目录下,因此在配置HA集群的时候要注意所使用的centos是32位还是64位的。用uname -i  进行查看.
问题二:
chmod 600 authkeys即可
问题三:
ERROR: Current node [133] not in configuration!表示节点不在配置文件中。主要要考虑主机名的配置上是否一致则对/etc/ha.d目录下的三个文件进行authkeys、authkeys、ha.cf
vim haresourses 
(node点)132  192.168.116.135/24/eth0:1 nginx
vim ha.cf
auto_failback on
node 133(node点)
node 132(node点)

ping 192.168.116.2
就是将所有用到的master和slave名字的地方都进行查看。主要不要忘记配置下hosts。
问题四:
注意查看下系统的80端口是否被占用
贴上所有相关的配置
hosts
192.168.116.133 133(master)
192.168.116.132 132  (slave)
authkeys
auth 3
#1 crc
#2 sha1 HI!
3 md5 Hello!

haresources
132  192.168.116.135/24/eth0:1 nginx
ha.cf
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 30
warntime 10
initdead 120
udpport 694
ucast eth0 192.168.116.133
auto_failback on
node 133
node 132
ping 192.168.116.2
respawn hacluster /usr/lib64/heartbeat/ipfail
如有出错欢迎拍砖

你可能感兴趣的:(linux,HA)