1、环境描述
服务器A(主) 192.85.1.175
服务器B(从) 192.85.1.176
Mysql版本:5.1.61
系统版本:System OS:ubuntu 10.10 X86
2.安装heartbeat
1)安装heartbeat
sudo apt-get install heartbeat
2)配置说明
heartbeat的安装目录为/etc/ha.d目录下,
安装完成后,需要三个配置文件,为 ha.cf,haresources,authkeys。
此时目录下没有这三个文件,需要创建,我们可以在
/usr/share/doc/heartbeat目录里找到ha.cf、haresources、authkeys三个文件,只需将其拷贝到
/etc/ha.d目录下,即可
*.gz文件,使用 gunzip 命令解压
3.175服务器配置信息:
(1)etc/hosts 文件内容:
192.85.1.175 primary # Added by NetworkManager
(2)ha.cf 文件内容:(主配置文件)
# # There are lots of options in this file. All you have to have is a set # of nodes listed {"node ...} one of {serial, bcast, mcast, or ucast}, # and a value for "auto_failback". # # ATTENTION: As the configuration file is read line by line, # THE ORDER OF DIRECTIVE MATTERS! # # In particular, make sure that the udpport, serial baud rate # etc. are set before the heartbeat media are defined! # debug and log file directives go into effect when they # are encountered. # # All will be fine if you keep them ordered as in this example. # # # Note on logging: # If all of debugfile, logfile and logfacility are not defined, # logging is the same as use_logd yes. In other case, they are # respectively effective. if detering the logging to syslog, # logfacility must be "none". # # File to write debug messages to debugfile /var/log/ha-debug #调试日志文件 # # # File to write other messages to # logfile /var/log/ha-log #系统运行日志文件 # # # Facility to use for syslog()/logger # logfacility local0 # 日志记录等级 # # # A note on specifying "how long" times below... # # The default time unit is seconds # 10 means ten seconds # # You can also specify them in milliseconds # 1500ms means 1.5 seconds # # # keepalive: how long between heartbeats? # keepalive 2 #心跳频率,2表示2秒;200ms则表示200毫秒 # # deadtime: how long-to-declare-host-dead? # # If you set this too low you will get the problematic # split-brain (or cluster partition) problem. # See the FAQ for how to use warntime to tune deadtime. # deadtime 30 #节点死亡时间,就是过了10秒后还没有收到心跳就认为主节点死亡 # # warntime: how long before issuing "late heartbeat" warning? # See the FAQ for how to use warntime to tune deadtime. # warntime 10 #告警时间 # # # Very first dead time (initdead) # # On some machines/OSes, etc. the network takes a while to come up # and start working right after you've been rebooted. As a result # we have a separate dead time for when things first come up. # It should be at least twice the normal dead time. # initdead 120 #初始化时间 # # # What UDP port to use for bcast/ucast communication? # udpport 694 #心跳信息传递的udp端口 # # What interfaces to broadcast heartbeats over? # bcast eth0 # Linux #采用udp广播播来通知心跳,建议在备用节点不只一台时使用 #bcast eth1 eth2 # Linux #bcast le0 # Solaris #bcast le1 le2 # Solaris # # Set up a multicast heartbeat medium # mcast [dev] [mcast group] [port] [ttl] [loop] # # [dev] device to send/rcv heartbeats on # [mcast group] multicast group to join (class D multicast address # 224.0.0.0 - 239.255.255.255) # [port] udp port to sendto/rcvfrom (set this value to the # same value as "udpport" above) # [ttl] the ttl value for outbound heartbeats. this effects # how far the multicast packet will propagate. (0-255) # Must be greater than zero. # [loop] toggles loopback for outbound multicast heartbeats. # if enabled, an outbound packet will be looped back and # received by the interface it was sent on. (0 or 1) # Set this value to zero. # # #bcast eth0 225.0.0.1 694 1 0 # # Set up a unicast / udp heartbeat medium # ucast [dev] [peer-ip-addr] # # [dev] device to send/rcv heartbeats on # [peer-ip-addr] IP address of peer to send packets to # ucast eth0 192.85.1.175 auto_failback on #如果主节点重新恢复过来,主节点将主动将资源抢占过来,如果为off,则只当备用节点当掉后,主节点才取回资源 watchdog /dev/watchdog #看门狗。如果本节点在超过1分钟后还没有发出心跳,那么本节点自动重启 # # Tell what machines are in the cluster # node nodename ... -- must match uname -n node primary #主节点名称,与uname -n显示必须一致 node backup #备用节点名称 # # Less common options... # # Treats 10.10.10.254 as a psuedo-cluster-member # Used together with ipfail below... # note: don't use a cluster node as ping node # ping 192.85.1.1 #通过ping网关来监测心跳是否正常
(3) haresources (资源配置文件)
primary 192.85.1.177/24http,mysql,phpmyadmin #虚拟IP配置及对应的访问资源配置
(4) authkeys (认证信息配置文件)
#通讯密钥,两台机器上的文件内容必须完全一致
auth 3 3 md5 Hello
#authkeys需要设置读写权限:chmod 600 ./authkeys
4.176服务器配置信息:
(1)etc/hosts 文件内容:
192.85.1.176 backup # Added by NetworkManager
(2)ha.cf 文件内容:
# # There are lots of options in this file. All you have to have is a set # of nodes listed {"node ...} one of {serial, bcast, mcast, or ucast}, # and a value for "auto_failback". # # ATTENTION: As the configuration file is read line by line, # THE ORDER OF DIRECTIVE MATTERS! # # In particular, make sure that the udpport, serial baud rate # etc. are set before the heartbeat media are defined! # debug and log file directives go into effect when they # are encountered. # # All will be fine if you keep them ordered as in this example. # # # Note on logging: # If all of debugfile, logfile and logfacility are not defined, # logging is the same as use_logd yes. In other case, they are # respectively effective. if detering the logging to syslog, # logfacility must be "none". # # File to write debug messages to debugfile /var/log/ha-debug #调试日志文件 # # # File to write other messages to # logfile /var/log/ha-log #系统运行日志文件 # # # Facility to use for syslog()/logger # logfacility local0 # 日志记录等级 # # # A note on specifying "how long" times below... # # The default time unit is seconds # 10 means ten seconds # # You can also specify them in milliseconds # 1500ms means 1.5 seconds # # # keepalive: how long between heartbeats? # keepalive 2 #心跳频率,2表示2秒;200ms则表示200毫秒 # # deadtime: how long-to-declare-host-dead? # # If you set this too low you will get the problematic # split-brain (or cluster partition) problem. # See the FAQ for how to use warntime to tune deadtime. # deadtime 30 #节点死亡时间,就是过了10秒后还没有收到心跳就认为主节点死亡 # # warntime: how long before issuing "late heartbeat" warning? # See the FAQ for how to use warntime to tune deadtime. # warntime 10 #告警时间 # # # Very first dead time (initdead) # # On some machines/OSes, etc. the network takes a while to come up # and start working right after you've been rebooted. As a result # we have a separate dead time for when things first come up. # It should be at least twice the normal dead time. # initdead 120 #初始化时间 # # # What UDP port to use for bcast/ucast communication? # udpport 694 #心跳信息传递的udp端口 # # What interfaces to broadcast heartbeats over? # bcast eth0 # Linux #采用udp广播播来通知心跳,建议在备用节点不只一台时使用 #bcast eth1 eth2 # Linux #bcast le0 # Solaris #bcast le1 le2 # Solaris # # Set up a multicast heartbeat medium # mcast [dev] [mcast group] [port] [ttl] [loop] # # [dev] device to send/rcv heartbeats on # [mcast group] multicast group to join (class D multicast address # 224.0.0.0 - 239.255.255.255) # [port] udp port to sendto/rcvfrom (set this value to the # same value as "udpport" above) # [ttl] the ttl value for outbound heartbeats. this effects # how far the multicast packet will propagate. (0-255) # Must be greater than zero. # [loop] toggles loopback for outbound multicast heartbeats. # if enabled, an outbound packet will be looped back and # received by the interface it was sent on. (0 or 1) # Set this value to zero. # # #bcast eth0 225.0.0.1 694 1 0 # # Set up a unicast / udp heartbeat medium # ucast [dev] [peer-ip-addr] # # [dev] device to send/rcv heartbeats on # [peer-ip-addr] IP address of peer to send packets to # ucast eth0 192.85.1.176 auto_failback on #如果主节点重新恢复过来,主节点将主动将资源抢占过来,如果为off,则只当备用节点当掉后,主节点才取回资源 watchdog /dev/watchdog #看门狗。如果本节点在超过1分钟后还没有发出心跳,那么本节点自动重启 # # Tell what machines are in the cluster # node nodename ... -- must match uname -n node primary #主节点名称,与uname -n显示必须一致 node backup #备用节点名称 # # Less common options... # # Treats 10.10.10.254 as a psuedo-cluster-member # Used together with ipfail below... # note: don't use a cluster node as ping node # ping 192.85.1.1 #通过ping网关来监测心跳是否正常
(3) haresources
primary 192.85.1.177/24http,mysql,phpmyadmin #虚拟IP配置及对应的访问资源配置
(4) authkeys
#通讯密钥,两台机器上的文件内容必须完全一致
auth 3 3 md5 Hello
#authkeys需要设置读写权限:chmod 600 ./authkeys
5.HA服务的启动、关闭以及测试
启动HA: service heartbeat start 或 /etc/init.d/heartbeat
关闭HA; service heartbeat stop 或 /etc/init.d/heartbeat
系统在启动时已经自动把heartbeat加载了。
使用http服务测试 heartbeat
首先启动httpd服务
#service httpd start
编辑各自主机的测试用html文件,放到/var/www/html/目录下。
启动node1的heartbeat,并执行这个指令进行监控: heartbeat status
例如直接使用 http://192.85.1.177/phpmyadmin ,可以登录管理数据库