针对LVS的Director的高可用集群测试
实验目的:
1. 测试 Director的高可用集群
2. 观察heartbeat-ldirectord 对后端 Real Server 健康状况的检测
实验环境:
Redhat 5.8
VIP 192.168.0.2
Real Server:
RS1 192.168.0.5
RS2 192.168.0.6
Director:
node1 192.168.0.11
node2 192.168.0.12
需要用到的rpm包:
heartbeat-2.1.4-9.el5.i386.rpm
heartbeat-stonith-2.1.4-10.el5.i386.rpm
heartbeat-gui-2.1.4-9.el5.i386.rpm
libnet-1.1.4-3.el5.i386.rpm
heartbeat-ldirectord-2.1.4-9.el5.i386.rpm
perl-MailTools-1.77-1.el5.noarch.rpm
heartbeat-pils-2.1.4-10.el5.i386.rpm
另外还要准备好系统光盘,作为yum源
下面来说具体实验过程:
一. 先配置 Real Server
1. 同步两台Real Server的时间
# hwclock -s
2. 安装 apache
# yum -y install httpd
为两台Real Server提供网页文件
- [root@RS1 ~]# echo "
Real Server 1
> /var/www/html/index.html"
- [root@RS2 ~]# echo "
Real Server 2
> /var/www/html/index.html"
- [root@RS1 ~]# vi /etc/httpd/conf/httpd.conf
- 更改:ServerName RS1.yue.com
- [root@RS2 ~]# vi /etc/httpd/conf/httpd.conf
- 更改:ServerName RS2.yue.com
# /etc/init.d/httpd start
3. 在RS1上编辑内核的相关参数
- [root@RS1 ~]# echo 1 > /proc/sys/net/ipv4/conf/eth0/arp_ignore
- [root@RS1 ~]# echo 1 > /proc/sys/net/ipv4/conf/all/arp_ignore
- [root@RS1 ~]# echo 2 > /proc/sys/net/ipv4/conf/all/arp_announce
- [root@RS1 ~]# echo 2 > /proc/sys/net/ipv4/conf/eth0/arp_announce
- [root@RS1 ~]# ifconfig lo:0 192.168.0.2 broadcast 192.168.0.255 netmask 255.255.255.255 up
- [root@RS1 ~]# ifconfig
- eth0 Link encap:Ethernet HWaddr 00:0C:29:7E:8B:C6
- inet addr:192.168.0.5 Bcast:192.168.0.255 Mask:255.255.255.0
- UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
- RX packets:2719 errors:0 dropped:0 overruns:0 frame:0
- TX packets:3628 errors:0 dropped:0 overruns:0 carrier:0
- collisions:0 txqueuelen:1000
- RX bytes:200533 (195.8 KiB) TX bytes:644821 (629.7 KiB)
- Interrupt:67 Base address:0x2000
- lo Link encap:Local Loopback
- inet addr:127.0.0.1 Mask:255.0.0.0
- UP LOOPBACK RUNNING MTU:16436 Metric:1
- RX packets:71 errors:0 dropped:0 overruns:0 frame:0
- TX packets:71 errors:0 dropped:0 overruns:0 carrier:0
- collisions:0 txqueuelen:0
- RX bytes:5699 (5.5 KiB) TX bytes:5699 (5.5 KiB)
- lo:0 Link encap:Local Loopback
- inet addr:192.168.0.2 Mask:255.255.255.255
- UP LOOPBACK RUNNING MTU:16436 Metric:1
- [root@RS1 ~]# elinks -dump http://192.168.0.5
- Real Server 1
- [root@RS1 ~]# elinks -dump http://192.168.0.2
- Real Server 1
- [root@RS1 ~]# route add -host 192.168.0.2 dev lo:0
- [root@RS1 ~]# route -n
- Kernel IP routing table
- Destination Gateway Genmask Flags Metric Ref Use Iface
- 192.168.0.2 0.0.0.0 255.255.255.255 UH 0 0 0 lo
- 192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
- 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
- 0.0.0.0 192.168.0.1 0.0.0.0 UG 0 0 0 eth0
设定服务开机自动启动
- [root@RS1 ~]# chkconfig --add httpd
- [root@RS1 ~]# chkconfig httpd on
- [root@RS1 ~]# chkconfig --list httpd
- httpd 0:off 1:off 2:on 3:on 4:on 5:on 6:off
4. 在RS2 上做同样的设置
二、配置Director
节点的主机名要与 "uname -n "的结果一致
1. 先在 node1 上配置
同步时间:
# hwclock -s
主机名解析:
# vi /etc/hosts 添加如下内容
192.168.0.11 node1.yue.com node1
192.168.0.12 node2.yue.com node2
主机名:
# hostname RS1
- [root@node1 ~]# vi /etc/sysconfig/network
- NETWORKING=yes
- NETWORKING_IPV6=no
- HOSTNAME=node1.yue.com
IP地址:
- [root@node1 ~]# vi /etc/sysconfig/network-scripts/ifcfg-eth0
- # Advanced Micro Devices [AMD] 79c970 [PCnet32 LANCE]
- DEVICE=eth0
- BOOTPROTO=none
- ONBOOT=yes
- HWADDR=00:0c:29:e7:3d:5c
- IPADDR=192.168.0.11
- GATEWAY=192.168.0.1
- NETAMASK=255.255.255.0
双机互信:
- [root@node1 ~]# ssh-keygen -t rsa
- Generating public/private rsa key pair.
- Enter file in which to save the key (/root/.ssh/id_rsa): 密码为空(直接回车)
- Created directory '/root/.ssh'.
- Enter passphrase (empty for no passphrase): 再次输入密码
- Enter same passphrase again:
- Your identification has been saved in /root/.ssh/id_rsa.
- Your public key has been saved in /root/.ssh/id_rsa.pub.
- The key fingerprint is:
- 0f:c8:62:6b:2e:68:4c:8b:ce:0f:25:52:23:93:c7:0a [email protected]
- [root@node1 ~]# ssh-copy-id -i .ssh/id_rsa.pub [email protected] 将公钥传送到node2(默认在root用户的家目录下的.ssh目录下)
- 15
- The authenticity of host 'node2.yue.com (192.168.0.12)' can't be established.
- RSA key fingerprint is 9d:d9:14:94:81:c2:7b:d5:7b:af:2c:64:58:8f:e3:49.
- Are you sure you want to continue connecting (yes/no)? yes 提示是否接受连接
- [email protected]'s password: 输入node2的密码
- Now try logging into the machine, with "ssh '[email protected]'", and check in:
- .ssh/authorized_keys
- to make sure we haven't added extra keys that you weren't expecting.
测试一下效果:
- [root@node1 ~]# ssh node2 'ifconfig' 远程在node2上执行命令令
- The authenticity of host 'node2 (192.168.0.12)' can't be established.
- RSA key fingerprint is 9d:d9:14:94:81:c2:7b:d5:7b:af:2c:64:58:8f:e3:49.
- Are you sure you want to continue connecting (yes/no)? yes
- Warning: Permanently added 'node2' (RSA) to the list of known hosts.
- eth0 Link encap:Ethernet HWaddr 00:0C:29:D9:75:DF
- inet addr:192.168.0.12 Bcast:192.168.0.255 Mask:255.255.255.0
- UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
- RX packets:629 errors:0 dropped:0 overruns:0 frame:0
- TX packets:528 errors:0 dropped:0 overruns:0 carrier:0
- collisions:0 txqueuelen:1000
- RX bytes:60497 (59.0 KiB) TX bytes:56572 (55.2 KiB)
- Interrupt:67 Base address:0x2000
- lo Link encap:Local Loopback
- inet addr:127.0.0.1 Mask:255.0.0.0
- UP LOOPBACK RUNNING MTU:16436 Metric:1
- RX packets:10 errors:0 dropped:0 overruns:0 frame:0
- TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
- collisions:0 txqueuelen:0
- RX bytes:692 (692.0 b) TX bytes:692 (692.0 b)
2. 在 node2 上做相应的配置
3. 安装相关软件包:
- [root@node1 tmp]# ls
- heartbeat-2.1.4-9.el5.i386.rpm
- heartbeat-stonith-2.1.4-10.el5.i386.rpm
- heartbeat-gui-2.1.4-9.el5.i386.rpm
- libnet-1.1.4-3.el5.i386.rpm
- heartbeat-ldirectord-2.1.4-9.el5.i386.rpm
- perl-MailTools-1.77-1.el5.noarch.rpm
- heartbeat-pils-2.1.4-10.el5.i386.rpm
- [root@node1 tmp]# yum --nogpgcheck localinstall *.rpm
- [root@node1 tmp]# chkconfig --list ldirectord
- ldirectord 0:off 1:off 2:off 3:on 4:off 5:on 6:off
- [root@node1 tmp]# chkconfig ldirectord off
- [root@node2 tmp]# yum --nogpgcheck localinstall *.rpm
- [root@node2 tmp]# chkconfig ldirectord off
- [root@node2 tmp]# chkconfig --list ldirectord
- ldirectord 0:off 1:off 2:off 3:off 4:off 5:off 6:off
配置文件:
- [root@node1 tmp]# cp /usr/share/doc/heartbeat-ldirectord-2.1.4/ldirectord.cf /etc/ha.d/
- [root@node1 ~]# cd /usr/share/doc/heartbeat-2.1.4/
- [root@node1 heartbeat-2.1.4]# cp ha.cf authkeys haresources /etc/ha.d/
- [root@node1 heartbeat-2.1.4]# cd /etc/ha.d/
- [root@node1 ha.d]# chmod 600 authkeys 一定要改权限,否则启动的时候会报错
(1)
# vi /etc/ha.d/ha.cf
- logfile /var/log/ha-log
- logfacility local0
- keepalive 2 多长时间传送一次心跳信息
- deadtime 30 多长时间收不到心跳信息,就认为死亡
- warntime 10 警告时间
- initdead 120 第一次启动时等待多长时间,认为死亡,通常就为deadtime 的2倍
- udpport 694
- bcast eth0 # Linux 广播方式传送心跳信息
- auto_failback on 是否自动回收资源
- node node1.yue.com 节点列表.要与/etc/hosts文件中的定义相同
- node node2.yue.com
- ping 192.168.0.1 指向一个IP(通常是离我们最近的网关),检查自身的网络连接,以确定对方是否已经死亡
- compression bz2 传送的信息要压缩
- compression_threshold 2 压缩的下限
- 并添加: crm respawn 这一行
(2)
- [root@node1 ha.d]# dd if=/dev/urandom count=1 bs=512 | md5sum
- 1+0 records in
- 1+0 records out
- 512 bytes (512 B) copied, 0.00025866 seconds, 2.0 MB/s
- 4faf8724bc49da78b21fc04ceb7b5bc3 -
# vi /etc/ha.d/authkeys
- #auth 1 使用哪种加密机制,并指定其编号
- #1 crc
- #2 sha1 HI!
- #3 md5 Hello!
- auth 1 使用1这个编号的算法
- 1 sha1 1daea09a52368d9fe65a37163d4ae3ea 1 号算法为sha1
(3).
# vi /etc/ha.d/ldirectord.cf
- virtual=192.168.0.2:80 Vip
- real=192.168.0.5:80 gate gate: dr模型
- real=192.168.0.6:80 gate
- fallback=127.0.0.1:80 gate 若两个Real Server都挂掉,是否通过本机给客户端一个提示信息
- service=http 基于什么协议检测后端的Real Server
- request="test.html" 检测哪个网页
- receive="Real Server OK" 期望从检测网页得到什么样内容
- scheduler=rr
- # persistent=600 持久性
- netmask=255.255.255.255 定义广播域
- protocol=tcp
- checktype=negotiate 检测的方式,协商
- checkport=80
提供检测页面:
- [root@RS1 ~]# vi /var/www/html/test.html
Real Server OK
- [root@RS2 ~]# vi /var/www/html/test.html
Real Server OK
传送配置文件:
- [root@node1 ha.d]# scp authkeys ha.cf haresources ldirectord.cf node2:/etc/ha.d/
- authkeys 100% 692 0.7KB/s 00:00
- ha.cf 100% 10KB 10.4KB/s 00:00
- haresources 100% 5905 5.8KB/s 00:00
- ldirectord.cf 100% 7689 7.5KB/s 00:00
启动heartbeat:
启动是有顺序的:必须先在node 1 上启动,然后在node1 上远程启动node2 上的heartbeat;关闭node2的时候必须是在node1远程进行
- [root@node1 ha.d]# /etc/init.d/heartbeat start 先在node 1 上启动
- Starting High-Availability services: [ OK ]
- [root@node1 ha.d]# ssh node2 '/etc/init.d/heartbeat start' 在node 1 上远程启动node 2 上的heartbeat
- Starting High-Availability services:
- [ OK ]
查看当前集群节点的工作状况:
- [root@node1 ~]# crm_mon -1 显示当前集群的工作状况 ,只显示一次
- ============
- Last updated: Sun Aug 5 09:12:40 2012
- Current DC: node1.yue.com (5d29dca8-514e-441d-8619-8c395db8cb70)
- 2 Nodes configured. 2个节点
- 0 Resources configured. 0个资源
- ============
- Node: node1.yue.com (5d29dca8-514e-441d-8619-8c395db8cb70): online
- Node: node2.yue.com (8f2d3cdb-f19e-493d-85be-f5214f7615ff): online
- [root@node1 ~]# netstat -tnlp 查看5560端口是否已经开启
- tcp 0 0 0.0.0.0:5560 0.0.0.0:* LISTEN 31039/mgmtd
- [root@node1 ~]# crmadmin --status node1.yue.com 查看状态
- Status of [email protected]: S_IDLE (ok) 主节点DC
- [root@node1 ~]# crmadmin --status node2.yue.com
- Status of [email protected]: S_NOT_DC (ok)
- [root@node1 ~]# tail -1 /etc/passwd 给hacuster用户添加密码
- hacluster:x:101:157:heartbeat user:/var/lib/heartbeat/cores/hacluster:/sbin/nologin
- [root@node1 ~]# passwd hacluster
- Changing password for user hacluster.
- New UNIX password: 输入密码
- BAD PASSWORD: it is based on a dictionary word
- Retype new UNIX password: 再输入一次
- passwd: all authentication tokens updated successfully.
配置资源:
web_ip
web_ldirectord
[root@node1 ~]# hb_gui &
[1] 3535
新建资源组:web_server
在组中新建资源:
创建资源web_ip
创建资源web_ldirectord
启动资源:
- [root@node1 ~]# crm_mon -1
- ============
- Last updated: Sun Aug 5 10:03:17 2012
- Current DC: node2.yue.com (8f2d3cdb-f19e-493d-85be-f5214f7615ff)
- 2 Nodes configured.
- 1 Resources configured.
- ============
- Node: node1.yue.com (5d29dca8-514e-441d-8619-8c395db8cb70): online
- Node: node2.yue.com (8f2d3cdb-f19e-493d-85be-f5214f7615ff): online
- Resource Group: Web_server
- web_ldirectord (ocf::heartbeat:ldirectord): Started node1.yue.com
- web_ip (o cf::heartbeat:IPaddr2): Started node1.yue.com
- [root@node1 ~]# ip addr show
- 1: lo:
mtu 16436 qdisc noqueue - link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
- inet 127.0.0.1/8 scope host lo
- 2: eth0:
mtu 1500 qdisc pfifo_fast qlen 1000 - link/ether 00:0c:29:e7:3d:5c brd ff:ff:ff:ff:ff:ff
- inet 192.168.0.11/24 brd 192.168.0.255 scope global eth0
- inet 192.168.0.2/32 brd 192.168.0.255 scope global eth0
- [root@node1 ~]# ipvsadm -Ln
- IP Virtual Server version 1.2.1 (size=4096)
- Prot LocalAddress:Port Scheduler Flags
- -> RemoteAddress:Port Forward Weight ActiveConn InActConn
- TCP 192.168.0.2:80 rr
- -> 192.168.0.5:80 Route 1 0 3
- -> 192.168.0.6:80 Route 1 0 2
此时可以用浏览器测试:http://192.168.0.2 查看页面是否正常,是否可以负载均衡
让node1处于Standby状态,查看资源是否会切换:
- [root@node2 ~]# ipvsadm -Ln
- IP Virtual Server version 1.2.1 (size=4096)
- Prot LocalAddress:Port Scheduler Flags
- -> RemoteAddress:Port Forward Weight ActiveConn InActConn
- TCP 192.168.0.2:80 rr
- -> 192.168.0.5:80 Route 1 0 0
- -> 192.168.0.6:80 Route 1 0 0 vip已经启用
- [root@node2 ~]# ip addr show
- 1: lo:
mtu 16436 qdisc noqueue - link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
- inet 127.0.0.1/8 scope host lo
- 2: eth0:
mtu 1500 qdisc pfifo_fast qlen 1000 - link/ether 00:0c:29:d9:75:df brd ff:ff:ff:ff:ff:ff
- inet 192.168.0.12/24 brd 192.168.0.255 scope global eth0
- inet 192.168.0.2/32 brd 192.168.0.255 scope global eth0 vip启用
- [root@node2 ~]# crm_mon -1
- ============
- Last updated: Sun Aug 5 10:34:17 2012
- Current DC: node2.yue.com (8f2d3cdb-f19e-493d-85be-f5214f7615ff)
- 2 Nodes configured.
- 1 Resources configured.
- ============
- Node: node1.yue.com (5d29dca8-514e-441d-8619-8c395db8cb70): standby
- Node: node2.yue.com (8f2d3cdb-f19e-493d-85be-f5214f7615ff): online
- Resource Group: Web_server
- web_ldirectord (ocf::heartbeat:ldirectord): Started node2.yue.com
- web_ip (ocf::heartbeat:IPaddr2): Started node2.yue.com
三、观察heartbeat-ldirectord 对后端 Real Server 健康状况的检测
可以停掉一台Real Server ,然后通过浏览器访问 http://192.168.0.2
通过刷新页面来观察heartbeat-ldirectord是否可以检测到后端Real Server的健康状况