我们网站的XX服务随着业务的增长我们需要配置跨IDC容灾情况,为了处理突发事件,避免影响网站的正常访问。自己在网上看了很多处理方法,结合自己的实际情况,我自己也在在局域网的测试服务器上演练了一把。
测试结果表现良好,现在把我在局域网中的部署整理如下:
思路:
当服务器A 发生故障,服务器B可以迅速接管服务器A的任务,不影响用户的正常访问。
当服务器A 故障恢复,服务器A可以马上接管服务器B的任务,服务器B恢复备机状态。
做法:
两台服务器上配置一个虚拟IP地址,主服务器先绑定虚拟ip地址,当发生故障时,备机自动接管虚拟ip地址,刷新网关路由地址。当主机恢复后,备机释放虚拟ip地址,主机再次主动接管虚拟ip地址,刷新网关路由地址。
结构图:
主服务器:ip 192.168.190.199
备服务器: ip 192.168.190.208
vip:192.168.190.88
gateWay=192.168.190.254
netMask=255.255.255.0
bcast=192.168.190.255
步骤:
1、配置虚拟ip:192.168.190.88 到 www.test.com 绑定虚拟ip
/sbin/ifconfig eth0:1 192.168.190.88 broadcast 192.168.190.255
netmask 255.255.255.0 up
/sbin/route add -host 192.168.190.88 dev eth0:1
2、刷新网关路由
/sbin/arping -i eth0 -s 192.168.190.88 192.168.190.254 > /dev/null 2>&1
3、当发生故障时,主机192.168.190.199 ,释放虚拟ip192.168.190.88,备机192.168.190.208接管 虚拟ip192.168.190.88
/sbin/ifconfig eth0:1 192.168.190.88 broadcast 192.168.190.255 netmask 255.255.255.0 down
/sbin/arping -i eth0 -s 192.168.190.88 192.168.190.254 > /dev/null 2>&1
3、这时 www.test.com 解析到了备机服务器192.168.190.208
4、重启备机web服务器
5、若主机192.168.190.199服务恢复正常,备机192.168.190.208释放虚拟IP,主机
绑定虚拟IP 192.168.190.88
备机释放虚拟ip:
/sbin/ifconfig eth0:1 192.168.190.88 broadcast 192.168.190.255 netmask 255.255.255.0 down
/sbin/arping -i eth0 -s 192.168.190.88 192.168.190.254 > /dev/null 2>&1
6、重启启动主备机web服务
主机192.168.190.199
autoSwitchMain.sh切换脚本
#!/bin/sh
#############################################################
#desc:服务器宕机自动切换服务
#author:gaozhonghui
#mail:[email protected]
#date:20121101
#############################################################
vip=192.168.190.88
gateWay=192.168.190.254
netMask=255.255.255.0
bcast=192.168.190.255
function_bind_vip1(){
/sbin/ifconfig eth0:1 ${vip} broadcast ${bcast} netmask ${netMask} up
/sbin/route add -host ${vip} dev eth0:1
}
function_remove_vip1(){
/sbin/ifconfig eth0:1 ${vip} broadcast ${bcast} netmask ${netMask down
}
function_vip_arping1(){
/sbin/arping -i eth0 -s ${vip} ${gateWay} > /dev/null 2>&1
}
function_restart_nginx(){
/web/webserver/nginx/sbin/nginx -s reload
}
bind_time_vip="N"
while true
do
httpCode_rip1=`/usr/bin/curl -o /dev/null -s -w %{http_code} http://192.168.190.199`
if [ x${httpCode_rip1} == "x200" ];
then
if [ ${bind_time_vip} == "N" ];
then
function_bind_vip1
function_vip_arping1
bind_time_vip="Y"
fi
function_vip_arping1
else
if [ ${bind_time_vip} == "Y" ]
then
function_remove_vip1
bind_time_vip="N"
fi
fi
sleep 10
done
然后linux 启动守候进程
/usr/bin/nohup /bin/sh /home/Gzh/shell/ autoSwitchMain.sh 2>&1 > /dev/null &
备机192.168.190.208:
autoSwitchSlave.sh
#!/bin/sh
#############################################################
#desc:服务器宕机自动切换服务
#author:gaozhonghui
#mail:[email protected]
#date:20121101
#############################################################
vip=192.168.190.88
gateWay=192.168.190.254
netMask=255.255.255.0
bcast=192.168.190.255
function_bind_vip1(){
/sbin/ifconfig eth0:1 ${vip} broadcast ${bcast} netmask ${netMask} up
/sbin/route add -host ${vip} dev eth0:1
}
function_remove_vip1(){
/sbin/ifconfig eth0:1 ${vip} broadcast ${bcast} netmask ${netMask} down
}
function_vip_arping1(){
/sbin/arping -i eth0 -s ${vip} ${gateWay} > /dev/null 2>&1
}
function_restart_nginx(){
/web/webserver/nginx/sbin/nginx -s reload
}
bind_time_vip="N"
while true
do
httpCode_rip1=`/usr/bin/curl -o /dev/null -s -w %{http_code} http://192. 168.190.199`
if [ x${httpCode_rip1} == "x200" ];
then
if [ ${bind_time_vip} == "Y" ];
then
function_remove_vip1
bind_time_vip="N"
fi
function_vip_arping1
else
if [ ${bind_time_vip} == "N" ]
then
function_bind_vip1
function_vip_arping1
bind_time_vip="Y"
fi
fi
sleep 10
done
启动守候进程
/usr/bin/nohup /bin/sh /home/Gzh/shell/autoSwitchSlave.sh 2>&1 > /dev/null &
参考资料:
http://blog.s135.com/post/379/