Heartbeat+DRBD+MySQ+NFS部署文档

系统环境

系统:CentOS6.6

系统位数 x86_64


软件环境

heartbeat-3.0.4-2

drbd-8.4.3

nfs-utils-1.2.3-26


部署环境

角色 IP

VIP 192.168.1.13(内网提供服务的地址)

data-09.com br0:192.168.1.9

data-11.com br0:192.168.1.11



1、DRBD 篇 

注意:DRBD可使用硬盘、分区、逻辑卷,但不能建立文件系统

1)、安装依赖包

[[email protected] ~]# yum install gcc gcc-c++ make glibc flex kernel-devel kernel-headers

2)、安装drbd

[[email protected] ~]#wget http://oss.linbit.com/drbd/8.4/drbd-8.4.3.tar.gz

[[email protected] ~]#tar zxvf drbd-8.4.3.tar.gz

[[email protected] ~]#cd drbd-8.4.3

[[email protected] ~]#./configure --prefix=/usr/local/tdoa/drbd --with-km

[[email protected] ~]#make KDIR=/usr/src/kernels/2.6.32-279.el6.x86_64/

[[email protected] ~]#make install

[[email protected] ~]#mkdir -p /usr/local/tdoa/drbd/var/run/drbd

[[email protected] ~]#cp /usr/local/tdoa/drbd/etc/rc.d/init.d/drbd /etc/rc.d/init.d

[[email protected] ~]#加载DRBD模块:

[[email protected] ~]# modprobe drbd

3)、配置DRBD

主备节点两端配置文件完全一致

[[email protected] ~]#cat /usr/local/drbd/etc/drbd.conf

resource r0{

protocol C;

startup { wfc-timeout 0; degr-wfc-timeout 120;}

disk { on-io-error detach;}

net{

  timeout 60;

  connect-int 10;

  ping-int 10;

  max-buffers 2048;

  max-epoch-size 2048;

}

syncer { rate 100M;}

on data-09.com{

  device /dev/drbd0;

  disk   /dev/data/data_lv;

  address 192.168.1.9:7788;

  meta-disk internal;

}

on data-11.com{

  device /dev/drbd0;

  disk   /dev/data/data_lv;

  address 192.168.1.11:7788;

  meta-disk internal;

}

4)、初始化drbd的r0资源并启动

在两个节点都要做的操作

[[email protected] ~]# drbdadm create-md r0

[[email protected] ~]# drbdadm up r0

查看data-09.com和data-11.com的状态应该类似下面的:

[[email protected] ~]# cat /proc/drbd 

version: 8.4.3 (api:1/proto:86-101)

GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by [email protected], 2014-02-26 07:26:07

 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----

    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0


5)、将data-09.com提升为主节点并设置启动

[[email protected] ~]# drbdadm primary --force r0

查看data-09.com的状态应该类似下面的:

[[email protected] ~]# cat /proc/drbd 

version: 8.4.3 (api:1/proto:86-101)

GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by [email protected], 2014-02-26 07:28:26

 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----

ns:4 nr:0 dw:4 dr:681 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0


注意:DRBD服务需要开机自启动



2.NFS篇

  yum install nfs-utils portmap -y   安装NFS服务

  vim /etc/exports 

/usr/local/tdoa/data/attach     192.168.100.0/24(rw,no_root_squash)

/usr/local/tdoa/data/attachment 192.168.100.0/24 (rw,no_root_squash)

  service rpcbind restart

  service nfs restart   

  chkconfig rpcbind on

    chkconfig nfs off

    service nfs stop

   测试NFS可被前端Web服务器挂载并可写后停止NFS服务   


3.Mysql篇

1.建立高可用目录/usr/local/data

  data5 目录用于数据库文件

2.heartbeat主修改Mysql数据库存放目录至/usr/local/data/data5

3.主heartbeat和备heartbeat服务器上的Mysql安装完毕后切换DRBD分区切换至备机,备机的Mysql是否正常工作。

将主机降级为备机

[[email protected] /]# drbdadm secondary r0  

[[email protected] /]# cat /proc/drbd  

在备机data-11.com上, 将它升级为”主机”.

[[email protected]/]# drbdadm primary r0  



4、heartbeat篇

(1.1)、YUM安装heartbeat

[[email protected] ~]#wget http://mirrors.sohu.com/fedora-epel/6Server/x86_64/epel-release-6-8.noarch.rpm

[[email protected] ~]# rpm -ivh epel-release-6-8.noarch.rpm

[[email protected] ~]# yum install heartbeat -y

  

 (1.2)、RPM安装heartbeat

      1.yum install "liblrm.so.2()(64bit)"

      2.rpm -ivh PyXML-0.8.4-19.el6.x86_64.rpm

      3.rpm -ivh perl-TimeDate-1.16-13.el6.noarch.rpm

      4.rpm -ivh resource-agents-3.9.5-12.el6_6.1.x86_64.rpm

      5.rpm -ivh cluster-glue-1.0.5-6.el6.x86_64.rpm

      6.rpm -ivh cluster-glue-libs-1.0.5-6.el6.x86_64.rpm

      7.rpm -ivh heartbeat-libs-3.0.4-2.el6.x86_64.rpm heartbeat-3.0.4-2.el6.x86_64.rpm

      备注:heartbeat-libs和heartbeat要一起安装    

    

(2)、配置heartbeat

主备节点两端的配置文件(ha.cf authkeys haresources)完全相同

  cp /usr/share/doc/heartbeat-3.0.4/ha.cf /etc/ha.d/

  cp /usr/share/doc/heartbeat-3.0.4/haresources /etc/ha.d/

  cp /usr/share/doc/heartbeat-3.0.4/authkeys /etc/ha.d/


vim /etc/ha.d/ha.cf

#############################################

logfile /var/log/ha-log       #日志目录

logfacility     local0        #日志级别

keepalive 2                 #心跳检测间隔

deadtime 5                 #死亡时间

ucast eth3 75.0.2.33         #心跳网卡及对方的IP(备机仅修改此处)

auto_failback off            #主服务器正常后,资源转移至主

node oa-mysql.com oa-haproxy.com   #两个节点的主机名

###############################################################################

vim /etc/ha.d/authkeys      #心跳密码文件权限必须是600

######################

auth 3              #选用算法3,MD5算法

#1 crc

#2 sha1 HI!

3 md5 heartbeat

######################

vim /etc/ha.d/

#########################################################################

data-09.com IPaddr::192.168.1.13/24/br0 drbddisk::r0 Filesystem::/dev/drbd0::/usr/local/data::ext4 mysql nfs

注释:主服务器的主机名 VIP/绑定的网卡 drbd分区:drbd分区挂载目录:文件系统 mysql服务 NFS服务


(5)、创建drbddisk nfs mysql脚本并授予执行权限(三个资源管理脚本需存放在ha.d/resource.d)

[[email protected] ~]#cat /etc/ha.d/resource.d/drbddisk

##################################################################

#!/bin/bash

#

# This script is inteded to be used as resource script by heartbeat

#

# Copright 2003-2008 LINBIT Information Technologies

# Philipp Reisner, Lars Ellenberg

#

###

DEFAULTFILE="/etc/default/drbd"

DRBDADM="/sbin/drbdadm"

if [ -f $DEFAULTFILE ]; then

 . $DEFAULTFILE

fi

if [ "$#" -eq 2 ]; then

 RES="$1"

 CMD="$2"

else

 RES="all"

 CMD="$1"

fi

## EXIT CODES

# since this is a "legacy heartbeat R1 resource agent" script,

# exit codes actually do not matter that much as long as we conform to

#  http://wiki.linux-ha.org/HeartbeatResourceAgent

# but it does not hurt to conform to lsb init-script exit codes,

# where we can.

#  http://refspecs.linux-foundation.org/LSB_3.1.0/

#LSB-Core-generic/LSB-Core-generic/iniscrptact.html

####

drbd_set_role_from_proc_drbd()

{

local out

if ! test -e /proc/drbd; then

ROLE="Unconfigured"

return

fi

dev=$( $DRBDADM sh-dev $RES )

minor=${dev#/dev/drbd}

if [[ $minor = *[!0-9]* ]] ; then

# sh-minor is only supported since drbd 8.3.1

minor=$( $DRBDADM sh-minor $RES )

fi

if [[ -z $minor ]] || [[ $minor = *[!0-9]* ]] ; then

ROLE=Unknown

return

fi

if out=$(sed -ne "/^ *$minor: cs:/ { s/:/ /g; p; q; }" /proc/drbd); then

set -- $out

ROLE=${5%/**}

: ${ROLE:=Unconfigured} # if it does not show up

else

ROLE=Unknown

fi

}

case "$CMD" in

   start)

# try several times, in case heartbeat deadtime

# was smaller than drbd ping time

try=6

while true; do

$DRBDADM primary $RES && break

let "--try" || exit 1 # LSB generic error

sleep 1

done

;;

   stop)

# heartbeat (haresources mode) will retry failed stop

# for a number of times in addition to this internal retry.

try=3

while true; do

$DRBDADM secondary $RES && break

# We used to lie here, and pretend success for anything != 11,

# to avoid the reboot on failed stop recovery for "simple

# config errors" and such. But that is incorrect.

# Don't lie to your cluster manager.

# And don't do config errors...

let --try || exit 1 # LSB generic error

sleep 1

done

;;

   status)

if [ "$RES" = "all" ]; then

   echo "A resource name is required for status inquiries."

   exit 10

fi

ST=$( $DRBDADM role $RES )

ROLE=${ST%/**}

case $ROLE in

Primary|Secondary|Unconfigured)

# expected

;;

*)

# unexpected. whatever...

# If we are unsure about the state of a resource, we need to

# report it as possibly running, so heartbeat can, after failed

# stop, do a recovery by reboot.

# drbdsetup may fail for obscure reasons, e.g. if /var/lock/ is

# suddenly readonly.  So we retry by parsing /proc/drbd.

drbd_set_role_from_proc_drbd

esac

case $ROLE in

Primary)

echo "running (Primary)"

exit 0 # LSB status "service is OK"

;;

Secondary|Unconfigured)

echo "stopped ($ROLE)"

exit 3 # LSB status "service is not running"

;;

*)

# NOTE the "running" in below message.

# this is a "heartbeat" resource script,

# the exit code is _ignored_.

echo "cannot determine status, may be running ($ROLE)"

exit 4 #  LSB status "service status is unknown"

;;

esac

;;

   *)

echo "Usage: drbddisk [resource] {start|stop|status}"

exit 1

;;

esac

exit 0

##############################################################

[[email protected] ~]#cat /etc/ha.d/resrouce.d/nfs

killall -9 nfsd; /etc/init.d/nfs restart;exit 0

mysql启动脚本用mysql自带的启动管理脚本即可

 cp /etc/init.d/mysql /etc/ha.d/resrouce.d/

               注意:nfs mysql drbddisk 三个脚本需要+x 权限

 (6)、启动heartbeat

[[email protected] ~]# service heartbeat start (两个节点同时启动)

[[email protected] ~]# chkconfig heartbeat off

说明:关闭开机自启动,当服务器重启时,需要人工去启动

5、测试

在在另外一台LINUX的客户端挂载虚IP:192.168.7.90,挂载成功表明NFS+DRBD+HeartBeat大功告成.


测试DRBD+HeartBeat+NFS可用性:

1.向挂载的/tmp目录传送文件,忽然重新启动主端DRBD服务器,查看变化能够实现断点续传,但是drbd+heartbeat正常切换需要时间

2. 假设此时把primary的eth0 给ifdown了, 然后直接在secondary上进行主的提升,并也给mount了, 发现在primary上测试拷入的文件确实同步过来了。之后把primary的 eth0 恢复后, 发现没有自动恢复主从关系, 经过支持查询,发现出现了drbd检测出现了Split-Brain 的状况, 两个节点各自都standalone了,故障描术如下:Split-Brain detected, dropping connection!这个即时传说中的脑裂了,DRBD官方推荐手动恢复(生产环境下出现这个机率的机会很低的,谁会去故障触动生产中的服务器)


以下手动恢复Split-Brain状况:

1. drbdadm secondary r0  

2. drbdadm disconnect all  

3. drbdadmin -- --discard-my-data connect r0 

ii.在primary上:

1. drbdadm disconnect all  

2. drbdadm connect r0


3. 假设Primary因硬件损坏了,需要将Secondary提生成Primay主机,如何处理,方法如下:

在primaty主机上,先要卸载掉DRBD设备.

umount /tmp


将主机降级为备机

[[email protected] /]# drbdadm secondary r0  

[[email protected] /]# cat /proc/drbd  

1: cs:Connected st:Secondary/Secondary ds:UpToDate/UpToDate C r― 

现在,两台主机都是”备机”.

在备机data-11.com上, 将它升级为”主机”.

[[email protected]/]# drbdadm primary r0  

[[email protected] /]# cat /proc/drbd  

1: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r― 


已知问题:

heartbeat无法监控资源,也就是说当drbd或者nfs挂掉了,也不会发生任何动作,它只认为对方的机器dead之后才会发生动作。也就是机器宕机,网络断掉才会发生主备切换,因此有另外一种方案:corosync+pacemaker


你可能感兴趣的:(Heartbeat+DRBD+MySQ+NFS部署文档)