红帽企业集群和存储管理之

DRBD+Heartbeat+NFS实现详解
案例应用背景
本实验部署 DRBD + HEARDBEAT + NFS 环境,建立一个高可用(HA)的文件服务器集群。在方案中,通过DRBD保证了服务器数据的完整性和一致性。DRBD类似于一个网络RAID-1功能。当你将数据写入本地文件系统时,数据还将会被发送到网络中另一台主机上,以相同的形式记录在一个另文件系统中。主节点与备节点的数据可以保证实时相互同步。当本地主服务器出现故障时,备份服务器上还会保留有一份相同的数据,可以继续使用。在高可用(HA)中使用DRBD功能,可以代替使用一个共享盘阵。因为数据同时存在于本地主服务器和备份服务器上。切换时,远程主机只要使用它上面的那份备份数据,就可以继续提供主服务器上相同的服务,并且client 用户对主服务器的故障无感知。
案例应用简化拓扑图
软件包下载地址: http://down.51cto.com/data/402474

红帽企业集群和存储管理之DRBD+Heartbeat+NFS实现详解_第1张图片

案例应用具体实现步骤:
一.网络的基本配置
1.1 node1基本配置&新建磁盘
1.1.1查看系统信息,同步时间

[root@node1 ~]# uname -rv

2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:54 EDT 2009

[root@node1 ~]# cat /etc/redhat-release

Red Hat Enterprise Linux Server release 5.4 (Tikanga)

[root@node1 ~]# hwclock -s

[root@node1 ~]# date

Wed Feb 8 13:55:50 CST 2012

1.1.2查看主机名称,修改查看ip地址

[root@node1 ~]# hostname

node1.junjie.com

[root@node1 ~]# cat /etc/sysconfig/network

NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=node1.junjie.com

[root@node1 ~]#setup

红帽企业集群和存储管理之DRBD+Heartbeat+NFS实现详解_第2张图片

[root@node1 ~]# service network restart

Shutting down interface eth0: [ OK ]

Shutting down loopback interface: [ OK ]

Bringing up loopback interface: [ OK ]

Bringing up interface eth0: [ OK ]

[root@node1 ~]# ifconfig eth0

eth0 Link encap:Ethernet HWaddr 00:0C:29:AE:83:D1

inet addr:192.168.101.211 Bcast:192.168.101.255 Mask:255.255.255.0

1.1.3配置/etc/hosts文件(就不用dns了)

[root@node1 ~]# echo "192.168.101.211 node1.junjie.com node1" >>/etc/hosts

[root@node1 ~]# echo "192.168.101.212 node2.junjie.com node2" >>/etc/hosts

1.1.4构建一个新的磁盘空间有利于实现DRBD技术

[root@node1 ~]# fdisk -l

 

Disk /dev/sda: 21.4 GB, 21474836480 bytes

255 heads, 63 sectors/track, 2610 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

 

Device Boot Start End Blocks Id System

/dev/sda1 * 1 13 104391 83 Linux

/dev/sda2 14 1318 10482412+ 83 Linux

/dev/sda3 1319 1579 2096482+ 82 Linux swap / Solaris

[root@node1 ~]# fdisk /dev/sda

p/n/p//+1000M/p/w

[root@node1 ~]# fdisk -l

 

Disk /dev/sda: 21.4 GB, 21474836480 bytes

255 heads, 63 sectors/track, 2610 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

 

Device Boot Start End Blocks Id System

/dev/sda1 * 1 13 104391 83 Linux

/dev/sda2 14 1318 10482412+ 83 Linux

/dev/sda3 1319 1579 2096482+ 82 Linux swap / Solaris

/dev/sda4 1580 1702 987997+ 83 Linux

[root@node1 ~]# partprobe /dev/sda

[root@node1 ~]# cat /proc/partitions

major minor #blocks name

 

8 0 20971520 sda

8 1 104391 sda1

8 2 10482412 sda2

8 3 2096482 sda3

8 4 987997 sda4

1.2 node2基本配置&新建磁盘

1.2.1查看系统信息,同步时间

[root@node2 ~]# uname -rv

2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:54 EDT 2009

[root@node2 ~]# cat /etc/redhat-release

Red Hat Enterprise Linux Server release 5.4 (Tikanga)

[root@node2 ~]# hwclock -s

[root@node2 ~]# date

Wed Feb 8 14:02:22 CST 2012

1.2.2查看主机名称,修改查看ip地址

[root@node2 ~]# hostname

node2.junjie.com

[root@node2 ~]# cat /etc/sysconfig/network

NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=node2.junjie.com

[root@node2 ~]# setup

红帽企业集群和存储管理之DRBD+Heartbeat+NFS实现详解_第3张图片

[root@node2 ~]# service network restart

Shutting down interface eth0: [ OK ]

Shutting down loopback interface: [ OK ]

Bringing up loopback interface: [ OK ]

Bringing up interface eth0: [ OK ]

[root@node2 ~]# ifconfig eth0

eth0 Link encap:Ethernet HWaddr 00:0C:29:D1:D4:32

inet addr:192.168.101.212 Bcast:192.168.101.255 Mask:255.255.255.0

1.2.3配置/etc/hosts文件(就不用dns了)

[root@node2 ~]# echo "192.168.101.211 node1.junjie.com node1" >>/etc/hosts

[root@node2 ~]# echo "192.168.101.212 node2.junjie.com node2" >>/etc/hosts

1.2.4构建一个新的磁盘空间有利于实现DRBD技术

[root@node2 ~]# fdisk -l

 

Disk /dev/sda: 21.4 GB, 21474836480 bytes

255 heads, 63 sectors/track, 2610 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

 

Device Boot Start End Blocks Id System

/dev/sda1 * 1 13 104391 83 Linux

/dev/sda2 14 1318 10482412+ 83 Linux

/dev/sda3 1319 1579 2096482+ 82 Linux swap / Solaris

[root@node2 ~]# fdisk /dev/sda

p/n/p//+1000M/p/w

[root@node2 ~]# fdisk -l

 

Disk /dev/sda: 21.4 GB, 21474836480 bytes

255 heads, 63 sectors/track, 2610 cylinders

Units = cylinders of 16065 * 512 = 8225280 bytes

 

Device Boot Start End Blocks Id System

/dev/sda1 * 1 13 104391 83 Linux

/dev/sda2 14 1318 10482412+ 83 Linux

/dev/sda3 1319 1579 2096482+ 82 Linux swap / Solaris

/dev/sda4 1580 1702 987997+ 83 Linux

[root@node2 ~]# partprobe /dev/sda

[root@node2 ~]# cat /proc/partitions

major minor #blocks name

 

8 0 20971520 sda

8 1 104391 sda1

8 2 10482412 sda2

8 3 2096482 sda3

8 4 987997 sda4

1.3 在node1和node2上配置ssh密钥信息,

有利于以后在一个节点对另一节点直接操作。

1.3.1 在node1上配置ssh密钥信息

[root@node1 ~]# ssh-keygen -t rsa

[root@node1 ~]# ssh-copy-id -i .ssh/id_rsa.pub [email protected]

1.3.2 在node2上配置ssh密钥信息

[root@node2 ~]# ssh-keygen -t rsa

[root@node2 ~]# ssh-copy-id -i .ssh/id_rsa.pub [email protected]

二、DRBD安装配置步骤
在node1和node2做以下操作:
我下载的软件包是:(我放在/root/下了)

drbd83-8.3.8-1.el5.centos.i386.rpm

kmod-drbd83-8.3.8-1.el5.centos.i686.rpm
 

2.1、安装DRBD 套件

[root@node1 ~]# rpm -ivh drbd83-8.3.8-1.el5.centos.i386.rpm

[root@node1 ~]# rpm -ivh kmod-drbd83-8.3.8-1.el5.centos.i686.rpm

 

[root@node2 ~]# rpm -ivh drbd83-8.3.8-1.el5.centos.i386.rpm

[root@node2 ~]# rpm -ivh kmod-drbd83-8.3.8-1.el5.centos.i686.rpm

2.2、加载DRBD 模块

[root@node1 ~]# modprobe drbd

[root@node1 ~]# lsmod | grep drbd

drbd 228528 0

[root@node1 ~]#

 

[root@node2 ~]# modprobe drbd

[root@node2 ~]# lsmod | grep drbd

drbd 228528 0

[root@node2 ~]#

2.3、修改配置文件
drbd.conf配置文件DRBD运行时,会读取一个配置文件/etc/drbd.conf.这个文件里描述了DRBD设备与硬盘分区的映射关系

2.3.1 node1上作以下配置

 

红帽企业集群和存储管理之DRBD+Heartbeat+NFS实现详解_第4张图片

[root@node1 ~]# cd /etc/drbd.d/

[root@node1 drbd.d]# ll

total 4

-rwxr-xr-x 1 root root 1418 Jun 4 2010 global_common.conf

[root@node1 drbd.d]# cp global_common.conf global_common.conf.bak

红帽企业集群和存储管理之DRBD+Heartbeat+NFS实现详解_第5张图片

红帽企业集群和存储管理之DRBD+Heartbeat+NFS实现详解_第6张图片

2.3.2 复制配置到node2上:

[root@node1 drbd.d]# scp /etc/drbd.conf node2:/etc/

drbd.conf 100% 133 0.1KB/s 00:00

[root@node1 drbd.d]# scp /etc/drbd.d/* node2:/etc/drbd.d/

global_common.conf 100% 427 0.4KB/s 00:00

global_common.conf.bak 100% 1418 1.4KB/s 00:00

nfs.res 100% 330 0.3KB/s 00:00

[root@node1 drbd.d]#

2.4、 检测配置文件
// 检测配置文件

[root@node1 drbd.d]# drbdadm adjust nfs

0: Failure: (119) No valid meta-data signature found.

 

==> Use 'drbdadm create-md res' to initialize meta-data area. <==

 

Command 'drbdsetup 0 disk /dev/sda4 /dev/sda4 internal --set-defaults --create-device --fencing=resource-only --on-io-error=detach' terminated with exit code 10

[root@node1 drbd.d]# drbdadm adjust nfs

drbdsetup 0 show:5: delay-probe-volume 0k => 0k out of range [4..1048576]k.

[root@node1 drbd.d]#

2.5、创建nfs 的资源

2.5.1 在node1上创建nfs 的资源

[root@node1 drbd.d]# drbdadm create-md nfs

Writing meta data...

initializing activity log

NOT initialized bitmap

New drbd meta data block successfully created.

[root@node1 drbd.d]# ll /dev/drbd0

brw-r----- 1 root disk 147, 0 Feb 8 14:27 /dev/drbd0

2.5.2 在node2上创建nfs 的资源

[root@node1 drbd.d]# ssh node2.junjie.com 'drbdadm create-md nfs'

NOT initialized bitmap

Writing meta data...

initializing activity log

New drbd meta data block successfully created.

[root@node1 drbd.d]# ssh node2.junjie.com 'ls -l /dev/drbd0'

brw-r----- 1 root disk 147, 0 Feb 8 14:19 /dev/drbd0

2.6 启动DRBD服务

[root@node1 drbd.d]# service drbd start

Starting DRBD resources: drbdsetup 0 show:5: delay-probe-volume 0k => 0k out of range [4..1048576]k.

[root@node1 drbd.d]# ssh node2.junjie.com 'service drbd start'

Starting DRBD resources: drbdsetup 0 show:5: delay-probe-volume 0k => 0k out of range [4..1048576]k.

[root@node1 drbd.d]#

2.7 启动DRBD服务,查看DRBD状态

[root@node1 drbd.d]# service drbd status

drbd driver loaded OK; device status:

version: 8.3.8 (api:88/proto:86-94)

GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by [email protected], 2010-06-04 08:04:16

m:res cs ro ds p mounted fstype

0:nfs Connected Secondary/Secondary Inconsistent/Inconsistent C

[root@node1 drbd.d]# ssh node2.junjie.com 'service drbd status'

drbd driver loaded OK; device status:

version: 8.3.8 (api:88/proto:86-94)

GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by [email protected], 2010-06-04 08:04:16

m:res cs ro ds p mounted fstype

0:nfs Connected Secondary/Secondary Inconsistent/Inconsistent C

[root@node1 drbd.d]#

 

[root@node1 drbd.d]# drbd-overview

0:nfs Connected Secondary/Secondary Inconsistent/Inconsistent C r----

[root@node1 drbd.d]# ssh node2.junjie.com 'drbd-overview'

0:nfs Connected Secondary/Secondary Inconsistent/Inconsistent C r----

[root@node1 drbd.d]#

 

[root@node1 drbd.d]# chkconfig drbd on

[root@node1 drbd.d]# chkconfig --list drbd

drbd 0:off 1:off 2:on 3:on 4:on 5:on 6:off

[root@node1 drbd.d]# ssh node2.junjie.com 'chkconfig drbd on'

[root@node1 drbd.d]# ssh node2.junjie.com 'chkconfig --list drbd'

drbd 0:off 1:off 2:on 3:on 4:on 5:on 6:off

2.8 在node1主节点上进行以下配置,并查看挂载信息。

[root@node1 drbd.d]# mkdir /mnt/nfs

[root@node1 drbd.d]# ssh node2.junjie.com 'mkdir /mnt/nfs'

[root@node1 drbd.d]# drbdsetup /dev/drbd0 primary -o

[root@node1 drbd.d]# mkfs.ext3 /dev/drbd0

[root@node1 drbd.d]# mount /dev/drbd0 /mnt/nfs/

红帽企业集群和存储管理之DRBD+Heartbeat+NFS实现详解_第7张图片

 

至此DRBD配置成功!!!
三、NFS配置

两台服务器都修改nfs 配置文件,都修改nfs 启动脚本,如下:

红帽企业集群和存储管理之DRBD+Heartbeat+NFS实现详解_第8张图片

红帽企业集群和存储管理之DRBD+Heartbeat+NFS实现详解_第9张图片

 
四、Heartbeat配置
在server1和server2做以下操作:
1、安装Heartbeat套件

[root@node1 ~]# yum localinstall -y heartbeat-2.1.4-9.el5.i386.rpm heartbeat-pils-2.1.4-10.el5.i386.rpm heartbeat-stonith-2.1.4-10.el5.i386.rpm libnet-1.1.4-3.el5.i386.rpm perl-MailTools-1.77-1.el5.noarch.rpm --nogpgcheck

[root@node1 ~]#

 

[root@node2 ~]# yum localinstall -y heartbeat-2.1.4-9.el5.i386.rpm heartbeat-pils-2.1.4-10.el5.i386.rpm heartbeat-stonith-2.1.4-10.el5.i386.rpm libnet-1.1.4-3.el5.i386.rpm perl-MailTools-1.77-1.el5.noarch.rpm --nogpgcheck

[root@node2 ~]#

 
2、拷贝配置文档

[root@node1 ~]# cd /usr/share/doc/heartbeat-2.1.4/

[root@node1 heartbeat-2.1.4]# cp authkeys ha.cf haresources /etc/ha.d/

[root@node1 heartbeat-2.1.4]# cd /etc/ha.d/

3、修改配置文档

红帽企业集群和存储管理之DRBD+Heartbeat+NFS实现详解_第10张图片

 

 

 

红帽企业集群和存储管理之DRBD+Heartbeat+NFS实现详解_第11张图片

红帽企业集群和存储管理之DRBD+Heartbeat+NFS实现详解_第12张图片

六、测试
1、在测试机上将192.168.101.210:/mnt/nfs挂到本地/data下

[root@client ~]# mkdir /data

[root@client ~]# mount 192.168.101.210:/mnt/nfs/ /data/

[root@client ~]# cd /data/

[root@client data]# ll

total 20

-rw-r--r-- 1 root root 4 Feb 8 17:41 f1

drwx------ 2 root root 16384 Feb 8 14:57 lost+found

[root@client data]# touch f-client-1

[root@client data]# ll

total 20

-rw-r--r-- 1 root root 0 Feb 8 19:50 f-client-1

-rw-r--r-- 1 root root 4 Feb 8 17:41 f1

drwx------ 2 root root 16384 Feb 8 14:57 lost+found

[root@client data]#cd

[root@client ~]#

2、在测试机上创建测试shell,每秒一次

红帽企业集群和存储管理之DRBD+Heartbeat+NFS实现详解_第13张图片

 

3、将主节点node1 的heartbeat服务停止,则备节点node2 接管服务

[root@node1 ha.d]# service heartbeat stop

Stopping High-Availability services:

[ OK ]

[root@node1 ha.d]# drbd-overview

0:nfs Connected Secondary/Primary UpToDate/UpToDate C r----

[root@node1 ha.d]# ifconfig eth0:0

eth0:0 Link encap:Ethernet HWaddr 00:0C:29:AE:83:D1

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

Interrupt:67 Base address:0x2000

 

[root@node1 ha.d]#

 

[root@node2 ha.d]# drbd-overview

0:nfs Connected Primary/Secondary UpToDate/UpToDate C r---- /mnt/nfs ext3 950M 18M 885M 2%

[root@node2 ha.d]# ifconfig eth0:0

eth0:0 Link encap:Ethernet HWaddr 00:0C:29:D1:D4:32

inet addr:192.168.101.210 Bcast:192.168.101.254 Mask:255.255.255.0

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

Interrupt:67 Base address:0x2000

 

[root@node2 ha.d]#

4、在客户端上运行nfs.sh测试文件,一直显示的信息如下:

[root@client ~]# ./nfs.sh

---> trying touch x : Wed Feb 8 20:00:58 CST 2012

<----- done touch x : Wed Feb 8 20:00:58 CST 2012

 

---> trying touch x : Wed Feb 8 20:00:59 CST 2012

<----- done touch x : Wed Feb 8 20:00:59 CST 2012

 

---> trying touch x : Wed Feb 8 20:01:00 CST 2012

<----- done touch x : Wed Feb 8 20:01:00 CST 2012

 

---> trying touch x : Wed Feb 8 20:01:01 CST 2012

<----- done touch x : Wed Feb 8 20:01:01 CST 2012

 

---> trying touch x : Wed Feb 8 20:01:02 CST 2012

<----- done touch x : Wed Feb 8 20:01:02 CST 2012

 

---> trying touch x : Wed Feb 8 20:01:03 CST 2012

<----- done touch x : Wed Feb 8 20:01:03 CST 2012

 

---> trying touch x : Wed Feb 8 20:01:04 CST 2012

<----- done touch x : Wed Feb 8 20:01:04 CST 2012

 

---> trying touch x : Wed Feb 8 20:01:05 CST 2012

<----- done touch x : Wed Feb 8 20:01:05 CST 2012

 
 
5、查看客户端的挂载信息如下,磁盘可正常使用:

[root@client ~]# mount

/dev/sda2 on / type ext3 (rw)

proc on /proc type proc (rw)

sysfs on /sys type sysfs (rw)

devpts on /dev/pts type devpts (rw,gid=5,mode=620)

/dev/sda1 on /boot type ext3 (rw)

tmpfs on /dev/shm type tmpfs (rw)

none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)

sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)

nfsd on /proc/fs/nfsd type nfsd (rw)

192.168.101.210:/mnt/nfs on /data type nfs (rw,addr=192.168.101.210)

[root@client ~]#

[root@client ~]# ll /data/

total 20

-rw-r--r-- 1 root root 0 Feb 8 19:50 f-client-1

-rw-r--r-- 1 root root 4 Feb 8 17:41 f1

drwx------ 2 root root 16384 Feb 8 14:57 lost+found

[root@client ~]#

 
至此,node2 接管服务成功,实验已实现所需的功能;也可手动在nfs挂载目录里建立文件,来回切换node1node2drbd服务来进行测试。
5、恢复node1为主要节点

[root@node1 ha.d]# service heartbeat start

Starting High-Availability services:

2012/02/08_20:04:49 INFO: Resource is stopped

[ OK ]

[root@node1 ha.d]# drbd-overview

0:nfs Connected Primary/Secondary UpToDate/UpToDate C r---- /mnt/nfs ext3 950M 18M 885M 2%

[root@node1 ha.d]#

至此DRBD+Heartbeat+NFS已经实现成功!!!
《完》

--xjzhujunjie

--2012/05/08