基础:
磁盘镜像技术DRBD
DRBD(Distributed Replicated Block Device)是由内核模块和相关脚本构成,用以构建高可用性的集群。其实现方式是通过网络来镜像整个设备。它允许用户在远程机器上建立一个本地块设备的实时镜像。也可以把它看成一个网络RAID1.
工作原理:
DRBD负责接收数据,把数据写到本地磁盘,然后发送给另一个主机。另一个主机再将数据存到自己的磁盘中。目前,DRBD每次只允许对一个节点进行读写访问,这对于通常的故障切换高可用性集群已经够了。
DRBD协议:
A:数据一旦写入磁盘并发送到网络中就认为完成了写入操作
B:收到接受确认就认为完成了写入操作
C:收到写入确认就完成了写入操作
目前运用最多的就是C协议
DRBD的三个进程:
drbd0——worker:主进程
drbd0——asender:primary上drbd0的数据发送进程
drbd0——receiver;secondary上drbd0的数据接受进程
配置DRBD前需要注意几点:
mount drbd设备以前必须把设备切换到primary状态
两个节点中,同一时刻只能有一台处于primary状态,另一台处于secondary状态,处于secondary状态的服务器上不能加载drbd设备。主备服务器同步的两个分区大小最好相同,这样不至于浪费磁盘空间,因为drbd镜像相当于网络raid1.
测试环境:
– 两个节点 (node1:192.168.1.110 node2:192.168.1.111 hosts做绑定 )
- 两块硬盘 (node1与node2节点各自添加相同大小的一块硬盘)
– 保持两个节点时间同步
– Selinux 关闭 防火墙7788端口打开(或者关闭防火墙)
开始安装:
安装第三方源
rpm -Uvh http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm yum makecache yum install *gcc* make* #源码包安装是的依赖
安装DRBD(yum):
[root@node1 ~]# yum -y install drbd83-utils kmod-drbd83 [root@node2 ~]# yum -y install drbd83-utils kmod-drbd83 node1与node2执行"yum install kernel-devel kernel kernel-headers flex " ,否则会报错:FATAL: Module drbd not found. 安装后reboot
手动加入drbd模块:
# depmod
说明:在编译并准备好一个Linux内核加载模块后,modprobe前必须先执行命令depmod,此命令会生成新的modules.dep。
/sbin/modprobe drbd
lsmod |grep drbd
安装DRBD(源码):
[root@node1 ~]#wget http://oss.linbit.com/drbd/8.3/drbd-8.3.16.tar.gz && tar -zxvf drbd-8.3.16.tar.gz #drbd共有两部分组成:内核模块和用户空间的管理工具。 [root@node1 ~]#cd drbd-8.3.16 [root@node1 drbd-8.3.16]#./configure --prefix=/usr/local/drbd --with-km #"--with-km"启用drbd内核模块(这里可以使用./configure --help查询) [root@node1 drbd-8.3.16]#make KDIR=/usr/src/kernels/2.6.32-358.el6.x86_64/ #KDIR指的是自己的内核路径 [root@node1 drbd-8.3.16]#make install [root@node1 drbd-8.3.16]#mkdir -p /usr/local/drbd/var/run/drbd #创建drbd运行所需目录 [root@node1 drbd-8.3.16]#cp /usr/local/drbd/etc/rc.d/init.d/drbd /etc/rc.d/init.d/ #添加启动脚本 [root@node1 drbd-8.3.16]#chkconfig --add drbd && chkconfig drbd on #设置开机启动 安装drbd模块: [root@node1 drbd-8.3.16]#cd drbd [root@node1 drbd-8.3.16]#make clean [root@node1 drbd-8.3.16]#make KDIR=/usr/src/kernels/2.6.32-358.el6.x86_64/ [root@node1 drbd-8.3.16]#cp drbd.ko /lib/modules/`uname -r`/kernel/lib/ [root@node1 drbd-8.3.16]# depmod [root@node1 drbd-8.3.16]#/sbin/modprobe drbd [root@node1 drbd-8.3.16]#lsmod |grep drbd
DRBD分区:
在两个节点为drbd单独分区
[root@node1 ~]# fdisk -cu /dev/sdb [root@node2 ~]# fdisk -cu /dev/sdb
如下:
[root@node1 yum.repos.d]# fdisk -cu /dev/sdb
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel Building a new DOS disklabel with disk identifier 0x2a0f1472. Changes will remain in memory only, until you decide to write them. After that, of course, the previous content won't be recoverable. Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite) Command (m for help): p Disk /dev/sdb: 2147 MB, 2147483648 bytes 255 heads, 63 sectors/track, 261 cylinders, total 4194304 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x2a0f1472 Device Boot Start End Blocks Id System Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 1 First sector (2048-4194303, default 2048): Using default value 2048 Last sector, +sectors or +size{K,M,G} (2048-4194303, default 4194303): Using default value 4194303 Command (m for help): w The partition table has been altered!
mkfs.ext4 /dev/sdb1
创建分布式的复制的块设备文件:
[root@node1 ~]# vi /etc/drbd.d/clusterdb.res resource clusterdb { startup { wfc-timeout 30; outdated-wfc-timeout 20; degr-wfc-timeout 30; } net { cram-hmac-alg sha1; shared-secret sync_disk; } syncer { rate 10M; al-extents 257; on-no-data-accessible io-error; } on node1 { device /dev/drbd0; disk /dev/sdb1; address 192.168.1.110:7788; flexible-meta-disk internal; } on node2 { device /dev/drbd0; disk /dev/sdb1; address 192.168.1.111:7788; meta-disk internal; } }
确认在两个节点做了hosts解析
/etc/hosts 192.168.1.110 node1 node1.example.com 192.168.1.111 node2 node2.example.com
时间同步(crond加入定时执行)
crontab -e 5 * * * * root ntpdate time-nw.nist.gov :wq crontab -l 5 * * * * root ntpdate time-nw.nist.gov service crond restart
拷贝DRBD配置文件至node2节点:
[root@node1 ~]# scp /etc/drbd.d/clusterdb.res node2:/etc/drbd.d/clusterdb.res
初始化两个节点的DRBD元数据:
[root@node1 ~]# drbdadm create-md clusterdb [root@node2 ~]# drbdadm create-md clusterdb You want me to create a v08 style flexible-size internal meta data block. There appears to be a v08 flexible-size internal meta data block already in place on /dev/sdb1 at byte offset 2146430976 Do you really want to overwrite the existing v08 meta-data? [need to type 'yes' to confirm] yes Writing meta data... initializing activity log NOT initialized bitmap New drbd meta data block successfully created.
报错:
md_offset 2146758656 al_offset 2146725888 bm_offset 2146660352 Found ext3 filesystem 2096448 kB data area apparently used 2096348 kB left usable by current configuration Device size would be truncated, which would corrupt data and result in 'access beyond end of device' errors. You need to either * use external meta data (recommended) * shrink that filesystem first * zero out the device (destroy the filesystem) Operation refused. Command 'drbdmeta 0 v08 /dev/sdb1 internal create-md' terminated with exit code 40 drbdadm create-md clusterdb: exited with code 40
执行:dd if=/dev/zero of=/dev/sdb1 bs=1M count=1 然后重新初始化可解决报错.
在两个节点开启DRBD进程:
[root@node1 ~]# service drbd start [root@node2 ~]# service drbd start
运行drbdadm设置node1为主节点命令:
[root@node1 ~]# drbdadm -- --overwrite-data-of-peer primary all 注: 也可以在要设置为Primary的节点上使用如下命令来设置主节点: drbdsetup primary all 或 [root@node1 ~]# drbdsetup /dev/drbd0 primary -o 而后再次查看状态,可以发现数据同步过程已经开始:
检查磁盘初始同步是否完成,确认你是否在主节点上:
[root@node1 yum.repos.d]# cat /proc/drbd version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build32R6, 2013-09-27 15:59:12 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----- ns:78848 nr:0 dw:0 dr:79520 al:0 bm:4 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:2017180 [>....................] sync'ed: 4.0% (2017180/2096028)K finish: 0:02:58 speed: 11,264 (11,264) K/sec ns:1081628 nr:0 dw:33260 dr:1048752 al:14 bm:64 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0]
注:"/proc/drbd"中显示了drbd当前的状态.第一行的st表示两台主机的状态,都是”备机”状态.ds是磁盘状态.
在分布式复制的块设备上创建文件系统:
[root@node1 yum.repos.d]# /sbin/mkfs.ext4 /dev/drbd0 mke2fs 1.41.12 (17-May-2010) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 131072 inodes, 524007 blocks 26200 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=536870912 16 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912 Writing inode tables: done Creating journal (8192 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 26 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override.
在你的主节点挂载DRBD分区:
mkdir /data mount /dev/drbd0 /data [root@node1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg_unixmencentos65-lv_root 19G 3.6G 15G 20% / tmpfs 1.2G 44M 1.2G 4% /dev/shm /dev/sda1 485M 80M 380M 18% /boot /dev/drbd0 2.0G 36M 1.9G 2% /data
创建测试文件:
[root@node1 ~]#dd if=/dev/zero of=/data/tempfile1.tmp bs=104857600 count=2 #创建200M的测试文件 [root@node1 data]# ll total 204816 drwx------ 2 root root 16384 Jan 25 00:14 lost+found -rw-r--r-- 1 root root 209715200 Jan 25 00:15 tempfile1.tmp
关闭主/设置辅节点:
(从主节点卸载data目录,同时将主节点降为辅助节点)
(设置辅助节点为主节点,挂载data目录)
[root@node1 ~]#umount /data #卸载drbd设备
[root@node1 ~]#drbdadm secondary clusterdb #关闭node1的主节点,设置为辅节点
[root@node1 data]# cat /proc/drbd version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r----- ns:774292 nr:8 dw:476096 dr:300622 al:123 bm:23 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
[root@node2 ~]#drbdadm -- --overwrite-data-of-peer primary all #设置node2为主节点,关闭辅节点
[root@node2 ~]#mount /dev/drbd0 /data #挂载drbd0到data目录
[root@node2 data]# cat /proc/drbd version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:32 nr:774292 dw:774324 dr:2030 al:2 bm:23 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
查看data目录:
[root@node2 data]# ll total 204816 drwx------ 2 root root 16384 Jan 25 02:24 lost+found -rw-r--r-- 1 root root 209715200 Jan 25 02:25 tempfile1.tmp
注意:
修改主节点为辅节点时一定要卸载挂载的目录,否则会报错!
翻译自: http://www.unixmen.com/configure-drbd-centos-6-5/
可参考: http://geekpeek.net/drbd-how-to-configure-drbd-on-centos/
DRBD脑裂处理:http://www.tuicool.com/articles/Zv22i2j
DRBD介绍、工作原理及脑裂故障处理:http://bruce007.blog.51cto.com/7748327/1330959
DRBD脑裂处理:http://itindex.net/detail/50197-drbd
源码下载地址:http://oss.linbit.com/drbd/