安装配置DRBD
定义:
DRBD(Distributed Replicated Block Device) 分布式块设备复制:是linux内核存储层中的一个分布式存储系统,可利用DRBD在两台linux服务器之间共享块设备,文件系统和数据,类似于一个网络RAID1的功能
两台服务器间就算某台服务器出现断电或者宕机也不会对数据有任何影响
当将数据写入到本地主节点的文件系统时,这些数据会通过网络发送到远端另一台从节点上;本地节点和远端从节点数据通过TCP/IP协议保持同步,主节点发生故障时,远端从节点保存着相同的数据,可以接替主节点继续对外提供服务;两节点之间使用Heartbeat来检测对方是否存活,且自动切换可以通过Heartbeat方案解决,不需要人工干预(Heartbeat方案这里先忽略)
DRBD三种复制协议:
协议A:异步复制协议:主节点写操作完成后就认为写操作完成,并不会去考虑备用节点的写操作是否完成,容易出现数据丢失
协议B:半同步复制协议:主节点写操作完成且主节点把数据发送给备用节点后就认为写操作完成,并不会去考虑当数据发送给备用节点后,备用节点是否完成了写操作,也容易出现数据丢失
协议C:同步复制协议:主节点写操作完成且备用节点写操作完成,主节点才认为写操作完成,数据不会发生丢失,保证了数据的一致性和安全性
通常情况下我们会采用协议C
用途:
DRBD+Heartbeat+Mysql:配置mysql的高可用,且实现自动切换
DRBD+HeartBeat+NFS:配置NFS的高可用,作为集群中的底端共享存储
环境:
[root@scj ~]# cat /etc/issue CentOS release 6.4 (Final) Kernel \r on an \m [root@scj ~]# uname -r 2.6.32-358.el6.i686
dbm135 | 192.168.186.135 | dbm135.51.com | primary |
dbm134 | 192.168.186.134 | dbm134.51.com | secondary |
准备:
关闭iptables和SELINUX:(dbm135,dbm134)
[root@scj ~]#service iptables stop [root@scj ~]#setenforce 0 [root@scj ~]#vi /etc/sysconfig/selinux --------------- SELINUX=disabled ---------------
修改主机名:
dbm135 [root@scj ~]#hostname dbm135.51.com [root@scj ~]#cat /etc/sysconfig/network NETWORKING=yes HOSTNAME=dbm135.51.com
dbm134 [root@scj ~]#hostname dbm134.51.com [root@scj ~]#cat /etc/sysconfig/network NETWORKING=yes HOSTNAME=dbm134.51.com
修改/etc/hosts文件:(dbm135,dbm134)
192.168.186.135 dbm135.51.com dbm135 192.168.186.134 dbm134.51.com dbm134
时间同步:(dbm135,dbm134)
[root@scj ~]#/usr/sbin/ntpdate pool.ntp.org #可以创建一个计划任务
网络:(dbm135,dbm134)
DRBD同步操作对网络环境要求很高,特别是在写入数据量特别大,同步数据很多时尤为重要;建议将两台dbm服务器放在同一个机房,使用内网进行数据同步
磁盘规划:
两台主机节点的(磁盘)分区大小要一致:
考虑到数据库的大小和未来的增长,这里采用lvm逻辑卷进行分区:
分区:(dbm135)
[root@scj ~]# fdisk -l Disk /dev/sda: 21.5 GB, 21474836480 bytes 255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00051b9f Device Boot Start End Blocks Id System /dev/sda1 * 1 64 512000 83 Linux Partition 1 does not end on cylinder boundary. /dev/sda2 64 2611 20458496 8e Linux LVM Disk /dev/sdb: 10.7 GB, 10737418240 bytes 255 heads, 63 sectors/track, 1305 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Disk /dev/mapper/VolGroup-lv_root: 19.9 GB, 19872612352 bytes 255 heads, 63 sectors/track, 2416 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000 Disk /dev/mapper/VolGroup-lv_swap: 1073 MB, 1073741824 bytes 255 heads, 63 sectors/track, 130 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00000000
由fdisk -l可以看出有一块大小为10.7G的设备/dev/sdb,对/dev/sdb来创建逻辑卷:
[root@scj ~]# pvcreate /dev/sdb #创建pv Physical volume "/dev/sdb" successfully created [root@scj ~]# pvs PV VG Fmt Attr PSize PFree /dev/sda2 VolGroup lvm2 a-- 19.51g 0 /dev/sdb lvm2 a-- 10.00g 10.00g [root@scj ~]# vgcreate drbd /dev/sdb #创建卷组drbd,将pv加到卷组中 Volume group "drbd" successfully created [root@scj ~]# vgs VG #PV #LV #SN Attr VSize VFree VolGroup 1 2 0 wz--n- 19.51g 0 drbd 1 0 0 wz--n- 10.00g 10.00g [root@scj ~]# lvcreate -n dbm -L 9G drbd #在卷组drbd中创建lvm逻辑卷 Logical volume "dbm" created [root@scj ~]# lvs LV VG Attr LSize Pool Origin Data% Move Log Cpy%Sync Convert lv_root VolGroup -wi-ao--- 18.51g lv_swap VolGroup -wi-ao--- 1.00g dbm drbd -wi-a---- 9.00g [root@scj ~]# ls /dev/drbd/dbm #查看创建的逻辑卷 /dev/drbd/dbm
注意:dbm134采用上面相同的方法对磁盘进行规划操作
安装DRBD:
通过yum安装DRBD(推荐):(dbm135,dbm134)
[root@scj ~]# cd /usr/local/src/ [root@scj src]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm [root@scj src]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm [root@scj src]# yum -y install drbd83-utils kmod-drbd83 #时间可能会比较长
[root@scj ~]# modprobe drbd #加载drbd模块 FATAL: Module drbd not found.
解决加载drbd模块失败:
因为在执行yum -y install drbd83-utils kmod-drbd83时,对内核kernel进行了update,要重新启动服务器,更新后的内核才会生效
重新启动服务器
检查DRBD是否安装成功:(dbm135,dbm134)
[root@scj ~]# modprobe drbd #加载drbd模块 [root@scj ~]# modprobe -l | grep drbd weak-updates/drbd83/drbd.ko [root@scj ~]# lsmod | grep drbd drbd 311209 0 安装成功后,在/sbin目录下有drbdadm,drbdmeta,drbdsetup命令,以及/etc/init.d/drbd启动脚本
配置DRBD:
DRBD运行时会去读取/etc/drbd.conf文件,该文件描述了DRBD设备与磁盘分区的映射关系和DRBD的一些配置参数
修改配置文件:(dbm135,dbm134)
[root@scj ~]# cp /usr/share/doc/drbd83-utils-8.3.16/drbd.conf.example /etc/drbd.conf.old
[root@scj ~]# vi /etc/drbd.conf # You can find an example in /usr/share/doc/drbd.../drbd.conf.example global { usage-count no; #是否参加DRBD使用者统计,默认为yes } common { syncer { rate 100M; #设置主、备节点同步时的网络速率最大值 } } resource r0 { #资源名为r0 protocol C; #使用DRBD的第三种同步协议,表示收到远程主机的写入确认后认为写入完成 startup { wfc-timeout 120; degr-wfc-timeout 120; } disk { on-io-error detach; fencing resource-only; } net { timeout 60; connect-int 10; ping-int 10; max-buffers 2048; max-epoch-size 2048; cram-hmac-alg "sha1"; shared-secret "MYSAL-HA"; #DRBD同步时使用的验证方式和密码 } on dbm135.51.com { #每个主机的说明以on开头,on后面必须是主机名,即:hostname device /dev/drbd0; #DRBD设备名 disk /dev/mapper/drbd-dbm; #DRBD设备/dev/drbd0所使用的分区 address 192.168.186.135:7788; #设置DRBD的监听端口,用于与另一台主机通信 meta-disk internal; #表示在同一个局域网内 } on dbm134.51.com { device /dev/drbd0; #DRBD设备名 disk /dev/mapper/drbd-dbm; #所使用的分区 address 192.168.186.134:7788; meta-disk internal; #DRBD的元数据存放方式 } }
注意:将配置文件拷贝到另一台主机,两台主机的配置文件是一模一样的
DRBD的启动:(dbm135,dbm134)
创建DRBD设备:
[root@scj ~]mknod /dev/drbd0 b 147 0
激活资源r0:
[root@scj ~]drbdadm create-md r0 md_offset 9663672320 al_offset 9663639552 bm_offset 9663344640 Found ext3 filesystem 9437184 kB data area apparently used 9436860 kB left usable by current configuration Device size would be truncated, which would corrupt data and result in 'access beyond end of device' errors. You need to either * use external meta data (recommended) * shrink that filesystem first * zero out the device (destroy the filesystem) Operation refused. Command 'drbdmeta 0 v08 /dev/mapper/drbd-dbm internal create-md' terminated with exit code 40 drbdadm create-md r0: exited with code 40
由上面输出结果可以看出激活失败,解决方法:
[root@scj ~]dd if=/dev/zero of=/dev/mapper/drbd-dbm bs=1M count=128 128+0 records in 128+0 records out 134217728 bytes (134 MB) copied, 0.391768 s, 343 MB/s #使用dd命令覆盖文件系统中设备块信息,操作时确认此分区上的数据备份过(这里无需备份)
再次尝试激活:
[root@scj ~]# drbdadm create-md r0 Writing meta data... initializing activity log NOT initialized bitmap New drbd meta data block successfully created. [root@scj ~]# drbdadm create-md r0 You want me to create a v08 style flexible-size internal meta data block. There appears to be a v08 flexible-size internal meta data block already in place on /dev/mapper/drbd-dbm at byte offset 9663672320 Do you really want to overwrite the existing v08 meta-data? [need to type 'yes' to confirm] yes Writing meta data... initializing activity log NOT initialized bitmap New drbd meta data block successfully created. #注意:需要执行两次drbdadm create-md r0,出现successfully表示成功
启动DRBD:(dbm135,dbm134)
[root@scj ~]# /etc/init.d/drbd start Starting DRBD resources: [ d(r0) s(r0) n(r0) ]outdated-wfc-timeout has to be shorter than degr-wfc-timeout outdated-wfc-timeout implicitly set to degr-wfc-timeout (120s) #出现[ d(r0) s(r0) n(r0) ]表示成功 ##设置开机自动启动 [root@scj ~]# chkconfig drbd on
查看启动情况:
[root@scj ~]# /etc/init.d/drbd status drbd driver loaded OK; device status: version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build32R6, 2014-11-24 14:49:06 m:res cs ro ds p mounted fstype 0:r0 Connected Secondary/Secondary Inconsistent/Inconsistent C [root@scj ~]# cat /proc/drbd version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build32R6, 2014-11-24 14:49:06 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:9436860 #注意:只有主备两边的DRBD都启动起来才会生效 #ro:角色信息,此时的状态为Secondary/Secondary,表示两台主机的状态都是备机状态 #ds:磁盘状态,Inconsistent/Inconsistent显示的状态内容“不一致”,这是因为DRBD无法判断哪一方为主机,应以哪一方的磁盘数据作为标准 #dw:磁盘写操作 #dr:磁盘读操作
设置主节点:(dbm135)
这里将dbm135设置为主节点:
[root@dbm135 ~]# drbdadm primary all 0: State change failed: (-2) Need access to UpToDate data Command 'drbdsetup 0 primary' terminated with exit code 17
如上设置主节点失败,解决方法:
[root@dbm135 ~]# drbdadm -- --overwrite-data-of-peer primary all #把此节点设置为主节点,且从头开始同步数据
查看此时的状态:
[root@dbm135 ~]# cat /proc/drbd version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build32R6, 2014-11-24 14:49:06 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----- ns:1589248 nr:0 dw:0 dr:1589912 al:0 bm:97 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:7847612 [==>.................] sync'ed: 16.9% (7660/9212)M finish: 0:03:04 speed: 42,532 (41,820) K/sec #ro:Primary/Secondary ds:UpToDate/Inconsistent,此时dbm135为Primary,另一台主机为Secondary,且数据正在同步UpToDate/Inconsistent
等一段时间,再次查看状态:
[root@dbm135 ~]# cat /proc/drbd version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build32R6, 2014-11-24 14:49:06 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:9436860 nr:0 dw:0 dr:9437524 al:0 bm:576 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 #ds:UpToDate/UpToDate,数据同步完成
查看dbm134的状态:
[root@dbm134 ~]# cat /proc/drbd version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build32R6, 2014-11-24 14:49:06 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:0 nr:9436860 dw:9436860 dr:0 al:0 bm:576 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 #ro:Secondary/Primary ds:UpToDate/UpToDate,dbm134此时状态为Secondary
格式化并挂载DRBD设备:(dbm135)
注意:所有的读写操作,包括对DRBD设备的格式化及挂载都必须在Primary主机进行,即dbm135;Secondary主机不能进行任何操作,即使读操作也不可以,连格式化和挂载也不可以
确认是Primary主机:
[root@dbm135 ~]# cat /proc/drbd | grep ro version: 8.3.16 (api:88/proto:86-97) 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- #此主机是Primary #只有主机是Primary时,才可以执行下面的格式化和挂载操作
格式化DRBD:
[root@dbm135 ~]# mkfs.ext4 /dev/drbd0 ##备用节点会进行DRBD设备格式化的同步 ##千万不要再在备用节点去进行DRBD设备的格式化
挂载DRBD:
[root@dbm135 ~]# mkdir /data #创建数据目录 [root@dbm135 ~]# mount /dev/drbd0 /data [root@dbm135 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup-lv_root 19G 1.1G 17G 7% / tmpfs 58M 0 58M 0% /dev/shm /dev/sda1 477M 43M 409M 10% /boot /dev/drbd0 8.8G 21M 8.3G 1% /data
注意:
Secondary节点上不允许对DRBD设备进行任何操作,即使是读操作也不行,因为此时在secondary节点上根本不能挂载DRBD设备;所有的读写操作只能在Primary节点上进行,只有当Primary节点挂掉时,Secondary节点才能提升为Primary节点继续工作
DRBD角色切换:
DRBD角色切换有两种方法(模拟故障)
dbm135为Primary主机
[root@dbm135 ~]# cd /data/ [root@dbm135 data]# ll total 16 drwx------ 2 root root 16384 Jun 29 08:13 lost+found [root@dbm135 data]# touch file{1..5} [root@dbm135 data]# ll total 16 -rw-r--r-- 1 root root 0 Jun 29 10:25 file1 -rw-r--r-- 1 root root 0 Jun 29 10:25 file2 -rw-r--r-- 1 root root 0 Jun 29 10:25 file3 -rw-r--r-- 1 root root 0 Jun 29 10:25 file4 -rw-r--r-- 1 root root 0 Jun 29 10:25 file5 drwx------ 2 root root 16384 Jun 29 08:13 lost+found
方法一:
dbm35 [root@dbm135 ~]# umount /dev/drbd0 #卸载/dev/drbd0 [root@dbm135 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup-lv_root 19G 1.1G 17G 7% / tmpfs 58M 0 58M 0% /dev/shm /dev/sda1 477M 43M 409M 10% /boot [root@dbm135 ~]# drbdadm secondary r0 #将dbm135设置为Secondary状态 [root@dbm135 ~]# cat /proc/drbd | grep ro version: 8.3.16 (api:88/proto:86-97) 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
dbm134 [root@dbm134 ~]# drbdadm primary r0 #将dbm134设置为Primary状态 [root@dbm134 ~]# cat /proc/drbd | grep ro version: 8.3.16 (api:88/proto:86-97) 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- [root@dbm134 ~]# mount /dev/drbd0 /data/ #挂载/dev/drbd0 ###注意:此时的/dev/drbd0千万不要再去格式化,主备两台主机从头至尾只需要格式化一次,即:在一开始的Primary主机格式化一次 [root@dbm134 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup-lv_root 19G 1.1G 17G 6% / tmpfs 58M 0 58M 0% /dev/shm /dev/sda1 477M 43M 409M 10% /boot /dev/drbd0 8.8G 21M 8.3G 1% /data [root@dbm134 ~]# cd /data/ [root@dbm134 data]# ll #查看数据有没有同步 total 16 -rw-r--r-- 1 root root 0 Jun 29 10:25 file1 -rw-r--r-- 1 root root 0 Jun 29 10:25 file2 -rw-r--r-- 1 root root 0 Jun 29 10:25 file3 -rw-r--r-- 1 root root 0 Jun 29 10:25 file4 -rw-r--r-- 1 root root 0 Jun 29 10:25 file5 drwx------ 2 root root 16384 Jun 29 08:13 lost+found
方法二:
dbm135 [root@dbm135 ~]# /etc/init.d/drbd stop #将dbm135的drbd服务stop掉 Stopping all DRBD resources: . [root@dbm135 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup-lv_root 19G 1.1G 17G 7% / tmpfs 58M 0 58M 0% /dev/shm /dev/sda1 477M 43M 409M 10% /boot
dbm134 [root@dbm134 ~]# drbdadm -- --overwrite-data-of-peer primary all #将dbm134设置为Primary [root@dbm134 ~]# mount /dev/drbd0 /data/ #挂载 [root@dbm134 ~]# cd /data/ [root@dbm134 data]# ll total 16 -rw-r--r-- 1 root root 0 Jun 29 10:25 file1 -rw-r--r-- 1 root root 0 Jun 29 10:25 file2 -rw-r--r-- 1 root root 0 Jun 29 10:25 file3 -rw-r--r-- 1 root root 0 Jun 29 10:25 file4 -rw-r--r-- 1 root root 0 Jun 29 10:25 file5 drwx------ 2 root root 16384 Jun 29 08:13 lost+found
附加:
保证DRBD主从结构的智能切换,实现高可用,需要Heartbeat来实现
Heartbeat会在DRBD主端挂掉的情况下,自动切换从端为主端并自动挂载/data分区
Heartbeat实现高可用,后面再详细介绍
DRBD出现脑裂解决方法:
注意:此时dbm135为Secondary
脑裂出现原因:
一般为人为触动心跳线,使主备之间无法正常通信,同时把备用端由Secondary状态变为Primary状态,这时候主备之间网络恢复正常,则会出现脑裂(因为此时主备两台机器都是Primary状态,不知道以哪台机器为主,则就会出现脑裂了)
脑裂现象:
dbm135 [root@dbm135 data]# cat /proc/drbd version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build32R6, 2014-11-24 14:49:06 0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r----- ns:1021 nr:8 dw:8 dr:1021 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:4 ##cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown,另一台主机的状态都是Unknown且cs是StandAlone
dbm134 [root@dbm134 data]# cat /proc/drbd version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build32R6, 2014-11-24 14:49:06 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r----- ns:0 nr:0 dw:4 dr:1345 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:4096 ##ro:Primary/Unknown ds:UpToDate/DUnknown,另一台主机的状态都是Unknown
解决方法:
Secondary端(dbm135) [root@dbm135 ~]#drbdadm secondary r0 [root@dbm135 ~]#drbdadm disconnect all [root@dbm135 ~]#drbdadm -- --discard-my-data connect r0 ##该命令告诉drbd,secondary上的数据不正确,以primary上的数据为准
Primary端(dbm134) [root@dbm134 ~]#drbdadm disconnect all [root@dbm134 ~]#drbdadm connect r0 [root@dbm134 ~]#drbdsetup /dev/drbd0 primary