IV 12 MySQL+drbd+heartbeat
一主多从是最常用的DB架构方案,该架构部署简单、维护方便,通过代理或程序的方式可实现rw splitting,且多个从库通过LVS或haproxy实现LB分担r的压力,排除了r的单点问题,但仅有一个主库这也是单点,若主出问题w将停止,最简单的方案人工介入,做监控,主一旦宕机,管理人员手动选择半同步的那个从改为主,让其它从与新的主同步,人工介入虽可行但高要求的场合并不适用
注:
正常情况下MySQL-M-active负责w,MySQL-M-inactive为不可见状态,MySQL slave负责r,另可做MySQL slave的LB,master和slave同步时利用其自身机制并通过VIP;web server在rw时通过程序自身实现,也可用mysql proxy或amoeba开源软件实现;
注:双主热备模式
1、安装配置heartbeat
准备环境:
VIP:10.96.20.8
master:eth0(10.96.20.113)、eth1(172.16.1.113,不配网关及dns)、主机名(test-master)
backup:eth0(10.96.20.114)、eth1(172.16.1.114,不配网关及dns)、主机名(test-backup)
双网卡、双硬盘、
注:eth0为管理IP;eth1心跳连接及drbd传输通道,若是生产环境中心跳传输和数据传输用一个网卡要做限制,给心跳留有带宽
注:规范vmware中标签,Xshell中标签,公司中的生产环境所有主机均应在/etc/hosts文件中有相应记录,方便分发及管理维护
test-master(分别配置主机名/etc/sysconfig/network结果一定要与uname-n保持一致,/etc/hosts文件,ssh双机互信,时间同步,iptables,selinux):
[root@test-master ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.5(Santiago)
[root@test-master ~]# uname -rm
2.6.32-431.el6.x86_64 x86_64
[root@test-master ~]# uname -n
test-master
[root@test-master ~]# ifconfig | grep eth0 -A 1
eth0 Link encap:Ethernet HWaddr00:0C:29:1F:B6:AC
inet addr:10.96.20.113 Bcast:10.96.20.255 Mask:255.255.255.0
[root@test-master ~]# ifconfig | grep eth1 -A 1
eth1 Link encap:Ethernet HWaddr00:0C:29:1F:B6:B6
inet addr:172.16.1.113 Bcast:172.16.1.255 Mask:255.255.255.0
[root@test-master ~]# route add -host 172.16.1.114 dev eth1 #(添加主机路由,心跳传送通过指定网卡出去,此句可追加到/etc/rc.local中,也可配置静态路由#vim /etc/sysconfig/network-scripts/route-eth1添加172.16.1.114/24via 172.16.1.113)
[root@test-master ~]# ssh-keygen -t rsa -f ./.ssh/id_rsa -P ''
Generating public/private rsa key pair.
Your identification has been saved in./.ssh/id_rsa.
Your public key has been saved in./.ssh/id_rsa.pub.
The key fingerprint is:
29:c3:a3:68:81:43:59:2f:0a:ad:8a:54:56:b0:1e:12root@test-master
The key's randomart p_w_picpath is:
+--[ RSA 2048]----+
| E o.. |
| .+ + |
|.+.* . |
|oo* o. . |
|+o.. = S |
|+. o . + |
|o o . |
| . |
| |
+-----------------+
[root@test-master ~]# ssh-copy-id -i ./.ssh/id_rsa root@test-backup
The authenticity of host 'test-backup(10.96.20.114)' can't be established.
RSA key fingerprint is63:f5:2e:dc:96:64:54:72:8e:14:7e:ec:ef:b8:a1:0c.
Are you sure you want to continueconnecting (yes/no)? yes
Warning: Permanently added 'test-backup'(RSA) to the list of known hosts.
root@test-backup's password:
Now try logging into the machine, with"ssh 'root@test-backup'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keysthat you weren't expecting.
[root@test-master ~]# crontab -l
*/5 * * * * /usr/sbin/ntpdatetime.windows.com &> /dev/null
[root@test-master ~]# service crond restart
Stopping crond: [ OK ]
Starting crond: [ OK ]
[root@test-master ~]# wget http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm
[root@test-master ~]# rpm -ivh epel-release-6-8.noarch.rpm
warning: epel-release-6-8.noarch.rpm:Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY
Preparing... ########################################### [100%]
1:epel-release ########################################### [100%]
[root@test-master ~]# yum search heartbeat
……
heartbeat-devel.i686 : Heartbeatdevelopment package
heartbeat-devel.x86_64 : Heartbeatdevelopment package
heartbeat-libs.i686 : Heartbeat libraries
heartbeat-libs.x86_64 : Heartbeat libraries
heartbeat.x86_64 : Messaging and membershipsubsystem for High-Availability Linux
[root@test-master ~]# yum -y install heartbeat
[root@test-master ~]# chkconfig heartbeat off
[root@test-master ~]# chkconfig --list heartbeat
heartbeat 0:off 1:off 2:off 3:off 4:off 5:off 6:off
test-backup:
[root@test-backup ~]# uname -n
test-backup
[root@test-backup ~]# ifconfig | grep eth0-A 1
eth0 Link encap:Ethernet HWaddr00:0C:29:15:E6:BB
inet addr:10.96.20.114 Bcast:10.96.20.255 Mask:255.255.255.0
[root@test-backup ~]# ifconfig | grep eth1-A 1
eth1 Link encap:Ethernet HWaddr00:0C:29:15:E6:C5
inet addr:172.16.1.114 Bcast:172.16.1.255 Mask:255.255.255.0
[root@test-backup ~]# route add -host 172.16.1.113 dev eth1
[root@test-backup ~]# ssh-keygen -t rsa -f ./.ssh/id_rsa -P ''
Generating public/private rsa key pair.
Your identification has been saved in./.ssh/id_rsa.
Your public key has been saved in./.ssh/id_rsa.pub.
The key fingerprint is:
08:ea:6a:44:7f:1a:c9:bf:ff:01:d5:32:e5:39:1b:b8root@test-backup
The key's randomart p_w_picpath is:
+--[ RSA 2048]----+
| . |
| = . |
| . = * |
| . . . .. + + |
|. + . ..SE . |
| o = . . |
|. . = . |
| o . . . |
|o .o... |
+-----------------+
[root@test-backup ~]# ssh-copy-id -i ./.ssh/id_rsa root@test-master
The authenticity of host 'test-master(10.96.20.113)' can't be established.
RSA key fingerprint is63:f5:2e:dc:96:64:54:72:8e:14:7e:ec:ef:b8:a1:0c.
Are you sure you want to continueconnecting (yes/no)? yes
Warning: Permanently added 'test-master'(RSA) to the list of known hosts.
root@test-master's password:
Now try logging into the machine, with"ssh 'root@test-master'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keysthat you weren't expecting.
[root@test-backup ~]# crontab -l
*/5 * * * * /usr/sbin/ntpdatetime.windows.com &> /dev/null
[root@test-backup ~]# service crond restart
Stopping crond: [ OK ]
Starting crond: [ OK ]
[root@test-backup ~]# wget http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm
[root@test-backup ~]# rpm -ivh epel-release-6-8.noarch.rpm
[root@test-backup ~]# yum -y install heartbeat
[root@test-backup ~]# chkconfig heartbeat off
[root@test-backup ~]# chkconfig --list heartbeat
heartbeat 0:off 1:off 2:off 3:off 4:off 5:off 6:off
test-master:
[root@test-master ~]# cp /usr/share/doc/heartbeat-3.0.4/{ha.cf,authkeys,haresources} /etc/ha.d/
[root@test-master ~]# cd /etc/ha.d
[root@test-master ha.d]# ls
authkeys ha.cf harc haresources rc.d README.config resource.d shellfuncs
[root@test-master ha.d]# vim authkeys #(使用#ddif=/dev/random count=1 bs=512 | md5sum生成随机数,sha1后跟随机数)
auth 1
1 sha1912d6402295ac8d47109e56b177073b9
[root@test-master ha.d]# chmod 600 authkeys #(此文件权限600,否则启动服务时会报错)
[root@test-master ha.d]# ll !$
ll authkeys
-rw-------. 1 root root 692 Aug 7 21:51 authkeys
[root@test-master ha.d]# vim ha.cf
debugfile /var/log/ha-debug #(调试日志)
logfile /var/log/ha-log
logfacility local1 #(在rsyslog服务中配置通过local1接收日志)
keepalive 2 #(指定心跳间隔时间,即2s发一次广播)
deadtime 30 #(指定备node在30s内没收到主node的心跳信息则立即接管对方的服务资源)
warntime 10 #(指定心跳延迟的时间为10s,当10s内备node没收到主node的心跳信息,就会往日志中写警告,此时不会切换服务)
initdead 120 #(指定在heartbeat首次运行后,需等待120s才启动主node的各资源,此项用于解决等待对方heartbeat服务启动了自己才启,此项值至少要是deadtime的两倍)
udpport 694
#bcast eth0 #(指定心跳使用以太网广播方式在eth0上广播,若要使用两个实际网络传送心跳则要为bcast eth0 eth1)
mcast eth0 225.0.0.11 6941 0 #(设置多播通信的参数,多播地址在LAN内必须是唯一的,因为有可能有多个heartbeat服务,多播地址使用D类IP(224.0.0.0--239.255.255.255),格式为mcast devmcast_group port ttl loop)
auto_failback on #(用于主node恢复后failback)
node test-master #(主node主机名,uname -n结果)
node test-backup #(备node主机名)
crm no #(是否开启CRM功能)
[root@test-master ha.d]# vim haresources
test-master IPaddr::10.96.20.8/24/eth0 #(此句相当于执行#/etc/ha.d/resource.d/IPaddr10.96.20.8/24/eth0 stop|start,IPaddr即是/etc/ha.d/resource.d/下的脚本)
[root@test-master ha.d]# scp authkeys ha.cf haresources root@test-backup:/etc/ha.d/
authkeys 100% 692 0.7KB/s 00:00
ha.cf 100% 10KB 10.3KB/s 00:00
haresources 100% 5944 5.8KB/s 00:00
[root@test-master ha.d]# service heartbeat start
Starting High-Availability services:INFO: Resource is stopped
Done.
[root@test-master ha.d]# ssh test-backup 'service heartbeat start'
Starting High-Availability services:2016/08/07_22:39:00 INFO: Resource isstopped
Done.
[root@test-master ha.d]# ps aux | grep heartbeat
root 63089 0.0 3.1 50124 7164 ? SLs 22:38 0:00 heartbeat: mastercontrol process
root 63093 0.0 3.1 50076 7116 ? SL 22:38 0:00 heartbeat: FIFOreader
root 63094 0.0 3.1 50072 7112 ? SL 22:38 0:00 heartbeat: write:mcast eth0
root 63095 0.0 3.1 50072 7112 ? SL 22:38 0:00 heartbeat: read:mcast eth0
root 63136 0.0 0.3 103264 836 pts/0 S+ 22:39 0:00 grep heartbeat
[root@test-master ha.d]# ssh test-backup 'ps aux | grep heartbeat'
root 3050 0.0 3.1 50124 7164 ? SLs 22:39 0:00 heartbeat: master control process
root 3054 0.0 3.1 50076 7116 ? SL 22:39 0:00 heartbeat: FIFOreader
root 3055 0.0 3.1 50072 7112 ? SL 22:39 0:00 heartbeat: write:mcast eth0
root 3056 0.0 3.1 50072 7112 ? SL 22:39 0:00 heartbeat: read:mcast eth0
root 3094 0.0 0.5 106104 1368 ? Ss 22:39 0:00 bash -c ps aux | grep heartbeat
root 3108 0.0 0.3 103264 832 ? S 22:39 0:00 grep heartbeat
[root@test-master ha.d]# netstat -tnulp |grep heartbeat
udp 0 0 225.0.0.11:694 0.0.0.0:* 63094/heartbeat:wr
udp 0 0 0.0.0.0:50268 0.0.0.0:* 63094/heartbeat:wr
[root@test-master ha.d]# ssh test-backup 'netstat -tnulp | grep heartbeat'
udp 0 0 0.0.0.0:58019 0.0.0.0:* 3055/heartbeat:wri
udp 0 0 225.0.0.11:694 0.0.0.0:* 3055/heartbeat: wri
[root@test-master ha.d]# ip addr | grep 10.96.20
inet 10.96.20.113/24 brd 10.96.20.255scope global eth0
inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0
[root@test-master ha.d]# ssh test-backup'ip addr | grep 10.96.20'
inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0
[root@test-master ha.d]# service heartbeatstop
Stopping High-Availability services: Done.
[root@test-master ha.d]# ip addr | grep 10.96.20
inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0
[root@test-master ha.d]# ssh test-backup'ip addr | grep 10.96.20'
inet 10.96.20.114/24 brd 10.96.20.255scope global eth0
inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0
[root@test-master ha.d]# service heartbeat start
Starting High-Availability services:INFO: Resource is stopped
Done.
[root@test-master ha.d]# ip addr | grep 10.96.20
inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0
inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0
[root@test-master ha.d]# ssh test-backup 'ip addr | grep 10.96.20'
inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0
[root@test-master ~]# service heartbeat stop
Stopping High-Availability services: Done.
[root@test-master ~]# ssh test-backup 'service heartbeat stop'
Stopping High-Availability services: Done.
2、安装配置drbd
test-master:
[root@test-master ~]# fdisk -l
……
Disk /dev/sdb: 2147 MB, 2147483648 bytes
255 heads, 63 sectors/track, 261 cylinders
Units = cylinders of 16065 * 512 = 8225280bytes
Sector size (logical/physical): 512 bytes /512 bytes
I/O size (minimum/optimal): 512 bytes / 512bytes
Disk identifier: 0x00000000
[root@test-master ~]# parted /dev/sdb #(parted命令可支持大于2T的硬盘,将新硬盘分两个区,一个区用于放数据,另一个区用于drbd的meta data)
GNU Parted 2.1
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to viewa list of commands.
(parted) h
align-check TYPE N check partition N for TYPE(min|opt) alignment
check NUMBER do a simple check on the file system
cp[FROM-DEVICE] FROM-NUMBER TO-NUMBER copy file system to another partition
help [COMMAND] print general help, or helpon COMMAND
mklabel,mktable LABEL-TYPE create a new disklabel (partitiontable)
mkfs NUMBER FS-TYPE make a FS-TYPE file system on partition NUMBER
mkpart PART-TYPE [FS-TYPE] START END make a partition
mkpartfs PART-TYPE FS-TYPE START END make a partition with a file system
move NUMBER START END move partition NUMBER
name NUMBER NAME name partition NUMBER as NAME
print [devices|free|list,all|NUMBER] display the partition table, availabledevices, free space, all found partitions, or a
particular partition
quit exit program
rescue START END rescue a lost partition near START and END
resize NUMBER START END resize partition NUMBER and its file system
rmNUMBER delete partition NUMBER
select DEVICE choose the device to edit
setNUMBER FLAG STATE change the FLAG on partition NUMBER
toggle [NUMBER [FLAG]] toggle the state of FLAG on partition NUMBER
unit UNIT set the default unit to UNIT
version display the version number and copyright information of GNU Parted
(parted) mklabel gpt
(parted) mkpart primary 0 1024
Warning: The resulting partition is not properlyaligned for best performance.
Ignore/Cancel?Ignore
(parted) mkpart primary 1025 2147
Warning: The resulting partition is notproperly aligned for best performance.
Ignore/Cancel? Ignore
(parted) p
Model: VMware, VMware Virtual S (scsi)
Disk /dev/sdb: 2147MB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number Start End Size File system Name Flags
1 17.4kB 1024MB 1024MB primary
2 1025MB 2147MB 1122MB primary
[root@test-master ~]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm
[root@test-master ~]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm
warning:elrepo-release-6-6.el6.elrepo.noarch.rpm: Header V4 DSA/SHA1 Signature, key IDbaadae52: NOKEY
Preparing... ########################################### [100%]
1:elrepo-release ########################################### [100%]
[root@test-master ~]# yum -y install drbd kmod-drbd84
[root@test-master ~]# modprobe drbd
FATAL: Module drbd not found.
[root@test-master ~]# yum -y install kernel* #(更新内核后要重启系统)
[root@test-master ~]# uname -r
2.6.32-642.3.1.el6.x86_64
[root@test-master ~]# depmod
[root@test-master ~]# lsmod | grep drbd
drbd 372759 0
libcrc32c 1246 1 drbd
[root@test-master ~]# ll /usr/src/kernels/
total 12
drwxr-xr-x. 22 root root 4096 Mar 31 06:462.6.32-431.el6.x86_64
drwxr-xr-x. 22 root root 4096 Aug 8 03:40 2.6.32-642.3.1.el6.x86_64
drwxr-xr-x. 22 root root 4096 Aug 8 03:40 2.6.32-642.3.1.el6.x86_64.debug
[root@test-master ~]# echo "modprobedrbd > /dev/null 2>&1" > /etc/sysconfig/modules/drbd.modules
[root@test-master ~]# cat !$
cat /etc/sysconfig/modules/drbd.modules
modprobe drbd > /dev/null 2>&1
test-backup:
[root@test-backup ~]# parted /dev/sdb
(parted) mklabel gpt
(parted) mkpart primary 0 4096
Warning: The resulting partition is notproperly aligned for best performance.
Ignore/Cancel? Ignore
(parted) mkpart primary 4097 5368
(parted) p
Model: VMware, VMware Virtual S (scsi)
Disk /dev/sdb: 5369MB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Number Start End Size File system Name Flags
1 17.4kB 4096MB 4096MB primary
2 4097MB 5368MB 1271MB primary
[root@test-backup ~]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm
[root@test-backup ~]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm
[root@test-backup ~]# ll /etc/yum.repos.d/
total 20
-rw-r--r--. 1 root root 1856 Jul 19 00:28CentOS6-Base-163.repo
-rw-r--r--. 1 root root 2150 Feb 9 2014elrepo.repo
-rw-r--r--. 1 root root 957 Nov 4 2012 epel.repo
-rw-r--r--. 1 root root 1056 Nov 4 2012epel-testing.repo
-rw-r--r--. 1 root root 529 Mar 30 23:00 rhel-source.repo.bak
[root@test-backup ~]# yum -y install drbd kmod-drbd84
[root@test-backup ~]# yum -y install kernel*
[root@test-backup ~]# depmod
[root@test-backup ~]# lsmod | grep drbd
drbd 372759 0
libcrc32c 1246 1 drbd
[root@test-backup ~]# chkconfig drbd off
[root@test-backup ~]# chkconfig --list drbd
drbd 0:off 1:off 2:off 3:off 4:off 5:off 6:off
[root@test-backup ~]# echo "modprobedrbd > /dev/null 2>&1" > /etc/sysconfig/modules/drbd.modules
[root@test-backup ~]# cat !$
cat /etc/sysconfig/modules/drbd.modules
modprobe drbd > /dev/null 2>&1
test-master:
[root@test-master ~]# vim /etc/drbd.d/global_common.conf
[root@test-master ~]# egrep -v "#|^$" /etc/drbd.d/global_common.conf
global {
usage-countno;
}
common {
handlers{
}
startup{
}
options{
}
disk{
on-io-error detach;
}
net{
}
syncer{
rate50M;
verify-algcrc32c;
}
}
[root@test-master ~]# vim/etc/drbd.d/data.res
resource data {
protocol C;
on test-master {
device /dev/drbd0;
disk /dev/sdb1;
address 172.16.1.113:7788;
meta-disk /dev/sdb2[0];
}
on test-backup {
device /dev/drbd0;
disk /dev/sdb1;
address 172.16.1.114:7788;
meta-disk /dev/sdb2[0];
}
}
[root@test-master ~]# cd /etc/drbd.d
[root@test-master drbd.d]# scp global_common.conf data.res root@test-backup:/etc/drbd.d/
global_common.conf 100% 2144 2.1KB/s 00:00
data.res 100% 251 0.3KB/s 00:00
[root@test-master drbd.d]# drbdadm --help
USAGE: drbdadm COMMAND [OPTION...]{all|RESOURCE...}
GENERAL OPTIONS:
--stacked, -S
--dry-run, -d
--verbose, -v
--config-file=...,-c ...
--config-to-test=..., -t ...
--drbdsetup=..., -s ...
--drbdmeta=..., -m ...
--drbd-proxy-ctl=..., -p ...
--sh-varname=..., -n ...
--peer=..., -P ...
--version, -V
--setup-option=..., -W ...
--help, -h
COMMANDS:
attach disk-options
detach connect
net-options disconnect
up resource-options
down primary
secondary invalidate
invalidate-remote outdate
resize verify
pause-sync resume-sync
adjust adjust-with-progress
wait-connect wait-con-int
role cstate
dstate dump
dump-xml create-md
show-gi get-gi
dump-md wipe-md
apply-al hidden-commands
[root@test-master drbd.d]# drbdadm create-md data
initializing activity log
NOT initializing bitmap
Writing meta data...
New drbd meta data block successfullycreated.
[root@test-master drbd.d]# ssh test-backup 'drbdadm create-md data'
NOT initializing bitmap
initializing activity log
Writing meta data...
New drbd meta data block successfullycreated.
[root@test-master drbd.d]# drbdadm up data
[root@test-master drbd.d]# ssh test-backup 'drbdadm up data'
[root@test-master drbd.d]# cat /proc/drbd
version: 8.4.7-1 (api:1/proto:86-101)
GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11
0:cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
ns:0 nr:0 dw:0 dr:0 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:999984
[root@test-master drbd.d]# ssh test-backup 'cat /proc/drbd'
version: 8.4.7-1 (api:1/proto:86-101)
GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11
0:cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
ns:0 nr:0 dw:0 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:999984
[root@test-master drbd.d]# drbdadm -- --overwrite-data-of-peer primary data #(仅在主上执行)
[root@test-master drbd.d]# cat /proc/drbd
version: 8.4.7-1 (api:1/proto:86-101)
GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11
0:cs:SyncSource ro:Primary/Secondaryds:UpToDate/Inconsistent C r-----
ns:339968 nr:0 dw:0 dr:340647 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:660016
[=====>..............]sync'ed: 34.3% (660016/999984)K
finish:0:00:15 speed: 42,496 (42,496) K/sec
[root@test-master drbd.d]# cat /proc/drbd
version: 8.4.7-1 (api:1/proto:86-101)
GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11
0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
ns:630784 nr:0 dw:0 dr:631463 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:369200
[===========>........]sync'ed: 63.3% (369200/999984)K
finish:0:00:09 speed: 39,424 (39,424) K/sec
[root@test-master drbd.d]# cat /proc/drbd
version: 8.4.7-1 (api:1/proto:86-101)
GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11
0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----
ns:942080 nr:0 dw:0 dr:942759 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:57904
[=================>..]sync'ed: 94.3% (57904/999984)K
finish:0:00:01 speed: 39,196 (39,252) K/sec
[root@test-master drbd.d]# cat /proc/drbd
version: 8.4.7-1 (api:1/proto:86-101)
GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11
0:cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:999983nr:0 dw:0 dr:1000662 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
[root@test-master drbd.d]# ssh test-backup 'cat /proc/drbd'
version: 8.4.7-1 (api:1/proto:86-101)
GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6, 2016-01-1213:27:11
0:cs:Connected ro:Secondary/Primaryds:UpToDate/UpToDate C r-----
ns:0 nr:999983 dw:999983 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:0
[root@test-master drbd.d]# mkdir /drbd
[root@test-master drbd.d]# ssh test-backup 'mkdir /drbd'
[root@test-master drbd.d]# mkfs.ext4 -b 4096 /dev/drbd0 #(仅在主上执行,meta分区不要格式化)
Writing superblocks and filesystemaccounting information: done
[root@test-master drbd.d]# tune2fs -c -1 /dev/drbd0
tune2fs 1.41.12 (17-May-2010)
Setting maximal mount count to -1
[root@test-master drbd.d]# mount /dev/drbd0 /drbd
[root@test-master drbd.d]# cd /drbd
[root@test-master drbd]# for i in `seq 1 10`; do touch test$i; done
[root@test-master drbd]# ls
lost+found test1 test10 test2 test3 test4 test5 test6 test7 test8 test9
[root@test-master drbd]# cd
[root@test-master ~]# umount /dev/drbd0
[root@test-master ~]# drbdadm secondary data
[root@test-master ~]# cat /proc/drbd
version: 8.4.7-1 (api:1/proto:86-101)
GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11
0:cs:Connected ro:Secondary/Secondaryds:UpToDate/UpToDate C r-----
ns:1032538 nr:0 dw:32554 dr:1001751 al:19 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1wo:f oos:0
test-backup:
[root@test-backup ~]# cat /proc/drbd
version: 8.4.7-1 (api:1/proto:86-101)
GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11
0:cs:Connected ro:Secondary/Secondaryds:UpToDate/UpToDate C r-----
ns:0 nr:1032538 dw:1032538 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:0
[root@test-backup ~]# drbdadm primary data
[root@test-backup ~]# cat /proc/drbd
version: 8.4.7-1 (api:1/proto:86-101)
GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11
0:cs:Connected ro:Primary/Secondaryds:UpToDate/UpToDate C r-----
ns:0 nr:1032538 dw:1032538 dr:679 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1wo:f oos:0
[root@test-backup ~]# mount /dev/drbd0 /drbd
[root@test-backup ~]# ls /drbd
lost+found test1 test10 test2 test3 test4 test5 test6 test7 test8 test9
3、调试heartbeat+drbd
[root@test-master ~]# ssh test-backup 'umount /drbd'
[root@test-master ~]# ssh test-backup 'drbdadm secondary data'
[root@test-master ~]# service drbd stop
Stopping all DRBD resources: .
[root@test-master ~]# ssh test-backup 'service drbd stop'
Stopping all DRBD resources: .
[root@test-master ~]# service heartbeat status
heartbeat is stopped. No process
[root@test-master ~]# ssh test-backup 'service heartbeat status'
heartbeat is stopped. No process
[root@test-master ~]# ll /etc/ha.d/resource.d/{Filesystem,drbddisk}
-rwxr-xr-x. 1 root root 3162 Jan 12 2016 /etc/ha.d/resource.d/drbddisk
-rwxr-xr-x. 1 root root 1903 Dec 2 2013/etc/ha.d/resource.d/Filesystem
[root@test-master ~]# vim /etc/ha.d/haresources #(此行内容相当于脚本加参数的执行方式,例如#/etc/ha.d/resource.d/IPaddr10.96.20.8/24/eth0 start|stop,#/etc/ha.d/resource.d/drbddisk data start|stop,#/etc/ha.d/resource.d/Filesystem/dev/drbd0 /drbd ext4 start|stop;heartbeat就是这样按配置的先后顺序控制资源的,如果heartbeat出问题了,可通过查看日志并单独运行这些命令排错)
test-master IPaddr::10.96.20.8/24/eth0 drbddisk::data Filesystem::/dev/drbd/0::/drbd::ext4
[root@test-master ~]# scp /etc/ha.d/haresources root@test-backup:/etc/ha.d/
haresources 100% 5996 5.9KB/s 00:00
[root@test-master~]# service drbd start #(在主node执行)
Starting DRBD resources: [
create res: data
prepare disk: data
adjust disk: data
adjust net: data
]
..........
***************************************************************
DRBD's startup script waits for the peernode(s) to appear.
- Ifthis node was already a degraded cluster before the
reboot,the timeout is 0 seconds. [degr-wfc-timeout]
- Ifthe peer was available before the reboot, the timeout
is0 seconds. [wfc-timeout]
(These values are for resource 'data'; 0 sec -> wait forever)
Toabort waiting enter 'yes' [ 23]:
[root@test-backup~]# service drbd start #(在备node执行)
Starting DRBD resources: [
create res: data
prepare disk: data
adjust disk: data
adjust net: data
]
.
[root@test-master ~]# drbdadm role data
Secondary/Secondary
[root@test-master ~]# ssh test-backup 'drbdadm role data'
Secondary/Secondary
[root@test-master ~]# drbdadm -- --overwrite-data-of-peer primary data
[root@test-master ~]# drbdadm role data
Primary/Secondary
[root@test-master ~]# service heartbeat start
Starting High-Availability services:INFO: Resource is stopped
Done.
[root@test-master ~]# ssh test-backup 'service heartbeat start'
Starting High-Availability services:2016/08/09_03:08:11 INFO: Resource isstopped
Done.
[root@test-master ~]# ip addr | grep 10.96.20
inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0
inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0
[root@test-master ~]# drbdadm role data
Primary/Secondary
[root@test-master ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 18G 6.3G 11G 38% /
tmpfs 112M 0 112M 0% /dev/shm
/dev/sda1 283M 83M 185M 31% /boot
/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom
/dev/drbd0 946M 1.3M 896M 1% /drbd
[root@test-master ~]# ls /drbd
lost+found test1 test10 test2 test3 test4 test5 test6 test7 test8 test9
[root@test-master ~]# service heartbeat stop
Stopping High-Availability services: Done.
[root@test-master ~]# ssh test-backup 'ipaddr | grep 10.96.20'
inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0
inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0
[root@test-master ~]# ssh test-backup 'df-h'
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 18G 3.9G 13G 24% /
tmpfs 112M 0 112M 0% /dev/shm
/dev/sda1 283M 83M 185M 31% /boot
/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom
/dev/drbd0 946M 1.3M 896M 1% /drbd
[root@test-master ~]# ssh test-backup 'ls /drbd'
lost+found
test1
test10
test2
test3
test4
test5
test6
test7
test8
test9
[root@test-master ~]# drbdadm role data
Secondary/Primary
[root@test-master ~]# service heartbeat start #(主node恢复后,先确保把drbd理顺,弄正常,再开启heartbeat服务)
Starting High-Availability services:INFO: Resource is stopped
Done.
[root@test-master ~]# drbdadm role data
Primary/Secondary
[root@test-master ~]# ip addr | grep 10.96.20
inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0
inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0
[root@test-master ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 18G 6.3G 11G 38% /
tmpfs 112M 0 112M 0% /dev/shm
/dev/sda1 283M 83M 185M 31% /boot
/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom
/dev/drbd0 946M 1.3M 896M 1% /drbd
[root@test-master ~]# ls /drbd
lost+found test1 test10 test2 test3 test4 test5 test6 test7 test8 test9
4、分别在两主一从上,安装配置MySQL
MySQL-master-active:
[root@test-master ~]# drbdadm role data
Primary/Secondary
[root@test-master ~]# groupadd -g 3306 mysql
[root@test-master ~]# useradd -u 3306 -g 3306 -s /sbin/nologin -M mysql
[root@test-master ~]# id mysql
uid=3306(mysql) gid=3306(mysql)groups=3306(mysql)
[root@test-master ~]# mkdir /drbd/data #(两主要在drbd的挂载点处创建DB的数据目录,drbd仅同步MySQL的数据,程序文件都放在/usr/local/下)
[root@test-master ~]# chown -R mysql.mysql /drbd/data
[root@test-master ~]# rz #(上传mysql二进制包)
[root@test-master ~]# tar xf mysql-5.5.45-linux2.6-x86_64.tar.gz -C /usr/local
[root@test-master ~]# cd /usr/local
[root@test-master local]# ln -sv mysql-5.5.45-linux2.6-x86_64/ mysql
`mysql' ->`mysql-5.5.45-linux2.6-x86_64/'
[root@test-master local]# cd mysql
[root@test-master mysql]# chown -R root.mysql ./
[root@test-master mysql]#scripts/mysql_install_db --user=mysql --datadir=/drbd/data #(仅在当前对外提供服务的主node初始化,即drbd的primary端)
Installing MySQL system tables...
160810 19:46:23 [Note] ./bin/mysqld (mysqld5.5.45) starting as process 3908 ...
OK
…….
[root@test-master mysql]# cp support-files/my-large.cnf /etc/my.cnf
[root@test-master mysql]# vim /etc/my.cnf #(添加如下两项)
[mysqld]
datadir = /drbd/data
log-bin=mysql-bin
log-bin-index=mysql-bin.index
server-id=1
sync_binlog=1
innodb_file_per_table = 1
binlog_format=mixed
[root@test-master mysql]# egrep -v "#|^$" /etc/my.cnf
[client]
port =3306
socket =/tmp/mysql.sock
[mysqld]
port =3306
socket =/tmp/mysql.sock
skip-external-locking
key_buffer_size = 256M
max_allowed_packet = 1M
table_open_cache = 256
sort_buffer_size = 1M
read_buffer_size = 1M
read_rnd_buffer_size = 4M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size= 16M
thread_concurrency = 8
datadir = /drbd/data
log-bin=mysql-bin
log-bin-index=mysql-bin.index
server-id=1
sync_binlog=1
innodb_file_per_table = 1
binlog_format=mixed
[mysqldump]
quick
max_allowed_packet = 16M
[mysql]
no-auto-rehash
[myisamchk]
key_buffer_size = 128M
sort_buffer_size = 128M
read_buffer = 2M
write_buffer = 2M
[mysqlhotcopy]
interactive-timeout
[root@test-master mysql]# scp /etc/my.cnf root@test-backup:/etc/
my.cnf 100% 4787 4.7KB/s 00:00
[root@test-master mysql]# cp support-files/mysql.server /etc/init.d/mysqld
[root@test-master mysql]# chkconfig --add mysqld
[root@test-master mysql]# chkconfig mysqldoff
[root@test-master mysql]# chkconfig --list mysqld
mysqld 0:off 1:off 2:off 3:off 4:off 5:off 6:off
[root@test-master mysql]# service mysqld start
Starting MySQL..... [ OK ]
[root@test-master mysql]#/usr/local/mysql/bin/mysql
……
mysql> GRANT ALL ON *.* TO 'root'@'%'IDENTIFIED BY 'redhat';
Query OK, 0 rows affected (0.28 sec)
mysql> GRANT REPLICATION SLAVE ON *.* TO 'repluser'@'%' IDENTIFIED BY 'repluser';
Query OK, 0 rows affected (0.17 sec)
mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.04 sec)
mysql> select User,Password,Host from mysql.user;
mysql> select User,Host,Password from mysql.user;
+----------+-------------+-------------------------------------------+
| User | Host | Password |
+----------+-------------+-------------------------------------------+
| root | localhost | |
| root | test-master | |
| root | 127.0.0.1 | |
| root | ::1 | |
| | localhost | |
| | test-master | |
| root | % |*84BB5DF4823DA319BBF86C99624479A198E6EEE9 |
| repluser | % |*89A63F9688240669B54B5C2649EEFB795850597E |
+----------+-------------+-------------------------------------------+
8 rows in set (0.23 sec)
mysql> create database webgame;
Query OK, 1 row affected (0.10 sec)
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| test |
| webgame |
+--------------------+
5 rows in set (0.04 sec)
mysql> \q
Bye
[root@test-master mysql]# ip addr | grep 10.96.20
inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0
inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0
[root@test-master mysql]# df -h | grep drbd0
/dev/drbd0 946M 31M 866M 4% /drbd
[root@test-master ~]# vim /etc/ha.d/haresources
test-master IPaddr::10.96.20.8/24/eth0 drbddisk::data Filesystem::/dev/drbd0::/drbd::ext4 mysqld
[root@test-master ~]# scp /etc/ha.d/haresources root@test-backup:/etc/ha.d/
MySQL-master-inactive:
[root@test-backup ~]# drbdadm role data
Secondary/Primary
[root@test-backup ~]# groupadd -g 3306 mysql
[root@test-backup ~]# useradd -u 3306 -g 3306 -s /sbin/nologin -M mysql
[root@test-backup ~]# id mysql
uid=3306(mysql) gid=3306(mysql)groups=3306(mysql)
[root@test-backup ~]# rz
[root@test-backup ~]# tar xf mysql-5.5.45-linux2.6-x86_64.tar.gz -C /usr/local
[root@test-backup ~]# cd /usr/local
[root@test-backup local]# ln -sv mysql-5.5.45-linux2.6-x86_64/ mysql
`mysql' ->`mysql-5.5.45-linux2.6-x86_64/'
[root@test-backup local]# cd mysql
[root@test-backup mysql]# chown -R root.mysql ./
[root@test-backup mysql]# vim /etc/my.cnf #(此文件从master active传来的,确认有如下配置)
[mysqld]
datadir = /drbd/data
log-bin=mysql-bin
log-bin-index=mysql-bin.index
server-id=1
sync_binlog=1
innodb_file_per_table = 1
binlog_format=mixed
[root@test-backup mysql]# cp support-files/mysql.server /etc/init.d/mysqld
[root@test-backup mysql]# chkconfig --add mysqld
[root@test-backup mysql]# chkconfig mysqldoff
[root@test-backup mysql]# chkconfig --list mysqld
mysqld 0:off 1:off 2:off 3:off 4:off 5:off 6:off
mysql-slave:
[root@localhost ~]# mkdir /mydata/data -pv
mkdir: created directory `/mydata'
mkdir: created directory `/mydata/data'
[root@localhost ~]# groupadd -g 3306 mysql
[root@localhost ~]# useradd -u 3306 -g 3306 -s /sbin/nologin -M mysql
[root@localhost ~]# id mysql
uid=3306(mysql) gid=3306(mysql)groups=3306(mysql)
[root@localhost ~]# rz
[root@localhost ~]# tar xf mysql-5.5.45-linux2.6-x86_64.tar.gz -C /usr/local
[root@localhost ~]# cd /usr/local
[root@localhost local]# ln -sv mysql-5.5.45-linux2.6-x86_64/ mysql
`mysql' ->`mysql-5.5.45-linux2.6-x86_64/'
[root@localhost local]# cd mysql
[root@localhost mysql]# chown -R root.mysql./
[root@localhost mysql]# chown -R mysql.mysql /mydata/data
[root@localhost mysql]# cp support-files/my-large.cnf /etc/my.cnf
cp: overwrite `/etc/my.cnf'? y
[root@localhost mysql]# cp support-files/mysql.server /etc/init.d/mysqld
[root@localhost mysql]# chkconfig --add mysqld
[root@localhost mysql]# chkconfig --list mysqld
mysqld 0:off 1:off 2:on 3:on 4:on 5:on 6:off
[root@localhost mysql]# vim /etc/my.cnf
[mysqld]
datadir=/mydata/data
innodb_file_per_table=1
relay-log=relay-log
relay-log-index=relay-log.index
server-id=11
read_only=1
skip_slave_start=1
[root@localhost mysql]# egrep -v "#|^$" /etc/my.cnf
[client]
port =3306
socket =/tmp/mysql.sock
[mysqld]
port =3306
socket =/tmp/mysql.sock
skip-external-locking
key_buffer_size = 256M
max_allowed_packet = 1M
table_open_cache = 256
sort_buffer_size = 1M
read_buffer_size = 1M
read_rnd_buffer_size = 4M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size= 16M
thread_concurrency = 8
datadir=/mydata/data
innodb_file_per_table=1
relay-log=relay-log
relay-log-index=relay-log.index
server-id=11
read_only=1
skip_slave_start=1
[mysqldump]
quick
max_allowed_packet = 16M
[mysql]
no-auto-rehash
[myisamchk]
key_buffer_size = 128M
sort_buffer_size = 128M
read_buffer = 2M
write_buffer = 2M
[mysqlhotcopy]
interactive-timeout
[root@localhost mysql]# scripts/mysql_install_db --user=mysql --datadir=/mydata/data
Installing MySQL system tables...
160810 22:18:18 [Warning]'THREAD_CONCURRENCY' is deprecated and will be removed in a future release.
160810 22:18:18 [Note] ./bin/mysqld (mysqld5.5.45) starting as process 46873 ...
OK
Filling help tables...
160810 22:18:19 [Warning]'THREAD_CONCURRENCY' is deprecated and will be removed in a future release.
160810 22:18:19 [Note] ./bin/mysqld (mysqld5.5.45) starting as process 46880 ...
OK
……
[root@localhost mysql]# service mysqld start
Starting MySQL.. [ OK ]
[root@localhost ~]# mysql
mysql> CHANGE MASTER TO MASTER_USER='repluser',MASTER_PASSWORD='repluser',MASTER_HOST='10.96.20.8',MASTER_LOG_FILE='mysql-bin.000003',MASTER_LOG_POS=330;
Query OK, 0 rows affected (0.04 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G
……
测试分两步:
先测两主node间是否正常,调整好drbd并开启服务,先不要开启heartbeat,手动开启mysqld服务,在master-active创建新库,再关闭mysqld、将active的drbd置从;将inactive的drbd置为主,开启mysqld在master-inactive上查看;
再测在主切换后,主从同步能否继续,如下,正常
[root@test-backup ~]# tail -f /var/log/ha-log #(模拟active故障,在inactive查看take over过程)
Aug 10 22:40:38 test-backup heartbeat:[7738]: info: Local status now set to: 'up'
Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Link test-master:eth0 up.
Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Status update for node test-master: status active
Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Comm_now_up(): updating status to active
Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Local status now set to: 'active'
harc(default)[7747]: 2016/08/10_22:40:39 info: Running /etc/ha.d//rc.d/statusstatus
Aug 10 22:40:50 test-backup heartbeat:[7738]: info: local resource transition completed.
Aug 10 22:40:50 test-backup heartbeat:[7738]: info: Initial resource acquisition complete (T_RESOURCES(us))
Aug 10 22:40:50 test-backup heartbeat:[7766]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeystest-backup] to acquire.
Aug 10 22:40:50 test-backup heartbeat:[7738]: info: remote resource transition completed.
Aug 10 23:10:16 test-backup heartbeat:[7738]: info: Received shutdown notice from 'test-master'.
Aug 10 23:10:16 test-backup heartbeat:[7738]: info: Resources being acquired from test-master.
Aug 10 23:10:16 test-backup heartbeat:[7879]: info: acquire local HA resources (standby).
Aug 10 23:10:16 test-backup heartbeat:[7879]: info: local HA resource acquisition completed (standby).
Aug 10 23:10:16 test-backup heartbeat:[7738]: info: Standby resource acquisition done [all].
Aug 10 23:10:16 test-backup heartbeat:[7880]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeystest-backup] to acquire.
harc(default)[7905]: 2016/08/10_23:10:16 info: Running /etc/ha.d//rc.d/statusstatus
mach_down(default)[7922]: 2016/08/10_23:10:16 info: Taking overresource group IPaddr::10.96.20.8/24/eth0
ResourceManager(default)[7949]: 2016/08/10_23:10:16 info: Acquiring resourcegroup: test-master IPaddr::10.96.20.8/24/eth0 drbddisk::dataFilesystem::/dev/drbd0::/drbd::ext4 mysqld
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.96.20.8)[7977]: 2016/08/10_23:10:16 INFO: Resource is stopped
ResourceManager(default)[7949]: 2016/08/10_23:10:16 info: Running/etc/ha.d/resource.d/IPaddr 10.96.20.8/24/eth0 start
IPaddr(IPaddr_10.96.20.8)[8102]: 2016/08/10_23:10:16 INFO: Adding inet address10.96.20.8/24 with broadcast address 10.96.20.255 to device eth0
IPaddr(IPaddr_10.96.20.8)[8102]: 2016/08/10_23:10:16 INFO: Bringing device eth0up
IPaddr(IPaddr_10.96.20.8)[8102]: 2016/08/10_23:10:16 INFO:/usr/libexec/heartbeat/send_arp -i 200 -r 5 -p/var/run/resource-agents/send_arp-10.96.20.8 eth0 10.96.20.8 auto not_usednot_used
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.96.20.8)[8076]: 2016/08/10_23:10:16 INFO: Success
ResourceManager(default)[7949]: 2016/08/10_23:10:16 info: Running/etc/ha.d/resource.d/drbddisk data start
/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[8231]: 2016/08/10_23:10:17 INFO: Resource is stopped
ResourceManager(default)[7949]: 2016/08/10_23:10:17 info: Running/etc/ha.d/resource.d/Filesystem /dev/drbd0 /drbd ext4 start
Filesystem(Filesystem_/dev/drbd0)[8314]: 2016/08/10_23:10:17 INFO: Running start for/dev/drbd0 on /drbd
/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[8306]: 2016/08/10_23:10:17 INFO: Success
ResourceManager(default)[7949]: 2016/08/10_23:10:18 info: Running/etc/init.d/mysqld start
mach_down(default)[7922]: 2016/08/10_23:10:31 info:/usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired
Aug 10 23:10:32test-backup heartbeat: [7738]: info: mach_down takeover complete.
mach_down(default)[7922]: 2016/08/10_23:10:33 info: mach_down takeovercomplete for node test-master.
^C
[root@test-backup ~]# ip addr
……
2: eth0:
link/ether 00:0c:29:15:e6:bb brd ff:ff:ff:ff:ff:ff
inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0
inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0
inet6 fe80::20c:29ff:fe15:e6bb/64 scope link
valid_lft forever preferred_lft forever
……
[root@test-backup ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 18G 4.7G 12G 29% /
tmpfs 112M 0 112M 0% /dev/shm
/dev/sda1 283M 83M 185M 31% /boot
/dev/sr0 3.6G 3.6G 0 100% /mnt/cdrom
/dev/drbd0 946M 31M 866M 4% /drbd
[root@test-backup ~]# service mysqld status
MySQL running(8772) [ OK ]
[root@localhost ~]# mysql (在slave端查看主从同步是否正常)
Welcome to the MySQL monitor. Commands end with ; or \g.
……
mysql> show slave status\G
*************************** 1. row***************************
Slave_IO_State: Waiting formaster to send event
Master_Host: 10.96.20.8
Master_User: repluser
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000005
Read_Master_Log_Pos: 198
Relay_Log_File: relay-log.000004
Relay_Log_Pos: 344
Relay_Master_Log_File: mysql-bin.000005
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
……
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| test |
| webgame1 |
| webgame2 |
| webgame3 |
+--------------------+
7 rows in set (0.00 sec)
MySQL主从同步常用的架构方案:
1、一主一从
注:HA软件keepalived、heartbeat只负责VIP切换即可;
此方案部署简单、容易维护;
master故障后,业务可自动切换到slave;
rw都依赖主库,压力大,有锁、死锁等;
也可让slave有r服务,但要依赖程序代码实现;
2、一主多从
注:HA软件keepalived、heartbeat可只负责VIP的切换;
master故障后,业务可自动切到slave1上,这时slave2可能无法和slave1自动同步,解决办法使用semi-sync机制;
支持rw splitting,master负责w,slave负责r,但要通过程序代码实现;
3、双主
注:HA软件keepalived+LVS,MMM;
双主同步后,可将两个主做LB,任意一个主挂掉业务不受影响;
双主会有严重问题,会增加数据不一致的机率;
双主对性能提升不大,属复杂而并无太多好处的架构方案,不推荐;
4、双主多从:
注:HA软件MMM、keepalived;
若一个主挂掉,业务不受影响;
双写可以做,但会增加数据不一致机率;
同一时间只往一个主上写数据;
5、级联复制
注:HA软件keepalived、heartbeat,可只负责VIP的切换;
master故障切至master2上,master2依然继续向slave{1,2}同步;
slave{1,2}支持rw splitting,但要通过程序代码实现;
从库为级联同步,可能会有延迟,master2若故障,那slave的同步将中断;
6、drbd的双主
注:passive-server作为备用node时是不可见状态
7、
-------------------------------------------------------------------------
注:HA软件heartbeat既负责VIP切换,还负责drbd、mysqld服务的管理;
若master故障自动切至backup,slave{1,2}仍能与backup同步;
slave{1,2}支持rw splitting,但要通过程序代码实现;
此方案也支持semi-sync机制;
backup仅在提升为主时才能访问,正常情况下,master和backup仅有一台对外提供服务;
8、基于SAN存储的HA方案,Oracle、SQLserver常用
-----------------------------------------------------------------------
注:HA软件RedHat Cluster Site;
业务依赖SAN存储;
Backup仅在Master故障后,成功接管才能访问;
slave{1,2}支持rw splitting;
9、
注:部署灵活、资源利用率高;
双master负责w,slave负责r;
业务依赖DNS服务,对长连接的支持不好;
master故障影响从库;
10、
注:可用软件mysql-proxy、amoeba;
前端业务透明rw splitting,后端health check;
开源方案目前不稳定;
需要定制开发DBproxy;
11、分布式数据库集群高可用方案
注:DAL,data access layer;
12、
注:基于Galera高可用方案;
Galera是一套在MySQL InnoDB上实现Multi-Master且sychronousreplication的集群系统;特点:true multi-master;read&write to any node;synchronousreplication;no slave log,integrity issues;no master-slave failover,noVIP;multi-thread slave;automatic node provisioning;
13、MySQL官方cluster高可用方案
注:
MySQL HA架构方案选择依据:
根据可用性 |
根据安全性 |
根据写性能 |
|
MySQL replication |
98%--99.9+% |
No |
Fair |
master-master with MMM manager |
99% |
No |
Fair |
heartbeat/SAN |
99.5%--99.9% |
Yes |
Excellent |
Heartbeat/drbd |
99.9% |
Yes |
Good |
NDB cluster |
99.999% |
yes |
excellent |
注:NDB cluster(very high,specific NDB knowledge,strom MySQL skills and strongsysadmin skills
MySQL目前存在的问题:
单机性能(QPS(rw),响应时间,数据规模,IOPS是r操作和w操作的瓶颈);
主从数据一致性(异步复制,semi-sync复制,顺序性+完整性);
自动化扩容(数据迁移;按一定规模扩容(哈希取模、范围、日期、组合等,水平垂直拆分);数据容量预估、提前预警(单表容量预估(业务评估);buffer pool容量、命中率;磁盘容量);全量+增量自动化扩容(从库提升为新主库;自动或手动;扩容完毕通知代理层对前端透明);
主库单点(主备策略(备库只做数据同步,不做线上查询);数据补全(从主库拉取binblog文件进行数据补全);单点切换(主库宕机,切换新主库,尽量保持数据一致性(业务特性);通知代理层切换新的主库对应透明);
分布式数据库:
1、产品定位(尽量保证数据库特性,提升数据规模;线上低延迟的访问;满足具有一定复杂关系的数据操作);
2、设计原则(实现mysql客户端通信协议;数据逻辑分布对应用透明;自动发现/人工决定/自动处理;支持单机事务);
3、设计指标(千亿级别存贮数据;响应时间低于10ms;对上层应用完全透明);
分布式数据库代理层(实现mysql客户端协议;rw splitting;LB,从库加权轮询等;数据查询合并;数据拆分规则;并发控制;sql白名单管理;单机事务支持(amoeba不支持事务);服务端模型);
监控(存活监控;主从延时监控;容量监控(表、磁盘);流量监控(请求);命中率监控(缓冲池);关键数据收集上报);
web监控和报警(界面对运维和DBA友好;可以触发集群管理操作(人工扩容、切换新主库);监控数据异常报警(邮件、短信、级别不同方式不一样);
元数据服务(存贮数据拆分规则(配置中心);选举服务;实现fast paxos协议;数据原子广播通信协议;实现数据通知服务;锁服务;应用定位服务);
单点切换服务(主库宕机提升备库或从库为新主库(ssh是否通,获取binlog补全数据),尽量保持数据一致性);选取新主库的策略;新主库确定,通知前端代理层);
数据迁移服务(根据监控数据和预值指标进行扩容;全量+增量;冗余数据自动清理;自动或人工迁移)