IV 12 MySQL+drbd+heartbeat

 

一主多从是最常用的DB架构方案,该架构部署简单、维护方便,通过代理或程序的方式可实现rw splitting,且多个从库通过LVShaproxy实现LB分担r的压力,排除了r的单点问题,但仅有一个主库这也是单点,若主出问题w将停止,最简单的方案人工介入,做监控,主一旦宕机,管理人员手动选择半同步的那个从改为主,让其它从与新的主同步,人工介入虽可行但高要求的场合并不适用

注:

正常情况下MySQL-M-active负责wMySQL-M-inactive为不可见状态,MySQL slave负责r,另可做MySQL slaveLBmasterslave同步时利用其自身机制并通过VIPweb serverrw时通过程序自身实现,也可用mysql proxyamoeba开源软件实现;

IV 12 MySQL+drbd+heartbeat_第1张图片

注:双主热备模式

 

 

1、安装配置heartbeat

准备环境:

VIP10.96.20.8

mastereth010.96.20.113)、eth1172.16.1.113,不配网关及dns)、主机名(test-master

backupeth010.96.20.114)、eth1172.16.1.114,不配网关及dns)、主机名(test-backup

双网卡、双硬盘、

注:eth0为管理IPeth1心跳连接及drbd传输通道,若是生产环境中心跳传输和数据传输用一个网卡要做限制,给心跳留有带宽

注:规范vmware中标签,Xshell中标签,公司中的生产环境所有主机均应在/etc/hosts文件中有相应记录,方便分发及管理维护

 

test-master(分别配置主机名/etc/sysconfig/network结果一定要与uname-n保持一致,/etc/hosts文件,ssh双机互信,时间同步,iptablesselinux):

[root@test-master ~]# cat /etc/redhat-release

Red Hat Enterprise Linux Server release 6.5(Santiago)

[root@test-master ~]# uname -rm

2.6.32-431.el6.x86_64 x86_64

[root@test-master ~]# uname -n

test-master

[root@test-master ~]# ifconfig | grep eth0 -A 1

eth0     Link encap:Ethernet  HWaddr00:0C:29:1F:B6:AC 

         inet addr:10.96.20.113 Bcast:10.96.20.255 Mask:255.255.255.0

[root@test-master ~]# ifconfig | grep eth1 -A 1

eth1     Link encap:Ethernet  HWaddr00:0C:29:1F:B6:B6 

         inet addr:172.16.1.113 Bcast:172.16.1.255 Mask:255.255.255.0

[root@test-master ~]# route add -host 172.16.1.114 dev eth1   #(添加主机路由,心跳传送通过指定网卡出去,此句可追加到/etc/rc.local中,也可配置静态路由#vim /etc/sysconfig/network-scripts/route-eth1添加172.16.1.114/24via 172.16.1.113

[root@test-master ~]# ssh-keygen -t rsa -f ./.ssh/id_rsa -P ''

Generating public/private rsa key pair.

Your identification has been saved in./.ssh/id_rsa.

Your public key has been saved in./.ssh/id_rsa.pub.

The key fingerprint is:

29:c3:a3:68:81:43:59:2f:0a:ad:8a:54:56:b0:1e:12root@test-master

The key's randomart p_w_picpath is:

+--[ RSA 2048]----+

| E o..           |

| .+ +            |

|.+.* .           |

|oo* o.  .       |

|+o.. = S        |

|+. o . +         |

|o o .            |

| .               |

|                 |

+-----------------+

[root@test-master ~]# ssh-copy-id -i ./.ssh/id_rsa root@test-backup

The authenticity of host 'test-backup(10.96.20.114)' can't be established.

RSA key fingerprint is63:f5:2e:dc:96:64:54:72:8e:14:7e:ec:ef:b8:a1:0c.

Are you sure you want to continueconnecting (yes/no)? yes

Warning: Permanently added 'test-backup'(RSA) to the list of known hosts.

root@test-backup's password:

Now try logging into the machine, with"ssh 'root@test-backup'", and check in:

 

 .ssh/authorized_keys

 

to make sure we haven't added extra keysthat you weren't expecting.

[root@test-master ~]# crontab -l

*/5 * * * * /usr/sbin/ntpdatetime.windows.com &> /dev/null

[root@test-master ~]# service crond restart

Stopping crond:                                           [  OK  ]

Starting crond:                                            [  OK  ]

[root@test-master ~]# wget http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm

[root@test-master ~]# rpm -ivh epel-release-6-8.noarch.rpm

warning: epel-release-6-8.noarch.rpm:Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY

Preparing...               ########################################### [100%]

  1:epel-release          ########################################### [100%]

[root@test-master ~]# yum search heartbeat

……

heartbeat-devel.i686 : Heartbeatdevelopment package

heartbeat-devel.x86_64 : Heartbeatdevelopment package

heartbeat-libs.i686 : Heartbeat libraries

heartbeat-libs.x86_64 : Heartbeat libraries

heartbeat.x86_64 : Messaging and membershipsubsystem for High-Availability Linux

[root@test-master ~]# yum -y install heartbeat

[root@test-master ~]# chkconfig heartbeat off

[root@test-master ~]# chkconfig --list heartbeat

heartbeat          0:off 1:off 2:off 3:off 4:off 5:off 6:off

 

test-backup

[root@test-backup ~]# uname -n

test-backup

[root@test-backup ~]# ifconfig | grep eth0-A 1

eth0     Link encap:Ethernet  HWaddr00:0C:29:15:E6:BB 

         inet addr:10.96.20.114 Bcast:10.96.20.255 Mask:255.255.255.0

[root@test-backup ~]# ifconfig | grep eth1-A 1

eth1     Link encap:Ethernet  HWaddr00:0C:29:15:E6:C5 

         inet addr:172.16.1.114 Bcast:172.16.1.255 Mask:255.255.255.0

[root@test-backup ~]# route add -host 172.16.1.113 dev eth1

[root@test-backup ~]# ssh-keygen -t rsa -f ./.ssh/id_rsa -P ''

Generating public/private rsa key pair.

Your identification has been saved in./.ssh/id_rsa.

Your public key has been saved in./.ssh/id_rsa.pub.

The key fingerprint is:

08:ea:6a:44:7f:1a:c9:bf:ff:01:d5:32:e5:39:1b:b8root@test-backup

The key's randomart p_w_picpath is:

+--[ RSA 2048]----+

|           .    |

|         = .    |

|   .    = *     |

| . . . .. + +    |

|. + . ..SE .     |

| o = . .        |

|. . =   .       |

| o . .   .      |

|o   .o...       |

+-----------------+

[root@test-backup ~]# ssh-copy-id -i ./.ssh/id_rsa root@test-master

The authenticity of host 'test-master(10.96.20.113)' can't be established.

RSA key fingerprint is63:f5:2e:dc:96:64:54:72:8e:14:7e:ec:ef:b8:a1:0c.

Are you sure you want to continueconnecting (yes/no)? yes

Warning: Permanently added 'test-master'(RSA) to the list of known hosts.

root@test-master's password:

Now try logging into the machine, with"ssh 'root@test-master'", and check in:

 

 .ssh/authorized_keys

 

to make sure we haven't added extra keysthat you weren't expecting.

[root@test-backup ~]# crontab -l

*/5 * * * * /usr/sbin/ntpdatetime.windows.com &> /dev/null

[root@test-backup ~]# service crond restart

Stopping crond:                                           [  OK  ]

Starting crond:                                            [  OK  ]

[root@test-backup ~]# wget http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm

[root@test-backup ~]# rpm -ivh epel-release-6-8.noarch.rpm

[root@test-backup ~]# yum -y install heartbeat

[root@test-backup ~]# chkconfig heartbeat off

[root@test-backup ~]# chkconfig --list heartbeat

heartbeat          0:off 1:off 2:off 3:off 4:off 5:off 6:off

 

test-master

[root@test-master ~]# cp /usr/share/doc/heartbeat-3.0.4/{ha.cf,authkeys,haresources} /etc/ha.d/

[root@test-master ~]# cd /etc/ha.d

[root@test-master ha.d]# ls

authkeys ha.cf  harc  haresources rc.d  README.config  resource.d shellfuncs

[root@test-master ha.d]# vim authkeys   #(使用#ddif=/dev/random count=1 bs=512 | md5sum生成随机数,sha1后跟随机数)

auth 1

1 sha1912d6402295ac8d47109e56b177073b9

[root@test-master ha.d]# chmod 600 authkeys   #(此文件权限600,否则启动服务时会报错)

[root@test-master ha.d]# ll !$

ll authkeys

-rw-------. 1 root root 692 Aug  7 21:51 authkeys

[root@test-master ha.d]# vim ha.cf

debugfile /var/log/ha-debug   #(调试日志)

logfile /var/log/ha-log

logfacility     local1  #(在rsyslog服务中配置通过local1接收日志)

keepalive 2   #(指定心跳间隔时间,即2s发一次广播)

deadtime 30   #(指定备node30s内没收到主node的心跳信息则立即接管对方的服务资源)

warntime 10   #(指定心跳延迟的时间为10s,当10s内备node没收到主node的心跳信息,就会往日志中写警告,此时不会切换服务)

initdead 120   #(指定在heartbeat首次运行后,需等待120s才启动主node的各资源,此项用于解决等待对方heartbeat服务启动了自己才启,此项值至少要是deadtime的两倍)

udpport 694

#bcast eth0   #(指定心跳使用以太网广播方式在eth0上广播,若要使用两个实际网络传送心跳则要为bcast eth0 eth1

mcast eth0 225.0.0.11 6941 0   #(设置多播通信的参数,多播地址在LAN内必须是唯一的,因为有可能有多个heartbeat服务,多播地址使用DIP224.0.0.0--239.255.255.255),格式为mcast devmcast_group port ttl loop

auto_failback on   #(用于主node恢复后failback

node test-master   #(主node主机名,uname -n结果)

node test-backup   #(备node主机名)

crm no   #(是否开启CRM功能)

[root@test-master ha.d]# vim haresources

test-master     IPaddr::10.96.20.8/24/eth0   #(此句相当于执行#/etc/ha.d/resource.d/IPaddr10.96.20.8/24/eth0 stop|startIPaddr即是/etc/ha.d/resource.d/下的脚本)

[root@test-master ha.d]# scp authkeys ha.cf haresources root@test-backup:/etc/ha.d/

authkeys                                                                                           100%  692     0.7KB/s  00:00   

ha.cf                                                                                              100%   10KB  10.3KB/s  00:00   

haresources                                                                                        100% 5944     5.8KB/s   00:00   

[root@test-master ha.d]# service heartbeat start

Starting High-Availability services:INFO:  Resource is stopped

Done.

 

[root@test-master ha.d]# ssh test-backup 'service heartbeat start'

Starting High-Availability services:2016/08/07_22:39:00 INFO:  Resource isstopped

Done.

[root@test-master ha.d]# ps aux | grep heartbeat

root     63089  0.0  3.1 50124  7164 ?        SLs 22:38   0:00 heartbeat: mastercontrol process

root     63093  0.0  3.1 50076  7116 ?        SL  22:38   0:00 heartbeat: FIFOreader       

root     63094  0.0  3.1 50072  7112 ?        SL  22:38   0:00 heartbeat: write:mcast eth0 

root     63095  0.0  3.1 50072  7112 ?        SL  22:38   0:00 heartbeat: read:mcast eth0  

root     63136  0.0  0.3 103264  836 pts/0    S+   22:39  0:00 grep heartbeat

[root@test-master ha.d]# ssh test-backup 'ps aux | grep heartbeat'

root      3050  0.0 3.1  50124  7164 ?       SLs  22:39   0:00 heartbeat: master control process

root      3054  0.0  3.1 50076  7116 ?        SL  22:39   0:00 heartbeat: FIFOreader       

root      3055  0.0  3.1 50072  7112 ?        SL  22:39   0:00 heartbeat: write:mcast eth0 

root      3056  0.0  3.1 50072  7112 ?        SL  22:39   0:00 heartbeat: read:mcast eth0  

root      3094  0.0  0.5 106104 1368 ?        Ss   22:39  0:00 bash -c ps aux | grep heartbeat

root      3108  0.0  0.3 103264  832 ?        S    22:39  0:00 grep heartbeat

[root@test-master ha.d]# netstat -tnulp |grep heartbeat

udp       0      0 225.0.0.11:694              0.0.0.0:*                               63094/heartbeat:wr

udp       0      0 0.0.0.0:50268               0.0.0.0:*                               63094/heartbeat:wr

[root@test-master ha.d]# ssh test-backup 'netstat -tnulp | grep heartbeat'

udp       0      0 0.0.0.0:58019               0.0.0.0:*                               3055/heartbeat:wri

udp        0     0 225.0.0.11:694             0.0.0.0:*                              3055/heartbeat: wri

[root@test-master ha.d]# ip addr | grep 10.96.20

   inet 10.96.20.113/24 brd 10.96.20.255scope global eth0

   inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0

[root@test-master ha.d]# ssh test-backup'ip addr | grep 10.96.20'

   inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0

[root@test-master ha.d]# service heartbeatstop

Stopping High-Availability services: Done.

 

[root@test-master ha.d]# ip addr | grep 10.96.20

   inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

[root@test-master ha.d]# ssh test-backup'ip addr | grep 10.96.20'

   inet 10.96.20.114/24 brd 10.96.20.255scope global eth0

   inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0

[root@test-master ha.d]# service heartbeat start

Starting High-Availability services:INFO:  Resource is stopped

Done.

 

[root@test-master ha.d]# ip addr | grep 10.96.20

   inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

   inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[root@test-master ha.d]# ssh test-backup 'ip addr | grep 10.96.20'

   inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0

[root@test-master ~]# service heartbeat stop

Stopping High-Availability services: Done.

 

[root@test-master ~]# ssh test-backup 'service heartbeat stop'

Stopping High-Availability services: Done.

 

 

 

2、安装配置drbd

test-master

[root@test-master ~]# fdisk -l

……

Disk /dev/sdb: 2147 MB, 2147483648 bytes

255 heads, 63 sectors/track, 261 cylinders

Units = cylinders of 16065 * 512 = 8225280bytes

Sector size (logical/physical): 512 bytes /512 bytes

I/O size (minimum/optimal): 512 bytes / 512bytes

Disk identifier: 0x00000000

[root@test-master ~]# parted /dev/sdb  #parted命令可支持大于2T的硬盘,将新硬盘分两个区,一个区用于放数据,另一个区用于drbdmeta data

GNU Parted 2.1

Using /dev/sdb

Welcome to GNU Parted! Type 'help' to viewa list of commands.

(parted) h                                                               

 align-check TYPE N                       check partition N for TYPE(min|opt) alignment

 check NUMBER                            do a simple check on the file system

  cp[FROM-DEVICE] FROM-NUMBER TO-NUMBER  copy file system to another partition

 help [COMMAND]                           print general help, or helpon COMMAND

  mklabel,mktable LABEL-TYPE               create a new disklabel (partitiontable)

 mkfs NUMBER FS-TYPE                     make a FS-TYPE file system on partition NUMBER

  mkpart PART-TYPE [FS-TYPE] START END     make a partition

 mkpartfs PART-TYPE FS-TYPE START END    make a partition with a file system

 move NUMBER START END                   move partition NUMBER

 name NUMBER NAME                        name partition NUMBER as NAME

  print [devices|free|list,all|NUMBER]     display the partition table, availabledevices, free space, all found partitions, or a

       particular partition

 quit                                    exit program

 rescue START END                        rescue a lost partition near START and END

 resize NUMBER START END                 resize partition NUMBER and its file system

  rmNUMBER                               delete partition NUMBER

 select DEVICE                           choose the device to edit

  setNUMBER FLAG STATE                   change the FLAG on partition NUMBER

 toggle [NUMBER [FLAG]]                  toggle the state of FLAG on partition NUMBER

 unit UNIT                               set the default unit to UNIT

 version                                 display the version number and copyright information of GNU Parted

(parted) mklabel gpt                                                     

(parted) mkpart primary 0 1024

Warning: The resulting partition is not properlyaligned for best performance.

Ignore/Cancel?Ignore

(parted) mkpart primary 1025 2147                                        

Warning: The resulting partition is notproperly aligned for best performance.

Ignore/Cancel? Ignore

(parted) p                                                                

Model: VMware, VMware Virtual S (scsi)

Disk /dev/sdb: 2147MB

Sector size (logical/physical): 512B/512B

Partition Table: gpt

 

Number Start   End     Size   File system  Name     Flags

 1     17.4kB  1024MB  1024MB               primary

 2     1025MB  2147MB  1122MB               primary

[root@test-master ~]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm

[root@test-master ~]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm

warning:elrepo-release-6-6.el6.elrepo.noarch.rpm: Header V4 DSA/SHA1 Signature, key IDbaadae52: NOKEY

Preparing...               ########################################### [100%]

  1:elrepo-release        ########################################### [100%]

[root@test-master ~]# yum -y install drbd kmod-drbd84

[root@test-master ~]# modprobe drbd

FATAL: Module drbd not found.

[root@test-master ~]# yum -y install kernel*   #(更新内核后要重启系统)

[root@test-master ~]# uname -r

2.6.32-642.3.1.el6.x86_64

[root@test-master ~]# depmod

[root@test-master ~]# lsmod | grep drbd

drbd                  372759  0

libcrc32c               1246  1 drbd

[root@test-master ~]# ll /usr/src/kernels/

total 12

drwxr-xr-x. 22 root root 4096 Mar 31 06:462.6.32-431.el6.x86_64

drwxr-xr-x. 22 root root 4096 Aug  8 03:40 2.6.32-642.3.1.el6.x86_64

drwxr-xr-x. 22 root root 4096 Aug  8 03:40 2.6.32-642.3.1.el6.x86_64.debug

[root@test-master ~]# echo "modprobedrbd > /dev/null 2>&1" > /etc/sysconfig/modules/drbd.modules

[root@test-master ~]# cat !$

cat /etc/sysconfig/modules/drbd.modules

modprobe drbd > /dev/null 2>&1

 

test-backup

[root@test-backup ~]# parted /dev/sdb

(parted) mklabel gpt

(parted) mkpart primary 0 4096                                           

Warning: The resulting partition is notproperly aligned for best performance.

Ignore/Cancel? Ignore                                                    

(parted) mkpart primary 4097 5368                                        

(parted) p                                                                

Model: VMware, VMware Virtual S (scsi)

Disk /dev/sdb: 5369MB

Sector size (logical/physical): 512B/512B

Partition Table: gpt

 

Number Start   End     Size   File system  Name     Flags

 1     17.4kB  4096MB  4096MB               primary

 2     4097MB  5368MB  1271MB               primary

[root@test-backup ~]# wget http://www.elrepo.org/elrepo-release-6-6.el6.elrepo.noarch.rpm

[root@test-backup ~]# rpm -ivh elrepo-release-6-6.el6.elrepo.noarch.rpm

[root@test-backup ~]# ll /etc/yum.repos.d/

total 20

-rw-r--r--. 1 root root 1856 Jul 19 00:28CentOS6-Base-163.repo

-rw-r--r--. 1 root root 2150 Feb  9  2014elrepo.repo

-rw-r--r--. 1 root root  957 Nov 4  2012 epel.repo

-rw-r--r--. 1 root root 1056 Nov  4  2012epel-testing.repo

-rw-r--r--. 1 root root  529 Mar 30 23:00 rhel-source.repo.bak

[root@test-backup ~]# yum -y install drbd kmod-drbd84

[root@test-backup ~]# yum -y install kernel*

[root@test-backup ~]# depmod

[root@test-backup ~]# lsmod | grep drbd

drbd                  372759  0

libcrc32c               1246  1 drbd

[root@test-backup ~]# chkconfig drbd off

[root@test-backup ~]# chkconfig --list drbd

drbd              0:off 1:off 2:off 3:off 4:off 5:off 6:off

[root@test-backup ~]# echo "modprobedrbd > /dev/null 2>&1" > /etc/sysconfig/modules/drbd.modules

[root@test-backup ~]# cat !$

cat /etc/sysconfig/modules/drbd.modules

modprobe drbd > /dev/null 2>&1

 

test-master

[root@test-master ~]# vim /etc/drbd.d/global_common.conf

[root@test-master ~]# egrep -v "#|^$" /etc/drbd.d/global_common.conf

global {

         usage-countno;

}

common {

         handlers{

         }

         startup{

         }

         options{

         }

         disk{

                on-io-error detach;

         }

         net{

         }

         syncer{

                   rate50M;

                   verify-algcrc32c;

         }

}

[root@test-master ~]# vim/etc/drbd.d/data.res

resource data {

       protocol C;

       on test-master {

                device  /dev/drbd0;

                disk    /dev/sdb1;

                address 172.16.1.113:7788;

                meta-disk       /dev/sdb2[0];

       }

       on test-backup {

                device  /dev/drbd0;

                disk    /dev/sdb1;

                address 172.16.1.114:7788;

                meta-disk       /dev/sdb2[0];

       }

}

[root@test-master ~]# cd /etc/drbd.d

[root@test-master drbd.d]# scp global_common.conf data.res root@test-backup:/etc/drbd.d/

global_common.conf                                                                                     100% 2144     2.1KB/s   00:00   

data.res                                                                                                100%  251    0.3KB/s   00:00   

 

[root@test-master drbd.d]# drbdadm --help

USAGE: drbdadm COMMAND [OPTION...]{all|RESOURCE...}

GENERAL OPTIONS:

 --stacked, -S

 --dry-run, -d

 --verbose, -v

  --config-file=...,-c ...

 --config-to-test=..., -t ...

 --drbdsetup=..., -s ...

 --drbdmeta=..., -m ...

 --drbd-proxy-ctl=..., -p ...

 --sh-varname=..., -n ...

 --peer=..., -P ...

 --version, -V

 --setup-option=..., -W ...

 --help, -h

 

COMMANDS:

 attach                             disk-options                      

 detach                             connect                           

 net-options                        disconnect                        

 up                                 resource-options                  

 down                               primary                           

 secondary                          invalidate                        

 invalidate-remote                  outdate                           

 resize                             verify                            

 pause-sync                         resume-sync                       

 adjust                            adjust-with-progress              

 wait-connect                       wait-con-int                      

 role                               cstate                            

 dstate                             dump                              

 dump-xml                           create-md                          

 show-gi                           get-gi                            

 dump-md                            wipe-md                           

 apply-al                           hidden-commands    

[root@test-master drbd.d]# drbdadm create-md data

initializing activity log

NOT initializing bitmap

Writing meta data...

New drbd meta data block successfullycreated.

[root@test-master drbd.d]# ssh test-backup 'drbdadm create-md data'

NOT initializing bitmap

initializing activity log

Writing meta data...

New drbd meta data block successfullycreated.

[root@test-master drbd.d]# drbdadm up data

[root@test-master drbd.d]# ssh test-backup 'drbdadm up data'

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11

 0:cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----

   ns:0 nr:0 dw:0 dr:0 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:999984

[root@test-master drbd.d]# ssh test-backup 'cat /proc/drbd'

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11

 0:cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----

   ns:0 nr:0 dw:0 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:999984

[root@test-master drbd.d]# drbdadm -- --overwrite-data-of-peer primary data   #(仅在主上执行)

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11

 0:cs:SyncSource ro:Primary/Secondaryds:UpToDate/Inconsistent C r-----

   ns:339968 nr:0 dw:0 dr:340647 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:660016

         [=====>..............]sync'ed: 34.3% (660016/999984)K

         finish:0:00:15 speed: 42,496 (42,496) K/sec

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11

 0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----

   ns:630784 nr:0 dw:0 dr:631463 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:369200

         [===========>........]sync'ed: 63.3% (369200/999984)K

         finish:0:00:09 speed: 39,424 (39,424) K/sec

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11

 0:cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----

   ns:942080 nr:0 dw:0 dr:942759 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:57904

         [=================>..]sync'ed: 94.3% (57904/999984)K

         finish:0:00:01 speed: 39,196 (39,252) K/sec

[root@test-master drbd.d]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11

 0:cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----

    ns:999983nr:0 dw:0 dr:1000662 al:8 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

[root@test-master drbd.d]# ssh test-backup 'cat /proc/drbd'

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6, 2016-01-1213:27:11

 0:cs:Connected ro:Secondary/Primaryds:UpToDate/UpToDate C r-----

   ns:0 nr:999983 dw:999983 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:0

[root@test-master drbd.d]# mkdir /drbd

[root@test-master drbd.d]# ssh test-backup 'mkdir /drbd'

[root@test-master drbd.d]# mkfs.ext4 -b 4096 /dev/drbd0   #(仅在主上执行,meta分区不要格式化)

Writing superblocks and filesystemaccounting information: done

[root@test-master drbd.d]# tune2fs -c -1 /dev/drbd0

tune2fs 1.41.12 (17-May-2010)

Setting maximal mount count to -1

[root@test-master drbd.d]# mount /dev/drbd0 /drbd

[root@test-master drbd.d]# cd /drbd

[root@test-master drbd]# for i in `seq 1 10`; do touch test$i; done

[root@test-master drbd]# ls

lost+found test1  test10  test2 test3  test4  test5 test6  test7  test8 test9

[root@test-master drbd]# cd

[root@test-master ~]# umount /dev/drbd0

[root@test-master ~]# drbdadm secondary data

[root@test-master ~]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash: 3a6a769340ef93b1ba2792c6461250790795db49build by mockbuild@Build64R6, 2016-01-12 13:27:11

 0:cs:Connected ro:Secondary/Secondaryds:UpToDate/UpToDate C r-----

   ns:1032538 nr:0 dw:32554 dr:1001751 al:19 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1wo:f oos:0

 

test-backup

[root@test-backup ~]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11

 0:cs:Connected ro:Secondary/Secondaryds:UpToDate/UpToDate C r-----

   ns:0 nr:1032538 dw:1032538 dr:0 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:foos:0

[root@test-backup ~]# drbdadm primary data

[root@test-backup ~]# cat /proc/drbd

version: 8.4.7-1 (api:1/proto:86-101)

GIT-hash:3a6a769340ef93b1ba2792c6461250790795db49 build by mockbuild@Build64R6,2016-01-12 13:27:11

 0:cs:Connected ro:Primary/Secondaryds:UpToDate/UpToDate C r-----

   ns:0 nr:1032538 dw:1032538 dr:679 al:16 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1wo:f oos:0

[root@test-backup ~]# mount /dev/drbd0 /drbd

[root@test-backup ~]# ls /drbd

lost+found test1  test10  test2 test3  test4  test5 test6  test7  test8 test9

 

 

3、调试heartbeat+drbd

[root@test-master ~]# ssh test-backup 'umount /drbd'

[root@test-master ~]# ssh test-backup 'drbdadm secondary data'

[root@test-master ~]# service drbd stop

Stopping all DRBD resources: .

[root@test-master ~]# ssh test-backup 'service drbd stop'

Stopping all DRBD resources: .

[root@test-master ~]# service heartbeat status

heartbeat is stopped. No process

[root@test-master ~]# ssh test-backup 'service heartbeat status'

heartbeat is stopped. No process

[root@test-master ~]# ll /etc/ha.d/resource.d/{Filesystem,drbddisk}

-rwxr-xr-x. 1 root root 3162 Jan 12  2016 /etc/ha.d/resource.d/drbddisk

-rwxr-xr-x. 1 root root 1903 Dec  2  2013/etc/ha.d/resource.d/Filesystem

[root@test-master ~]# vim /etc/ha.d/haresources   #(此行内容相当于脚本加参数的执行方式,例如#/etc/ha.d/resource.d/IPaddr10.96.20.8/24/eth0 start|stop#/etc/ha.d/resource.d/drbddisk data start|stop#/etc/ha.d/resource.d/Filesystem/dev/drbd0 /drbd ext4 start|stopheartbeat就是这样按配置的先后顺序控制资源的,如果heartbeat出问题了,可通过查看日志并单独运行这些命令排错)

test-master     IPaddr::10.96.20.8/24/eth0      drbddisk::data  Filesystem::/dev/drbd/0::/drbd::ext4

[root@test-master ~]# scp /etc/ha.d/haresources root@test-backup:/etc/ha.d/

haresources                                                                                               100% 5996     5.9KB/s   00:00 

[root@test-master~]# service drbd start   #(在主node执行)

Starting DRBD resources: [

    create res: data

  prepare disk: data

   adjust disk: data

    adjust net: data

]

..........

***************************************************************

 DRBD's startup script waits for the peernode(s) to appear.

 - Ifthis node was already a degraded cluster before the

   reboot,the timeout is 0 seconds. [degr-wfc-timeout]

 - Ifthe peer was available before the reboot, the timeout

   is0 seconds. [wfc-timeout]

  (These values are for resource 'data'; 0 sec -> wait forever)

 Toabort waiting enter 'yes' [  23]:

[root@test-backup~]# service drbd start   #(在备node执行)

Starting DRBD resources: [

    create res: data

  prepare disk: data

   adjust disk: data

    adjust net: data

]

.

[root@test-master ~]# drbdadm role data

Secondary/Secondary

[root@test-master ~]# ssh test-backup 'drbdadm role data'

Secondary/Secondary

[root@test-master ~]# drbdadm -- --overwrite-data-of-peer primary data

[root@test-master ~]# drbdadm role data

Primary/Secondary

[root@test-master ~]# service heartbeat start

Starting High-Availability services:INFO:  Resource is stopped

Done.

[root@test-master ~]# ssh test-backup 'service heartbeat start'

Starting High-Availability services:2016/08/09_03:08:11 INFO:  Resource isstopped

Done.

[root@test-master ~]# ip addr | grep 10.96.20

   inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

   inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0

[root@test-master ~]# drbdadm role data

Primary/Secondary

[root@test-master ~]# df -h

Filesystem      Size Used Avail Use% Mounted on

/dev/sda2        18G 6.3G   11G  38% /

tmpfs           112M     0 112M   0% /dev/shm

/dev/sda1       283M  83M  185M  31% /boot

/dev/sr0        3.6G 3.6G     0 100% /mnt/cdrom

/dev/drbd0      946M 1.3M  896M   1% /drbd

[root@test-master ~]# ls /drbd

lost+found test1  test10  test2 test3  test4  test5 test6  test7  test8 test9

 

[root@test-master ~]# service heartbeat stop

Stopping High-Availability services: Done.

[root@test-master ~]# ssh test-backup 'ipaddr | grep 10.96.20'

   inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0

   inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[root@test-master ~]# ssh test-backup 'df-h'

Filesystem      Size Used Avail Use% Mounted on

/dev/sda2        18G 3.9G   13G  24% /

tmpfs           112M    0  112M   0% /dev/shm

/dev/sda1       283M  83M  185M  31% /boot

/dev/sr0        3.6G 3.6G     0 100% /mnt/cdrom

/dev/drbd0      946M 1.3M  896M   1% /drbd

[root@test-master ~]# ssh test-backup 'ls /drbd'

lost+found

test1

test10

test2

test3

test4

test5

test6

test7

test8

test9

 

[root@test-master ~]# drbdadm role data  

Secondary/Primary

[root@test-master ~]# service heartbeat start   #node恢复后,先确保把drbd理顺,弄正常,再开启heartbeat服务

Starting High-Availability services:INFO:  Resource is stopped

Done.

[root@test-master ~]# drbdadm role data

Primary/Secondary

[root@test-master ~]# ip addr | grep 10.96.20

   inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

   inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[root@test-master ~]# df -h

Filesystem      Size Used Avail Use% Mounted on

/dev/sda2        18G 6.3G   11G  38% /

tmpfs           112M     0 112M   0% /dev/shm

/dev/sda1       283M  83M  185M  31% /boot

/dev/sr0        3.6G 3.6G     0 100% /mnt/cdrom

/dev/drbd0      946M 1.3M  896M   1% /drbd

[root@test-master ~]# ls /drbd

lost+found test1  test10  test2 test3  test4  test5 test6  test7  test8 test9

 

 

4、分别在两主一从上,安装配置MySQL

MySQL-master-active

[root@test-master ~]# drbdadm role data

Primary/Secondary

[root@test-master ~]# groupadd -g 3306 mysql

[root@test-master ~]# useradd -u 3306 -g 3306 -s /sbin/nologin -M mysql

[root@test-master ~]# id mysql

uid=3306(mysql) gid=3306(mysql)groups=3306(mysql)

[root@test-master ~]# mkdir /drbd/data   #(两主要在drbd的挂载点处创建DB的数据目录,drbd仅同步MySQL的数据,程序文件都放在/usr/local/下)

[root@test-master ~]# chown -R mysql.mysql /drbd/data

[root@test-master ~]# rz   #(上传mysql二进制包)

[root@test-master ~]# tar xf mysql-5.5.45-linux2.6-x86_64.tar.gz -C /usr/local

[root@test-master ~]# cd /usr/local

[root@test-master local]# ln -sv mysql-5.5.45-linux2.6-x86_64/ mysql

`mysql' ->`mysql-5.5.45-linux2.6-x86_64/'

[root@test-master local]# cd mysql

[root@test-master mysql]# chown -R root.mysql ./

[root@test-master mysql]#scripts/mysql_install_db --user=mysql --datadir=/drbd/data   #(仅在当前对外提供服务的主node初始化,即drbdprimary端)

Installing MySQL system tables...

160810 19:46:23 [Note] ./bin/mysqld (mysqld5.5.45) starting as process 3908 ...

OK

…….

[root@test-master mysql]# cp support-files/my-large.cnf /etc/my.cnf

[root@test-master mysql]# vim /etc/my.cnf   #(添加如下两项)

[mysqld]

datadir = /drbd/data

log-bin=mysql-bin

log-bin-index=mysql-bin.index

server-id=1

sync_binlog=1

innodb_file_per_table = 1

binlog_format=mixed

[root@test-master mysql]# egrep -v "#|^$" /etc/my.cnf

[client]

port           =3306

socket                =/tmp/mysql.sock

[mysqld]

port           =3306

socket                =/tmp/mysql.sock

skip-external-locking

key_buffer_size = 256M

max_allowed_packet = 1M

table_open_cache = 256

sort_buffer_size = 1M

read_buffer_size = 1M

read_rnd_buffer_size = 4M

myisam_sort_buffer_size = 64M

thread_cache_size = 8

query_cache_size= 16M

thread_concurrency = 8

datadir = /drbd/data

log-bin=mysql-bin

log-bin-index=mysql-bin.index

server-id=1

sync_binlog=1

innodb_file_per_table = 1

binlog_format=mixed

[mysqldump]

quick

max_allowed_packet = 16M

[mysql]

no-auto-rehash

[myisamchk]

key_buffer_size = 128M

sort_buffer_size = 128M

read_buffer = 2M

write_buffer = 2M

[mysqlhotcopy]

interactive-timeout

[root@test-master mysql]# scp /etc/my.cnf root@test-backup:/etc/

my.cnf                                                                                                     100% 4787     4.7KB/s   00:00   

[root@test-master mysql]# cp support-files/mysql.server /etc/init.d/mysqld

[root@test-master mysql]# chkconfig --add mysqld

[root@test-master mysql]# chkconfig mysqldoff

[root@test-master mysql]# chkconfig --list mysqld

mysqld            0:off 1:off 2:off 3:off 4:off 5:off 6:off

[root@test-master mysql]# service mysqld start

Starting MySQL.....                                        [  OK  ]

[root@test-master mysql]#/usr/local/mysql/bin/mysql

……

mysql> GRANT ALL ON *.* TO 'root'@'%'IDENTIFIED BY 'redhat';

Query OK, 0 rows affected (0.28 sec)

mysql> GRANT REPLICATION SLAVE ON *.* TO 'repluser'@'%' IDENTIFIED BY 'repluser';

Query OK, 0 rows affected (0.17 sec)

mysql> FLUSH PRIVILEGES;

Query OK, 0 rows affected (0.04 sec)

mysql> select User,Password,Host from mysql.user;

mysql> select User,Host,Password from mysql.user;

+----------+-------------+-------------------------------------------+

| User    | Host        | Password                                  |

+----------+-------------+-------------------------------------------+

| root    | localhost   |                                           |

| root    | test-master |                                           |

| root    | 127.0.0.1   |                                           |

| root    | ::1         |                                           |

|         | localhost   |                                           |

|         | test-master |                                           |

| root    | %           |*84BB5DF4823DA319BBF86C99624479A198E6EEE9 |

| repluser | %           |*89A63F9688240669B54B5C2649EEFB795850597E |

+----------+-------------+-------------------------------------------+

8 rows in set (0.23 sec)

mysql> create database webgame;

Query OK, 1 row affected (0.10 sec)

 

mysql> show databases;

+--------------------+

| Database           |

+--------------------+

| information_schema |

| mysql              |

| performance_schema |

| test               |

| webgame            |

+--------------------+

5 rows in set (0.04 sec)

mysql> \q

Bye

[root@test-master mysql]# ip addr | grep 10.96.20

   inet 10.96.20.113/24 brd 10.96.20.255 scope global eth0

   inet 10.96.20.8/24 brd 10.96.20.255 scope global secondary eth0

[root@test-master mysql]# df -h | grep drbd0

/dev/drbd0      946M  31M  866M   4% /drbd

[root@test-master ~]# vim /etc/ha.d/haresources

test-master     IPaddr::10.96.20.8/24/eth0      drbddisk::data  Filesystem::/dev/drbd0::/drbd::ext4     mysqld

[root@test-master ~]# scp /etc/ha.d/haresources root@test-backup:/etc/ha.d/

 

 

MySQL-master-inactive

[root@test-backup ~]# drbdadm role data

Secondary/Primary

[root@test-backup ~]# groupadd -g 3306 mysql

[root@test-backup ~]# useradd -u 3306 -g 3306 -s /sbin/nologin -M mysql

[root@test-backup ~]# id mysql

uid=3306(mysql) gid=3306(mysql)groups=3306(mysql)

[root@test-backup ~]# rz

[root@test-backup ~]# tar xf mysql-5.5.45-linux2.6-x86_64.tar.gz -C /usr/local

[root@test-backup ~]# cd /usr/local

[root@test-backup local]# ln -sv mysql-5.5.45-linux2.6-x86_64/ mysql

`mysql' ->`mysql-5.5.45-linux2.6-x86_64/'

[root@test-backup local]# cd mysql

[root@test-backup mysql]# chown -R root.mysql ./

[root@test-backup mysql]# vim /etc/my.cnf   #(此文件从master active传来的,确认有如下配置)

[mysqld]

datadir = /drbd/data

log-bin=mysql-bin

log-bin-index=mysql-bin.index

server-id=1

sync_binlog=1

innodb_file_per_table = 1

binlog_format=mixed

[root@test-backup mysql]# cp support-files/mysql.server /etc/init.d/mysqld

[root@test-backup mysql]# chkconfig --add mysqld

[root@test-backup mysql]# chkconfig mysqldoff

[root@test-backup mysql]# chkconfig --list mysqld

mysqld            0:off 1:off 2:off 3:off 4:off 5:off 6:off

 

 

mysql-slave

[root@localhost ~]# mkdir /mydata/data -pv

mkdir: created directory `/mydata'

mkdir: created directory `/mydata/data'

[root@localhost ~]# groupadd -g 3306 mysql

[root@localhost ~]# useradd -u 3306 -g 3306 -s /sbin/nologin -M mysql

[root@localhost ~]# id mysql

uid=3306(mysql) gid=3306(mysql)groups=3306(mysql)

[root@localhost ~]# rz

[root@localhost ~]# tar xf mysql-5.5.45-linux2.6-x86_64.tar.gz -C /usr/local

[root@localhost ~]# cd /usr/local

[root@localhost local]# ln -sv mysql-5.5.45-linux2.6-x86_64/ mysql

`mysql' ->`mysql-5.5.45-linux2.6-x86_64/'

[root@localhost local]# cd mysql

[root@localhost mysql]# chown -R root.mysql./

[root@localhost mysql]# chown -R mysql.mysql /mydata/data

[root@localhost mysql]# cp support-files/my-large.cnf /etc/my.cnf

cp: overwrite `/etc/my.cnf'? y

[root@localhost mysql]# cp support-files/mysql.server /etc/init.d/mysqld

[root@localhost mysql]# chkconfig --add mysqld

[root@localhost mysql]# chkconfig --list mysqld

mysqld            0:off 1:off 2:on 3:on 4:on 5:on 6:off

[root@localhost mysql]# vim /etc/my.cnf

[mysqld]

datadir=/mydata/data

innodb_file_per_table=1

relay-log=relay-log

relay-log-index=relay-log.index

server-id=11   

read_only=1

skip_slave_start=1

[root@localhost mysql]# egrep -v "#|^$" /etc/my.cnf

[client]

port           =3306

socket                =/tmp/mysql.sock

[mysqld]

port           =3306

socket                =/tmp/mysql.sock

skip-external-locking

key_buffer_size = 256M

max_allowed_packet = 1M

table_open_cache = 256

sort_buffer_size = 1M

read_buffer_size = 1M

read_rnd_buffer_size = 4M

myisam_sort_buffer_size = 64M

thread_cache_size = 8

query_cache_size= 16M

thread_concurrency = 8

datadir=/mydata/data

innodb_file_per_table=1

relay-log=relay-log

relay-log-index=relay-log.index

server-id=11

read_only=1

skip_slave_start=1

[mysqldump]

quick

max_allowed_packet = 16M

[mysql]

no-auto-rehash

[myisamchk]

key_buffer_size = 128M

sort_buffer_size = 128M

read_buffer = 2M

write_buffer = 2M

[mysqlhotcopy]

interactive-timeout

[root@localhost mysql]# scripts/mysql_install_db --user=mysql --datadir=/mydata/data

Installing MySQL system tables...

160810 22:18:18 [Warning]'THREAD_CONCURRENCY' is deprecated and will be removed in a future release.

160810 22:18:18 [Note] ./bin/mysqld (mysqld5.5.45) starting as process 46873 ...

OK

Filling help tables...

160810 22:18:19 [Warning]'THREAD_CONCURRENCY' is deprecated and will be removed in a future release.

160810 22:18:19 [Note] ./bin/mysqld (mysqld5.5.45) starting as process 46880 ...

OK

……

[root@localhost mysql]# service mysqld start

Starting MySQL..                                           [  OK  ]

[root@localhost ~]# mysql

mysql> CHANGE MASTER TO MASTER_USER='repluser',MASTER_PASSWORD='repluser',MASTER_HOST='10.96.20.8',MASTER_LOG_FILE='mysql-bin.000003',MASTER_LOG_POS=330;

Query OK, 0 rows affected (0.04 sec)

 

mysql> start slave;

Query OK, 0 rows affected (0.00 sec)

mysql> show slave status\G

……

 

 

测试分两步:

先测两主node间是否正常,调整好drbd并开启服务,先不要开启heartbeat,手动开启mysqld服务,在master-active创建新库,再关闭mysqld、将activedrbd置从;将inactivedrbd置为主,开启mysqldmaster-inactive上查看;

再测在主切换后,主从同步能否继续,如下,正常

[root@test-backup ~]# tail -f /var/log/ha-log   #(模拟active故障,在inactive查看take over过程)

Aug 10 22:40:38 test-backup heartbeat:[7738]: info: Local status now set to: 'up'

Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Link test-master:eth0 up.

Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Status update for node test-master: status active

Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Comm_now_up(): updating status to active

Aug 10 22:40:39 test-backup heartbeat:[7738]: info: Local status now set to: 'active'

harc(default)[7747]:         2016/08/10_22:40:39 info: Running /etc/ha.d//rc.d/statusstatus

Aug 10 22:40:50 test-backup heartbeat:[7738]: info: local resource transition completed.

Aug 10 22:40:50 test-backup heartbeat:[7738]: info: Initial resource acquisition complete (T_RESOURCES(us))

Aug 10 22:40:50 test-backup heartbeat:[7766]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeystest-backup] to acquire.

Aug 10 22:40:50 test-backup heartbeat:[7738]: info: remote resource transition completed.

Aug 10 23:10:16 test-backup heartbeat:[7738]: info: Received shutdown notice from 'test-master'.

Aug 10 23:10:16 test-backup heartbeat:[7738]: info: Resources being acquired from test-master.

Aug 10 23:10:16 test-backup heartbeat:[7879]: info: acquire local HA resources (standby).

Aug 10 23:10:16 test-backup heartbeat:[7879]: info: local HA resource acquisition completed (standby).

Aug 10 23:10:16 test-backup heartbeat:[7738]: info: Standby resource acquisition done [all].

Aug 10 23:10:16 test-backup heartbeat:[7880]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeystest-backup] to acquire.

harc(default)[7905]:         2016/08/10_23:10:16 info: Running /etc/ha.d//rc.d/statusstatus

mach_down(default)[7922]:    2016/08/10_23:10:16 info: Taking overresource group IPaddr::10.96.20.8/24/eth0

ResourceManager(default)[7949]: 2016/08/10_23:10:16 info: Acquiring resourcegroup: test-master IPaddr::10.96.20.8/24/eth0 drbddisk::dataFilesystem::/dev/drbd0::/drbd::ext4 mysqld

/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.96.20.8)[7977]:          2016/08/10_23:10:16 INFO:  Resource is stopped

ResourceManager(default)[7949]: 2016/08/10_23:10:16 info: Running/etc/ha.d/resource.d/IPaddr 10.96.20.8/24/eth0 start

IPaddr(IPaddr_10.96.20.8)[8102]:  2016/08/10_23:10:16 INFO: Adding inet address10.96.20.8/24 with broadcast address 10.96.20.255 to device eth0

IPaddr(IPaddr_10.96.20.8)[8102]:  2016/08/10_23:10:16 INFO: Bringing device eth0up

IPaddr(IPaddr_10.96.20.8)[8102]:  2016/08/10_23:10:16 INFO:/usr/libexec/heartbeat/send_arp -i 200 -r 5 -p/var/run/resource-agents/send_arp-10.96.20.8 eth0 10.96.20.8 auto not_usednot_used

/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.96.20.8)[8076]:          2016/08/10_23:10:16 INFO:  Success

ResourceManager(default)[7949]: 2016/08/10_23:10:16 info: Running/etc/ha.d/resource.d/drbddisk data start

/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[8231]:   2016/08/10_23:10:17 INFO:  Resource is stopped

ResourceManager(default)[7949]: 2016/08/10_23:10:17 info: Running/etc/ha.d/resource.d/Filesystem /dev/drbd0 /drbd ext4 start

Filesystem(Filesystem_/dev/drbd0)[8314]:     2016/08/10_23:10:17 INFO: Running start for/dev/drbd0 on /drbd

/usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd0)[8306]:   2016/08/10_23:10:17 INFO:  Success

ResourceManager(default)[7949]: 2016/08/10_23:10:18 info: Running/etc/init.d/mysqld  start

mach_down(default)[7922]:    2016/08/10_23:10:31 info:/usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired

Aug 10 23:10:32test-backup heartbeat: [7738]: info: mach_down takeover complete.

mach_down(default)[7922]:    2016/08/10_23:10:33 info: mach_down takeovercomplete for node test-master.

^C

[root@test-backup ~]# ip addr

……

2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000

   link/ether 00:0c:29:15:e6:bb brd ff:ff:ff:ff:ff:ff

   inet 10.96.20.114/24 brd 10.96.20.255 scope global eth0

    inet 10.96.20.8/24 brd 10.96.20.255 scopeglobal secondary eth0

   inet6 fe80::20c:29ff:fe15:e6bb/64 scope link

      valid_lft forever preferred_lft forever

……

[root@test-backup ~]# df -h

Filesystem      Size Used Avail Use% Mounted on

/dev/sda2        18G 4.7G   12G  29% /

tmpfs           112M     0 112M   0% /dev/shm

/dev/sda1       283M  83M  185M  31% /boot

/dev/sr0        3.6G 3.6G     0 100% /mnt/cdrom

/dev/drbd0      946M  31M  866M   4% /drbd

[root@test-backup ~]# service mysqld status

MySQL running(8772)                                      [  OK  ]

 

[root@localhost ~]# mysql   (在slave端查看主从同步是否正常)

Welcome to the MySQL monitor.  Commands end with ; or \g.

……

mysql> show slave status\G

*************************** 1. row***************************

               Slave_IO_State: Waiting formaster to send event

                  Master_Host: 10.96.20.8

                  Master_User: repluser

                  Master_Port: 3306

                Connect_Retry: 60

              Master_Log_File: mysql-bin.000005

         Read_Master_Log_Pos: 198

               Relay_Log_File: relay-log.000004

                Relay_Log_Pos: 344

       Relay_Master_Log_File: mysql-bin.000005

             Slave_IO_Running: Yes

            Slave_SQL_Running: Yes

              Replicate_Do_DB:

         Replicate_Ignore_DB:

          Replicate_Do_Table:

……

mysql> show databases;

+--------------------+

| Database           |

+--------------------+

| information_schema |

| mysql              |

| performance_schema |

| test               |

| webgame1           |

| webgame2           |

| webgame3           |

+--------------------+

7 rows in set (0.00 sec)

 

 

MySQL主从同步常用的架构方案:

1、一主一从

IV 12 MySQL+drbd+heartbeat_第2张图片

注:HA软件keepalivedheartbeat只负责VIP切换即可;

此方案部署简单、容易维护;

master故障后,业务可自动切换到slave

rw都依赖主库,压力大,有锁、死锁等;

也可让slaver服务,但要依赖程序代码实现;

 

2、一主多从

IV 12 MySQL+drbd+heartbeat_第3张图片

注:HA软件keepalivedheartbeat可只负责VIP的切换;

master故障后,业务可自动切到slave1上,这时slave2可能无法和slave1自动同步,解决办法使用semi-sync机制;

支持rw splittingmaster负责wslave负责r,但要通过程序代码实现;

 

3、双主

注:HA软件keepalived+LVSMMM

双主同步后,可将两个主做LB,任意一个主挂掉业务不受影响;

双主会有严重问题,会增加数据不一致的机率;

双主对性能提升不大,属复杂而并无太多好处的架构方案,不推荐;

 

4、双主多从:

IV 12 MySQL+drbd+heartbeat_第4张图片

注:HA软件MMMkeepalived

若一个主挂掉,业务不受影响;

双写可以做,但会增加数据不一致机率;

同一时间只往一个主上写数据;

 

5、级联复制

IV 12 MySQL+drbd+heartbeat_第5张图片

注:HA软件keepalivedheartbeat,可只负责VIP的切换;

master故障切至master2上,master2依然继续向slave{1,2}同步;

slave{1,2}支持rw splitting,但要通过程序代码实现;

从库为级联同步,可能会有延迟,master2若故障,那slave的同步将中断;

 

6drbd的双主

IV 12 MySQL+drbd+heartbeat_第6张图片

注:passive-server作为备用node时是不可见状态

 

7

IV 12 MySQL+drbd+heartbeat_第7张图片

-------------------------------------------------------------------------

IV 12 MySQL+drbd+heartbeat_第8张图片

注:HA软件heartbeat既负责VIP切换,还负责drbdmysqld服务的管理;

master故障自动切至backupslave{1,2}仍能与backup同步;

slave{1,2}支持rw splitting,但要通过程序代码实现;

此方案也支持semi-sync机制;

backup仅在提升为主时才能访问,正常情况下,masterbackup仅有一台对外提供服务;

 

8、基于SAN存储的HA方案,OracleSQLserver常用

IV 12 MySQL+drbd+heartbeat_第9张图片

-----------------------------------------------------------------------

IV 12 MySQL+drbd+heartbeat_第10张图片

注:HA软件RedHat Cluster Site

业务依赖SAN存储;

Backup仅在Master故障后,成功接管才能访问;

slave{1,2}支持rw splitting

 

9

IV 12 MySQL+drbd+heartbeat_第11张图片

注:部署灵活、资源利用率高;

master负责wslave负责r

业务依赖DNS服务,对长连接的支持不好;

master故障影响从库;

 

10

IV 12 MySQL+drbd+heartbeat_第12张图片

注:可用软件mysql-proxyamoeba

前端业务透明rw splitting,后端health check

开源方案目前不稳定;

需要定制开发DBproxy

 

11、分布式数据库集群高可用方案

IV 12 MySQL+drbd+heartbeat_第13张图片

注:DALdata access layer

 

12

IV 12 MySQL+drbd+heartbeat_第14张图片

注:基于Galera高可用方案;

Galera是一套在MySQL InnoDB上实现Multi-Mastersychronousreplication的集群系统;特点:true multi-master;read&write to any node;synchronousreplication;no slave log,integrity issues;no master-slave failover,noVIP;multi-thread slave;automatic node provisioning;

 

13MySQL官方cluster高可用方案

IV 12 MySQL+drbd+heartbeat_第15张图片

 

 

 

注:

MySQL HA架构方案选择依据:


根据可用性

根据安全性

根据写性能

MySQL replication

98%--99.9+%

No

Fair

master-master with MMM manager

99%

No

Fair

heartbeat/SAN

99.5%--99.9%

Yes

Excellent

Heartbeat/drbd

99.9%

Yes

Good

NDB cluster

99.999%

yes

excellent

注:NDB clustervery high,specific NDB knowledge,strom MySQL skills and strongsysadmin skills

 

 

MySQL目前存在的问题:

单机性能(QPS(rw),响应时间,数据规模,IOPSr操作和w操作的瓶颈);

主从数据一致性(异步复制,semi-sync复制,顺序性+完整性);

自动化扩容(数据迁移;按一定规模扩容(哈希取模、范围、日期、组合等,水平垂直拆分);数据容量预估、提前预警(单表容量预估(业务评估);buffer pool容量、命中率;磁盘容量);全量+增量自动化扩容(从库提升为新主库;自动或手动;扩容完毕通知代理层对前端透明);

主库单点(主备策略(备库只做数据同步,不做线上查询);数据补全(从主库拉取binblog文件进行数据补全);单点切换(主库宕机,切换新主库,尽量保持数据一致性(业务特性);通知代理层切换新的主库对应透明);

 

分布式数据库:

1、产品定位(尽量保证数据库特性,提升数据规模;线上低延迟的访问;满足具有一定复杂关系的数据操作);

2、设计原则(实现mysql客户端通信协议;数据逻辑分布对应用透明;自动发现/人工决定/自动处理;支持单机事务);

3、设计指标(千亿级别存贮数据;响应时间低于10ms;对上层应用完全透明);

IV 12 MySQL+drbd+heartbeat_第16张图片

分布式数据库代理层(实现mysql客户端协议;rw splittingLB,从库加权轮询等;数据查询合并;数据拆分规则;并发控制;sql白名单管理;单机事务支持(amoeba不支持事务);服务端模型);

监控(存活监控;主从延时监控;容量监控(表、磁盘);流量监控(请求);命中率监控(缓冲池);关键数据收集上报);

web监控和报警(界面对运维和DBA友好;可以触发集群管理操作(人工扩容、切换新主库);监控数据异常报警(邮件、短信、级别不同方式不一样);

元数据服务(存贮数据拆分规则(配置中心);选举服务;实现fast paxos协议;数据原子广播通信协议;实现数据通知服务;锁服务;应用定位服务);

单点切换服务(主库宕机提升备库或从库为新主库(ssh是否通,获取binlog补全数据),尽量保持数据一致性);选取新主库的策略;新主库确定,通知前端代理层);

数据迁移服务(根据监控数据和预值指标进行扩容;全量+增量;冗余数据自动清理;自动或人工迁移)