前言
本文涉及到技术有Heartbeat、DRBD、MySQL。
Heartbeat介绍
详见官网http://linux-ha.org/wiki/Main_Page或blog.51cto.com/lzhnb
DRBD介绍
详见官网http://www.drbd.org/或blog.51cto.com/lzhnb
MySQL介绍
详见官网http://www.mysql.com/或blog.51cto.com/lzhnb
第1章 系统环境及架构
1.2系统环境
[root@MySQL-Master01 ~]# cat /etc/redhat-release CentOS release 6.9 (Final) [root@MySQL-Master01 ~]# uname -r 2.6.32-696.el6.x86_64 [root@MySQL-Master01 ~]# uname -m x86_64 [root@MySQL-Master01 ~]# /etc/init.d/iptables stop ==》关闭防火墙 [root@MySQL-Master01 ~]# sed -i "s#SELINUX=enforcing#SELINUX=disabled#gp" /etc/selinux/config [root@MySQL-Master01 ~]# grep "SELINUX=disabled" /etc/selinux/config [root@MySQL-Master01 ~]# setenforce 0 [root@MySQL-Master01 ~]# getenforce [root@MySQL-Master01 ~]# echo '#time sync by liuzhonghe at 2018-1-15' >>/var/spool/cron/root 设置时间同步 [root@MySQL-Master01 ~]# echo '*/5 * * * * /usr/sbin/ntpdate ntp1.aliyun.com >/dev/null 2>&1' >>/var/spool/cron/root [root@MySQL-Master01 ~]# crontab -l
1.3软件环境
Heartbeat |
heartbeat-3.0.4-2.el6.x86_64 |
DRBD |
drbd83-utils-8.3.16 |
MySQL |
mysql-5.5.49 |
1.4服务器及目录规划
1.4.1服务器名、IP、主机名规划
序号 |
角色 |
IP |
主机名 |
1 |
MySQL主节点-1 |
172.16.1.51/24(内网) |
MySQL-Master01 |
172.16.4.2/24(心跳) |
|||
172.168.4.2/24(DRBD数据传输) |
|||
2 |
MySQL主节点-2 |
172.16.1.52/24(内网) |
MySQL-Master01
|
172.16.4.3/24(心跳) |
|||
172.168.4.3/24(DRBD数据传输) |
|||
3 |
MySQL从节点-1 |
172.16.1.71/24 |
MySQL-Slave01 |
4 |
VIP |
172.16.1.53/24(内网提供服务) |
|
说明:从库是通过主库的VIP进行数据同步的 |
1.4.2目录规划
目录名 |
位置 |
作用 |
/server/scripts |
所有服务器 |
存放脚本 |
/application/tools |
所有服务器 |
软件包存放 |
/application |
所有服务器 |
编译安装路径 |
/data |
所有服务器 |
数据库数据存放 |
第2章 安装部署过程
2.1 Heartbeat部署
2.1.1配置主库间的心跳路由
######################################主节点################################### [root@MySQL-Master01 ~]# route add -host 172.16.4.3 dev eth2 ==》到达对端心跳路由 [root@MySQL-Master01 ~]# route add -host 172.168.4.3 dev eth3 ==》DRBD数据路由 ######################################备节点################################### [root@MySQL-Master02 ~]# route add -host 172.16.4.2 dev eth2 ==》到达对端心跳路由 [root@MySQL-Master02 ~]# route add -host 172.168.4.2 dev eth3 ==》DRBD数据路由
2.1.2安装Heartbeat(两者都要安装)
[root@MySQL-Master01 ~]# yum install -y heartbeat [root@MySQL-Master02 ~]# yum install -y heartbeat
2.1.3配置Heartbeat配置文件(两者的配置文件完全一样)
2.1.3.1 /etc/ha.d/ha.cf
[root@MySQL-Master01 ~]# cat /etc/ha.d/ha.cf #log configure debugfile /var/log/ha-debug ==》存放heartbeat调试信息 logfile /var/log/ha-log ==》存放日志信息 logfacility local1 ==》在syslog服务中配置通过local1设备接收日志 #options configure keepalive 2 ==》心跳的时间间隔 默认单位是秒 deadtime 30 ==》超出该时间间隔未收到对方心跳,则认为对方死亡 warntime 10 ==》超出该时间未收到对方心跳,则发出警告并记录到日志 initdead 120 ==》重启或者服务恢复后网络正常工作需要的时间,至少是deadtime的2倍 mcast eth2 225.0.0.7 694 1 0 ==》设置广播通信使用的端口 694为默认使用的端口 auto_failback on ==》主节点恢复后,将服务自动切回 node MySQL-Master01 ==》主节点的主机名,可以用IP地址 node MySQL-Master02 ==》备节点的主机名,可以用IP地址 crm no 是否开启资源管理功能
2.3.2 /etc/ha.d/haresources
[root@MySQL-Master01 ~]# cat /etc/ha.d/haresources MySQL-Master01 IPaddr::172.16.1.53/24/eth1 #MySQL-Master01 IPaddr::172.16.1.53/24/eth1 drbddisk::data Filesystem::/dev/drbd1::/data::ext4 mysqld 说明: drbddisk::data <==启动drbd data资源,相当于执行/etc/ha.d/resource.d/drbddisk data stop/start操作 Filesystem::/dev/drbd1::/data::ext4 <==drbd分区挂载到/data目录,相当于执行/etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext4 stop/start <==相当于系统中执行mount /dev/drbd1 /data mysqld <==启动mysql服务脚本,相当于/etc/init.d/mysqld stop/start
2.3.3/etc/ha.d/ authkeys
[root@MySQL-Master01 ~]# cat /etc/ha.d/authkeys auth 1 1 sha1 liucdlzh [root@MySQL-Master01 ~]# chmod 600 /etc/ha.d/authkeys
2.1.4 启动heartbeat(两个节点都要启动)
[root@MySQL-Master01 ~]# /etc/init.d/heartbeat start [root@MySQL-Master01 ~]# chkconfig heartbeat off [root@MySQL-Master02 ~]# /etc/init.d/heartbeat start [root@MySQL-Master02 ~]# chkconfig heartbeat off 注意:关闭开机自启动,当重启服务器的时候需要人工手动重启服务
2.1.5测试heartbeat
2.1.5.1 正常状态下
[root@MySQL-Master01 ~]# ip addr |grep eth1 3: eth1:mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 172.16.1.51/24 brd 172.16.1.255 scope global eth1 inet 172.16.1.53/24 brd 172.16.1.255 scope global secondary eth1 [root@MySQL-Master02 ~]# ip addr |grep eth1 3: eth1: mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 172.16.1.52/24 brd 172.16.1.255 scope global eth1
2.1.5.2 模拟主节点宕机
[root@MySQL-Master01 ~]# /etc/init.d/heartbeat stop Stopping High-Availability services: Done [root@MySQL-Master02 ~]# ip addr |grep eth1 3: eth1:mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 172.16.1.52/24 brd 172.16.1.255 scope global eth1 inet 172.16.1.53/24 brd 172.16.1.255 scope global secondary eth1
2.1.5.3 模拟主节点恢复
[root@MySQL-Master01 ~]# ip addr |grep eth1 3: eth1:mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 172.16.1.51/24 brd 172.16.1.255 scope global eth1 inet 172.16.1.53/24 brd 172.16.1.255 scope global secondary eth1 [root@MySQL-Master02 mysql]# ip addr |grep eth1 3: eth1: mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 172.16.1.52/24 brd 172.16.1.255 scope global eth1
2.2 DRBD部署(两个节点的操作是完全同步的)
2.2.1添加新硬盘
[root@MySQL-Master01 ~]# fdisk -l /dev/sdb Disk /dev/sdb: 10.7 GB, 10737418240 bytes 255 heads, 63 sectors/track, 1305 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0xe21fa70d Device Boot Start End Blocks Id System /dev/sdb1 1 654 5253223+ 83 Linux /dev/sdb2 655 1305 5229157+ 83 Linux 实施步骤: [root@MySQL-Master01 ~]# fdisk /dev/sdb ==》分两个分区 [root@MySQL-Master01 ~]# mkfs.ext4 /dev/sdb1 ==》格式化分区 [root@MySQL-Master01 ~]# tune2fs -c -1 /dev/sdb1 ==》设置最大挂载数为-1 注意:sbd2分区不需要格式化 因为为meta data分区
2.2.2安装DRBD
[root@MySQL-Master01 ~]# rpm -Uvh http://www.elrepo.org/elrepo-release-6-8.el6.elrepo.noarch.rpm [root@MySQL-Master01 ~]# yum install -y kmod-drbd83 drbd83-utils [root@MySQL-Master01 ~]# modprobe drbd 注意:千万不要设置echo "modprobe drbd" >>/etc/rc.local开机自动加载drbd模块,否则会先启动drbd服务在加载drbd的顺序,导致drbd启动不了出现的问题
2.2.3 配置DRBD
[root@MySQL-Master01 ~]# cd /etc/drbd.d/ [root@MySQL-Master01 drbd.d]# cp global_common.conf{,.bak} [root@MySQL-Master01 drbd.d]# cat global_common.conf global { usage-count no; ==》不让linbit公司统计drbd目前的使用情况 默认yes } common { protocol C; ==》同步模式默认为sysnc就是C disk { ==>精细的调节drbd底层存储的属性 on-io-error detach; ==》同步IO出错时的做法:分离该磁盘 no-disk-flushes; no-md-flushes; } net { ==>精细的调节网络相关的属性 sndbuf-size 512k; ==》调节TCP send buffer的大小 0自动调节 128k默认最大不超过2M max-buffers 8000; ==》设定drbd分配的最大请求数 unplug-watermark 1024; max-epoch-size 8000; cram-hmac-alg "sha1"; ==》指定算法 shared-secret "liucdlzh"; after-sb-0pri disconnect; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect; } syncer { rate 120M; ==》同步速率 al-extents 517; } }
2.2.4 配置DRBD资源
[root@MySQL-Master01 drbd.d]# cat r0.res resource data { on MySQL-Master01 { ==> 主节点 device /dev/drbd1; disk /dev/sdb1; address 172.16.1.51:7788; meta-disk /dev/sdb2 [0]; } on MySQL-Master02 { ==> 备节点 device /dev/drbd1; disk /dev/sdb1; address 172.16.1.52:7788; meta-disk /dev/sdb2 [0]; } }
2.2.5初始化设备元数据并启动
[root@MySQL-Master01 ~]# drbdadm create-md data [root@MySQL-Master01 ~]# /etc/init.d/drbd start
2.2.6 初始化设备同步并挂载
[root@MySQL-Master01 ~]# drbdadm -- --overwrite-data-of-peer primary data ==》只需在主节点上执行即可 [root@MySQL-Master01 ~]# drbdadm primary all [root@MySQL-Master01 ~]# mount /dev/drbd1 /data/ [root@MySQL-Master01 ~]# cat /proc/drbd ==》查看
2.2.7 测试DRBD
2.2.7.1正常状态
[root@MySQL-Master01 ~]# cat /proc/drbd version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:2172 nr:2804 dw:4976 dr:35859 al:11 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0 [root@MySQL-Master02 ~]# cat /proc/drbd version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37 1: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:2804 nr:2196 dw:5000 dr:30544 al:10 bm:2 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
2.2.7.2模拟DRBD故障
[root@MySQL-Master01 ~]# umount /dev/drdb1 [root@MySQL-Master01 ~]# /etc/init.d/drbd stop [root@MySQL-Master02 ~]# drbdadm primary all [root@MySQL-Master02 ~]# mount /dev/drbd1 /data/ [root@MySQL-Master02 ~]# cat /proc/drbd version: 8.3.16 (api:88/proto:86-97) GIT-hash: a798fa7e274428a357657fb52f0ecf40192c1985 build by phil@Build64R6, 2014-11-24 14:51:37 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:2172 nr:2804 dw:4976 dr:35859 al:11 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0 [root@MySQL-Master02 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 8.8G 2.8G 5.7G 33% / tmpfs 491M 0 491M 0% /dev/shm /dev/sda1 190M 35M 146M 20% /boot /dev/drbd1 4.9G 40M 4.6G 1% /data
2.3 MySQL部署
【前提】
q 三台数据库都需要安装mysql服务
q MySQL-Master02不需要初始化数据库
q mysqld服务不需要加入开机自启动
2.3.1安装过程
安装mysql ####创建mysql用户 [root@MySQL-Master01 ~]# useradd mysql -s /sbin/nologin -M ####解压安装mysql [root@MySQL-Master01 ~]# cd /home/oldboy/tools/ [root@MySQL-Master01 ~]# rz [root@MySQL-Master01 ~]# xf mysql-5.5.49-linux2.6-x86_64.tar.gz [root@MySQL-Master01 ~]# mv mysql-5.5.49-linux2.6-x86_64 /application/mysql-5.5.49/ [root@MySQL-Master01 ~]# -s /application/mysql-5.5.49/ /application/mysql [root@MySQL-Master01 ~]# ll /application/mysql ####初始化数据库(在备节点上不需要执行该步骤) [root@MySQL-Master01 ~]# /application/mysql/scripts/mysql_install_db --basedir=/application/mysql --datadir=/application/mysql/data/ --user=mysql ####授权配置文件 [root@MySQL-Master01 ~]# chown -R mysql.mysql /application/mysql/ [root@MySQL-Master01 ~]# cp /application/mysql/support-files/my-small.cnf /etc/my.cnf [root@MySQL-Master01 ~]# cp /application/mysql/support-files/mysql.server /etc/init.d/mysqld [root@MySQL-Master01 ~]# chmod +x /etc/init.d/mysqld [root@MySQL-Master01 ~]# sed -i 's#/usr/local/mysql#/application/mysql#g' /application/mysql/bin/mysqld_safe /etc/init.d/mysqld [root@MySQL-Master01 ~]# /etc/init.d/mysqld start ####拷贝环境变量 [root@MySQL-Master01 ~]# cp -a /application/mysql/bin/* /usr/local/sbin/ ####设置密码(备节点不需要执行该步骤) [root@MySQL-Master01 ~]# mysqladmin -uroot password '123456'
2.3.2配置从库同VIP同步
2.3.2.1主库配置
1、开启binlog和设置server-id [root@MySQL-Master01 ~]# cat /etc/my.cnf ==》在该文件中加入下面两行 log-bin = /application/mysql/mysql-bin server-id = 3 [root@MySQL-Master01 ~]# /etc/init.d/mysqld restart ==》备节点不需要重启 2、授权并建立同步账户 [root@MySQL-Master01 ~]# mysql -uroot -p mysql> grant replication slave on *.* to 'rep'@'172.16.1.%' identified by '123456';
2.3.2.2 slave配置
1、设置server-id [root@MySQL-Slave02 ~]# cat /etc/my.cnf server-id = 4 2、配置同步参数 [root@MySQL-Slave02 ~]# mysql -uroot -p mysql> change master to master_host='172.16.1.53', master_port=3306, master_user='rep', master_password='123456', master_log_file='mysql-bin.000001', ==》通过在主库执行show master status;获得 master_log_pos=257; ==》通过在主库执行show master status;获
2.3.3 检查是否主从同步
[root@MySQL-Slave02 ~]# mysql -uroot -p mysql> show slave status\G Slave_IO_Running: Yes Slave_SQL_Running: Yes
2.4 测试高可用性
2.4.1正常状态
[root@MySQL-Master01 ~]# mysql -uroot -p mysql> create database lzh; Query OK, 1 row affected (0.02 sec) [root@MySQL-Slave02 ~]# mysql -uroot -p mysql> show slave status\G Slave_IO_Running: Yes Slave_SQL_Running: Yes [root@MySQL-Slave02 ~]# mysql -uroot -e "show databases like 'lzh';" -p123456 +----------------+ | Database (lzh) | +----------------+ | lzh | +----------------+
2.4.2模拟高可用主节点宕机
[root@MySQL-Master01 ~]# /etc/init.d/heartbeat stop Stopping High-Availability services: Done. [root@MySQL-Master02 ~]# ip addr |grep eth1 3: eth1:mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 172.16.1.52/24 brd 172.16.1.255 scope global eth1 inet 172.16.1.53/24 brd 172.16.1.255 scope global secondary eth1 [root@MySQL-Slave02 mysql]# mysql -uroot -p123456 -e "show slave status\G"|egrep "Slave_IO_Running|Slave_SQL_Running" Slave_IO_Running: Yes Slave_SQL_Running: Yes [root@MySQL-Master02 ~]# mysql -uroot -p123456 -e "create database oldboy;" [root@MySQL-Slave02 mysql]# mysql -uroot -p123456 -e "show databases like 'old%';" +-----------------+ | Database (old%) | +-----------------+ | oldboy | +-----------------+
2.4.3模拟高可用主节点恢复
[root@MySQL-Master01 ~]# /etc/init.d/heartbeat start [root@MySQL-Master01 ~]# ip addr |grep eth1 3: eth1:mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 172.16.1.51/24 brd 172.16.1.255 scope global eth1 inet 172.16.1.53/24 brd 172.16.1.255 scope global secondary eth1 [root@MySQL-Slave02 mysql]# mysql -uroot -p123456 -e "show slave status\G"|egrep "Slave_IO_Running|Slave_SQL_Running" Slave_IO_Running: Yes Slave_SQL_Running: Yes
第3章 高可用脑裂问题及解决方案
3.1引起脑裂的原因
1、高可用服务器间的心跳线路故障,导致无法相互检查心跳
2、高可用服务器间开启了防火墙,阻挡心跳检测
3、高可用服务器网卡地址配置不正确,导致发送心跳失败
4、软件BUG、服务配置不当等原因
3.2 防止脑裂的解决方案
1、加冗余线路
2、做好监控报警