本文赚自http://shanhu.blog.51cto.com/1293405/1212605
Heartbeat介绍
官方站点:http://linux-ha.org/wiki/Main_Page
heartbeat可以资源(VIP地址及程序服务)从一台有故障的服务器快速的转移到另一台正常的服务器提供服务,heartbeat和keepalived相似,heartbeat可以实现failover功能,但不能实现对后端的健康检查
DRBD介绍
官方站点:http://www.drbd.org/
DRBD(Distributed Replicated Block Device)是一个基于块设备级别在远程服务器直接同步和镜像数据的软件,用软件实现的、无共享的、服务器之间镜像块设备内容的存储复制解决方案。它可以实现在网络中两台服务器之间基于块设备级别的实时镜像或同步复制(两台服务器都写入成功)/异步复制(本地服务器写入成功),相当于网络的RAID1,由于是基于块设备(磁盘,LVM逻辑卷),在文件系统的底层,所以数据复制要比cp命令更快
DRBD已经被MySQL官方写入文档手册作为推荐的高可用的方案之一
MySQL介绍
官方站点:http://www.mysql.com/
MySQL是一个开放源码的小型关联式数据库管理系统。目前MySQL被广泛地应用在Internet上的中小型网站中。由于其体积小、速度快、总体拥有成本低,尤其是开放源码这一特点,许多中小型网站为了降低网站总体拥有成本而选择了MySQL作为网站数据库。
heartbeat和keepalived应用场景及区别
很多网友说为什么不使用keepalived而使用长期不更新的heartbeat,下面说一下它们之间的应用场景及区别
1、对于web,db,负载均衡(lvs,haproxy,nginx)等,heartbeat和keepalived都可以实现
2、lvs最好和keepalived结合,因为keepalived最初就是为lvs产生的,(heartbeat没有对RS的健康检查功能,heartbeat可以通过ldircetord来进行健康检查的功能)
3、mysql双主多从,NFS/MFS存储,他们的特点是需要数据同步,这样的业务最好使用heartbeat,因为heartbeat有自带的drbd脚本
总结:无数据同步的应用程序高可用可选择keepalived,有数据同步的应用程序高可用可选择heartbea
一、环境
系统: CentOS 6.4x64最小化安装
ha-node1: 192.168.3.71
ha-node2: 192.168.3.72
vip: 192.168.3.73
mysql: 192.168.3.74
需求:
1、主库ha-node1宕机后ha-node2自动接管VIP以及所有从库
2、在ha-node2接管时,不影响从库的主从同步replication
磁盘 容量 挂载点 说明
/dev/sdb1 8G /data 存放数据
/dev/sdb2 2G 存放DRBD同步的状态信息
注意
1、meta data分区一定不能格式化建立文件系统(sdb2存放drbd同步的状态信息)
2、分好的分区不要进行挂载
3、生产环境DRBD meta data分区一般可设置为1-2G,数据分区看需求给最大
4、在生产环境中两块硬盘一样大
二、基础配置
node1和node2的操作一样
#关闭iptables和selinux [root@ha-node1 ~]# service iptables stop [root@ha-node1 ~]# getenforce Disabled #要保证结果正确 [root@ha-node1 ~]# wget http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm [root@ha-node1 ~]# rpm -ivh epel-release-6-8.noarch.rpm warning: epel-release-6-8.noarch.rpm: Header V3 RSA/SHA256 Signature, key ID 0608b895: NOKEY Preparing... ########################################### [100%] 1:epel-release ########################################### [100%] [root@ha-node1 ~]# rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-6 [root@ha-node1 ~]# rpm -ivh http://elrepo.org/elrepo-release-6-5.el6.elrepo.noarch.rpm Retrieving http://elrepo.org/elrepo-release-6-5.el6.elrepo.noarch.rpm warning: /var/tmp/rpm-tmp.zglfTu: Header V4 DSA/SHA1 Signature, key ID baadae52: NOKEY Preparing... ########################################### [100%] 1:elrepo-release ########################################### [100%] #配置本地host解析 [root@ha-node1 ~]# echo "192.168.3.71 ha-node1" >>/etc/hosts [root@ha-node1 ~]# echo "192.168.3.72 ha-node2" >>/etc/hosts [root@ha-node1 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.3.71 ha-node1 192.168.3.72 ha-node2 #配置ssh互信,这里只给出ha-node1的配置过程 [root@ha-node1 ~]# ssh-keygen [root@ha-node1 ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub root@ha-node2
三、安装配置heartbeat
(1).安装heartbeat
#在ha-node1和ha-node2都执行安装操作 [root@ha-node1 ~]# yum install heartbeat -y
(2).配置ha.cf
[root@ha-node1 ~]# cd /usr/share/doc/heartbeat-3.0.4/ [root@ha-node1 heartbeat-3.0.4]# cp authkeys ha.cf haresources /etc/ha.d/ [root@ha-node1 ~]# grep -v "^#" /etc/ha.d/ha.cf logfile /var/log/ha-log logfacility local1 keepalive 2 deadtime 30 warntime 10 initdead 120 mcast eth0 225.0.10.1 694 1 0 #组播 auto_failback on node ha-node1 #主节点 node ha-node2 #备节点 crm no
(3).配置authkeys
[root@ha-node1 ~]# dd if=/dev/random bs=512 count=1 | openssl md5 0+1 records in 0+1 records out 19 bytes (19 B) copied, 4.6784e-05 s, 406 kB/s (stdin)= 317fa30c3a96a0e54b8018d3e0f3c04a [root@ha-node1 ~]# grep -v ^# /etc/ha.d/authkeys auth 1 1 md5 317fa30c3a96a0e54b8018d3e0f3c04a #使用md5认证 #将认证文件权限修改成600 [root@ha-node1 ~]# chmod 600 /etc/ha.d/authkeys
(4).配置haresource
[root@ha-node1 ~]# grep -v ^# /etc/ha.d/haresources ha-node1 IPaddr::192.168.3.73/24/eth0 #这里暂时只配置一个vip资源
(5).启动heartbeat
[root@ha-node1 ~]# /etc/init.d/heartbeat start Starting High-Availability services: INFO: Resource is stopped Done. [root@ha-node1 ~]# chkconfig heartbeat off 说明:关闭开机自启动,当服务器重启时,需要人工去启动 #ha-node2的配置文件从ha-node1上复制过去,并启动服务 [root@ha-node1 ha.d]# scp authkeys ha.cf haresources ha-node2:/etc/ha.d/ #查看结果 [root@ha-node1 ~]# ip a |grep 192.168.3.73 #vip在主节点上 inet 192.168.3.73/24 brd 192.168.3.255 scope global secondary eth0 [root@ha-node2 ~]# ip a |grep 192.168.3.73 #备节点上没有vip
(6).测试heartbeat
正常状态
#ha-node1的IP信息 [root@ha-node1 ~]# ip a |grep eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.3.71/24 brd 192.168.3.255 scope global eth0 inet 192.168.3.73/24 brd 192.168.3.255 scope global secondary eth0 #ha-node2的IP信息 [root@ha-node2 ~]# ip a |grep eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.3.72/24 brd 192.168.3.255 scope global eth0
模拟主节点宕机后的状态信息
#在主节点ha0node1停止heartbeat服务 [root@ha-node1 ~]# /etc/init.d/heartbeat stop Stopping High-Availability services: Done. [root@ha-node1 ~]# ip a |grep eth0 #主节点的heartbeat服务停止后,vip资源被抢走 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.3.71/24 brd 192.168.3.255 scope global eth0 #在备节点ha-node2查看资源 [root@ha-node2 ~]# ip a |grep eth0 #备节点已自动接管资源 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.3.72/24 brd 192.168.3.255 scope global eth0 inet 192.168.3.73/24 brd 192.168.3.255 scope global secondary eth0
恢复主节点的heartbeat服务
[root@ha-node1 ~]# /etc/init.d/heartbeat start Starting High-Availability services: INFO: Resource is stopped Done. #主节点的heartbeat服务恢复后,将资源接管回来了 [root@ha-node1 ~]# ip a |grep eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.3.71/24 brd 192.168.3.255 scope global eth0 inet 192.168.3.73/24 brd 192.168.3.255 scope global secondary eth0 #查看备节点 [root@ha-node2 ~]# ip a |grep eth0 #vip资源已移除 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.3.72/24 brd 192.168.3.255 scope global eth0
四、安装部署DRBD
(1).对硬盘进行分区,ha-node1和ha-node2的操作一样
[root@ha-node1 ~]# fdisk /dev/sdb #说明:/dev/sdb分成2个分区/dev/sdb1和/dev/sdb2,/dev/sdb1=8G [root@ha-node1 ~]# partprobe /dev/sdb #对分区进行格式化 [root@ha-node1 ~]# mkfs.ext4 /dev/sdb1 说明:sdb2分区为meta data分区,不需要格式化操作 [root@ha-node1 ~]# tune2fs -c -1 /dev/sdb1 说明:设置最大挂载数为-1,关闭强制检查挂载次数限制
(2).安装DRBD
由于我们的系统是CentOS6.4的,所以我们还需要安装内核模块,版本需要和uname -r保持一致,安装包我们从系统安装软件中提取出来,过程略。ha-node1和ha-node2的安装过程一样,这里只给出ha-node1的安装过程
#安装内核文件 [root@ha-node1 ~]# rpm -ivh kernel-devel-2.6.32-358.el6.x86_64.rpm kernel-headers-2.6.32-358.el6.x86_64.rpm [root@ha-node1 ~]# yum install drbd84 kmod-drbd84 -y
(3).配置DRBD
a.修改全局配置文件
[root@ha-node1 ~]# egrep -v "^$|^#|^[[:space:]]+#" /etc/drbd.d/global_common.conf global { usage-count no; } common { protocol C; handlers { } startup { } options { } disk { on-io-error detach; no-disk-flushes; no-md-flushes; rate 200M; } net { sndbuf-size 512k; max-buffers 8000; unplug-watermark 1024; max-epoch-size 8000; cram-hmac-alg "sha1"; shared-secret "weyee2014"; after-sb-0pri disconnect; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect; } }
b.增加资源
[root@ha-node1 ~]# cat /etc/drbd.d/mysqldata.res resource mysqldata { on ha-node1 { device /dev/drbd1; disk /dev/sdb1; address 192.168.3.71:7789; meta-disk /dev/sdb2 [0]; } on ha-node2 { device /dev/drbd1; disk /dev/sdb1; address 192.168.3.72:7789; meta-disk /dev/sdb2 [0]; } }
c.将配置文件复制到ha-node2上,重启系统加载drbd模块,初始化meta数据
[root@ha-node1 drbd.d]# scp global_common.conf mysqldata.res ha-node2:/etc/drbd.d/ [root@ha-node1 ~]# depmod [root@ha-node1 ~]# modprobe drbd [root@ha-node1 ~]# lsmod |grep drbd drbd 365931 0 libcrc32c 1246 1 drbd #在ha-node1初始化meta数据 [root@ha-node1 ~]# drbdadm create-md mysqldata initializing activity log NOT initializing bitmap Writing meta data... New drbd meta data block successfully created. #在ha-node2上加载模块,初始化meta数据 [root@ha-node2 ~]# depmod [root@ha-node2 ~]# modprobe drbd [root@ha-node2 ~]# lsmod |grep drbd drbd 365931 0 libcrc32c 1246 1 drbd [root@ha-node2 ~]# drbdadm create-md mysqldata initializing activity log NOT initializing bitmap Writing meta data... New drbd meta data block successfully created.
d.在ha-node1和ha-node2上启动drbd
#ha-node1操作 [root@ha-node1 drbd.d]# /etc/init.d/drbd start Starting DRBD resources: [ create res: mysqldata prepare disk: mysqldata adjust disk: mysqldata adjust net: mysqldata ] #ha-node2操作 [root@ha-node2 ~]# /etc/init.d/drbd start Starting DRBD resources: [ create res: mysqldata prepare disk: mysqldata adjust disk: mysqldata adjust net: mysqldata ] [root@ha-node1 ~]# drbd-overview 1:mysqldata/0 Connected Secondary/Secondary Inconsistent/Inconsistent #将ha-node1设置成主节点 [root@ha-node1 ~]# drbdadm -- --overwrite-data-of-peer primary mysqldata [root@ha-node1 ~]# drbd-overview 1:mysqldata/0 SyncSource Primary/Secondary UpToDate/Inconsistent [>....................] sync'ed: 4.3% (7848/8196)M #将DRBD设备挂载到/data目录下,写入测试数据ha-node2.txt [root@ha-node1 ~]# mount /dev/drbd drbd/ drbd1 [root@ha-node1 ~]# mount /dev/drbd1 /data/ [root@ha-node1 ~]# touch /data/ha-node2.txt [root@ha-node1 ~]# ll /data/ total 16 -rw-r--r-- 1 root root 0 Jun 12 16:35 ha-node2.txt drwx------ 2 root root 16384 Jun 12 15:31 lost+found #状态结果显示UpToDate/UpToDate表示主备节点数据已同步 [root@ha-node1 ~]# drbd-overview 1:mysqldata/0 Connected Primary/Secondary UpToDate/UpToDate /data ext4 7.8G 19M 7.4G 1%
e.测试DRBD
正常状态
[root@ha-node1 ~]# drbd-overview 1:mysqldata/0 Connected Primary/Secondary UpToDate/UpToDate /data ext4 7.8G 19M 7.4G 1% [root@ha-node2 ~]# drbd-overview 1:mysqldata/0 Connected Secondary/Primary UpToDate/UpToDate #注:这里显示的是ha-node1是主节点,ha-node2是从节点
模拟宕机后的状态
[root@ha-node1 ~]# umount /data #将ha-node1的状态设置成Secondary [root@ha-node1 ~]# drbdadm secondary mysqldata [root@ha-node1 ~]# drbd-overview 1:mysqldata/0 Connected Secondary/Secondary UpToDate/UpToDate #将ha-node2的状态设置成primasy [root@ha-node2 ~]# drbdadm primary mysqldata [root@ha-node2 ~]# drbd-overview 1:mysqldata/0 Connected Primary/Secondary UpToDate/UpToDate [root@ha-node2 ~]# mount /dev/drbd drbd/ drbd1 [root@ha-node2 ~]# mount /dev/drbd1 /mnt #查看文件,测试结果正常 [root@ha-node2 ~]# ll /mnt total 16 -rw-r--r-- 1 root root 0 Jun 12 16:35 ha-node1.txt -rw-r--r-- 1 root root 0 Jun 15 09:54 ha-node2.txt #注:DRBD主节点宕机后,将备节点设置成primary状态后能正常使用,且数据一致 #将DRBD状态恢复成原状态
五、在ha-node1和ha-node2上安装部署mysql,mysql版本5.5.37
ha-node1:
#创建用户和组 [root@ha-node1 ~]# groupadd mysql [root@ha-node1 ~]# useradd -g mysql mysql -s /sbin/nologin #创建数据目录 [root@ha-node1 ~]# mkdir -p /data [root@ha-node1 ~]# mount /dev/drbd1 /data/ #将DRBD设备挂载到/data目录下 [root@ha-node1 ~]# mkdir -p /data/mysql/data #解压并安装mysql [root@ha-node1 ~]# yum -y install make gcc-c++ cmake bison-devel ncurses-devel [root@ha-node1 ~]# tar xf mysql-5.5.37.tar.gz [root@ha-node1 ~]# cd mysql-5.5.37 [root@ha-node1 mysql-5.5.37]# cmake \ > -DCMAKE_INSTALL_PREFIX=/usr/local/mysql-5.5.37 \ > -DMYSQL_DATADIR=/data/mysql/data \ > -DSYSCONFDIR=/etc \ > -DWITH_MYISAM_STORAGE_ENGINE=1 \ > -DWITH_INNOBASE_STORAGE_ENGINE=1 \ > -DWITH_MEMORY_STORAGE_ENGINE=1 \ > -DWITH_READLINE=1 \ > -DMYSQL_UNIX_ADDR=/var/lib/mysql/mysql.sock \ > -DMYSQL_TCP_PORT=3306 \ > -DENABLED_LOCAL_INFILE=1 \ > -DWITH_PARTITION_STORAGE_ENGINE=1 \ > -DEXTRA_CHARSETS=all \ > -DDEFAULT_CHARSET=utf8 \ > -DDEFAULT_COLLATION=utf8_general_ci [root@ha-node1 mysql-5.5.37]# make && make install #初始化数据 [root@ha-node1 mysql-5.5.37]# cd /usr/local/mysql-5.5.37/ [root@ha-node1 mysql-5.5.37]# scripts/mysql_install_db --datadir=/data/mysql/data/ --user=mysql --basedir=/usr/local/mysql-5.5.37/ Installing MySQL system tables... OK Filling help tables... OK To start mysqld at boot time you have to copy support-files/mysql.server to the right place for your system PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER ! To do so, start the server, then issue the following commands: /usr/local/mysql-5.5.37//bin/mysqladmin -u root password 'new-password' /usr/local/mysql-5.5.37//bin/mysqladmin -u root -h ha-node1 password 'new-password' Alternatively you can run: /usr/local/mysql-5.5.37//bin/mysql_secure_installation which will also give you the option of removing the test databases and anonymous user created by default. This is strongly recommended for production servers. See the manual for more instructions. You can start the MySQL daemon with: cd /usr/local/mysql-5.5.37/ ; /usr/local/mysql-5.5.37//bin/mysqld_safe & You can test the MySQL daemon with mysql-test-run.pl cd /usr/local/mysql-5.5.37//mysql-test ; perl mysql-test-run.pl Please report any problems at #复制mysql配置文件 [root@ha-node1 mysql-5.5.37]# cp -rf support-files/my-large.cnf /etc/my.cnf #创建启动脚本 [root@ha-node1 mysql-5.5.37]# cp support-files/mysql.server /etc/init.d/mysqld [root@ha-node1 mysql-5.5.37]# chmod +x /etc/init.d/mysqld #配置软连接 [root@ha-node1 mysql-5.5.37]# ln -s /usr/local/mysql-5.5.37/ /usr/local/mysql [root@ha-node1 mysql-5.5.37]# ln -s /usr/local/mysql-5.5.37/bin/* /usr/sbin/ #启动mysql [root@ha-node1 ~]# /etc/init.d/mysqld start Starting MySQL.. SUCCESS! [root@ha-node1 ~]# netstat -anpt |grep mysql tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 17309/mysqld #连接到mysql并创建数据库dev [root@ha-node1 ~]# mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.5.37-log Source distribution Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> create database dev; Query OK, 1 row affected (0.00 sec) mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | dev | #这是刚创建的数据库 | mysql | | performance_schema | | test | +--------------------+ 5 rows in set (0.00 sec) #关闭mysql,设置开机不启动 [root@ha-node1 ~]# /etc/init.d/mysqld stop Shutting down MySQL. SUCCESS! [root@ha-node1 ~]# chkconfig --add mysqld [root@ha-node1 ~]# chkconfig mysqld off
ha-node2:
#由于我们使用的是DRBD方式,所以这里理论上只需要安装好mysql就能正常启动了,且能识别到我们在ha-node1上创建的数据库dev #确认ha-node2的DRBD状态 [root@ha-node2 ~]# drbd-overview 1:mysqldata/0 Connected Secondary/Primary UpToDate/UpToDate #停止DRBD [root@ha-node2 ~]# drbdadm down mysqldata #安装基础软件包 [root@ha-node2 ~]# yum -y install make gcc-c++ cmake bison-devel ncurses-devel #创建用户和组 [root@ha-node2 ~]# groupadd mysql [root@ha-node2 ~]# useradd -g mysql mysql -s /sbin/nologin #创建数据目录 [root@ha-node2 ~]# mkdir -p /data [root@ha-node2 ~]# mount /dev/sdb1 /data #这里的/dev/sdb1是DRBD的设备 #解压并安装mysql [root@ha-node2 ~]# tar xf mysql-5.5.37.tar.gz [root@ha-node2 ~]# cd mysql-5.5.37 [root@ha-node2 mysql-5.5.37]# cmake \ > -DCMAKE_INSTALL_PREFIX=/usr/local/mysql-5.5.37 \ > -DMYSQL_DATADIR=/data/mysql/data \ > -DSYSCONFDIR=/etc \ > -DWITH_MYISAM_STORAGE_ENGINE=1 \ > -DWITH_INNOBASE_STORAGE_ENGINE=1 \ > -DWITH_MEMORY_STORAGE_ENGINE=1 \ > -DWITH_READLINE=1 \ > -DMYSQL_UNIX_ADDR=/var/lib/mysql/mysql.sock \ > -DMYSQL_TCP_PORT=3306 \ > -DENABLED_LOCAL_INFILE=1 \ > -DWITH_PARTITION_STORAGE_ENGINE=1 \ > -DEXTRA_CHARSETS=all \ > -DDEFAULT_CHARSET=utf8 \ > -DDEFAULT_COLLATION=utf8_general_ci [root@ha-node2 mysql-5.5.37]# make && make install #mysql的配置文件和启动脚本从ha-node1上复制过来 [root@ha-node1 ~]# scp /etc/my.cnf ha-node2:/etc [root@ha-node1 ~]# scp /etc/init.d/mysqld ha-node2:/etc/init.d/ #配置软连接 [root@ha-node2 ~]# ln -s /usr/local/mysql-5.5.37/ /usr/local/mysql [root@ha-node2 ~]# ln -s /usr/local/mysql-5.5.37/bin/* /usr/sbin/ #设置mysql开机不启动 [root@ha-node2 ~]# chkconfig --add mysqld [root@ha-node2 ~]# chkconfig mysqld off #启动mysql [root@ha-node2 ~]# /etc/init.d/mysqld start Starting MySQL.. SUCCESS! [root@ha-node2 ~]# netstat -anpt |grep mysql tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 16617/mysqld #连接到mysql,查看是否有在ha-node1上创建的dev数据库 [root@ha-node2 ~]# mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.5.37-log Source distribution Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | dev | #在ha-node1上创建的数据库能正常显示出来 | mysql | | performance_schema | | test | +--------------------+ 5 rows in set (0.05 sec) #关闭mysql,卸载/data分区,启动DRBD设备 [root@ha-node2 ~]# /etc/init.d/mysqld stop Shutting down MySQL. SUCCESS! [root@ha-node2 ~]# umount /mnt umount: /mnt: not mounted [root@ha-node2 ~]# umount /data/ [root@ha-node2 ~]# drbdadm up mysqldata [root@ha-node2 ~]# drbd-overview 1:mysqldata/0 Connected Secondary/Primary UpToDate/UpToDate
到此基于DRBD的mysql安装完成
六、配置heartbeat调用DRBD
确保ha-node1和ha-node2没有挂载DRBD设备,且ha-node1是主节点
#ha-node1的挂载情况 [root@ha-node1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 18G 2.8G 14G 17% / tmpfs 495M 0 495M 0% /dev/shm /dev/sda1 190M 48M 133M 27% /boot #ha-node2的挂载情况 [root@ha-node2 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 18G 2.7G 14G 17% / tmpfs 495M 0 495M 0% /dev/shm /dev/sda1 190M 48M 133M 27% /boot [root@ha-node1 ~]# drbd-overview 1:mysqldata/0 Connected Primary/Secondary UpToDate/UpToDate
前面我们通过heartbeat管理VIP已经成功,这里我们配置管理DRBD设备
编辑配置文件/etc/ha.d/haresources
[root@ha-node1 ~]# grep -v ^# /etc/ha.d/haresources ha-node1 IPaddr::192.168.3.73/24/eth0 drbddisk::mysqldata Filesystem::/dev/drbd1::/data::ext4 #说明 drbddisk::mysqldata <==启动drbd mysqldata资源,相当于执行/etc/ha.d/resource.d/drbddisk mysqldata stop/start操作 Filesystem::/dev/drbd1::/data::ext4 <==drbd分区挂载到/data目录,相当于执行/etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext4 stop/start <==相当于系统中执行mount /dev/drbd1 /data #将资源文件复制到ha-node2上 [root@ha-node1 ~]# scp /etc/ha.d/haresources ha-node2:/etc/ha.d/ #ha-node1重新加载DRBD资源前的挂载情况 [root@ha-node1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 18G 2.8G 14G 17% / tmpfs 495M 0 495M 0% /dev/shm /dev/sda1 190M 48M 133M 27% /boot #在ha-node1上停止heartbeat资源并重新启动 [root@ha-node1 ~]# /etc/init.d/stop [root@ha-node2 ~]# /etc/init.d/stop [root@ha-node1 ~]# /etc/init.d/heartbeat start Starting High-Availability services: INFO: Resource is stopped Done. [root@ha-node2 ~]# /etc/init.d/heartbeat start Starting High-Availability services: INFO: Resource is stopped Done. #查看结果 [root@ha-node1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 18G 2.8G 14G 17% / tmpfs 495M 0 495M 0% /dev/shm /dev/sda1 190M 48M 133M 27% /boot /dev/drbd1 7.8G 48M 7.4G 1% /data #DRBD设备在主节点正常挂载到/data目录下 #将ha-node1的heartbeat服务停止 [root@ha-node1 ~]# /etc/init.d/heartbeat stop Stopping High-Availability services: Done. [root@ha-node1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 18G 2.8G 14G 17% / tmpfs 495M 0 495M 0% /dev/shm /dev/sda1 190M 48M 133M 27% /boot #在ha-node2上查看挂载情况 [root@ha-node2 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 18G 2.7G 14G 17% / tmpfs 495M 0 495M 0% /dev/shm /dev/sda1 190M 48M 133M 27% /boot /dev/drbd1 7.8G 48M 7.4G 1% /data #结果正常 #查看备节点接管日志 [root@ha-node2 ~]# tail -20 /var/log/messages Jun 15 11:30:10 ha-node2 ResourceManager(default)[17855]: info: Acquiring resource group: ha-node1 IPaddr::192.168.3.73/24/eth0 drbddisk::mysqldata Filesystem::/dev/drbd1::/data::ext4 Jun 15 11:30:10 ha-node2 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.3.73)[17883]: INFO: Resource is stopped Jun 15 11:30:10 ha-node2 ResourceManager(default)[17855]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.3.73/24/eth0 start Jun 15 11:30:10 ha-node2 IPaddr(IPaddr_192.168.3.73)[18006]: INFO: Adding inet address 192.168.3.73/24 with broadcast address 192.168.3.255 to device eth0 Jun 15 11:30:10 ha-node2 IPaddr(IPaddr_192.168.3.73)[18006]: INFO: Bringing device eth0 up Jun 15 11:30:11 ha-node2 IPaddr(IPaddr_192.168.3.73)[18006]: INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.3.73 eth0 192.168.3.73 auto not_used not_used Jun 15 11:30:11 ha-node2 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.3.73)[17980]: INFO: Success Jun 15 11:30:11 ha-node2 ResourceManager(default)[17855]: info: Running /etc/ha.d/resource.d/drbddisk mysqldata start Jun 15 11:30:11 ha-node2 kernel: block drbd1: role( Secondary -> Primary ) Jun 15 11:30:11 ha-node2 /usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd1)[18138]: INFO: Resource is stopped Jun 15 11:30:11 ha-node2 ResourceManager(default)[17855]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext4 start Jun 15 11:30:11 ha-node2 Filesystem(Filesystem_/dev/drbd1)[18222]: INFO: Running start for /dev/drbd1 on /data Jun 15 11:30:11 ha-node2 kernel: EXT4-fs (drbd1): mounted filesystem with ordered data mode. Opts: Jun 15 11:30:11 ha-node2 /usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd1)[18214]: INFO: Success Jun 15 11:30:11 ha-node2 mach_down(default)[17828]: info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired Jun 15 11:30:11 ha-node2 mach_down(default)[17828]: info: mach_down takeover complete for node ha-node1. Jun 15 11:30:11 ha-node2 heartbeat: [17726]: info: mach_down takeover complete. Jun 15 11:30:41 ha-node2 heartbeat: [17726]: WARN: node ha-node1: is dead Jun 15 11:30:41 ha-node2 heartbeat: [17726]: info: Dead node ha-node1 gave up resources. Jun 15 11:30:41 ha-node2 heartbeat: [17726]: info: Link ha-node1:eth0 dead. #我们在重新启动ha-node1的heartbeat服务 [root@ha-node1 ~]# /etc/init.d/heartbeat start Starting High-Availability services: INFO: Resource is stopped Done. #ha-node1的挂载情况 [root@ha-node1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 18G 2.8G 14G 17% / tmpfs 495M 0 495M 0% /dev/shm /dev/sda1 190M 48M 133M 27% /boot /dev/drbd1 7.8G 48M 7.4G 1% /data #ha-node2的挂载情况 [root@ha-node2 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 18G 2.7G 14G 17% / tmpfs 495M 0 495M 0% /dev/shm /dev/sda1 190M 48M 133M 27% /boot
最后配置heartbeat+DRBD+mysql的组合
编辑配置文件/etc/ha.d/haresources
[root@ha-node1 ~]# grep -v ^# /etc/ha.d/haresources ha-node1 IPaddr::192.168.3.73/24/eth0 drbddisk::mysqldata Filesystem::/dev/drbd1::/data::ext4 mysqld #说明 drbddisk::mysqldata <==启动drbd mysqldata资源,相当于执行/etc/ha.d/resource.d/drbddisk mysqldata stop/start操作 Filesystem::/dev/drbd1::/data::ext4 <==drbd分区挂载到/data目录,相当于执行/etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext4 stop/start <==相当于系统中执行mount /dev/drbd1 /data mysqld <==启动mysql服务脚本,相当于/etc/init.d/mysqld stop/start #把配置文件复制到ha-node2上 [root@ha-node1 ~]# scp /etc/ha.d/haresources ha-node2:/etc/ha.d/
重启ha-node1的heartbeat服务
[root@ha-node1 ~]# /etc/init.d/heartbeat stop [root@ha-node1 ~]# /etc/init.d/heartbeat start Starting High-Availability services: INFO: Resource is stopped Done. #查看结果,正常情况下vip,DRBD设备,mysql服务,都应该在ha-node1上正常启动 [root@ha-node1 ~]# netstat -anpt |grep mysql tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 21075/mysqld [root@ha-node1 ~]# ip a|grep eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.3.71/24 brd 192.168.3.255 scope global eth0 inet 192.168.3.73/24 brd 192.168.3.255 scope global secondary eth0 [root@ha-node1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 18G 2.8G 14G 17% / tmpfs 495M 0 495M 0% /dev/shm /dev/sda1 190M 48M 133M 27% /boot /dev/drbd1 7.8G 48M 7.4G 1% /data #测试mysql是否正常 [root@ha-node1 ~]# mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.5.37-log Source distribution Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> create database yanfa; Query OK, 1 row affected (0.02 sec) mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | dev | | mysql | | performance_schema | | test | | yanfa | #创建一个新的数据库研发 +--------------------+ 6 rows in set (0.02 sec)
重启下ha-node2的heartbeat服务
#我们修改配置文件后,还没有重启ha-node2的heartbeat服务,这里我们重启下 [root@ha-node2 ~]# /etc/init.d/heartbeat stop Stopping High-Availability services: Done. [root@ha-node2 ~]# /etc/init.d/heartbeat start Starting High-Availability services: INFO: Resource is stopped Done.
测试mysql,是否能正常漂移并提供服务
#现在我们的vip,DRBD,mysql服务都是运行在ha-node1上的,我们停止ha-node1的heartbeat服务后,这些资源应该会被ha-node2自动接管 [root@ha-node1 ~]# /etc/init.d/heartbeat stop Stopping High-Availability services: Done. [root@ha-node1 ~]# ip a|grep eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.3.71/24 brd 192.168.3.255 scope global eth0 [root@ha-node1 ~]# netstat -anpt |grep mysql [root@ha-node1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 18G 2.8G 14G 17% / tmpfs 495M 0 495M 0% /dev/shm /dev/sda1 190M 48M 133M 27% /boot #从上面的结果能看出所有资源已经被移除 #在node2上查看结果 [root@ha-node2 ~]# ip a|grep eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.3.72/24 brd 192.168.3.255 scope global eth0 inet 192.168.3.73/24 brd 192.168.3.255 scope global secondary eth0 [root@ha-node2 ~]# netstat -anpt |grep mysql tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 21149/mysqld [root@ha-node2 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 18G 2.7G 14G 17% / tmpfs 495M 0 495M 0% /dev/shm /dev/sda1 190M 48M 133M 27% /boot /dev/drbd1 7.8G 48M 7.4G 1% /data #结果显示ha-node2已正常接管所有资源 #查看mysql结果是否正常 [root@ha-node2 ~]# mysql -e "show databases;" +--------------------+ | Database | +--------------------+ | information_schema | | dev | | mysql | | performance_schema | | test | | yanfa | #刚才创建的yanfa数据库能够正常显示 +--------------------+ #查看ha-node2的日志文件 [root@ha-node2 ~]# tail -30 /var/log/messages Jun 15 11:46:17 ha-node2 kernel: block drbd1: peer( Primary -> Secondary ) Jun 15 11:46:17 ha-node2 heartbeat: [20281]: info: Received shutdown notice from 'ha-node1'. Jun 15 11:46:17 ha-node2 heartbeat: [20281]: info: Resources being acquired from ha-node1. Jun 15 11:46:17 ha-node2 heartbeat: [20321]: info: acquire local HA resources (standby). Jun 15 11:46:17 ha-node2 heartbeat: [20322]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeys ha-node2] to acquire. Jun 15 11:46:17 ha-node2 heartbeat: [20321]: info: local HA resource acquisition completed (standby). Jun 15 11:46:17 ha-node2 heartbeat: [20281]: info: Standby resource acquisition done [foreign]. Jun 15 11:46:17 ha-node2 harc(default)[20347]: info: Running /etc/ha.d//rc.d/status status Jun 15 11:46:17 ha-node2 mach_down(default)[20364]: info: Taking over resource group IPaddr::192.168.3.73/24/eth0 Jun 15 11:46:17 ha-node2 ResourceManager(default)[20391]: info: Acquiring resource group: ha-node1 IPaddr::192.168.3.73/24/eth0 drbddisk::mysqldata Filesystem::/dev/drbd1::/data::ext4 mysqld Jun 15 11:46:17 ha-node2 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.3.73)[20419]: INFO: Resource is stopped Jun 15 11:46:17 ha-node2 ResourceManager(default)[20391]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.3.73/24/eth0 start Jun 15 11:46:17 ha-node2 IPaddr(IPaddr_192.168.3.73)[20542]: INFO: Adding inet address 192.168.3.73/24 with broadcast address 192.168.3.255 to device eth0 Jun 15 11:46:17 ha-node2 IPaddr(IPaddr_192.168.3.73)[20542]: INFO: Bringing device eth0 up Jun 15 11:46:17 ha-node2 IPaddr(IPaddr_192.168.3.73)[20542]: INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.3.73 eth0 192.168.3.73 auto not_used not_used Jun 15 11:46:17 ha-node2 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.3.73)[20516]: INFO: Success Jun 15 11:46:17 ha-node2 ResourceManager(default)[20391]: info: Running /etc/ha.d/resource.d/drbddisk mysqldata start Jun 15 11:46:17 ha-node2 kernel: block drbd1: role( Secondary -> Primary ) Jun 15 11:46:17 ha-node2 /usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd1)[20670]: INFO: Resource is stopped Jun 15 11:46:17 ha-node2 ResourceManager(default)[20391]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext4 start Jun 15 11:46:17 ha-node2 Filesystem(Filesystem_/dev/drbd1)[20752]: INFO: Running start for /dev/drbd1 on /data Jun 15 11:46:17 ha-node2 kernel: EXT4-fs (drbd1): mounted filesystem with ordered data mode. Opts: Jun 15 11:46:17 ha-node2 /usr/lib/ocf/resource.d//heartbeat/Filesystem(Filesystem_/dev/drbd1)[20744]: INFO: Success Jun 15 11:46:17 ha-node2 ResourceManager(default)[20391]: info: Running /etc/init.d/mysqld start Jun 15 11:46:19 ha-node2 mach_down(default)[20364]: info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired Jun 15 11:46:19 ha-node2 heartbeat: [20281]: info: mach_down takeover complete. Jun 15 11:46:19 ha-node2 mach_down(default)[20364]: info: mach_down takeover complete for node ha-node1. Jun 15 11:46:49 ha-node2 heartbeat: [20281]: WARN: node ha-node1: is dead Jun 15 11:46:49 ha-node2 heartbeat: [20281]: info: Dead node ha-node1 gave up resources. Jun 15 11:46:49 ha-node2 heartbeat: [20281]: info: Link ha-node1:eth0 dead.
我们在ha-node1上重新启动heartbeat服务,查看资源是否会被接管回来
[root@ha-node1 ~]# /etc/init.d/heartbeat start Starting High-Availability services: INFO: Resource is stopped Done. #查看结果,资源能够被正常接管回来 [root@ha-node1 ~]# ip a|grep eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.3.71/24 brd 192.168.3.255 scope global eth0 inet 192.168.3.73/24 brd 192.168.3.255 scope global secondary eth0 [root@ha-node1 ~]# netstat -anpt |grep mysql tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 22805/mysqld [root@ha-node1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 18G 2.8G 14G 17% / tmpfs 495M 0 495M 0% /dev/shm /dev/sda1 190M 48M 133M 27% /boot /dev/drbd1 7.8G 48M 7.4G 1% /data
七、配置mysql高可用主从
我们在上面的架构基础上增加一个mysql的从服务器,实现读写分离。从服务器的安装我们使用脚本安装
脚本内同如下
#!/bin/bash DATADIR='/data/mysql/data' VERSION='mysql-5.5.37' export LANG=zh_CN.UTF-8 #Source function library. . /etc/init.d/functions #camke install mysql5.5.X install_mysql(){ #这里的密码需要我们手动输入,设置密码是lyao36843 read -p "please input a password for root: " PASSWD if [ ! -d $DATADIR ];then mkdir -p $DATADIR fi yum install cmake make gcc-c++ bison-devel ncurses-devel -y id mysql &>/dev/null if [ $? -ne 0 ];then useradd mysql -s /sbin/nologin -M fi #useradd mysql -s /sbin/nologin -M #change datadir owner to mysql chown -R mysql.mysql $DATADIR cd #wget http://mirrors.sohu.com/mysql/MySQL-5.5/mysql-5.5.38.tar.gz tar xf $VERSION.tar.gz cd $VERSION cmake . -DCMAKE_INSTALL_PREFIX=/usr/local/$VERSION \ -DMYSQL_DATADIR=$DATADIR \ -DMYSQL_UNIX_ADDR=$DATADIR/mysql.sock \ -DDEFAULT_CHARSET=utf8 \ -DDEFAULT_COLLATION=utf8_general_ci \ -DENABLED_LOCAL_INFILE=ON \ -DWITH_INNOBASE_STORAGE_ENGINE=1 \ -DWITH_FEDERATED_STORAGE_ENGINE=1 \ -DWITH_BLACKHOLE_STORAGE_ENGINE=1 \ -DWITHOUT_EXAMPLE_STORAGE_ENGINE=1 \ -DWITHOUT_PARTITION_STORAGE_ENGINE=1 make && make install if [ $? -ne 0 ];then action "install mysql is failed!" /bin/false exit $? fi sleep 2 #link ln -s /usr/local/$VERSION/ /usr/local/mysql ln -s /usr/local/mysql/bin/* /usr/bin/ #copy config and start file /bin/cp /usr/local/mysql/support-files/my-small.cnf /etc/my.cnf cp /usr/local/mysql/support-files/mysql.server /etc/init.d/mysqld chmod 700 /etc/init.d/mysqld #init mysql /usr/local/mysql/scripts/mysql_install_db --basedir=/usr/local/mysql --datadir=$DATADIR --user=mysql if [ $? -ne 0 ];then action "install mysql is failed!" /bin/false exit $? fi #check mysql /etc/init.d/mysqld start if [ $? -ne 0 ];then action "mysql start is failed!" /bin/false exit $? fi chkconfig --add mysqld chkconfig mysqld on /usr/local/mysql/bin/mysql -e "update mysql.user set password=password('$PASSWD') where host='localhost' and user='root';" /usr/local/mysql/bin/mysql -e "update mysql.user set password=password('$PASSWD') where host='127.0.0.1' and user='root';" /usr/local/mysql/bin/mysql -e "delete from mysql.user where password='';" /usr/local/mysql/bin/mysql -e "flush privileges;" #/usr/local/mysql/bin/mysql -e "select version();" >/dev/null 2>&1 if [ $? -eq 0 ];then echo "+---------------------------+" echo "+------mysql安装完成--------+" echo "+---------------------------+" fi #/etc/init.d/mysqld stop } install_mysql
编辑配置文件/etc/my.cnf,修改如下
[root@mysql ~]# egrep -v "^#|^$" /etc/my.cnf [client] port = 3306 socket = /data/mysql/data/mysql.sock [mysqld] port = 3306 socket = /data/mysql/data/mysql.sock skip-external-locking key_buffer_size = 16K max_allowed_packet = 1M table_open_cache = 4 sort_buffer_size = 64K read_buffer_size = 256K read_rnd_buffer_size = 256K net_buffer_length = 2K thread_stack = 128K server-id = 2 #确保server-id在架构中是唯一的值 log-bin=mysql-bin #二进制日志文件最好打开 [mysqldump] quick max_allowed_packet = 16M [mysql] no-auto-rehash [myisamchk] key_buffer_size = 8M sort_buffer_size = 8M [mysqlhotcopy] interactive-timeout
重新启动mysql
[root@mysql ~]# service mysqld restart Shutting down MySQL. SUCCESS! Starting MySQL.. SUCCESS! [root@mysql ~]# netstat -anpt Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1789/sshd tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1305/master tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 12084/mysqld tcp 0 52 192.168.3.74:22 192.168.3.2:4138 ESTABLISHED 1416/sshd tcp 0 0 :::22 :::* LISTEN 1789/sshd tcp 0 0 ::1:25 :::* LISTEN 1305/master
在ha-node1的mysql里建立授权用户rep
mysql> grant replication slave on *.* to 'rep'@'192.168.3.%' identified by 'rep123'; Query OK, 0 rows affected (0.00 sec) mysql> flush privileges; Query OK, 0 rows affected (0.00 sec)
查看master状态信息
#在ha-node1上操作 mysql> show master status; +------------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +------------------+----------+--------------+------------------+ | mysql-bin.000006 | 333 | | | +------------------+----------+--------------+------------------+ 1 row in set (0.00 sec)
在从库上配置同步参数
mysql> change master to master_host='192.168.3.73',master_port=3306,master_user='rep',master_password='rep123',master_log_file='mysql-bin.000006',master_log_pos=333; Query OK, 0 rows affected (0.03 sec) #启动slave mysql> start slave; Query OK, 0 rows affected (0.00 sec) #检查状态,状态正常 mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.3.73 Master_User: rep Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000006 Read_Master_Log_Pos: 333 Relay_Log_File: mysql-relay-bin.000002 Relay_Log_Pos: 253 Relay_Master_Log_File: mysql-bin.000006 Slave_IO_Running: Yes #确保这里是Yes Slave_SQL_Running: Yes #确保这里是Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 333 Relay_Log_Space: 409 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 1 row in set (0.00 sec)
测试1:现在mysql是运行在ha-node1上,我们创建一个新的数据库rep,查看是否能正常同步
#在ha-node1环境的mysql里创建 mysql> create database rep; Query OK, 1 row affected (0.00 sec) mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | rep | #创建的新数据库rep | test | +--------------------+ 7 rows in set (0.00 sec) #在mysql从服务器上检查结果,结果正常 mysql> system hostname mysql mysql> system ip a |grep eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.3.74/24 brd 192.168.3.255 scope global eth0 mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | rep | | test | +--------------------+ 5 rows in set (0.00 sec)
测试2:将ha-node1的heartbeat停止后,所有资源会被ha-node2接管过去,这种情况下mysql从服务器是否还能正常同步
#在ha-node1上停止heartbeat服务,让ha-node2将资源接管过去 [root@ha-node1 ~]# /etc/init.d/heartbeat stop Stopping High-Availability services: Done. #在ha-node2上确认资源情况 [root@ha-node2 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 18G 2.7G 14G 17% / tmpfs 495M 0 495M 0% /dev/shm /dev/sda1 190M 48M 133M 27% /boot /dev/drbd1 7.8G 48M 7.4G 1% /data [root@ha-node2 ~]# ip a|grep eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.3.72/24 brd 192.168.3.255 scope global eth0 inet 192.168.3.73/24 brd 192.168.3.255 scope global secondary eth0 [root@ha-node2 ~]# netstat -anpt |grep mysql tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 22430/mysqld #在mysql从服务器查看slave状态 [root@mysql ~]# mysql -uroot -plyao36843 -h 127.0.0.1 -e "show slave status\G" |grep Yes Slave_IO_Running: Yes Slave_SQL_Running: Yes #这里的结果显示结果正常 #在ha-node2上创建数据库ha mysql> create database ha; Query OK, 1 row affected (0.00 sec) mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | ha | | mysql | | performance_schema | | rep | | test | +--------------------+ 8 rows in set (0.04 sec) #在mysql从服务器查看结果 [root@mysql ~]# mysql -uroot -plyao36843 -h 127.0.0.1 -e "show databases;" +--------------------+ | Database | +--------------------+ | information_schema | | ha | #结果正常 | mysql | | performance_schema | | rep | | test | +--------------------+
测试3:ha-node1恢复正常后,mysql的从服务器状态是否正常
#重新启动ha-node1的heartbeat服务 [root@ha-node1 ~]# /etc/init.d/heartbeat start Starting High-Availability services: INFO: Resource is stopped Done. [root@ha-node1 ~]# ip a |grep eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 inet 192.168.3.71/24 brd 192.168.3.255 scope global eth0 inet 192.168.3.73/24 brd 192.168.3.255 scope global secondary eth0 [root@ha-node1 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 18G 2.8G 14G 17% / tmpfs 495M 0 495M 0% /dev/shm /dev/sda1 190M 48M 133M 27% /boot /dev/drbd1 7.8G 48M 7.4G 1% /data [root@ha-node1 ~]# netstat -anpt |grep mysql tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 24156/mysqld #确认mysql从服务器状态 [root@mysql ~]# mysql -uroot -plyao36843 -h 127.0.0.1 -e "show slave status\G" |grep Yes Slave_IO_Running: Yes Slave_SQL_Running: Yes #说明在heartbeat+DRBD+mysql的架构中,由于heartbeat主从的切换,mysql从服务器会断开连接60s,然后会自动重连。
到heart+drbd+mysql主从配置完成
高可用脑裂问题及解决方案
1、高可用服务器之间心跳链路故障,导致无法相互检查心跳
2、高可用服务器上开启了防火墙,阻挡了心跳检测
3、高可用服务器上网卡地址等信息配置不正常,导致发送心跳失败
4、其他服务配置不当等原因,如心跳方式不同,心跳广播冲突,软件BUG等
1、加冗余线路
2、检测到裂脑时,强行关闭心跳检测(远程关闭主节点,控制电源的电路fence)
3、做好脑裂的监控报警
4、报警后,备节点在接管时设置比较长的时间去接管,给运维人员足够的时间去处理(人为处理)
5、启动磁盘锁,正在服务的一方锁住磁盘,裂脑发生时,让对方完全抢不走"共享磁盘资源"
磁盘锁存在的问题:
使用锁磁盘会有死锁的问题,如果占用共享磁盘的一方不主动"解锁"另一方就永远得不到共享磁盘,假如服务器节点突然死机或崩溃,就不可能执行解锁命令,备节点也就无法接管资源和服务了,有人在HA中设计了智能锁,正在提供服务的一方只在发现心跳全部断开时才会启用磁盘锁,平时就不上锁