MHA概述

日本DeNA公司youshimaton (现就职于Facebook公司) 开发
一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件

MHA的组成

MHA Manager (管理节点)
MHA Node (数据节点)

MHA特点

自动故障切换过程中,MHA试图从宕机的主服务器上保存二进制日志,最大程度的保证数据的不丢失
使用MySQL 5.5的半同步复制,可以大大降低数据丢失的风险

实验思路

1、MHA架构

(1)数据库安装
(2)一主两从
(3)MHA搭建

2、故障模拟

(1)主库失效
(2)备选主库成为主库
(3)从库2将备选主库指向为主库

部署环境

1、实验环境

服务器角色 IP地址 服务软件包
master 192.168.142.130 mha4mysql-node
slave1 192.168.142.131 mha4mysql-node
slave2 192.168.142.132 mha4mysql-node
manager 192.168.142.133 Textmha4mysql-manager、 mha4mysql-node

2、实验要求

本案例要求通过MHA监控MySQL 数据库在故障时进行自动切换,不影响业务。

3、实验思路

(1)安装MySQL数据库
(2)配置MySQL一主两从
(3)安装MHA软件
(4)配置无密码认证
(5)配置MySQL MHA高可用
(6)模拟master 故障切换

第一步:在三台MySQL节点上分别安装数据库

(MySOL版本请使用5.6.36, cmake版本请使用2.8.6)

1、安装编译依赖的环境

yum install -y install ncurses-devel gcc gcc-c++ perl-Module-Install

2、远程挂载

mkdir /abc
mount.cifs //192.168.1421/mha /abc/

3、安装gmake编译软件

cd /abc/mha/
tar zxvf cmake-2.8.6.tar.gz -C /opt/

cd /opt/cmake-2.8.6/
./configure
gmake && gmake install

4、安装MySQL数据库

cd /abc/mha/
tar zxvf mysql-5.6.36.tar.gz -C /opt/

cd /opt/mysql-5.6.36/
cmake -DCMAKE_INSTALL_PREFIX=/usr/local/mysql \
-DDEFAULT_CHARSET=utf8 \
-DDEFAULT_COLLATION=utf8_general_ci \
-DWITH_EXTRA_CHARSETS=all \
-DSYSCONFDIR=/etc

make && make install

cp support-files/my-default.cnf /etc/my.cnf
cp support-files/mysql.server /etc/rc.d/init.d/mysqld

chmod +x /etc/rc.d/init.d/mysqld
chkconfig --add mysqld
echo "PATH=$PATH:/usr/local/mysql/bin" >> /etc/profile
source /etc/profile

useradd -M -s /sbin/nologin mysql
chown -R mysql.mysql /usr/local/mysql

/usr/local/mysql/scripts/mysql_install_db \
--basedir=/usr/local/mysql \
--datadir=/usr/local/mysql/data \
--user=mysql

5、修改master的主配置文件/etc/my.cnf文件,三台服务器的server-id不能一样

vim /etc/my.cnf
[mysqld]
server-id = 1
log_bin = master-bin
log-slave-updates = true

第二步:配置从服务器1

修改 mysql 的主配置文件

#在/etc/my.cnf中修改或者增加下面内容。
[mysqld]
server-id = 2
log_bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index

第三步:配置从服务器2

1.修改 mysql 的主配置文件:/etc/my.cnf

vim /etc/my.cnf
[mysql]
server-id = 3
log_bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index

2.在master、slave1、slave2上分别做两个软连接

ln -s /usr/local/mysql/bin/mysql /usr/sbin/
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/

3.master、slave1、slave2上启动mysql,并查看开启状况

#启动mysql
/usr/local/mysql/bin/mysqld_safe --user=mysql &

#查看服务端口状态
netstat -ntap | grep 3306

#关闭防火墙和安全功能
systemctl stop firewalld.service 
setenforce 0

第四步:配置MySQL一主两从


1.mysq主从配置相对比较简单需要注意的是授权,在所有数据库节点上授权两个用户,一个是从库同步使用用户myslave,另一个是manager使用监控用户mha

grant replication slave on *.* to 'myslave'@'192.168.142.%' identified by '123';
grant all privileges on *.* to 'mha'@'192.168.142.%' identified by 'manager';
flush privileges;

2.下面三条授权按理论是不用添加的,但是做案例实验环境时候通过MHA检查mysql主从有报错,报两个从库通过主机名连接不上主库,所以所有数据库加上下面的授权

grant all privileges on *.* to 'mha'@'master' identified by 'manager';
grant all privileges on *.* to 'mha'@'slave1' identified by 'manager';
grant all privileges on *.* to 'mha'@'slave2' identified by 'manager';

#刷新数据库
flush privileges;

3.在master主机上查看二进制文件和同步点

mysql> show master status;
+-------------------+----------+--------------+------------------+-------------------+
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-------------------+----------+--------------+------------------+-------------------+
| master-bin.000001 |     1292 |              |                  |                   |
+-------------------+----------+--------------+------------------+-------------------+

4.在slave1、slave2上分别执行同步

change master to master_host='192.168.142.130',master_user='myslave',master_password='123',master_log_file='masterbin.000001',master_log_pos=1292;
start slave;
#开启slave

5.查看IO和SQL线程都是yes代表代表同步正常

show slave status\G;

             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
#必须设置两个从库为只读模式

#设置两个从库为只读模式
set global read_only=1;

#刷新数据库
flush privileges;

第五步:安装MHA软件(所有服务器上都安装MHA依赖的环境,首先安装epel源)

#关闭防火墙和安全功能
systemctl stop firewalld.service 
setenforce 0

#安装MHA依赖的环境
yum install epel-release --nogpgcheck -y
yum install -y perl-DBD-MySQL \
perl-Config-Tiny \
perl-Log-Dispatch \
perl-Parallel-ForkManager \
perl-ExtUtils-CBuilder \
perl-ExtUtils-MakeMaker \
perl-CPAN

#安装node(在所有服务器上安装node)
tar zxvf /abc/rpm/MHA/mha4mysql-node-0.57.tar.gz
cd mha4mysql-node-0.57/
perl Makefile.PL
make && make install

第六步:在MHA-manager上安装manager组件(注意:先安装node组件才能安装manager组件)

tar zxvf /abc/rpm/MHA/mha4mysql-manager-0.57.tar.gz
cd mha4mysql-manager-0.57/
perl Makefile.PL 
make
make install

manager安装后会在/usr/local/bin下面生成几个工具:

masterha_conf_host           
#添加或删除配置的server信息

masterha_stop                   
#关闭manager

masterha_manager            
#启动manager脚本

masterha_check_repl         
#检查mysql复制情况

masterha_master_monitor 
#检查master是否宕机

masterha_check_ssh          
#检查MHA的SSH配置状况

masterha_master_switch    
#控制故障转移(自动或者手动)

masterha_check_status      
#检测当前MHA运行状态

node安装后也会在/usr/local/bin下面生成几个脚本(这些工具通常由MHA Manager的脚本出发,无需人为陈操作)

apply_diff_relay_logs             
#识别差异的中继日志事件并将其差异的事件应用与其他的slave

filter_mysqlbinlog             
#去除不必要的ROLLBACK事件(MHA已不再使用这个工具)

purge_relay_logs            
#清除中继日志(不会阻塞SQL线程)

save_binary_logs            
#保存和复制master的二进制日志

第七步:配置无密码验证

(1)在manager上配置到所有数据库节点的无密码认证

#因为是无密码验证,所以一路按回车键
ssh-keygen -t rsa         
ssh-copy-id 192.168.142.130
ssh-copy-id 192.168.142.131
ssh-copy-id 192.168.142.132

(2)在master上配置到数据库节点slave1和slave2的无密码验证

ssh-keygen -t rsa
ssh-copy-id 192.168.142.131
ssh-copy-id 192.168.142.132

(3)在slave1上配置到数据库节点master和slave2的无密码认证

ssh-keygen -t rsa
ssh-copy-id 192.168.142.130
ssh-copy-id 192.168.142.132

(4)在slave2上配置到数据库节点master和slave1的无密码认证

ssh-keygen -t rsa
ssh-copy-id 192.168.142.130
ssh-copy-id 192.168.142.131

第八步:配置MHA

1.在manager节点上复制相关脚本到/usr/local/bin目录

cp -ra /root/mha4mysql-manager-0.57/samples/scripts /usr/local/bin
#拷贝后会有四个执行文件

#查看目录权限
ll /usr/local/bin/scripts/

-rwxr-xr-x. 1 1001 1001  3648 May 31  2015 master_ip_failover            #自动切换时VIP管理的脚本
-rwxr-xr-x. 1 1001 1001  9870 May 31  2015 master_ip_online_change      #在线切换时VIP的管理
-rwxr-xr-x. 1 1001 1001 11867 May 31  2015 power_manager           #故障发生后关闭主机的脚本
-rwxr-xr-x. 1 1001 1001  1360 May 31  2015 send_report             #因故障切换后发送警报的脚本

2.复制上述的自动切换时VIP管理的脚本到/usr/local/bin目录,这里使用脚本管理VIP

cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin

3.重新编写 master_ip_failover 脚本:(删除原有内容,直接写入下述内容)

vim /usr/local/bin/master_ip_failover 

#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';

use Getopt::Long;

my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);

#添加内容部分
my $vip = '192.168.142.200';
my $brdc = '192.168.142.255';
my $ifdev = 'ens33';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
my $exit_code = 0;
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {

my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
#A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

4.创建MHA软件目录并拷贝配置文件

mkdir /etc/masterha
cp /root/mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha
vim /etc/masterha/app1.cnf 

[server default]
#manager配置文件
manager_log=/var/log/masterha/app1/manager.log     

#manager日志
manager_workdir=/var/log/masterha/app1

#master保存binlog的位置,这里的路径要与master里配置的bilog的相同
master_binlog_dir=/usr/local/mysql/data

#设置自动failover时候的切换脚本。也就是上边的那个脚本
master_ip_failover_script=/usr/local/bin/master_ip_failover

#设置手动切换时候的切换脚本
master_ip_online_change_script=/usr/local/bin/master_ip_online_change

#这个密码是前文中创建监控用户的那个密码
password=manager
remote_workdir=/tmp

#设置复制用户密码
repl_password=123

#设置复制用户的用户
repl_user=myslave

#设置发生切换后发生报警的脚本
reporl_script=/usr/local/send_report
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.45.130 -s 192.168.45.134

#设置故障发生关闭故障脚本主机
shutdown_script=""

#设置ssh的登录用户名
ssh_user=root

#设置监控用户
user=mha

[server1]
hostname=192.168.142.
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.142.
port=3306

[server3]
hostname=192.168.142.
port=3306

5.测试ssh无密码认证

masterha_check_ssh -conf=/etc/masterha/app1.cnf
masterha_check_repl -conf=/etc/masterha/app1.cnf

#注意:第一次配置需要去master上手动开启虚拟IP
/sbin/ifconfig ens33:1 192.168.142.200/24

6.启动MHA

nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

7.查看MHA状态,可以看到当前的master是mysql1节点

masterha_check_status --conf=/etc/masterha/app1.cnf

8.查看MHA日志,也可以看到当前的master是192.168.142.130

cat /var/log/masterha/app1/manager.log 

第九步:故障模拟

1、启动监控观察日志记录

tailf /var/log/masterha/app1/manager.log

2、查看地址变化

pkill -9 mysql        
#宕掉mysql服务      

VIP地址不会因为manager节点停止MHA服务而消失,VIP地址会转移到slave1上

#从服务器查看vip地址转移
ifconfig

ens33: flags=4163  mtu 1500
        inet 192.168.142.131  netmask 255.255.255.0  broadcast 192.168.142.255
        inet6 fe80::b81a:9df:a960:45ac  prefixlen 64  scopeid 0x20
        ether 00:0c:29:97:8e:66  txqueuelen 1000  (Ethernet)
        RX packets 1687418  bytes 1157627305 (1.0 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1376468  bytes 170996461 (163.0 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens33:1: flags=4163  mtu 1500
        inet 192.168.142.200  netmask 255.255.255.0  broadcast 192.168.142.255
        ether 00:0c:29:97:8e:66  txqueuelen 1000  (Ethernet)

3、在mha-manager上开启另外一个新的终端,直接yum安装一个mysql

 yum install mysql -y

#在slave1上赋予权限,要不然mha-manager这边是进不到数据库的:
grant all on *.* to 'root'@'%' identified by 'abc123';

#在mh-manager上进行登录:
 mysql -h 192.168.142.200 -uroot -p
Enter password:             
#输入密码

(1)创建个数据库school,并创建个表info,写一下简单的内容

MySQL [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
+--------------------+
4 rows in set (0.00 sec)

MySQL [(none)]> create database school;
Query OK, 1 row affected (0.00 sec)

MySQL [(none)]> use school;
Database changed
MySQL [school]> create table info (id int);
Query OK, 0 rows affected (0.01 sec)

(2)创建好以后再slave1上的数据库中查看,会同步数据

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| school             |
| test               |
+--------------------+

(3)因为slave1和slave2之间是相互同步的,所以在slave2上数据也应该同步

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| school             |
| test               |
+--------------------+