上一篇介绍了MySQL的主从复制、读写分离,实现主从服务器同步的架构,它存在单点故障的隐患,一旦主服务器出现故障,将无法进行写入,为了解决这个问题,可以考虑是否能够让从服务器切换角色,自动变为主服务器,继续提供服务呢?答案当然是可行的,这里就必须要用到MHA架构了。
MHA是一套优秀的MySQL高可用环境下故障切换和主从复制的软件;MySQL故障过程中,MHA能做到0-30秒内自动完成故障切换。
它由两部分组成,MHA Manager (管理节点)和 MHA Node(数据节点)
MHA的特点:
综上所述,我们至少需要四台主机,三台做主从复制(一主两从,两从中一台为主备,即主服务器宕机,这台主备成为新的主服务器),一台作为manager,检测管理数据库集群。通过监控 MySQL 数据库,在故障时进行自动切换,不影响业务。
整体思路
1) 安装 MySQL 数据库
2) 配置 MySQL 一主两从
3) 安装 MHA 软件
4) 配置无密码认证,ssh免密登录
5) 配置 MySQL MHA 高可用
6) 模拟 master 故障切换
环境介绍
下面只在 Mysql1 上面做演示,安装过程如下。
[root@Mysql1 ~]# yum -y install ncurses-devel gcc-c++ perl-Module-Install
[root@Mysql1 ~]# tar zxvf cmake-2.8.6.tar.gz
[root@Mysql1 ~]# cd cmake-2.8.6
[root@Mysql1 cmake-2.8.6]# ./configure
[root@Mysql1 cmake-2.8.6]# gmake && gmake install
[root@Mysql1 ~]# tar -zxvf mysql-5.6.36.tar.gz
[root@Mysql1 ~]# cd mysql-5.6.36
[root@Mysql1 mysql-5.6.36]# cmake -DCMAKE_INSTALL_PREFIX=/usr/local/mysql -DDEFAULT_CHARSET=utf8 -DDEFAULT_COLLATION=utf8_general_ci -DWITH_EXTRA_CHARSETS=all -DSYSCONFDIR=/etc
[root@Mysql1 mysql-5.6.36]# make && make install
[root@Mysql1 mysql-5.6.36]# cp support-files/my-default.cnf /etc/my.cnf
[root@Mysql1 mysql-5.6.36]# cp support-files/mysql.server /etc/rc.d/init.d/mysqld
[root@Mysql1 ~]# chmod +x /etc/rc.d/init.d/mysqld
[root@Mysql1 ~]# chkconfig --add mysqld
[root@Mysql1 ~]# echo "PATH=$PATH:/usr/local/mysql/bin" >> /etc/profile
[root@Mysql1 ~]# source /etc/profile
[root@Mysql1 ~]# groupadd mysql
[root@Mysql1 ~]# useradd -M -s /sbin/nologin mysql -g mysql
[root@Mysql1 ~]# chown -R mysql.mysql /usr/local/mysql
[root@Mysql1 ~]# mkdir -p /data/mysql
[root@Mysql1 ~]# /usr/local/mysql/scripts/mysql_install_db --basedir=/usr/local/mysql --datadir=/usr/local/mysql/data --user=mysql
Mysql1:master
vi /etc/my.cnf
[client]
port = 3306
socket = /usr/local/mysql/mysql.sock
[mysql]
port = 3306
socket = /usr/local/mysql/mysql.sock
[mysqld]
user = mysql
basedir = /usr/local/mysql
datadir = /usr/local/mysql/data
port = 3306
pid-file = /usr/local/mysql/mysqld.pid
socket = /usr/local/mysql/mysql.sock
server-id = 1
log_bin = master-bin
log-slave-updates = true
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZERO,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,PIPES_AS_CONCAT,ANSI_QUOTES
Mysql2:slave1
vi /etc/my.cnf
[client]
port = 3306
socket = /usr/local/mysql/mysql.sock
[mysql]
port = 3306
socket = /usr/local/mysql/mysql.sock
[mysqld]
user = mysql
basedir = /usr/local/mysql
datadir = /usr/local/mysql/data
port = 3306
pid-file = /usr/local/mysql/mysqld.pid
socket = /usr/local/mysql/mysql.sock
server-id = 2
log_bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZERO,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,PIPES_AS_CONCAT,ANSI_QUOTES
Mysql2:slave2
vi /etc/my.cnf
[client]
port = 3306
socket = /usr/local/mysql/mysql.sock
[mysql]
port = 3306
socket = /usr/local/mysql/mysql.sock
[mysqld]
user = mysql
basedir = /usr/local/mysql
datadir = /usr/local/mysql/data
port = 3306
pid-file = /usr/local/mysql/mysqld.pid
socket = /usr/local/mysql/mysql.sock
server-id = 3
log_bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZERO,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,PIPES_AS_CONCAT,ANSI_QUOTES
[root@Mysql1 ~]# ln -s /usr/local/mysql/bin/mysql /usr/sbin/
[root@Mysql1 ~]# ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/
[root@Mysql1 ~]# /usr/local/mysql/bin/mysqld_safe --user=mysql &
在所有数据库节点上授权两个用户,一个是从库同步使用,另外一个是 manager 使用。
mysql> grant replication slave on *.* to 'myslave'@'192.168.247.%' identified by '123';
mysql> grant all privileges on *.* to 'mha'@'192.168.247.%' identified by 'manager';
mysql> flush privileges;
下面三条授权按理论是不用添加的,但是做案例实验环境时候通过 MHA 检查MySQL 主从有报错,报两个从库通过主机名连接不上主库,所以所有数据库加上下面的授权。(在/etc/hosts里做映射也可解决)
mysql> grant all privileges on *.* to 'mha'@'Mysql1' identified by 'manager';
mysql> grant all privileges on *.* to 'mha'@'Mysql2' identified by 'manager';
mysql> grant all privileges on *.* to 'mha'@'Mysql3' identified by 'manager';
在 Mysql1 主机上查看二进制文件和同步点,
mysql> show master status;
+-------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-------------------+----------+--------------+------------------+-------------------+
| master-bin.000001 | 1491 | | | |
+-------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
接下来在 Mysql2 和 Mysql3 分别执行同步。
mysql> change master to
master_host='192.168.247.130',master_user='myslave',master_password='123',master_log_file='master-bin.000001',master_log_pos=1491; 和上一章保持一致
mysql> start slave;
查看两个从数据库, IO 和 SQL 线程都是 yes 代表同步是否正常。
mysql> show slave status\G;
......
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
......
必须设置两个从库为只读模式:
mysql> set global read_only=1;
在 Mysql1 主库插入两条数据,测试是否同步。到这里主从复制就完成了。
[root@MHA-manager ~]# yum install epel-release --nogpgcheck
[root@MHA-manager ~]# yum install -y perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-CPAN
MHA 软件包对于每个操作系统版本不一样,这里 CentOS7.3 必须选择 0.57 版本,
在所有服务器(四台主机)上必须先安装 node 组件,因为 manager 依赖 node 组件,
下面都是在 Mysql1 上操作演示安装 node 组件。
[root@Mysql1 ~]# tar zxvf mha4mysql-node-0.57.tar.gz
[root@Mysql1 ~]# cd mha4mysql-node-0.57
[root@Mysql1 mha4mysql-node-0.57]# perl Makefile.PL
[root@Mysql1 mha4mysql-node-0.57]# make
[root@Mysql1 mha4mysql-node-0.57]# make install
在 MHA-manager 上安装 manager 组件
[root@MHA-manager ~]# tar zxvf mha4MHA-manager-0.57.tar.gz
[root@MHA-manager ~]# cd mha4MHA-manager-0.57
[root@MHA-manager mha4MHA-manager-0.57]# perl Makefile.PL
*** Module::AutoInstall version 1.06
*** Checking for Perl dependencies...
[Core Features] - DBI ...loaded. (1.627)
- DBD::mysql ...loaded. (4.023)
- Time::HiRes ...loaded. (1.9725)
- Config::Tiny ...loaded. (2.14)
- Log::Dispatch ...loaded. (2.41)
- Parallel::ForkManager ...loaded. (7.18)
- MHA::NodeConst ...loaded. (0.57) *** Module::AutoInstall configuration finished. Checking if your kit is complete... Looks good
Writing Makefile for mha4mysql::manager
[root@MHA-manager mha4MHA-manager-0.57]# make
[root@MHA-manager mha4MHA-manager-0.57]# make install
manager 安装后在/usr/local/bin 下面会生成几个工具,主要包括以下几个:
masterha_check_ssh //检查 MHA 的 SSH 配置状况
masterha_check_repl //检查 MySQL 复制状况
masterha_manger //启动 manager的脚本
masterha_check_status //检测当前 MHA 运行状态
masterha_master_monitor //检测 master 是否宕机
masterha_master_switch //控制故障转移(自动或者手动)
masterha_conf_host //添加或删除配置的 server 信息
masterha_stop //关闭manager
node 安装后也会在/usr/local/bin 下面会生成几个脚本(这些工具通常由 MHAManager 的脚本触发,无需人为操作)主要如下:
save_binary_logs //保存和复制 master 的二进制日志
apply_diff_relay_logs //识别差异的中继日志事件并将其差异的事件应用于其他的 slave
filter_mysqlbinlog //去除不必要的 ROLLBACK 事件(MHA 已不再使用这个工具)
purge_relay_logs //清除中继日志(不会阻塞 SQL 线程)
在 manager 上配置到所有节点的无密码认证
[root@MHA-manager ~]# ssh-keygen -t rsa //一路按回车键
[root@MHA-manager ~]# ssh-copy-id 192.168.247.130
[root@MHA-manager ~]# ssh-copy-id 192.168.247.140
[root@MHA-manager ~]# ssh-copy-id 192.168.247.150
#验证能否登陆
[root@MHA-manager ~]# ssh root@192.168.247.130
[root@MHA-manager ~]# ssh root@192.168.247.140
[root@MHA-manager ~]# ssh root@192.168.247.150
在 Mysql1 上配置到其他数据库节点的无密码认证
[root@Mysql1 ~]# ssh-keygen -t rsa
[root@Mysql1 ~]# ssh-copy-id 192.168.247.140
[root@Mysql1 ~]# ssh-copy-id 192.168.247.150
#验证能否登陆
[root@MHA-manager ~]# ssh root@192.168.247.140
[root@MHA-manager ~]# ssh root@192.168.247.150
在 Mysql2 上配置到其他数据库节点的无密码认证
[root@Mysql2 ~]# ssh-keygen -t rsa
[root@Mysql2 ~]# ssh-copy-id 192.168.247.130
[root@Mysql2 ~]# ssh-copy-id 192.168.247.150
#验证能否登陆
[root@MHA-manager ~]# ssh root@192.168.247.130
[root@MHA-manager ~]# ssh root@192.168.247.150
在 Mysql3 上配置到其他数据库节点的无密码认证
[root@Mysql3 ~]# ssh-keygen -t rsa
[root@Mysql3 ~]# ssh-copy-id 192.168.247.130
[root@Mysql3 ~]# ssh-copy-id 192.168.247.140
#验证能否登陆
[root@MHA-manager ~]# ssh root@192.168.247.130
[root@MHA-manager ~]# ssh root@192.168.247.140
[root@MHA-manager ~]# cp -ra /root/mha4MHA-manager-0.57/samples/scripts /usr/local/bin
[root@atlas ~]# ll /usr/local/bin/scripts/
总用量 32
-rwxr-xr-x 1 mysql mysql 3648 5 月 31 2015 master_ip_failover
-rwxr-xr-x 1 mysql mysql 9872 5 月 25 09:07 master_ip_online_change
-rwxr-xr-x 1 mysql mysql 11867 5 月 31 2015 power_manager
-rwxr-xr-x 1 mysql mysql 1360 5 月 31 2015 send_report
master_ip_failover #自动切换时 VIP 管理的脚本
master_ip_online_change #在线切换时 vip 的管理
power_manager #故障发生后关闭主机的脚本
send_report #因故障切换后发送报警的脚本
[root@MHA-manager ~]# cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin
修改脚本内容如下:(最好直接全部删除,复制以下内容进去)
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
#############################添加内容部分#########################################
my $vip = '192.168.247.200'; //虚拟主机地址,用作主服务器的地址漂移
my $brdc = '192.168.247.255';
my $ifdev = 'ens33';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
my $exit_code = 0;
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
##################################################################################
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
[root@MHA-manager ~]# mkdir /etc/masterha
[root@MHA-manager ~]# cp /root/mha4MHA-manager-0.57/samples/conf/app1.cnf /etc/masterha
修改配置文件
[root@MHA-manager ~]# vi /etc/masterha/app1.cnf
[server default]
manager_workdir=/var/log/masterha/app1
manager_log=/var/log/masterha/app1/manager.log
master_binlog_dir=/usr/local/mysql/data
master_ip_failover_script= /usr/local/bin/master_ip_failover
master_ip_online_change_script= /usr/local/bin/master_ip_online_change
password=manager
user=mha
ping_interval=1
remote_workdir=/tmp
repl_password=123
repl_user=myslave
secondary_check_script= /usr/local/bin/masterha_secondary_check -s 192.168.247.140 -s 192.168.247.150
#二次检查的是两个从服务器
shutdown_script=""
ssh_user=root
[server1] #主服务器
hostname=192.168.247.130
port=3306
[server2] #从服务器、主备
hostname=192.168.247.140
port=3306
#设置为主备的配置
candidate_master=1
check_repl_delay=0
[server3] #从服务器
hostname=192.168.247.150
port=3306
如果正常最后会输出 successfully,如下所示。
[root@MHA-manager ~]# masterha_check_ssh -conf=/etc/masterha/app1.cnf
Thu Oct 29 23:03:22 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Oct 29 23:03:22 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Oct 29 23:03:22 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Oct 29 23:03:22 2020 - [info] Starting SSH connection tests..
Thu Oct 29 23:03:24 2020 - [debug]
Thu Oct 29 23:03:22 2020 - [debug] Connecting via SSH from root@192.168.247.130(192.168.247.130:22) to root@192.168.247.140(192.168.247.140:22)..
Thu Oct 29 23:03:23 2020 - [debug] ok.
Thu Oct 29 23:03:23 2020 - [debug] Connecting via SSH from root@192.168.247.130(192.168.247.130:22) to root@192.168.247.150(192.168.247.150:22)..
Thu Oct 29 23:03:23 2020 - [debug] ok.
Thu Oct 29 23:03:25 2020 - [debug]
Thu Oct 29 23:03:23 2020 - [debug] Connecting via SSH from root@192.168.247.140(192.168.247.140:22) to root@192.168.247.130(192.168.247.130:22)..
Thu Oct 29 23:03:23 2020 - [debug] ok.
Thu Oct 29 23:03:23 2020 - [debug] Connecting via SSH from root@192.168.247.140(192.168.247.140:22) to root@192.168.247.150(192.168.247.150:22)..
Thu Oct 29 23:03:24 2020 - [debug] ok.
Thu Oct 29 23:03:25 2020 - [debug]
Thu Oct 29 23:03:23 2020 - [debug] Connecting via SSH from root@192.168.247.150(192.168.247.150:22) to root@192.168.247.130(192.168.247.130:22)..
Thu Oct 29 23:03:24 2020 - [debug] ok.
Thu Oct 29 23:03:24 2020 - [debug] Connecting via SSH from root@192.168.247.150(192.168.247.150:22) to root@192.168.247.140(192.168.247.140:22)..
Thu Oct 29 23:03:24 2020 - [debug] ok.
Thu Oct 29 23:03:25 2020 - [info] All SSH connection tests passed successfully.
健康检查,最后出现 MySQL Replication Health is OK 字样说明
[root@MHA-manager ~]# masterha_check_repl -conf=/etc/masterha/app1.cnf
......
Thu Oct 29 23:04:46 2020 - [info] Slaves settings check done.
Thu Oct 29 23:04:46 2020 - [info]
#可以看到当前主服务器
192.168.247.130(192.168.247.130:3306) (current master)
+--192.168.247.140(192.168.247.140:3306)
+--192.168.247.150(192.168.247.150:3306)
......
#可以看到虚拟地址
IN SCRIPT TEST====/sbin/ifconfig ens33:1 down==/sbin/ifconfig ens33:1 192.168.247.200===
Checking the Status of the script.. OK
Thu Oct 29 23:04:46 2020 - [info] OK.
Thu Oct 29 23:04:46 2020 - [warning] shutdown_script is not defined.
Thu Oct 29 23:04:46 2020 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
[root@MHA-manager ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
–remove_dead_master_conf #该参数代表当发生主从切换后,老的主库的 ip 将会从配置文件中移除。
–manger_log #日志存放位置。
–ignore_last_failover #在缺省情况下,如果 MHA 检测到连续发生宕机,且两次宕机间隔不足 8 小时的话,则不会进行 Failover,之所以这样限制是为了避免 ping-pong 效应。该参数代表忽略上次 MHA 触发切换产生的文件,默认情况下,MHA 发生切换后会在日志记目录,也就是上面设置的日志 app1.failover.complete 文件,下次再次切换的时候如果发现该目录下存在该文件将不允许触发切换,除非在第一次切换后收到删除该文件,为了方便,这里设置为–ignore_last_failover。
查看MHA状态,可以看到当前的 master 是 Mysql1 节点。
[root@MHA-manager ~]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 (pid:16318) is running(0:PING_OK), master:192.168.247.130
查看 MHA 日志,也以看到当前的 master 是 192.168.8.134
[root@MHA-manager ~]# cat /var/log/masterha/app1/manager.log
[root@Mysql1 ~]# ifconfig ens33:1 20.0.0.200/24
[root@Mysql1 ~]# ifconfig
在主服务器上:使主服务器宕机
[root@Mysql1 ~]# pkill mysqld
查看从服务器的状态
从服务器1(主备):
mysql> show slave status;
Empty set (0.00 sec)
#无从服务器状态,因为此时,它已经自动切换成为主服务器
#查看虚拟地址是否漂移成功到主备上,原本是在主服务器上的,192.168.247.200
[root@Mysql2 ~]# ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.247.140 netmask 255.255.255.0 broadcast 192.168.247.255
inet6 fe80::9f21:19b9:688b:aed5 prefixlen 64 scopeid 0x20<link>
ether 00:0c:29:fe:3c:8c txqueuelen 1000 (Ethernet)
RX packets 98489 bytes 116376334 (110.9 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 40408 bytes 5256587 (5.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
ens33:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.247.200 netmask 255.255.255.0 broadcast 192.168.247.255
ether 00:0c:29:fe:3c:8c txqueuelen 1000 (Ethernet)
从服务器2:
mysql> show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.247.140
Master_User: myslave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: master-bin.000001
Read_Master_Log_Pos: 2778
Relay_Log_File: relay-log-bin.000002
Relay_Log_Pos: 284
Relay_Master_Log_File: master-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
.....
#可以看到主服务器地址已经显示为主备的地址,而不再是故障的192.168.247.130
也可在manager日志上查看到过程:
---- Failover Report -----
app1: MySQL Master failover 192.168.247.130(192.168.247.130:3306) to 192.168.247.140(192.168.247.140:3306) succeeded
Master 192.168.247.130(192.168.247.130:3306) is down!
Check MHA Manager logs at maneger:/var/log/masterha/app1/manager.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on 192.168.247.130(192.168.247.130:3306)
The latest slave 192.168.247.140(192.168.247.140:3306) has all relay logs for recovery.
Selected 192.168.247.140(192.168.247.140:3306) as a new master.
192.168.247.140(192.168.247.140:3306): OK: Applying all logs succeeded.
192.168.247.140(192.168.247.140:3306): OK: Activated master IP address.
192.168.247.150(192.168.247.150:3306): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.247.150(192.168.247.150:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.247.140(192.168.247.140:3306)
192.168.247.140(192.168.247.140:3306): Resetting slave info succeeded.
Master failover to 192.168.247.140(192.168.247.140:3306) completed successfully.
重新启动原来的主数据库mysql1
[root@Mysql1 ~]# systemctl restart mysqld
同步新的主数据库(mysql2),使自己成为从服务器
查看mysql2(新的主服务器)的二进制文件和同步点
mysql> show master status;
+-------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-------------------+----------+--------------+------------------+-------------------+
| master-bin.000001 | 2778 | | | |
+-------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
在master1 (原来的主服务器)上执行同步,查看 IO 和 SQL 线程都是 yes 代表同步是否正常
mysql> change master to
-> master_host='192.168.247.140',master_user='myslave',master_password='123',master_log_file='master-bin.000001',master_log_pos=2778;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.247.140
Master_User: myslave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: master-bin.000001
Read_Master_Log_Pos: 2778
Relay_Log_File: mysqld-relay-bin.000002
Relay_Log_Pos: 284
Relay_Master_Log_File: master-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
......
此时新的主从关系就形成了,master为mysql2,slaves为mysql1、mysql3.
MHA实现一次切换之后,会自动关闭
[root@maneger ~]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 is stopped(2:NOT_RUNNING).
为了继续保证MHA功能,实现如果mysql2出现故障,能继续切换至mysql1,需要在manager服务器上修改配置文件(再把这个记录添加进去,因为它检测server1宕机的时候,server1模块会自动消失),再开启MHA功能,即可继续监测管理。
[root@maneger ~]# vi /etc/masterha/app1.cnf
[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/usr/local/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=manager
ping_interval=1
remote_workdir=/tmp
repl_password=123
repl_user=myslave
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.247.130 -s 192.168.247.150
shutdown_script=""
ssh_user=root
user=mha
[server1]
candidate_master=1
check_repl_delay=0
hostname=192.168.247.130
port=3306
[server2]
hostname=192.168.247.150
port=3306
[server3]
hostname=192.168.247.150
port=3306
[root@maneger ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
[1] 17500