MHA各个节点规划如下
考虑点
各个节点做相互的免密登录
安装基于GTID复制的主库和从库 注意server_id不一致 从库只读,切换成功后新主要把read-only从配置文件中注释掉
安装MHA
主节点设置VIP 虚拟IP 漂移
建立binlogserver作为数据补偿方案 这个也需要安装mysql客户端工具mysqlbinlog
主节点mysql宕机两种情况:1.mysql进程断 binlog能正常被获取 2.linux服务器挂掉
所以此时应该考虑日志补偿 即 linux主机断 也不会影响从节点和新主正常获取旧主的二进制日志
建立邮件告警
断掉主库 验证
1.所有节点做相互的免密登录
cat >~/for_public_key.sh < /dev/null || yum -y install sshpass
[ -f ~/.ssh/id_rsa ] || ssh-keygen -P "" -f /root/.ssh/id_rsa
for IP in \$HOSTLIST ;do
{
sshpass -p \${PASS} ssh-copy-id -o StrictHostKeyChecking=no -i /root/.ssh/id_rsa.pub root@\${IP} &>/dev/null
}&
done
wait
END
2.安装基于GTID复制 的主从复制
2.1 安装
slave1的IP是10.0.0.27 所以我们命名server_id为27
slave2的IP是10.0.0.37 所以我们命名server_id为37
两个slave都开启GTID和二进制功能 都设置read-only
主节点的IP是10.0.0.17 所以我们命名server_id为17
2.2 建立主从关系
master建立复制用户
grant replication slave on *.* to repl@'10.0.0.%' identified by '123';
slave1和slave2
编辑change master to写入到master.info的信息
change master to
master_host='10.0.0.17',
master_user='repl',
master_password='123' ,
MASTER_AUTO_POSITION=1;
start slave;
show slave status\G;
3.安装MHA
3.1 安装MHA依赖(所有节点操作)
10.0.0.17master用node
10.0.0.27slave1用node
10.0.0.37slave2用manager和node
10.0.0.7 binlog_server用node
准备mha-node和mha-manager放在10.0.0.37:/root下
#perl语言开发的连接mysql的驱动
yum install perl-DBD-MySQL -y
#所有节点安装node
cd /root
#将node包传送到各个节点
scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm 10.0.0.27:/root
scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm 10.0.0.17:/root
scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm 10.0.0.7:/root
yum -y install mha4mysql-node-0.58-0.el7.centos.noarch.rpm
3.2 master节点操作
准备mha用户
mysql -e 'grant all privileges on *.* to mha@'10.0.0.%' identified by 'mha';'
mysql -e 'select user,host from mysql.user'
3.3 manager节点操作
a. 准备依赖 安装manager
yum install -y perl-Config-Tiny epel-release perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes
cd /root/
yum -y install mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
b. 准备MHA的目录和相关文件
#配置目录
mkdir -p /etc/mha
#日志目录
mkdir -p /var/log/mha/app1
cat > /etc/mha/app1.cnf <
c. MHA启动前状态检测
masterha_check_ssh --conf=/etc/mha/app1.cnf #SSH免密钥检查
masterha_check_repl --conf=/etc/mha/app1.cnf #mysql主从检查
d. 启动MHA
nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 &
e. 检测MHA运行状态
masterha_check_status --conf=/etc/mha/app1.cnf
4.VIP+binlogserver+mail
a. 准备master_ip_failover文件(manager操作)
准备官网的VIP文件 修改#标注的地方 并把该文件复制到/usr/local/bin 下改名master_ip_failover 方便MHA的manager调用
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
# my $vip = '10.0.0.55/24';
my $vip = '10.0.0.100/24'; #此处为VIP地址
my $key = '1'; #此处为VIP的标签 比如eth0:1的1就是label
# my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip"; #三节点网卡名需要同名
# my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip"; #三节点网卡名需要同名,开启VIP
my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down"; #关闭VIP
# my $ssh_Bcast_arp= "/sbin/arping -I ens33 -c 3 -A 10.0.0.55";
my $ssh_Bcast_arp= "/sbin/arping -I eth0 -c 3 -A 10.0.0.100";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
cp master_ip_failover /usr/local/bin/master_ip_failover
chmod +x /usr/local/bin/master_ip_failover
b.手工添加vip(master操作)
ip addr add 10.0.0.100/24 dev eth0 label eth0:1
ip a |grep eth0
c.测试VIP是否可用
另外找个机器测试
mysql -umha -pmha -h10.0.0.100 -e 'select @@server_id;'
d.准备邮件脚本
下面的内容更改为自己的邮箱设置
#!/usr/bin/perl
use strict;
use warnings FATAL => 'all';
use Mail::Sender;
use Getopt::Long;
#new_master_host and new_slave_hosts are set only when recovering master succeeded
my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body );
my $smtp='smtp.qq.com';
my $mail_from='发件箱';
my $mail_user='发件箱名称';
my $mail_pass='发件箱授权吗';
#my $mail_to=['[email protected]','[email protected]'];
my $mail_to='收件人邮箱';
GetOptions(
'orig_master_host=s' => \$dead_master_host,
'new_master_host=s' => \$new_master_host,
'new_slave_hosts=s' => \$new_slave_hosts,
'subject=s' => \$subject,
'body=s' => \$body,
);
# Do whatever you want here
mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);
sub mailToContacts {
my ($smtp, $mail_from, $mail_user, $mail_pass, $mail_to, $subject, $msg ) = @_;
open my $DEBUG, ">/tmp/mail.log"
or die "Can't open the debug file:$!\n";
my $sender = new Mail::Sender {
ctype => 'text/plain;charset=utf-8',
encoding => 'utf-8',
smtp => $smtp,
from => $mail_from,
auth => 'LOGIN',
TLS_allowed => '0',
authid => $mail_user,
authpwd => $mail_pass,
to => $mail_to,
subject => $subject,
debug => $DEBUG
};
$sender->MailMsg(
{
msg => $msg,
debug => $DEBUG
}
) or print $Mail::Sender::Error;
return 1;
}
exit 0;
cp -a send_report /usr/local/bin/
chmod +x /usr/local/bin/send_report
e.数据补偿办法:建立主库的binlogserver
找一台额外的机器,必须要有5.6以上的版本,支持gtid并开启,我们用10.0.0.7
binlog_server创建必要的目录(binlog_server操作)
mkdir -p /data/mysql/binlog
chown -R mysql.mysql /data/*
拉取主库日志到binlogserver(binlog_server操作)
cd /data/mysql/binlog #必须进入到自己创建好的目录
nohup mysqlbinlog -R --host=10.0.0.17 --user=mha --password=mha --raw --stop-never mysql-bin.000001 &>/dev/null &
# --host是主库的IP --user --password是主库设置的mha的密码和用户
# 此处从mysql-bin.000001拉取所有二进制日志,生产中从master的最新二进制日志处拉取二进制日志
# --raw: 以 binlog 格式存储日志,方便后期使用;
f.更新MHA配置文件 加入调用VIP脚本与邮件脚本与binlogserver参数
cat > /etc/mha/app1.cnf <
g.重启MHA
masterha_stop --conf=/etc/mha/app1.cnf
nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 &
masterha_check_status --conf=/etc/mha/app1.cnf
5.测试
a. 追踪manager日志(manager操作)
tail -f /var/log/app1/manager.log
b. 断开主库
pkill mysqld
c. 查看日志(manager)
下面就是MHA在主库10.0.0.17断开后的切换新主的日志
root@37 ~]# tail -f /var/log/mha/app1/manager.log -n0
Sat Jan 23 21:06:41 2021 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Sat Jan 23 21:06:41 2021 - [info] Executing SSH check script: exit 0
Sat Jan 23 21:06:41 2021 - [info] HealthCheck: SSH to 10.0.0.17 is reachable.
Sat Jan 23 21:06:43 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '10.0.0.17' (111))
Sat Jan 23 21:06:43 2021 - [warning] Connection failed 2 time(s)..
Sat Jan 23 21:06:45 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '10.0.0.17' (111))
Sat Jan 23 21:06:45 2021 - [warning] Connection failed 3 time(s)..
Sat Jan 23 21:06:47 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '10.0.0.17' (111))
Sat Jan 23 21:06:47 2021 - [warning] Connection failed 4 time(s)..
Sat Jan 23 21:06:47 2021 - [warning] Master is not reachable from health checker!
Sat Jan 23 21:06:47 2021 - [warning] Master 10.0.0.17(10.0.0.17:3306) is not reachable!
Sat Jan 23 21:06:47 2021 - [warning] SSH is reachable.
Sat Jan 23 21:06:47 2021 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/app1.cnf again, and trying to connect to all servers to check server status..
Sat Jan 23 21:06:47 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Jan 23 21:06:47 2021 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Sat Jan 23 21:06:47 2021 - [info] Reading server configuration from /etc/mha/app1.cnf..
Sat Jan 23 21:06:48 2021 - [info] GTID failover mode = 1
Sat Jan 23 21:06:48 2021 - [info] Dead Servers:
Sat Jan 23 21:06:48 2021 - [info] 10.0.0.17(10.0.0.17:3306)
Sat Jan 23 21:06:48 2021 - [info] Alive Servers:
Sat Jan 23 21:06:48 2021 - [info] 10.0.0.27(10.0.0.27:3306)
Sat Jan 23 21:06:48 2021 - [info] 10.0.0.37(10.0.0.37:3306)
Sat Jan 23 21:06:48 2021 - [info] Alive Slaves:
Sat Jan 23 21:06:48 2021 - [info] 10.0.0.27(10.0.0.27:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Sat Jan 23 21:06:48 2021 - [info] GTID ON
Sat Jan 23 21:06:48 2021 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Sat Jan 23 21:06:48 2021 - [info] 10.0.0.37(10.0.0.37:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Sat Jan 23 21:06:48 2021 - [info] GTID ON
Sat Jan 23 21:06:48 2021 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Sat Jan 23 21:06:48 2021 - [info] Checking slave configurations..
Sat Jan 23 21:06:48 2021 - [info] read_only=1 is not set on slave 10.0.0.27(10.0.0.27:3306).
Sat Jan 23 21:06:48 2021 - [info] Checking replication filtering settings..
Sat Jan 23 21:06:48 2021 - [info] Replication filtering check ok.
Sat Jan 23 21:06:48 2021 - [info] Master is down!
Sat Jan 23 21:06:48 2021 - [info] Terminating monitoring script.
Sat Jan 23 21:06:48 2021 - [info] Got exit code 20 (Master dead).
Sat Jan 23 21:06:48 2021 - [info] MHA::MasterFailover version 0.58.
Sat Jan 23 21:06:48 2021 - [info] Starting master failover.
Sat Jan 23 21:06:48 2021 - [info]
Sat Jan 23 21:06:48 2021 - [info] * Phase 1: Configuration Check Phase..
Sat Jan 23 21:06:48 2021 - [info]
Sat Jan 23 21:06:48 2021 - [info] HealthCheck: SSH to 10.0.0.7 is reachable.
Sat Jan 23 21:06:49 2021 - [info] Binlog server 10.0.0.7 is reachable.
Sat Jan 23 21:06:50 2021 - [info] GTID failover mode = 1
Sat Jan 23 21:06:50 2021 - [info] Dead Servers:
Sat Jan 23 21:06:50 2021 - [info] 10.0.0.17(10.0.0.17:3306)
Sat Jan 23 21:06:50 2021 - [info] Checking master reachability via MySQL(double check)...
Sat Jan 23 21:06:50 2021 - [info] ok.
Sat Jan 23 21:06:50 2021 - [info] Alive Servers:
Sat Jan 23 21:06:50 2021 - [info] 10.0.0.27(10.0.0.27:3306)
Sat Jan 23 21:06:50 2021 - [info] 10.0.0.37(10.0.0.37:3306)
Sat Jan 23 21:06:50 2021 - [info] Alive Slaves:
Sat Jan 23 21:06:50 2021 - [info] 10.0.0.27(10.0.0.27:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Sat Jan 23 21:06:50 2021 - [info] GTID ON
Sat Jan 23 21:06:50 2021 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Sat Jan 23 21:06:50 2021 - [info] 10.0.0.37(10.0.0.37:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Sat Jan 23 21:06:50 2021 - [info] GTID ON
Sat Jan 23 21:06:50 2021 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Sat Jan 23 21:06:50 2021 - [info] Starting GTID based failover.
Sat Jan 23 21:06:50 2021 - [info]
Sat Jan 23 21:06:50 2021 - [info] ** Phase 1: Configuration Check Phase completed.
Sat Jan 23 21:06:50 2021 - [info]
Sat Jan 23 21:06:50 2021 - [info] * Phase 2: Dead Master Shutdown Phase..
Sat Jan 23 21:06:50 2021 - [info]
Sat Jan 23 21:06:50 2021 - [info] Forcing shutdown so that applications never connect to the current master..
Sat Jan 23 21:06:50 2021 - [info] Executing master IP deactivation script:
Sat Jan 23 21:06:50 2021 - [info] /usr/local/bin/master_ip_failover --orig_master_host=10.0.0.17 --orig_master_ip=10.0.0.17 --orig_master_port=3306 --command=stopssh --ssh_user=root
IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 10.0.0.100/24===
Disabling the VIP on old master: 10.0.0.17
Sat Jan 23 21:06:50 2021 - [info] done.
Sat Jan 23 21:06:50 2021 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Sat Jan 23 21:06:50 2021 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Sat Jan 23 21:06:50 2021 - [info]
Sat Jan 23 21:06:50 2021 - [info] * Phase 3: Master Recovery Phase..
Sat Jan 23 21:06:50 2021 - [info]
Sat Jan 23 21:06:50 2021 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Sat Jan 23 21:06:50 2021 - [info]
Sat Jan 23 21:06:50 2021 - [info] The latest binary log file/position on all slaves is mysql-bin.000002:310
Sat Jan 23 21:06:50 2021 - [info] Retrieved Gtid Set: fdce59f6-5bde-11eb-a39b-000c29b493d9:1
Sat Jan 23 21:06:50 2021 - [info] Latest slaves (Slaves that received relay log files to the latest):
Sat Jan 23 21:06:50 2021 - [info] 10.0.0.27(10.0.0.27:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Sat Jan 23 21:06:50 2021 - [info] GTID ON
Sat Jan 23 21:06:50 2021 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Sat Jan 23 21:06:50 2021 - [info] 10.0.0.37(10.0.0.37:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Sat Jan 23 21:06:50 2021 - [info] GTID ON
Sat Jan 23 21:06:50 2021 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Sat Jan 23 21:06:50 2021 - [info] The oldest binary log file/position on all slaves is mysql-bin.000002:310
Sat Jan 23 21:06:50 2021 - [info] Retrieved Gtid Set: fdce59f6-5bde-11eb-a39b-000c29b493d9:1
Sat Jan 23 21:06:50 2021 - [info] Oldest slaves:
Sat Jan 23 21:06:50 2021 - [info] 10.0.0.27(10.0.0.27:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Sat Jan 23 21:06:50 2021 - [info] GTID ON
Sat Jan 23 21:06:50 2021 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Sat Jan 23 21:06:50 2021 - [info] 10.0.0.37(10.0.0.37:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Sat Jan 23 21:06:50 2021 - [info] GTID ON
Sat Jan 23 21:06:50 2021 - [info] Replicating from 10.0.0.17(10.0.0.17:3306)
Sat Jan 23 21:06:50 2021 - [info]
Sat Jan 23 21:06:50 2021 - [info] * Phase 3.3: Determining New Master Phase..
Sat Jan 23 21:06:50 2021 - [info]
Sat Jan 23 21:06:50 2021 - [info] Searching new master from slaves..
Sat Jan 23 21:06:50 2021 - [info] Candidate masters from the configuration file:
Sat Jan 23 21:06:50 2021 - [info] Non-candidate masters:
Sat Jan 23 21:06:50 2021 - [info] New master is 10.0.0.27(10.0.0.27:3306)
Sat Jan 23 21:06:50 2021 - [info] Starting master failover..
Sat Jan 23 21:06:50 2021 - [info]
From:
10.0.0.17(10.0.0.17:3306) (current master)
+--10.0.0.27(10.0.0.27:3306)
+--10.0.0.37(10.0.0.37:3306)
To:
10.0.0.27(10.0.0.27:3306) (new master)
+--10.0.0.37(10.0.0.37:3306)
Sat Jan 23 21:06:50 2021 - [info]
Sat Jan 23 21:06:50 2021 - [info] * Phase 3.3: New Master Recovery Phase..
Sat Jan 23 21:06:50 2021 - [info]
Sat Jan 23 21:06:50 2021 - [info] Waiting all logs to be applied..
Sat Jan 23 21:06:50 2021 - [info] done.
Sat Jan 23 21:06:50 2021 - [info] -- Saving binlog from host 10.0.0.7 started, pid: 26690
Sat Jan 23 21:06:51 2021 - [info]
Sat Jan 23 21:06:51 2021 - [info] Log messages from 10.0.0.7 ...
Sat Jan 23 21:06:51 2021 - [info]
Sat Jan 23 21:06:50 2021 - [info] Fetching binary logs from binlog server 10.0.0.7..
Sat Jan 23 21:06:50 2021 - [info] Executing binlog save command: save_binary_logs --command=save --start_file=mysql-bin.000002 --start_pos=310 --output_file=/var/tmp/saved_binlog_binlog1_20210123210648.binlog --handle_raw_binlog=0 --skip_filter=1 --disable_log_bin=0 --manager_version=0.58 --oldest_version=5.7.31-log --binlog_dir=/data/mysql/binlog
Creating /var/tmp if not exists.. ok.
Concat binary/relay logs from mysql-bin.000002 pos 310 to mysql-bin.000002 EOF into /var/tmp/saved_binlog_binlog1_20210123210648.binlog ..
No additional binlog events found.
Event not exists.
Sat Jan 23 21:06:50 2021 - [info] Additional events were not found from the binlog server. No need to save.
Sat Jan 23 21:06:51 2021 - [info] End of log messages from 10.0.0.7.
Sat Jan 23 21:06:51 2021 - [info] No binlog events found from 10.0.0.7. Skipping
Sat Jan 23 21:06:51 2021 - [info] Getting new master's binlog name and position..
Sat Jan 23 21:06:51 2021 - [info] mysql-bin.000001:154
Sat Jan 23 21:06:51 2021 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='10.0.0.27', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Sat Jan 23 21:06:51 2021 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000001, 154, fdce59f6-5bde-11eb-a39b-000c29b493d9:1
Sat Jan 23 21:06:51 2021 - [info] Executing master IP activate script:
Sat Jan 23 21:06:51 2021 - [info] /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=10.0.0.17 --orig_master_ip=10.0.0.17 --orig_master_port=3306 --new_master_host=10.0.0.27 --new_master_ip=10.0.0.27 --new_master_port=3306 --new_master_user='mha' --new_master_password=xxx
Unknown option: new_master_user
Unknown option: new_master_password
IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 10.0.0.100/24===
Enabling the VIP - 10.0.0.100/24 on the new master - 10.0.0.27
Sat Jan 23 21:06:51 2021 - [info] OK.
Sat Jan 23 21:06:51 2021 - [info] ** Finished master recovery successfully.
Sat Jan 23 21:06:51 2021 - [info] * Phase 3: Master Recovery Phase completed.
Sat Jan 23 21:06:51 2021 - [info]
Sat Jan 23 21:06:51 2021 - [info] * Phase 4: Slaves Recovery Phase..
Sat Jan 23 21:06:51 2021 - [info]
Sat Jan 23 21:06:51 2021 - [info]
Sat Jan 23 21:06:51 2021 - [info] * Phase 4.1: Starting Slaves in parallel..
Sat Jan 23 21:06:51 2021 - [info]
Sat Jan 23 21:06:51 2021 - [info] -- Slave recovery on host 10.0.0.37(10.0.0.37:3306) started, pid: 26698. Check tmp log /var/log/mha/app1/10.0.0.37_3306_20210123210648.log if it takes time..
Sat Jan 23 21:06:53 2021 - [info]
Sat Jan 23 21:06:53 2021 - [info] Log messages from 10.0.0.37 ...
Sat Jan 23 21:06:53 2021 - [info]
Sat Jan 23 21:06:51 2021 - [info] Resetting slave 10.0.0.37(10.0.0.37:3306) and starting replication from the new master 10.0.0.27(10.0.0.27:3306)..
Sat Jan 23 21:06:51 2021 - [info] Executed CHANGE MASTER.
Sat Jan 23 21:06:52 2021 - [info] Slave started.
Sat Jan 23 21:06:52 2021 - [info] gtid_wait(fdce59f6-5bde-11eb-a39b-000c29b493d9:1) completed on 10.0.0.37(10.0.0.37:3306). Executed 0 events.
Sat Jan 23 21:06:53 2021 - [info] End of log messages from 10.0.0.37.
Sat Jan 23 21:06:53 2021 - [info] -- Slave on host 10.0.0.37(10.0.0.37:3306) started.
Sat Jan 23 21:06:53 2021 - [info] All new slave servers recovered successfully.
Sat Jan 23 21:06:53 2021 - [info]
Sat Jan 23 21:06:53 2021 - [info] * Phase 5: New master cleanup phase..
Sat Jan 23 21:06:53 2021 - [info]
Sat Jan 23 21:06:53 2021 - [info] Resetting slave info on the new master..
Sat Jan 23 21:06:53 2021 - [info] 10.0.0.27: Resetting slave info succeeded.
Sat Jan 23 21:06:53 2021 - [info] Master failover to 10.0.0.27(10.0.0.27:3306) completed successfully.
Sat Jan 23 21:06:53 2021 - [info] Deleted server1 entry from /etc/mha/app1.cnf .
Sat Jan 23 21:06:53 2021 - [info]
----- Failover Report -----
app1: MySQL Master failover 10.0.0.17(10.0.0.17:3306) to 10.0.0.27(10.0.0.27:3306) succeeded
Master 10.0.0.17(10.0.0.17:3306) is down!
Check MHA Manager logs at 37:/var/log/mha/app1/manager.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on 10.0.0.17(10.0.0.17:3306)
Selected 10.0.0.27(10.0.0.27:3306) as a new master.
10.0.0.27(10.0.0.27:3306): OK: Applying all logs succeeded.
10.0.0.27(10.0.0.27:3306): OK: Activated master IP address.
10.0.0.37(10.0.0.37:3306): OK: Slave started, replicating from 10.0.0.27(10.0.0.27:3306)
10.0.0.27(10.0.0.27:3306): Resetting slave info succeeded.
Master failover to 10.0.0.27(10.0.0.27:3306) completed successfully.
Sat Jan 23 21:06:53 2021 - [info] Sending mail..
Unknown option: conf
d. 查看主库是否切换(slave操作)
10.0.0.27变成新主
mysql -e 'show slave status\G'
#上面这个查不到结果
mysql -e 'show slave hosts'
#这个可以看到slave2 10.0.0.27变成了新主 它的从库是10.0.0.37
e. 查看VIP是否漂移(新主操作)
root@27 ~]# ifconfig eth0:1
eth0:1: flags=4163 mtu 1500
inet 10.0.0.100 netmask 255.255.255.0 broadcast 10.0.0.255
ether 00:0c:29:2a:ed:d8 txqueuelen 1000 (Ethernet)
f. 查看邮件是否收到告警
6.新主操作
如果不需要旧的服务器 新主10.0.0.27原来是从库 配置文件的read-only需要注释掉。
恢复的话从新操作