官方网址 Manager:https://github.com/yoshinorim/mha4mysql-manager
Node: https://github.com/yoshinorim/mha4mysql-node
当前最新版本是0.57
MHA(Master High Availability)是目前业界MySQL高可用方面是一个相对成熟的解决方案,它由日本DeNA公司youshimaton基于Perl开发的。
(https://github.com/yoshinorim/mha4mysql-manager/wiki/Advantages)
mha主从切换可以在很短的时间内完成 ,通常在10-30s
能最大程度上解决数据一致性的问题(When the current master crashes, MHA automatically identifies differential relay log events between slaves, and applies to each slave. So finally all slaves can be in sync, as long as all slave servers are alive. By using together with Semi-Synchronous Replication, (almost) no data loss can also be guaranteed.)
不需要修改当前已有的MySQL架构
不需要另外多余的服务器
没有性能损耗
没有存储引擎限制
10.211.55.21 master
10.211.55.23 slave
10.211.55.24 slave
a ,去MySQL官网下载源码
b ,添加用户和组
groupadd mysql
useradd -d /home/mysql -g mysql -m mysql
c ,创建标准的目录
mkdir -p /u01/mysql3306/data
mkdir -p /u01/mysql3306/log/iblog
mkdir -p /u01/mysql3306/log/binlog
mkdir -p /u01/mysql3306/run
mkdir -p /u01/mysql3306/tmp
mkdir -p /u01/mysql3306/etc
d ,上传 下载的源码到服务器到/root/目录下并解压
tar -zxvf ./mysql-5.7.19.tar.gz
e ,安装cmake 这个从mysql 5.5 开始使用的跨平台的编译工具
yum install -y cmake gcc gcc-c++ ncurses ncurses-devel bison zlib libxml openssl openssl-devel
f ,编译源码并安装
mkdir -p /u01/mysql3306/boost_1_59_0/ 这个目录是给MySQL源码编译时候下载boost_1_59_0并保存的目录
cd mysql-5.7.19
cmake \
-DCMAKE_INSTALL_PREFIX=/u01/mysql3306 \ 安装目录
-DINSTALL_DATADIR=/u01/mysql3306/data \ 数据存放目录
-DDEFAULT_CHARSET=utf8 \ 字符集
-DDEFAULT_COLLATION=utf8_general_ci \ 排序规则
-DEXTRA_CHARSETS=all \ 安全套接字
-DWITH_SSL=yes \
-DWITH_EMBEDDED_SERVER=1 \
-DENABLED_LOCAL_INFILE=1 \
-DWITH_MYISAM_STORAGE_ENGINE=1 \ 表示支持对应的存储引擎
-DWITH_INNOBASE_STORAGE_ENGINE=1 \
-DWITH_ARCHIVE_STORAGE_ENGINE=1 \
-DWITH_BLACKHOLE_STORAGE_ENGINE=1 \
-DWITH_FEDERATED_STORAGE_ENGINE=1 \
-DWITH_PARTITION_STORAGE_ENGINE=1 \
-DMYSQL_UNIX_ADDR=/u01/mysql3306/run/mysql.sock \ 对应的sock文件
-DMYSQL_TCP_PORT=3306 \
-DENABLED_LOCAL_INFILE=1 \
-DSYSCONFDIR=/u01/mysql3306/etc \ MySQL对应的my.conf 配置文件的目录
-DWITH_READLINE=on\
-DDOWNLOAD_BOOST=1 \
-DWITH_BOOST=/u01/mysql3306/boost_1_59_0/
make
make install
g ,将MySQL目录给mysql用户
chown -R mysql:mysql mysql3306/
chmod -R 755 mysql3306/
h ,创建一个MySQL配置文件/u01/mysql3306/etc/my.cnf
[client]
port=3306
socket=/u01/mysql3306/run/mysql.sock
[mysql]
#pid_file=/u01/mysql3306/run/mysqld.pid
[mysqld]
autocommit=1
general_log=off
explicit_defaults_for_timestamp=true
# system
relay_log_purge=0
basedir=/u01/mysql3306
datadir=/u01/mysql3306/data
max_allowed_packet=1g
max_connections=3000
max_user_connections=2800
open_files_limit=65535
pid_file=/u01/mysql3306/run/mysqld.pid
port=3306
server_id=101
skip_name_resolve=ON
socket=/u01/mysql3306/run/mysql.sock
tmpdir=/u01/mysql3306/tmp
#binlog
log_bin=/u01/mysql3306/log/binlog/binlog
binlog_cache_size=32768
binlog_format=row
expire_logs_days=7
log_slave_updates=ON
max_binlog_cache_size=2147483648
max_binlog_size=524288000
sync_binlog=100
#logging
log_error=/u01/mysql3306/log/error.log
slow_query_log_file=/u01/mysql3306/log/slow.log
log_queries_not_using_indexes=0
slow_query_log=1
log_slave_updates=ON
log_slow_admin_statements=1
long_query_time=1
#relay
relay_log=/u01/mysql3306/log/relaylog
relay_log_index=/u01/mysql3306/log/relay.index
relay_log_info_file=/u01/mysql3306/log/relay-log.info
#slave
slave_load_tmpdir=/u01/mysql3306/tmp
slave_skip_errors=OFF
#innodb
innodb_data_home_dir=/u01/mysql3306/log/iblog
innodb_log_group_home_dir=/u01/mysql3306/log/iblog
innodb_adaptive_flushing=ON
innodb_adaptive_hash_index=ON
innodb_autoinc_lock_mode=1
innodb_buffer_pool_instances=8
#default
innodb_change_buffering=inserts
innodb_checksums=ON
innodb_buffer_pool_size= 128M
innodb_data_file_path=ibdata1:32M;ibdata2:16M:autoextend
innodb_doublewrite=ON
innodb_file_format=Barracuda
innodb_file_per_table=ON
innodb_flush_log_at_trx_commit=1
innodb_flush_method=O_DIRECT
innodb_io_capacity=1000
innodb_lock_wait_timeout=10
innodb_log_buffer_size=67108864
innodb_log_file_size=1048576000
innodb_log_files_in_group=4
innodb_max_dirty_pages_pct=60
innodb_open_files=60000
innodb_purge_threads=1
innodb_read_io_threads=4
innodb_stats_on_metadata=OFF
innodb_support_xa=ON
innodb_use_native_aio=OFF
innodb_write_io_threads=10
[mysqld_safe]
datadir=/u01/mysql3306/data
h ,初始化数据库
/u01/mysql3306/bin/mysqld –defaults-file=/u01/mysql3306/etc/my.cnf –initialize –user=mysql –basedir=/u01/mysql3306/ –datadir=/u01/mysql3306/data
(这样会生成一个临时密码 在error.log 中查看 –initialize-insecure默认密码就是空)
i , 启动MySQL
/u01/mysql3306/bin/mysqld_safe –defaults-file=/u01/mysql3306/etc/my.cnf –user=mysql &
使用mysql客户端 链接上MySQL 修改root 密码
ALTER USER ‘root’@’localhost’ IDENTIFIED BY ‘123456’;
(/u01/mysql3306/bin/mysqladmin -uroot -p123456 –socket=/u01/mysql3306/run/mysql.sock shutdown & 关闭MySQL)
a, 在10.211.55.23 和10.211.55.24 两台机器上 按照上面的步骤安装新的MySQL。
注意不一样的地方就是my.cnf 配置文件中色server_id要保证每个MySQL实例都不相同。
b, 在每台MySQL上面创建一个replication 账号
grant replication slave,replication client on . to ‘repl’@’%’ identifed by
‘123456’
c, 启动10.211.55.23 和10.211.55.24机器上的MySQL实例,以为3台MySQL都是新安装,所以不需要把master中的数据导入到slave中 。(假如master是中有数据,就需要先给master做一个逻辑或者物理备份,我们从备份文件中查询到master_log_file和master_log_pos 这两个值 并且记录下来 )用mysql客户端连接上master 执行 show master status\G;可以得到 File 和 Position 的值 并且记录下来。
d, (如果master现在有数据 就先在每台slave上面 恢复备份数据 如果没有就跳过这步骤) 在所有的slave上面执行 change master to master_host=’10.211.55.21’,master_port=3306,master_user=’repl’,master_password=’193159’,master_log_file=’binlog.000026’,master_log_pos=154;
(这里的master_log_file和master_log_pos 的值 是来自于master的备份问文件中 或者show master status\G;)
e, start slave;
f, 在每台slave上面执行 show slave status\G;
10.211.55.23 (安装 MHA manager 和 MHA node,MHA的manger 实际生产环境中最好是一台单独的服务器,这是是和MySQL的一个slave在一台服务器上,MHA manager不要和mysql master的在同一台服务器上 )
10.211.55.21 (安装 MHA node )
10.211.55.24 (安装 MHA node )
a, 到MHA官网下载MHA源码文件 并安装 具体的安装步骤
https://github.com/yoshinorim/mha4mysql-manager/wiki/Installation
因为MHA使用Perl语言写的在上面的安装过程中可能会提示类似这样的Can’t locate inc/Module/Install.pm in(这是缺少Module::Install模块)报错 这是 缺少Perl模块 我们就使用cpan 命令来安装Perl模块(http://blog.sina.com.cn/s/blog_48c95a190100h7yq.html)
因为我们的MHA的版本是0.57 安装好了之后会出现一直连不上
b, 所有的服务器之后需要免密码登陆
在10.211.55.21上执行
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
在10.211.55.23上执行
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
(这里有一点比较特殊 因为Manger是在这台服务上的 所以需要自己和自己免密码登陆)
在10.211.55.24上执行
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
c, 创建MHA的配置文件
[server default]
user = root
password = 123456
ssh_user = root
repl_user = repl
repl_password = 123456
ping_interval = 1
ping_type = SELECT
manager_workdir=/u01/mha/app
manager_log=/u01/mha/log/manager.log
remote_workdir=/u01/mha/app
master_binlog_dir="/u01/mysql3306/log/binlog"
master_ip_failover_script="/u01/mha/script/master_ip_failover"
master_ip_online_change_script="/u01/mha/script/master_ip_online_change"
shutdown_script="/u01/mha/script/power_manager"
report_script="/u01/mha/script/send_report"
secondary_check_script = "masterha_secondary_check -s 10.211.55.23 -s 10.211.55.24"
#check_repl_delay=0
[server1]
hostname=10.211.55.21
port=3306
master_binlog_dir="/u01/mysql3306/log/binlog"
candidate_master=1
ignore_fail=1
client_bindir=/u01/mysql3306/bin/
client_libdir=/u01/mysql3306/lib/
[server2]
hostname=10.211.55.23
port=3306
master_binlog_dir="/u01/mysql3306/log/binlog"
candidate_master=1
ignore_fail=1
client_bindir=/u01/mysql3306/bin/
client_libdir=/u01/mysql3306/lib/
[server3]
hostname=10.211.55.24
port=3306
master_binlog_dir="/u01/mysql3306/log/binlog"
candidate_master=1
ignore_fail=1
client_bindir=/u01/mysql3306/bin/
client_libdir=/u01/mysql3306/lib/
MHA配置文件中所有参数的详细文档请参考
https://github.com/yoshinorim/mha4mysql-manager/wiki/Parameters
d, 修改对应的脚本这个脚本是
master_ip_failover_script模板文件 在MHA manager源码中 sample/scripts/目录下,需要把FIXME_xxx 注释掉 (93行 和88 行)http://blog.takanabe.tokyo/2016/05/04/2410/
secondary_check_script 这个文件不用修改
shutdown_script (power_manager)这个脚本 我是根据官方文档自己写的,大家可以根据自己的生产环境做修改
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my ( $command, $ssh_user, $host, $ip, $port, $pid_file, $help );
my $ssh_killmysql = "pkill -9 mysql";
my $ssh_poweroff = "shutdown -h now";
my $ssh_countmysqlprocess = "ps -ef |grep -e /u01/mysql3306/bin/mysqld_safe -e /u01/mysql3306/bin/mysqld";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'host=s' => \$host,
'ip=s' => \$ip,
'port=i' => \$port,
'pid_file=s' => \$pid_file,
'help' => \$help,
);
exit &main();
# If ssh is reachable and mysqld process does not exist, exit with 2 and
# do not power off. If ssh is not reachable, do power off and exit with 0
# if successful. Otherwise exit with 1.
sub main {
print "\n\nIN SHUTDOWN SCRIPT TEST\n\n";
print "\n\nCOMMAND IS $command\n\n";
if ( $command eq "stopssh" || $command eq "stopssh2" ) {
print "kills all mysqld and mysqld_safe processes with -9 on the master via ssh \n";
&stop_master_mysql();
print " $command SHUTDOWN SCRIPT exit 10 \n";
&count_mysql_process();
exit 10;
}elsif( $command eq "stop" ){
print " $command SHUTDOWN SCRIPT exit 0 \n";
#这里实际上是需要对应的工具关闭物理机
print " In fact, you should shutdown here. Power off command depends on H/W. For HP(iLO), ipmitool or SSL is common. For Dell(DRAC), dracadm is common. \n";
exit 0;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the Shutdown script.. OK \n";
exit 0;
}
else {
exit 0;
}
}
sub stop_master_mysql() {
`ssh $ssh_user\@$host \" $ssh_killmysql \"`;
}
sub count_mysql_process() {
my $count_str=`ssh $ssh_user\@$host \" $ssh_countmysqlprocess \"`;
print "================ $count_str =============\n";
}
report_script(send_report)这里是用的Email::Simple模块 所以要使用cpan命令安装对应的模块 这个脚本是failover之后给dba发邮件
#!/usr/bin/perl
use strict;
use warnings FATAL => 'all';
use Email::Simple;
use Email::Sender::Simple qw(sendmail);
use Email::Sender::Transport::SMTP::TLS;
use Getopt::Long;
my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body );
my $smtp='smtp.qq.com';
my $mail_from='[email protected]';
my $mail_user='[email protected]';
#邮箱授权码不是密码 邮箱授权码获取方式可以百度
my $mail_auth_pass='xxxxxx';
my $mail_to='xxxxxxxxxx';
GetOptions(
'orig_master_host=s' => \$dead_master_host,
'new_master_host=s' => \$new_master_host,
'new_slave_hosts=s' => \$new_slave_hosts,
'subject=s' => \$subject,
'body=s' => \$body,
);
my $transport = Email::Sender::Transport::SMTP::TLS->new(
host => $smtp,
port => 25,
username => $mail_user,
password => $mail_auth_pass,
);
my $message = Email::Simple->create(
header => [
From => $mail_from,
To => $mail_to,
Subject => $subject,
],
body => $body,
);
sendmail( $message, {transport => $transport} );
exit 0;
secondary_check_script(masterha_secondary_check)这个脚本不用修改
将上面的脚本copy到配置文件指定的目录下
/usr/local/bin/masterha_check_ssh –conf=/u01/mha/etc/app.cnf 检测ssh免密码登陆有没有问题
/usr/local/bin/masterha_check_repl –conf=/u01/mha/etc/app.cnf 检查主从复制有没有问题
nohup /usr/local/bin/masterha_manager –conf=/u01/mha/etc/mha.cnf –ignore_last_failover –ignore_fail_on_start < /dev/null > /u01/mha/log/app1.log 2>&1 & 后台启动MHA
/u01/mha/etc/masterha_check_status –conf=/u01/mha/etc/app.cnf 检测MHA的状态
/usr/local/bin/masterha_stop –conf=/u01/mha/etc/app.cnf 关闭MHA
关于上面的masterha_check_ssh masterha_check_repl masterha_manager masterha_check_status masterha_stop 这些脚本需要的参数 可以参考官方文档 https://github.com/yoshinorim/mha4mysql-manager/wiki
MHA manger完成了failover manger会关闭掉 需要重启 所以我们应该借助工具 让manger一直运行(daemontools)
安装 daemontools(http://blog.csdn.net/superbfly/article/details/52877954)
安装完成之后
mkdir -p /opt/svscan/mha
在 /opt/svscan/mh目录下创建 run 脚本
#!/bin/sh
exec masterha_manager --conf=/u01/mha/etc/mha.cnf --ignore_last_failover --ignore_fail_on_start --wait_on_monitor_error=60 >> /u01/mha/log/app1.log 2>&1
chmod -R 755 /opt/svscan/mha
chmod 755 /opt/svscan/mha/run
ln -s /opt/svscan/mha /service/mha
启动daemontools
/command/svscanboot &
现在用masterha_check_status 查看mha 的状态
让VIP(10.211.55.100)指向10.211.55.21
ifconfig eth0:0 10.211.55.100 netmask 255.255.255.0 up
这个eth0是对应的网卡
然后ping 10.211.55.100 发现是可以ping 通的
接着在10.211.55.21 上执行 pkill -9 mysql 看mha的日志 会看到整个failover 流程
failover之后 上面 send_port 脚本中设置的邮箱会收到对应的邮件
purge_relay_logs removes relay logs without blocking SQL threads. Relay logs need to be purged regularly (i.e. once per day, once per 6 hours, etc), so purge_relay_logs should be regularly invoked on each slave server from job schedulers. For example, you can invoke purge_relay_logs from cron as below.
at /etc/cron.d/purge_relay_logs
# purge relay logs at 5am
0 5 * * * app /usr/bin/purge_relay_logs --user=root --password=PASSWORD --disable_relay_log_purge >> /var/log/masterha/purge_relay_logs.log 2>&1