需要的包下载:(MHA0.57+MAXSCALE1.43 2.0)
http://down.51cto.com/data/2258543
http://down.51cto.com/data/2258544
http://down.51cto.com/data/2258545
架构图:
要点:
1、安装配置MHA
2、主从搭建
3、MAXSCALE安装配置
4、单节点故障
简介:Master HighAvailability
该软件由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)
MHA工作原理总结为如下:
(1)从宕机崩溃的master保存二进制日志事件(binlog events)
(2)识别含有最新更新的slave
(3)应用差异的中继日志(relay log)到其他的slave
(4)应用从master保存的二进制日志事件(binlog events)
(5)提升一个slave为新的master
(6)使其他的slave连接新的master进行复制
Manager工具包主要包括以下几个工具:
masterha_check_ssh 检查MHA的SSH配置状况
masterha_check_repl 检查MySQL复制状况
masterha_manger 启动MHA
masterha_check_status 检测当前MHA运行状态
masterha_master_monitor 检测master是否宕机
masterha_master_switch 控制故障转移(自动或者手动)
masterha_conf_host 添加或删除配置的server信息
Node工具包
save_binary_logs 保存和复制master的二进制日志
apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的slave
filter_mysqlbinlog 去除不必要的ROLLBACK事件(MHA已不再使用这个工具)
purge_relay_logs 清除中继日志(不会阻塞SQL线程)
一、MySQL主从搭建(略)
二、Mha 安装
1、在所有节点安装MHA node所需的perl模块(DBD:mysql)
yum install perl-DBD-MySQL –y
yum -y install perl-DBD-MySQLperl-Config-Tiny perl-Params-Validate perl-CPAN perl-develperl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker
###这里有些包可能没有安装上
2、在所有节点安装mha node
tar -xvf mha4mysql-node-0.57.tar.gz
cd mha4mysql-node-0.57
perl Makefile.PL
make
make install
安装后再/usr/local/bin下生成以下文件
apply_diff_relay_logs
filter_mysqlbinlog
purge_relay_logs
save_binary_logs
3、安装MHA Manager
cd mha4mysql-manager-0.57
perl Makefile.PL
make
make install
###执行的时候会提示其他一些东西,下载安装一直输入Y
#yum –y install perl-* ##在存放包路径下执行
rw-r--r-- 1 root root 21272 Sep 15 05:52perl-Config-Tiny-2.12-1.el6.rfx.noarch.rpm
-rw-r--r-- 1 root root 72328 Sep 15 05:52perl-Log-Dispatch-2.26-1.el6.rf.noarch.rpm
-rw-r--r-- 1 root root 14840 Sep 15 05:52perl-Parallel-ForkManager-0.7.5-2.2.el6.rf.noarch.rpm
-rw-r--r-- 1 root root 76684 Sep 15 05:52perl-Params-Validate-0.92-3.el6.x86_64.rpm
拷贝相关脚本到/usr/local/bin下
cd /usr/local/mha4mysql-manager-0.57/samples/scripts
scp * /usr/local/bin
###一些检测检测脚本拷贝到/usr/bin/目录下
#scp /root/mha4mysql-manager-0.57/bin/*/usr/bin/
master_ip_failover #自动切换时vip管理的脚本,不是必须,如果我们使用keepalived的,我们可以自己编写脚本完成对vip的管理,比如监控mysql,如果mysql异常,我们停止keepalived就行,这样vip就会自动漂移
master_ip_online_change #在线切换时vip的管理,不是必须,同样可以可以自行编写简单的shell完成
power_manager #故障发生后关闭主机的脚本,不是必须
send_report #因故障切换后发送报警的脚本,不是必须,可自行编写简单的shell完成。
4、配置SSH登录无密码验证
###配置前配置各主机的/etc/hosts信息
1)在每台发服务器上执行:ssh-keygen-t rsa
ssh-copy-id -i.ssh/id_rsa.pub "[email protected]"
ssh-copy-id -i.ssh/id_rsa.pub "[email protected]"
ssh-copy-id -i .ssh/id_rsa.pub "[email protected]"
5、两台Slave服务设置read_only
set global read_only=1;
7、在master库创建监控用户
grant all privileges on *.* to dlan@'192.168.%'identified by 'root123';
flush privileges;
8、配置mha
1)创建MHA的工作目录,并且创建相关配置文件
mkdir -p /etc/masterha
cd /usr/local/mha4mysql-manager-0.57/samples/conf
cp app1.cnf /etc/masterha/
2)修改配置文件
vi/etc/masterha/app1.cnf
[server default]
manager_workdir=/var/log/masterha
manager_log=/var/log/masterha/app1/manager.log
master_binlog_dir=/data/mysql
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=root123
user=dlan
ping_interval=1
remote_workdir=/tmp
repl_password=root123
repl_user=repl
#report_script=/usr/local/bin/send_report
ssh_user=root
[server1]
hostname=192.168.1.114
candidate_master=1
port=3306
[server2]
hostname=192.168.1.107
port=3306
candidate_master=1
[server3]
hostname=192.168.1.108
port=3306
no_master=1
3)检查ssh设置
##可能需要设置用户连接方式socket和tcp/ip两种方式,对rep的用户
masterha_check_ssh--conf=/etc/masterha/app1.cnf
Thu Sep 1506:18:44 2016 - [info] All SSH connection tests passed successfully.
4)检查主从复制
masterha_check_repl--conf=/etc/masterha/app1.cnf
5)安装keepalived
tar xf keepalived-1.2.12.tar.gz
cdkeepalived-1.2.12
./configure--prefix=/usr/local/keepalived
make&& make install
cp/usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d/
cp/usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/
mkdir/etc/keepalived
cp /usr/local/keepalived/etc/keepalived/keepalived.conf/etc/keepalived/
cp/usr/local/keepalived/sbin/keepalived /usr/sbin/
主库keepalived配置文件
vi /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
}
notification_email_from [email protected]
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_instance VI_1 {
state MASTER ##当设置master-backup时候主库宕机,VIP会发生漂移,主库修复后,又会发生VIP抢占模式,即使设置非抢占模式的动作也会发生,设置BACKUP--backup就不会发生抢占。
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.1.219 dev eth0 scope global
}
}
备用主库的keepalived配置文件
vi/etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
}
notification_email_from [email protected]
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.1.219 dev eth0 scope global
}
}
7)mhamanager管理
启动manager
nohupmasterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf--ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log2>&1 &
nohupmasterha_manager --conf=/etc/mhamaster/app1.cnf >/var/log/masterha/manager.log < /dev/null2>&1 &
关闭manager
masterha_stop --conf=/etc/masterha/app1.cnf
查看manager的状态
masterha_check_status--conf=/etc/masterha/app1.cnf
切换
1启动manager后,master发生故障后会自动切换
2在线切换过程
masterha_stop --conf=/etc/masterha/app1.cnf
masterha_master_switch--conf=/etc/masterha/app1.cnf --master_state=alive--new_master_host=192.168.1.108 --new_master_port=3306--orig_master_is_new_slave --running_updates_limit=10000
测试过程中需要关注的几个问题:
1.切换过程会自动把read_only关闭
show variables like '%read_only%';
2、切换之后需要删除手工删除/masterha/app1/app1.failover.complete,才能进行第二次
3.一旦发生切换管理进程将会退出,无法进行再次测试,需将故障数据库加入到MHA环境中来
4.原主节点重新加入到MHA时只能设置为slave,在
####注意:切换过程在在日志获取change master to的信息。记录最新的主库的POS位置信息
5.关于ip地址的接管有几种方式,这里采用的是MHA自动调用ip别名的方式,好处是在能够保证数据库状态与业务Ip 切换的一致性。启动管理节点之后 vip会自动别名到当前主节点上,keepalived也只能做到对3306的健康检查,但是做不到比如像MySQL复制中的slave-SQL、slave-IO进程的检查,容易出现对切换的误判。
6.注意:二级从服务器需要将log_slave_updates打开
7.手工切换需要先定义好master_ip_online_change_script脚本,不然只会切换mysql,Ip地址不会绑定上去,可以根据模板来配置该脚本
8.通过设置no_master=1可以让某一个节点永远不成为新的主节点
9、切换后,被切的点的keepalived会被停止工作,若需要后续提供功能还需要重新启动
手动切换后,使用masterha_check_repl不能使用原来的/etc/masterha/app1.cnf来做check,要用新的app2.cnf来做check,因为app1.cnf里面的master是原来旧的cnf,check会报错主从复制失败
1)复制原理的app1.cnf为新的app2.cnf
cp/etc/masterha/app1.cnf/etc/masterha/app2.cnf
(2)编辑app2.cnf,将里面的server1和server2的ip互换,也就是switch的两个主从的ip换掉
相关脚本
1)master_ip_failover
#!/usr/bin/envperl
use strict;
use warnings FATAL=> 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host,$new_master_ip, $new_master_port
);
my $vip ='192.168.1.219';
my $ssh_start_vip= "/etc/init.d/keepalived start";
my $ssh_stop_vip ="/etc/init.d/keepalived stop";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' =>\$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' =>\$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPTTEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" ||$command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP onold master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vipon the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of thescript.. OK \n";
#`ssh $ssh_user\@cluster1 \"$ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
}
# A simple systemcall that enable the VIP on the new master
sub start_vip() {
`ssh $ssh_user\@$new_master_host \"$ssh_start_vip \"`;
}
# A simple systemcall that disable the VIP on the old_master
sub stop_vip() {
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \"$ssh_stop_vip \"`;
}
sub usage {
"Usage: master_ip_failover--command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip--orig_master_port=port --new_master_host=host --new_master_ip=ip--new_master_port=port\n";
}
2)master_ip_online_change
#!/usr/bin/envperl
# Copyright (C) 2011 DeNA Co.,Ltd.
# This program is free software; you canredistribute it and/or modify
# it under the terms of the GNU General PublicLicense as published by
# the Free Software Foundation; either version2 of the License, or
# (at your option) any later version.
# This program is distributed in the hope thatit will be useful,
# but WITHOUT ANY WARRANTY; without even theimplied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULARPURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNUGeneral Public License
# along with this program; if not, write to theFree Software
# Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston,MA 02110-1301 USA
## Note: This is asample script and is not complete. Modify the script based on your environment.
use strict;
use warnings FATAL=> 'all';
use Getopt::Long;
use MHA::DBHelper;
use MHA::NodeUtil;
use Time::HiResqw( sleep gettimeofday tv_interval );
use Data::Dumper;
my $ssh_user ="root";
my $vip ='192.168.1.219';
my $ssh_start_vip= "/etc/init.d/keepalived start";
my $ssh_stop_vip ="/etc/init.d/keepalived stop";
my $_tstart;
my$_running_interval = 0.1;
my (
$command, $orig_master_host, $orig_master_ip,
$orig_master_port, $orig_master_user,$orig_master_password,
$new_master_host, $new_master_ip, $new_master_port,
$new_master_user, $new_master_password
);
GetOptions(
'command=s' => \$command,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'orig_master_user=s' => \$orig_master_user,
'orig_master_password=s' =>\$orig_master_password,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
);
exit &main();
subcurrent_time_us {
my ( $sec, $microsec ) = gettimeofday();
my $curdate = localtime($sec);
return $curdate . " " . sprintf("%06d", $microsec );
}
sub sleep_until {
my $elapsed = tv_interval($_tstart);
if ( $_running_interval > $elapsed ) {
sleep( $_running_interval - $elapsed );
}
}
subget_threads_util {
my $dbh = shift;
my $my_connection_id = shift;
my $running_time_threshold = shift;
my $type = shift;
$running_time_threshold = 0 unless($running_time_threshold);
$type = 0 unless ($type);
my @threads;
my $sth = $dbh->prepare("SHOWPROCESSLIST");
$sth->execute();
while ( my $ref = $sth->fetchrow_hashref()) {
my $id = $ref->{Id};
my $user = $ref->{User};
my $host = $ref->{Host};
my $command = $ref->{Command};
my $state = $ref->{State};
my $query_time = $ref->{Time};
my$info = $ref->{Info};
$info =~ s/^\s*(.*?)\s*$/$1/ ifdefined($info);
next if ( $my_connection_id == $id );
next if ( defined($query_time) &&$query_time < $running_time_threshold );
next if ( defined($command) && $command eq "BinlogDump" );
next if ( defined($user) && $user eq "systemuser" );
next
if ( defined($command)
&& $command eq "Sleep"
&& defined($query_time)
&& $query_time >= 1 );
if ( $type >= 1 ) {
next if ( defined($command) &&$command eq "Sleep" );
next if ( defined($command) &&$command eq "Connect" );
}
if ( $type >= 2 ) {
next if ( defined($info) && $info=~ m/^select/i );
next if ( defined($info) && $info=~ m/^show/i );
}
push @threads, $ref;
}
return @threads;
}
sub main {
if ( $command eq "stop" ) {
## Gracefully killing connections on thecurrent master
# 1. Set read_only= 1 on the new master
# 2. DROP USER so that no app user canestablish new connections
# 3. Set read_only= 1 on the current master
# 4. Kill current queries
# * Any database access failure will resultin script die.
my $exit_code = 1;
eval {
## Setting read_only=1 on the new master(to avoid accident)
my $new_master_handler = newMHA::DBHelper();
# args: hostname, port, user, password,raise_error(die_on_error)_or_not
$new_master_handler->connect($new_master_ip, $new_master_port,
$new_master_user, $new_master_password,1 );
print current_time_us() . " Set read_only on the new master..";
$new_master_handler->enable_read_only();
if ($new_master_handler->is_read_only() ) {
print "ok.\n";
}
else {
die "Failed!\n";
}
$new_master_handler->disconnect();
# Connecting to the orig master, die ifany database error happens
my $orig_master_handler = newMHA::DBHelper();
$orig_master_handler->connect($orig_master_ip, $orig_master_port,
$orig_master_user,$orig_master_password, 1 );
## Drop application user so that nobodycan connect. Disabling per-session binlog beforehand
$orig_master_handler->disable_log_bin_local();
print current_time_us() . " Drppingapp user on the orig master..\n";
#FIXME_xxx_drop_app_user($orig_master_handler);
## Waiting for N * 100 milliseconds sothat current connections can exit
my $time_until_read_only = 15;
$_tstart = [gettimeofday];
my @threads = get_threads_util($orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );
while ( $time_until_read_only > 0&& $#threads >= 0 ) {
if ( $time_until_read_only % 5 == 0 ) {
printf
"%s Waitingall running %d threads are disconnected.. (max %d milliseconds)\n",
current_time_us(), $#threads + 1,$time_until_read_only * 100;
if ( $#threads < 5 ) {
print Data::Dumper->new( [$_])->Indent(0)->Terse(1)->Dump . "\n"
foreach (@threads);
}
}
sleep_until();
$_tstart = [gettimeofday];
$time_until_read_only--;
@threads = get_threads_util($orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );
}
## Setting read_only=1 on the currentmaster so that nobody(except SUPER) can write
print current_time_us() . " Setread_only=1 on the orig master.. ";
$orig_master_handler->enable_read_only();
if ($orig_master_handler->is_read_only() ) {
print "ok.\n";
}
else {
die "Failed!\n";
}
## Waiting for M * 100 milliseconds sothat current update queries can complete
my $time_until_kill_threads = 5;
@threads = get_threads_util($orig_master_handler->{dbh},
$orig_master_handler->{connection_id});
while ( $time_until_kill_threads > 0&& $#threads >= 0 ) {
if ( $time_until_kill_threads % 5 == 0) {
printf
"%s Waitingall running %d queries are disconnected.. (max %d milliseconds)\n",
current_time_us(), $#threads + 1,$time_until_kill_threads * 100;
if ( $#threads < 5 ) {
print Data::Dumper->new( [$_])->Indent(0)->Terse(1)->Dump . "\n"
foreach (@threads);
}
}
sleep_until();
$_tstart = [gettimeofday];
$time_until_kill_threads--;
@threads = get_threads_util($orig_master_handler->{dbh},
$orig_master_handler->{connection_id} );
}
print "Disabling the VIP on oldmaster: $orig_master_host \n";
&stop_vip();
## Terminating all threads
print current_time_us() . " Killingall application threads..\n";
$orig_master_handler->kill_threads(@threads) if ( $#threads >= 0);
print current_time_us() . "done.\n";
$orig_master_handler->enable_log_bin_local();
$orig_master_handler->disconnect();
## After finishing the script, MHAexecutes FLUSH TABLES WITH READ LOCK
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
## Activating master ip on the new master
# 1. Create app user with write privileges
# 2. Moving backup script if needed
# 3. Register new master's ip to thecatalog database
# We don't returnerror even though activating updatable accounts/ip failed so that we don'tinterrupt slaves' recovery.
# If exit code is0 or 10, MHA does not abort
my $exit_code = 10;
eval {
my $new_master_handler = new MHA::DBHelper();
# args: hostname, port, user, password,raise_error_or_not
$new_master_handler->connect($new_master_ip, $new_master_port,
$new_master_user, $new_master_password,1 );
## Set read_only=0 on the new master
$new_master_handler->disable_log_bin_local();
print current_time_us() . " Setread_only=0 on the new master.\n";
$new_master_handler->disable_read_only();
## Creating an app user on the new master
print current_time_us() . " Creatingapp user on the new master..\n";
FIXME_xxx_create_app_user($new_master_handler);
$new_master_handler->enable_log_bin_local();
$new_master_handler->disconnect();
## Update master ip on the catalogdatabase, etc
print "Enabling the VIP - $vip onthe new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
# do nothing
exit0;
}
else {
&usage();
exit 1;
}
}
# A simple systemcall that enable the VIP on the new master
sub start_vip() {
`ssh $ssh_user\@$new_master_host \"$ssh_start_vip \"`;
}
# A simple systemcall that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \"$ssh_stop_vip \"`;
}
sub usage {
"Usage:master_ip_online_change --command=start|stop|status --orig_master_host=host--orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip--new_master_port=port\n";
die;
}
三、安装MAXSCALE
1、安装MX 可根据提供下载的有2.0 和1.43的版本
yum -y intall maxsclae.......
2、配置文件
# MaxScale documentation on GitHub:
# https://github.com/mariadb-corporation/MaxScale/blob/master/Documentation/Documentation-Contents.md
# Global parameters
# Complete list of configuration options:
# https://github.com/mariadb-corporation/MaxScale/blob/master/Documentation/Getting-Started/Configuration-Guide.md
[maxscale]
threads=1
# Server definitions
# Set the address of the server to the network
# address of a MySQL server.
[server1]
type=server
address=192.168.1.114
port=3306
protocol=MySQLBackend
[server2]
type=server
address=192.168.1.107
port=3306
protocol=MySQLBackend
[server3]
type=server
address=192.168.1.108
port=3306
protocol=MySQLBackend
# Monitor for the servers
# This will keep MaxScale aware of the state of the servers.
# MySQL Monitor documentation:
# https://github.com/mariadb-corporation/MaxScale/blob/master/Documentation/Monitors/MySQL-Monitor.md
[MySQL Monitor]
type=monitor
module=mysqlmon
servers=server1,server2,server3
user=maxscale_monitor
passwd=root123
monitor_interval=10000
# Service definitions
# Service Definition for a read-only service and
# a read/write splitting service.
# ReadConnRoute documentation:
# https://github.com/mariadb-corporation/MaxScale/blob/master/Documentation/Routers/ReadConnRoute.md
#[Read-Only Service]
#type=service
#router=readconnroute
#servers=server1
#user=myuser
#passwd=mypwd
#router_options=slave
# ReadWriteSplit documentation:
# https://github.com/mariadb-corporation/MaxScale/blob/master/Documentation/Routers/ReadWriteSplit.md
[Read-Write Service]
type=service
router=readwritesplit
servers=server1,server2,server3
user=maxscale
passwd=root123
max_slave_connections=100%
# This service enables the use of the MaxAdmin interface
# MaxScale administration guide:
# https://github.com/mariadb-corporation/MaxScale/blob/master/Documentation/Reference/MaxAdmin.md
[MaxAdmin Service]
type=service
router=cli
# Listener definitions for the services
# These listeners represent the ports the
# services will listen on.
#[Read-Only Listener]
#type=listener
#service=Read-Only Service
#protocol=MySQLClient
#port=4008
[Read-Write Listener]
type=listener
service=Read-Write Service
protocol=MySQLClient
port=4006
[MaxAdmin Listener]
type=listener
service=MaxAdmin Service
protocol=maxscaled
port=6603
3、在DB主库上添加路由用户和监控用户供MX使用
4、在MHA的备机上也安装MX,避免路由单节点故障后致使整个故障
5、检测脚本,检测MX的存活状态
####至此mha+maxscale 的架构类似haproxy+mysql集群方案