整个MHA集群环境搭建过程演示
一. 实验环境说明
安装MHA操作步骤
MHA节点包含三个脚本,依赖perl模块。
save_binary_logs:保存和复制当掉的主服务器二进制日志
apply_diff_relay_logs:识别差异的relay log事件,并应用于其他salve服务器
purge_relay_logs:清除relay log文件
需要在所有mysql服务器上安装MHA节点,MHA管理服务器也需要安装。MHA管理节点模块内部依赖MHA节点模块。MHA管理节点通过ssh连接管理mysql服务器和执行MHA节点脚本。MHA节点依赖perl的DBD::mysql模块。
1.1. 环境简介
1.1.1、vmvare虚拟机,系统版本CentOS6.5 x86_64位最小化安装,mysql的版本5.7.21,
1.1.2、虚拟机器的ssh端口均为默认22,
1.1.3、虚拟机的iptables全部关闭,
1.1.4、虚拟机的selinux全部关闭,
1.1.5、虚拟机服务器时间全部一致 ntpdate 0.asia.pool.ntp.org
1.1.6、3台机器的ssh端口为10280
1.2、此次试验采用的是3台机器,机器具体部署如下:
角色 IP地址(内网) 主机名称 节点机器部署服务 业务用途
Master 192.168.2.128 server02 mha4mysql-node-0.56-0.el6 写入
mha4mysql-manager-0.56-0.el6
keepalived
slave(备master) 192.168.2.129 server03 mha4mysql-node-0.56-0.el6 读
keepalived
Slave+Monitor 192.168.2.130 server04 mha4mysql-node-0.56-0.el6 读+备份数据
1.3说明介绍:
server03和server04是server02的slave从库,复制环境搭建后面会简单演示,其中master对外提供写服务,备选master(实际的slave,主机名server03)提供读服务,slave也提供相关的读服务,一旦master宕机,将会把备
选备master提升为新的master,slave指向新的master
server04上部署Monitor(MHA Manager监控),主要是监控主从复制的集群中主库master是否正常,一旦master挂掉,MHA Manager会自动完成主库和slave从库的自动切换
二.HMA具体部署过程
2.1主从复制的搭建
2.1.1 主库主配置文件(192.168.2.128机器)
[root@server02 ~]# cat /etc/my.cnf
[client]
port = 3306
socket = /tmp/mysql.sock
[mysql]
no-auto-rehash
[mysqld]
user = mysql
port = 3306
socket = /tmp/mysql.sock
basedir = /usr/local/mysql
datadir = /data/mysql/data
back_log = 2000
open_files_limit = 1024
max_connections = 800
max_connect_errors = 3000
max_allowed_packet = 33554432
external-locking = FALSE
character_set_server = utf8
#binlog
log-slave-updates = 1
binlog_format = row
log-bin = /data/mysql/logs/bin-log/mysql-bin
expire_logs_days = 5
sync_binlog = 1
binlog_cache_size = 1M
max_binlog_cache_size = 1M
max_binlog_size = 2M
#replicate-ignore-db=mysql
skip-name-resolve
slave-skip-errors = 1032,1062,
##skip_slave_start=1
skip_slave_start=0
##read_only=1
##relay_log_purge=0
###relay log
relay-log = /data/mysql/logs/relay-log/relay-bin
relay-log-info-file = /data/mysql/relay-log.info
###slow_log
slow_query_log = 1
slow-query-log-file = /data/mysql/logs/mysql-slow.log
log-error = /data/mysql/logs/error.log
##GTID
server_id = 1103
##gtid_mode=on
##enforce_gtid_consistency=on
event_scheduler = ON
innodb_autoinc_lock_mode = 1
innodb_buffer_pool_size = 10737418
innodb_data_file_path = ibdata1:10M:autoextend
innodb_data_home_dir = /data/mysql/data
innodb_log_group_home_dir = /data/mysql/data
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
innodb_io_capacity = 2000
innodb_log_buffer_size = 8388608
innodb_log_files_in_group = 3
innodb_max_dirty_pages_pct = 50
innodb_open_files = 512
innodb_read_io_threads = 8
innodb_thread_concurrency = 20
innodb_write_io_threads = 8
innodb_lock_wait_timeout = 10
innodb_buffer_pool_load_at_startup = 1
innodb_buffer_pool_dump_at_shutdown = 1
key_buffer_size = 3221225472
innodb_log_file_size = 1G
local_infile = 1
log_bin_trust_function_creators = 1
log_output = FILE
long_query_time = 1
myisam_sort_buffer_size = 33554432
join_buffer_size = 8388608
tmp_table_size = 33554432
net_buffer_length = 8192
performance_schema = 1
performance_schema_max_table_instances = 200
query_cache_size = 0
query_cache_type = 0
read_buffer_size = 20971520
read_rnd_buffer_size = 16M
max_heap_table_size = 33554432
bulk_insert_buffer_size = 134217728
secure-file-priv = /data/mysql/tmp
sort_buffer_size = 2097152
table_open_cache = 128
thread_cache_size = 50
tmpdir = /data/mysql/tmp
slave-load-tmpdir = /data/mysql/tmp
wait_timeout = 120
transaction_isolation=read-committed
innodb_flush_log_at_trx_commit=0
lower_case_table_names=1
[mysqldump]
quick
max_allowed_packet = 64M
[mysqld_safe]
log-error = /data/mysql/logs/error.log
pid-file = /data/mysql/mysqld.pid
2.1.2 slave从库配置文件
192.168.2.129机器 当主故障时,slave会提升为新主
与192.168.1.128master主库的mysql配置文件my.cnf中参数不同的是
129机器上开启以下2个参数
read_only=1
relay_log_purge=0
同时server-id的值必须和128主库不一样
192.168.2.130机器 提升mysql只读,以及数据备份,已经monitor监控
与192.168.1.128master主库的mysql配置文件my.cnf中参数不同的是
129机器上开启以下2个参数
read_only=1
relay_log_purge=0
同时server-id的值必须和128主库和129从库不一样
2.1.3参数详细介绍
1)各从库应设置relay_log_purge=0
否则收到以下告警信息mysql -e 'set global relay_log_purge=0' 动态修改该参数,因为随时slave会提升为master。
2)各从库设置read_only=1
否则收到以下告警信息 mysql -e 'set global read_only=1' 动态修改该参数,因为随时slave会提升为master
2.1.4主从复制操作
1)主库创建复制帐号
主库操作:
server02主库操作:
mysqldump -uroot -p123456 --master-data=2 --single-transaction -R --triggers -A > /root/all.sql
其中--master-data=2代表备份时刻记录master的Binlog位置和Position,--single-transaction意思是获取一致性快照,-R意思是备份存储过程和函数,--triggres的意思是备份触发器,-A代表备份所有的库。
grant replication slave on *.* to 'repmha'@'192.168.2.%' identified by '123456';
flush privilegs;
scp -rp /root/all.sql [email protected]:/root
scp -rp /root/all.sql [email protected]:/root
从库操作:
slave03 和slave04 2个从库上操作:
mysql -uroot -p123456
主从复制环境到此配置完成
2.2 创建MySQL管理用户monitor
2.2.1 删除多余用户
mysql> drop user root@'localhost';
mysql> select user,host from mysql.user;
+------+--------------+
| user | host |
+------+--------------+
| root | 127.0.0.1 |
| rep | 192.168.10.% |
+------+--------------+
3 rows in set (0.00 sec
2.2.2 创建监控monitor管理帐号
mysql> grant all on *.* to monitor@'192.168.10.%' identified by '123456';
Query OK, 0 rows affected (0.01 sec)
之前测试MHA主从同步报错,就是数据库授权管理问题,不是使用的localhost进行管理数据库,而是使用manager管理内网网段进行检查同步。
2.3.两台slave服务器设置read_only
(从库对外提供读服务,之所以没有写进配置文件,是因为随时slave会提升为master)
192.168.2.129 [root ~]$ mysql -uroot -p123456 -e "set global read_only=1"
192.168.2.130 [root ~]$ mysql -uroot -p123456 -e "set global read_only=1"
2.4.配置SSH登录无密码验证
(使用key登录,工作中常用,最好不要禁掉密码登录,如果禁了,可能会有问题)
在server02 192.168.2.128操作(Master):
192.168.2.128 [root ~]$ ssh-keygen -t rsa
192.168.2.128 [root ~]$ ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
192.168.2.128 [root ~]$ ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
在server03 192.168.2.129操作(slave):
192.168.2.129 [root ~]$ ssh-keygen -t rsa
192.168.2.129 [root ~]$ ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
192.168.2.129 [root ~]$ ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
在server04 192.168.2.130操作(slave+Monitor):
192.168.2.130 [root ~]$ ssh-keygen -t rsa
192.168.2.130 [root ~]$ ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
192.168.2.130 [root ~]$ ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
2.5.安装MHA过程
2.5.1创建软件包服务器存放目录
mkdir /data/tools -p
cd tools/
rz mha4mysql-node-0.56-0.el6.noarch.rpm
rz mha4mysql-manager-0.56-0.el6.noarch.rpm
在所有的节点安装MHA node:(下面以server02为例,记得server03和server04也一样的操作),MHA node和MHA Manager都在要官网下载,
下载地址:https://code.google.com/p/mysql-master-ha/wiki/Downloads?tm=2(自备×××)
2.5.2 安装MHA Node
在所有的服务器上安装
yum install -y perl-DBD-MySQL
rpm -ihv mha4mysql-node-0.56-0.el6.noarch.rpm
2.5.3. 安装MHA Manager
在192.168.2.130 机器上安装mha4mysql-manager和 mha4mysql-node
yum -y install perl-Parallel-ForkManager perl-Log-Dispatch perl-Time-HiRes perl-Mail-Sender perl-Mail-Sendmail perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Config-IniFiles
rpm -ihv mha4mysql-node-0.56-0.el6.noarch.rpm
rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm
此处为了演示只是以192.168.2.130机器上安装 mha4mysql-node和mha4mysql-manager演示为例
A.采用rpm包安装
[root@server03 ~]# rpm -ivh mha4mysql-node-0.56-0.el6.noarch
安装成功后会出现以下软件包,代表安装成功
[root@server03 ~]# rpm -ql mha4mysql-node-0.56-0.el6.noarch
/usr/bin/apply_diff_relay_logs
/usr/bin/filter_mysqlbinlog
/usr/bin/purge_relay_logs
/usr/bin/save_binary_logs
/usr/share/man/man1/apply_diff_relay_logs.1.gz
/usr/share/man/man1/filter_mysqlbinlog.1.gz
/usr/share/man/man1/purge_relay_logs.1.gz
/usr/share/man/man1/save_binary_logs.1.gz
/usr/share/perl5/vendor_perl/MHA/BinlogHeaderParser.pm
/usr/share/perl5/vendor_perl/MHA/BinlogManager.pm
/usr/share/perl5/vendor_perl/MHA/BinlogPosFindManager.pm
/usr/share/perl5/vendor_perl/MHA/BinlogPosFinder.pm
/usr/share/perl5/vendor_perl/MHA/BinlogPosFinderElp.pm
/usr/share/perl5/vendor_perl/MHA/BinlogPosFinderXid.pm
/usr/share/perl5/vendor_perl/MHA/NodeConst.pm
/usr/share/perl5/vendor_perl/MHA/NodeUtil.pm
/usr/share/perl5/vendor_perl/MHA/SlaveUtil.pm
Node脚本说明:(这些工具通常由MHA Manager的脚本触发,无需人为操作)
save_binary_logs //保存和复制master的二进制日志
apply_diff_relay_logs //识别差异的中继日志事件并将其差异的事件应用于其他的slave
filter_mysqlbinlog //去除不必要的ROLLBACK事件(MHA已不再使用这个工具)
purge_relay_logs //清除中继日志(不会阻塞SQL线程)
刚才的四个命令是在安装mha4mysql-node-0.56-0.el6.noarch.rpm 生成的
[root@server03 ~]# rpm -ql mha4mysql-manager-0.56-0.el6.noarch
package mha4mysql-manager-0.56-0.el6.noarch is not installed
[root@server03 ~]# rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm
Preparing... ########################################### [100%]
1:mha4mysql-manager ########################################### [100%]
[root@server03 ~]#
安装成功后会出现以下软件包,代表安装成功
[root@server03 ~]# rpm -ql mha4mysql-manager-0.56-0.el6.noarch
/usr/bin/masterha_check_repl
/usr/bin/masterha_check_ssh
/usr/bin/masterha_check_status
/usr/bin/masterha_conf_host
/usr/bin/masterha_manager
/usr/bin/masterha_master_monitor
/usr/bin/masterha_master_switch
/usr/bin/masterha_secondary_check
/usr/bin/masterha_stop
/usr/share/man/man1/masterha_check_repl.1.gz
/usr/share/man/man1/masterha_check_ssh.1.gz
/usr/share/man/man1/masterha_check_status.1.gz
/usr/share/man/man1/masterha_conf_host.1.gz
/usr/share/man/man1/masterha_manager.1.gz
/usr/share/man/man1/masterha_master_monitor.1.gz
/usr/share/man/man1/masterha_master_switch.1.gz
/usr/share/man/man1/masterha_secondary_check.1.gz
/usr/share/man/man1/masterha_stop.1.gz
/usr/share/perl5/vendor_perl/MHA/Config.pm
/usr/share/perl5/vendor_perl/MHA/DBHelper.pm
/usr/share/perl5/vendor_perl/MHA/FileStatus.pm
/usr/share/perl5/vendor_perl/MHA/HealthCheck.pm
/usr/share/perl5/vendor_perl/MHA/ManagerAdmin.pm
/usr/share/perl5/vendor_perl/MHA/ManagerAdminWrapper.pm
/usr/share/perl5/vendor_perl/MHA/ManagerConst.pm
/usr/share/perl5/vendor_perl/MHA/ManagerUtil.pm
/usr/share/perl5/vendor_perl/MHA/MasterFailover.pm
/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm
/usr/share/perl5/vendor_perl/MHA/MasterRotate.pm
/usr/share/perl5/vendor_perl/MHA/SSHCheck.pm
/usr/share/perl5/vendor_perl/MHA/Server.pm
/usr/share/perl5/vendor_perl/MHA/ServerManager.pm
复制下面的脚本到/usr/local/bin目录
/usr/bin/masterha_check_repl
/usr/bin/masterha_check_ssh
/usr/bin/masterha_check_status
/usr/bin/masterha_conf_host
/usr/bin/masterha_manager
/usr/bin/masterha_master_monitor
/usr/bin/masterha_master_switch
/usr/bin/masterha_secondary_check
/usr/bin/masterha_stop
复制相关脚本到/usr/local/bin目录(软件包解压缩后就有了,不是必须,因为这些脚本不完整,需要自己修改,这是软件开发着留给我们自己发挥的,如果开启下面的任何一个脚本对应的参数,而对应这里的脚本又没有修改则会抛错,自己被坑的很惨
master_ip_failover //自动切换时vip管理的脚本,不是必须,如果我们使用keepalived的,我们可以自己编写脚本完成对vip的管理,比如监控mysql,如果mysql异常,我们停止keepalived就行,这样vip就会自动漂移
master_ip_online_change //在线切换时vip的管理,不是必须,同样可以可以自行编写简单的shell完成
power_manager //故障发生后关闭主机的脚本,不是必须
send_report //因故障切换后发送报警的脚本,不是必须,可自行编写简单的shell完成
到这里整个MHA集群环境已经搭建完毕