一.mysql-mmm实现mysql 高可用架构
MMM 即Master-Master Replication Manager for MySQL(mysql 主主复制管理器)关于 mysql
主主复制配置的监控、故障转移和管理的一套可伸缩的脚本套件(在任何时候只有一个
节点可以被写入),这个套件也能对居于标准的主从配置的任意数量的从服务器进行读
负载均衡,所以你可以用它来在一组居于复制的服务器启动虚拟 ip,除此之外,它还
有实现数据备份、节点之间重新同步功能的脚本。
MySQL 本身没有提供 replication failover 的解决方案,通过 MMM 方案能实现服务器的故障转移,从而实现 mysql 的高可用。MMM 项目来自 Google:http://code.google.com/p/mysql-master-master
官方网站为:http://mysql-mmm.org
mysql-mmm 主要功能由下面三个脚本提供
mmm_mond 负责所有的监控工作的监控守护进程,决定节点的移除等等
mmm_agentd 运行在 mysql 服务器上的代理守护进程,通过简单远程服务集提供给监控节点
mmm_control 通过命令行管理 mmm_mond 进程
mysql-mmm 的监管端会提供多个虚拟 IP(VIP),包括一个可写 VIP,多个可读 VIP,通过监管的管理,这些 IP 会绑定在可用 mysql 之上,当某一台 mysql 宕机时,监管会将 VIP
迁移至其他 mysql。
在整个监管过程中,需要在 mysql 中添加相关授权用户,以便让 mysql 可以支持监理机的维护。授权的用户包括一个 mmm_monitor 用户和一个 mmm_agent 用户,如果想使用 mmm的备份工具则还要添加一个 mmm_tools 用户。
部署开始,由于机器资源有限,这里的实验,slave 就用一台了。
二.部署的前期工作
1.环境描述
vmvare 虚拟机:4 台
系统版本:CentOS release 6.6 (Final) 2.6.32-504.el6.x86_64
mysql版本:5.5.32
mysql-mmm版本:
4台虚拟机信息:
MMM管理机:192.168.0.149 Monitor test-A
master1:192.168.0.150 server-id=1 test-B
master2:192.168.0.160 server-id=3 test-D
slave:192.168.0.151 server-id=2 test-C
虚拟IP:
10.0.0.13 write
10.0.0.14 read
10.0.0.15 read
10.0.0.16 read
Mysql-MMM 架构配置简介:
1.master1, master2 两台安装 mysql,并做主主的配置
2.slave1 上安装 mysql,并配置作为 master1 的从服务器。
3.master1/2, slave1,Monitor 这四台都要安装 mysql-mmm,并配置:mmm_common.conf、
mmm_agent.conf 以及 mmm_mon.conf 文件
三、配置 mysql-master-1/2(主主同步),mysql-master-1 与 mysql-slave(主从同步)
注:所有的mysql都是新安装的,所以没有任何数据,环境相同。
1.1、 改my.cnf然后重启服务
mysql-master-1:
[mysqld]
server-id = 1
log-bin=mysql-bin
log-slave-updates
auto_increment_offset=1
auto_increment_increment=2
mysql-master-2:
[mysqld]
server-id = 3
log-bin=mysql-bin
log-bin=mysql-bin
log-slave-updates
auto_increment_offset=2
auto_increment_increment=2
1.2、 配置master1和master2 做主主同步
master1 和 master2 都需要创建链接用户
mysql> grant replication slave on *.* to 'rep'@'192.168.0.%' identified by
'test123';
Query OK, 0 rows affected (0.02 sec)
master1操作:
mysql> show master status;
+------------------+----------+--------------+---------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+---------------------------------------------+
| mysql-bin.000006 | 107 | | mysql,performance_schema,information_schema |
+------------------+----------+--------------+---------------------------------------------+
1 row in set (0.00 sec)
master2操作:
mysql> change master to
> master_host='192.168.0.150',
> master_port=3306,
> master_user='rep',
> master_password='test123',
> master_log_file='mysql-bin.000006',
> master_log_pos=107;
mysql> start slave;
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.0.150
Master_User: rep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000006
Read_Master_Log_Pos: 107
Relay_Log_File: test-D-relay-bin.000006
Relay_Log_Pos: 253
Relay_Master_Log_File: mysql-bin.000006
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 107
Relay_Log_Space: 410
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
1 row in set (0.00 sec)
ERROR:
No query specified
mysql> grant replication slave on *.* to 'rep'@'192.168.0.%' identified by 'test123';
mysql> show master status; +------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000003 | 107 | | |
+------------------+----------+--------------+------------------+
1 row in set (0.00 sec)
master1操作:
mysql> change master to
> master_host='192.168.0.160',
> master_port=3306,
> master_user='rep',
> master_password='test123',
> master_log_file='mysql-bin.000003',
> master_log_pos=107;
mysql> start slave;
mysql> show slave status\G;
2.1、 slave修改my.cnf并重启服务
vi /data/3307/my.cnf
[mysqld]
server-id = 2
[root@test-C ~]# mysqladmin -uroot -p456 shutdown -S /data/3307/mysql.sock
[root@test-C ~]# /application/mysql/bin/mysqld_safe --defaults-file=/data/3307/my.cnf &
注:从上我用的是多实例
2.2、 配置同步参数
查看 master1 主库的记录点信息
flush tables with read lock; #锁表
mysql> show master status;
+------------------+----------+--------------+---------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+---------------------------------------------+
| mysql-bin.000006 | 107 | | mysql,performance_schema,information_schema |
+------------------+----------+--------------+---------------------------------------------+
1 row in set (0.00 sec)
mysqldump -uroot --event -A -B >/tmp/master1.sql # 备份主库
unlock tables; #解锁
slave 从库进行操作:
mysql -uroot <master1.sql -S /data/3307/mysql.sock
mysql> CHANGE MASTER TO
-> MASTER_HOST='192.168.0.150',
-> MASTER_PORT=3306,
-> MASTER_USER='rep',
-> MASTER_PASSWORD='test123',
-> MASTER_LOG_FILE='mysql-bin.000006',
-> MASTER_LOG_POS=107;
Query OK, 0 rows affected (0.04 sec)
mysql> start slave;
Query OK, 0 rows affected (0.02 sec)
mysql> show slave status\G;
四、配置mysql-mmm
4.1、 安装mysql-mmm
注:需要在这四台 server 上都安装 mysql-mmm
CentOS 软件仓库默认是不含这些软件的,必须要有epel这个包的支持。所以我们必须先安装epel。
四台同时操作:
cd tools
wget http://dl.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm
rpm -Uvh epel-release-6-8.noarch.rpm
yum install -y mysql-mmm*
4.2、 配置mmm代理和监控账号的权限
现在环境已经配置好,我没有配置忽略 mysql库和 user表,所以只要在任意一台主库上执
行下面的操作,其他的库就都有这俩账号了。
mysql> GRANT REPLICATION CLIENT ON *.* TO 'mmm_monitor'@'192.168.0.%' IDENTIFIED BY 'test123';
mysql> GRANT SUPER, REPLICATION CLIENT, PROCESS ON *.* TO 'mmm_agent'@'192.168.0.%' IDENTIFIED BY 'test123';
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.06 sec)
注:master1、master2、slave 是一样的
mysql> select user,host from mysql.user;
+-------------+-------------+
| user | host |
+-------------+-------------+
| root | 127.0.0.1 |
| mmm_agent | 192.168.0.% |
| mmm_monitor | 192.168.0.% |
| rep | 192.168.0.% |
| root | ::1 |
| | localhost |
| root | localhost |
| | test-B |
| root | test-B |
+-------------+-------------+
8 rows in set (0.00 sec)
4.3、 所有服务器均需配置/etc/mysql-mmm/mmm_common.conf
vi /etc/mysql-mmm/mmm_common.conf
active_master_role writer
<host default>
cluster_interface eth0
pid_path /var/run/mysql-mmm/mmm_agentd.pi
d
bin_path /usr/libexec/mysql-mmm/
replication_user rep
replication_password test123
agent_user mmm_agent
agent_password test123
</host>
<host db1>
ip 192.168.0.150
mode master
peer db2
</host>
<host db2>
ip 192.168.0.160
mode master
peer db1
</host>
<host db3>
ip 192.168.0.151
mode slave
</host>
<role writer>
hosts db1, db2
ips 10.0.0.13
mode exclusive
</role>
<role reader>
hosts db1, db2, db3
ips 10.0.0.14, 10.0.0.15, 10.0.0.16
mode balanced
</role>
4.4、数据库主机配置/etc/mysql-mmm/mmm_agent.conf
hostname ip my.cnf -serverid dbname
master1 192.168.0.150 1 db1
master2 192.168.0.160 3 db2
slave1 192.168.0.151 2 db3
根据上表对三台 mysql服务器的/etc/mysql-mmm/mmm_agent.conf 配置文件进行修改
例:
[root@mysql-mmm-master1 tools]# vi /etc/mysql-mmm/mmm_agent.conf
include mmm_common.conf
# The 'this' variable refers to this server. Proper operation requires
# that 'this' server (db1 by default), as well as all other servers, have the
# proper IP addresses set in mmm_common.conf.
this db1
4.5、 monitor主机配置/etc/mysql-mmm/mmm_mon.conf
include mmm_common.conf
<monitor>
ip 127.0.0.1
pid_path /var/run/mysql-mmm/mmm_mond.pid
bin_path /usr/libexec/mysql-mmm
status_path /var/lib/mysql-mmm/mmm_mond.status
ping_ips 192.168.0.150, 192.168.0.151, 192.16
8.0.160
auto_set_online 30
# The kill_host_bin does not exist by default, though th
e monitor will
# throw a warning about it missing. See the section 5.1
0 "Kill Host
# Functionality" in the PDF documentation.
#
# kill_host_bin /usr/libexec/mysql-mmm/monitor/kill_
host
#
</monitor>
<host default>
monitor_user mmm_monitor
monitor_password test123
</host>
debug 0
4.6、 启动mysql-mmm
master-1,master-2,slave 启动代理:
编辑/etc/default/mysql-mmm-agent 来开启
[root@mysql-mmm-master2 tools]# vi /etc/default/mysql-mmm-agent
# mysql-mmm-agent defaults
ENABLED=1
所有数据库主机启动 mmm-agent:
/etc/init.d/mysql-mmm-agent start
monitor 主机启动 mmm-monitor
/etc/init.d/mysql-mmm-monitor start
4.7、 mmm_control命令监控mysql 服务器状态
[root@mysql-mmm-monitor ~]# mmm_control show
db1(192.168.0.150) master/ONLINE. Roles: reader(10.0.0.15), writer(10.0.0.13)
db2(192.168.0.160) master/ONLINE. Roles: reader(10.0.0.14)
db3(192.168.0.151) slave/ONLINE. Roles: reader(10.0.0.16)
4.8、 测试两个mysql服务器能否实现故障自动切换
将db1的mysql服务停止
[root@test-B ~]# /etc/init.d/mysqld stop
Shutting down MySQL. SUCCESS!
等待30秒在 mysql-mmm-monitor 服务器上进行监控查看
[root@test-A ~]# mmm_control show
db1(192.168.0.150) master/HARD_OFFLINE. Roles:
db2(192.168.0.160) master/ONLINE. Roles: reader(10.0.0.14), writer(10.0.0.13)
db3(192.168.0.151) slave/ONLINE. Roles: reader(10.0.0.15), reader(10.0.0.16)
slave检查master_host 是否切换到了另一个主库地址:
[root@test-C ~]# mysql -uroot -p -e "show slave status\G" -S /data/3307/mysql.sock
Enter password:
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.0.160
Master_User: rep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 537
Relay_Log_File: relay-bin.000002
Relay_Log_Pos: 253
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB: mysql
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 537
Relay_Log_Space: 403
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 3
恢复master-1(db1)
[root@test-B ~]# /etc/init.d/mysqld start
Starting MySQL.. SUCCESS!
monitor端检查恢复情况
[root@test-A ~]# mmm_control show
db1(192.168.0.150) master/ONLINE. Roles: reader(10.0.0.16)
db2(192.168.0.160) master/ONLINE. Roles: reader(10.0.0.14), writer(10.0.0.13)
db3(192.168.0.151) slave/ONLINE. Roles: reader(10.0.0.15)
可以看到当 db1 恢复后就充当 slave 的角色了!只有当 db2 挂了以后db1 又会担当起主服务器的写入功能。
4.9、 mmm_control命令介绍
Valid commands are:
help - show this message
ping - ping monitor
show - show status
checks [<host>|all [<check>|all]] - show checks status
set_online <host> - set host <host> online
set_offline <host> - set host <host> offline
mode - print current mode.
set_active - switch into active mode.
set_manual - switch into manual mode.
set_passive - switch into passive mode.
move_role [--force] <role> <host> - move exclusive role <role> to host <host>
(Only use --force if you know what you are doing!)
set_ip <ip> <host> - set role with ip <ip> to host <host>
五、配置过程中我遇到的一些问题和解决方法
问题:
配置过程中,到最后查看所有服务器状态,从服务器不在状态:
[root@test-A ~]# mmm_control show
[root@test-A ~]# mmm_control show
db1(192.168.0.150) master/ONLINE. Roles: reader(10.0.0.15), writer(10.0.0.13)
db2(192.168.0.160) master/ONLINE. Roles: reader(10.0.0.14), reader(10.0.0.16)
db3(192.168.0.151) slave/HARD_OFFLINE. Roles:
如上,解决方法:
从服务器上的mysql,原来做测试时,用的多实例,mysql服务端口为3307。停掉从服务器上的主从,然后在配置文件中把端口改为3306,重启服务。重新做一下主从同步后,重新启动MMM的代理服务后,再次在MMM管理端查看所有服务器状态,已全部正常,如下:
[root@test-A ~]# mmm_control show
db1(192.168.0.150) master/ONLINE. Roles: reader(10.0.0.15), writer(10.0.0.13)
db2(192.168.0.160) master/ONLINE. Roles: reader(10.0.0.14)
db3(192.168.0.151) slave/ONLINE. Roles: reader(10.0.0.16)