转自:http://opsmysql.blog.51cto.com/2238445/1059322/
MMM简介:
MMM即Master-Master Replication Manager for MySQL(mysql主主复制管理器),是关于mysql主主复制配置的监控、故障转移和管理的一套可伸缩的脚本套件(在任何时候只有一个节点可以被写入),这个套件也能基于标准的主从配置的任意数量的从服务器进行读负载均衡,所以你可以用它来在一组居于复制的服务器启动虚拟ip,除此之外,它还有实现数据备份、节点之间重新同步功能的脚本。
MySQL本身没有提供replication failover的解决方案,通过MMM方案能实现服务器的故障转移,从而实现mysql的高可用。
MMM项目来自 Google:http://code.google.com/p/mysql-master-master
官方网站为:http://mysql-mmm.org
MMM主要功能由下面三个脚本提供
mmm_mond :负责所有的监控工作的监控守护进程,决定节点的移除等等
mmm_agentd :运行在mysql服务器上的代理守护进程,通过简单远程服务集提供给监控节点
mmm_control :通过命令行管理mmm_mond进程
关于此架构的优缺点:
优点:安全性、稳定性高,可扩展性好,当主服务器挂掉以后,另一个主立即接管,其他的从服务器能自动切换,不用人工干预。
缺点:至少三个节点,对主机的数量有要求,需要实现读写分离,可以在程序扩展上比较难实现。同时对主从(双主)同步延迟要求比较高!因此不适合数据安全非常严格的场合。
实用场所:高访问量,业务增长快,并且要求实现读写分离的场景。
环境:
MMM_Monitor: 192.168.8.31-----(MySQL-MON)
MySQL_Master1: 192.168.8.32-----(MySQL-M1)
MySQL_Master2: 192.168.8.33-----(MySQL-M2)
VIP_Write: 192.168.8.30-----(VIP0)
VIP_Read1: 192.168.8.34-----(VIP1)
VIP_Read2: 192.168.8.35-----(VIP2)
架构原理图:
一、环境基础配置
1. 设置hosts解析
三台服务器配置如下:
cat >>/etc/hosts<<EOF
192.168.8.31 MySQL-MON
192.168.8.32 MySQL-M1
192.168.8.33 MySQL-M2
EOF
二、安装配置mysql
具体安装过过程略,如果不会安装 mysql的可以不用看本教程了!
这里说下my.cnf配置文件的细节:
MySQL-M1的配置:
server-id = 12
#log-slave-updates
#sync_binlog = 1
log-bin = /data/mysql/binlog/mysql-bin
auto-increment-increment = 2
auto-increment-offset = 2
relay-log=mysql-relay
relay-log-index=mysql-relay.index
MySQL-M2的配置:
server-id = 13
#log-slave-updates
#sync_binlog = 1
log-bin = /data/mysql/binlog/mysql-bin
auto-increment-increment = 2
auto-increment-offset = 2
relay-log=mysql-relay
relay-log-index=mysql-relay.index
三、安装mysql-mmm
在三台服务器安装
wget http://download.fedoraproject.org/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm
rpm -ivh epel-release-5-4.noarch.rpm
yum -y install mysql-mmm*
[root@MySQL-M1 mysql-mmm]# rpm -qa |grep mysql-mmm
mysql-mmm-2.2.1-1.el5
mysql-mmm-agent-2.2.1-1.el5
mysql-mmm-tools-2.2.1-1.el5
mysql-mmm-monitor-2.2.1-1.el5
说明:也可以下载源码包安装:
wget http://mysql-mmm.org/_media/:mmm2:mysql-mmm-2.2.1.tar.gz
mv :mmm2:mysql-mmm-2.2.1.tar.gz mysql-mmm-2.2.1.tar.gz
tar xf mysql-mmm-2.2.1.tar.gz
cd mysql-mmm-2.2.1
make install
四、配置MySQL-M1和MySQL-M2主主模式
1.首先创建三个账号
mysql> grant file, replication salve on *.* to 'repl'@'192.168.8.%' identified by "repl";
mysql> grant process, super, replication client on *.* to 'mmm_agent'@'192.168.8.%' identified by 'mmm_agent';
mysql> grant replication client on *.* to "mmm_monitor"@"192.168.8.%" identified by "mmm_monitor";
说明:
第一个账号repl(复制账号),是用于主主复制
第二个账号mmm_agent(代理账号),是mmm agent用来变成只读模式和同步master等
第三个账号mmm_monitor(监听账号),是mmm monitor服务器用来对mysql服务器做健康检查的
要注意一点是:由于MySQL-M1和MySQL-M2之间有复制,所以只要在一台服务器上执行就可以了,不过要在MySQL-MON上执行后面两条!
2.配置主主模式
2.1 把MySQL-M1服务器作为MySQL-M2服务器主
在MySQL-M1服务器操作:
mysql> show master status;
+------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000003 | 107 | | mysql |
+------------------+----------+--------------+------------------+
1 row in set (0.00 sec)
然后在MySQL-M2服务器操作:
mysql> change master to master_host='192.168.8.32',master_user='repl',master_password='repl',master_log_file='mysql-bin.000003',master_log_pos=107;
Query OK, 0 rows affected (0.07 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.8.32
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 107
Relay_Log_File: mysql-relay.000002
Relay_Log_Pos: 253
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
.....................................................
后面的信息省略..
.....................................................
2.2 把MySQL-M2服务器作为MySQL-M1服务器主
在MySQL-M2服务器上操作:
mysql> show master status;
+------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000003 | 605 | | |
+------------------+----------+--------------+------------------+
1 row in set (0.00 sec)
在MySQL-M1服务器上操作:
mysql> change master to master_host='192.168.8.33',master_user='repl',master_password='repl',master_log_file='mysql-bin.000003',master_log_pos=605;
Query OK, 0 rows affected (0.06 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.8.33
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 605
Relay_Log_File: mysql-relay.000002
Relay_Log_Pos: 253
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
.....................................................
后面的信息省略..
.....................................................
OK...主主模式配置成功!主主复制同步测试这里不再说明,接着下面的步骤。
五、配置MMM监控、代理服务
1. 在三台服务器修改mmm_common.conf配置文件(三台服务器此配置文件内容相同)
修改后的内容如下:
active_master_role writer
<host default>
cluster_interface eth0
pid_path /var/run/mysql-mmm/mmm_agentd.pid
bin_path /usr/libexec/mysql-mmm/
replication_user repl #前面创建的复制账号
replication_password repl #前面创建的复制账号密码
agent_user mmm_agent #前面创建的代理账号
agent_password mmm_agent #前面创建的代理账号密码
</host>
<host MySQL-M1>
ip 192.168.8.32 #MySQL-M1服务器IP
mode master
peer MySQL-M2 #MySQL-M2服务器主机名
</host>
<host MySQL-M2>
ip 192.168.8.33
mode master
peer MySQL-M1
</host>
<role writer>
hosts MySQL-M1, MySQL-M2 #能够作为Write的服务器
ips 192.168.8.30 #Write节点虚拟IP,应用的写请求将直接连接到这个IP
mode exclusive #排它模式
</role>
<role reader>
hosts MySQL-M1, MySQL-M2 #作为Reader的服务器
ips 192.168.8.34, 192.168.8.35 #Reader节点虚拟IP,应用的读请求将直接连接到这些IP
mode balanced #平衡模式
</role>
说明:mode exclusive
这个地方有两种模式:
exclusive:在这种模式下任何时候只能一个主机拥有该角色
balanced : 该模式下可以多个主机同时拥有此角色。
通常情况下writer是exclusive,reader是balanced
2. 在MySQL-M1服务器上修改mmm_agent.conf配置文件
修改后的内容如下:
include mmm_common.conf
# The 'this' variable refers to this server. Proper operation requires
# that 'this' server (db1 by default), as well as all other servers, have the
# proper IP addresses set in mmm_common.conf.
this MySQL-M1
3. 在MySQL-M2服务器上修改mmm_agent.conf配置文件
修改后的内容如下:
include mmm_common.conf
# The 'this' variable refers to this server. Proper operation requires
# that 'this' server (db1 by default), as well as all other servers, have the
# proper IP addresses set in mmm_common.conf.
this MySQL-M2
4. 在MySQL-MON服务器上配置mmm_mon.conf配置文件
修改后的内容如下:
include mmm_common.conf
<monitor>
ip 127.0.0.1
pid_path /var/run/mysql-mmm/mmm_mond.pid
bin_path /usr/libexec/mysql-mmm
status_path /var/lib/mysql-mmm/mmm_mond.status
ping_ips 192.168.8.32,192.168.8.33 #可以ping的真实agent服务器的IP
auto_set_online 10 #发现节点丢失则过10秒进行切换
# The kill_host_bin does not exist by default, though the monitor will
# throw a warning about it missing. See the section 5.10 "Kill Host
# Functionality" in the PDF documentation.
#
# kill_host_bin /usr/libexec/mysql-mmm/monitor/kill_host
#
</monitor>
<host default>
monitor_user mmm_monitor #前面创建的监控账号
monitor_password mmm_monitor #前面创建的监控账号密码
</host>
debug 0
5.启动代理(默认是启用,这里只是说明下)
[root@MySQL-M1 mysql-mmm]# cat /etc/default/mysql-mmm-agent
# mysql-mmm-agent defaults
ENABLED=1
[root@MySQL-M2 mysql-mmm]# cat /etc/default/mysql-mmm-agent
# mysql-mmm-agent defaults
ENABLED=1
六、启动各服务器的相关服务
MySQL-M1和MySQL-M2服务器上启动
/etc/init.d/mysql-mmm-agent start
MySQL-MON服务器上启动
/etc/init.d/mysql-mmm-monitor start
在MySQL-MON监控机上查看MMM状态信息:
[root@MySQL-MON mysql-mmm]# mmm_control show
MySQL-M1(192.168.8.32) master/ONLINE. Roles: reader(192.168.8.35), writer(192.168.8.30)
MySQL-M2(192.168.8.33) master/ONLINE. Roles: reader(192.168.8.34)
[root@MySQL-MON mysql-mmm]# mmm_control checks all
MySQL-M2 ping [last change: 2012/10/15 05:07:35] OK
MySQL-M2 mysql [last change: 2012/10/15 05:07:35] OK
MySQL-M2 rep_threads [last change: 2012/10/15 05:07:35] OK
MySQL-M2 rep_backlog [last change: 2012/10/15 05:07:35] OK: Backlog is null
MySQL-M1 ping [last change: 2012/10/15 05:07:35] OK
MySQL-M1 mysql [last change: 2012/10/15 05:07:35] OK
MySQL-M1 rep_threads [last change: 2012/10/15 05:07:35] OK
MySQL-M1 rep_backlog [last change: 2012/10/15 05:07:35] OK: Backlog is null
[root@MySQL-MON mysql-mmm]# mmm_control mode
ACTIVE
下面分别查看各服务器的日志信息:
[root@MySQL-M1 mysql-mmm]# cat /var/log/mysql-mmm/mmm_agentd.log
2012/10/15 05:06:06 INFO We have some new roles added or old rules deleted!
2012/10/15 05:06:06 INFO Added: reader(192.168.8.35), writer(192.168.8.30)
[root@MySQL-M2 mysql-mmm]# cat /var/log/mysql-mmm/mmm_agentd.log
2012/10/16 14:53:51 INFO We have some new roles added or old rules deleted!
2012/10/16 14:53:51 INFO Added: reader(192.168.8.34)
[root@MySQL-MON ~]# cat /var/log/mysql-mmm/mmm_mond.log
2012/10/15 05:07:36 FATAL Couldn't open status file '/var/lib/mysql-mmm/mmm_mond.status': Starting up without status information.
2012/10/15 05:08:38 FATAL State of host 'MySQL-M2' changed from AWAITING_RECOVERY to ONLINE because of auto_set_online(10 seconds). It was in state AWAITING_RECOVERY for 10 seconds
2012/10/15 05:08:38 FATAL State of host 'MySQL-M1' changed from AWAITING_RECOVERY to ONLINE because of auto_set_online(10 seconds). It was in state AWAITING_RECOVERY for 10 seconds
下面再看下各服务器的服务进程信息:
[root@MySQL-MON ~]# ps aux |grep mmm
root 19176 0.0 2.8 115784 14320 ? S 05:07 0:00 mmm_mond
root 19178 0.3 14.4 331756 71556 ? Sl 05:07 0:18 mmm_mond
root 19185 0.1 1.8 105824 9376 ? S 05:07 0:06 perl /usr/libexec/mysql-mmm/monitor/checker ping_ip
root 19188 0.0 2.2 137644 10940 ? S 05:07 0:05 perl /usr/libexec/mysql-mmm/monitor/checker mysql
root 19190 0.0 1.8 105824 9384 ? S 05:07 0:02 perl /usr/libexec/mysql-mmm/monitor/checker ping
root 19192 0.1 2.2 137644 10984 ? S 05:07 0:07 perl /usr/libexec/mysql-mmm/monitor/checker rep_backlog
root 19194 0.1 2.2 137644 10984 ? S 05:07 0:07 perl /usr/libexec/mysql-mmm/monitor/checker rep_threads
root 19308 0.0 0.1 61228 720 pts/1 R+ 06:42 0:00 grep mmm
[root@MySQL-M1 mysql-mmm]# ps aux |grep mmm
root 1371 0.0 0.1 61228 724 pts/1 R+ 06:40 0:00 grep mmm
root 24228 0.0 2.2 106096 11068 ? S 04:58 0:00 mmm_agentd
root 24230 0.2 2.6 140148 13204 ? S 04:58 0:16 mmm_agentd
下面再看下VIP绑定信息:
七、模拟宕机切换测试
1.现在把MySQL-M1的mysqld服务停掉
[root@MySQL-M1 mysql-mmm]# service mysqld stop
Shutting down MySQL. [ OK ]
[root@MySQL-M1 mysql-mmm]# ps aux |grep mysqld |grep -v grep
[root@MySQL-MON ~]# mmm_control show
MySQL-M1(192.168.8.32) master/HARD_OFFLINE. Roles:
MySQL-M2(192.168.8.33) master/ONLINE. Roles: reader(192.168.8.34), reader(192.168.8.35), writer(192.168.8.30)
2. 现在再恢复MySQL-M1
[root@MySQL-M1 mysql-mmm]# service mysqld start
Starting MySQL..... [ OK ]
[root@MySQL-M1 mysql-mmm]# ps aux |grep mysqld |grep -v grep
root 19328 1.2 0.2 66116 1340 pts/1 S 09:51 0:00 /bin/sh /opt/webserver/mysql/bin/mysqld_safe --datadir=/data/mysql/data/ --pid-file=/data/mysql/data//MySQL-M1.pid
mysql 20139 7.0 10.3 420060 51348 pts/1 Sl 09:51 0:00 /opt/webserver/mysql/bin/mysqld --basedir=/opt/webserver/mysql --datadir=/data/mysql/data/ --plugin-dir=/opt/webserver/mysql/lib/plugin --user=mysql --log-error=/data/mysql/logs/mysql.err --open-files-limit=10240 --pid-file=/data/mysql/data//MySQL-M1.pid --socket=/tmp/mysql.sock --port=3306
然后再到MySQL-MON上查看MMM信息:
[root@MySQL-MON ~]# mmm_control show
MySQL-M1(192.168.8.32) master/ONLINE. Roles: reader(192.168.8.34)
MySQL-M2(192.168.8.33) master/ONLINE. Roles: reader(192.168.8.35), writer(192.168.8.30)
可以看到MySQL-M1恢复后又OK了。。。。但是MySQL-M1此时不再提供写代理了,只提供读代理了!
接下来看下VIP绑定信息:
[root@MySQL-M1 mysql-mmm]# ip a |grep eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
inet 192.168.1.32/24 brd 192.168.1.255 scope global eth0
inet 192.168.8.34/32 scope global eth0
[root@MySQL-M2 mysql-mmm]# ip a |grep eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
inet 192.168.8.33/24 brd 192.168.8.255 scope global eth0
inet 192.168.8.35/32 scope global eth0
inet 192.168.8.30/32 scope global eth0
总结:mmm_mond监控各mysql-server的运行状态
1、当Roles为reader和write的MySQL-M1发生故障的时候,就将reader和writer角色从MySQL-M1上移除,并标记为HARD_OFFLINE状态,由Roles为reader的MySQL-M2取代该服务器,并分配writer角色和浮动IP,此时MySQL-M2为主服务器,承担服务器的读写代理。当MySQL-M1恢复后,mysql-mmm会自动分配Roles为reader,标记为ONLINE状态,并和MySQL-M2一起承担读压力,此时MySQL-M1为slave,提供只读代理功能。
2、当Roles为reader的MySQL-M1发生故障,就将reader的角色从MySQL-MMM上移除,并标记为HARD_OFFLINE状态。当该MySQL-M1恢复后又会自动分配reader角色给该服务器,标记为ONLINE状态,并和MySQL-M2一起承担读压力。
这里只测试下MySQL-MMM故障转移,有兴趣的朋友还要以测试下当主主复制出现问题时,会导致MMM出现什么问题!
附录:mmm_control命令相关参数说明
[root@MySQL-MON ~]# /usr/sbin/mmm_control help
Valid commands are:
help - show this message #查看帮助信息
ping - ping monitor #ping监控,用于监控检测agent服务器
show - show status #查看状态信息
checks [<host>|all [<check>|all]] - show checks status #显示检查状态,包括(ping、mysql、rep_threads、rep_backlog)
set_online <host> - set host <host> online #设置某host为online状态
set_offline <host> - set host <host> offline #设置某host为offline状态
mode - print current mode. #打印当前的模式,是ACTIVE、MANUAL、PASSIVE(默认是ACTIVE模式)
set_active - switch into active mode. #更改为active模式
set_manual - switch into manual mode. #更改为manual模式
set_passive - switch into passive mode. #更改为passive模式
move_role [--force] <role> <host> - move exclusive role <role> to host <host> #更改host的模式,比如更改处于slave的mysql数据库角色为writer
(Only use --force if you know what you are doing!)
set_ip <ip> <host> - set role with ip <ip> to host <host> help #为host设置ip,只有passive模式的时候才允许更改!
最后再总结MMM的两点不足:
1、MMM不允许两台Master同时挂掉
2、MMM对主主复制之间的延迟控制比较严格
后期博文发写一篇关于MHA高可用架构的切换实验。。。。