一.mysql-mmm实现mysql 高可用架构
     MMM 即Master-Master Replication Manager for MySQL(mysql 主主复制管理器)关于 mysql
     主主复制配置的监控、故障转移和管理的一套可伸缩的脚本套件(在任何时候只有一个
节点可以被写入),这个套件也能对居于标准的主从配置的任意数量的从服务器进行读
负载均衡,所以你可以用它来在一组居于复制的服务器启动虚拟 ip,除此之外,它还
有实现数据备份、节点之间重新同步功能的脚本。
     MySQL 本身没有提供 replication failover 的解决方案,通过 MMM 方案能实现服务器的故障转移,从而实现 mysql 的高可用。MMM 项目来自 Google:http://code.google.com/p/mysql-master-master
官方网站为:http://mysql-mmm.org
mysql-mmm     主要功能由下面三个脚本提供
mmm_mond     负责所有的监控工作的监控守护进程,决定节点的移除等等
mmm_agentd   运行在 mysql 服务器上的代理守护进程,通过简单远程服务集提供给监控节点
mmm_control   通过命令行管理 mmm_mond 进程
     mysql-mmm 的监管端会提供多个虚拟 IP(VIP),包括一个可写 VIP,多个可读 VIP,通过监管的管理,这些 IP 会绑定在可用 mysql 之上,当某一台 mysql 宕机时,监管会将 VIP
迁移至其他 mysql。
    在整个监管过程中,需要在 mysql 中添加相关授权用户,以便让 mysql 可以支持监理机的维护。授权的用户包括一个 mmm_monitor 用户和一个 mmm_agent 用户,如果想使用 mmm的备份工具则还要添加一个 mmm_tools 用户。

部署开始,由于机器资源有限,这里的实验,slave 就用一台了。

二.部署的前期工作
1.环境描述
vmvare 虚拟机:4 台
系统版本:CentOS release 6.6 (Final)  2.6.32-504.el6.x86_64
mysql版本:5.5.32
mysql-mmm版本:

4台虚拟机信息:
MMM管理机:192.168.0.149  Monitor      test-A
master1:192.168.0.150    server-id=1  test-B
master2:192.168.0.160    server-id=3  test-D
slave:192.168.0.151      server-id=2  test-C

虚拟IP:
10.0.0.13   write
10.0.0.14   read
10.0.0.15   read
10.0.0.16   read

Mysql-MMM 架构配置简介:
1.master1, master2 两台安装 mysql,并做主主的配置
2.slave1 上安装 mysql,并配置作为 master1 的从服务器。
3.master1/2, slave1,Monitor 这四台都要安装 mysql-mmm,并配置:mmm_common.conf、
mmm_agent.conf 以及 mmm_mon.conf 文件

三、配置 mysql-master-1/2(主主同步),mysql-master-1 与 mysql-slave(主从同步)
注:所有的mysql都是新安装的,所以没有任何数据,环境相同。
1.1、 改my.cnf然后重启服务
mysql-master-1:
[mysqld]
server-id       = 1
log-bin=mysql-bin
log-slave-updates  
auto_increment_offset=1  
auto_increment_increment=2

mysql-master-2:
[mysqld]
server-id       = 3
log-bin=mysql-bin
log-bin=mysql-bin
log-slave-updates  
auto_increment_offset=2  
auto_increment_increment=2

1.2、 配置master1和master2 做主主同步
master1 和 master2 都需要创建链接用户
mysql> grant replication slave on *.* to 'rep'@'192.168.0.%' identified by
'test123';
Query OK, 0 rows affected (0.02 sec)

master1操作:
mysql> show master status;
+------------------+----------+--------------+---------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB                            |
+------------------+----------+--------------+---------------------------------------------+
| mysql-bin.000006 |      107 |              | mysql,performance_schema,information_schema |
+------------------+----------+--------------+---------------------------------------------+
1 row in set (0.00 sec)
 
master2操作:
mysql>  change master to
     >  master_host='192.168.0.150',
     >  master_port=3306,
     >  master_user='rep',
     >  master_password='test123',
     >  master_log_file='mysql-bin.000006',
     >  master_log_pos=107;
mysql> start slave;
mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.0.150
                  Master_User: rep
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000006
          Read_Master_Log_Pos: 107
               Relay_Log_File: test-D-relay-bin.000006
                Relay_Log_Pos: 253
        Relay_Master_Log_File: mysql-bin.000006
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 107
              Relay_Log_Space: 410
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
1 row in set (0.00 sec)

ERROR:
No query specified

mysql> grant replication slave on *.* to 'rep'@'192.168.0.%' identified by 'test123';
mysql> show master status;                                  +------------------+----------+--------------+------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000003 |      107 |              |                  |
+------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

master1操作:
mysql>  change master to
     >  master_host='192.168.0.160',
     >  master_port=3306,
     >  master_user='rep',
     >  master_password='test123',
     >  master_log_file='mysql-bin.000003',
     >  master_log_pos=107;
mysql> start slave;
mysql> show slave status\G;

2.1、 slave修改my.cnf并重启服务
vi /data/3307/my.cnf
[mysqld]
server-id = 2
[root@test-C ~]# mysqladmin -uroot -p456 shutdown -S /data/3307/mysql.sock
[root@test-C ~]# /application/mysql/bin/mysqld_safe --defaults-file=/data/3307/my.cnf &
注:从上我用的是多实例

2.2、  配置同步参数
查看 master1 主库的记录点信息
flush tables with read lock;     #锁表
mysql> show master status;
+------------------+----------+--------------+---------------------------------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB                            |
+------------------+----------+--------------+---------------------------------------------+
| mysql-bin.000006 |      107 |              | mysql,performance_schema,information_schema |
+------------------+----------+--------------+---------------------------------------------+
1 row in set (0.00 sec)
mysqldump -uroot --event -A -B >/tmp/master1.sql   # 备份主库
unlock tables;   #解锁

slave 从库进行操作:
mysql -uroot mysql> CHANGE MASTER TO   
    -> MASTER_HOST='192.168.0.150',
    -> MASTER_PORT=3306,  
    -> MASTER_USER='rep',  
    -> MASTER_PASSWORD='test123',
    -> MASTER_LOG_FILE='mysql-bin.000006',
    -> MASTER_LOG_POS=107;
Query OK, 0 rows affected (0.04 sec)
 
mysql> start slave;
Query OK, 0 rows affected (0.02 sec)
 
mysql> show slave status\G;

 
四、配置mysql-mmm
4.1、 安装mysql-mmm
注:需要在这四台 server 上都安装 mysql-mmm
CentOS 软件仓库默认是不含这些软件的,必须要有epel这个包的支持。所以我们必须先安装epel。
四台同时操作:
cd tools
wget http://dl.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm
rpm -Uvh epel-release-6-8.noarch.rpm  
yum install -y mysql-mmm*

 
4.2、 配置mmm代理和监控账号的权限
现在环境已经配置好,我没有配置忽略 mysql库和 user表,所以只要在任意一台主库上执
行下面的操作,其他的库就都有这俩账号了。
mysql> GRANT REPLICATION CLIENT ON *.* TO 'mmm_monitor'@'192.168.0.%' IDENTIFIED  BY 'test123';
mysql> GRANT SUPER, REPLICATION CLIENT, PROCESS ON *.* TO 'mmm_agent'@'192.168.0.%' IDENTIFIED BY 'test123';       
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.06 sec)
注:master1、master2、slave 是一样的
mysql> select user,host from mysql.user;
+-------------+-------------+
| user        | host        |
+-------------+-------------+
| root        | 127.0.0.1   |
| mmm_agent   | 192.168.0.% |
| mmm_monitor | 192.168.0.% |
| rep         | 192.168.0.% |
| root        | ::1         |
|             | localhost   |
| root        | localhost   |
|             | test-B      |
| root        | test-B      |
+-------------+-------------+
8 rows in set (0.00 sec)
      
 
4.3、 所有服务器均需配置/etc/mysql-mmm/mmm_common.conf
vi /etc/mysql-mmm/mmm_common.conf     
active_master_role      writer


    cluster_interface       eth0
    pid_path                /var/run/mysql-mmm/mmm_agentd.pi
d
    bin_path                /usr/libexec/mysql-mmm/
    replication_user        rep
    replication_password    test123
    agent_user              mmm_agent
    agent_password          test123



    ip      192.168.0.150
    mode    master
    peer    db2



    ip      192.168.0.160
    mode    master
    peer    db1



    ip      192.168.0.151
    mode    slave



    hosts   db1, db2
    ips     10.0.0.13
    mode    exclusive



    hosts   db1, db2, db3
    ips     10.0.0.14, 10.0.0.15, 10.0.0.16
    mode    balanced


 
4.4、数据库主机配置/etc/mysql-mmm/mmm_agent.conf
hostname      ip  my.cnf -serverid  dbname
master1  192.168.0.150      1         db1
master2  192.168.0.160      3         db2
slave1   192.168.0.151      2         db3
根据上表对三台 mysql服务器的/etc/mysql-mmm/mmm_agent.conf 配置文件进行修改
例:
[root@mysql-mmm-master1 tools]# vi /etc/mysql-mmm/mmm_agent.conf
include mmm_common.conf
# The 'this' variable refers to this server.  Proper operation requires
# that 'this' server (db1 by default), as well as all other servers, have the
# proper IP addresses set in mmm_common.conf.
this db1
 
4.5、 monitor主机配置/etc/mysql-mmm/mmm_mon.conf
include mmm_common.conf


    ip                  127.0.0.1
    pid_path            /var/run/mysql-mmm/mmm_mond.pid
    bin_path            /usr/libexec/mysql-mmm
    status_path         /var/lib/mysql-mmm/mmm_mond.status
    ping_ips            192.168.0.150, 192.168.0.151, 192.16
8.0.160
    auto_set_online     30

    # The kill_host_bin does not exist by default, though th
e monitor will
    # throw a warning about it missing.  See the section 5.1
0 "Kill Host
    # Functionality" in the PDF documentation.
    #
    # kill_host_bin     /usr/libexec/mysql-mmm/monitor/kill_
host
    #



    monitor_user        mmm_monitor
    monitor_password    test123


debug 0

4.6、 启动mysql-mmm
master-1,master-2,slave 启动代理:
编辑/etc/default/mysql-mmm-agent 来开启
[root@mysql-mmm-master2 tools]# vi /etc/default/mysql-mmm-agent
# mysql-mmm-agent defaults
ENABLED=1
所有数据库主机启动 mmm-agent:
/etc/init.d/mysql-mmm-agent start
monitor 主机启动 mmm-monitor
/etc/init.d/mysql-mmm-monitor start
 
4.7、 mmm_control命令监控mysql 服务器状态
[root@mysql-mmm-monitor ~]# mmm_control show
  db1(192.168.0.150) master/ONLINE. Roles: reader(10.0.0.15), writer(10.0.0.13)
  db2(192.168.0.160) master/ONLINE. Roles: reader(10.0.0.14)
  db3(192.168.0.151) slave/ONLINE. Roles: reader(10.0.0.16)

 
4.8、 测试两个mysql服务器能否实现故障自动切换
将db1的mysql服务停止
[root@test-B ~]# /etc/init.d/mysqld stop
Shutting down MySQL. SUCCESS!
等待30秒在 mysql-mmm-monitor 服务器上进行监控查看
[root@test-A ~]# mmm_control show
  db1(192.168.0.150) master/HARD_OFFLINE. Roles:
  db2(192.168.0.160) master/ONLINE. Roles: reader(10.0.0.14), writer(10.0.0.13)
  db3(192.168.0.151) slave/ONLINE. Roles: reader(10.0.0.15), reader(10.0.0.16)

slave检查master_host 是否切换到了另一个主库地址:
[root@test-C ~]# mysql -uroot -p -e "show slave status\G" -S /data/3307/mysql.sock                    
Enter password:
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.0.160
                  Master_User: rep
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000003
          Read_Master_Log_Pos: 537
               Relay_Log_File: relay-bin.000002
                Relay_Log_Pos: 253
        Relay_Master_Log_File: mysql-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB: mysql
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 537
              Relay_Log_Space: 403
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 3

恢复master-1(db1)
[root@test-B ~]# /etc/init.d/mysqld start
Starting MySQL.. SUCCESS!
monitor端检查恢复情况
[root@test-A ~]# mmm_control show
  db1(192.168.0.150) master/ONLINE. Roles: reader(10.0.0.16)
  db2(192.168.0.160) master/ONLINE. Roles: reader(10.0.0.14), writer(10.0.0.13)
  db3(192.168.0.151) slave/ONLINE. Roles: reader(10.0.0.15)
可以看到当 db1 恢复后就充当 slave 的角色了!只有当 db2 挂了以后db1 又会担当起主服务器的写入功能。

4.9、 mmm_control命令介绍
Valid commands are:
    help                              - show this message
    ping                              - ping monitor
    show                              - show status
    checks [|all [|all]] - show checks status
    set_online                  - set host online
    set_offline                 - set host offline
    mode                              - print current mode.
    set_active                        - switch into active mode.
    set_manual                        - switch into manual mode.
    set_passive                       - switch into passive mode.
    move_role [--force] - move exclusive role to host
                                        (Only use --force if you know what you are doing!)
    set_ip                 - set role with ip to host

五、配置过程中我遇到的一些问题和解决方法
问题:
配置过程中,到最后查看所有服务器状态,从服务器不在状态:
[root@test-A ~]# mmm_control show
[root@test-A ~]# mmm_control show
  db1(192.168.0.150) master/ONLINE. Roles: reader(10.0.0.15), writer(10.0.0.13)
  db2(192.168.0.160) master/ONLINE. Roles: reader(10.0.0.14), reader(10.0.0.16)
  db3(192.168.0.151) slave/HARD_OFFLINE. Roles:
如上,解决方法:
    从服务器上的mysql,原来做测试时,用的多实例,mysql服务端口为3307。停掉从服务器上的主从,然后在配置文件中把端口改为3306,重启服务。重新做一下主从同步后,重新启动MMM的代理服务后,再次在MMM管理端查看所有服务器状态,已全部正常,如下:
[root@test-A ~]# mmm_control show                    
  db1(192.168.0.150) master/ONLINE. Roles: reader(10.0.0.15), writer(10.0.0.13)
  db2(192.168.0.160) master/ONLINE. Roles: reader(10.0.0.14)
  db3(192.168.0.151) slave/ONLINE. Roles: reader(10.0.0.16)