Mysql+KeepAlived 主从搭建测试(互为主从)

一、目的:

解决Master单点问题,两台mysql互为主备,双向replication。当一master挂了,则将slave切换为master,继续工作。
主机A:192.168.1.8

主机B:192.168.1.9

二、互为主从配置
1.安装
具体安装过程请参考:

http://blog.csdn.net/shiyu1157758655/article/details/70226036

2.具体配置:

主机A上,vi /etc/my.cnf,添加:


log-bin=mysql-bin
server-id=1
binlog_do_db=shiyu   #指定mysql的binlog日志记录哪个db
replicate-do-db=shiyu  #需要同步的db

注意,Mysql版本从5.1.7以后开始就不支持“master-host”类似的参数,例如:

master-host=192.168.1.9  #目标机器是B的地址
master-user=user1
master-password=pass1
master-port=3306
master-connect-retry=5

机器B上,vi /etc/my.cnf,添加:

log-bin=mysql-bin
server-id=2  #服务id要和A不同
binlog_do_db=shiyu   #指定mysql的binlog日志记录哪个db
replicate-do-db=shiyu  #需要同步的db

在机器A上开启binlog,创建复制帐号

mysql> create user 'user1'@'%' identified by 'pass1';  
Query OK, 0 rows affected (0.01 sec)  
 
mysql> grant replication slave on *.* to 'user1'@'%';  
Query OK, 0 rows affected (0.01 sec) 

mysql> flush privileges;  
Query OK, 0 rows affected (0.00 sec)  

在机器B上开启binlog,创建复制帐号

mysql> create user 'user2'@'%' identified by 'pass2';
Query OK, 0 rows affected (0.01 sec)

mysql> grant replication slave on *.* to 'user2'@'%';
Query OK, 0 rows affected (0.01 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)


在机器A上和机器B上创建数据库shiyu,并创建一个测试的table

mysql> create database shiyu;
Query OK, 1 row affected (0.01 sec)

mysql> use shiyu;
Database changed
mysql> create table test(id int,name varchar(10));
Query OK, 0 rows affected (0.03 sec)

机器A上查看信息

mysql> show variables like 'ser%';
+----------------+--------------------------------------+
| Variable_name  | Value                                |
+----------------+--------------------------------------+
| server_id      | 1                                    |
| server_id_bits | 32                                   |
| server_uuid    | 5d340b81-d4a3-11e7-ac7c-000c29c84c5f |
+----------------+--------------------------------------+

mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000002 |      947 | shiyu        |                  |                   |
+------------------+----------+--------------+------------------+-------------------+


机器B上开启同步:

mysql> change master to master_host='192.168.1.8',master_port=3306,master_log_file='mysql-bin.000002',master_log_pos=947,master_user='user1',maaster_password='pass1';
Query OK, 0 rows affected, 2 warnings (0.02 sec)

mysql> start slave;

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.8
                  Master_User: user1
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 1657
               Relay_Log_File: tomcat2-relay-bin.000002
                Relay_Log_Pos: 1030
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: shiyu
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1657
              Relay_Log_Space: 1239
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
                  Master_UUID: 5d340b81-d4a3-11e7-ac7c-000c29c84c5f
             Master_Info_File: /u01/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

ERROR: 
No query specified

机器B上查看信息


mysql> show variables like 'ser_%';
+----------------+--------------------------------------+
| Variable_name  | Value                                |
+----------------+--------------------------------------+
| server_id      | 2                                    |
| server_id_bits | 32                                   |
| server_uuid    | 02c31977-d3d8-11e7-8197-000c295a16f1 |
+----------------+--------------------------------------+
3 rows in set (0.01 sec)

mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000014 |      316 | shiyu        |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

机器A上开启同步:

mysql> change master to master_host='192.168.1.9',master_port=3306,master_log_file='mysql-bin.000014',master_log_pos=316,master_user='user2',master_password='pass2';
Query OK, 0 rows affected, 2 warnings (0.02 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.9
                  Master_User: user2
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000014
          Read_Master_Log_Pos: 316
               Relay_Log_File: tomcat1-relay-bin.000002
                Relay_Log_Pos: 320
        Relay_Master_Log_File: mysql-bin.000014
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: shiyu
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 316
              Relay_Log_Space: 529
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 2
                  Master_UUID: 02c31977-d3d8-11e7-8197-000c295a16f1
             Master_Info_File: /u01/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

ERROR: 
No query specified

mysql> 


三、验证

机器A上插入一行数据:

mysql> insert into test values(1,'A');
Query OK, 1 row affected (0.00 sec)

mysql> select * from test;
+------+------+
| id   | name |
+------+------+
|    1 | A    |
+------+------+
1 row in set (0.00 sec)

机器B上查看


mysql> select * from test;
+------+------+
| id   | name |
+------+------+
|    1 | A    |
+------+------+
1 row in set (0.00 sec)

机器B上插入一行数据:

mysql> insert into test values(2,'B');
Query OK, 1 row affected (0.02 sec)

mysql> select * from test;
+------+------+
| id   | name |
+------+------+
|    1 | A    |
|    2 | B    |
+------+------+
2 rows in set (0.00 sec)

机器A上查看:

mysql> select * from test;
+------+------+
| id   | name |
+------+------+
|    1 | A    |
|    2 | B    |
+------+------+
2 rows in set (0.00 sec)


至此说明A和B互为主从,在一个上面修改数据,就会同步到另外一个机器上是可行的。


四、KeepAlived

上一步确保了互为主从设置成功后,接下来安装:

[root@tomcat1 ~]# yum install keepalived
[root@tomcat1 ~]# keepalived -v
Keepalived v1.2.13 (03/19,2015)

[root@tomcat2 ~]# yum install keepalived

[root@tomcat2 ~]# keepalived -v
Keepalived v1.2.13 (03/19,2015)


1.具体配置


在A服务器编辑KeepAlived的配置文件vi /etc/keepalived/keepalived.

! Configuration File for keepalived

global_defs {  #主要是配置故障发生时的通知对象以及机器标识。
   router_id HA_MySQL  #标识,双主相同
}

vrrp_instance VI_1 {  #用来定义对外提供服务的VIP区域以及机器标识
    state BACKUP  #注意,主从两端都配置成了backup,因为使用了nopreempt,即非抢占模式
    interface eth0
    virtual_router_id 51  #分组,主备相同
    priority 100  #优先级,这个高一点则先把他当作为master
    advert_int 1
    nopreempt  #不主动抢占资源,设置非抢占模式
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.1.10
    }
}

virtual_server 192.168.1.10 3306 {  #设置虚拟服务器,需要指定虚拟IP地址和服务端口,IP与端口直接用空格隔开
    delay_loop 2  #设置运行情况检查时间,单位是秒
    lb_algo wrr   #设置后端调度算法,rr为轮询算法,这里设置的wrr是带有权重的轮询算法
    lb_kind DR    #设置LVS实现负载均衡的机制。有NAR、TUN、DR三个模式可选
    persistence_timeout 60  #回话保持的时间,单位是秒。这个选项对动态网页是非常有用的,为了集群系统中的session
                            #共享提供了一个很好的解决方案。有了这个会话保持功能,用户的请求会被一直分发到某个服务节点,
                            #直到超过这个会话的保持时间。
   protocol TCP   #指定转发协议类型,有TCP和UDP两种

    real_server 192.168.1.8 3306 {  #配置服务节点1,需要制定real server的真实IP地址和端口号,IP与端口之间用空格隔开
        weight 3   #配置服务节点的权值,权值大小用数字表示,数字越大,权值越高,设置全职大小为分区不同性能的服务器
        notify_down /usr/local/keepalived/mysql.sh #检测到real server的mysql服务down后执行的脚本
        TCP_CHECK {
            connect_timeout 3  #连接超时时间
            nb_get_retry 3     #重连次数
            delay_before_retry 3 #重连间隔时间
            connect_port 3306  #健康检查端口,设置车工自己mysql的服务端口

     } 
  }
}

然后编辑机器B上的配置文件

! Configuration File for keepalived

global_defs {  
   router_id HA_MySQL  
}

vrrp_instance VI_1 {  
    state BACKUP  
    interface eth0
    virtual_router_id 51  
    priority 90   #优先级,这个要低一点
    advert_int 1
    #nopreempt  #这里的nopreempt(即非抢占模式)去掉,因为该配置项一般只在优先级高的mysql上配置 
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.1.10
    }
}
virtual_server 192.168.1.10 3306 {  
    delay_loop 2 
    lb_algo wrr   
    lb_kind DR   
    persistence_timeout 60                            
    protocol TCP   

    real_server 192.168.1.9 3306 {  
        weight 3   
        notify_down /usr/local/keepalived/mysql.sh 
        TCP_CHECK {
            connect_timeout 3  
            nb_get_retry 3    
            delay_before_retry 3 
            connect_port 3306  

    } 
  }
}

有一点要注意的是,主从两端的state,都配置成了backup,因为使用了nopreempt,即非抢占模式。举个例子,当主端先启动mysql实例和keepalived后,如果此时从端也启动了mysql实例和keepalived,那么vip不会跳到从端上去,即使它的优先级为100,要大于主端的90而如果不设置nopreempt,那么这个时候,又分2种情况:1.state相同,即都是master或都是backup优先级高的,会占有vip,和角色无关2.state不同,即master->backup或backup->master优先级高的,会占有vip,和角色无关前提不同,结果都是一样的,即优先级是主导,谁的优先级高,vip就漂到谁那里可以看到A和B配置的共同点,就是virtual_ipaddress都是,这样当我们使用mysql的时候,就连接到这个虚拟ip上,由它进行转发到实体机。priority值高的先被访问,作为Master。这样实体机mysql的down机重启和转换主从角色则由keepalived来完成,对应用层透明,让应用无法感知。两台机器要通局域网,虚拟IP会被设置在他们共有的局域网内,不通的话两台机器各设置各的虚拟IP。

杀死keepalived的脚本(机器A和B上都要有)


vi /usr/local/keepalived/mysql.sh

#!/bin/bash
#kill掉keepalived进程,以防止脑裂问题。
pkill keepalived

启动keepalived服务(两台都要启动)

[root@tomcat1 keepalived]#chmod +x /usr/local/keepalived/mysql.sh

[root@tomcat1 keepalived]# /etc/init.d/keepalived start正在启动 keepalived:[确定]
[root@tomcat1 keepalived]# /etc/init.d/keepalived statuskeepalived (pid 5279) 正在运行...

[root@tomcat2 keepalived]#chmod +x /usr/local/keepalived/mysql.sh

[root@tomcat2 keepalived]# /etc/init.d/keepalived start正在启动 keepalived:[确定]
[root@tomcat2 keepalived]# /etc/init.d/keepalived statuskeepalived (pid 7426) 正在运行

2.测试

先给个图说下目前的状况,应用程序连接VIP,VIP连接到A机器,从A向B执行Replication同步

Mysql+KeepAlived 主从搭建测试(互为主从)_第1张图片
(1)连接虚拟IP

先查看下虚拟ip有没有,在设置的priority值高,也就是主机的那台机器上,这里是A机器,调用如下命令


[root@tomcat1 ~]# ip addr
1: lo:  mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:c8:4c:5f brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.8/24 brd 192.168.1.255 scope global eth0
    inet 192.168.1.10/32 scope global eth0
    inet6 fe80::20c:29ff:fec8:4c5f/64 scope link 
       valid_lft forever preferred_lft forever

可以看到已经有虚拟ip了,被绑定到了eth0上了。同时在B上面执行,则发现没有虚拟ip,说明现在A机器被选择了承担虚拟ip。接下来ping下它

[root@tomcat1 ~]# ping 192.168.1.10
PING 192.168.1.10 (192.168.1.10) 56(84) bytes of data.
64 bytes from 192.168.1.10: icmp_seq=1 ttl=64 time=0.025 ms
64 bytes from 192.168.1.10: icmp_seq=2 ttl=64 time=0.034 ms

说明可以ping通,下面用客户端连接共同的入口,即虚拟IP

[root@tomcat1 ~]# mysql -uroot -h192.168.1.10 -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 323
Server version: 5.7.15-log MySQL Community Server (GPL)

Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show variables like 'server_id';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id     | 1     |
+---------------+-------+
1 row in set (0.01 sec)

mysql> 


server_id=1是A机器,此时两台机器上的数据都是

mysql> select * from shiyu.test;
+------+------+
| id   | name |
+------+------+
|    1 | A    |
|    2 | B    |
+------+------+
2 rows in set (0.00 sec)


(2)修改数据

mysql> update test set name='shiyu' where id=2;
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> select * from test;
+------+-------+
| id   | name  |
+------+-------+
|    1 | A     |
|    2 | shiyu |
+------+-------+
2 rows in set (0.00 sec)

单独开个客户端去查看A和B的机器,得到同样的效果,数据得到了同步

mysql> select * from test;
+------+-------+
| id   | name  |
+------+-------+
|    1 | A     |
|    2 | shiyu |
+------+-------+
2 rows in set (0.00 sec)

(3)模拟down机的情况


把mysql进程直接杀掉,类似于机器down的情况

[root@tomcat1 ~]# ps -ef |grep mysql
root       4399      1  0 11:23 pts/0    00:00:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --datadir=/u01/data --pid-file=/u01/data/tomcat1.pid
mysql      5401   4399  2 16:16 pts/0    00:00:00 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/u01/data --plugin-dir=/usr/local/mysql/lib/plugin --user=mysql --log-error=/u01/data/tomcat1.err --pid-file=/u01/data/tomcat1.pid --socket=/tmp/mysql.sock --port=3306
root       5459   2401  0 16:16 pts/2    00:00:00 grep mysql
[root@tomcat1 ~]# kill -9 4399
[root@tomcat1 ~]# kill -9 5401

然后再次查看server_id,短暂的失去联系,即可很快的恢复

mysql> show variables like 'server_id';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id     | 1     |
+---------------+-------+
1 row in set (0.01 sec)

mysql> show variables like 'server_id';
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql> show variables like 'server_id';
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    611
Current database: *** NONE ***

+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id     | 2     |
+---------------+-------+
1 row in set (0.06 sec)


正常应该是虚拟VIP已经漂到B上了,在B上查看:


[root@tomcat2 ~]# ip addr
1: lo:  mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:5a:16:f1 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.9/24 brd 192.168.1.255 scope global eth0
    inet 192.168.1.10/32 scope global eth0
    inet6 fe80::20c:29ff:fe5a:16f1/64 scope link 
       valid_lft forever preferred_lft forever


server_id=2,并且B机器加载上了虚拟ip,同时我们在A机器上使用ip addr看到虚拟ip已经不存在了,A机器的keepalived进程也死了。充分证明了现在虚拟ip迁移到了B机器上。当然要让A机器复活,自己另外写脚本重启吧。现在的情况变成了应用程序连接VIP,VIP连接到B机器,从B向A执行Replication同步。

Mysql+KeepAlived 主从搭建测试(互为主从)_第2张图片

(4)复活机器A


重启机器A上的mysql和keepalived

[root@tomcat1 ~]# service mysql start
Starting MySQL.                                            [确定]
[root@tomcat1 ~]# /etc/init.d/keepalived start
正在启动 keepalived:                                      [确定]
[root@tomcat1 ~]# /etc/init.d/keepalived status
keepalived (pid  5942) 正在运行...
[root@tomcat1 ~]# 

再次查看server_id,发现还是2,说明虚拟ip没有迁移回到A机器,这样就避免了脑裂问题

mysql> show variables like 'server_id';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id     | 2     |
+---------------+-------+
1 row in set (0.00 sec)

五、总结

Keepalived+mysql双主一般来说,中小型规模的时候,采用这种架构是最省事的。

在master节点发生故障后,利用keepalived的高可用机制实现快速切换到备用节点。

在这个方案里,有几个需要注意的地方:

1.采用keepalived作为高可用方案时,两个节点最好都设置成BACKUP模式,避免因为意外情况下(比如脑裂)相互抢占导致往两个节点写入相同数据而引发冲突;

2.把两个节点的auto_increment_increment(自增步长)和auto_increment_offset(自增起始值)设成不同值。其目的是为了避免master节点意外宕机时,可能会有部分binlog未能及时复制到slave上被应用,从而会导致slave新写入数据的自增值和原先master上冲突了,因此一开始就使其错开;当然了,如果有合适的容错机制能解决主从自增ID冲突的话,也可以不这么做;

3.slave节点服务器配置不要太差,否则更容易导致复制延迟。作为热备节点的slave服务器,硬件配置不能低于master节点;

4.如果对延迟问题很敏感的话,可考虑使用MariaDB分支版本,或者直接上线MySQL 5.7最新版本,利用多线程复制的方式可以很大程度降低复制延迟






你可能感兴趣的:(mysql学习笔记)