mysql5.7.24 gtid双主复制+atlas+keepalived

一环境介绍:

系统:

centos7

[root@mgr01 ~]# cat /etc/hosts
10.0.0.6 pxc01
10.0.0.7  pxc02
[root@pxc02 ~]# cat /etc/hosts
10.0.0.6 pxc01
10.0.0.7 pxc02

关闭selinux:
vim /etc/sysconfig/selinux
getenforce

保证服务器时间同步:

ntpdate ntp1.aliyun.com
*/5 * * * *  ntpdate ntp1.aliyun.com

关闭防火墙:

 systemctl status firewalld
systemctl stop firewalld
 systemctl stop iptables.service
 systemctl status iptables.service

关闭防火墙服务开机自启动:

systemctl disable firewalld.service
systemctl disable iptables.service

关闭mysql服务开机自启动:

[root@mgr01 ~]# systemctl disable mysql.service
mysql.service is not a native service, redirecting to /sbin/chkconfig.
Executing /sbin/chkconfig mysql off

二、环境部署说明

10.0.0.6 部署mysql5.7.24 gtid双主复制+atlas+keepalived
10.0.0.7 部署mysql5.7.24 gtid双主复制+atlas+keepalived

2.1MySQL安装:

mysql 版本:MySQL5.7.24
10.0.0.6 pxc01 操作:

**MySQL采用二进制安装以及安装过程如下:**
useradd mysql -s /sbin/nologin -M
mkdir /data/mysql/mysql3306/{data,logs,binlog} -p
chown -R mysql.mysql  /data/mysql/mysql3306
tar xf mysql-5.7.24-linux-glibc2.12-x86_64.tar.gz -C /usr/local/
cd /usr/local/mysql
mv mysql-5.7.24-linux-glibc2.12-x86_64 mysql
/usr/local/mysql/bin/mysqld --defaults-file=/data/mysql/mysql3306/my3306.cnf --initialize
/usr/local/mysql/bin/mysqld --defaults-file=/data/mysql/mysql3306/my3306.cnf &

**修改mysql密码:**
alter user user() identified by '654321';

10.0.0.7 pxc02 操作:
10.0.0.7 机器也按照上面的方式安装mysql

2.2、mysql的配置文件介绍:

10.0.0.6 pxc01 主配置文件如下:
下面的参数是必须打开的:

[root@pxc01 mysql3306]# egrep 'server_id|gtid_mode|enforce_gtid_consistency|log_bin|binlog_format|log-slave-updates|skip_slave_start|auto-increment' /data/mysql/mysql3306/my3306.cnf 
server_id                           =63306                        # 0
binlog_format                       =row                          # row
log_bin                             =/data/mysql/mysql3306/binlog/mysql-bin        #    off
gtid_mode                           =on                            #    off
enforce_gtid_consistency            =on                            #    off
skip_slave_start                     =1                              #
auto-increment-increment = 2
auto-increment-offset = 1

10.0.0.7 master02 主配置文件如下:
下面的参数是必须打开的:

[root@pxc02 local]# egrep 'server_id|gtid_mode|enforce_gtid_consistency|log_bin|binlog_format|log-slave-updates|skip_slave_start|auto-increment' /data/mysql/mysql3306/my3306.cnf 
server_id                           =73306                        # 0
binlog_format                       =row                          # row
log_bin                             =/data/mysql/mysql3306/binlog/mysql-bin      #  off
gtid_mode                           =on                            #    off
enforce_gtid_consistency            =on                            #    off
skip_slave_start                     =1                              #
auto-increment-increment = 2
auto-increment-offset = 2

2.3、配置主从复制

10.0.0.6 pxc01 上操作:

配置复制用户权限
grant replication slave on *.* to rep1@'10.0.0.%' identified by 'rep123321';flush privileges;

master02上chang master   to 来配置从库:
 change master to master_host='10.0.0.6',master_user='rep1',master_password='rep123321',master_auto_position=1;start slave ; show slave status\G

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
10.0.0.7 pxc02上操作:

配置复制用户权限
grant replication slave on *.* to rep2@'10.0.0.%' identified by 'rep123321';flush privileges;

master01上chang master   to 来配置从库:
 change master to master_host='10.0.0.7',master_user='rep2',master_password='rep123321',master_auto_position=1;start slave ; show slave status\G

配置完成

2.4、安装atlas:

10.0.0.6 机器上安装atlas:

 [root@mgr01 ~]# wget    https://github.com/Qihoo360/Atlas/releases/download/2.2.1/Atlas-2.2.1.el6.x86_64.rpm
[root@mgr01 ~]# yum  -y localinstall Atlas-2.2.1.el6.x86_64.rpm
或者  rpm -ivh Atlas-2.2.1.el6.x86_64.rpm 

登录10.0.0.6 机器上登录mysql创建用户,会自动同步到10.0.0.7上:

 grant all on *.* to atlasuser@'10.0.0.%' identified by '558996';flush privileges;

给用户密码加密:

[root@pxc01 ~]# /usr/local/mysql-proxy/bin/encrypt  558996
Otp11P90TDw=
[root@pxc01 ~]# 

加密后的密码下入Atlas的配置文件中:

[root@pxc01 ~]# grep atlasuser /usr/local/mysql-proxy/conf/mgrmul.cnf 
pwds = atlasuser:Otp11P90TDw=
[root@pxc01 ~]# 

设置允许客户端的ip连接Atlas:

[root@pxc01 ~]# grep client-ips  /usr/local/mysql-proxy/conf/mgrmul.cnf
client-ips = 127.0.0.1, 192.168.1, 10.0.0.20, 10.0.0.6
#Atlas前面挂接的LVS的物理网卡的IP(注意不是虚IP),若有LVS且设置了client-ips则此项必须设置,否则可以不设置。
此处的10.0.0.20 为后面的演示的VIP

10.0.0.6 atlas:配置文件内容如下:


[root@pxc01 ~]# cat /usr/local/mysql-proxy/conf/mgrmul.cnf 
[mysql-proxy]

#带#号的为非必需的配置项目

#管理接口的用户名
admin-username = zykjwtest

#管理接口的密码
admin-password = zykjwtest01

#Atlas后端连接的MySQL主库的IP和端口,可设置多项,用逗号分隔
proxy-backend-addresses=10.0.0.6:3306,10.0.0.7:3306
#Atlas后端连接的MySQL从库的IP和端口,@后面的数字代表权重,用来作负载均衡,若省略则默认为1,可设置多项,用逗号分隔
proxy-read-only-backend-addresses = 10.0.0.6:3306@1,10.0.0.7:3306

#用户名与其对应的加密过的MySQL密码,密码使用PREFIX/bin目录下的加密程序encrypt加密,下行的user1和user2为示例,将其替换为你的MySQL的用户名和加密密码!
pwds = atlasuser:Otp11P90TDw=

#设置Atlas的运行方式,设为true时为守护进程方式,设为false时为前台方式,一般开发调试时设为false,线上运行时设为true,true后面不能有空格。
daemon = true

#设置Atlas的运行方式,设为true时Atlas会启动两个进程,一个为monitor,一个为worker,monitor在worker意外退出后会自动将其重启,设为false时只有worker,没有monitor,一般开发调试时设为false,线上运行时设为true,true后面不能有空格。
keepalive = true

#工作线程数,对Atlas的性能有很大影响,可根据情况适当设置
event-threads = 4

#日志级别,分为message、warning、critical、error、debug五个级别
log-level = warning

#日志存放的路径
log-path = /usr/local/mysql-proxy/log

#SQL日志的开关,可设置为OFF、ON、REALTIME,OFF代表不记录SQL日志,ON代表记录SQL日志,REALTIME代表记录SQL日志且实时写入磁盘,默认为OFF
sql-log = REALTIME

#慢日志输出设置。当设置了该参数时,则日志只输出执行时间超过sql-log-slow(单位:ms)的日志记录。不设置该参数则输出全部日志。
#sql-log-slow = 10

#实例名称,用于同一台机器上多个Atlas实例间的区分
instance = mgrmul

#Atlas监听的工作接口IP和端口
proxy-address = 0.0.0.0:52119

#Atlas监听的管理接口IP和端口
admin-address = 0.0.0.0:52118

#分表设置,此例中person为库名,mt为表名,id为分表字段,3为子表数量,可设置多项,以逗号分隔,若不分表则不需要设置该项
#tables = person.mt.id.3

#默认字符集,设置该项后客户端不再需要执行SET NAMES语句
#charset = utf8

#允许连接Atlas的客户端的IP,可以是精确IP,也可以是IP段,以逗号分隔,若不设置该项则允许所有IP连接,否则只允许列表中的IP连接
client-ips = 127.0.0.1, 192.168.1, 10.0.0.20, 10.0.0.6

#Atlas前面挂接的LVS的物理网卡的IP(注意不是虚IP),若有LVS且设置了client-ips则此项必须设置,否则可以不设置
#lvs-ips = 127.0.0.1, 10.0.0.20, 10.0.0.6

Atlas启动和关闭:

/usr/local/mysql-proxy/bin/mysql-proxyd mgrmul start
/usr/local/mysql-proxy/bin/mysql-proxyd mgrmul stop
/usr/local/mysql-proxy/bin/mysql-proxyd mgrmul status
/usr/local/mysql-proxy/bin/mysql-proxyd mgrmul restart

启动成功:

[root@pxc01 log]# ss -lntup|grep mysql-proxy
tcp    LISTEN     0      128       *:52118                 *:*                   users:(("mysql-proxy",pid=10543,fd=9))
tcp    LISTEN     0      128       *:52119                 *:*                   users:(("mysql-proxy",pid=10543,fd=10))

卸载掉atlas:

[root@mgr02 local]# rpm -qa|grep Atlas 
Atlas-2.2.1-1.x86_64
[root@mgr01 local]# rpm -e --nodeps Atlas-2.2.1-1.x86_64 
[root@mgr01 local]# rpm -qa|grep Atlas 
[root@mgr01 local]#

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
10.0.0.7机器上安装atlas:

若果只是单独的在10.0.0.6 安装atlas 服务,避免不了单点故障,一旦10.0.0.6 上的atlas服务挂掉了,或者是10.0.0.6 机器挂掉了,整个业务都会中断。于是在10.0.0.7 上安装atlas服务并引入keepalived 来实现高可用

于是按照10.0.0.6 机器上安装atlas的步骤在10.0.0.7 机器安装atlas服务
按照上述过程在10.0.0.7 上执行一遍,安装atlas服务,并开启。

配置文件说明:

设置允许客户端的ip连接Atlas:
[root@pxc02 ~]# grep client-ips  /usr/local/mysql-proxy/conf/mgrmul.cnf
client-ips = 127.0.0.1, 192.168.1, 10.0.0.20, 10.0.0.7
#Atlas前面挂接的LVS的物理网卡的IP(注意不是虚IP),若有LVS且设置了client-ips则此项必须设置,否则可以不设置

10.0.0.6 的配置文件和10.0.0.7的配置文件只有此处IP不一样,需要修改为本地IP,其余都是一样的参数

2.4、测试演示atlas:

+++++++++++++++++++在10.0.0.6 上测试+++++++++++++++++

登录后端的mysql server:

[root@pxc01 log]# mysql -uatlasuser -p'558996' -h10.0.0.6 -P52119
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.0.81-log MySQL Community Server (GPL)

Copyright (c) 2009-2018 Percona LLC and/or its affiliates
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

(atlasuser@'mgr01':52119)[(none)]>show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test01             |
| test02             |
+--------------------+
6 rows in set (0.00 sec)

写入测试数据演示:


use test01;
CREATE TABLE `test01` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `titles` char(15) NOT NULL,
  `icon` smallint(6) unsigned DEFAULT '0',
  `integral` int(10) NOT NULL DEFAULT '0',
  `isdefault` tinyint(1) unsigned NOT NULL DEFAULT '0',
`create_time` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (`id`),
  KEY `integral` (`integral`)
) ENGINE=Innodb AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;
insert into test01.test01 values(1,'列兵',1,0,1,now());

查看sql日志:

[root@pxc01 log]# tailf /usr/local/mysql-proxy/log/sql_mgrmul.log 
[03/02/2019 18:56:51] C:10.0.0.6:58637 S:10.0.0.7:3306 OK 26.275 "select @@version_comment limit 1"
[03/02/2019 18:56:51] C:10.0.0.6:58637 S:10.0.0.6:3306 OK 21.822 "select USER()"
[03/02/2019 18:57:00] C:10.0.0.6:58637 S:10.0.0.7:3306 OK 6.994 "select @@port"
[03/02/2019 18:57:04] C:10.0.0.6:58637 S:10.0.0.6:3306 OK 2.230 "select @@port"
[03/02/2019 18:57:18] C:10.0.0.6:58637 S:10.0.0.7:3306 OK 3.958 "select @@hostname"
[03/02/2019 18:57:20] C:10.0.0.6:58637 S:10.0.0.6:3306 OK 0.937 "select @@hostname"
[03/02/2019 18:57:22] C:10.0.0.6:58637 S:10.0.0.7:3306 OK 3.855 "select @@hostname"
[03/02/2019 19:02:08] C:10.0.0.6:58637 S:10.0.0.6:3306 OK 2.456 "SELECT DATABASE()"
[03/02/2019 19:02:08] C:10.0.0.6:58637 S:10.0.0.6:3306 OK 0.807 "show databases"
[03/02/2019 19:02:08] C:10.0.0.6:58637 S:10.0.0.7:3306 OK 86.643 "show tables"
[03/02/2019 19:02:10] C:10.0.0.6:58637 S:10.0.0.6:3306 OK 377.687 "CREATE TABLE `test01` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `titles` char(15) NOT NULL,
  `icon` smallint(6) unsigned DEFAULT '0',
  `integral` int(10) NOT NULL DEFAULT '0',
  `isdefault` tinyint(1) unsigned NOT NULL DEFAULT '0',
`create_time` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (`id`),
  KEY `integral` (`integral`)
) ENGINE=Innodb AUTO_INCREMENT=0 DEFAULT CHARSET=utf8"
[03/02/2019 19:02:18] C:10.0.0.6:58637 S:10.0.0.6:3306 OK 425.538 "insert into test01.test01 values(1,'列兵',1,0,1,now())"
[03/02/2019 19:02:36] C:10.0.0.6:58637 S:10.0.0.6:3306 OK 2.272 "select * from test01.test01"
[03/02/2019 19:02:39] C:10.0.0.6:58637 S:10.0.0.6:3306 OK 2.337 "select * from test01.test01"
[03/02/2019 19:02:55] C:10.0.0.6:58637 S:10.0.0.7:3306 OK 70.055 "select * from test01.test01"

读写分离成功

登录atlas的管理端:

[root@pxc01 log]# mysql -uzykjwtest -pzykjwtest01 -h127.0.0.1 -P52118

查看atlas的操作命令:


(zykjwtest@'mgr01':52118)[(none)]>SELECT * FROM help;
+----------------------------+---------------------------------------------------------+
| command                    | description                                             |
+----------------------------+---------------------------------------------------------+
| SELECT * FROM help         | shows this help                                         |
| SELECT * FROM backends     | lists the backends and their state                      |
| SET OFFLINE $backend_id    | offline backend server, $backend_id is backend_ndx's id |
| SET ONLINE $backend_id     | online backend server, ...                              |
| ADD MASTER $backend        | example: "add master 127.0.0.1:3306", ...               |
| ADD SLAVE $backend         | example: "add slave 127.0.0.1:3306", ...                |
| REMOVE BACKEND $backend_id | example: "remove backend 1", ...                        |
| SELECT * FROM clients      | lists the clients                                       |
| ADD CLIENT $client         | example: "add client 192.168.1.2", ...                  |
| REMOVE CLIENT $client      | example: "remove client 192.168.1.2", ...               |
| SELECT * FROM pwds         | lists the pwds                                          |
| ADD PWD $pwd               | example: "add pwd user:raw_password", ...               |
| ADD ENPWD $pwd             | example: "add enpwd user:encrypted_password", ...       |
| REMOVE PWD $pwd            | example: "remove pwd user", ...                         |
| SAVE CONFIG                | save the backends to config file                        |
| SELECT VERSION             | display the version of Atlas                            |
+----------------------------+---------------------------------------------------------+
16 rows in set (0.00 sec)

查看当前节点mysql在线状态和读写状态:

(zykjwtest@'mgr01':52118)[(none)]>SELECT * FROM backends;
+-------------+---------------+-------+------+
| backend_ndx | address       | state | type |
+-------------+---------------+-------+------+
|           1 | 10.0.0.6:3306 | up    | rw   |
|           2 | 10.0.0.7:3306 | up    | rw   |
|           3 | 10.0.0.6:3306 | up    | ro   |
|           4 | 10.0.0.7:3306 | up    | ro   |
+-------------+---------------+-------+------+
4 rows in set (0.00 sec)

++++++++++++++++++++++++在10.0.0.7 上测试++++++++++++++++++++++++++
按照10.0.0.6 机器上安装atlas的步骤在10.0.0.7 机器安装atlas服务
并且启动atlas

[root@pxc02 conf]# /usr/local/mysql-proxy/bin/mysql-proxyd mgrmul start
OK: MySQL-Proxy of mgrmul is started
[root@pxc02 conf]# 
[root@pxc02 conf]# 
[root@pxc02 conf]# ps -ef|grep mysql-proxy
root       9761      1  0 19:35 ?        00:00:00 /usr/local/mysql-proxy/bin/mysql-proxy --defaults-file=/usr/local/mysql-proxy/conf/mgrmul.cnf
root       9762   9761 43 19:35 ?        00:00:03 /usr/local/mysql-proxy/bin/mysql-proxy --defaults-file=/usr/local/mysql-proxy/conf/mgrmul.cnf
root       9770   7701  0 19:35 pts/3    00:00:00 grep --color=auto mysql-proxy
[root@pxc02 conf]# ss -lntup|grep mysql-proxy
tcp    LISTEN     0      128       *:52118                 *:*                   users:(("mysql-proxy",pid=9762,fd=9))
tcp    LISTEN     0      128       *:52119                 *:*                   users:(("mysql-proxy",pid=9762,fd=10))
[root@pxc02 conf]# 

测试10.0.0.7 上的atlas服务,经测试可以正常写入和查询


[root@pxc02 conf]# mysql -uatlasuser -p'558996' -h10.0.0.7 -P52119
(atlasuser@'mgr02':52119)[(none)]>insert into test01.test01 values(3,'排长',3,2000,1,now());
Query OK, 1 row affected (0.10 sec)
(atlasuser@'mgr02':52119)[(none)]>select * from test01.test01;
+----+--------+------+----------+-----------+---------------------+
| id | titles | icon | integral | isdefault | create_time         |
+----+--------+------+----------+-----------+---------------------+
|  1 | 列兵   |    1 |        0 |         1 | 2019-03-02 19:02:17 |
|  2 | 班长   |    2 |     1000 |         1 | 2019-03-02 19:26:01 |
|  3 | 排长   |    3 |     2000 |         1 | 2019-03-02 19:42:35 |
+----+--------+------+----------+-----------+---------------------+
3 rows in set (0.02 sec)

但是在10.0.0.7 测试登录atlas 服务的管理端,提示错误:

[root@pxc02 conf]# mysql -uzykjwtest -pzykjwtest01 -h127.0.0.1 -P52118
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1105 (07000): use 'SELECT * FROM help' to see the supported commands

**暂时未找到是什么原因,
但是经测试,在单独的另外一台机器10.0.0.131上开启Atlas服务,通过它访问后端的10.0.0.6 和10.0.0.7数据库,测可以正常的登录atlas 服务的管理端

[root@pxc03 conf]# mysql -uzykjwtest -p'zykjwtest01'  -h10.0.0.131 -P52118
([email protected]:52118)[(none)]>select * from test01;
ERROR 1105 (07000): use 'SELECT * FROM help' to see the supported commands
([email protected]:52118)[(none)]>SELECT * FROM backends;
+-------------+---------------+-------+------+
| backend_ndx | address       | state | type |
+-------------+---------------+-------+------+
|           1 | 10.0.0.6:3306 | up    | rw   |
|           2 | 10.0.0.7:3306 | up    | rw   |
|           3 | 10.0.0.6:3306 | up    | ro   |
|           4 | 10.0.0.7:3306 | down  | ro   |
+-------------+---------------+-------+------+
4 rows in set (0.00 sec)

而且可以正常的写入数据:

[root@pxc03 conf]# mysql -uatlasuser -p'558996'  -h10.0.0.131 -P52119
([email protected]:52119)[(none)]>use test02;
Database changed
([email protected]:52119)[test02]>show tables;
Empty set (0.00 sec)

([email protected]:52119)[test02]>CREATE TABLE `test01` (
    ->   `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
    ->   `titles` char(15) NOT NULL,
    ->   `icon` smallint(6) unsigned DEFAULT '0',
    ->   `integral` int(10) NOT NULL DEFAULT '0',
    ->   `isdefault` tinyint(1) unsigned NOT NULL DEFAULT '0',
    -> `create_time` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
    ->   PRIMARY KEY (`id`),
    ->   KEY `integral` (`integral`)
    -> ) ENGINE=Innodb AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.51 sec)

([email protected]:52119)[test02]>insert into test02.test01 values(1,'列兵',1,0,1,now());
Query OK, 1 row affected (0.05 sec)

([email protected]:52119)[test02]>select * from test01;
+----+--------+------+----------+-----------+---------------------+
| id | titles | icon | integral | isdefault | create_time         |
+----+--------+------+----------+-----------+---------------------+
|  1 | 列兵   |    1 |        0 |         1 | 2019-03-03 01:30:18 |
+----+--------+------+----------+-----------+---------------------+
1 row in set (0.02 sec)

上面的问题8成是和10.0.0.7 本地虚拟机的环境问题有关系,但是没找到是什么问题导致的

2.5、安装keepalived

10.0.0.6 上安装操作:

 wget http://www.keepalived.org/software/keepalived-1.4.0.tar.gz
 tar xf keepalived-1.4.0.tar.gz  -C /usr/local/
 cd /usr/local/keepalived-1.4.0/
  mkdir /etc/keepalived
 cp /usr/local/keepalived-1.4.0/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/ 
 find / -name "keepalived"
 cp /usr/local/etc/sysconfig/keepalived  /etc/sysconfig/
 cp /usr/local/keepalived-1.4.0/keepalived/etc/init.d/keepalived /etc/init.d/
 chmod +x /etc/init.d/keepalived 
 cp /usr/local/sbin/keepalived /usr/sbin/
 which keepalived
 cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.ori
 systemctl disable keepalived.service
systemctl status keepalived.service
 /etc/init.d/keepalived status
 /etc/init.d/keepalived start

10.0.0.6 keepalived.conf主配置文件:
####参考文档:
https://blog.51cto.com/13581826/2109449
https://segmentfault.com/a/1190000014484218


[root@mgr01 keepalived]# cat keepalived.conf
global_defs {
   notification_email {
   [email protected]
   }
   notification_email_from [email protected]
   smtp_server 192.168.200.1
   smtp_connect_timeout 30
   router_id LVS_01
}

vrrp_script chk_atlas {
     script "lsof -i:52118 | grep mysql-pro || exit 1"  ###此处意思是假如atlas的端口52118 存在,则keep的vip在本地,如果atlas的端口52118不 存在,则vip漂移到10.0.0.7 机器上
     interval 2                                         ###此处也可以写成lsof -i:52118 | grep mysql-pro || pkill keepalived,表示atlas挂掉了,则kill掉本地的keepalived服务
     fail 1
}
vrrp_instance VI_1 {
    state BACKUP
    interface eno16777736
    virtual_router_id 51
    #mcast_src_ip 10.0.0.6
    priority 120
    advert_int 1
    nopreempt
    authentication {
        auth_type PASS
        auth_pass 1111
    }
   track_script {
           chk_atlas
    }
    virtual_ipaddress {
    10.0.0.20
    }

}

启动keepalived服务:

[root@pxc01 log]# /etc/init.d/keepalived start
Starting keepalived (via systemctl):                       [  OK  ]
[root@pxc01 log]# 

确认启动成功已经绑定VIP10.0.0.20

[root@pxc01 log]# /etc/init.d/keepalived status
● keepalived.service - LVS and VRRP High Availability Monitor
   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
   Active: active (running) since Sat 2019-03-02 19:59:32 CST; 16s ago
  Process: 10941 ExecStart=/usr/local/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 10942 (keepalived)
   CGroup: /system.slice/keepalived.service
           ├─10942 /usr/local/sbin/keepalived -D
           ├─10943 /usr/local/sbin/keepalived -D
           └─10944 /usr/local/sbin/keepalived -D

Mar 02 19:59:43 pxc01 Keepalived_vrrp[10944]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 19:59:43 pxc01 Keepalived_vrrp[10944]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 19:59:43 pxc01 Keepalived_vrrp[10944]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 19:59:43 pxc01 Keepalived_vrrp[10944]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 19:59:48 pxc01 Keepalived_vrrp[10944]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 19:59:48 pxc01 Keepalived_vrrp[10944]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eno16777736 for 10.0.0.20
Mar 02 19:59:48 pxc01 Keepalived_vrrp[10944]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 19:59:48 pxc01 Keepalived_vrrp[10944]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 19:59:48 pxc01 Keepalived_vrrp[10944]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 19:59:48 pxc01 Keepalived_vrrp[10944]: Sending gratuitous ARP on eno16777736 for 10.0.0.20

[root@pxc01 log]# ip a|grep 10.0.0.20
    inet 10.0.0.20/32 scope global eno16777736
[root@pxc01 log]# 

确认是否可以通过atlas 连接后端的mysql server:

[root@pxc01 log]# mysql -uatlasuser -p'558996' -h10.0.0.20 -P52119
(atlasuser@'mgr01':52119)[(none)]>insert into test01.test01 values(4,'连长',4,3000,1,now());
Query OK, 1 row affected (0.06 sec)

(atlasuser@'mgr01':52119)[(none)]>select * from test01.test01;
+----+--------+------+----------+-----------+---------------------+
| id | titles | icon | integral | isdefault | create_time         |
+----+--------+------+----------+-----------+---------------------+
|  1 | 列兵   |    1 |        0 |         1 | 2019-03-02 19:02:17 |
|  2 | 班长   |    2 |     1000 |         1 | 2019-03-02 19:26:01 |
|  3 | 排长   |    3 |     2000 |         1 | 2019-03-02 19:42:35 |
|  4 | 连长   |    4 |     3000 |         1 | 2019-03-02 20:08:06 |
+----+--------+------+----------+-----------+---------------------+
4 rows in set (0.01 sec)

查看sql日志确认是可以登录mysql 库写入和查询数据的:

3/02/2019 20:07:16] C:10.0.0.20:52938 S:10.0.0.6:3306 OK 17.113 "select * from test01.test01"
[03/02/2019 20:08:06] C:10.0.0.20:52938 S:10.0.0.6:3306 OK 56.423 "insert into test01.test01 values(4,'连长',4,3000,1,now())"
[03/02/2019 20:08:08] C:10.0.0.20:52938 S:10.0.0.6:3306 OK 1.751 "select * from test01.test01"
[03/02/2019 20:08:09] C:10.0.0.20:52938 S:10.0.0.7:3306 OK 18.499 "select * from test01.test01"
[03/02/2019 20:08:10] C:10.0.0.20:52938 S:10.0.0.6:3306 OK 1.911 "select * from test01.test01"

10.0.0.7 上安装keepalived服务:

同样的方式在10.0.0.7 机器上部署keepalived 并启动
注意:此时的keepalived的配置文件参数priority 的优先级要比10.0.0.6 上的小,同时10.0.0.6和10.0.0.7 上的 “state BACKUP” 都要设置成费抢占模式BACKUP。
原因如下:
两个节点的模式最好都为BACKUP模式,避免因为网络延迟,超过心跳检查时间,发生脑裂情况相互抢占MASTER导致写入相同数据引发的冲突

启动10.0.0.7 keepalived服务:


[root@pxc02 conf]# /etc/init.d/keepalived start
Starting keepalived (via systemctl):                       [  OK  ]
You have new mail in /var/spool/mail/root
[root@pxc02 conf]# 
[root@pxc02 conf]# ps -ef|grep keep
root       9977      1  1 20:11 ?        00:00:00 /usr/local/sbin/keepalived -D
root       9978   9977  0 20:11 ?        00:00:00 /usr/local/sbin/keepalived -D
root       9979   9977  1 20:11 ?        00:00:00 /usr/local/sbin/keepalived -D
root       9993   7701  0 20:11 pts/3    00:00:00 grep --color=auto keep
[root@pxc02 conf]# /etc/init.d/keepalived status
● keepalived.service - LVS and VRRP High Availability Monitor
   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
   Active: active (running) since Sat 2019-03-02 20:11:19 CST; 15s ago
  Process: 9976 ExecStart=/usr/local/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 9977 (keepalived)
   CGroup: /system.slice/keepalived.service
           ├─9977 /usr/local/sbin/keepalived -D
           ├─9978 /usr/local/sbin/keepalived -D
           └─9979 /usr/local/sbin/keepalived -D

Mar 02 20:11:19 pxc02 Keepalived_vrrp[9979]: WARNING - default user 'keepalived_script' for script execution does not exist - please create.
Mar 02 20:11:19 pxc02 Keepalived_vrrp[9979]: WARNING - script `lsof` resolved by path search to `/usr/sbin/lsof`. Please specify full path.
Mar 02 20:11:19 pxc02 Keepalived_vrrp[9979]: SECURITY VIOLATION - scripts are being executed but script_security not enabled.
Mar 02 20:11:19 pxc02 Keepalived_vrrp[9979]: VRRP_Instance(VI_1) removing protocol VIPs.
Mar 02 20:11:19 pxc02 Keepalived_vrrp[9979]: Using LinkWatch kernel netlink reflector...
Mar 02 20:11:19 pxc02 Keepalived_vrrp[9979]: VRRP_Instance(VI_1) Entering BACKUP STATE
Mar 02 20:11:19 pxc02 Keepalived_vrrp[9979]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Mar 02 20:11:19 pxc02 Keepalived_vrrp[9979]: VRRP_Instance(VI_1) Now in FAULT state
Mar 02 20:11:19 pxc02 Keepalived_vrrp[9979]: VRRP_Script(chk_atlas) succeeded
Mar 02 20:11:20 pxc02 Keepalived_vrrp[9979]: VRRP_Instance(VI_1) Entering BACKUP STATE
[root@pxc02 conf]# 

由于10.0.0.6 上的主库keepalived 的优先级高于10.0.0.7,所以VIP被=绑定到了10.0.0.6 网卡上

三、故障模拟演示

关掉10.0.0.6 上的atlas服务测试:
10.0.0.6 机器上查看keepalived 的状态 此时keepalived已经被pkill keepalived

[root@pxc01 keepalived]# /etc/init.d/keepalived status
● keepalived.service - LVS and VRRP High Availability Monitor
   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

Mar 02 19:59:48 pxc01 Keepalived_vrrp[10944]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 19:59:48 pxc01 Keepalived_vrrp[10944]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 19:59:48 pxc01 Keepalived_vrrp[10944]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 19:59:48 pxc01 Keepalived_vrrp[10944]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 20:16:29 pxc01 Keepalived[10942]: Stopping
Mar 02 20:16:29 pxc01 Keepalived_vrrp[10944]: VRRP_Script(chk_atlas) failed (due to signal 15)
Mar 02 20:16:29 pxc01 Keepalived_vrrp[10944]: VRRP_Instance(VI_1) sent 0 priority
Mar 02 20:16:29 pxc01 Keepalived_vrrp[10944]: VRRP_Instance(VI_1) removing protocol VIPs.
Mar 02 20:16:30 pxc01 Keepalived_vrrp[10944]: Stopped
Mar 02 20:16:30 pxc01 Keepalived[10942]: Stopped Keepalived v1.4.0 (12/29,2017)
[root@pxc01 keepalived]# 

10.0.0.7 机器上查看keepalived 的状态 VIP已经绑定了:

[root@pxc02 ~]# /etc/init.d/keepalived status
● keepalived.service - LVS and VRRP High Availability Monitor
   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
   Active: active (running) since Sat 2019-03-02 20:11:19 CST; 11min ago
  Process: 9976 ExecStart=/usr/local/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 9977 (keepalived)
   CGroup: /system.slice/keepalived.service
           ├─9977 /usr/local/sbin/keepalived -D
           ├─9978 /usr/local/sbin/keepalived -D
           └─9979 /usr/local/sbin/keepalived -D
Mar 02 20:16:30 pxc02 Keepalived_vrrp[9979]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 20:16:30 pxc02 Keepalived_vrrp[9979]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 20:16:30 pxc02 Keepalived_vrrp[9979]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 20:16:30 pxc02 Keepalived_vrrp[9979]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 20:16:35 pxc02 Keepalived_vrrp[9979]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 20:16:35 pxc02 Keepalived_vrrp[9979]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eno16777736 for 10.0.0.20
Mar 02 20:16:35 pxc02 Keepalived_vrrp[9979]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 20:16:35 pxc02 Keepalived_vrrp[9979]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 20:16:35 pxc02 Keepalived_vrrp[9979]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
Mar 02 20:16:35 pxc02 Keepalived_vrrp[9979]: Sending gratuitous ARP on eno16777736 for 10.0.0.20
[root@pxc02 ~]# 

10.0.0.7 机器上通过atlas登录mysql server :


[root@pxc02 conf]# mysql -uatlasuser -p'558996' -h10.0.0.20 -P52119
(atlasuser@'mgr02':52119)[(none)]>insert into test01.test01 values(5,'营长',5,4000,1,now());
Query OK, 1 row affected (0.06 sec)

(atlasuser@'mgr02':52119)[(none)]>select * from test01.test01;
+----+--------+------+----------+-----------+---------------------+
| id | titles | icon | integral | isdefault | create_time         |
+----+--------+------+----------+-----------+---------------------+
|  1 | 列兵   |    1 |        0 |         1 | 2019-03-02 19:02:17 |
|  2 | 班长   |    2 |     1000 |         1 | 2019-03-02 19:26:01 |
|  3 | 排长   |    3 |     2000 |         1 | 2019-03-02 19:42:35 |
|  4 | 连长   |    4 |     3000 |         1 | 2019-03-02 20:08:06 |
|  5 | 营长   |    5 |     4000 |         1 | 2019-03-02 20:20:43 |
+----+--------+------+----------+-----------+---------------------+
5 rows in set (0.01 sec)

查看10.0.0.7 atlas的sql 日志,正常提供读写分离工作:


[03/02/2019 20:19:31] C:10.0.0.20:35793 S:10.0.0.7:3306 OK 25.475 "SET SQL_SAFE_UPDATES=1,SQL_SELECT_LIMIT=1000,MAX_JOIN_SIZE=1000000"
[03/02/2019 20:19:31] C:10.0.0.20:35793 S:10.0.0.6:3306 OK 34.604 "select @@version_comment limit 1"
[03/02/2019 20:19:31] C:10.0.0.20:35793 S:10.0.0.7:3306 OK 0.245 "select USER()"
[03/02/2019 20:19:36] C:10.0.0.20:35793 S:10.0.0.6:3306 OK 23.035 "select * from test01.test01"
[03/02/2019 20:20:43] C:10.0.0.20:35793 S:10.0.0.6:3306 OK 60.000 "insert into test01.test01 values(5,'营长',5,4000,1,now())"
[03/02/2019 20:20:45] C:10.0.0.20:35793 S:10.0.0.6:3306 OK 3.905 "select * from test01.test01"
[03/02/2019 20:20:47] C:10.0.0.20:35793 S:10.0.0.7:3306 OK 3.395 "select * from test01.test01"
[03/02/2019 20:20:55] C:10.0.0.20:35793 S:10.0.0.6:3306 OK 2.126 "select * from test01.test01"

+++++++++++++++++++++++++++++++++++++++++++++
关掉10.0.0.6 上的mysql服务测试:

centos7的系统上安装mysql client的方式

yum install -y mariadb.x86_64 mariadb-libs.x86_64
[root@pxc02 ~]# rpm -qa|grep mariadb
mariadb-5.5.60-1.el7_5.x86_64
mariadb-libs-5.5.60-1.el7_5.x86_64
[root@pxc02 ~]# mysql -V
mysql  Ver 15.1 Distrib 5.5.60-MariaDB, for Linux (x86_64) using readline 5.1
[root@pxc02 ~]# 
[root@pxc02 ~]# mysql -uatlasuser -p'558996' -h10.0.0.20 -P52119
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MySQL connection id is 11
Server version: 5.0.81-log MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

(atlasuser@'mgr02':52119)[(none)]>insert into test01.test01 values(6,'团长',6,5000,1,now());
Query OK, 1 row affected (0.15 sec)

(atlasuser@'mgr02':52119)[(none)]>select * from test01.test01;
+----+--------+------+----------+-----------+---------------------+
| id | titles | icon | integral | isdefault | create_time         |
+----+--------+------+----------+-----------+---------------------+
|  1 | 列兵   |    1 |        0 |         1 | 2019-03-02 19:02:17 |
|  2 | 班长   |    2 |     1000 |         1 | 2019-03-02 19:26:01 |
|  3 | 排长   |    3 |     2000 |         1 | 2019-03-02 19:42:35 |
|  4 | 连长   |    4 |     3000 |         1 | 2019-03-02 20:08:06 |
|  5 | 营长   |    5 |     4000 |         1 | 2019-03-02 20:20:43 |
|  6 | 团长   |    6 |     5000 |         1 | 2019-03-02 21:05:40 |
+----+--------+------+----------+-----------+---------------------+
6 rows in set (0.00 sec)

查看10.0.0.7 atlas的sql 日志:
可以正常提供写和读

[03/02/2019 21:03:14] C:10.0.0.20:36004 S:10.0.0.7:3306 OK 0.464 "SET SQL_SAFE_UPDATES=1,SQL_SELECT_LIMIT=1000,MAX_JOIN_SIZE=1000000"
[03/02/2019 21:03:14] C:10.0.0.20:36004 S:10.0.0.7:3306 OK 0.322 "select @@version_comment limit 1"
[03/02/2019 21:03:14] C:10.0.0.20:36004 S:10.0.0.7:3306 OK 0.245 "select USER()"
[03/02/2019 21:03:51] C:10.0.0.20:36004 S:10.0.0.7:3306 OK 0.572 "select * from test01.test01"
[03/02/2019 21:05:40] C:10.0.0.20:36004 S:10.0.0.7:3306 OK 141.829 "insert into test01.test01 values(6,'团长',6,5000,1,now())"
[03/02/2019 21:05:43] C:10.0.0.20:36004 S:10.0.0.7:3306 OK 0.723 "select * from test01.test01"
[03/02/2019 21:05:44] C:10.0.0.20:36004 S:10.0.0.7:3306 OK 3.542 "select * from test01.test01"

此时只能是10.0.0.7 一个节点提供读和写了

四、关此数据库架构的总结

4.1在配置双主互从过程中需要注意什么?

①keepalived+mysql双主一般来说,主要应用中小型规模数据库。master节点发生故障后,利用keepalived的VIP机制实现快速切换备用节点
②在keepalived部署配置文件中,两个节点的模式最好都为BACKUP模式,或者建议keepalived不要开机自启动。主要是避免因为网络延迟,超过心跳检查时间,发生脑裂情况相互抢占MASTER导致写入相同数据引发的冲突
③两个节点的auto_increment_incremenet(自增步长)和auto_increment_offset(自增起始点)设为不同值。目的为了避免master意外宕机,可能会有部分binlog未能及时复制到slave上被应用,从而导致slave新写入数据的自增值和原先的master冲突,从offset起始点开始就错开了,避免了主键id的冲突,当然,如有合适的容错机制解决冲突话,也可以不这么设置

4.2如果遇到主从延迟怎么解决?

①首先需要通过show slave statusG中 Seconds_Behind_Master观察主从之间延迟的状态,但是仅仅依靠这个参考值不是准确的。

②slave节点服务器硬件配置不能与master节点相差太大,会大大导致复制的延迟
③由于mysql5.7版本引入了多线程复制的参数和并行复制参数,可以很大程度降低复制延迟。可以在slave库上开启这些参数,网络延迟导致复制延迟可以得到很好的解决。同时考虑更换mariadb分支版本,也是一种解决延迟复制的方式
    mysql> show global variables like 'slave_paralle%';
    Variable_name   Value
    slave_parallel_type     DATABASE
    slave_parallel_workers  0
    slave_parallel_workers:默认为0,表示为单线程
    slave_parallel_type:默认多线程机制为一个线程处理一个DATABASE
    mysql> set global slave_parallel_workers=4; #修改为四个线程操作
    mysql> set global slave_parallel_type='logical_clock'; #修改为并行复制
④调整master节点服务器DDL速度还有就是主库是写,对数据安全性较高,比如sync_binlog=1,innodb_flush_log_at_trx_commit= 1 之类的设置,而slave则不需要这么高的数据安全,完全可以讲sync_binlog设置为0或者关闭binlog,innodb_flushlog也可以设置为0来提高sql的执行效率。另外就是使用比主库更好的硬件设备作为slave

4.3、MySQL双主环境那些mysql函数是不建议使用的
生产上使用mysql双主的话,注意不要使用mysql自身的触发器,外键等特殊函数,同时也不建议使用mysql的存储过程。如果采用这些功能出问题了,没有专职DBA的维护能力,还是建议由程序代码上来实现这些功能。

但是挂掉的10.0.0.6 如何重新加入到双主集群,以及注意事项,请期待下次测试

转载于:https://blog.51cto.com/wujianwei/2357266

你可能感兴趣的:(mysql5.7.24 gtid双主复制+atlas+keepalived)