Mysql 8配置MGR单主多从集群

一、MGR介绍

1、官方文档

Chapter 18 Group Replication

2、MGR介绍

MySQL Group Replication(下简称:MGR)是MySQL官方推出的一种基于Paxos协议的状态机复制的高可用与高扩展的解决方案。在MGR出现之前,用户常见的MySQL高可用方式,无论怎么变化架构,本质就是Master-Slave架构。MySQL 5.7版本开始支持无损半同步复制(lossless semi-sync replication),从而进一步提升数据复制的强一致性。暂时只支持mysql5.7和mysql8.0版本。

MGR有以下优点

  • 高一致性:基于原生复制及paxos协议的组复制技术。

  • 高容错性:有自动检测机制,当出现宕机后,会自动剔除问题节点,其他节点可以正常使用(类似zk集群),当不同节点产生资源争用冲突时,会按照先到先得处理,并且内置了自动化脑裂防护机制。

  • 高扩展性:可随时在线新增和移除节点,会自动同步所有节点上状态,直到新节点和其他节点保持一致,自动维护新的组信息。

  • 高灵活性:直接插件形式安装(5.7.17后自带.so插件),有单主模式和多主模式,单主模式下,只有主库可以读写,其他从库会加上super_read_only状态,只能读取不可写入,出现故障会自动选主。

MGR基础结构要求,见官方文档Group Replication Requirements

  • 引擎必须为innodb,因为需事务支持在commit时对各节点进行冲突检查

  • 每个表必须有主键,在进行事务冲突检测时需要利用主键值对比

  • 必须开启binlog且为row格式

  • 开启GTID,且主从状态信息存于表中(–master-info-repository=TABLE 、–relay-log-info-repository=TABLE),–log-slave-updates打开

  • 一致性检测设置–transaction-write-set-extraction=XXHASH64

MGR使用限制,见官方文档18.9.2 Group Replication Limitations

  • RP和普通复制binlog校验不能共存,需设置–binlog-checksum=none

  • 不支持gap lock(间隙锁),隔离级别需设置为read_committed

  • 不支持对表进行锁操作(lock /unlock table),不会发送到其他节点执行 ,影响需要对表进行加锁操作的情况,列入mysqldump全表备份恢复操作

  • 不支持serializable(序列化)隔离级别

  • DDL语句不支持原子性,不能检测冲突,执行后需自行校验是否一致

  • 不支持外键:多主不支持,单主模式不存在此问题

  • 最多支持9个节点:超过9台server无法加入组

3、MGR与其他复制的对比介绍,见官方文档

MySQL异步复制
master事务的提交不需要经过slave的确认,slave是否接收到master的binlog,master并不关心。slave接收到master binlog后先写relay log,最后异步地去执行relay log中的sql应用到自身。由于master的提交不需要确保slave relay log是否被正确接受,当slave接受master binlog失败或者relay log应用失败,master无法感知。假设master发生宕机并且binlog还没来得及被slave接收,而切换程序将slave提升为新的master,就会出现数据不一致的情况!另外,在高并发的情况下,传统的主从复制,从节点可能会与主产生较大的延迟,原理如下图
Mysql 8配置MGR单主多从集群_第1张图片
MySQL半同步复制
基于传统异步存在的缺陷,mysql在5.5版本推出半同步复制。可以说半同步复制是传统异步复制的改进,在master事务的commit之前,必须确保一个slave收到relay log并且响应给master以后,才能进行事务的commit。但是slave对于relay log的应用仍然是异步进行的,原理如下图
Mysql 8配置MGR单主多从集群_第2张图片
MySQL组复制(MGR)
基于传统异步复制和半同步复制的缺陷——数据的一致性问题无法保证,MySQL官方在5.7.17版本正式推出组复制(MySQL Group Replication,简称MGR)。

由若干个节点共同组成一个复制组,一个事务的提交,必须经过组内大多数节点(N / 2 + 1)决议并通过,才能得以提交。如下图所示,由3个节点组成一个复制组,Consensus层为一致性协议层,在事务提交过程中,发生组间通讯,由2个节点决议(certify)通过这个事务,事务才能够最终得以提交并响应。

引入组复制,主要是为了解决传统异步复制和半同步复制可能产生数据不一致的问题。组复制依靠分布式一致性协议(Paxos协议的变体),实现了分布式下数据的最终一致性,提供了真正的数据高可用方案。一个复制组由若干个节点(数据库实例)组成,组内各个节点维护各自的数据副本(Share Nothing),通过一致性协议实现原子消息和全局有序消息,来实现组内实例数据的一致。原理如下图
Mysql 8配置MGR单主多从集群_第3张图片

二、MGR搭建,见18.2 Getting Started

1、环境准备

IP 主机名 数据库 端口 server-id 操作系统 备注
192.168.50.5 server-1 mysql-8.0.17 3306 5 ubuntu 16.04 server master
192.168.50.6 server-2 mysql-8.0.17 3306 6 ubuntu 16.04 server slave
192.168.50.7 server-3 mysql-8.0.17 3306 7 ubuntu 16.04 server slave

在3台服务器的/etc/hosts中加入以下配置

192.168.50.5 server-1
192.168.50.6 server-2
192.168.50.7 server-3

2、安装MYSQL

安装方法见搭建Linux项目环境(四)-安装MYSQL

3、安装MGR插件

三台服务器都要装

# 登录mysql
mysql -uroot -p

# 安装插件
install PLUGIN group_replication SONAME 'group_replication.so';

# 查看group replication插件
show plugins;

安装完后如下图所示
Mysql 8配置MGR单主多从集群_第4张图片

4、配置复制环境,见官方文档18.2.1.2 Configuring an Instance for Group Replication

4.1、配置192.168.50.5(server-1)

打开/etc/mysql/my.cnf,添加以下配置

[mysqld]
# Group Replication
server-id=5
slow_query_log=1
log_queries_not_using_indexes=1
slow_query_log_file=/var/log/mysql/slow-query.log
log-bin=/var/log/mysql/mgr-bin
relay-log=/var/log/mysql/mgr-relay
character_set_server=utf8mb4

# MGR使用乐观锁,所以官网建议隔离级别是RC,减少锁粒度
transaction_isolation=READ-COMMITTED
gtid_mode=on
enforce_gtid_consistency=1 # 强制GTID一致性
binlog_format=row

# 因为集群会在故障恢复时互相检查binlog的数据,所以需要记录下集群内其他服务器发过来已经执行过的binlog,按GTID来区分是否执行过.
log-slave-updates=1

# binlog校验规则,5.6之后的高版本是CRC32,低版本都是NONE,但是MGR要求使用NONE
binlog_checksum=NONE

# 基于安全的考虑,MGR集群要求复制模式要改成slave记录记录到表中,不然就报错
master_info_repository=TABLE
relay_log_info_repository=TABLE

# 记录事务的算法,官网建议设置该参数使用 XXHASH64 算法
transaction_write_set_extraction = XXHASH64

# 加载group_replication插件
plugin_load_add='group_replication.so'

# 相当于此GROUP的名字,是UUID值,可以使用select uuid()生成
group_replication_group_name = '558edd3c-02ec-11ea-9bb3-080027e39bd2'

# 是否随服务器启动而自动启动组复制,不建议直接启动,怕故障恢复时有扰乱数据准确性的特殊情况
group_replication_start_on_boot = OFF

# 本地MGR的IP地址和端口,host:port,是MGR的端口,不是数据库的端口
group_replication_local_address = 'server-1:33061'

# 需要接受本MGR实例控制的服务器IP地址和端口,是MGR的端口,不是数据库的端口
group_replication_group_seeds = 'server-1:33061,server-2:33061,server-3:33061'

# 开启引导模式,添加组成员,用于第一次搭建MGR或重建MGR的时候使用,只需要在集群内的其中一台开启
group_replication_bootstrap_group = OFF

重启mysql,命令service mysql restart

建立复制账号并启动group replication

# 登录mysql
mysql -uroot -p

# 关闭日志记录
set SQL_LOG_BIN=0;

# 创建用户(网段192.168.50的可以访问)
CREATE USER 'mgr_repl'@'192.168.50.%' IDENTIFIED WITH sha256_password BY '123456';

# 授权
GRANT REPLICATION SLAVE ON *.* TO 'mgr_repl'@'192.168.50.%';

# 刷新权限
flush privileges;

# 开启日志
set SQL_LOG_BIN=1;

# 构建group replication集群
change master to master_user='mgr_repl',master_password='123456' for channel 'group_replication_recovery';

# 设置group_replication_bootstrap_group为ON是为了标示以后加入集群的服务器以这台服务器为基准,以后加入的就不需要设置
set global group_replication_bootstrap_group=ON;

# 作为首个节点启动MGR集群
start group_replication;

# 关闭group_replication_bootstrap_group
set global group_replication_bootstrap_group=OFF;

# 查看mgr的状态,查询表performance_schema.replication_group_members
select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 4c668095-dd20-11e9-95ba-080027e39bd2 | server-1    |        3306 | ONLINE       | PRIMARY     | 8.0.17         |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+

# group相关参数查看
show variables like 'group%';
4.2、配置192.168.50.6(server-2)

复制server-1的配置文件/etc/mysql/my.cnf,修改server_id,loose-group_replication_local_address即可,如下

[mysqld]
# Group Replication
server-id=6
slow_query_log=1
log_queries_not_using_indexes=1
slow_query_log_file=/var/log/mysql/slow-query.log
log-bin=/var/log/mysql/mgr-bin
relay-log=/var/log/mysql/mgr-relay
character_set_server=utf8mb4

# MGR使用乐观锁,所以官网建议隔离级别是RC,减少锁粒度
transaction_isolation=READ-COMMITTED
gtid_mode=on
enforce_gtid_consistency=1 # 强制GTID一致性
binlog_format=row

# 因为集群会在故障恢复时互相检查binlog的数据,所以需要记录下集群内其他服务器发过来已经执行过的binlog,按GTID来区分是否执行过.
log-slave-updates=1

# binlog校验规则,5.6之后的高版本是CRC32,低版本都是NONE,但是MGR要求使用NONE
binlog_checksum=NONE

# 基于安全的考虑,MGR集群要求复制模式要改成slave记录记录到表中,不然就报错
master_info_repository=TABLE
relay_log_info_repository=TABLE

# 记录事务的算法,官网建议设置该参数使用 XXHASH64 算法
transaction_write_set_extraction = XXHASH64

# 启动时加载group_replication插件
plugin_load_add='group_replication.so'

# 相当于此GROUP的名字,是UUID值,可以使用select uuid()生成
group_replication_group_name = '558edd3c-02ec-11ea-9bb3-080027e39bd2'

# 是否随服务器启动而自动启动组复制,不建议直接启动,怕故障恢复时有扰乱数据准确性的特殊情况
group_replication_start_on_boot = OFF

# 本地MGR的IP地址和端口,host:port,是MGR的端口,不是数据库的端口
group_replication_local_address = 'server-2:33061'

# 需要接受本MGR实例控制的服务器IP地址和端口,是MGR的端口,不是数据库的端口
group_replication_group_seeds = 'server-1:33061,server-2:33061,server-3:33061'

# 开启引导模式,添加组成员,用于第一次搭建MGR或重建MGR的时候使用,只需要在集群内的其中一台开启
group_replication_bootstrap_group = OFF

重启mysql,命令service mysql restart

建立复制账号并启动group replication

# 登录mysql
mysql -uroot -p

# 关闭日志记录
set SQL_LOG_BIN=0;

# 创建用户(网段192.168.50的可以访问)
CREATE USER 'mgr_repl'@'192.168.50.%' IDENTIFIED WITH sha256_password BY '123456';

# 授权
GRANT REPLICATION SLAVE ON *.* TO 'mgr_repl'@'192.168.50.%';

# 刷新权限
flush privileges;

# 开启日志
set SQL_LOG_BIN=1;

# 构建group replication集群
change master to master_user='mgr_repl',master_password='123456' for channel 'group_replication_recovery';

# 加入MGR集群
start group_replication;

# 查看mgr的状态,查询表performance_schema.replication_group_members,发现已加入MGR群
select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 4c668095-dd20-11e9-95ba-080027e39bc3 | server-2    |        3306 | RECOVERING   | SECONDARY   | 8.0.17         |
| group_replication_applier | 4c668095-dd20-11e9-95ba-080027e39bd2 | server-1    |        3306 | ONLINE       | PRIMARY     | 8.0.17         |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+

节点长时间处于RECOVERING状态,见下面第6.1节,有解决方法

4.3、配置192.168.50.7(server-3)

复制server-1的配置文件/etc/mysql/my.cnf,修改server_id,loose-group_replication_local_address即可,如下

[mysqld]
# Group Replication
server-id=7
slow_query_log=1
log_queries_not_using_indexes=1
slow_query_log_file=/var/log/mysql/slow-query.log
log-bin=/var/log/mysql/mgr-bin
relay-log=/var/log/mysql/mgr-relay
character_set_server=utf8mb4

# MGR使用乐观锁,所以官网建议隔离级别是RC,减少锁粒度
transaction_isolation=READ-COMMITTED
gtid_mode=on
enforce_gtid_consistency=1 # 强制GTID一致性
binlog_format=row

# 因为集群会在故障恢复时互相检查binlog的数据,所以需要记录下集群内其他服务器发过来已经执行过的binlog,按GTID来区分是否执行过.
log-slave-updates=1

# binlog校验规则,5.6之后的高版本是CRC32,低版本都是NONE,但是MGR要求使用NONE
binlog_checksum=NONE

# 基于安全的考虑,MGR集群要求复制模式要改成slave记录记录到表中,不然就报错
master_info_repository=TABLE
relay_log_info_repository=TABLE

# 记录事务的算法,官网建议设置该参数使用 XXHASH64 算法
transaction_write_set_extraction = XXHASH64

# 启动时加载group_replication插件
plugin_load_add='group_replication.so'

# 相当于此GROUP的名字,是UUID值,可以使用select uuid()生成
group_replication_group_name = '558edd3c-02ec-11ea-9bb3-080027e39bd2'

# 是否随服务器启动而自动启动组复制,不建议直接启动,怕故障恢复时有扰乱数据准确性的特殊情况
group_replication_start_on_boot = OFF

# 本地MGR的IP地址和端口,host:port,是MGR的端口,不是数据库的端口
group_replication_local_address = 'server-3:33061'

# 需要接受本MGR实例控制的服务器IP地址和端口,是MGR的端口,不是数据库的端口
group_replication_group_seeds = 'server-1:33061,server-2:33061,server-3:33061'

# 开启引导模式,添加组成员,用于第一次搭建MGR或重建MGR的时候使用,只需要在集群内的其中一台开启
group_replication_bootstrap_group = OFF

重启mysql,命令service mysql restart

建立复制账号并启动group replication

# 登录mysql
mysql -uroot -p

# 关闭日志记录
set SQL_LOG_BIN=0;

# 创建用户(网段192.168.50的可以访问)
CREATE USER 'mgr_repl'@'192.168.50.%' IDENTIFIED WITH sha256_password BY '123456';

# 授权
GRANT REPLICATION SLAVE ON *.* TO 'mgr_repl'@'192.168.50.%';

# 刷新权限
flush privileges;

# 开启日志
set SQL_LOG_BIN=1;

# 构建group replication集群
change master to master_user='mgr_repl',master_password='123456' for channel 'group_replication_recovery';

# 加入MGR集群
start group_replication;

# 查看mgr的状态,查询表performance_schema.replication_group_members,发现已加入MGR群
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 4c668095-dd20-11e9-95ba-080027e39bc3 | server-2    |        3306 | ONLINE       | SECONDARY   | 8.0.17         |
| group_replication_applier | 4c668095-dd20-11e9-95ba-080027e39bd2 | server-1    |        3306 | ONLINE       | PRIMARY     | 8.0.17         |
| group_replication_applier | c90e9166-de11-11e9-9ee0-080027d8da86 | server-3    |        3306 | ONLINE       | SECONDARY   | 8.0.17         |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+

5、测试

在server-1(192.168.50.5)上执行以下sql

# 测试server-1上的mysql
create database repl;
use repl;
create table test (id int primary key, name varchar(20));  #注意创建主键
insert into test values (1,'测试');

# 查看binlog
SHOW BINLOG EVENTS;

查看server-1的结果

mysql> select * from test;
+----+--------+
| id | name   |
+----+--------+
|  1 | 测试   |
+----+--------+

查看server-2的结果

# repl数据库已同步过来了
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mybatis            |
| mysql              |
| performance_schema |
| repl               |
| sys                |
+--------------------+
mysql> use repl;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed

-- 数据也同步过来了
mysql> select * from test;
+----+--------+
| id | name   |
+----+--------+
|  1 | 测试   |
+----+--------+

查看server-3的结果

# repl数据库已同步过来了
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mybatis            |
| mysql              |
| performance_schema |
| repl               |
| sys                |
+--------------------+
mysql> use repl;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed

-- 数据也同步过来了
mysql> select * from test;
+----+--------+
| id | name   |
+----+--------+
|  1 | 测试   |
+----+--------+
因为是单主模式,server-1为主(PRIMARY),支持读写,server-2和server-3为从(SECONDARY),只支持读,如果执行写会失败,如下

在server-2和server-3中执行insert into test values (2,'写入测试');,都会报以下错误

mysql> insert into test values (2,'写入测试');
ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement

在server-1执行则成功

mysql> insert into test values (2,'写入测试');
Query OK, 1 row affected (0.11 sec)

6、常见的错误

6.1 节点长期处于RECOVERING状态的解决方法

查看日志,发现是用户密码加密插件问题Authentication plugin 'caching_sha2_password' reported error: Authentication requires secure connection.,也就是创建复制用户时,密码默认是mysql 8的加密方式

2019-11-09T14:59:18.932184Z 28 [ERROR] [MY-011583] [Repl] Plugin group_replication reported: 'For details please check performance_schema.replication_connection_status table and error log messages of Slave I/O for channel group_replication_recovery.'
2019-11-09T15:00:19.061955Z 28 [System] [MY-010597] [Repl] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='server-1', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='server-1', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''.
2019-11-09T15:00:19.144499Z 37 [Warning] [MY-010897] [Repl] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2019-11-09T15:00:19.154364Z 37 [ERROR] [MY-010584] [Repl] Slave I/O for channel 'group_replication_recovery': error connecting to master 'mgr_repl@server-1:3306' - retry-time: 60 retries: 1 message: Authentication plugin 'caching_sha2_password' reported error: Authentication requires secure connection. Error_code: MY-002061
2019-11-09T15:00:19.196600Z 28 [ERROR] [MY-011582] [Repl] Plugin group_replication reported: 'There was an error when connecting to the donor server. Please check that group_replication_recovery channel credentials and all MEMBER_HOST column values of performance_schema.replication_group_members table are correct and DNS resolvable.'
2019-11-09T15:00:19.196632Z 28 [ERROR] [MY-011583] [Repl] Plugin group_replication reported: 'For details please check performance_schema.replication_connection_status table and error log messages of Slave I/O for channel group_replication_recovery.'

解决方法,更改密码加密方式

SET SQL_LOG_BIN=0;
alter USER 'mgr_repl'@'192.168.50.%' IDENTIFIED WITH sha256_password BY '123456';
GRANT REPLICATION SLAVE ON *.* TO 'mgr_repl'@'192.168.50.%';
SET SQL_LOG_BIN=1;

再启动MGR就可以正常了

select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 4c668095-dd20-11e9-95ba-080027e39bc3 | server-2    |        3306 | ONLINE       | SECONDARY   | 8.0.17         |
| group_replication_applier | 4c668095-dd20-11e9-95ba-080027e39bd2 | server-1    |        3306 | ONLINE       | PRIMARY     | 8.0.17         |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+

7、管理维护

7.1 单主切换到多主
# MGR切换模式需要重新启动组复制,因此需要在所有节点上先关闭组复制,
# 设置 group_replication_single_primary_mode=OFF 等参数,再启动组复制。
# 停止组复制(在所有MGR节点上执行):
stop group_replication;

# 单主模式关闭
set global group_replication_single_primary_mode=OFF;

# 如果是单主模式,因为不存在多主同时操作的可能,这个强制检查是可以关闭,因为已经不存在这样的操作,多主是必须要开的,不开的话数据就可能出现错乱了
set global group_replication_enforce_update_everywhere_checks=ON;

# 随便选择某个MGR节点执行 (比如这里选择在server-1节点):
set global group_replication_recovery_get_public_key=1;
SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;

# 然后在其他的MGR节点执行 (这里指server-2和server-3节点上执行):
set global group_replication_recovery_get_public_key=1;
START GROUP_REPLICATION;

# 查看MGR组信息 (在任意一个MGR节点上都可以查看)
SELECT * FROM performance_schema.replication_group_members;
7.2 多主切换回单主
# 停止组复制(在所有MGR节点上执行):
stop group_replication;

# 如果是单主模式,因为不存在多主同时操作的可能,这个强制检查是可以关闭,因为已经不存在这样的操作,多主是必须要开的,不开的话数据就可能出现错乱了
set global group_replication_enforce_update_everywhere_checks=OFF;

# 打开单主模式
set global group_replication_single_primary_mode=ON;

# 选择一个节点作为主节点, 在主节点上执行 (这里选择server-1节点作为主节点)
SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;
 
# 在其他剩余的节点, 也就是从库节点上执行 (这里从库节点指的就是server-2和server-3):
START GROUP_REPLICATION;

# 查看MGR组信息 (在任意一个MGR节点上都可以查看)
SELECT * FROM performance_schema.replication_group_members;
7.3 常用命令及语句
# 查一下GTID,是否为之前设的那个group的uuid(558edd3c-02ec-11ea-9bb3-080027e39bd2)
mysql> show master status;
+----------------+----------+--------------+------------------+------------------------------------------+
| File           | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                        |
+----------------+----------+--------------+------------------+------------------------------------------+
| mgr-bin.000004 |     2208 |              |                  | 558edd3c-02ec-11ea-9bb3-080027e39bd2:1-9 |
+----------------+----------+--------------+------------------+------------------------------------------+

# 查看group内所有成员的节点信息
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 4c668095-dd20-11e9-95ba-080027e39bc3 | server-2    |        3306 | ONLINE       | SECONDARY   | 8.0.17         |
| group_replication_applier | 4c668095-dd20-11e9-95ba-080027e39bd2 | server-1    |        3306 | ONLINE       | PRIMARY     | 8.0.17         |
| group_replication_applier | c90e9166-de11-11e9-9ee0-080027d8da86 | server-3    |        3306 | ONLINE       | SECONDARY   | 8.0.17         |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+

# 查看GROUP中的同步情况,当前复制状态
mysql> select * from performance_schema.replication_group_member_stats\G;
*************************** 1. row ***************************
                              CHANNEL_NAME: group_replication_applier
                                   VIEW_ID: 15733926047947018:3
                                 MEMBER_ID: 4c668095-dd20-11e9-95ba-080027e39bc3
               COUNT_TRANSACTIONS_IN_QUEUE: 0
                COUNT_TRANSACTIONS_CHECKED: 4
                  COUNT_CONFLICTS_DETECTED: 0
        COUNT_TRANSACTIONS_ROWS_VALIDATING: 1
        TRANSACTIONS_COMMITTED_ALL_MEMBERS: 558edd3c-02ec-11ea-9bb3-080027e39bd2:1-9
            LAST_CONFLICT_FREE_TRANSACTION: 558edd3c-02ec-11ea-9bb3-080027e39bd2:9
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0
         COUNT_TRANSACTIONS_REMOTE_APPLIED: 5
         COUNT_TRANSACTIONS_LOCAL_PROPOSED: 0
         COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0
*************************** 2. row ***************************
                              CHANNEL_NAME: group_replication_applier
                                   VIEW_ID: 15733926047947018:3
                                 MEMBER_ID: 4c668095-dd20-11e9-95ba-080027e39bd2
               COUNT_TRANSACTIONS_IN_QUEUE: 0
                COUNT_TRANSACTIONS_CHECKED: 4
                  COUNT_CONFLICTS_DETECTED: 0
        COUNT_TRANSACTIONS_ROWS_VALIDATING: 1
        TRANSACTIONS_COMMITTED_ALL_MEMBERS: 558edd3c-02ec-11ea-9bb3-080027e39bd2:1-9
            LAST_CONFLICT_FREE_TRANSACTION: 558edd3c-02ec-11ea-9bb3-080027e39bd2:9
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0
         COUNT_TRANSACTIONS_REMOTE_APPLIED: 2
         COUNT_TRANSACTIONS_LOCAL_PROPOSED: 4
         COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0
*************************** 3. row ***************************
                              CHANNEL_NAME: group_replication_applier
                                   VIEW_ID: 15733926047947018:3
                                 MEMBER_ID: c90e9166-de11-11e9-9ee0-080027d8da86
               COUNT_TRANSACTIONS_IN_QUEUE: 0
                COUNT_TRANSACTIONS_CHECKED: 4
                  COUNT_CONFLICTS_DETECTED: 0
        COUNT_TRANSACTIONS_ROWS_VALIDATING: 1
        TRANSACTIONS_COMMITTED_ALL_MEMBERS: 558edd3c-02ec-11ea-9bb3-080027e39bd2:1-9
            LAST_CONFLICT_FREE_TRANSACTION: 558edd3c-02ec-11ea-9bb3-080027e39bd2:9
COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0
         COUNT_TRANSACTIONS_REMOTE_APPLIED: 4
         COUNT_TRANSACTIONS_LOCAL_PROPOSED: 0
         COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0

# 当前server中各个通道的使用情况
mysql> select * from performance_schema.replication_connection_status\G;
*************************** 1. row ***************************
                                      CHANNEL_NAME: group_replication_applier
                                        GROUP_NAME: 558edd3c-02ec-11ea-9bb3-080027e39bd2
                                       SOURCE_UUID: 558edd3c-02ec-11ea-9bb3-080027e39bd2
                                         THREAD_ID: NULL
                                     SERVICE_STATE: ON
                         COUNT_RECEIVED_HEARTBEATS: 0
                          LAST_HEARTBEAT_TIMESTAMP: 0000-00-00 00:00:00.000000
                          RECEIVED_TRANSACTION_SET: 558edd3c-02ec-11ea-9bb3-080027e39bd2:1-9
                                 LAST_ERROR_NUMBER: 0
                                LAST_ERROR_MESSAGE: 
                              LAST_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000
                           LAST_QUEUED_TRANSACTION: 558edd3c-02ec-11ea-9bb3-080027e39bd2:5
 LAST_QUEUED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000
LAST_QUEUED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000
     LAST_QUEUED_TRANSACTION_START_QUEUE_TIMESTAMP: 2019-11-10 21:35:36.917108
       LAST_QUEUED_TRANSACTION_END_QUEUE_TIMESTAMP: 2019-11-10 21:35:36.917146
                              QUEUEING_TRANSACTION: 
    QUEUEING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000
   QUEUEING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000
        QUEUEING_TRANSACTION_START_QUEUE_TIMESTAMP: 0000-00-00 00:00:00.000000

# 当前server中各个通道是否启用,on是启用
mysql> select * from performance_schema.replication_applier_status;
+---------------------------+---------------+-----------------+----------------------------+
| CHANNEL_NAME              | SERVICE_STATE | REMAINING_DELAY | COUNT_TRANSACTIONS_RETRIES |
+---------------------------+---------------+-----------------+----------------------------+
| group_replication_applier | ON            |            NULL |                          0 |
+---------------------------+---------------+-----------------+----------------------------+

8、故障注意事项

# 单主模式,恢复MGR-node1节点, 恢复后, 需要手动激活下该节点的组复制功能
# 如果节点发生故障, 在恢复后需要重新加入到MGR集群里, 正确的做法是:
STOP GROUP_REPLICATION;
START GROUP_REPLICATION;

# 如果某个节点挂了, 则其他的节点继续进行同步。当故障节点恢复后, 只需要手动激活下该节点的组复制功能, 即可正常加入到MGR组复制集群内并自动同步其他节点数据。
START GROUP_REPLICATION;

# 如果是i/o复制出现异常,确定数据无误后
# 查找主库的gtid情况
mysql> show global variables like '%gtid%';
+----------------------------------------------+------------------------------------------+
| Variable_name                                | Value                                    |
+----------------------------------------------+------------------------------------------+
| binlog_gtid_simple_recovery                  | ON                                       |
| enforce_gtid_consistency                     | ON                                       |
| group_replication_gtid_assignment_block_size | 1000000                                  |
| gtid_executed                                | 558edd3c-02ec-11ea-9bb3-080027e39bd2:1-9 |
| gtid_executed_compression_period             | 1000                                     |
| gtid_mode                                    | ON                                       |
| gtid_owned                                   |                                          |
| gtid_purged                                  |                                          |
| session_track_gtids                          | OFF                                      |
+----------------------------------------------+------------------------------------------+

# 在有故障的从库中操作
stop GROUP_REPLICATION;
reset master;
set global gtid_purged='58f6e65e-9309-11e9-9d88-525400184a0a:1-946055:1000003';
START GROUP_REPLICATION;

# 添加白名单网段,一定要注意: 先关闭 Group Replication
stop group_replication;
set global group_replication_ip_whitelist="127.0.0.1/32,172.16.60.0/24,172.16.50.0/24,172.16.51.0/24";
start group_replication;
show variables like "group_replication_ip_whitelist";

你可能感兴趣的:(mysql)