低并发情况下(QPS<7000),靠谱的高可用架构。
更推荐MGR
PXC下载路径
注,rpm包有以下分类:
ssl100适用于:早于9的Debian,早于14.04的Ubuntu
ssl101适用于CentOS 6/7
ssl102适用于9以后的Debian,以及14.04以后的Ubuntu
基本规划:
Node1:192.168.47.30:3311
Node2:192.168.47.31:3311
Node3:192.168.47.32:3311
basedir=/usr/local/mysql3311
datadir=/data/mysql3311
默认端口(按需开放)
3306:数据库端口
4444:全量传输端口。SST - State Transfer from donor to joining node
4567:group communication among nodes
4568:IST - Increment State Transfer from donor to joining node
9200:check status of database node from Load Balancer(健康检查脚本所需)
3节点分别如下操作:
关闭SELinux(如果采用rpm包安装)
安装依赖包(perl-IO-Socket-SSL perl-DBD-MySQL perl-Time-HiRes socat nc)、XtraBackup(略)
安装PXC:
# wget -c https://www.percona.com/downloads/Percona-XtraDB-Cluster-LATEST/Percona-XtraDB-Cluster-5.7.25-31.35/binary/tarball/Percona-XtraDB-Cluster-5.7.25-rel28-31.35.1.Linux.x86_64.ssl101.tar.gz
# tar zxvf Percona-XtraDB-Cluster-5.7.25-rel28-31.35.1.Linux.x86_64.ssl101.tar.gz -C /usr/local/
# ldd ./bin/mysqld 确认是否有尚未安装的依赖包
# ln -s Percona-XtraDB-Cluster-5.7.25-rel28-31.35.1.Linux.x86_64.ssl101 mysql3311
# mkdir -p /data/mysql3311
my.cnf文件中添加以下必要信息:
[mysqld]
#PXC
innodb_flush_log_at_trx_commit=2
sync_binlog=10000
server-id=237103306
log_bin=binlog
binlog_format=ROW
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
wsrep_on=on
wsrep_provider_options="gcache.size=4G;"
wsrep_provider=/usr/local/mysql3311/lib/libgalera_smm.so
wsrep_cluster_address=gcomm://192.168.237.9,192.168.237.10,192.168.237.11
wsrep_cluster_name=pxc-cluster
wsrep_slave_threads= 16
wsrep_node_name=pxc-cluster-node-1
wsrep_node_address=192.168.237.10
pxc_strict_mode=ENFORCING
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth="sstuser:s3cretPass"
初始化(仅第一节点需要,其他节点自动通过xtrabackup同步):
# /usr/local/mysql3311/bin/mysqld --defaults-file=/usr/local/mysql3311/my.cnf --initialize-insecure
修改权限:
# chown -R root:mysql /usr/local/mysql3311
# chown -R mysql:mysql /data/mysql3311
启动时需带参数--wsrep-new-cluster,仅第一个节点需要:
第一节点:# /usr/local/mysql3311/bin/mysqld --defaults-file=/usr/local/mysql3311/my.cnf --wsrep-new-cluster &
配置SST账号权限,仅第一节点需要:
SST账号权限:PROCESS, RELOAD, LOCK TABLES, REPLICATION CLIENT
mysql> CREATE USER 'sstuser'@'localhost' IDENTIFIED BY 'xtrabackuppass';
Query OK, 0 rows affected (0.02 sec)
mysql> GRANT PROCESS, RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sstuser'@'localhost';
Query OK, 0 rows affected (0.04 sec)
(1)SST方式:
直接拉起全新数据节点,执行节点间全量传输任务
# /usr/local/mysql3311/bin/mysqld --defaults-file=/usr/local/mysql3311/my.cnf &
(2)IST方式:
增量传输方式,前提是新加入节点已有部分集群数据,并且合理配置grastate.dat,正常启动,seqno置为-1
[root@pxc2 mysqldata]# cat grastate.dat
# GALERA saved state
version: 2.1
uuid: 37e6cf22-a96e-11e5-910e-ceb510e80d98
seqno: -1
cert_index:
(1)新节点恢复备份数据要点:
远程传送命令:
innobackupex --defaults-file=/etc/my.cnf --slave-info --galera-info --user=xtrabackup --password=pass123 --socket=/tmp/mysql.sock --no-timestamp --stream=tar ./ | ssh [email protected] "cat - > /archive/backup/backup_`date +%Y%m%d`.tar"
apply-log和copy-back之后,读取数据目录下的 xtrabackup_galera_info,假设该文件内容如下:
37e6cf22-a96e-11e5-910e-ceb510e80d98:63257332# 把UUID和事务号(seqno)63257332复制下来,写到grastate.dat文件中
GALERA saved state
version: 2.1
uuid: 37e6cf22-a96e-11e5-910e-ceb510e80d98
seqno: 63257332
cert_index:
(2)wsrep_local_cached_downto
如果集群中已有节点的wsrep_local_cached_downto值均大于备份集的seqno,那么即使利用备份恢复了部分数据,由于其他节点的gcache过小或者数据库写入量太大,导致gcache无法完整包含备份缺失数据,依然会采用SST的方式。
解决方法:以非集群方式启动(mysqld --user=mysql --wsrep-provider='none'),
cat xtrabackup_binlog_info,获取GTID信息
reset master;
set global gtid_purged='XXXXXX:1-XXX';
change master 到其中一个节点,数据追上后,
stop slave;
show slave status \G
Relay_Master_Log_File: mysql-bin.000251
Exec_Master_Log_Pos: 513201890
mysqlbinlog mysql-bin.000251 |grep -i xid|grep 513201890
mysqlbinlog: [Warning] unknown variable 'loose_default-characterset=utf8'
#180320 17:16:35 server id 3306102 end_log_pos 513201890 CRC32 0xcc618593 Xid = 52684354
把主库上找到的xid号写入galera配置文件的seqno
再以集群方式启动。
先主从再集群的方法需要:集群中所有节点server-id唯一、log_slave_updates=ON
(3)启动排错方法:
如果第二节点拉不起来,需要排查第一节点的error log和innobackup.log,及第二节点上的error log
核心原理图
global trx ID这一把全局锁是导致PXC支撑的并发不高的根本原因。
1、wsrep_sync_wait
该参数设定了一致性的强度,5.6.20-25.7引入。
0: Do not run causality checks for any statements. This is the default.
1: Perform checks for READ statements (including SELECT, SHOW, and BEGIN or START TRANSACTION).
2: Perform checks for UPDATE and DELETE statements.
3: Perform checks for READ, UPDATE, and DELETE statements.
4: Perform checks for INSERT and REPLACE statements.
5: Perform checks for READ, INSERT, and REPLACE statements.
6: Perform checks for UPDATE, DELETE, INSERT, and REPLACE statements.
7: Perform checks for READ, UPDATE, DELETE, INSERT, and REPLACE statements.
集群中最后关闭的节点,在下次启动时,应该第一个启动。
那么如何判断哪个是最后关闭的节点呢?
在数据目录下,例如 /var/lib/mysql/grastate.bat,
如果safe_to_bootstrap: 1,表示该节点是最后一个关闭的节点。
也有可能所有节点该值都为0,此时任意启动一个节点都会报:
2019-12-11T09:53:49.015127Z 0 [ERROR] WSREP: It may not be safe to bootstrap the cluster from this node. It was not the last one to leave the cluster and may not contain all the updates. To force cluster bootstrap with this node, edit the grastate.dat file manually and set safe_to_bootstrap to 1 .
2019-12-11T09:53:49.015131Z 0 [ERROR] WSREP: Provider/Node (gcomm://192.168.237.9,192.168.237.10,192.168.237.11) failed to establish connection with cluster (reason: 7)
2019-12-11T09:53:49.015133Z 0 [ERROR] Aborting
2019-12-11T09:53:49.015138Z 0 [Note] Giving 0 client threads a chance to die gracefully
2019-12-11T09:53:49.015142Z 0 [Note] WSREP: Waiting for active wsrep applier to exit
2019-12-11T09:53:49.015146Z 0 [Note] WSREP: Service disconnected.
2019-12-11T09:53:49.015148Z 0 [Note] WSREP: Waiting to close threads......
2019-12-11T09:53:54.015527Z 0 [Note] WSREP: Some threads may fail to exit.
2019-12-11T09:53:54.015646Z 0 [Note] Binlog end
2019-12-11T09:53:54.015930Z 0 [Note] /usr/sbin/mysqld: Shutdown complete
此时,需要编辑 /var/lib/mysql/grastate.bat ,选择一个节点将值设置为1,--wsrep-new-cluster 方式启动,其他节点照常启动即可。
选择方法如下:
带 --wsrep-recover 参数启动mysqld,输出如下
...
2016-11-18 01:42:15 36311 [Note] InnoDB: Database was not shutdown normally!
2016-11-18 01:42:15 36311 [Note] InnoDB: Starting crash recovery.
...
2016-11-18 01:42:16 36311 [Note] WSREP: Recovered position: 37bb872a-ad73-11e6-819f-f3b71d9c5ada:345628
...
2016-11-18 01:42:17 36311 [Note] /home/philips/git/mysql-wsrep-bugs-5.6/sql/mysqld: Shutdown complete
Recovered position:UUID后的数字将作为该节点的最新LSN,选取拥有最大值的节点作为第一节点启动。
参考文档:
Percona XtraDB Cluster 5.7 Documentation: Multi-Master Replication
How to Bootstrap MySQL or MariaDB Galera Cluster - Updated
How to Avoid SST When Adding a New Node to MySQL Galera Cluster
Introducing the “Safe-To-Bootstrap” feature in Galera Cluster
沃趣科技-罗小波的知数堂内部资料
PXC严格模式的翻译