实验环境:
节点1:192.168.2.11
节点2:192.168.2.12
节点3:192.168.2.13
官方yum源地址:http://repo.percona.com/release/centos/latest/RPMS/x86_64/
或者执行以下命令安装官方yum源repo配置文件:
yum install http://www.percona.com/downloads/percona-release/redhat/0.1-4/percona-release-0.1-4.noarch.rpm
在三个节点上配置好yum源后,分别执行以下命令安装PXC:
yum install Percona-XtraDB-Cluster-57
遇到的问题:yum安装失败
报错信息如下:
Transaction check error:
file /etc/my.cnf conflicts between attempted installs of Percona-XtraDB-Cluster-server-57-5.7.21-29.26.1.el7.x86_64 and MariaDB-common-10.2.11-1.el7.centos.x86_64
file /usr/lib64/mysql/plugin/dialog.so conflicts between attempted installs of Percona-XtraDB-Cluster-server-57-5.7.21-29.26.1.el7.x86_64 and MariaDB-common-10.2.11-1.el7.centos.x86_64
有两文件存在冲突,应该是前面安装mariadb的残留文件。
于是删除两文件,再次yum安装:
[root@node1 ~]#ls /etc/my.cnf
my.cnf my.cnf.d/
[root@node1 ~]#ls /etc/my.cnf.d/mysql-clients.cnf
/etc/my.cnf.d/mysql-clients.cnf
[root@node1 ~]#rm -rf /etc/my.cnf
[root@node1 ~]#rm -rf /etc/my.cnf.d/
但是再次报错,信息如下:
Transaction check error:
file /etc/my.cnf conflicts between attempted installs of Percona-XtraDB-Cluster-server-57-5.7.21-29.26.1.el7.x86_64 and MariaDB-common-10.2.11-1.el7.centos.x86_64
file /usr/lib64/mysql/plugin/dialog.so conflicts between attempted installs of Percona-XtraDB-Cluster-server-57-5.7.21-29.26.1.el7.x86_64 and MariaDB-common-10.2.11-1.el7.centos.x86_64
猜测是否存在某个之前安装的mariadb相关的程序包导致的冲突,于是使用yum remove "mariadb*"
命令,果然有两个相关的依赖包存在。(我还试过MariaDB等关键字,但没有有用信息。)
[root@node1 ~]#yum remove "mariadb*"
…… ……
Erasing : 2:postfix-2.10.1-6.el7.x86_64 1/2
Erasing : 1:mariadb-libs-5.5.56-2.el7.x86_64 2/2
Verifying : 1:mariadb-libs-5.5.56-2.el7.x86_64 1/2
Verifying : 2:postfix-2.10.1-6.el7.x86_64 2/2
Removed:
mariadb-libs.x86_64 1:5.5.56-2.el7
Dependency Removed:
postfix.x86_64 2:2.10.1-6.el7
Complete!
再次使用yum安装Percona-XtraDB-Cluster-57成功。
另外,在安装之前,建议关闭selinux和防火墙,或者在防火墙中开放以下端口:3306、4444、4567、4568,否则可能会导致服务启动失败。这四个端口在PXC集群中的作用如下:
3306 数据库对外提供服务的端口
4444 镜像数据传输SST,集群数据同步端口,全量同步,新节点加入时起作用
4567 集群节点间相互通信的端口
4568 增量数据同步IST,节点下线、重启后使用该端口,增量同步数据。
各节点安装完成后需要启动mysqld服务,在第一次启动服务时PXC会创建随机默认密码并保存在日志文件/var/log/mysqld.log
中,使用grep
命令搜索关键字temporary password
即可查看。
[root@node1 ~]#systemctl start mysqld
[root@node1 ~]#grep "temporary password" /var/log/mysqld.log
2018-05-27T00:29:52.534443Z 1 [Note] A temporary password is generated for root@localhost: rkMXb7doO>Pu
然后使用该密码登录mysql,并使用alter user
命令修改root账户密码。
[root@node1 ~]#mysql -uroot -p'rkMXb7doO>Pu'
mysql> alter user 'root'@'localhost' identified by 'admin';
Query OK, 0 rows affected (0.01 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.01 sec)
mysql> quit
在修改配置文件之前,需要先停止mysqld服务systemctl stop mysqld
,然后在节点1的配置文件/etc/my.cnf
中添加以下信息:
wsrep_provider=/usr/lib64/galera3/libgalera_smm.so #Galera库文件路径
wsrep_cluster_name=pxc-cluster #集群的逻辑名称,各节点应该统一
wsrep_cluster_address=gcomm://192.168.2.11,192.168.2.12,192.168.2.13 #列出集群内所有节点的IP
wsrep_node_name=pxc1 #当前节点的逻辑名称
wsrep_node_address=192.168.2.11 #当前节点的IP
wsrep_sst_method=xtrabackup-v2 #全量同步(SST)方式
wsrep_sst_auth=sstuser:password #用于SST的账户信息(需要自行创建)
pxc_strict_mode=ENFORCING #PXC严格模式,建议开启
binlog_format=ROW #binlog格式,PXC只支持格式为ROW的binlog
default_storage_engine=InnoDB #PXC对InnoDB存储引擎有最好的支持
innodb_autoinc_lock_mode=2 #在向有auto_increment 列的表插入数据时,PXC只支持interleaved(2)交错锁
节点2和节点3的配置文件同节点1,但分别有两处配置需要修改:
节点2:
wsrep_node_name=pxc2
wsrep_node_address=192.168.2.12
节点3:
wsrep_node_name=pxc3
wsrep_node_address=192.168.2.13
不同于通常的mysql服务启动方式,节点1作为初始化PXC集群的节点,在初始化集群启动mysql服务时需要以比较特别的方式启动:
systemctl start mysql@bootstrap.service
不幸的是,节点1启动服务失败:
[root@node2 ~]#systemctl start mysql@bootstrap.service
Job for mysql@bootstrap.service failed because the control process exited with error code. See "systemctl status [email protected]" and "journalctl -xe" for details.
错误信息如下:
[root@node1 ~]#systemctl status [email protected]
● mysql@bootstrap.service - Percona XtraDB Cluster with config /etc/sysconfig/mysql.bootstrap
Loaded: loaded (/usr/lib/systemd/system/mysql@.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Sun 2018-05-27 18:16:48 CST; 11s ago
Process: 4210 ExecStopPost=/usr/bin/mysql-systemd stop-post (code=exited, status=0/SUCCESS)
Process: 4168 ExecStop=/usr/bin/mysql-systemd stop (code=exited, status=2)
Process: 4010 ExecStartPost=/usr/bin/mysql-systemd start-post $MAINPID (code=exited, status=1/FAILURE)
Process: 4009 ExecStart=/usr/bin/mysqld_safe --basedir=/usr ${EXTRA_ARGS} (code=exited, status=1/FAILURE)
Process: 3951 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
Main PID: 4009 (code=exited, status=1/FAILURE)
May 27 18:16:48 node1.test.com mysql-systemd[4210]: my_print_defaults: [ERROR] Fatal error in defaults handling. Program aborted!
May 27 18:16:48 node1.test.com mysql-systemd[4210]: my_print_defaults: [ERROR] Found option without preceding group in config file /etc/...ine 13!
May 27 18:16:48 node1.test.com mysql-systemd[4210]: my_print_defaults: [ERROR] Fatal error in defaults handling. Program aborted!
May 27 18:16:48 node1.test.com mysql-systemd[4210]: my_print_defaults: [ERROR] Found option without preceding group in config file /etc/...ine 13!
May 27 18:16:48 node1.test.com mysql-systemd[4210]: my_print_defaults: [ERROR] Fatal error in defaults handling. Program aborted!
May 27 18:16:48 node1.test.com mysql-systemd[4210]: WARNING: mysql pid file /var/lib/mysql/node1.test.com.pid empty or not readable
May 27 18:16:48 node1.test.com mysql-systemd[4210]: WARNING: mysql may be already dead
May 27 18:16:48 node1.test.com systemd[1]: Failed to start Percona XtraDB Cluster with config /etc/sysconfig/mysql.bootstrap.
May 27 18:16:48 node1.test.com systemd[1]: Unit mysql@bootstrap.service entered failed state.
May 27 18:16:48 node1.test.com systemd[1]: mysql@bootstrap.service failed.
Hint: Some lines were ellipsized, use -l to show in full.
查阅相关文档后得知,需要在配置文件/etc/my.cnf
第一行添加[mysqld]
。而该文件的最初的配置如下:
[root@node1 ~]#cat /etc/my.cnf
#
# The Percona XtraDB Cluster 5.7 configuration file.
#
#
# * IMPORTANT: Additional settings that can override those from this file!
# The files must end with '.cnf', otherwise they'll be ignored.
# Please make any edits and changes to the appropriate sectional files
# included below.
#
!includedir /etc/my.cnf.d/
!includedir /etc/percona-xtradb-cluster.conf.d/
确实没有[mysqld]
。同时,仔细研究PXC在/etc/my.cnf
被注释了的补充配置文件中,可以发现[mysqld]
的相关配置信息:
[root@node1 ~]#cat /etc/percona-xtradb-cluster.conf.d/mysqld.cnf
# Template my.cnf for PXC
# Edit to your requirements.
[client]
socket=/var/lib/mysql/mysql.sock
[mysqld]
server-id=1
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
log-bin
log_slave_updates
expire_logs_days=7
现在在/etc/my.cnf
第一行添加[mysqld]
,然后重新启动服务成功(节点2和节点3的也需要参照节点1在配置文件/etc/my.cnf
的第一行添加[mysqld]
)。
[root@node1 ~]#grep "^[^#]" /etc/my.cnf
[mysqld]
!includedir /etc/my.cnf.d/
!includedir /etc/percona-xtradb-cluster.conf.d/
wsrep_provider=/usr/lib64/galera3/libgalera_smm.so
wsrep_cluster_name=pxc-cluster
wsrep_cluster_address=gcomm://192.168.2.11,192.168.2.12,192.168.2.13
wsrep_node_name=pxc1
wsrep_node_address=192.168.2.11
wsrep_sst_method=xtrabackup-v2
wsrep_sst_auth=sstuser:password
pxc_strict_mode=ENFORCING
binlog_format=ROW
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
[root@node1 ~]#systemctl start mysql@bootstrap.service
[root@node1 ~]#systemctl status mysql@bootstrap.service
● mysql@bootstrap.service - Percona XtraDB Cluster with config /etc/sysconfig/mysql.bootstrap
Loaded: loaded (/usr/lib/systemd/system/mysql@.service; disabled; vendor preset: disabled)
Active: active (running) since Sun 2018-05-27 18:32:30 CST; 5s ago
Process: 4210 ExecStopPost=/usr/bin/mysql-systemd stop-post (code=exited, status=0/SUCCESS)
Process: 4168 ExecStop=/usr/bin/mysql-systemd stop (code=exited, status=2)
Process: 4419 ExecStartPost=/usr/bin/mysql-systemd start-post $MAINPID (code=exited, status=0/SUCCESS)
Process: 4378 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
Main PID: 4418 (mysqld_safe)
CGroup: /system.slice/system-mysql.slice/mysql@bootstrap.service
├─4418 /bin/sh /usr/bin/mysqld_safe --basedir=/usr --wsrep-new-cluster
└─4982 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --wsrep-provider=/...
监听端口3306和4567:
[root@node1 ~]#ss -ntl
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 *:111 *:*
LISTEN 0 128 *:22 *:*
LISTEN 0 128 *:4567 *:*
LISTEN 0 128 :::111 :::*
LISTEN 0 128 :::22 :::*
LISTEN 0 80 :::3306 :::*
在初始化启动节点1的mysql后,需要创建用于节点直接同步数据的账户。
mysql> GRANT RELOAD,LOCK TABLES,PROCESS,REPLICATION CLIENT ON *.* TO 'sstuser'@'localhost' IDENTIFIED BY 'password';
Query OK, 0 rows affected, 1 warning (0.03 sec)
mysql> FLUSH PRIVILEGES;
Query OK, 0 rows affected (0.02 sec)
此处应注意,该账户只需要在节点1上创建即可,且账户的host为localhost
。
可使用命令show status like 'wsrep%'
查看PXC集群当前状态:
mysql> show status like 'wsrep%';
+----------------------------------+--------------------------------------+
| Variable_name | Value |
+----------------------------------+--------------------------------------+
| wsrep_local_state_uuid | 1869fabe-6145-11e8-8589-9740e28b11bb |
| wsrep_protocol_version | 8 |
| wsrep_last_applied | 4 |
| wsrep_last_committed | 4 |
| wsrep_replicated | 2 |
| wsrep_replicated_bytes | 504 |
| wsrep_gcomm_uuid | 3d64d29d-6199-11e8-982a-8fce70c54ebc |
| wsrep_cluster_conf_id | 1 |
| wsrep_cluster_size | 1 |
| …… | …… |
| wsrep_cluster_status | Primary |
| wsrep_connected | ON |
| wsrep_local_bf_aborts | 0 |
| wsrep_local_index | 0 |
| wsrep_provider_name | Galera |
| wsrep_provider_vendor | Codership Oy |
| wsrep_provider_version | 3.26(rac090bc) |
| wsrep_ready | ON |
+----------------------------------+--------------------------------------+
68 rows in set (0.00 sec)
参数wsrep_cluster_size
的值为1,显示当前PXC集群只有一个节点。在初始化集群过程中,要注意节点1的服务的启动方式,下一步就是将其他节点加入集群。
其余两个节点按照正常启动mysql方式即可,不需要按照节点1的bootstrap启动方式:
systemctl start mysqld
进入mysql查看PXC集群状态:
[root@node2 ~]#mysql -uroot -padmin
mysql> show status like 'wsrep%';
+----------------------------------+-------------------------------------------------------+
| Variable_name | Value |
+----------------------------------+-------------------------------------------------------+
| wsrep_local_state_uuid | 1869fabe-6145-11e8-8589-9740e28b11bb |
| wsrep_protocol_version | 8 |
| wsrep_last_applied | 4 |
| wsrep_last_committed | 4 |
| …… | …… |
| wsrep_local_state_comment | Synced |
| …… | …… |
| wsrep_incoming_addresses | 192.168.2.11:3306,192.168.2.12:3306,192.168.2.13:3306 |
| wsrep_desync_count | 0 |
| wsrep_evs_delayed | |
| wsrep_evs_evict_list | |
| wsrep_evs_repl_latency | 0/0/0/0/0 |
| wsrep_evs_state | OPERATIONAL |
| wsrep_gcomm_uuid | 73c206cd-61c4-11e8-a802-4642ddf2cd8c |
| wsrep_cluster_conf_id | 7 |
| wsrep_cluster_size | 3 |
| wsrep_cluster_state_uuid | 1869fabe-6145-11e8-8589-9740e28b11bb |
| wsrep_cluster_status | Primary |
| wsrep_connected | ON |
| …… | …… |
| wsrep_provider_version | 3.26(rac090bc) |
| wsrep_ready | ON |
+----------------------------------+-------------------------------------------------------+
wsrep_cluster_size
显示此时的集群节点数为3。
如果在这个过程中,节点2或者节点3的mysql服务无法启动,可以从以下几个方向着手:
(1)配置文件错误
(2)防火墙没有开放相应的端口,比如默认的4567端口
(3)忘记进行授权账号
(4)xtrabackup没有安装或者安装有问题
两个节点按照正常方式启动mysql后,会按照配置文件自动加入PXC集群。当所有节点添加完毕时,将节点1的mysql服务关闭,然后再以正常方式启动mysql服务:
[root@node1 ~]#systemctl stop mysql@bootstrap
[root@node1 ~]#systemctl start mysql
[root@node1 ~]#systemctl status mysql
● mysql.service - Percona XtraDB Cluster
Loaded: loaded (/usr/lib/systemd/system/mysql.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2018-05-28 00:03:40 CST; 1min 31s ago
Process: 1873 ExecStopPost=/usr/bin/mysql-systemd stop-post (code=exited, status=0/SUCCESS)
Process: 1838 ExecStop=/usr/bin/mysql-systemd stop (code=exited, status=0/SUCCESS)
Process: 7155 ExecStartPost=/usr/bin/mysql-systemd start-post $MAINPID (code=exited, status=0/SUCCESS)
Process: 7114 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
Main PID: 7154 (mysqld_safe)
CGroup: /system.slice/mysql.service
├─7154 /bin/sh /usr/bin/mysqld_safe --basedir=/usr
└─7701 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --wsrep-provider=/...
首先在节点上上创建数据库,并写入数据:
[root@node1 ~]#mysql -uroot -padmin
mysql> CREATE DATABASE percona;
Query OK, 1 row affected (0.02 sec)
mysql> USE percona;
Database changed
mysql> CREATE TABLE example (node_id INT PRIMARY KEY, node_name VARCHAR(30));
Query OK, 0 rows affected (0.07 sec)
mysql> INSERT INTO percona.example VALUES (1, 'percona1');
Query OK, 1 row affected (0.05 sec)
mysql> SELECT * FROM percona.example;
+---------+-----------+
| node_id | node_name |
+---------+-----------+
| 1 | percona1 |
+---------+-----------+
1 row in set (0.00 sec)
在节点2和节点3上验证数据是否同步:
[root@node2 ~]#mysql -uroot -padmin
mysql> SELECT * FROM percona.example;
+---------+-----------+
| node_id | node_name |
+---------+-----------+
| 1 | percona1 |
+---------+-----------+
1 row in set (0.00 sec)
[root@node3 ~]#mysql -uroot -padmin
mysql> SELECT * FROM percona.example;
+---------+-----------+
| node_id | node_name |
+---------+-----------+
| 1 | percona1 |
+---------+-----------+
1 row in set (0.00 sec)
然后分别在节点2和节点3上写入数据:
节点2写入数据:
mysql> INSERT INTO percona.example VALUES (3, 'percona3');
Query OK, 1 row affected (0.02 sec)
节点3写入数据:
mysql> INSERT INTO percona.example VALUES (2, 'percona2');
Query OK, 1 row affected (0.05 sec)
在节点1上验证数据是否同步:
mysql> SELECT * FROM percona.example;
+---------+-----------+
| node_id | node_name |
+---------+-----------+
| 1 | percona1 |
| 2 | percona2 |
| 3 | percona3 |
+---------+-----------+
3 rows in set (0.00 sec)
PXC集群内的所有节点均可写入并同步复制到其他节点。
此时如果停止节点1的mysql服务,然后在其他节点写入数据:
[root@node1 ~]#systemctl stop mysql
[root@node1 ~]#systemctl status mysql
● mysql.service - Percona XtraDB Cluster
Loaded: loaded (/usr/lib/systemd/system/mysql.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Mon 2018-05-28 00:24:45 CST; 49s ago
Process: 34086 ExecStopPost=/usr/bin/mysql-systemd stop-post (code=exited, status=0/SUCCESS)
Process: 34051 ExecStop=/usr/bin/mysql-systemd stop (code=exited, status=0/SUCCESS)
Process: 30919 ExecStartPost=/usr/bin/mysql-systemd start-post $MAINPID (code=exited, status=0/SUCCESS)
Process: 30918 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS)
Process: 30878 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
Main PID: 30918 (code=exited, status=0/SUCCESS)
节点2写入数据:
mysql> show status like 'wsrep_cluster%';
+--------------------------+--------------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------------+
| wsrep_cluster_conf_id | 10 |
| wsrep_cluster_size | 2 |
| wsrep_cluster_state_uuid | 1869fabe-6145-11e8-8589-9740e28b11bb |
| wsrep_cluster_status | Primary |
+--------------------------+--------------------------------------+
4 rows in set (0.02 sec)
mysql> INSERT INTO percona.example VALUES (4, 'percona4');
Query OK, 1 row affected (0.02 sec)
wsrep_cluster_size
显示此时的集群节点数为2。当再次启动节点1的mysql服务,检查节点2最近写入的数据是否完成同步:
[root@node1 ~]#systemctl start mysql
[root@node1 ~]#mysql -uroot -padmin
mysql> SELECT * FROM percona.example;
+---------+-----------+
| node_id | node_name |
+---------+-----------+
| 1 | percona1 |
| 2 | percona2 |
| 3 | percona3 |
| 4 | percona4 |
+---------+-----------+
4 rows in set (0.00 sec)
由此可见,当集群内的某个节点掉线后,其他节点仍可以正常的工作,新写入的数据会在该节点重新上线后完成同步,以实现PXC的高可用。
针对重新上线的节点,PXC有两种方式完成数据传输以保证数据同步:State Snapshot Transfer (SST)和Incremental State Transfer (IST)。
SST通常用在当有新的节点加入PXC集群同时从已存在节点复制全部数据时采用,在PXC中有三种可用的方式完成SST过程:
– mysqldump
– rsync
– xtrabackup
mysqldump
和rsync
的缺点是在数据传输过程中法规,PXC集群将会变成只读模式,SST将会对数据库施加只读锁(FLUSH TABLES WITH READ LOCK)。而使用xtrabackup
则不需要再数据同步过程中施加读锁,仅仅是同步.frm文件,类似于常规的备份。
IST用于只将数据的增量变化从一个节点复制到另一个节点。
尽管,如果SST使用xtrabackup
不需要施加读锁,SST仍可能扰乱了服务的正常运行。而IST则不会。如果一个节点掉线时间较短,当再次上线,它将只会从其他节点获取掉线期间的数据变化部分。IST是在节点上使用缓存机制实现的。每个节点包含一个缓存,且环形缓冲区(大小是可配置的)存储最后N个变化,并且节点能够传输该缓存的一部分。显然,只有当传输所需的更改量小于N时,才可以执行IST,如果超过N,则加入的节点必须执行SST。