gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():766: Will never receive state.

环境信息

安装pxc过程中,启动第二个节点时出现如下错误:
gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():766: Will never receive state.
安装包:
percona-xtrabackup-24-2.4.15-1.el7.x86_64.rpm
Percona-XtraDB-Cluster-5.7.25-rel28-31.35.1.Linux.x86_64.ssl101.tar.gz

日志分析:

2019-12-01T06:49:17.523802-08:00 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2019-12-01T06:49:17.523926-08:00 0 [Warning] WSREP: Node is running in bootstrap/initialize mode. Disabling pxc_strict_mode checks
 100
 100
 100
 100
2019-12-01T06:49:18.521551-08:00 0 [Warning] InnoDB: New log files created, LSN=45790
2019-12-01T06:49:18.693027-08:00 0 [Warning] InnoDB: Creating foreign key constraint system tables.
2019-12-01T06:49:18.751601-08:00 0 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: c0c4d9e3-1449-11ea-b40b-000c29c20589.
2019-12-01T06:49:18.755176-08:00 0 [Warning] Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened.
2019-12-01T06:49:18.948738-08:00 0 [Warning] CA certificate ca.pem is self signed.
2019-12-01T06:49:19.014433-08:00 1 [Note] A temporary password is generated for root@localhost: lPTwmCj!(9ja
2019-12-01T06:49:44.249683-08:00 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2019-12-01T06:49:44.252680-08:00 0 [Warning] Insecure configuration for --secure-file-priv: Location is accessible to all OS users. Consider choosing a different directory.
2019-12-01T06:49:44.252736-08:00 0 [Note] /usr/local/pxcmysql/bin/mysqld (mysqld 5.7.25-28-31.35-log) starting as process 58740 ...
2019-12-01T06:49:44.260104-08:00 0 [Note] WSREP: Setting wsrep_ready to false
2019-12-01T06:49:44.260127-08:00 0 [Note] WSREP: No pre-stored wsrep-start position found. Skipping position initialization.
2019-12-01T06:49:44.260132-08:00 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/local/pxcmysql/lib/libgalera_smm.so'
2019-12-01T06:49:44.266323-08:00 0 [Note] WSREP: wsrep_load(): Galera 3.35(r) by Codership Oy  loaded successfully.
2019-12-01T06:49:44.266424-08:00 0 [Note] WSREP: CRC-32C: using hardware acceleration.
2019-12-01T06:49:44.266757-08:00 0 [Warning] WSREP: Could not open state file for reading: '/data/mysql/mysql3306/data//grastate.dat'
2019-12-01T06:49:44.266765-08:00 0 [Warning] WSREP: No persistent state found. Bootstraping with default state
2019-12-01T06:49:44.266856-08:00 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 1
2019-12-01T06:49:44.270417-08:00 0 [Note] WSREP: Passing config to GCS: base_dir = /data/mysql/mysql3306/data/; base_host = 192.168.31.110; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 10; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 4; evs.view_forget_timeout = PT24H; gcache.dir = /data/mysql/mysql3306/data/; gcache.freeze_purge_at_seqno = -1; gcache.keep_pages_count = 0; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /data/mysql/mysql3306/data//galera.cache; gcache.page_size = 128M; gcache.recover = no; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 100; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recovery = 1; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = PT30S; pc.weight = 1; protonet.backend = asio; protonet.version = 0; repl.causal_read_timeout = PT30S; repl.commit_order = 3; repl.key_format = FLAT8; repl.max_ws_size = 2147483647; repl.proto_max = 9; socket.checksum = 2; socket.recv_buf_size = 212992; 
2019-12-01T06:49:44.294399-08:00 0 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 00000000-0000-0000-0000-000000000000:-1
2019-12-01T06:49:44.294925-08:00 0 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
2019-12-01T06:49:44.294974-08:00 0 [Note] WSREP: Preparing to initiate SST/IST
2019-12-01T06:49:44.294980-08:00 0 [Note] WSREP: Starting replication
2019-12-01T06:49:44.294994-08:00 0 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
2019-12-01T06:49:44.295174-08:00 0 [Note] WSREP: Using CRC-32C for message checksums.
2019-12-01T06:49:44.295248-08:00 0 [Note] WSREP: gcomm thread scheduling priority set to other:0 
2019-12-01T06:49:44.295344-08:00 0 [Warning] WSREP: Fail to access the file (/data/mysql/mysql3306/data//gvwstate.dat) error (No such file or directory). It is possible if node is booting for first time or re-booting after a graceful shutdown
2019-12-01T06:49:44.295351-08:00 0 [Note] WSREP: Restoring primary-component from disk failed. Either node is booting for first time or re-booting after a graceful shutdown
2019-12-01T06:49:44.295883-08:00 0 [Note] WSREP: GMCast version 0
2019-12-01T06:49:44.296395-08:00 0 [Note] WSREP: (cffe85ea, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2019-12-01T06:49:44.296404-08:00 0 [Note] WSREP: (cffe85ea, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2019-12-01T06:49:44.296865-08:00 0 [Note] WSREP: EVS version 0
2019-12-01T06:49:44.297040-08:00 0 [Note] WSREP: gcomm: connecting to group 'pcx_mc', peer '192.168.31.100:,192.168.31.110:,192.168.31.120:'
2019-12-01T06:49:44.299857-08:00 0 [Note] WSREP: (cffe85ea, 'tcp://0.0.0.0:4567') connection established to cffe85ea tcp://192.168.31.110:4567
2019-12-01T06:49:44.299886-08:00 0 [Warning] WSREP: (cffe85ea, 'tcp://0.0.0.0:4567') address 'tcp://192.168.31.110:4567' points to own listening address, blacklisting
2019-12-01T06:49:44.300832-08:00 0 [Note] WSREP: (cffe85ea, 'tcp://0.0.0.0:4567') connection established to 43b66ed9 tcp://192.168.31.100:4567
2019-12-01T06:49:44.300910-08:00 0 [Note] WSREP: (cffe85ea, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: 
2019-12-01T06:49:44.803236-08:00 0 [Note] WSREP: declaring 43b66ed9 at tcp://192.168.31.100:4567 stable
2019-12-01T06:49:44.804296-08:00 0 [Note] WSREP: Node 43b66ed9 state primary
2019-12-01T06:49:44.825368-08:00 0 [Note] WSREP: Current view of cluster as seen by this node
view (view_id(PRIM,43b66ed9,4)
memb {
	43b66ed9,0
	cffe85ea,0
	}
joined {
	}
left {
	}
partitioned {
	}
)
2019-12-01T06:49:44.825419-08:00 0 [Note] WSREP: Save the discovered primary-component to disk
2019-12-01T06:49:44.826904-08:00 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.31.120:4567
2019-12-01T06:49:45.301330-08:00 0 [Note] WSREP: gcomm: connected
2019-12-01T06:49:45.301464-08:00 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2019-12-01T06:49:45.301557-08:00 0 [Note] WSREP: Waiting for SST/IST to complete.
2019-12-01T06:49:45.301902-08:00 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
2019-12-01T06:49:45.301922-08:00 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
2019-12-01T06:49:45.301952-08:00 0 [Note] WSREP: STATE EXCHANGE: sent state msg: d050c771-1449-11ea-9cd3-16ef5f2842a0
2019-12-01T06:49:45.301959-08:00 0 [Note] WSREP: STATE EXCHANGE: got state msg: d050c771-1449-11ea-9cd3-16ef5f2842a0 from 0 (grace)
2019-12-01T06:49:45.303194-08:00 0 [Note] WSREP: STATE EXCHANGE: got state msg: d050c771-1449-11ea-9cd3-16ef5f2842a0 from 1 (localhost.localdomain)
2019-12-01T06:49:45.303222-08:00 0 [Note] WSREP: Quorum results:
	version    = 4,
	component  = PRIMARY,
	conf_id    = 3,
	members    = 1/2 (primary/total),
	act_id     = 5,
	last_appl. = -1,
	protocols  = 0/9/3 (gcs/repl/appl),
	group UUID = 43b79600-1447-11ea-8ae5-463947bed9e1
2019-12-01T06:49:45.303231-08:00 0 [Note] WSREP: Flow-control interval: [141, 141]
2019-12-01T06:49:45.303235-08:00 0 [Note] WSREP: Trying to continue unpaused monitor
2019-12-01T06:49:45.303240-08:00 0 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 5)
2019-12-01T06:49:45.303721-08:00 2 [Note] WSREP: State transfer required: 
	Group state: 43b79600-1447-11ea-8ae5-463947bed9e1:5
	Local state: 00000000-0000-0000-0000-000000000000:-1
2019-12-01T06:49:45.303748-08:00 2 [Note] WSREP: REPL Protocols: 9 (4, 2)
2019-12-01T06:49:45.303769-08:00 2 [Note] WSREP: New cluster view: global state: 43b79600-1447-11ea-8ae5-463947bed9e1:5, view# 4: Primary, number of nodes: 2, my index: 1, protocol version 3
2019-12-01T06:49:45.303779-08:00 2 [Note] WSREP: Setting wsrep_ready to true
2019-12-01T06:49:45.303785-08:00 2 [Warning] WSREP: Gap in state sequence. Need state transfer.
2019-12-01T06:49:45.303789-08:00 2 [Note] WSREP: Setting wsrep_ready to false
2019-12-01T06:49:45.304312-08:00 0 [Note] WSREP: Initiating SST/IST transfer on JOINER side (wsrep_sst_xtrabackup-v2 --role 'joiner' --address '192.168.31.110' --datadir '/data/mysql/mysql3306/data/' --defaults-file '/data/mysql/mysql3306/my3306.cnf' --defaults-group-suffix '' --parent '58740' --mysqld-version '5.7.25-28-31.35'  --binlog '/data/mysql/mysql3306/logs/mysql-bin' )
2019-12-01T06:49:46.384624-08:00 2 [Note] WSREP: Prepared SST/IST request: xtrabackup-v2|192.168.31.110:4444/xtrabackup_sst//1
2019-12-01T06:49:46.384654-08:00 2 [Note] WSREP: Auto Increment Offset/Increment re-align with cluster membership change (Offset: 1 -> 2) (Increment: 1 -> 2)
2019-12-01T06:49:46.384664-08:00 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2019-12-01T06:49:46.384688-08:00 2 [Note] WSREP: Assign initial position for certification: 5, protocol version: 4
2019-12-01T06:49:46.384754-08:00 0 [Note] WSREP: Service thread queue flushed.
2019-12-01T06:49:46.385003-08:00 2 [Note] WSREP: Check if state gap can be serviced using IST
2019-12-01T06:49:46.385022-08:00 2 [Note] WSREP: Local UUID: 00000000-0000-0000-0000-000000000000 != Group UUID: 43b79600-1447-11ea-8ae5-463947bed9e1
2019-12-01T06:49:46.385058-08:00 2 [Note] WSREP: State gap can't be serviced using IST. Switching to SST
2019-12-01T06:49:46.385064-08:00 2 [Note] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (43b79600-1447-11ea-8ae5-463947bed9e1): 1 (Operation not permitted)
	 at galera/src/replicator_str.cpp:prepare_for_IST():538. IST will be unavailable.
2019-12-01T06:49:46.386031-08:00 0 [Note] WSREP: Member 1.0 (localhost.localdomain) requested state transfer from '*any*'. Selected 0.0 (grace)(SYNCED) as donor.
2019-12-01T06:49:46.386065-08:00 0 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 5)
2019-12-01T06:49:46.386118-08:00 2 [Note] WSREP: Requesting state transfer: success, donor: 0
2019-12-01T06:49:46.386136-08:00 2 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> 43b79600-1447-11ea-8ae5-463947bed9e1:5
2019-12-01T06:49:46.474925-08:00 0 [Warning] WSREP: 0.0 (grace): State transfer to 1.0 (localhost.localdomain) failed: -2 (No such file or directory)
2019-12-01T06:49:46.474973-08:00 0 [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():766: Will never receive state. Need to abort.
2019-12-01T06:49:46.474998-08:00 0 [Note] WSREP: gcomm: terminating thread
2019-12-01T06:49:46.475007-08:00 0 [Note] WSREP: gcomm: joining thread
2019-12-01T06:49:46.475116-08:00 0 [Note] WSREP: gcomm: closing backend
2019-12-01T06:49:47.302937-08:00 0 [Note] WSREP: (cffe85ea, 'tcp://0.0.0.0:4567') turning message relay requesting off
2019-12-01T06:49:49.802917-08:00 0 [Note] WSREP: (cffe85ea, 'tcp://0.0.0.0:4567') connection to peer 43b66ed9 with addr tcp://192.168.31.100:4567 timed out, no messages seen in PT3S (gmcast.peer_timeout)
2019-12-01T06:49:49.803082-08:00 0 [Note] WSREP: (cffe85ea, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://192.168.31.100:4567 
2019-12-01T06:49:51.302920-08:00 0 [Note] WSREP: (cffe85ea, 'tcp://0.0.0.0:4567') reconnecting to 43b66ed9 (tcp://192.168.31.100:4567), attempt 0
2019-12-01T06:49:51.475434-08:00 0 [Note] WSREP: declaring node with index 0 suspected, timeout PT5S (evs.suspect_timeout)
2019-12-01T06:49:51.475484-08:00 0 [Note] WSREP: evs::proto(cffe85ea, LEAVING, view_id(REG,43b66ed9,4)) suspecting node: 43b66ed9
2019-12-01T06:49:51.475493-08:00 0 [Note] WSREP: evs::proto(cffe85ea, LEAVING, view_id(REG,43b66ed9,4)) suspected node without join message, declaring inactive
2019-12-01T06:49:51.475523-08:00 0 [Note] WSREP: Current view of cluster as seen by this node
view (view_id(NON_PRIM,43b66ed9,4)
memb {
	cffe85ea,0
	}
joined {
	}
left {
	}
partitioned {
	43b66ed9,0
	}
)
2019-12-01T06:49:51.475876-08:00 0 [Note] WSREP: Current view of cluster as seen by this node
view ((empty))
2019-12-01T06:49:51.476195-08:00 0 [Note] WSREP: gcomm: closed
2019-12-01T06:49:51.476229-08:00 0 [Note] WSREP: /usr/local/pxcmysql/bin/mysqld: Terminated.
	2019-12-01T14:51:26.092543Z WSREP_SST: [ERROR] ******************* FATAL ERROR ********************** 
	2019-12-01T14:51:26.094746Z WSREP_SST: [ERROR] Possible timeout in receving first data from donor in gtid/keyring stage
	2019-12-01T14:51:26.097779Z WSREP_SST: [ERROR] ****************************************************** 
	2019-12-01T14:51:26.099984Z WSREP_SST: [ERROR] Cleanup after exit with status:32
Log of wsrep recovery (--wsrep-recover):

在第二个节点的日志中发现接受不到主节点的SST/IST。 检查sst用户,没有问题,检查防火墙,也发现错误。查看主节点日志,发现如下错误,原因是主节点的xtrabackup没有安装,配置文件中指明了使用xtrabackup的方式传送SST/IST(wsrep_sst_method=xtrabackup-v2)。 所以报错。

2019-12-01T09:45:50.666272-05:00 2 [Note] WSREP: Assign initial position for certification: 5, protocol version: 4
2019-12-01T09:45:50.666396-05:00 0 [Note] WSREP: Service thread queue flushed.
2019-12-01T09:45:52.213130-05:00 0 [Note] WSREP: Member 1.0 (localhost.localdomain) requested state transfer from '*any*'. Selected 0.0 (grace)(SYNCED) as donor.
2019-12-01T09:45:52.213153-05:00 0 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 5)
2019-12-01T09:45:52.213247-05:00 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2019-12-01T09:45:52.213499-05:00 0 [Note] WSREP: Initiating SST/IST transfer on DONOR side (wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.31.110:4444/xtrabackup_sst//1' --socket '/tmp/mysql3306.sock' --datadir '/data/mysql/mysql3306/data/' --defaults-file '/data/mysql/mysql3306/my3306.cnf' --defaults-group-suffix '' --mysqld-version '5.7.25-28-31.35'  --binlog '/data/mysql/mysql3306/logs/mysql-bin' --gtid '43b79600-1447-11ea-8ae5-463947bed9e1:5' )
2019-12-01T09:45:52.214154-05:00 2 [Note] WSREP: DONOR thread signaled with 0
which: no xtrabackup in (/usr/sbin:/sbin:/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin:/usr/local/mysql/bin:/usr/local/mysql/bin:/root/bin:/usr/local/mysql/bin:/usr/local/pxcmysql/bin:/usr/local/pxcmysql/bin)
	2019-12-01T14:45:52.345536Z WSREP_SST: [ERROR] ******************* FATAL ERROR ********************** 
	2019-12-01T14:45:52.347380Z WSREP_SST: [ERROR] xtrabackup not in path: /usr/sbin:/sbin:/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin:/usr/local/mysql/bin:/usr/local/mysql/bin:/root/bin:/usr/local/mysql/bin:/usr/local/pxcmysql/bin:/usr/local/pxcmysql/bin
	2019-12-01T14:45:52.349007Z WSREP_SST: [ERROR] ****************************************************** 
2019-12-01T09:45:52.349599-05:00 0 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.31.110:4444/xtrabackup_sst//1' --socket '/tmp/mysql3306.sock' --datadir '/data/mysql/mysql3306/data/' --defaults-file '/data/mysql/mysql3306/my3306.cnf' --defaults-group-suffix '' --mysqld-version '5.7.25-28-31.35'  --binlog '/data/mysql/mysql3306/logs/mysql-bin' --gtid '43b79600-1447-11ea-8ae5-463947bed9e1:5' : 2 (No such file or directory)
2019-12-01T09:45:52.349688-05:00 0 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup-v2 --role 'donor' --address '192.168.31.110:4444/xtrabackup_sst//1' --socket '/tmp/mysql3306.sock' --datadir '/data/mysql/mysql3306/data/' --defaults-file '/data/mysql/mysql3306/my3306.cnf' --defaults-group-suffix '' --mysqld-version '5.7.25-28-31.35'  --binlog '/data/mysql/mysql3306/logs/mysql-bin' --gtid '43b79600-1447-11ea-8ae5-463947bed9e1:5' 

总结

遇到接收不到SST/IST,检查一下几点:

  1. 检查SST用户的用户名和密码是否正确,并且有权限(wsrep_sst_auth)
  2. 检查防火墙(selinux, firewalld)
  3. 检查节点xtrabackup是否安装并且能正常备份
  4. 参数文件中wsrep_node_address是否配置的是本机IP地址

你可能感兴趣的:(Mysql,PXC)