etcd启动错误处理方法

[root@node3 etcd]# systemctl status etcd.service
● etcd.service - Etcd Server
   Loaded: loaded (/usr/lib/systemd/system/etcd.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2023-07-13 20:29:00 CST; 11s ago
  Process: 4156 ExecStart=/usr/bin/etcd (code=exited, status=1/FAILURE)
 Main PID: 4156 (code=exited, status=1/FAILURE)

7月 13 20:29:00 node3 systemd[1]: Failed to start Etcd Server.
7月 13 20:29:01 node3 systemd[1]: etcd.service: Start request repeated too quickly.
7月 13 20:29:01 node3 systemd[1]: etcd.service: Failed with result 'exit-code'.
7月 13 20:29:01 node3 systemd[1]: Failed to start Etcd Server.
7月 13 20:29:02 node3 systemd[1]: etcd.service: Start request repeated too quickly.
7月 13 20:29:02 node3 systemd[1]: etcd.service: Failed with result 'exit-code'.
7月 13 20:29:02 node3 systemd[1]: Failed to start Etcd Server.
7月 13 20:29:03 node3 systemd[1]: etcd.service: Start request repeated too quickly.
7月 13 20:29:03 node3 systemd[1]: etcd.service: Failed with result 'exit-code'.
7月 13 20:29:03 node3 systemd[1]: Failed to start Etcd Server.

由于node3机器故障,克隆了一个node2的节点,修改完配置信息后,启动etcd,报如上错误,这个时候删除node3节点的文件

[root@node3 etcd]# cd  /var/lib/etcd
总用量 0
drwx------ 3 root root 20  7月 13 20:25 default.etcd
drwx------ 3 root root 20  7月 13 20:29 node3.etcd

[root@node3 etcd]# rm -rf *
 

发现错误依然存在,在其他节点上查看

[root@node2 etcd]# etcdctl member list
b3620ea8b67f283d, started, node2, http://172.20.1.211:2380, http://172.20.1.211:2379, false
ec9b2cc31235d3a8, started, node3, http://172.20.1.212:2380, http://172.20.1.212:2379, false
f30319c9dda087bb, started, node1, http://172.20.1.210:2380, http://172.20.1.210:2379, false
 

[root@node2 etcd]# etcdctl --endpoints=172.20.1.212:2379,172.20.1.211:2379,172.20.1.210:2379 endpoint status --write-out=table
{"level":"warn","ts":"2023-07-13T20:24:37.367784+0800","logger":"etcd-client","caller":"[email protected]/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000378a80/172.20.1.212:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: Error while dialing dial tcp 172.20.1.212:2379: connect: connection refused\""}
Failed to get the status of endpoint 172.20.1.212:2379 (context deadline exceeded)
+-------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|     ENDPOINT      |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 172.20.1.211:2379 | b3620ea8b67f283d |   3.5.9 |   20 kB |     false |      false |        22 |    2827984 |            2827984 |        |
| 172.20.1.210:2379 | f30319c9dda087bb |   3.5.9 |   20 kB |      true |      false |        22 |    2827984 |            2827984 |        |
+-------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
 

只有两个节点是正确的,删除掉node1和node2节点 /var/lib/etcd目录下的文件,再重新启动就好了。

参考链接:ETCD-常用命令介绍_etcdctl member list_QianLiStudent的博客-CSDN博客

你可能感兴趣的:(etcd,数据库,postgresql,运维)