对k8s demo集群etcd的备份与恢复

之前的文章对k8s demo集群进行了CICD,以及监控配置,对于日常运维来说,对k8s集群的控制数据的备份也是必不可少的,所以本文对这个k8s demo集群的etcd进行了备份与恢复的测试演练。

1,配置etcdctl

在/etc/profile增加如下配置
export ETCDCTL_API=3
alias etcdctl='etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key '

查看etcd的状态数据

[root@VM-12-8-centos ~]# source /etc/profie
#查看状态
[root@VM-12-8-centos ~]# etcdctl endpoint status -w table
+--------------------------+------------------+---------+---------+-----------+-----------+------------+
|         ENDPOINT         |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+--------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://[127.0.0.1]:2379 | fc2c7cf28a83253f |  3.3.10 |  5.8 MB |      true |         5 |   22421398 |
+--------------------------+------------------+---------+---------+-----------+-----------+------------+
#健康检查
[root@VM-12-8-centos ~]# etcdctl endpoint health -w table
https://[127.0.0.1]:2379 is healthy: successfully committed proposal: took = 1.073883ms
#etcd集群成员列表
[root@VM-12-8-centos ~]# etcdctl member list -w table         
+------------------+---------+----------------+------------------------+------------------------+
|        ID        | STATUS  |      NAME      |       PEER ADDRS       |      CLIENT ADDRS      |
+------------------+---------+----------------+------------------------+------------------------+
| fc2c7cf28a83253f | started | vm-12-8-centos | https://10.0.12.8:2380 | https://10.0.12.8:2379 |
+------------------+---------+----------------+------------------------+------------------------+

查看etcd的相关配置文件,查看etcd的数据目录

[root@VM-12-8-centos manifests]# pwd
/etc/kubernetes/manifests
[root@VM-12-8-centos manifests]# cat etcd.yaml 
......
    - --data-dir=/var/lib/etcd
......
检查etcd的数据备份目录
[root@VM-12-8-centos etcd]# pwd
/var/lib/etcd
[root@VM-12-8-centos etcd]# tree member
member
├── snap
│   ├── 0000000000000005-0000000001555f8e.snap
│   ├── 0000000000000005-000000000155869f.snap
│   ├── 0000000000000005-000000000155adb0.snap
│   ├── 0000000000000005-000000000155d4c1.snap
│   ├── 0000000000000005-000000000155fbd2.snap
│   └── db
└── wal
    ├── 000000000000010e-00000000015074f9.wal
    ├── 000000000000010f-000000000151acf4.wal
    ├── 0000000000000110-000000000152e4f4.wal
    ├── 0000000000000111-0000000001541cef.wal
    ├── 0000000000000112-00000000015554ef.wal
    └── 0.tmp

2,etcd数据备份

建立一个etcd数据备份的目录/data/etcd_bak,再进行备份

[root@VM-12-8-centos etcd_bak]# etcdctl snapshot  save /data/etcd_bak/etcd-snapshot-`date +%Y%m%d`.db
Snapshot saved at /data/etcd_bak/etcd-snapshot-20230401.db
[root@VM-12-8-centos etcd_bak]# ll
total 5656
-rw-r--r-- 1 root root 5787680 Apr  1 15:35 etcd-snapshot-20230401.db

以上备份命令可以放在shell脚本中,配置crontab,每天凌晨的时候定时备份

3,模拟etcd数据丢失

将/var/lib/etcd/member改名,模拟etcd数据丢失

mv /var/lib/etcd/member /var/lib/etcd/member.bk

检查确认
[root@VM-12-8-centos etcd]# etcdctl member list -w table      
Error: context deadline exceeded
[root@VM-12-8-centos etcd]#  etcdctl endpoint health -w table
127.0.0.1:2379 is unhealthy: failed to commit proposal: context deadline exceeded
Error: unhealthy cluster
[root@VM-12-8-centos etcd]# kubectl get po
No resources found.

可以看到将member目录改名后,很快etcdctl查看etcd状态,用kubectl查看集群已经不正常了,获取不到相关数据

4,etcd备份恢复

需要注意的是备份恢复需要遵守以下的顺序

停止kube-apiserver --> 停止ETCD --> 恢复数据 --> 启动ETCD --> 启动kube-apiserve

由于我的k8s集群是采用kubeadm的方式安装,etcd和apiserver不属于系统服务,所以重启这2个进程采取静态pod重启的方式,即在/etc/kubernetes/manifests目录里移走apiserver,etcd的yml文件,再kill掉apiserver和etcd,待做了etcd数据恢复后,再分别移入etcd的yml文件,和apiserver的yml文件的方式来分别重启这2个进程

etcd备份数据恢复

[root@VM-12-8-centos etcd_bak]# etcdctl snapshot restore etcd-snapshot-20230401.db --data-dir=/var/lib/etcd
2023-04-01 16:04:03.430332 I | mvcc: restore compact to 19663311
2023-04-01 16:04:03.441067 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32

启动etcd和apiserver

[root@VM-12-8-centos kubernetes]# mv etcd.yaml  manifests/
[root@VM-12-8-centos etcd_bak]# p etcd
root     19873 19853  1 16:04 ?        00:00:00 etcd --advertise-client-urls=https://10.0.12.8:2379 --cert-file=/etc/kubernetes/pki/etcd/server.crt --client-cert-auth=true --data-dir=/var/lib/etcd --initial-advertise-peer-urls=https://10.0.12.8:2380 --initial-cluster=vm-12-8-centos=https://10.0.12.8:2380 --key-file=/etc/kubernetes/pki/etcd/server.key --listen-client-urls=https://127.0.0.1:2379,https://10.0.12.8:2379 --listen-peer-urls=https://10.0.12.8:2380 --name=vm-12-8-centos --peer-cert-file=/etc/kubernetes/pki/etc/peer.crt --peer-client-cert-auth=true --peer-key-file=/etc/kubernetes/pki/etcd/peer.key --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt --snapshot-count=10000 --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt

[root@VM-12-8-centos kubernetes]# mv kube-apiserver.yaml manifests/
[root@VM-12-8-centos etcd_bak]# p apiserver
root     20878 20858 99 16:05 ?        00:00:03 kube-apiserver --advertise-address=10.0.12.8 --allow-privileged=true --authorization-mode=Node,RBAC --client-ca-file=/etc/kubernetes/pki/ca.crt --enable-admission-plugins=NodeRestriction --enable-bootstrap-token-auth=true --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key --etcd-servers=https://127.0.0.1:2379 --insecure-port=0 --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key --requestheader-allowed-names=front-proxy-client --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-group-headers=X-Remote-Group --requestheader-username-headers=X-Remote-User --secure-port=6443 --service-account-key-file=/etc/kubernetes/pki/sa.pub --service-cluster-ip-range=10.1.0.0/16 --tls-cert-file=/etc/kubernetes/pki/apiserver.crt --tls-private-key-file=/etc/kubernetes/pki/apiserver.key

5,检查集群状态

[root@VM-12-8-centos etcd_bak]# etcdctl member list -w table 
+------------------+---------+----------------+-----------------------+------------------------+
|        ID        | STATUS  |      NAME      |      PEER ADDRS       |      CLIENT ADDRS      |
+------------------+---------+----------------+-----------------------+------------------------+
| 8e9e05c52164694d | started | vm-12-8-centos | http://localhost:2380 | https://10.0.12.8:2379 |
+------------------+---------+----------------+-----------------------+------------------------+
[root@VM-12-8-centos etcd_bak]# etcdctl endpoint health -w table
https://[127.0.0.1]:2379 is healthy: successfully committed proposal: took = 737.506µs
[root@VM-12-8-centos etcd_bak]# etcdctl endpoint status -w table
+--------------------------+------------------+---------+---------+-----------+-----------+------------+
|         ENDPOINT         |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+--------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://[127.0.0.1]:2379 | 8e9e05c52164694d |  3.3.10 |  5.8 MB |      true |         2 |        611 |
+--------------------------+------------------+---------+---------+-----------+-----------+------------+
[root@VM-12-8-centos etcd_bak]# kubectl get ns
NAME              STATUS   AGE
default           Active   156d
kube-node-lease   Active   156d
kube-public       Active   156d
kube-system       Active   156d
kube-users        Active   142d
monitoring        Active   138d
[root@VM-12-8-centos etcd_bak]# kubectl get po -n kube-system
NAME                                     READY   STATUS    RESTARTS   AGE
coredns-bccdc95cf-kwgjx                  1/1     Running   0          156d
coredns-bccdc95cf-s52jg                  1/1     Running   0          156d
etcd-vm-12-8-centos                      1/1     Running   0          156d
kube-apiserver-vm-12-8-centos            1/1     Running   0          151d
kube-controller-manager-vm-12-8-centos   1/1     Running   7          137d

你可能感兴趣的:(kubernetes,etcd,docker,容器)