k8s实践11:etcd集群数据备份恢复

1.
etcd的api版本说明

flannel操作etcd使用的是v2的API,而kubernetes操作etcd使用的v3的API,所以在下面我们执行etcdctl的时候需要设置ETCDCTL_API环境变量,该变量默认值为2.

Etcd V2和V3之间的数据结构完全不同,互不兼容,也就是说使用V2版本的API创建的数据只能使用V2的API访问,V3的版本的API创建的数据只能使用V3的API访问.
这就造成我们访问etcd中保存的flannel的数据需要使用etcdctl的V2版本的客户端,而访问kubernetes的数据需要设置ETCDCTL_API=3环境变量来指定V3版本的API.

2.
基础操作记录

v2 api的操作记录
kubernetes的flanenl网络信息存储在v2.

[root@k8s-master2 snap]#  etcdctl --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem --version
etcdctl version: 3.3.7
API version: 2
[root@k8s-master2 snap]#  etcdctl --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem ls /kubernetes/network/subnets
/kubernetes/network/subnets/172.30.5.0-24
/kubernetes/network/subnets/172.30.97.0-24
/kubernetes/network/subnets/172.30.7.0-24
/kubernetes/network/subnets/172.30.41.0-24
[root@k8s-master2 snap]#  etcdctl --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem get /kubernetes/network/config
{"Network":"172.30.0.0/16",
"SubnetLen": 24, "Backend": {"Type": "vxlan"}}
[root@k8s-master2 snap]#

v3 api的操作记录
操作前设置api
export ETCDCTL_API=3

[root@k8s-master1 ~]# etcdctl  version -w table
etcdctl version: 3.3.7
API version: 3.3
[root@k8s-master1 ~]#
[root@k8s-master1 ~]# etcdctl  member list -w table
+------------------+---------+-------------+-----------------------------+-----------------------------+
|        ID        | STATUS  |    NAME     |         PEER ADDRS          |        CLIENT ADDRS         |
+------------------+---------+-------------+-----------------------------+-----------------------------+
| 5bac98ba2781a51e | started | k8s-master2 | https://192.168.32.129:2380 | https://192.168.32.129:2379 |
| bd8793282fb7e56f | started | k8s-master3 | https://192.168.32.130:2380 | https://192.168.32.130:2379 |
| bee1cc9618cefbee | started | k8s-master1 | https://192.168.32.128:2380 | https://192.168.32.128:2379 |
+------------------+---------+-------------+-----------------------------+-----------------------------+

所有的资源对象都保存在/registry路径下

[root@k8s-master1 ~]# etcdctl get /registry/pods --prefix --keys-only
/registry/pods/default/mysql-5f4c86bcf9-cz2np

/registry/pods/kube-system/coredns-779ffd89bd-k6r7l

[root@k8s-master1 ~]# etcdctl get /registry/services --prefix --keys-only
/registry/services/endpoints/default/kubernetes

/registry/services/endpoints/default/mysql

/registry/services/endpoints/kube-system/kube-controller-manager

/registry/services/endpoints/kube-system/kube-dns

/registry/services/endpoints/kube-system/kube-scheduler

/registry/services/specs/default/kubernetes

/registry/services/specs/default/mysql

/registry/services/specs/kube-system/kube-dns

[root@k8s-master1 ~]#

参数:
--keys-only   默认为true,只显示key,如果设置为false,会显示key的所有值.
--prefix         默认为true可以看到所有的子目录.

v3api检索pod详细信息

[root@k8s-master1 ~]# kubectl get deployment,pod
NAME                          DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.extensions/mysql   1         1         1            1           22h
NAME                         READY     STATUS    RESTARTS   AGE
pod/mysql-5f4c86bcf9-s7pl9   1/1       Running   2          22h
[root@k8s-master1 ~]# 
[root@k8s-master1 ~]# etcdctl get /registry/pods/default/mysql-5f4c86bcf9-s7pl9  --prefix -w json
{"header":{"cluster_id":3029084511413524520,"member_id":6518028208589039923,"revision":152811,"raft_term":16},"kvs":[{"key":"L3JlZ2lzdHJ5L3BvZHMvZGVmYXVsdC9teXNxbC01ZjRjODZiY2Y5LXM3cGw5","create_revision":134651,"mod_revision":151326,"version":8,"value":"azhzAAoJCgJ2MRIDUG9kEqgKCu4BChZteXNxbC01ZjRjODZiY2Y5LXM3cGw5EhFteXNxbC01ZjRjODZiY2Y5LRoHZGVmYXVsdCIAKiQ0Y2JmZjU5ZS02MGVlLTExZTktOGM4Ni0wMDBjMjkxZDcwMjMyADgAQggIldPb5QUQAFoMCgNhcHASBW15c3FsWh8KEXBvZC10ZW1wbGF0ZS1oYXNoEgo1ZjRjODZiY2Y5alEKClJlcGxpY2FTZXQaEG15c3FsLTVmNGM4NmJjZjkiJDRjYjM1OTk4LTYwZWUtMTFlOS04Yzg2LTAwMGMyOTFkNzAyMyoHYXBwcy92MTABOAF6ABKVBAoeCgpteXNxbC1kYXRhEhBSDgoKbXlzcWwtcHZjMRAACjEKE2RlZmF1bHQtdG9rZW4td2RxNjISGjIYChNkZWZhdWx0LXRva2VuLXdkcTYyGKQDEuwBCgVteXNxbBIJbXlzcWw6NS42KgAyEwoFbXlzcWwQABjqGSIDVENQKgA6HwoTTVlTUUxfUk9PVF9QQVNTV09SRBIIcGFzc3dvcmRCAEogCgpteXNxbC1kYXRhEAAaDi92YXIvbGliL215c3FsIgBKSAoTZGVmYXVsdC10b2tlbi13ZHE2MhABGi0vdmFyL3J1bi9zZWNyZXRzL2t1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQiAGoUL2Rldi90ZXJtaW5hdGlvbi1sb2dyDElmTm90UHJlc2VudIABAIgBAJABAKIBBEZpbGUaBkFsd2F5cyAeMgxDbHVzdGVyRmlyc3RCB2RlZmF1bHRKB2RlZmF1bHRSC2s4cy1tYXN0ZXIzWABgAGgAcgCCAQCKAQCaARFkZWZhdWx0LXNjaGVkdWxlcrIBNgocbm9kZS5rdWJlcm5ldGVzLmlvL25vdC1yZWFkeRIGRXhpc3RzGgAiCU5vRXhlY3V0ZSisArIBOAoebm9kZS5rdWJlcm5ldGVzLmlvL3VucmVhY2hhYmxlEgZFeGlzdHMaACIJTm9FeGVjdXRlKKwCwgEAyAEAGpwECgdSdW5uaW5nEiMKC0luaXRpYWxpemVkEgRUcnVlGgAiCAiW09vlBRAAKgAyABIdCgVSZWFkeRIEVHJ1ZRoAIggItMvg5QUQACoAMgASJwoPQ29udGFpbmVyc1JlYWR5EgRUcnVlGgAiCAi0y+DlBRAAKgAyABIkCgxQb2RTY2hlZHVsZWQSBFRydWUaACIICJXT2+UFEAAqADIAGgAiACoOMTkyLjE2OC4zMi4xMzAyCzE3Mi4zMC44OS4yOggIltPb5QUQAELEAgoFbXlzcWwSDBIKCggIs8vg5QUQABpyGnAIABAAGglDb21wbGV0ZWQiACoICN2o3+UFEAAyCAjj3d/lBRAAOklkb2NrZXI6Ly82NGNhYzMzNTg3MjdiZDc0ZjNjMzIyZmJjYzg3ZmUyN2IxZGQ0NzQxYTlhOTEyMWRiZWJjNjlkM2JkODE2OTJiIAEoAjIJbXlzcWw6NS42Ol9kb2NrZXItcHVsbGFibGU6Ly9teXNxbEBzaGEyNTY6ZGUyOTEzYTBlYzUzZDk4Y2VkNmY2YmQ2MDdmNDg3YjdhZDhmZThkMmE4NmUyMTI4MzA4ZWJmNGJlMmY5MjY2N0JJZG9ja2VyOi8vOTA4YTQ4NTU4ZDRiZTgzMGJmZGVjZTFlZTQzODJlMWJiMjgxNjgxZjdjNDE5NjJhYjNmYjBiYWY1ZDVhODk0Y0oKQmVzdEVmZm9ydFoAGgAiAA=="}],"count":1}
[root@k8s-master1 ~]# etcdctl get /registry/pods/default/mysql-5f4c86bcf9-s7pl9  --prefix -w fields
"ClusterID" : 3029084511413524520
"MemberID" : 6518028208589039923
"Revision" : 152845
"RaftTerm" : 16
"Key" : "/registry/pods/default/mysql-5f4c86bcf9-s7pl9"
"CreateRevision" : 134651
"ModRevision" : 151326
"Version" : 8
"Value" : "k8s\x00\n\t\n\x02v1\x12\x03Pod\x12\xa8\n\n\xee\x01\n\x16mysql-5f4c86bcf9-s7pl9\x12\x11mysql-5f4c86bcf9-\x1a\adefault\"\x00*$4cbff59e-60ee-11e9-8c86-000c291d70232\x008\x00B\b\b\x95\xd3\xdb\xe5\x05\x10\x00Z\f\n\x03app\x12\x05mysqlZ\x1f\n\x11pod-template-hash\x12\n5f4c86bcf9jQ\n\nReplicaSet\x1a\x10mysql-5f4c86bcf9\"$4cb35998-60ee-11e9-8c86-000c291d7023*\aapps/v10\x018\x01z\x00\x12\x95\x04\n\x1e\n\nmysql-data\x12\x10R\x0e\n\nmysql-pvc1\x10\x00\n1\n\x13default-token-wdq62\x12\x1a2\x18\n\x13default-token-wdq62\x18\xa4\x03\x12\xec\x01\n\x05mysql\x12\tmysql:5.6*\x002\x13\n\x05mysql\x10\x00\x18\xea\x19\"\x03TCP*\x00:\x1f\n\x13MYSQL_ROOT_PASSWORD\x12\bpasswordB\x00J \n\nmysql-data\x10\x00\x1a\x0e/var/lib/mysql\"\x00JH\n\x13default-token-wdq62\x10\x01\x1a-/var/run/secrets/kubernetes.io/serviceaccount\"\x00j\x14/dev/termination-logr\fIfNotPresent\x80\x01\x00\x88\x01\x00\x90\x01\x00\xa2\x01\x04File\x1a\x06Always \x1e2\fClusterFirstB\adefaultJ\adefaultR\vk8s-master3X\x00`\x00h\x00r\x00\x82\x01\x00\x8a\x01\x00\x9a\x01\x11default-scheduler\xb2\x016\n\x1cnode.kubernetes.io/not-ready\x12\x06Exists\x1a\x00\"\tNoExecute(\xac\x02\xb2\x018\n\x1enode.kubernetes.io/unreachable\x12\x06Exists\x1a\x00\"\tNoExecute(\xac\x02\xc2\x01\x00\xc8\x01\x00\x1a\x9c\x04\n\aRunning\x12#\n\vInitialized\x12\x04True\x1a\x00\"\b\b\x96\xd3\xdb\xe5\x05\x10\x00*\x002\x00\x12\x1d\n\x05Ready\x12\x04True\x1a\x00\"\b\b\xb4\xcb\xe0\xe5\x05\x10\x00*\x002\x00\x12'\n\x0fContainersReady\x12\x04True\x1a\x00\"\b\b\xb4\xcb\xe0\xe5\x05\x10\x00*\x002\x00\x12$\n\fPodScheduled\x12\x04True\x1a\x00\"\b\b\x95\xd3\xdb\xe5\x05\x10\x00*\x002\x00\x1a\x00\"\x00*\x0e192.168.32.1302\v172.30.89.2:\b\b\x96\xd3\xdb\xe5\x05\x10\x00B\xc4\x02\n\x05mysql\x12\f\x12\n\n\b\b\xb3\xcb\xe0\xe5\x05\x10\x00\x1ar\x1ap\b\x00\x10\x00\x1a\tCompleted\"\x00*\b\bݨ\xdf\xe5\x05\x10\x002\b\b\xe3\xdd\xdf\xe5\x05\x10\x00:Idocker://64cac3358727bd74f3c322fbcc87fe27b1dd4741a9a9121dbebc69d3bd81692b \x01(\x022\tmysql:5.6:_docker-pullable://mysql@sha256:de2913a0ec53d98ced6f6bd607f487b7ad8fe8d2a86e2128308ebf4be2f92667BIdocker://908a48558d4be830bfdece1ee4382e1bb281681f7c41962ab3fb0baf5d5a894cJ\nBestEffortZ\x00\x1a\x00\"\x00"
"Lease" : 0
"More" : false
"Count" : 1
[root@k8s-master1 ~]#

v3api删除pod

删除操作前检索

[root@k8s-master1 ~]# kubectl get pod,deployment
NAME                        READY    STATUS    RESTARTS  AGE
pod/mysql-5f4c86bcf9-s7pl9  1/1      Running  2          22h

NAME                          DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
deployment.extensions/mysql  1        1        1            1          22h
[root@k8s-master1 ~]#

[root@k8s-master1 ~]# etcdctl get /registry/deployment --keys-only --prefix
/registry/deployments/default/mysql

/registry/deployments/kube-system/coredns

删除操作

[root@k8s-master1 ~]# etcdctl del /registry/deployments/default/mysql
1

删除后检索

[root@k8s-master1 ~]# etcdctl get /registry/pod --keys-only --prefix
/registry/pods/kube-system/coredns-779ffd89bd-k6r7l

[root@k8s-master1 ~]# etcdctl get /registry/deployment --keys-only --prefix
/registry/deployments/kube-system/coredns

[root@k8s-master1 ~]# kubectl get pod,delpyment
error: the server doesn't have a resource type "delpyment"
[root@k8s-master1 ~]#

3.
etcd api v2数据备份

[root@k8s-master2 ~]# etcdctl backup --data-dir /var/lib/etcd/  --backup-dir /root/etcd_backup

[root@k8s-master2 member]# pwd
/root/etcd_backup/member
[root@k8s-master2 member]# ll
total 0
drwx------ 2 root root 62 Apr 18 10:33 snap
drwx------ 2 root root 51 Apr 18 10:33 wal
[root@k8s-master2 member]# ll wal/
total 62500
-rw------- 1 root root 64000000 Apr 18 10:33 0000000000000000-0000000000000000.wal
[root@k8s-master2 member]# ll snap/
total 56
-rw-r--r-- 1 root root 31079 Apr 18 10:33 000000000000003c-00000000000493e3.snap
-rw-r--r-- 1 root root 32768 Apr 18 10:33 db
[root@k8s-master2 member]#

etcd api v3数据备份

单机备份

[root@k8s-master1 ~]# etcdctl --endpoints 127.0.0.1:2379 snapshot save snashot.db
Snapshot saved at snashot.db
[root@k8s-master1 ~]# ll
-rw-r--r--   1 root root 3756064 Apr 18 10:38 snashot.db
[root@k8s-master1 ~]#

集群备份

[root@k8s-master1 ~]# etcdctl --endpoints="https://192.168.32.129:2379,https://192.168.32.130:2379,192.168.32.128:2379" --cacert=/etc/kubernetes/cert/ca.pem --key=/etc/etcd/cert/etcd-key.pem --cert=/etc/etcd/cert/etcd.pem  snapshot save snashot1.db
Snapshot saved at snashot1.db
[root@k8s-master1 ~]#
[root@k8s-master1 ~]# ll
-rw-r--r--   1 root root 3756064 Apr 18 10:53 snashot1.db
-rw-r--r--   1 root root 3756064 Apr 18 10:38 snashot.db

数据恢复
做下面的操作,请慎重,有可能造成集群崩溃数据丢失.请在实验环境测试.

执行命令:systemctl stop etcd
所有节点的etcd服务全部停止.

执行命令:rm -rf /var/lib/etcd/
所有节点删除etcd的数据

恢复v3的数据

[root@k8s-master1 ~]#  etcdctl --name=k8s-master1 --endpoints="https://192.168.32.128:2379" --cacert=/etc/kubernetes/cert/ca.pem --key=/etc/etcd/cert/etcd-key.pem --cert=/etc/etcd/cert/etcd.pem --initial-cluster-token=etcd-cluster-1 --initial-advertise-peer-urls=https://192.168.32.128:2380 --initial-cluster=k8s-master1=https://192.168.32.128:2380,k8s-master2=https://192.168.32.129:2380,k8s-master3=https://192.168.32.130:2380 --data-dir=/var/lib/etcd snapshot restore snashot1.db
2019-04-18 13:43:42.570882 I | mvcc: restore compact to 148651
2019-04-18 13:43:42.584194 I | etcdserver/membership: added member 4c99f52323a3e391 [https://192.168.32.129:2380] to cluster 2a0978507970d828
2019-04-18 13:43:42.584224 I | etcdserver/membership: added member 5a74b01f28ece933 [https://192.168.32.128:2380] to cluster 2a0978507970d828
2019-04-18 13:43:42.584234 I | etcdserver/membership: added member b29b94ace458096d [https://192.168.32.130:2380] to cluster 2a0978507970d828
[root@k8s-master2 ~]# etcdctl --name=k8s-master2 --endpoints="https://192.168.32.129:2379" --cacert=/etc/kubernetes/cert/ca.pem --key=/etc/etcd/cert/etcd-key.pem --cert=/etc/etcd/cert/etcd.pem --initial-cluster-token=etcd-cluster-1 --initial-advertise-peer-urls=https://192.168.32.129:2380 --initial-cluster=k8s-master1=https://192.168.32.128:2380,k8s-master2=https://192.168.32.129:2380,k8s-master3=https://192.168.32.130:2380 --data-dir=/var/lib/etcd snapshot restore snashot1.db
2019-04-18 13:43:56.313096 I | mvcc: restore compact to 148651
2019-04-18 13:43:56.324779 I | etcdserver/membership: added member 4c99f52323a3e391 [https://192.168.32.129:2380] to cluster 2a0978507970d828
2019-04-18 13:43:56.324806 I | etcdserver/membership: added member 5a74b01f28ece933 [https://192.168.32.128:2380] to cluster 2a0978507970d828
2019-04-18 13:43:56.324819 I | etcdserver/membership: added member b29b94ace458096d [https://192.168.32.130:2380] to cluster 2a0978507970d828
[root@k8s-master3 ~]# etcdctl --name=k8s-master3 --endpoints="https://192.168.32.130:2379" --cacert=/etc/kubernetes/cert/ca.pem --key=/etc/etcd/cert/etcd-key.pem --cert=/etc/etcd/cert/etcd.pem --initial-cluster-token=etcd-cluster-1 --initial-advertise-peer-urls=https://192.168.32.130:2380 --initial-cluster=k8s-master1=https://192.168.32.128:2380,k8s-master2=https://192.168.32.129:2380,k8s-master3=https://192.168.32.130:2380 --data-dir=/var/lib/etcd snapshot restore snashot1.db
2019-04-18 13:44:10.643115 I | mvcc: restore compact to 148651
2019-04-18 13:44:10.649920 I | etcdserver/membership: added member 4c99f52323a3e391 [https://192.168.32.129:2380] to cluster 2a0978507970d828
2019-04-18 13:44:10.649957 I | etcdserver/membership: added member 5a74b01f28ece933 [https://192.168.32.128:2380] to cluster 2a0978507970d828
2019-04-18 13:44:10.649973 I | etcdserver/membership: added member b29b94ace458096d [https://192.168.32.130:2380] to cluster 2a0978507970d828

服务起不来

[root@k8s-master1 ~]# tail -n 30 /var/log/messages
Apr 18 13:46:41 k8s-master1 systemd: Starting Etcd Server...
Apr 18 13:46:41 k8s-master1 etcd: etcd Version: 3.3.7
Apr 18 13:46:41 k8s-master1 etcd: Git SHA: 56536de55
Apr 18 13:46:41 k8s-master1 etcd: Go Version: go1.9.6
Apr 18 13:46:41 k8s-master1 etcd: Go OS/Arch: linux/amd64
Apr 18 13:46:41 k8s-master1 etcd: setting maximum number of CPUs to 1, total number of available CPUs is 1
Apr 18 13:46:41 k8s-master1 etcd: error listing data dir: /var/lib/etcd
Apr 18 13:46:41 k8s-master1 systemd: etcd.service: main process exited, code=exited, status=1/FAILURE
Apr 18 13:46:41 k8s-master1 systemd: Failed to start Etcd Server.
Apr 18 13:46:41 k8s-master1 systemd: Unit etcd.service entered failed state.
Apr 18 13:46:41 k8s-master1 systemd: etcd.service failed.
Apr 18 13:46:41 k8s-master1 flanneld: timed out
Apr 18 13:46:41 k8s-master1 flanneld: E0418 13:46:41.858283   63943 main.go:349] Couldn't fetch network config: client: etcd cluster is unavailable or misconfigured; error #0: EOF
Apr 18 13:46:41 k8s-master1 flanneld: ; error #1: EOF
Apr 18 13:46:41 k8s-master1 flanneld: ; error #2: EOF
[root@k8s-master1 ~]#

修改数据目录权限,默认是root:root
chown -R k8s:k8s /var/lib/etcd
恢复正常.

[root@k8s-master1 ~]#  systemctl daemon-reload && systemctl restart etcd
[root@k8s-master1 ~]# systemctl status etcd
● etcd.service - Etcd Server
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2019-04-18 14:44:56 CST; 9s ago
     Docs: https://github.com/coreos
Main PID: 2557 (etcd)
   Memory: 11.6M
   CGroup: /system.slice/etcd.service
           └─2557 /opt/k8s/bin/etcd --data-dir=/var/lib/etcd --name=k8s-master1 --cert-file=/etc/etcd/cer...

Apr 18 14:44:56 k8s-master1 etcd[2557]: established a TCP streaming connection with peer 4c99f52323a3...ter)
Apr 18 14:44:56 k8s-master1 etcd[2557]: established a TCP streaming connection with peer 4c99f52323a3...ter)
Apr 18 14:44:56 k8s-master1 etcd[2557]: published {Name:k8s-master1 ClientURLs:[https://192.168.32.12...d828
Apr 18 14:44:56 k8s-master1 etcd[2557]: ready to serve client requests
Apr 18 14:44:56 k8s-master1 etcd[2557]: serving insecure client requests on 127.0.0.1:2379, this is s...ged!
Apr 18 14:44:56 k8s-master1 etcd[2557]: ready to serve client requests
Apr 18 14:44:56 k8s-master1 systemd[1]: Started Etcd Server.
Apr 18 14:44:56 k8s-master1 etcd[2557]: serving client requests on 192.168.32.128:2379
Apr 18 14:44:56 k8s-master1 etcd[2557]: 5a74b01f28ece933 initialzed peer connection; fast-forwarding ...r(s)
Apr 18 14:45:01 k8s-master1 etcd[2557]: the clock difference against peer 4c99f52323a3e391 is too hig... 1s]
Hint: Some lines were ellipsized, use -l to show in full.
[root@k8s-master1 ~]# etcdctl member list
4c99f52323a3e391, started, k8s-master2, https://192.168.32.129:2380, https://192.168.32.129:2379
5a74b01f28ece933, started, k8s-master1, https://192.168.32.128:2380, https://192.168.32.128:2379
b29b94ace458096d, started, k8s-master3, https://192.168.32.130:2380, https://192.168.32.130:2379
[root@k8s-master1 ~]#
[root@k8s-master1 ~]# kubectl get all
NAME                         READY     STATUS    RESTARTS   AGE
pod/mysql-5f4c86bcf9-s7pl9   1/1       Running   1          21h

NAME                 TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
service/kubernetes   ClusterIP   10.254.0.1               443/TCP    10d
service/mysql        ClusterIP   10.254.113.118           3306/TCP   21h

NAME                    DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/mysql   1         1         1            1           21h

NAME                               DESIRED   CURRENT   READY     AGE
replicaset.apps/mysql-5f4c86bcf9   1         1         1         21h
[root@k8s-master1 ~]#

可以看到v3的数据恢复成功.

但是v2的数据没有恢复
而且因为没有恢复v2的数据,flanneld也没有起来

[root@k8s-master3 ~]#  etcdctl --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/etcd/cert/etcd.pem --key-file=/etc/etcd/cert/etcd-key.pem ls /kubernetes/network/subnets
Error:  100: Key not found (/kubernetes) [9]
[root@k8s-master3 ~]# systemctl status flanneld
● flanneld.service - Flanneld overlay address etcd agent
   Loaded: loaded (/etc/systemd/system/flanneld.service; enabled; vendor preset: disabled)
   Active: activating (start) since Thu 2019-04-18 14:47:07 CST; 23s ago
Main PID: 2985 (flanneld)
   Memory: 9.3M
   CGroup: /system.slice/flanneld.service
           └─2985 /opt/k8s/bin/flanneld -etcd-cafile=/etc/kubernetes/cert/ca.pem -etcd-certfile=/etc/flan...

Apr 18 14:47:26 k8s-master3 flanneld[2985]: timed out
Apr 18 14:47:26 k8s-master3 flanneld[2985]: E0418 14:47:26.380541    2985 main.go:349] Couldn't fetch... [9]
Apr 18 14:47:27 k8s-master3 flanneld[2985]: timed out
Apr 18 14:47:27 k8s-master3 flanneld[2985]: E0418 14:47:27.384473    2985 main.go:349] Couldn't fetch... [9]
Apr 18 14:47:28 k8s-master3 flanneld[2985]: timed out
Apr 18 14:47:28 k8s-master3 flanneld[2985]: E0418 14:47:28.388892    2985 main.go:349] Couldn't fetch... [9]
Apr 18 14:47:29 k8s-master3 flanneld[2985]: timed out
Apr 18 14:47:29 k8s-master3 flanneld[2985]: E0418 14:47:29.393689    2985 main.go:349] Couldn't fetch... [9]
Apr 18 14:47:30 k8s-master3 flanneld[2985]: timed out
Apr 18 14:47:30 k8s-master3 flanneld[2985]: E0418 14:47:30.397597    2985 main.go:349] Couldn't fetch... [9]
Hint: Some lines were ellipsized, use -l to show in full.
[root@k8s-master3 ~]#

官网特别说明:
若使用 v3 备份数据时存在 v2 的数据则不影响恢复
若使用 v2 备份数据时存在 v3 的数据则恢复失败

v2记录的只是flannel网络信息,重新生成即可.

[root@k8s-master2 ~]#source /opt/k8s/bin/environment.sh
[root@k8s-master2 ~]#echo ${ETCD_ENDPOINTS}
https://192.168.32.128:2379,https://192.168.32.129:2379,https://192.168.32.130:2379
[root@k8s-master2 ~]#etcdctl --endpoints=${ETCD_ENDPOINTS} --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/flanneld/cert/flanneld.pem --key-file=/etc/flanneld/cert/flanneld-key.pem set ${FLANNEL_ETCD_PREFIX}/config '{"Network":"'${CLUSTER_CIDR}'",
> "SubnetLen": 24, "Backend": {"Type": "vxlan"}}'
{"Network":"172.30.0.0/16",
"SubnetLen": 24, "Backend": {"Type": "vxlan"}}
[root@k8s-master2 ~]#

v2的记录重新生成后,flanneld恢复正常.
集群恢复正常

[root@k8s-master2 ~]# systemctl status flanneld
● flanneld.service - Flanneld overlay address etcd agent
  Loaded: loaded (/etc/systemd/system/flanneld.service; enabled; vendor preset: disabled)
  Active: active (running) since Thu 2019-04-18 15:22:21 CST; 41s ago
  Process: 10032 ExecStartPost=/opt/k8s/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker (code=exited, status=0/SUCCESS)
Main PID: 9930 (flanneld)
  Memory: 6.5M
  CGroup: /system.slice/flanneld.service
          └─9930 /opt/k8s/bin/flanneld -etcd-cafile=/etc/kubernetes/cert/ca.pem -etcd-certfile=/etc/flan...

Apr 18 15:22:21 k8s-master2 flanneld[9930]: I0418 15:22:21.349931    9930 main.go:300] Wrote subnet f....env
Apr 18 15:22:21 k8s-master2 flanneld[9930]: I0418 15:22:21.349939    9930 main.go:304] Running backend.
Apr 18 15:22:21 k8s-master2 flanneld[9930]: I0418 15:22:21.356819    9930 iptables.go:115] Some iptab...ules
Apr 18 15:22:21 k8s-master2 flanneld[9930]: I0418 15:22:21.356834    9930 iptables.go:137] Deleting i...CEPT
Apr 18 15:22:21 k8s-master2 flanneld[9930]: I0418 15:22:21.359940    9930 iptables.go:137] Deleting i...CEPT
Apr 18 15:22:21 k8s-master2 flanneld[9930]: I0418 15:22:21.360944    9930 iptables.go:125] Adding ipt...CEPT
Apr 18 15:22:21 k8s-master2 flanneld[9930]: I0418 15:22:21.364645    9930 vxlan_network.go:60] watchi...ases
Apr 18 15:22:21 k8s-master2 flanneld[9930]: I0418 15:22:21.371347    9930 iptables.go:125] Adding ipt...CEPT
Apr 18 15:22:21 k8s-master2 flanneld[9930]: I0418 15:22:21.374036    9930 main.go:396] Waiting for 23...ease
Apr 18 15:22:21 k8s-master2 systemd[1]: Started Flanneld overlay address etcd agent.
Hint: Some lines were ellipsized, use -l to show in full.
[root@k8s-master2 ~]#
[root@k8s-master2 ~]# kubectl cluster-info
Kubernetes master is running at https://192.168.32.127:8443
CoreDNS is running at https://192.168.32.127:8443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
[root@k8s-master2 ~]# kubectl get cs
NAME                STATUS    MESSAGE            ERROR
controller-manager  Healthy  ok                 
scheduler            Healthy  ok                 
etcd-2              Healthy  {"health":"true"} 
etcd-0              Healthy  {"health":"true"} 
etcd-1              Healthy  {"health":"true"} 
[root@k8s-master2 ~]#

4.操作中可能遇到的错误

可能执行命令报错
Error: context deadline exceeded

需要加上证书
参考见下:

etcdctl --endpoints="https://192.168.32.128:2379" --cacert=/etc/kubernetes/cert/ca.pem --key=/etc/etcd/cert/etcd-key.pem  --cert=/etc/etcd/cert/etcd.pem --prefix --keys-only=false get /registry/pods/default/dnsutils-ds-4svcr

转载于:https://blog.51cto.com/goome/2380854

你可能感兴趣的:(k8s实践11:etcd集群数据备份恢复)