背景
由于kubernetes v1.16+存在严重漏洞,所有我们决定将kubernetes版本升级到当时plan时的稳定版本v1.19.3,由于从1.16跨了两个比较大的版本升级到1.19,所以升级过程中遇到各种问题,本系列文章记录当时在升级过程中遇到的一些比较棘手的问题的解决方案。
kubernetes 的升级
kubernetes 在1.16.7版本之前,主要组件api-server
, scheduler
, controller-manager
,kubelet
都是打包在一起,合成一个二进制文件hyperkube
,所以kubernetes主要组件只需构建成一个hyperkube
image,但是在1.19.3版本社区又把这几个二进制文件拆开成kube-apiserver
,kube-scheduler
,kube-controller-manager
三个独立的二进制文件,分别build成三个不同的image。
在升级kubernetesv1.19.3时存在的最大的一个棘手的问题就是v1.19.3总社区移除了好几个旧版本的API,比如将DaemonSet
、Deployment
资源从apiVersion: extensions/v1beta1
改成使用apiVersion: apps/v1
,升级之后,在升级后的cluster集群中,k8s 1.19.3将不再认识旧版本的API资源,所以首先第一步就是需要修改集群中所有的资源的API升级到最新的版本apps/v1
,主要的是helm
的chart
。在修改了chart的版本后,看起来万事大吉了,但是问题来了,在升级k8s后,etcd中同时存储了旧版本的资源和新版本的资源,k8s为了保持Backward compatibility,向后兼容性,在升级后的cluster中,新的kubernetes版本能同时识别这两个版本的资源,没有问题,但是作为chart
来说,helm并没有做这种新旧资源的向后兼容性,因为在升级前的helm release
中已经安装的chart
的 manifest
是旧版本的资源,当使用helm
client来操作这些资源时,就会产生如下错误:
helm status --tls image-manager --tls
Error: [unable to recognize "": no matches for kind "DaemonSet" in version "extensions/v1beta1", unable to recognize "": no matches for kind "StatefulSet" in version "apps/v1beta1"]
为此笔者还在社区一段操作,搜到几个类似的问题,最终明白就是因为helm release 中的metadata在新的k8s集群中不认识了,好在helm社区为了解决API不兼容的问题,又专门的fix文档,于是乎,根据社区的文档,需要手动去修改helm release 中的manifest数据
This is the official helm doc https://helm.sh/docs/topics/kubernetes_apis/ talking about the problem and provide some workaround, mainly section https://helm.sh/docs/topics/kubernetes_apis/#updating-api-versions-of-a-release-manifest.
From the doc it proposed two ways to workaround the issue:
- Manually update release data in configmap with supported API version.
- Use helm plugin
mapkubeapis
to help replace deprecated or removed API in helm release.
The manual step #1 involves lots of manual operations, for example get helm release data from configmap, decode it, update api version, encode and apply the new change in the configmap. And it's little complicated.
Here is an example how I use helm plugin mapkubeapis
to fix the problem we see in image manager
# kubectl -n kube-system get configmap |grep image-manager
image-manager.v1 1 21h
# helm list --tls | grep image-manager
image-manager 1 Wed Nov 4 01:35:29 2020 DEPLOYED image-manager-3.3.2001 kube-system
Install the plugin
# helm plugin install https://github.com/hickeyma/helm-mapkubeapis
Downloading and installing helm-mapkubeapis v0.0.15 ...
https://github.com/hickeyma/helm-mapkubeapis/releases/download/v0.0.15/helm-mapkubeapis_0.0.15_linux_amd64.tar.gz
Installed plugin: mapkubeapis
Update helm release image-manager with supported API version
# helm mapkubeapis --namespace=kube-system --v2 image-manager
2020/11/04 23:17:14 Release 'image-manager' will be checked for deprecated or removed Kubernetes APIs and will be updated if necessary to supported API versions.
2020/11/04 23:17:14 Get release 'image-manager' latest version.
2020/11/04 23:17:14 Check release 'image-manager' for deprecated or removed APIs...
2020/11/04 23:17:14 Found deprecated or removed Kubernetes API:
"apiVersion: apps/v1beta1
kind: StatefulSet"
Supported API equivalent:
"apiVersion: apps/v1
kind: StatefulSet"
2020/11/04 23:17:14 Found deprecated or removed Kubernetes API:
"apiVersion: extensions/v1beta1
kind: DaemonSet"
Supported API equivalent:
"apiVersion: apps/v1
kind: DaemonSet"
2020/11/04 23:17:14 Found deprecated or removed Kubernetes API:
"apiVersion: extensions/v1beta1
kind: Ingress"
Supported API equivalent:
"apiVersion: networking.k8s.io/v1beta1
kind: Ingress"
2020/11/04 23:17:14 Finished checking release 'image-manager' for deprecated or removed APIs.
2020/11/04 23:17:14 Deprecated or removed APIs exist, updating release: image-manager.
2020/11/04 23:17:14 Set status of release version 'image-manager.v1' to 'superseded'.
2020/11/04 23:17:14 Release version 'image-manager.v1' updated successfully.
2020/11/04 23:17:14 Add release version 'image-manager.v2' with updated supported APIs.
2020/11/04 23:17:14 Release version 'image-manager.v2' added successfully.
2020/11/04 23:17:14 Release 'image-manager' with deprecated or removed APIs updated successfully to new version.
2020/11/04 23:17:14 Map of release 'image-manager' deprecated or removed APIs to supported versions, completed successfully.
Now check again helm release image-manager
# kubectl -n kube-system get configmap | grep image-mana
image-manager-init-certs-config 1 21h
image-manager.v1 1 21h
image-manager.v2 1 64s
# helm list --tls | grep image-mana
image-manager 2 Wed Nov 4 23:17:14 2020 DEPLOYED image-manager-3.3.2001 kube-system
# helm status image-manager --tls
LAST DEPLOYED: Wed Nov 4 23:17:14 2020
NAMESPACE: kube-system
STATUS: DEPLOYED
RESOURCES:
==> v1/ConfigMap
NAME DATA AGE
image-manager-init-certs-config 1 21h
registry-config 1 21h
==> v1/DaemonSet
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
image-manager-init-certs 3 3 3 3 3 21h
==> v1/StatefulSet
NAME DESIRED CURRENT AGE
image-manager 1 1 21h
==> v1beta1/Ingress
NAME AGE
image-manager-token 21h
image-manager 21h
==> v1alpha1/Certificate
NAME AGE
image-manager-token-cert 21h
image-manager-registry-cert 21h
image-manager-cert 21h
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
image-manager-init-certs-7npql 1/1 Running 0 21h
image-manager-init-certs-f9mcq 1/1 Running 0 21h
image-manager-init-certs-vltsv 1/1 Running 0 21h
image-manager-0 2/2 Running 0 21h
You see helm plugin mapkubeapis will help update helm release with supported API version and create new release revision. Now we can start ICP helm upgrade.
helm mapkubeapi
plugin工具,相当于将helm release中的旧的API 资源copy一份修改成新的API资源,然后用新的API创建一个新的release,所以此操作之后会增加一个新的helm release。
历经了前面两个大难题,似乎一切看起来万事大吉了,升级是没啥大问题了。但是当rollback整个cluster的时候问题又来了,由于在旧版本的k8s的charts 中,是不强制要求加selector的,但是在k8s 1.19.3版本中,是强制要求加selector:
spec:
selector:
matchLabels:
app: icp-management-ingress
chart: icp-management-ingress
component: icp-management-ingress
heritage: Tiller
k8s-app: icp-management-ingress
release: icp-management-ingress
虽然我们可以在升级包的chart中加入selector,但是在rollback的时候问题就来了,rollback的时候报错:
[root@sunny-gf4 ~]# helm rollback --tls --force internal-management-ingress 2
Error: failed to create resource: DaemonSet.apps "internal-management-ingress" is invalid: spec.template.metadata.labels: Invalid value: map[string]string{"app":"internal-management-ingress", "chart":"icp-management-ingress", "component":"internal-management-ingress", "heritage":"Tiller", "k8s-app":"internal-management-ingress", "release":"internal-management-ingress"}: `selector` does not match template `labels`
真是upgrade容易rollback难呀,不知道下一步会遇到什么问题。这个错误的意思是,在两个版本的metadata中必须保持chart的selector中的label是一致的,但是明显我们旧版本的资源中是没有加selector的,欲哭无泪,真是验证了谁升级谁痛苦,更何况还需要rollback。那问题如何解决呢?
最初我们想的办法是手动改helm release中的metadata,类似helm mapkubeapi
plugin做的工作,但是要自动化这个手动的操作,复杂度有些高,还需要python处理metadata数据,对于我们程序员来说怎么能容忍这种乌龟操作,浴室经过几天苦思冥想,终于想出来workaround,在升级k8s版本之前,先修整一波有问题的charts,类似先做一步小的升级,该加selector的加selector,然后再升级k8s,然后再升级chart修改chart的API的版本,又一次逢凶化吉,完美的解决了问题。总结一下升级路径:
- upgrade chart
- upgrade k8s 1.19.3
- upgrade chart
- rollback chart
- rollback k8s 1.16.7
chart release 的变化
1----upgrade------>2------upgrade k8s---->2-----upgrade Charts----->3-----rollback chart(2)----->4-----rollback k8s----->4
etcd的升级
etcd是kubernetes云平台的分布式数据库,可谓是kubernetes的心脏,一旦升级etcd有问题,整个cluster都会挂掉,瘫痪,是客户所不能容忍的失误,所以etcd的升级非常重要,更何况我们还需要支持etcd的回滚(泪目。。。)。etcd社区明确表示,etcd不支持跨版本的升级,且跨版本的升级不支持zero-downtime,所以不得不在升级etcd之前先给集群做数据的backup,给集群数据做snapshot:etcdctl3 snapshot save
,万幸的是这两个版本的etcd数据的格式我们的一致的都是etcd3的数据,否则还需要做数据的转换。
以下是升级etcd的步骤:
After applying fix pack 3.2.2.2006, customer maybe need to rollback to 3.2.1.2003 or 3.2.1.2006 if some errors appears.
Backup etcd data before upgrading to 3.2.2.2006.
- Log on one of Master nodes as root user.
- Run the following commands to export the required environment variables:
Replace etcd_member_IP with the IP address of one of your etcd members.export image=mycluster.icp:8500/ibmcom/etcd:3.2.24.2 export endpoint=etcd_member_IP
- Copy the
etcdctl
binary to /user/local/bin/ by entering the following commands:mkdir tmp && chown -R etcd:etcd tmp docker run --rm -v $(pwd)/tmp:/data $image cp /usr/local/bin/etcdctl /data mv tmp/etcdctl /usr/local/bin/etcdctl && rm -rf tmp
- Configure the etcdctl command.
alias etcdctl3="ETCDCTL_API=3 etcdctl --endpoints=https://${endpoint}:4001 --cacert=/etc/cfc/conf/etcd/ca.pem --cert=/etc/cfc/conf/etcd/client.pem --key=/etc/cfc/conf/etcd/client-key.pem"
- Validate the etcd cluster status by running the following commands:
etcdctl3 --write-out=table endpoint status etcdctl3 endpoint health
- Take a snapshot of the etcd data by entering the following command. Create
/data
dir if needed.
The etcd backup data now is available atetcdctl3 snapshot save /data/etcd.db
/data/etcd.db
on the master node.
Apply fix pack 3.2.2.2006
Please follow normal procedures for applying fix pack 3.2.2.2006.
Rollback fix pack 3.2.2.2006
Before rollback 3.2.2.2006, please backup secret icp-mongodb-metrics
.
kubectl -n kube-system get secret icp-mongodb-metrics -o yaml > icp-mongodb-metrics.yaml.bak
After running rollback by applying the previous fixpack for example 3.2.1.2003, possibly that hit some error as below,
stderr: 'Error: Could not get information about the resource: no kind "Ingress" is registered for version "networking.k8s.io/v1beta1" in scheme "k8s.io/kubernetes/pkg/api/legacyscheme/scheme.go:29"'
Then please run below steps to restore etcd.
-
If ansible is not available on customer's boot node, please run below command to get ansible environment inside container in the installation directory.
docker run -e LICENSE=accept --net=host --rm -it -v "$(pwd)":/installer/cluster -v /data/:/data ibmcom/icp-inception-amd64:3.2.1.2003-ee /bin/bash Then run below command for each master IP address. ssh -i $CLUSTER_DIR/ssh_key root@
Make sure jq is installed on each master node. For example, it could be installed on ubuntu with following commnd. ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m package -a "use=apt name=jq state=present" -
Stop Kubernetes on all master nodes. This stops the etcd pod and prevents Kubernetes from automatically creating new pods for the ones that you are stopping.
a. Create a directory for the backup pod by entering the following command:
ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -a "mkdir -p /etc/cfc/podbackup"
b. Move the backup pod into the directory:
ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m shell -a "mv /etc/cfc/pods/*.json /etc/cfc/podbackup"
c. Wait for etcd to stop on all nodes. You can check the status by entering the following command:
ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m wait_for -a "port=4001 state=stopped"
d. After etcd stopped, stop the kubelet by running this command on all master nodes and Management nodes:
ansible master,management -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m service -a "name=kubelet state=stopped"
e. After the kubelet stopped, restart the Docker service to ensure that all pods that are not managed by kubelet are stopped by entering the following command:
ansible master,management -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m service -a "name=docker state=restarted"
-
Purge, copy and restore the etcd data.
a. Purge the current etcd data on all master Nodes by running the following commands:
ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m shell -a "mv /var/lib/etcd /var/lib/etcd.old" ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m shell -a "mv /var/lib/etcd-wal /var/lib/etcd-wal.old"
b. Copy the etcd snapshot to all master nodes. Assuming that you have the /data/etcd.db file in your environment, which contains a backup of your etcd, run the following procedure to copy the file to all master nodes:
ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m copy -a "src=/data/etcd.db dest=/tmp/snapshot.db"
c. Restore the snapshot on all master nodes. Assuming you have cloned the Git repository, and that your current directory is icp-backup/scripts, run the following commands to run the script that restores the snapshot to all of the master nodes:
ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m shell -a "mkdir -p /var/lib/etcd && chown -R etcd:etcd /var/lib/etcd /tmp/snapshot.db" ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m script -a "./multimaster-etcd-restore.sh"
The data is loaded into the /var/lib/etcd/restored directory on each of your master nodes, with the cluster settings configured.
d. Move the contents to the /var/lib/etcd/ and /var/lib/etcd-wal/ directories by running the following commands:
ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m shell -a "mkdir -p /var/lib/etcd-wal && chown -R etcd:etcd /var/lib/etcd-wal" ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m shell -a "mv /var/lib/etcd/restored/member /var/lib/etcd/" ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m shell -a "mv /var/lib/etcd/member/wal/ /var/lib/etcd-wal/"
e. Run the following script to purge the kubelet pods directory to ensure consistency between the cached kubelet data and the etcd data:
ansible master,management -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m script -a "./purge_kubelet_pods.sh"
f. Re-enable the kubelet pod by entering the following command:
ansible master,management -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m service -a "name=kubelet state=started"
g. Re-enable the etcd pod by entering the following command:
ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m shell -a "mv /etc/cfc/podbackup/etcd.json /etc/cfc/pods"
h. Run the following command to monitor the progress of the etcd component status as it starts:
ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m wait_for -a "port=4001 state=started"
-
Validate the etcd cluster health.
a. Run the following commands to configure the etcdctl tool to query the etcd cluster:export endpoint=
:4001, :4001, :4001 alias etcdctl="ETCDCTL_API=3 etcdctl --cacert=/etc/cfc/conf/etcd/ca.pem --cert=/etc/cfc/conf/etcd/client.pem --key=/etc/cfc/conf/etcd/client-key.pem" Change the value for
to the IP address of the etcd node that you are working with. b. Query the cluster health by entering the following command:
etcdctl --endpoints=${endpoint} endpoint health
-
Start the remaining IBM Cloud Private cluster pods by entering the following command:
ansible master -i $CLUSTER_DIR/hosts -e @$CLUSTER_DIR/config.yaml --private-key=$CLUSTER_DIR/ssh_key -m shell -a "mv /etc/cfc/podbackup/*.json /etc/cfc/pods"
This command enables kubelet to start the remaining core Kubernetes pods, which then start the workloads that are managed by Kubernetes.
It takes several minutes for all pods to be restarted. You can monitor the pods in the kube-system namespace by running the following command:kubectl get pods --namespace=kube-system
Because rollback failed, it's expected behavior if there is some pods failed to start.
-
Because secret
icp-mongodb-metrics
was also restored after etcd restored. So podicp-mongo-db
will failed with error similar as below,Warning Unhealthy 3m19s kubelet, 10.11.27.35 Liveness probe failed: time="2020-09-19T04:15:22Z" level=error msg="Cannot connect to server using url mongodb://****:****@localhost:27017: server returned error on SASL authentication step: Authentication failed." source="connection.go:84"
Please run below commands to fix the problem.
kubectl -n kube-system delete secret icp-mongodb-metrics kubectl -n kube-system create -f icp-mongodb-metrics.yaml.bak
-
After restoring etcd and secret
icp-mongodb-metrics
successfully, please re-run apply-fixpack for the previous fix pack. For example rollback to 3.2.1.2003 with below command:docker run -e LICENSE=accept --net=host --rm -t -v "$(pwd)":/installer/cluster ibmcom/icp-inception-amd64:3.2.1.2003-ee apply-fixpack