k8s外接etcd更换节点

  1. 准备3台新的物理机,安装基础配置,安装kubeadm、kubelet、kubectl等
  2. 由于k8s的etcd的集群采用的外接的方式,所以我们要做如下的配置

    创建优先于kubeadm提供的kubelet单位文件的优先级的新单位文件来覆盖服务优先级

    cat << EOF > /etc/systemd/system/kubelet.service.d/20-etcd-service-manager.conf

    [Service]

    ExecStart=

    #  Replace "systemd" with the cgroup driver of your container runtime. The default value in the kubelet is "cgroupfs".

    ExecStart=/usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd

    Restart=always

    EOF
    ##########################
    systemctl daemon-reload
    systemctl restart kubelet

  3. 为新的3个etcd节点生成kubeadm-config.yaml文件,文件内容如下

    [root@etcd1-test ~]# cat kubeadm-config.yaml

    apiVersion: "kubeadm.k8s.io/v1beta2"

    kind: ClusterConfiguration

    etcd:

        local:

            serverCertSANs:

            - "10.120.37.100"

            peerCertSANs:

            - "10.120.37.100"

            dataDir: "/ssd1/etcd"

            extraArgs:

                quota-backend-bytes: "8589934592"

                max-snapshots: "5"

                auto-compaction-retention: "1"

                max-wals: "8"

                initial-cluster: etcd1=https://etcd4-test.yidian-inc.com:2380,etcd2=https://etcd5-test.yidian-inc.com:2380,etcd3=https://etcd6-test.yidian-inc.com:2380

                initial-cluster-state: existing

                name: etcd1

                listen-peer-urls: https://10.120.37.100:2380

                listen-client-urls: https://10.120.37.100:2379

                advertise-client-urls: https://10.120.37.100:2379

                initial-advertise-peer-urls: https://10.120.37.100:2380

  4. 将老的etcd的ca.crt和ca.key同步到3个新的etcd节点,给新的etcd生成证书

    kubeadm init phase certs etcd-server --config=/root/kubeadm-config.yaml

    kubeadm init phase certs etcd-peer --config=/root/kubeadm-config.yaml

    kubeadm init phase certs etcd-healthcheck-client --config=/root/kubeadm-config.yaml

    kubeadm init phase certs apiserver-etcd-client --config=/root/kubeadm-config.yaml

  5. 更新新的etcd的证书为10年
    https://github.com/yuyicai/update-kube-cert
    查看更新后证书的时间

    [root@etcd4-test pki]# openssl x509 -in apiserver-etcd-client.crt -noout -dates

    notBefore=Mar 10 03:03:33 2023 GMT

    notAfter=Mar  7 03:03:33 2033 GMT

  6. 生成etcd的静态文件
    kubeadm init phase etcd local --config=/root/kubeadm-config.yaml

  7. 添加第一个etcd节点
    alias ectl="etcdctl --endpoints=10.120.37.100:2379,10.120.37.101:2379,10.120.42.103:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/apiserver-etcd-client.crt --key=/etc/kubernetes/pki/apiserver-etcd-client.key"
    ectl member add etcd1 --peer-urls="https://10.136.45.19:2380"
    ectl member list -w table  (检查节点)
    k8s外接etcd更换节点_第1张图片

  8. 修改第一个etcd节点(10.136.45.19)配置
    注意1: initial-cluster只能加一个节点,不能全加,否则会报错,“member count is unequal”
    注意2: member add 增加时候peer-urls后面是10.136.45.19:2380,所以再initial-cluster 中只能加10.136.45.19:2380,否则对等失败
    注意3: 如果还是对等失败,就把initial-cluster中所有的节点都换成ip即可

    apiVersion: v1
    kind: Pod
    metadata:
      annotations:
        kubeadm.kubernetes.io/etcd.advertise-client-urls: https://10.136.45.20:2379
      creationTimestamp: null
      labels:
        component: etcd
        tier: control-plane
      name: etcd
      namespace: kube-system
    spec:
      containers:
      - command:
        - etcd
        - --advertise-client-urls=https://10.136.45.20:2379
        - --auto-compaction-retention=1
        - --cert-file=/etc/kubernetes/pki/etcd/server.crt
        - --client-cert-auth=true
        - --data-dir=/ssd1/etcd
        - --initial-advertise-peer-urls=https://10.136.45.20:2380
        - --initial-cluster=etcd1=https://etcd1-test.yidian-inc.com:2380,etcd2=https://etcd2-test.yidian-inc.com:2380,etcd3=https://etcd3-test.yidian-inc.com:2380,etcd4=https://10.136.45.19:2380
        - --initial-cluster-state=existing
        - --key-file=/etc/kubernetes/pki/etcd/server.key
        - --listen-client-urls=https://10.136.45.20:2379
        - --listen-metrics-urls=http://127.0.0.1:2381
        - --listen-peer-urls=https://10.136.45.20:2380
        - --max-snapshots=5
        - --max-wals=8
        - --name=etcd5
        - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
        - --peer-client-cert-auth=true
        - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
        - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
        - --quota-backend-bytes=8589934592
        - --snapshot-count=10000
        - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
        image: k8s.gcr.io/etcd:3.4.3-0
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 8
          httpGet:
            host: 127.0.0.1
            path: /health
            port: 2381
            scheme: HTTP
          initialDelaySeconds: 15
          timeoutSeconds: 15
        name: etcd
        resources: {}
        volumeMounts:
        - mountPath: /ssd1/etcd
          name: etcd-data
        - mountPath: /etc/kubernetes/pki/etcd
          name: etcd-certs
      hostNetwork: true
      priorityClassName: system-cluster-critical
      volumes:
      - hostPath:
          path: /etc/kubernetes/pki/etcd
          type: DirectoryOrCreate
        name: etcd-certs
      - hostPath:
          path: /ssd1/etcd
          type: DirectoryOrCreate
        name: etcd-data
    status: {}
  9. 同理,将第二个和第三个节点的etcd都增加到集群当中

  10. 把开始加入的两个etcd1和etcd2的member的数量在initial-cluster中跟etcd3保持一致,从而保持member一致

  11. 检查整个集群的状态,确保所有节点OK,数据同步到统一,观察“RAFT APPLIED INDEX“数据一致
  12. 由于我们的calico的配置文件直接链接了etcd,所以接下来要修改calico的配置,使calico链接到新的etcd集群;先修改secrets,再修改calico-configmap中的etcdendpoint
  13. 如果前面没有任何问题,继续后续操作
  14. 摘除lvs,更新master节点etcd的证书,修改apiserve中etcd的配置
  15. 修改新的etcd节点中initial-cluster,删除老节点的信息,滚动更新
  16. 开始下线老的etcd节点 k8s外接etcd更换节点_第2张图片
  17. 开始move etcd的master节点到新的节点
  18. 下线最后一个老的etcd节点
    k8s外接etcd更换节点_第3张图片
  19. 测试迁移的结果
    启动一个测试的pod,看是否正常,如果不能正常启动,可能要批量重启calico-node;如果正常,可以忽略

你可能感兴趣的:(kubernetes,etcd,运维)