k8s集群部署+疑难问题解答

安装(ubuntu举例)

(1) sudo apt-get install docker.io
(2) sudo usermod -aG docker $USER
(3) sudo systemctl start docker && sudo systemctl enable docker
(4) curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add
(5) sudo apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main"
(6) sudo apt-get install kubeadm kubelet kubectl -y
(7) sudo swapoff -a 
(8) google 如何永久关闭 swap

k8启动(关键)

此步骤在 master 机器操作

(1)一定要用这个地址10.244.0.0
sudo kubeadm init --pod-network-cidr=10.244.0.0/16
注意: 此时的网卡是没有cni0,和flannel.1的
k8s集群部署+疑难问题解答_第1张图片
(2)

跟着提示敲下面命令
  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

(3)装网卡
sudo kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
先看自己的k8s集群服务是不是都正常起了, 如果启动失败,查看 “异常抉择”
k8s集群部署+疑难问题解答_第2张图片
sudo kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
(4)查看网卡
k8s集群部署+疑难问题解答_第3张图片
(5) 记录下slave机器加入集群的kubeadm join xxx命令行
k8s集群部署+疑难问题解答_第4张图片
(6) master机器配置完毕

此步骤在 slave 机器操作
(1) sudo apt-get install docker.io
(2) sudo usermod -aG docker $USER
(3) sudo systemctl start docker && sudo systemctl enable docker
(4) curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add
(5) sudo apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main"
(6) sudo apt-get install kubeadm kubelet kubectl -y
(7) sudo swapoff -a 
(8) 永久关闭 swap
(9) sudo kubeadm reset (如果你装过一次,执行下这个命令初始化)
(10) 使用你记录的 `kubeadm join xxx` ,让slave 加入master集群

回到master机器,kubectl get node查看slave机器加入了没有,如果配置正常,会从notready->ready发现slave机器
k8s集群部署+疑难问题解答_第5张图片

异常抉择

(1)问题: Unable to connect to the server: x509

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

(2)问题:coredns等服务启动失败
原因: kubeadm reset不会清理虚拟网卡,需要手动清理

(1) ifconfig 你会看到flannel.1的网卡,可能还有cni0的网卡,你要做的是删除他们
(2) sudo ip link delete flannel.1 && sudo ip link delete cni0
(3) sudo kubeadm reset 重置一下k8s集群

(3)问题: “cni0” already has an IP address different from
sudo ip link delete cni0

补充:aws私有镜像拉不下来, no basic auth credentials , trying and failing to pull image

kubectl create secret generic regcred \
    --from-file=.dockerconfigjson=/home/ubuntu/.docker/config.json \
    --type=kubernetes.io/dockerconfigjson

(1)使用这个命令可以创建秘钥,给私有仓库拉镜像做认证,在你的deployment文件,指定imagePullPolicy.
如果你创建不了,那就kubectl delete secret regcred, 再执行刚才的命令.
(2)后来发现每天secret都会失效,12小时失效一次,建议编写cron脚本12小时清一次secret,再重新创建
(3) /etc/docker/daemon.json 确保每个node都要这段代码,为了让你自己搭的harbor等私有仓库不用443端口也能拉镜像,加过记得重启docker。

 /etc/docker/daemon.json
 {
 "insecure-registries" : ["172.26.192.107:80"]
}

demo:

apiVersion: v1
kind: Service
metadata:
 name: py-main
 labels:
   app: py-main
spec:
 clusterIP: None
 ports:
   - port: 1000
     protocol: TCP
     name: port-1000
 selector:
   app: py-main
---
apiVersion: apps/v1
kind: Deployment
metadata:
 name: py-main
spec:
 selector:
   matchLabels:
     app: py-main
 template:
   metadata:
     labels:
       app: py-main
   spec:
     imagePullSecrets:
       - name: regcred
     containers:
       - name: py-main
         image: 私有镜像地址,不带http头
         imagePullPolicy: Always
         livenessProbe:
           httpGet:
             port: 1000
           periodSeconds: 30
         ports:
           - containerPort: 1000
             protocol: TCP
         env:
           - name: ME
             value: "eng-server"
         volumeMounts:
           - mountPath: /code
             name: py-main
     volumes:
       - name: py-main
         hostPath:
           path: /home/ubuntu/code

私有仓库secert创建(比如harbor)

kubectl create secret docker-registry regsecret --docker-server=192.166.2.74:80 --docker-username=admin --docker-password=Harbor12345

参考资料:

https://tonybai.com/2019/10/21/how-to-deploy-a-kubernetes-cluster-with-ubuntu-server-18-04/

你可能感兴趣的:(运维)