概述
在上一篇中已经实践了 非高可用的bubernetes集群的实践
普通的k8s集群当work node 故障时是高可用的,但是master node故障时将会发生灾难,因为k8s api server不可用
会导致整个集群群龙无首
所以本篇将在上篇的基础上扩展高可用的k8s 集群
其原理是将所有node对master的api请求通过负载均衡(haproxy)指向虚拟ip(keepalived)
而此虚拟ip由所有master节点通过优先级选举
在上篇中我已经做好了k8s环境的基础虚拟机镜像文件,所以我这篇将从恢复虚拟机镜像文件为起点
虚拟ip 192.168.1.200
m1 192.168.1.201
m2 192.168.1.202
w1 192.168.1.211
m1是matser node 1
w1是work node 1
命名自己按喜欢的来即可
在所有master节点上我需要做
1.修改host文件 加入hostname域名解析
[root@m1 ~]# [root@m1 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.1.200 VIP 192.168.1.201 m1 192.168.1.202 m2 192.168.1.211 w1
2.修改hostname
[root@m1 ~]# hostnamectl set-hostname m1
需要登出后再登进才可以生效
3.修改静态ip
[root@m1 ~]# cat /etc/sysconfig/network-scripts/ifcfg-eth0 TYPE=Ethernet PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=static DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=eth0 UUID=ea0a7d62-0d12-4fd6-897d-d7a87e031d6f DEVICE=eth0 ONBOOT=yes GATEWAY=192.168.1.1 IPADDR=192.168.1.201 NETMASK=255.255.255.0 DNS1=192.168.1.1 DNS2=8.8.8.8
4.安装配置keepalvied
yum install keepalived
[root@m1 ~]# cat /etc/keepalived/keepalived.conf ! Configuration File for keepalived global_defs { notification_email { [email protected] } notification_email_from [email protected] smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id LVS_1 } vrrp_instance VI_1 { state MASTER interface eth0 lvs_sync_daemon_inteface eth0 virtual_router_id 79 advert_int 1 priority 100 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 192.168.1.200 } }
优先级其他的matster节点逐渐下降 90 80 70 ...
192.168.1.200是上面规划好的虚拟ip
[root@m1 ~]# systemctl restart keepalived
[root@m1 ~]# systemctl enable keepalived
重启服务,开机自诩
5.安装配置haproxy
[root@m1 ~]# cat /etc/haproxy/haproxy.cfgfg global chroot /var/lib/haproxy daemon group haproxy user haproxy log 127.0.0.1:514 local0 warning pidfile /var/lib/haproxy.pid maxconn 20000 spread-checks 3 nbproc 8 defaults log global mode tcp retries 3 option redispatch listen https-apiserver bind *:8443 mode tcp balance roundrobin timeout server 900s timeout connect 15s server m1 192.168.1.201:6443 check port 6443 inter 5000 fall 5 server m2 192.168.1.202:6443 check port 6443 inter 5000 fall 5
所有master节点上配置一致的
[root@m1 ~]# systemctl start haproxy
[root@m1 ~]# systemctl enable haproxy
启动服务,开启自启
在第一个master节点上我需要做
1.配置kubeadm 初始化配置文件
kubeadm-init.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.16.2
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
controlPlaneEndpoint: "192.168.1.200:8443" networking: serviceSubnet: "10.96.0.0/16" podSubnet: "192.168.0.0/16" dnsDomain: "cluster.local"
192.168.1.200:8443是我们的虚拟ip ,master转发这个端口数据到负载均衡上再到某一个master的6443端口上
1.16.2是我现在在使用的kube三件的版本
192.168.0.0是我规划的pod子网段
2.初始化
kubeadm init --config=kubeadm-init.yaml --upload-certs
成功后会生成两个join的指令, 第一个是master节点用的加入 第二个是工作节点上的加入
Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of the control-plane node running the following command on each as root: kubeadm join 192.168.1.200:6443 --token 15wggu.lostbmuhkconfve8 \ --discovery-token-ca-cert-hash sha256:60bad8d33b8bea0ce9820746453e5ff28f8faa522f97e1018f56824ed29cab89 \ --control-plane --certificate-key 48b866c774671ede8d42b2188717d8e33e026d4fa7c1e9220a93e38a9bede443 Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use "kubeadm init phase upload-certs --upload-certs" to reload certs afterward. Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.1.200:6443 --token 15wggu.lostbmuhkconfve8 \ --discovery-token-ca-cert-hash sha256:60bad8d33b8bea0ce9820746453e5ff28f8faa522f97e1018f56824ed29cab89
密钥如果过期重新上传
sudo kubeadm init phase upload-certs --upload-certs
3.安装网络策略
最新参阅https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network
[root@m1 ~]# wget https://kuboard.cn/install-script/calico/calico-3.9.2.yaml #默认是192.168.0.0 ,pod子网不一样的要手动修改 [root@m1 ~]# kubectl apply -f calico-3.9.2.yaml
4.去污点
查看污点
[root@m1 ~]# kubectl describe node m1|grep -i taints
Taints: node-role.kubernetes.io/master:NoSchedule
删除默认的污点
[root@m1 ~]# kubectl taint nodes m1 node-role.kubernetes.io/master- node/m1 untainted
所有master都加入集群后,删除所有master的污点
kubectl taint nodes --all node-role.kubernetes.io/master-
在其他master 节点上我需要做
执行
kubeadm join 192.168.1.200:6443 --token 15wggu.lostbmuhkconfve8 \ --discovery-token-ca-cert-hash sha256:60bad8d33b8bea0ce9820746453e5ff28f8faa522f97e1018f56824ed29cab89 \ --control-plane --certificate-key 48b866c774671ede8d42b2188717d8e33e026d4fa7c1e9220a93e38a9bede443
去污点同上
在其他work 节点上我需要做
kubeadm join 192.168.1.200:6443 --token 15wggu.lostbmuhkconfve8 \ --discovery-token-ca-cert-hash sha256:60bad8d33b8bea0ce9820746453e5ff28f8faa522f97e1018f56824ed29cab89
在各个master节点上都可以执行 kubectl get node 等 master的指令了
最后
由于本人使用笔记本(低压)上的虚拟机实践,所以只部署了2个master node,
建议最少三个master节点,并且主机点上不应该去污点