官方文档的指导,有些问题,按照其步骤,无法正常建立高可用的K8S环境,现将可行的步骤作此记录。
本质安装基于CentOS7.6 + Docker 19.03 + Kubernetes 1.17.3 + HAProxy 1.5.18
前置条件:
- 3 Master Node + 3 Work Node 安装docker
- 3 Master Node + 3 Work Node 安装Kubernetes
- LB Node 安装 HAProxy
- 3 Master Node设置SSH互信,方便后续scp传文件
- 关闭所有Node的SELinux与firewalld,同时在iptables中添加相应端口
操作步骤:
- 在3 Master Node上执行kubeadm config images pull --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers下载所需image
- 修改3 Master Node的hostname为master-01,02,03;修改3 Work Node的hostname为work-01,02,03;修改LB Node的hostname为loadblance
hostnamectl set-hostname master-01(其余6个Node的命令类似)
- 在3 Master Node上安装etcd
yum install -y etcd && systemctl enable etcd
- 在master上配置ETCD
在master-01上执行以下脚本
etcd1=10.128.132.234
etcd2=10.128.132.232
etcd3=10.128.132.231
TOKEN=LNk8sTest
ETCDHOSTS=($etcd1 $etcd2 $etcd3)
NAMES=("infra0" "infra1" "infra2")
for i in "${!ETCDHOSTS[@]}"; do
HOST=${ETCDHOSTS[$i]}
NAME=${NAMES[$i]}
cat << EOF > /tmp/$NAME.conf
# [member]
ETCD_NAME=$NAME
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://$HOST:2380"
ETCD_LISTEN_CLIENT_URLS="http://$HOST:2379,http://127.0.0.1:2379"
#[cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://$HOST:2380"
ETCD_INITIAL_CLUSTER="${NAMES[0]}=http://${ETCDHOSTS[0]}:2380,${NAMES[1]}=http://${ETCDHOSTS[1]}:2380,${NAMES[2]}=http://${ETCDHOSTS[2]}:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="$TOKEN"
ETCD_ADVERTISE_CLIENT_URLS="http://$HOST:2379"
EOF
done
for i in "${!ETCDHOSTS[@]}"; do
HOST=${ETCDHOSTS[$i]}
NAME=${NAMES[$i]}
scp /tmp/$NAME.conf $HOST:
ssh $HOST "\mv -f $NAME.conf /etc/etcd/etcd.conf"
rm -f /tmp/$NAME.conf
done
- 在3 Master Node上启动etcd
systemctl start etcd
- 检查ETCD的状态,任意master node上执行以下命令
[root@master-01 ~]# etcdctl member list
30bf939e6a7c2da9: name=infra1 peerURLs=http://10.128.132.232:2380 clientURLs=http://10.128.132.232:2379 isLeader=false
49194e6617aabed9: name=infra2 peerURLs=http://10.128.132.231:2380 clientURLs=http://10.128.132.231:2379 isLeader=false
7564c96b6750649c: name=infra0 peerURLs=http://10.128.132.234:2380 clientURLs=http://10.128.132.234:2379 isLeader=true
[root@master-01 ~]# etcdctl cluster-health
member 30bf939e6a7c2da9 is healthy: got healthy result from http://10.128.132.232:2379
member 49194e6617aabed9 is healthy: got healthy result from http://10.128.132.231:2379
member 7564c96b6750649c is healthy: got healthy result from http://10.128.132.234:2379
cluster is healthy
- 在LB Node上配置HAProxy
在LB Node上执行以下脚本进行配置
[root@loadblance ~]# cat lbconfig.sh
master1=10.128.132.234
master2=10.128.132.232
master3=10.128.132.231
yum install -y haproxy
systemctl enable haproxy
cat << EOF >> /etc/haproxy/haproxy.cfg
listen k8s-lb
bind 0.0.0.0:8443
mode tcp
balance source
timeout server 900s
timeout connect 15s
server master-01 10.128.132.234:6443 check
server master-02 10.128.132.232:6443 check
server master-03 10.128.132.231:6443 check
EOF
- 在master-01上初始化集群
执行以下脚本进行配置
[root@master-01 ~]# cat initCluster.sh
proxy=10.128.132.230
etcd1=10.128.132.234
etcd2=10.128.132.232
etcd3=10.128.132.231
master1=$etcd1
master2=$etcd2
master3=$etcd3
cat << EOF > kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: stable
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
apiServerCertSANs:
- "$proxy"
controlPlaneEndpoint: "$proxy:8443"
etcd:
external:
endpoints:
- "http://$etcd1:2379"
- "http://$etcd2:2379"
- "http://$etcd3:2379"
networking:
podSubnet: "10.244.0.0/16"
EOF
kubeadm init --config kubeadm-config.yaml --v=5
- 拷贝集群证书至其他master node
拷贝以下证书
/etc/kubernetes/pki/ca.crt
/etc/kubernetes/pki/ca.key
/etc/kubernetes/pki/sa.key
/etc/kubernetes/pki/sa.pub
/etc/kubernetes/pki/front-proxy-ca.crt
/etc/kubernetes/pki/front-proxy-ca.key
在master-02与master-03上创建目录/etc/kubernetes/pki,将以上证书拷贝至该目录下方。
- 将master-02与master-03加入集群中
根据在master-01上初始化集群成功后的信息,将其余master加入集群,参考如下
kubeadm join 10.128.132.230:8443 --token pk72tg.30u2cs41v2i4jk0y \
--discovery-token-ca-cert-hash sha256:24b7ff6c9ca456a9155e8f1d0e72500abc71db122a1728afc9d3e14883779c9b \
--control-plane
- 将work-01,02,03加入集群,命令与第十步类似,只是去掉了--control-plane参数,参考如下
kubeadm join 10.128.132.230:8443 --token pk72tg.30u2cs41v2i4jk0y \
--discovery-token-ca-cert-hash sha256:24b7ff6c9ca456a9155e8f1d0e72500abc71db122a1728afc9d3e14883779c9b
- 在非root账号下,安装flannel网络插件
创建普通账号apple,为apple添加sudoer权限,按照初始化集群后的提示进行操作,参考如下
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
切换到apple账号下,安装flannel网络插件
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
13 检查node添加状况及集群状态
kubectl get cs # 查看etcd集群状态
kubectl get pods -o wide -n kube-system # 查看系统服务状态
kubectl get nodes # 查看集群节点状态
14 为worknode添加Roles
kubectl label nodes work-01 node-role.kubernetes.io/work=
kubectl label nodes work-02 node-role.kubernetes.io/work=
kubectl label nodes work-03 node-role.kubernetes.io/work=
检查最终状态
[apple@master-03 root]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master-01 Ready master 8d v1.17.3
master-02 Ready master 12h v1.17.3
master-03 Ready master 12h v1.17.3
work-01 Ready work 11h v1.17.3
work-02 Ready work 11h v1.17.3
work-03 Ready work 11h v1.17.3
注意事项
- Kubernetes随着版本改变,API版本会有变化,如果安装中遇到找不到某些kind的问题,需要去查看当前所用版本Kubernetes中kind所对应的具体API版本
- 如果遇到无法初始化的问题,可以通过curl查看LB与master节点间能否正常访问
参考文档
部署高可用集群
k8s对Node添加Label