1、环境说明
这里简单说明一下我使用的服务器情况:
服务器均采用 CentOS7.6 版本,未在其他系统版本中进行测试。
部署脚本包
链接:https://pan.baidu.com/s/1S8yXIKTqQpXmF3SnULdCgQ
提取码:16up
2、准备工作
1)修改以下内容
/data/config/environment.sh #修改ip为自己将要部署的机器ip
/data/config/Kcsh/hosts #修改ip为自己将要部署的机器ip
/data/config/Ketcd/etcd-csr.json #修改ip为自己将要部署的机器ip
/data/config/Kmaster/Kha/haproxy.cfg #修改ip为自己将要部署的机器ip
/data/config/Kmaster/Kapi/kubernetes-csr.json #修改ip为自己将要部署的机器ip
/data/config/Kmaster/Kmanage/kube-controller-manager-csr.json #修改ip为自己将要部署的机器ip
/data/config/Kmaster/Kscheduler/kube-scheduler-csr.json #修改ip为自己将要部署的机器ip
2)基础配置
在 kube-master 主机上执行执行 批量分发公钥-免交互方式
注意:请严格按照如下这几步操作进行,否则可能导致下边部署脚本无法正常走完
yum install -y sshpass # 安装 sshpass
cat /data/ip.txt
10.0.0.76
10.0.0.97
10.0.0.130
批量分发公钥 和 推送hosts的脚步
#!/bin/bash
echo "################ 批量分发公钥-免交互方式 ####################"
# 调用这个文件
. /etc/init.d/functions
# create key pair
rm -fr /root/.ssh/id_rsa*
ssh-keygen -t rsa -f /root/.ssh/id_rsa -P "" -q
IPtest=`cat /data/ip.txt`
# 批量推送key文件
for ip in $IPtest
do
echo "=======批量推送key=========="
# 前提是密码统一的情况
echo $ip
sshpass -pBK#6u12G+rARoVoc-+9 ssh-copy-id -i /root/.ssh/id_rsa.pub root@$ip -o StrictHostKeyChecking=no &>/dev/null
if [ $? -eq 0 ]
then
action "主机$ip [分发成功]" /bin/true
else
action "主机$ip [分发失败] " /bin/false
fi
done
# 批量推送hosts
for ip in $IPtest
do
echo $ip
echo "=======批量推送hosts=========="
scp /data/magic/config/Kcsh/hosts root@$ip:/etc/hosts
if [ $? -eq 0 ]
then
action "主机$ip [推送成功]" /bin/true
else
action "主机$ip [推送失败] " /bin/false
fi
done
修改主机名字,主机名称建议全是小写
ssh -o StrictHostKeyChecking=no [email protected] "hostname kube-master"
ssh -o StrictHostKeyChecking=no [email protected] "hostname kube-node01"
ssh -o StrictHostKeyChecking=no [email protected] "hostname kube-node02"
如修改为大写的话,kubelet 会出现 tokenGroups: Invalid value: []string{"system:bootstrappers:KUBE-MASTER"}: bootstrap group "system:bootstrappers:KUBE-MASTER" is invalid (must match system:bootstrappers:[a-z0-9:-]{0,255}[a-z0-9])
在magic.sh脚本里,修改 root@主机名字
sed -n '322,324p' magic.sh
scp $base_dir/config/Kmaster/Kha/keepalived-master.conf root@kube-master:/etc/keepalived/keepalived.conf
scp $base_dir/config/Kmaster/Kha/keepalived-backup.conf root@kube-node01:/etc/keepalived/keepalived.conf
scp $base_dir/config/Kmaster/Kha/keepalived-backup.conf root@kube-node02:/etc/keepalived/keepalived.conf
3、正式部署
部署非常简单,直接执行magic.sh脚本即可
[root@kube-master data]# ll
total 508944
drwxr-xr-x 9 root root 4096 Sep 2 17:48 config
-rw-r--r-- 1 root root 26174 Sep 2 17:41 magic.sh # 直接执行magic.sh脚本
-rw-r--r-- 1 root root 521113600 Sep 2 17:02 magic.tar.gz
drwxr-xr-x 2 root root 4096 Sep 2 17:42 pack
drwxr-xr-x 2 root root 4096 Sep 3 10:36 script
不过有几点需要做一下简单说明:
- 1,启动正式部署之前,务必仔细认真检查各处配置是否与所需求的相匹配了,若不匹配,应当调整。
- 2,部署过程中如果有卡壳,或者未正常部署而退出,请根据对应的部署阶段进行排查,然后重新执行部署脚本,即可进行接续部署
4、简单验证
部署完成之后,可使用如下方式进行一些对集群可用性的初步检验:
1)检查服务是否均已正常启动
cat > magic01_ckeck_server.sh << "EOF"
#!/bin/bash
# 检查服务是否均已正常启动
set -e
source /opt/k8s/bin/environment.sh
##set color##
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
##set color##
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ssh root@${node_ip} "systemctl status etcd|grep Active"
ssh root@${node_ip} "systemctl status flanneld|grep Active"
ssh root@${node_ip} "systemctl status haproxy|grep Active"
ssh root@${node_ip} "systemctl status keepalived|grep Active"
ssh root@${node_ip} "systemctl status kube-apiserver |grep 'Active:'"
ssh root@${node_ip} "systemctl status kube-controller-manager|grep Active"
ssh root@${node_ip} "systemctl status kube-scheduler|grep Active"
ssh root@${node_ip} "systemctl status docker|grep Active"
ssh root@${node_ip} "systemctl status kubelet | grep Active"
ssh root@${node_ip} "systemctl status kube-proxy|grep Active"
done
EOF
2)查看相关服务可用性
2.1)验证 etcd 集群可用性
cat > magic02_verify_etcd.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
# 验证 etcd 集群可用性
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ETCDCTL_API=3 /opt/k8s/bin/etcdctl \
--endpoints=https://${node_ip}:2379 \
--cacert=/etc/kubernetes/cert/ca.pem \
--cert=/etc/etcd/cert/etcd.pem \
--key=/etc/etcd/cert/etcd-key.pem endpoint health
done
EOF
2.2)验证 flannel 网络
查看已分配的 Pod 子网段列表:
source /opt/k8s/bin/environment.sh
etcdctl \
--endpoints=${ETCD_ENDPOINTS} \
--ca-file=/etc/kubernetes/cert/ca.pem \
--cert-file=/etc/flanneld/cert/flanneld.pem \
--key-file=/etc/flanneld/cert/flanneld-key.pem \
ls ${FLANNEL_ETCD_PREFIX}/subnets
输出:
/kubernetes/network/subnets/172.30.84.0-24
/kubernetes/network/subnets/172.30.8.0-24
/kubernetes/network/subnets/172.30.29.0-24
2.3)验证各节点能通过 Pod 网段互通:
注意其中的IP段换成自己的
cat > magic03_ping_IP.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
# 验证各节点能通过 Pod 网段互通
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ssh ${node_ip} "ping -c 2 172.30.8.0"
ssh ${node_ip} "ping -c 2 172.30.29.0"
ssh ${node_ip} "ping -c 2 172.30.84.0"
done
EOF
2.4)高可用组件验证
查看 VIP 所在的节点,确保可以 ping 通 VIP:
cat > magic04_verify_module.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
# 查看 VIP 所在的节点,确保可以 ping 通 VIP
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ssh ${node_ip} "/usr/sbin/ip addr show ${VIP_IF}"
ssh ${node_ip} "ping -c 1 ${MASTER_VIP}"
done
EOF
2.5)高可用性试验
查看当前的 leader:
kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml
apiVersion: v1
kind: Endpoints
metadata:
annotations:
control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"kube-master_5b7afd9c-0c81-11ec-b95c-525400b20a8c","leaseDurationSeconds":15,"acquireTime":"2021-09-03T06:37:06Z","renewTime":"2021-09-03T06:46:06Z","leaderTransitions":0}'
creationTimestamp: 2021-09-03T06:37:06Z
name: kube-controller-manager
namespace: kube-system
resourceVersion: "957"
selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager
uid: 5b7c2c9c-0c81-11ec-8ec4-5254006e8cb5
可见,当前的 leader 为 kube-master 节点。
现在停掉 kube-master 上的 kube-controller-manager。
systemctl stop kube-controller-manager
systemctl status kube-controller-manager |grep Active
Active: inactive (dead) since Fri 2021-09-03 14:47:40 CST; 4s ago
大概一分钟后,再查看一下当前的 leader:
kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml
apiVersion: v1
kind: Endpoints
metadata:
annotations:
control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"kube-node01_5bf8ccfa-0c81-11ec-9bdb-525400cc4651","leaseDurationSeconds":15,"acquireTime":"2021-09-03T06:47:57Z","renewTime":"2021-09-03T06:48:15Z","leaderTransitions":1}'
creationTimestamp: 2021-09-03T06:37:06Z
name: kube-controller-manager
namespace: kube-system
resourceVersion: "1117"
selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager
uid: 5b7c2c9c-0c81-11ec-8ec4-5254006e8cb5
可以看到已经自动漂移到 kube-node01 上去了
2.5)查验 kube-proxy 功能
查看 ipvs 路由规则
cat > magic05_check_ipvs_rule.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
# 查看 ipvs 路由规则
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ssh root@${node_ip} "/usr/sbin/ipvsadm -ln"
done
EOF
输出:
[root@kube-master script]# bash magic05_check_ipvs_rule.sh
>>> 10.0.0.76
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.254.0.1:443 rr persistent 10800
-> 10.0.0.76:6443 Masq 1 0 0
-> 10.0.0.97:6443 Masq 1 0 0
-> 10.0.0.130:6443 Masq 1 0 0
>>> 10.0.0.97
.......
>>> 10.0.0.130
......
2.6)添加到 kubectl
环境变量
报错的解决方法:
1)首先用命令 find / -name kubectl
查找kubectl所在的位置
我的环境kubectl
所在的位置:/opt/k8s/bin/
2)将这个路径添加到系统的path
,编辑 vim /etc/profile
在profile
中添加:export PATH="/opt/k8s/bin/:$PATH"
3)source /etc/profile
环境变量
查看集群节点:
[root@kube-master data]# kubectl get node
NAME STATUS ROLES AGE VERSION
kube-master Ready 11m v1.10.4
kube-node01 Ready 11m v1.10.4
kube-node02 Ready 11m v1.10.4
创建测试应用:
cat > nginx-ds.yml <
执行定义文件,启动之前,可以先将上边定义的镜像 pull 下来
[root@kube-master script]# kubectl create -f nginx-ds.yml
service "nginx-ds" created
daemonset.extensions "nginx-ds" created
检查各 Node 上的 Pod IP 连通性
[root@kube-master script]# kubectl get pods -o wide|grep nginx-ds
nginx-ds-kjclg 1/1 Running 0 4m 172.30.26.2 kube-master
nginx-ds-nl2c7 1/1 Running 0 4m 172.30.30.2 kube-node02
nginx-ds-vczsg 1/1 Running 0 4m 172.30.98.2 kube-node01
可见nginx-ds 的 Pod IP 分别是 172.30.26.2、172.30.30.2、172.30.98.2,在所有 Node 上分别 ping 这三个 IP,看是否连通:
cat > magic06_ping_IP.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
# ping一下IP是否通
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ssh ${node_ip} "ping -c 1 172.30.26.2"
ssh ${node_ip} "ping -c 1 172.30.30.2"
ssh ${node_ip} "ping -c 1 172.30.98.2"
done
EOF
检查服务 IP 和端口可达性
[root@kube-master script]# kubectl get svc |grep nginx-ds
nginx-ds NodePort 10.254.255.104 80:8556/TCP 5m
在所有 Node 上 curl Service IP:
cat > magic07_All_Node_Service_IP.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
# 在所有 Node 上 curl Service IP
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ssh ${node_ip} "curl 10.254.255.104"
done
EOF
>>> 10.0.0.76
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 612 100 612 0 0 738k 0 --:--:-- --:--:-- --:--:-- 597k
Welcome to nginx!
Welcome to nginx!
If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.
For online documentation and support please refer to
nginx.org.
Commercial support is available at
nginx.com.
Thank you for using nginx.
# 提示:Thank you for using nginx 就说明测试OK
>>> 10.0.0.97
.....
>>> 10.0.0.130
.....
检查服务的 NodePort 可达性
cat > magic08_ckeck_service_NodePort.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
echoRed() { echo $'\e[0;31m'"$1"$'\e[0m'; }
echoGreen() { echo $'\e[0;32m'"$1"$'\e[0m'; }
echoYellow() { echo $'\e[0;33m'"$1"$'\e[0m'; }
# 检查服务的 NodePort 可达性
for node_ip in ${NODE_IPS[@]}
do
echoGreen ">>> ${node_ip}"
ssh ${node_ip} "curl ${node_ip}:8996"
done
EOF