kubeadm部署k8s集群
服务器环境:(2GB或更多RAM,2个CPU或更多CPU)
Kubernetes Master1节点:172.20.26.34
Kubernetes Master2节点:172.20.26.36
Kubernetes Node1节点: 172.20.26.37
Kubernetes Node2节点: 172.20.26.38
操作系统:CentOS7.9
每台服务器部署前将日常工具安装完成
yum install vim net-tools lrzsz epel-release -y
yum update
一、K8S节点Hosts及防火墙设置
Master1、Master2、Node1、Node2节点进行如下配置:
#添加hosts解析;
cat >/etc/hosts<
172.20.26.34 master1
172.20.26.36 master2
172.20.26.37 node1
172.20.26.38 node2
EOF
#临时关闭selinux和防火墙;
sed -i '/SELINUX/s/enforcing/disabled/g' /etc/sysconfig/selinux
setenforce 0
systemctl stop firewalld.service
systemctl disable firewalld.service
#同步节点时间;
yum install ntpdate -y
ntpdate pool.ntp.org
#修改对应节点主机名;
hostname `cat /etc/hosts|grep $(ifconfig|grep broadcast|awk '{print $2}')|awk '{print $2}'`;su
#关闭swapoff(因交换分区读写速度无法与内存比,关闭交换分区,确保k8s性能);
swapoff -a # 临时关闭
sed -ri 's/.*swap.*/#&/' /etc/fstab #再执行命令永久关闭
二、Linux内核参数设置&优化
Master1、Master2、Node1、Node2节点执行
让k8s支持IP负载均衡技术:
cat > /etc/modules-load.d/ipvs.conf <
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack_ipv4
EOF
systemctl enable --now systemd-modules-load.service ##加载模块
#确认内核模块加载成功
lsmod | grep -e ip_vs -e nf_conntrack_ipv4
显示如下:
nf_conntrack_ipv4 15053 0
nf_defrag_ipv4 12729 1 nf_conntrack_ipv4
ip_vs_sh 12688 0
ip_vs_wrr 12697 0
ip_vs_rr 12600 0
ip_vs 145497 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack 133095 2 ip_vs,nf_conntrack_ipv4
libcrc32c 12644 3 xfs,ip_vs,nf_conntrack
如没有信息显示,可以尝试重启机器
init 6 重启系统
#安装ipset、ipvsadm
yum install -y ipset ipvsadm
#配置内核参数;(加入桥接转发,让容器能够使用二层网络)
cat <
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
#手动加载所有的配置文件
sysctl --system
#局部挂载配置文件
sysctl -p
所有节点安装Docker、kubeadm、kubelet
三、安装Docker
# 安装依赖软件包
yum install -y yum-utils device-mapper-persistent-data lvm2
# 添加Docker repository,这里使用国内阿里云yum源
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# 安装docker-ce,这里直接安装最新版本
yum install -y docker-ce
#修改docker配置文件
mkdir /etc/docker
cat > /etc/docker/daemon.json <
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
],
"registry-mirrors": ["https://uyah70su.mirror.aliyuncs.com"]
}
EOF
# 注意,由于国内拉取镜像较慢,配置文件最后增加了registry-mirrors
mkdir -p /etc/systemd/system/docker.service.d
# 重启docker服务
systemctl daemon-reload
systemctl enable docker.service
systemctl start docker.service
ps -ef|grep -aiE docker
四、Kubernetes添加部署源
Kubernetes Master1、Master2、Node1、Node2节点上安装Docker、Etcd和Kubernetes、Flannel网络,
在Master1、Master2、Node1、Node2节点上添加kubernetes源指令如下:
cat>>/etc/yum.repos.d/kubernetes.repo<
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
EOF
五、K8S Kubeadm
1)安装Kubeadm工具;
#在Master1、Master2、Node1、Node2节点安装Kubeadm;
#搜索k8s版本
yum list kubelet --showduplicates
yum install -y kubeadm-1.23.1 kubelet-1.23.1 kubectl-1.23.1
#在Master1、Master2节点启动kubelet服务
systemctl enable kubelet.service && systemctl restart kubelet.service
2)Kubeadm常见指令操作;
kubeadm init 启动一个 Kubernetes 主节点
kubeadm join 启动一个 Kubernetes 工作节点并且将其加入到集群
kubeadm upgrade 更新一个 Kubernetes 集群到新版本
kubeadm config 如果使用 v1.7.x 或者更低版本的 kubeadm 初始化集群,您需要对集群做一些配置以便使用 kubeadm upgrade 命令
kubeadm token 管理 kubeadm join 使用的令牌
kubeadm reset 还原 kubeadm init 或者 kubeadm join 对主机所做的任何更改
kubeadm version 打印 kubeadm 版本
kubeadm alpha 预览一组可用的新功能以便从社区搜集反馈
六、K8S Master1节点初始化,以及Master2节点加入集群
1)在Master 1节点上,执行kubeadm init初始化安装Master相关软件(2个CPU、2G内存以上);
echo "1" > /proc/sys/net/ipv4/ip_forward #打开IP转发
init 6 重启系统
kubeadm init --control-plane-endpoint=172.20.26.34:6443 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.23.1 --service-cidr=10.10.0.0/16 --pod-network-cidr=10.244.0.0/16 --upload-certs
报错如下:
[init] Using Kubernetes version: v1.20.4
[preflight] Running pre-flight checks
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.12. Latest validated version: 19.03
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
解决:
swapoff -a # 临时关闭
sed -ri 's/.*swap.*/#&/' /etc/fstab #再执行命令永久关闭
另外还要设置/etc/sysconfig/kubelet参数(在以往老版本中是必须要关闭swap的,但是现在新版又多了一个选择,可以通过参数指定,忽略swap报错!)
sed -i 's/KUBELET_EXTRA_ARGS=/KUBELET_EXTRA_ARGS="--fail-swap-on=false"/' /etc/sysconfig/kubelet
假如后续再次初始化时报错:
[root@master1 ~]# kubeadm init --control-plane-endpoint=172.20.26.34:6443 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.23.1 --service-cidr=10.10.0.0/16 --pod-network-cidr=10.244.0.0/16 --upload-certs
[init] Using Kubernetes version: v1.23.1
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Port-6443]: Port 6443 is in use
[ERROR Port-10259]: Port 10259 is in use
[ERROR Port-10257]: Port 10257 is in use
[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
[ERROR Port-10250]: Port 10250 is in use
[ERROR Port-2379]: Port 2379 is in use
[ERROR Port-2380]: Port 2380 is in use
[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
出现上述报错,端口被占用。执行
kubeadm reset #重置kubeadm
初始化报错:
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp 127.0.0.1:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
2)根据如上指令操作,执行成功
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join 172.20.26.34:6443 --token 5u8maw.7u73q0zq1b0z623d \
--discovery-token-ca-cert-hash sha256:754abbc6dd80d5da1f56c67f31cddd89c04c5cc39b039e8376f21225d8dec4dcbb \
--control-plane --certificate-key f26b6b75628f095a5cb4e4c12f3469b8aaced8ccd08674cfded0c36e087ca92e3
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 172.20.26.34:6443 --token 5u8maw.7u73q0zq1b0z623d \
--discovery-token-ca-cert-hash sha256:754abbc6dd80d5da1f56c67f31cd89c04c5cc39b039e8376f21225d8dec4dcbb
3)根据如上图提示,接下来需在Master端手工执行如下指令,拷贝admin配置文件;
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
export KUBECONFIG=/etc/kubernetes/admin.conf
Master1节点在当前目录下,上传kube-flannel.yaml,然后执行下面的命令
[root@master1 ~]# kubectl apply -f kube-flannel.yml
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
4)将Master2节点加入集群中,同时作为node节点加入K8S集群时使用的参数和指令如下;(token生命周期为一天)
#如果token过期,可以在master1上生成token
kubeadm token generate
#根据上面kubeadm init初始化成功的信息提示,在Master2上执行下面命令,将Master2节点加入集群中
kubeadm join 172.20.26.34:6443 --token 5u8maw.7u73q0zq1b0z623d \
--discovery-token-ca-cert-hash sha256:754abbc6dd80d5da1f56c67f31cd89c04c5cc39b039e8376f21225d8dec4dcbb \
--control-plane --certificate-key f26b6b75628f05a5cb4e4c12f3469b8aaced8ccd08674cfded0c36e087ca92e3
等待完成,Master2成功加入集群的信息如下:
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
七、Node1节点加入集群
#启动Node1节点上docker引擎服务;
systemctl start docker.service
#将Node1节点加入K8S集群;
kubeadm join 172.20.26.34:6443 --token 5u8maw.7u73q0zq1b0z623d \
--discovery-token-ca-cert-hash sha256:754abbc6dd802l5da1f56c67f31cd89c04c5cc39b039e8376f21225d8dec4dcbb
提示以下成功加入信息即可:
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
#假如在执行kubeadm init时没有记录下加入集群的指令,可以在Master端通过以下命令重新创建即可;
kubeadm token create --print-join-command
#登录K8S Master节点验证节点信息;
在Master节点上执行,节点查询命令
kubectl get nodes
[root@master1 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master1 Ready control-plane,master 15m v1.23.1
master2 Ready control-plane,master 4m23s v1.23.1
node1 Ready
node2 NotReady
[root@master1 ~]#
八、K8s Dashboard UI实战
Kubernetes实现的最重要的工作是对Docker容器集群统一的管理和调度,通常使用命令行来操作Kubernetes集群及各个节点,命令行操作非常不方便,如果使用UI界面来可视化操作,会更加方便的管理和维护。如下为配置kubernetes dashboard完整过程:(Master端)
1)下载Dashboard配置文件:(也可以将现成的k8s_dashboard.yaml文件上传上来)
可能由于网络原因,可能出现超时,下载失败现象
通过http://ipAddress.com,找到对应的ip地址
输入raw.githubusercontent.com 域名,点击放大镜搜索
然后鼠标向下滑动,找到对应的IP地址,如下提示:
Githubusercontent Raw Frequently Asked Questions (FAQ)
What IP addresses does raw.githubusercontent.com resolve to?
raw.githubusercontent.com resolves to 4 IPv4 addresses and 4 IPv6 addresses:
185.199.108.133
185.199.109.133
185.199.110.133
185.199.111.133
将查询raw.githubusercontent.com对应的IP加入hosts解析,
185.199.108.133 raw.githubusercontent.com
再次下载recommended.yaml
wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.1/aio/deploy/recommended.yaml
如果wget无法下载,可在windows 机器上绑定host,然后输入以下地址,将内容复制粘贴下来,保存为.yaml文件即可
https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.1/aio/deploy/recommended.yaml
\cp recommended.yaml recommended.yaml.bak
将recommended.yaml改名为 k8s_dashboard.yaml
2)修改文件k8s_dashboard.yaml的39行内容,#因为默认情况下,service的类型是cluster IP,需更改为NodePort的方式,便于访问,也可映射到指定的端口。
[root@master1 ~]# vim k8s_dashboard.yaml
spec:
type: NodePort
ports:
- port: 443
targetPort: 8443
nodePort: 31001
selector:
k8s-app: kubernetes-dashboard
3)v2.0.0版本,修改文件k8s_dashboard.yaml的195行内容,#因为默认情况下Dashboard为英文显示,可以设置为中文。
env:
- name: ACCEPT_LANGUAGE
value: zh
如果是Dashboard v2.5.1左右的版本,可以在浏览器访问Dashboard WEB中的设置里修改语言为简体中文
4)修改kubernetes-dashboard修改默认token认证过期时间
默认900s/15分钟后认证token回话失效,需要重新登录认证,修改为12小时(43200秒),方便使用
大概在数字190行左右,找到“args”字段,在下面添加 - --token-ttl=43200
containers:
- name: kubernetes-dashboard
image: kubernetesui/dashboard:v2.5.1
imagePullPolicy: Always
ports:
- containerPort: 8443
protocol: TCP
env:
- name: ACCEPT_LANGUAGE
value: zh
args:
- --auto-generate-certificates
- --token-ttl=43200 #修改kubernetes-dashboard默认token认证时间为12小时
- --namespace=kubernetes-dashboard
5)创建Dashboard服务,将k8s_dashboard.yaml修改好之后,在该文件的目录下执行指令,操作如下:
[root@master1 ~]# kubectl apply -f k8s_dashboard.yaml
6)查看Dashboard运行状态;
kubectl get pod -n kubernetes-dashboard
[root@master1 ~]# kubectl get pod -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
dashboard-metrics-scraper-5f6ccbd9c4-9qpkg 1/1 Running 0 103s
kubernetes-dashboard-57499967cf-shqxt 1/1 Running 0 103s
kubectl get svc -n kubernetes-dashboard
[root@master1 ~]# kubectl get svc -n kubernetes-dashboard
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dashboard-metrics-scraper ClusterIP 10.10.176.179
kubernetes-dashboard NodePort 10.10.170.121
7)基于Token的方式访问,设置和绑定Dashboard权限,命令如下;
#创建Dashboard的管理用户;
kubectl create serviceaccount dashboard-admin -n kube-system
#将创建的dashboard用户绑定为管理用户;
kubectl create clusterrolebinding dashboard-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
#获取刚刚创建的用户对应的Token名称;
kubectl get secrets -n kube-system | grep dashboard
#查看Token的详细信息;
kubectl describe secrets -n kube-system $(kubectl get secrets -n kube-system | grep dashboard |awk '{print $1}')
[root@master1 ~]# kubectl describe secrets -n kube-system $(kubectl get secrets -n kube-system | grep dashboard |awk '{print $1}')
Name: dashboard-admin-token-9nldp
Namespace: kube-system
Labels:
Annotations: kubernetes.io/service-account.name: dashboard-admin
kubernetes.io/service-account.uid: 52376c4c-8292-4369-8394-4011fce66daf
Type: kubernetes.io/service-account-token
Data
====
ca.crt: 1099 bytes
namespace: 11 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IjVrZ2ZvcDhlZk9Rd0lEU0lGZ0dsw1QTdzUW1jZHJFU2xfaTkwUVVoUktOZU0ifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tOW5sZHAiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291snsnQudWlkIjoiNTIzNzZjNGMtODI5Mi00MzY5LTgzOTQtNDAxMWZjZTY2ZGFmIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.UfRQpE5LpwwwP05WrSZ5zFmtPyR-iLtHaHjHizt9awcEinI1twbDtarEo41x9AfZFO0g-kuOavZhxpiVzGY-JKps8j5lAj2UMkzb4L2Ns5wl2NjoT2vqS5BphdZ3RerHC6Gg3Qq9aixJDKDu_WtoWYad6AkrVwgNXkPv14wVnbyWSjcFpJ3pAhdwhIyv5HkWH8pZq8dj7z4bjICMK6zIm4g6lkwYUywq0N8PBrdd83XrUPiALVm18J-SiwsAmPMGCoFothjhWG9u62T_p4ahStPhn7FFYurbsKm5vtkhVsKrmX_cntYXj1odhXKWTAaUcGHHNIyGyTZinbMk4omVCAQ
[root@master1 ~]#
8)通过浏览器访问Dashboard WEB,https://172.20.26.34:31001/,如图所示,输入Token登录即可。
问题1:
k8s 安装tomcat ,node节点上的tomcat访问不了,网关没有打通,我们在192.168.26.228节点机上查看iptables ,(注意这是在...172.20.26.37节点机上操作)
sodu iptables -S #查看 FROWARD 的状态为DROP
iptables -P FORWARD ACCEPT #使用命令把FORWARD DROP 修改成FORWARD ACCEPT
#查看 FROWARD 的状态为ACCEPT
sleep 60 && /sbin/iptables -P FORWARD ACCEPT #机器重启之后,又恢复DROP了,再此加一条防止重启还原DROP的命令
172.20.26.37:30042 即可打开tomcat默认页