我们准备三台主机:master node、work node1、work node2
cup:2*2、内存:2g、硬盘:20g 操作系统:centos-7.8
K8S集群生产架构
单master 集群
配置环境
角色 | 主机 | IP地址 | 内存 | cpu(核数) |
---|---|---|---|---|
master | k8s-master | 192.168.5.11 | 2G | 4 |
node1 | k8s-node01 | 192.168.5.12 | 2G | 4 |
node2 | k8s-node02 | 192.168.5.13 | 2G | 4 |
- 集群中所有机器之间网络互通。
- 可以访问外网,需要拉取镜像。
- 禁止swap分区,可选。
- 集群中所有机器时间同步
为了简洁起见,在k8s-master主机做批量操作,其他两个节点,也同时做了相应的配置
配置主机名、hosts解析文件
[root@k8s-master ~]# hostnamectl set-hostname k8s-master
[root@k8s-master ~]# vim /etc/hosts +
192.168.5.11 k8s-master
192.168.5.12 k8s-node1
192.168.5.13 k8s-node2
关闭防火墙、禁用selinux
[root@k8s-master ~]# systemctl stop firewalld.service
[root@k8s-master ~]# setenforce 0
[root@k8s-master ~]# vim /etc/selinux/config
#修改配置文件
SELINUX=disabled
关闭sawp分区
#临时
[root@k8s-master ~]# swapoff -a
#永久
[root@k8s-master ~]# sed -i '/\/dev\/mapper\/centos-swap/ s/^/#/' /etc/fstab
配置时间同步
[root@k8s-master ~]# date
Wed Nov 11 08:41:42 CST 2020
[root@node02 ~]# date
Wed Nov 11 08:41:42 CST 2020
[root@k8s-node2 ~]# date
Wed Nov 11 08:41:42 CST 2020
请务必校对三台机器时间同步,可以采人工校对或者配置时间服务器进行自动校准
安装常用的软件
[root@k8s-master ~]# yum install bash-completion wget tree psmisc net-tools vim lrzsz dos2unix -y
添加网桥过滤及地址转发
[root@k8s-master ~]# cat >> /etc/sysctl.d/k8s.conf << EOF
> net.bridge.bridge-nf-call-ip6tables = 1
> net.bridge.bridge-nf-call-iptables = 1
> net.ipv4.ip_forward = 1
> vm.swappiness = 0
> EOF
#加载br_netfilter模块
[root@k8s-master ~]# modprobe br_netfilter
#查看是否加载
[root@k8s-master ~]# lsmod | grep br_netfilter
br_netfilter 22256 0
bridge 151336 1 br_netfilter
#加载网桥过滤配置文件
[root@k8s-master ~]# sysctl -p /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
#安装ipset及ipvsadm
[root@k8s-master ~]# yum -y install ipset ipvsadm
#在所有节点执行如下脚本,添加需要加载的模块
[root@k8s-master ~]# cat > /etc/sysconfig/modules/ipvs.modules <
> #!/bin/bash
> modprobe -- ip_vs
> modprobe -- ip_vs_rr
> modprobe -- ip_vs_wrr
> modprobe -- ip_vs_sh
> modprobe -- nf_conntrack_ipv4
> EOF
#授权、运行、检查是否加载
[root@k8s-master ~]# chmod 755 /etc/sysconfig/modules/ipvs.modules && bash \
> /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e \
> nf_conntrack_ipv4
nf_conntrack_ipv4 15053 0
nf_defrag_ipv4 12729 1 nf_conntrack_ipv4
ip_vs_sh 12688 0
ip_vs_wrr 12697 0
ip_vs_rr 12600 0
ip_vs 145497 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack 139264 2 ip_vs,nf_conntrack_ipv4
libcrc32c 12644 3 xfs,ip_vs,nf_conntrack
安装docker
#安装必要的一些系统工具
[root@k8s-master ~]# yum install -y yum-utils device-mapper-persistent-data lvm2
#添加软件源信息
[root@k8s-master ~]# sudo yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
#更新并安装Docker-CE
[root@k8s-master ~]# yum makecache fast
[root@k8s-master ~]# yum -y install docker-ce
#修改docker-ce服务配置文件,修改其目的是为了后续使用/etc/docker/daemon.json来进行更多配置
ExecStart=/usr/bin/dockerd
#在/etc/docker/daemon.json添加如下内容
[root@k8s-master ~]# mkdir /etc/docker/
[root@k8s-master ~]# vim /etc/docker/daemon.json
{
"exec-opts":["native.cgroupdriver=systemd"]
}
#设置开机自启并启动docker
[root@k8s-master ~]# systemctl enable --now docker.service
Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.
#查看docker版本
[root@k8s-master ~]# docker version
Client: Docker Engine - Community
Version: 19.03.13
API version: 1.40
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:03:45 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.13
API version: 1.40 (minimum version 1.12)
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:02:21 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.3.7
GitCommit: 8fba4e9a7d01810a393d5d25a3621dc101981175
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
所有k8s集群节点均需安装,默认YUM源是谷歌,可以使用阿里云YUM
需求 | kubeadm | kubelet | kubectl | docker-ce |
---|---|---|---|---|
值 | 初始化集群、管理集群----1.19.3-0 | 用于接收api-server指令,对pod生命周期进行管理----1.19.3-0 | 集群命令 行管理工具----1.19.3-0 | 19.03.13 |
配置安装源
[root@k8s-master ~]#
[root@k8s-master ~]# cat > /etc/yum.repos.d/kubernetes.repo << EOF
> [kubernetes]
> name=Kubernetes
> baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
> enabled=1
> gpgcheck=1
> repo_gpgcheck=1
> gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
> https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
> EOF
安装安装kubelet、kubeadm、kubectl
- kubelet:负责管理pods和它们上面的容器,维护容器的生命周期
- kubeadm:安装K8S工具
- kubectl:K8S命令行工具
软件设置
主要配置kubelet,如果不配置可能会导致k8s集群无法启动。
#为了实现docker使用的cgroupdriver与kubelet使用的cgroup的一致性,建议修改如下文件内容
[root@k8s-master ~]# vim /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS="--cgroup-driver=systemd"
启动
[root@k8s-master ~]# systemctl enable --now kubelet.service
Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /usr/lib/systemd/system/kubelet.service.
主节点操作
[root@k8s-master ~]# kubeadm init \
> --apiserver-advertise-address=192.168.5.11 \
> --token-ttl 0 \
> --image-repository registry.aliyuncs.com/google_containers \
> --kubernetes-version v1.19.3 \
> --service-cidr=10.96.0.0/16 \
> --pod-network-cidr=10.244.0.0/16
继续执行命令
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.5.11:6443 --token e9qc1g.6yw4x930tngz97w3 \
--discovery-token-ca-cert-hash sha256:41a6dbe3c43b8c891033aefce90b8a6386d683070113701f68f61af01cf2ab28
[root@k8s-master ~]# mkdir -p $HOME/.kube
[root@k8s-master ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@k8s-master ~]# chown $(id -u):$(id -g) $HOME/.kube/config
进一步查看
[root@k8s-master ~]# docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.19.3 cdef7632a242 3 weeks ago 118MB
registry.aliyuncs.com/google_containers/kube-apiserver v1.19.3 a301be0cd44b 3 weeks ago 119MB
registry.aliyuncs.com/google_containers/kube-controller-manager v1.19.3 9b60aca1d818 3 weeks ago 111MB
registry.aliyuncs.com/google_containers/kube-scheduler v1.19.3 aaefbfa906bd 3 weeks ago 45.7MB
registry.aliyuncs.com/google_containers/etcd 3.4.13-0 0369cf4303ff 2 months ago 253MB
registry.aliyuncs.com/google_containers/coredns 1.7.0 bfe3a36ebd25 4 months ago 45.2MB
registry.aliyuncs.com/google_containers/pause 3.2 80d28bedfe5d 9 months ago 683kB
[root@k8s-master ~]# netstat -lnutp | grep 6443
tcp6 0 0 :::6443 :::* LISTEN 6016/kube-apiserver
#查询组件状态信息
[root@k8s-master ~]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Unhealthy Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused
scheduler Unhealthy Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused
etcd-0 Healthy {"health":"true"}
解决办法
针对通过kubeadm安装的k8s集群获取kube-scheduler和kube-controller-manager组件状态异常处
理,kubernetes版本:v1.18.6
确认kube-scheduler和kube-controller-manager组件配置是否禁用了非安全端口
配置文件路径:/etc/kubernetes/manifests/kubescheduler.yaml、/etc/kubernetes/manifests/kube-controller-manager.yaml
可以去掉–port=0这个设置,然后重启systemctl restart kubelet,再次检查正常
[root@k8s-master ~]# vim /etc/kubernetes/manifests/kube-scheduler.yaml
[root@k8s-master ~]# vim /etc/kubernetes/manifests/kube-controller-manager.yaml
#删除如下内容
- --port=0
重启服务,继续查看
[root@k8s-master ~]# systemctl restart kubelet
[root@k8s-master ~]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
查看节点信息
[root@k8s-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master NotReady master 66m v1.19.3
查询集群节点信息(因为还没有部署好flannel,所以节点显示为NotReady)
查询名称空间
[root@k8s-master ~]# kubectl get ns
NAME STATUS AGE
default Active 77m
kube-node-lease Active 77m
kube-public Active 77m
kube-system Active 77m
部署网络插件 calico
介于网络原因,我们提前导入已经下载好的插件
[root@k8s-master ~]# ll
total 585720
-rw-------. 1 root root 1496 Jun 1 21:14 anaconda-ks.cfg
-rw-r--r--. 1 root root 163265024 Nov 11 11:33 calico-cni.tar
-rw-r--r--. 1 root root 194709504 Nov 11 11:33 calico-node.tar
-rw-r--r--. 1 root root 21430 Nov 11 11:33 calico.yml
-rwxr-xr-x. 1 root root 118 Jun 12 19:34 hostname.sh
-rw-r--r--. 1 root root 181238643 Jun 7 14:06 jdk-8u60-linux-x64.tar.gz
-rw-r--r--. 1 root root 50465280 Nov 11 11:33 kube-controllers.tar
-rw-r--r--. 1 root root 10056192 Nov 11 11:33 pod2daemon-flexvol.tar
#加载镜像
[root@k8s-master ~]# docker load -i calico-cni.tar
1c95c77433e8: Loading layer [==================================================>] 72.47MB/72.47MB
f919277f01fb: Loading layer [==================================================>] 90.76MB/90.76MB
0094c919faf3: Loading layer [==================================================>] 10.24kB/10.24kB
9e1263ee4198: Loading layer [==================================================>] 2.56kB/2.56kB
Loaded image: calico/cni:v3.9.0
[root@k8s-master ~]# docker load -i calico-node.tar
538afb24c98b: Loading layer [==================================================>] 33.76MB/33.76MB
85b8bbfa3535: Loading layer [==================================================>] 3.584kB/3.584kB
7a653a5cb14b: Loading layer [==================================================>] 3.584kB/3.584kB
97cc86557fed: Loading layer [==================================================>] 21.86MB/21.86MB
3abae82a71aa: Loading layer [==================================================>] 11.26kB/11.26kB
7c85b99e7c27: Loading layer [==================================================>] 11.26kB/11.26kB
0e20735d7144: Loading layer [==================================================>] 6.55MB/6.55MB
2e3dede6195a: Loading layer [==================================================>] 2.975MB/2.975MB
f85ff1d9077d: Loading layer [==================================================>] 55.87MB/55.87MB
9d55754fd45b: Loading layer [==================================================>] 1.14MB/1.14MB
Loaded image: calico/node:v3.9.0
[root@k8s-master ~]# docker load -i kube-controllers.tar
fd6ffbcdb09f: Loading layer [==================================================>] 47.35MB/47.35MB
9c4005f3e0bc: Loading layer [==================================================>] 3.104MB/3.104MB
Loaded image: calico/kube-controllers:v3.9.0
[root@k8s-master ~]# docker load -i pod2daemon-flexvol.tar
3fc64803ca2d: Loading layer [==================================================>] 4.463MB/4.463MB
3aff8caf48a7: Loading layer [==================================================>] 5.12kB/5.12kB
89effeea5ce5: Loading layer [==================================================>] 5.572MB/5.572MB
Loaded image: calico/pod2daemon-flexvol:v3.9.0
修改calico资源清单文件
添加以下两行,增加到606行后面,由于calico自身网络发现机制有问题,因为需要修改calico使用的物理网卡
[root@k8s-master ~]# vim calico.yml
- name: IP_AUTODETECTION_METHOD
value: "interface=ens.*"
应用calico资源清单文件
在应用caclico资源清单文件之前,需要把calico所有的镜像导入到node节点中。
[root@k8s-master ~]# scp calico* pod2daemon-flexvol.tar kube-controllers.tar k8s-node1:/root/
The authenticity of host 'k8s-node1 (192.168.5.12)' can't be established.
ECDSA key fingerprint is SHA256:8KoAXpPVTPc8T4wS2TQoTrAcVmbrZUqiI0UQ4L56zCQ.
ECDSA key fingerprint is MD5:48:a8:5d:58:f3:a7:c6:9b:b8:11:1a:1c:09:a8:55:04.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'k8s-node1,192.168.5.12' (ECDSA) to the list of known hosts.
root@k8s-node1's password:
calico-cni.tar 100% 156MB 101.9MB/s 00:01
calico-node.tar 100% 186MB 87.9MB/s 00:02
calico.yml 100% 21KB 8.4MB/s 00:00
pod2daemon-flexvol.tar 100% 9821KB 53.7MB/s 00:00
kube-controllers.tar 100% 48MB 48.1MB/s 00:01
[root@k8s-master ~]# scp calico* pod2daemon-flexvol.tar kube-controllers.tar k8s-node2:/root/
The authenticity of host 'k8s-node2 (192.168.5.13)' can't be established.
ECDSA key fingerprint is SHA256:8KoAXpPVTPc8T4wS2TQoTrAcVmbrZUqiI0UQ4L56zCQ.
ECDSA key fingerprint is MD5:48:a8:5d:58:f3:a7:c6:9b:b8:11:1a:1c:09:a8:55:04.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'k8s-node2,192.168.5.13' (ECDSA) to the list of known hosts.
root@k8s-node2's password:
calico-cni.tar 100% 156MB 38.9MB/s 00:04
calico-node.tar 100% 186MB 18.2MB/s 00:10
calico.yml 100% 21KB 6.6MB/s 00:00
pod2daemon-flexvol.tar 100% 9821KB 43.2MB/s 00:00
kube-controllers.tar 100% 48MB 25.9MB/s 00:01
同一样,加载镜像,此处省略
启动集群网络
[root@k8s-master ~]# kubectl apply -f calico.yml
configmap/calico-config created
Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.apps/calico-node created
serviceaccount/calico-node created
deployment.apps/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
加入集群
[root@k8s-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 108m v1.19.3
[root@k8s-node1 ~]# kubeadm join 192.168.5.11:6443 --token e9qc1g.6yw4x930tngz97w3 --discovery-token-ca-cert-hash sha256:41a6dbe3c43b8c891033aefce90b8a6386d683070113701f68f61af01cf2ab28
[root@k8s-node2 ~]# kubeadm join 192.168.5.11:6443 --token e9qc1g.6yw4x930tngz97w3 --discovery-token-ca-cert-hash sha256:41a6dbe3c43b8c891033aefce90b8a6386d683070113701f68f61af01cf2ab28
[root@k8s-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 111m v1.19.3
k8s-node1 NotReady <none> 21s v1.19.3
k8s-node2 NotReady <none> 16s v1.19.3
加载网络配置文件
[root@k8s-node1 ~]# kubectl apply -f calico.yml
The connection to the server localhost:8080 was refused - did you specify the right host or port?
[root@k8s-node2 ~]# kubectl apply -f calico.yml
The connection to the server localhost:8080 was refused - did you specify the right host or port?
再次查看集群状态
[root@k8s-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 114m v1.19.3
k8s-node1 Ready <none> 3m46s v1.19.3
k8s-node2 Ready <none> 3m41s v1.19.3
最终查看K8S集群信息
#查询名称空间
[root@k8s-master ~]# kubectl get ns
NAME STATUS AGE
default Active 119m
kube-node-lease Active 120m
kube-public Active 120m
kube-system Active 120m
#查看集群健康状态
[root@k8s-master ~]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
etcd-0 Healthy {"health":"true"}
scheduler Healthy ok
controller-manager Healthy ok
#查看所以集群节点信息
[root@k8s-master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 121m v1.19.3
k8s-node1 Ready <none> 9m58s v1.19.3
k8s-node2 Ready <none> 9m53s v1.19.3
#查看集群信息
[root@k8s-master ~]# kubectl cluster-info
Kubernetes master is running at https://192.168.5.11:6443
KubeDNS is running at https://192.168.5.11:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
#查看更相信的集群信息
[root@k8s-master ~]# kubectl cluster-info dump