多年间,Docker、Kubernetes 被视为云计算时代下开发者的左膀右臂
不过,在k8s 1.20
的版本中不在使用docker shim,并会在未来的版本中删除,转而使用是容器运行时的东西,负责提取和运行容器映像。Docker 是该运行时的热门选择(其他常见选项包括contained / CRI-O),因此,作为用户,接下来,只需要将容器运行时从 Docker 更改为另一个受支持的容器运行时即可。
以下基于 containerd 部署 kubernetes集群,不再基于docker
操作系统: CentOS Linux release 8.2.2004 (Core)
VIP: 192.168.1.179
ip | hostname | 部署应用 |
---|---|---|
192.168.1.100 | www.kevin.com | 运维节点,包括dns、CA证书生成 |
192.168.1.170 | k8s-170.kevin.com | keepalived |
192.168.1.171 | k8s-171.kevin.com | keepalived |
192.168.1.172 | k8s-172.kevin.com | |
192.168.1.173 | k8s-173.kevin.com |
每台节点关闭防火墙
[root@k8s-170 ~]# systemctl stop firewalld
[root@k8s-170 ~]# systemctl disable firewalld
每台节点安装工具
[root@k8s-170 ~]# dnf -y install epel-release vim wget net-tools telnet tree nmap sysstat lrzsz dos2unix bind-utils
每台节点关闭SELinux
## 临时关闭
[root@k8s-170 ~]# setenforce 0
## 永久关闭
[root@k8s-170 ~]# sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
配置hostname
## 每个节点设置要不一样
[root@k8s-170 ~]# hostnamectl set-hostname k8s-170.kevin.com
关闭交换分区
[root@k8s-170 ~]# swapoff -a
[root@k8s-170 ~]# sed -ri 's/.*swap.*/#&/' /etc/fstab
每台节点配置时间同步
[root@k8s-170 ~]# systemctl restart chronyd.service
[root@k8s-170 ~]# systemctl enable chronyd.service
每台节点配置加载所需内核模块
[root@k8s-170 ~]# cat <
每台节点配置加载ipvs模块(可选项,默认为iptables模式)
kuber-proxy代理支持iptables和ipvs两种模式,如果使用ipvs模式需要在初始化集群前所有节点加载ipvs模块并安装ipset工具,Linux kernel 4.19以上的内核版本使用nf_conntrack代替nf_conntrack_ipv4
[root@k8s-170 ~]# cat > /etc/modules-load.d/ipvs.conf <
每台节点设置sysctl 参数,允许iptables检查桥接流量,这些参数在重新启动后仍然存在
[root@k8s-170 ~]# cat <
在 192.168.1.170
和 192.168.1.171
都执行
[root@k8s-170 ~]# dnf -y install keepalived
两个节点都配置检查 api-server监听的 6443端口
[root@k8s-170 ~]# vim /etc/keepalived/check_port.sh
#####################内容如下 #####################
#!/bin/bash
# keepalived 监控端口脚本
# 使用方法:
# 在 keepalived配置文件中
# vrrp_script check_port {
# script "/etc/keepalived/check_port.sh 6379" # 配置监听的端口
# interval 2 # 检查脚本的时间间隔,单位(秒)
#
#}
CHK_PORT=$1
if [ -n "$CHK_PORT" ];then
PORT_PROCESS=`ss -lnt |grep $CHK_PORT|wc -l`
if [ $PORT_PROCESS -eq 0 ];then
echo "Port $CHK_PORT is not used,End."
exit 1
fi
else
echo "Check port can not be empty!"
fi
#####################内容结束 #####################
## 配置可执行权限
[root@k8s-170 ~]# chmod +x /etc/keepalived/check_port.sh
在 192.168.1.170
配置 keepalived.conf
如下
[root@k8s-170 ~]# vim /etc/keepalived/keepalived.conf
global_defs {
router_id 192.168.1.170
}
vrrp-script chk_nginx {
script "/etc/keepalived/check_port.sh 6443"
interval 2
weight -20
}
vrrp_instance VI_1 {
state MASTER
interface ens33
virtual_router_id 170
priority 100
advert_int 1
mcast_src_ip 192.168.1.170
# 工作模式,nopreempt表示工作在非抢占模式,默认是抢占模式 preempt,在生产环境一定要配置为nopreempt
#不要让VIP老是变动,VIP经常变动已触发了高可用机制,属于重大生产事故
# 当vip漂移后,如果再切回之前某台服务器作为VIP服务器,可在访问量低的时候重启服务。
nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
track_script {
chk_nginx
}
virtual_ipaddress {
192.168.1.179
}
}
在 192.168.1.171
配置 keepalived.conf
如下
[root@k8s-171 ~]# cat /etc/keepalived/keepalived.conf
global_defs {
router_id 192.168.1.171
}
vrrp-script chk_nginx {
script "/etc/keepalived/check_port.sh 6443"
interval 2
weight -20
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 170
priority 90
advert_int 1
mcast_src_ip 192.168.1.171
# 工作模式,nopreempt表示工作在非抢占模式,默认是抢占模式 preempt,在生产环境一定要配置为nopreempt
#不要让VIP老是变动,VIP经常变动已触发了高可用机制,属于重大生产事故
# 当vip漂移后,如果再切回之前某台服务器作为VIP服务器,可在访问量低的时候重启服务。
nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
track_script {
chk_nginx
}
virtual_ipaddress {
192.168.1.179
}
}
在两台服务器都启动 keepalived
[root@k8s-170 ~]# systemctl start keepalived.service
[root@k8s-170 ~]# systemctl enable keepalived.service
## 在 170服务器查看,179的ip也在这台服务器上
[root@k8s-170 ~]# ip addr show ens33
2: ens33: mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 00:0c:29:94:2d:be brd ff:ff:ff:ff:ff:ff
inet 192.168.1.170/24 brd 192.168.1.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.1.179/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 2409:8a55:611:dc40:5942:a3d0:7aef:99a4/64 scope global dynamic noprefixroute
valid_lft 86250sec preferred_lft 86250sec
inet6 fe80::92c3:94e5:46fc:64f2/64 scope link noprefixroute
valid_lft forever preferred_lft forever
## 在 171服务器查看,没有 179的VIP
[root@k8s-171 ~]# ip addr show ens33
2: ens33: mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 00:0c:29:9f:63:79 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.171/24 brd 192.168.1.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet6 2409:8a55:611:dc40:20c:29ff:fe9f:6379/64 scope global dynamic mngtmpaddr
valid_lft 86250sec preferred_lft 86250sec
inet6 fe80::20c:29ff:fe9f:6379/64 scope link
valid_lft forever preferred_lft forever
[root@k8s-171 ~]#
[root@k8s-170 ~]# vim /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-$basearch
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
## 可以先查询有哪些版本
[root@k8s-170 ~]# dnf list kubeadm --showduplicates
## 默认安装最新版本
[root@k8s-170 ~]# dnf install -y kubelet kubeadm kubectl
## 如果要安装指定版本,这里使用的这个命令安装 1.20.5版本
[root@k8s-170 ~]# dnf install -y kubelet-1.20.5 kubeadm-1.20.5 kubectl-1.20.5
默认情况下kubeadm安装集群时,会自动生成相关证书,默认的根证书有效期为10年,客户端证书的有效时间为1年,这里使用自定义根证书来颁发
在192.168.1.100
节点执行
[root@k8s-100 ~]# wget https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssl_1.5.0_linux_amd64 -O /usr/bin/cfssl
[root@k8s-100 ~]# wget https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssljson_1.5.0_linux_amd64 -O /usr/bin/cfssl-json
[root@k8s-100 ~]# wget https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssl-certinfo_1.5.0_linux_amd64 -O /usr/bin/cfssl-certinfo
[root@k8s-100 ~]# chmod +x /usr/bin/cfssl*
[root@k8s-100 ~]# mkdir /opt/certs
[root@k8s-100 ~]# cd /opt/certs
[root@k8s-100 certs]# vim ca-csr.json
##################内容如下########################
{
"CN": "Kevin",
"hosts": [],
"key":{
"algo": "rsa",
"size": 2048
},
"names":[
{
"C": "CN",
"ST": "beijing",
"L": "beijing",
"O": "Kevin",
"OU": "www"
}
],
"ca":{
"expiry": "175200h"
}
}
##########################################
#CN: Common Name: 浏览器使用该字段验证合法性,一般写域名,非常重要,浏览器会使用该字段验证网站是否合法。
#C: Country 国家
#ST: State ,州、省
#L: Locality,地区、城市
#O: Organization Name: 组织机构名,公司名称
#OU: Organization Unit Name: 组织机构单位、公司部门
#expiry: 证书过期时间,这个时间非常重要,k8s用 kube-admin生成的证书过期时间为 1年。过期后需要更新证书。
[root@k8s-100 certs]# cfssl genkey -initca ca-csr.json |cfssl-json -bare ca
[root@www certs]# ll
total 16
-rw-r--r-- 1 root root 1037 Apr 16 21:47 ca.csr
-rw-r--r-- 1 root root 319 Apr 16 21:47 ca-csr.json
-rw------- 1 root root 1679 Apr 16 21:47 ca-key.pem
-rw-r--r-- 1 root root 1294 Apr 16 21:47 ca.pem
[root@www certs]#
复制生成的证书到 192.168.1.170
节点上
## 在 170执行
[root@k8s-170 ~]# mkdir /etc/kubernetes/pki -p
[root@k8s-170 ~]# cd /etc/kubernetes/pki
[root@k8s-170 ~]# rsync -av 192.168.1.100:/opt/certs/{ca,ca-key}.pem .
##将证书重命名,以下两步一定要做,因为k8s会使用以下两个文件作为CA的根证书
[root@k8s-170 pki]# mv ca.pem ca.crt
[root@k8s-170 pki]# mv ca-key.pem ca.key
配置kubelet使用containerd(所有节点都要配置cgroup-driver=systemd参数,否则node节点无法自动下载和创建pod)
[root@k8s-170 ~]# cat > /etc/sysconfig/kubelet <
如果不想修改/etc/sysconfig/kubelet配置,kubeadm init必须使用yaml文件来初始化传递cgroupDriver参数,可以通过如下命令导出默认的初始化配置
[root@k8s-170 ~]# kubeadm config print init-defaults > kubeadm-config.yaml
然后根据自己的需求修改配置,比如修改imageRepository的值,kube-proxy模式为ipvs,需要注意的是由于使用containerd作为运行时,所以在初始化节点的时候需要指定cgroupDriver为systemd模式,修改后的 kubeadm-config.yaml 如下:
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.1.170 ## 每个节点的id
bindPort: 6443 ## apiserver的端口
nodeRegistration:
criSocket: /run/containerd/containerd.sock ## 使用 containerd
name: k8s-170.kevin.com ## 每个节点的hostname
taints: ## taints,默认master节点为NoSchedule,也可以设置为null
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki ## 证书生成的目录
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd ## etcd数据目录,当然也可以配置为自己的etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers ## 镜像创建,这里配置为alibaba的
kind: ClusterConfiguration
kubernetesVersion: v1.20.5
controlPlaneEndpoint: 192.168.1.179:6443 ##配置vip的地址
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16 ## 这里要和使用的网络插件一致,calio默认为192.168.0.0/16 ,flannel默认为10.244.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs ## 使用ipvs调动流量
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd ## 使用systemd
每台节点执行
[root@k8s-170 ~]# systemctl enable --now kubelet
containerd组件默认在docker-ce源中
[root@k8s-170 ~]# dnf install -y yum-utils device-mapper-persistent-data lvm2
[root@k8s-170 ~]# yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
[root@k8s-170 ~]# dnf search containerd.io --showduplicates
[root@k8s-170 ~]# dnf install -y containerd.io
[root@k8s-170 ~]# mkdir -p /etc/containerd
[root@k8s-170 ~]# containerd config default | sudo tee /etc/containerd/config.toml
## plugins."io.containerd.grpc.v1.cri下的 sandbox_image 修改为 registry.aliyuncs.com/google_containers/pause:3.2
## containerd.runtimes.runc.options 添加 SystemdCgroup = true
## registry.mirrors."docker.io下的 endpoint 修改为 https://registry.cn-hangzhou.aliyuncs.com
[root@k8s-170 ~]# vim /etc/containerd/config.toml
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.2" ## 修改项
[plugins."io.containerd.grpc.v1.cri".containerd]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true ## 添加这一行
[plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry.cn-hangzhou.aliyuncs.com"] ## 修改项
[root@k8s-170 ~]# systemctl daemon-reload
[root@k8s-170 ~]# systemctl enable containerd
[root@k8s-170 ~]# systemctl restart containerd
工具下载地址:https://github.com/kubernetes-sigs/cri-tools/releases/
[root@k8s-170 ~]# wget https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.21.0/crictl-v1.21.0-linux-amd64.tar.gz
[root@k8s-170 ~]# tar zxvf crictl-v1.21.0-linux-amd64.tar.gz -C /usr/local/bin
[root@k8s-170 ~]# cat > /etc/crictl.yaml <
任意节点验证即可
[root@k8s-170 ~]# crictl pull nginx
[root@k8s-170 ~]# crictl images
[root@k8s-170 ~]# crictl rmi nginx
[root@k8s-170 ~]# systemctl daemon-reload
[root@k8s-170 ~]# systemctl restart containerd && systemctl restart kubelet
任意节点执行即可
[root@k8s-170 ~]# containerd --version
containerd containerd.io 1.4.4 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
[root@k8s-170 ~]# kubelet --version
Kubernetes v1.20.5
仅在 192.168.1.170 节点执行
[root@k8s-170 ~]# kubeadm init --config=kubeadm-config.yaml --upload-certs
####### 一系列输出,当看下如下输出时,master节点配置成功 ######################
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
#### 其它节点以master身份加入集群命令
kubeadm join 192.168.1.179:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:dc93ce29cf6b9112aebed16a050abdf030dabbbff92c42700a66d544083ac346 \
--control-plane --certificate-key aea69a228acb46bffd9faf54a1d29621cb21d8fb1b6dddb8b34417d94dba9132
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
#### 其它节点以worker身份加入集群命令
kubeadm join 192.168.1.179:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:dc93ce29cf6b9112aebed16a050abdf030dabbbff92c42700a66d544083ac346
### 根据上面的输出依次执行以下命令
[root@k8s-170 ~]# mkdir -p $HOME/.kube
[root@k8s-170 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@k8s-170 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
## 此时,执行以下命令查看node,当前节点已加入集群,但还是 NotReady 状态,需要安装网络插件
[root@k8s-170 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-170.kevin.com NotReady control-plane,master 2m28s v1.20.5
[root@k8s-170 ~]#
## kube-flannel 文件地址: https://github.com/flannel-io/flannel/blob/master/Documentation/kube-flannel.yml
[root@k8s-170 ~]# vim kube-flannel.yml
###########文件开始#################
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: psp.flannel.unprivileged
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
privileged: false
volumes:
- configMap
- secret
- emptyDir
- hostPath
allowedHostPaths:
- pathPrefix: "/etc/cni/net.d"
- pathPrefix: "/etc/kube-flannel"
- pathPrefix: "/run/flannel"
readOnlyRootFilesystem: false
# Users and groups
runAsUser:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
fsGroup:
rule: RunAsAny
# Privilege Escalation
allowPrivilegeEscalation: false
defaultAllowPrivilegeEscalation: false
# Capabilities
allowedCapabilities: ['NET_ADMIN']
defaultAddCapabilities: []
requiredDropCapabilities: []
# Host namespaces
hostPID: false
hostIPC: false
hostNetwork: true
hostPorts:
- min: 0
max: 65535
# SELinux
seLinux:
# SELinux is unused in CaaSP
rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
rules:
- apiGroups: ['extensions']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-amd64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- amd64
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-amd64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-arm64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- arm64
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-arm64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-arm64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-arm
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- arm
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-arm
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-arm
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-ppc64le
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- ppc64le
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-ppc64le
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-ppc64le
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-s390x
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- s390x
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-s390x
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-s390x
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
###########文件结束#################
[root@k8s-170 ~]# kubectl apply -f kube-flannel.yml
## 耐心等待上面的网络插件启动完成后,查看pod如下
[root@k8s-170 ~]# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-54d67798b7-968tp 1/1 Running 0 8m7s
kube-system coredns-54d67798b7-s5sh6 1/1 Running 0 8m7s
kube-system etcd-k8s-170.kevin.com 1/1 Running 0 8m22s
kube-system kube-apiserver-k8s-170.kevin.com 1/1 Running 0 8m22s
kube-system kube-controller-manager-k8s-170.kevin.com 1/1 Running 0 8m22s
kube-system kube-flannel-ds-amd64-kmcpz 1/1 Running 0 2m25s
kube-system kube-proxy-r6g59 1/1 Running 0 8m7s
kube-system kube-scheduler-k8s-170.kevin.com 1/1 Running 0 8m22s
[root@k8s-170 ~]#
### 此时,再次查看节点,当前节点为Ready状态
[root@k8s-170 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-170.kevin.com Ready control-plane,master 7m22s v1.20.5
[root@k8s-170 ~]#
192.168.1.171 节点以master身份加入集群
以下有两种方式将其它节点以master身份加入集群中,仅使用一种方法即可,这里我使用的是第一种方法
直接在 171节点执行 join命令加入
## 耐心等待,因为要拉取镜像
[root@k8s-171 ~]# kubeadm join 192.168.1.179:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:dc93ce29cf6b9112aebed16a050abdf030dabbbff92c42700a66d544083ac346 \
--control-plane --certificate-key aea69a228acb46bffd9faf54a1d29621cb21d8fb1b6dddb8b34417d94dba9132
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
## 当出现上面的成功提示后,根据提示执行
[root@k8s-171 ~]# mkdir -p $HOME/.kube
[root@k8s-171 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@k8s-171 ~]#sudo chown $(id -u):$(id -g) $HOME/.kube/config
## 再在任意master节点(170或171节点)查看集群节点状态
[root@k8s-170 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-170.kevin.com Ready control-plane,master 9m56s v1.20.6
k8s-171.kevin.com Ready control-plane,master 106s v1.20.6
[root@k8s-170 ~]#
## 在170 节点上将证书复制到 171节点上
[root@k8s-170 ~]# rsync -av /etc/kubernetes/pki/* 192.168.1.171:/etc/kubernetes/pki/
## 复制kubeadm-config.yaml 到171节点
[root@k8s-170 ~]# rsync -av kubeadm-config.yaml 192.168.1.171:/root
在171节点 修改 kubeadm-config.yaml,修改 advertiseAddress 和nodeRegistration.name ,最后修改后听内容如下
## [root@k8s-171 ~]# cat kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.1.171 ## 每个节点的id
bindPort: 6443 ## apiserver的端口
nodeRegistration:
criSocket: /run/containerd/containerd.sock ## 使用 containerd
name: k8s-171.kevin.com ## 每个节点的hostname
taints: ## taints,默认master节点为NoSchedule,也可以设置为null
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki ## 证书生成的目录
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd ## etcd数据目录,当然也可以配置为自己的etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers ## 镜像创建,这里配置为alibaba的
kind: ClusterConfiguration
kubernetesVersion: v1.20.5
controlPlaneEndpoint: 192.168.1.179:7443 ##配置vip的地址
networking:
dnsDomain: cluster.local
podSubnet: 192.168.0.0/16 ## 这里要和使用的网络插件一致,calio默认为192.168.0.0/16 ,flannel默认为10.244.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs ## 使用ipvs调动流量
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd ## 使用systemd
### 此过程需要拉取镜像,需要时间,请耐心等待
[root@k8s-171 ~]# kubeadm init --config=kubeadm-config.yaml --upload-certs
### 如果出现错误,可以先reset,再重新从179节点复制 /etc/kubernetes/pki/ 目录到此节点,没出现错误此步不用
[root@k8s-171 ~]# kubeadm reset
### 根据上面的输出依次执行以下命令
[root@k8s-171 ~]# mkdir -p $HOME/.kube
[root@k8s-171 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@k8s-171 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
## 查看集群节点信息如下
[root@k8s-171 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-170.kevin.com Ready control-plane,master 21m v1.20.5
k8s-171.kevin.com Ready control-plane,master 6m32s v1.20.5
[root@k8s-171 ~]#
在 172和 173 节点都执行
## 此过程需要拉取镜像,请耐心待
[root@k8s-172 ~]# kubeadm join 192.168.1.179:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:dc93ce29cf6b9112aebed16a050abdf030dabbbff92c42700a66d544083ac346
[preflight] Running pre-flight checks
[WARNING FileExisting-tc]: tc not found in system path
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
## 当出现以上信息输出时,worker 节点加入集群成功,等待worker节点镜像服务启动完成后,再次在任意master节点查看集群所有节点如下
[root@k8s-170 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-170.kevin.com Ready control-plane,master 31m v1.20.5
k8s-171.kevin.com Ready control-plane,master 17m v1.20.5
k8s-172.kevin.com Ready 119s v1.20.5
k8s-173.kevin.com Ready 115s v1.20.5
[root@k8s-170 ~]# kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system calico-kube-controllers-69496d8b75-ksbsv 1/1 Running 0 29m 192.168.72.66 k8s-170.kevin.com
kube-system calico-node-ksc98 1/1 Running 0 3m12s 192.168.1.173 k8s-173.kevin.com
kube-system calico-node-qzxwv 1/1 Running 0 29m 192.168.1.170 k8s-170.kevin.com
kube-system calico-node-sb27w 1/1 Running 0 3m16s 192.168.1.172 k8s-172.kevin.com
kube-system calico-node-zr7x2 1/1 Running 0 18m 192.168.1.171 k8s-171.kevin.com
kube-system coredns-54d67798b7-5v9z4 1/1 Running 0 33m 192.168.72.67 k8s-170.kevin.com
kube-system coredns-54d67798b7-r8dg9 1/1 Running 0 33m 192.168.72.65 k8s-170.kevin.com
kube-system etcd-k8s-170.kevin.com 1/1 Running 0 32m 192.168.1.170 k8s-170.kevin.com
kube-system etcd-k8s-171.kevin.com 1/1 Running 0 18m 192.168.1.171 k8s-171.kevin.com
kube-system kube-apiserver-k8s-170.kevin.com 1/1 Running 0 32m 192.168.1.170 k8s-170.kevin.com
kube-system kube-apiserver-k8s-171.kevin.com 1/1 Running 0 18m 192.168.1.171 k8s-171.kevin.com
kube-system kube-controller-manager-k8s-170.kevin.com 1/1 Running 0 32m 192.168.1.170 k8s-170.kevin.com
kube-system kube-controller-manager-k8s-171.kevin.com 1/1 Running 0 18m 192.168.1.171 k8s-171.kevin.com
kube-system kube-proxy-k84js 1/1 Running 0 3m16s 192.168.1.172 k8s-172.kevin.com
kube-system kube-proxy-m4qm5 1/1 Running 0 18m 192.168.1.171 k8s-171.kevin.com
kube-system kube-proxy-p2xbd 1/1 Running 0 33m 192.168.1.170 k8s-170.kevin.com
kube-system kube-proxy-rgrzj 1/1 Running 0 3m12s 192.168.1.173 k8s-173.kevin.com
kube-system kube-scheduler-k8s-170.kevin.com 1/1 Running 0 32m 192.168.1.170 k8s-170.kevin.com
kube-system kube-scheduler-k8s-171.kevin.com 1/1 Running 0 18m 192.168.1.171 k8s-171.kevin.com
[root@k8s-170 ~]#
如果忘记了node 节点加入集群的命令,可以在任意master执行:
[root@k8s-170 ~]# kubeadm token create --print-join-command
kubeadm join 192.168.1.179:6443 --token 4s4dig.go8mpg3oso530u1z --discovery-token-ca-cert-hash sha256:dc93ce29cf6b9112aebed16a050abdf030dabbbff92c42700a66d544083ac346
[root@k8s-170 ~]#
默认情况下,kube-proxy组件开启的就是ipvs调动,使用log查看如下
[root@k8s-170 ~]# kubectl -n kube-system logs kube-proxy-p2xbd | grep ipvs
I0415 12:52:02.610427 1 server_others.go:258] Using ipvs Proxier.
## 查看ipvs转发如下
[root@k8s-170 ~]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.96.0.1:443 rr
-> 192.168.1.170:6443 Masq 1 6 0
TCP 10.96.0.10:53 rr
-> 192.168.72.65:53 Masq 1 0 0
-> 192.168.72.67:53 Masq 1 0 0
TCP 10.96.0.10:9153 rr
-> 192.168.72.65:9153 Masq 1 0 0
-> 192.168.72.67:9153 Masq 1 0 0
UDP 10.96.0.10:53 rr
-> 192.168.72.65:53 Masq 1 0 0
-> 192.168.72.67:53 Masq 1 0 0
[root@k8s-170 ~]#
read udp 10.244.0.3:58854->192.168.1.100:53: i/o timeout
此时,部署到这里,K8s集群应该是部署完成了,但是,在查看 coredns服务日志时,一直报错 read udp 10.244.0.3:58854->192.168.1.100:53: i/o timeout
,使用 dig也不能正常解析出来,是iptables 没有生效,重启所有节点可以解决问题。
[root@k8s-170 traefik]# kubectl logs -f coredns-54d67798b7-bxs95 -n kube-system
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
[ERROR] plugin/errors: 2 2855583235070228061.2607985568958209405. HINFO: read udp 10.244.0.3:58854->192.168.1.100:53: i/o timeout
[ERROR] plugin/errors: 2 2855583235070228061.2607985568958209405. HINFO: dial udp [fe80::1%ens33]:53: connect: invalid argument
[ERROR] plugin/errors: 2 2855583235070228061.2607985568958209405. HINFO: read udp 10.244.0.3:50381->192.168.1.100:53: i/o timeout
[ERROR] plugin/errors: 2 2855583235070228061.2607985568958209405. HINFO: dial udp [fe80::1%ens33]:53: connect: invalid argument
[root@k8s-170 ~]# kubectl get svc -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.96.0.1 443/TCP 3h6m
kube-system kube-dns ClusterIP 10.96.0.10 53/UDP,53/TCP,9153/TCP 3h6m
## 使用 dig解析域名也不能正常解析
[root@k8s-170 traefik]# dig -t A www.baidu.com @10.96.0.10 +short
参考: https://blog.csdn.net/u013984806/article/details/124596484
参考: https://blog.csdn.net/u013984806/article/details/124622967