注意要点:
1.国内关于k8s的各种资源连接被墙,在创建集群之前需要准备好各项资源
2.注意版本匹配的问题,具体可以在k8s官网查询
一、各节点安装docker
arm64系统(armbain):
curl -fsSL https://get.docker.com -o get-docker.sh
sh get-docker.sh --mirror Aliyun
docker版本检查:
docker version
amd64系统(ubuntu):
deb [arch=amd64] https://download.docker.com/linux/ubuntu xenial stable
创建镜像加速:
mkdir -p /etc/docker
tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://6mfq3rtl.mirror.aliyuncs.com"]
}
EOF
重启生效配置:
systemctl daemon-reload
systemctl restart docker
其他命令
查看docker进程:
ps aux|grep docker
systemctl status docker
卸载docker:
mv /var/lib/dpkg/info/docker-ce* /tmp/
dpkg --remove --force-remove-reinstreq docker-ce
apt autoremove
apt autoclean
二、k8s程序安装
1.准备环境
a.关闭swap:sudo swapoff -a
修改相关配置vi /etc/sysctl.conf
其中内容vm.swappiness = 0
b.配置k8s下载源
vi /etc/apt/sources.list
amd64下
deb [arch=amd64] https://mirrors.ustc.edu.cn/kubernetes/apt kubernetes-xenial main
arm64下
deb [arch=arm64] https://mirrors.ustc.edu.cn/kubernetes/apt kubernetes-xenial main
apt-get update
更新遇到错误the public key is not available: NO_PUBKEY 6A030B21BA07F4FB
使用如下命令添加对应的key
gpg --keyserver keyserver.ubuntu.com --recv-keys 6A030B21BA07F4FB
gpg --export --armor 6A030B21BA07F4FB | sudo apt-key add -
2.安装程序本体
apt-get install -y kubeadm kubectl kubelet
3.准备相关配置
a.新增 vi /etc/default/kubelet
增加 KUBELET_EXTRA_ARGS=--cgroup-driver=systemd --fail-swap-on=false
b.新增daemon.json
mkdir -p /etc/docker
cat > /etc/docker/daemon.json <
4.生效新配置
sudo systemctl daemon-reload
sudo systemctl restart docker
sudo systemctl restart kubelet
三、集群搭建前的镜像准备与配置文件准备
一般来说,直接apply相应的file会自动pull doker images,但是国内GFW的存在,基本是无法访问。变通的做法,有三种:
1.先从国内镜像库拉镜像,再修改tag成国外tag。
2.直接修改相应的yaml文件设置直接使用国内镜像。
3.自建镜像服务器,给各节点使用。
此处我使用第一种方式。
1.准备相关images
—arm64
master节点:
docker pull gcr.azk8s.cn/google_containers/kube-apiserver-arm64:v1.17.4
docker pull gcr.azk8s.cn/google_containers/kube-controller-manager-arm64:v1.17.4
docker pull gcr.azk8s.cn/google_containers/kube-scheduler-arm64:v1.17.4
docker pull gcr.azk8s.cn/google_containers/etcd-arm64:3.3.10
本次集群部署我使用了外部部署etcd,所以最后一个镜像可以不啦,使用内部部署etcd则需要(k8s使用内部etcd默认只部署单节点,不没有实现ha,需自行修改)。
master&slave节点:
docker pull gcr.azk8s.cn/google_containers/kube-proxy-arm64:v1.17.4
docker pull gcr.azk8s.cn/google_containers/pause-arm64:3.1
docker pull coredns/coredns:coredns-arm64
修改tag:
master节点:
docker tag gcr.azk8s.cn/google_containers/kube-apiserver-arm64:v1.17.4 k8s.gcr.io/kube-apiserver:v1.17.4
docker tag gcr.azk8s.cn/google_containers/kube-controller-manager-arm64:v1.17.4 k8s.gcr.io/kube-controller-manager:v1.17.4
docker tag gcr.azk8s.cn/google_containers/kube-scheduler-arm64:v1.17.4 k8s.gcr.io/kube-scheduler:v1.17.4
docker tag gcr.azk8s.cn/google_containers/etcd-arm64:3.3.10 k8s.gcr.io/etcd:3.3.10
master&slave节点:
docker tag gcr.azk8s.cn/google_containers/kube-proxy-arm64:v1.17.4 k8s.gcr.io/kube-proxy:v1.17.4
docker tag gcr.azk8s.cn/google_containers/pause-arm64:3.1 k8s.gcr.io/pause:3.1
docker tag coredns/coredns:coredns-arm64 k8s.gcr.io/coredns:1.6.5
清理不需要的tag
master节点:
docker rmi gcr.azk8s.cn/google_containers/kube-apiserver-arm64:v1.17.4
docker rmi gcr.azk8s.cn/google_containers/kube-controller-manager-arm64:v1.17.4
docker rmi gcr.azk8s.cn/google_containers/kube-scheduler-arm64:v1.17.4
docker rmi gcr.azk8s.cn/google_containers/etcd-arm64:3.3.10
master&slave节点:
docker rmi gcr.azk8s.cn/google_containers/kube-proxy-arm64:v1.17.4
docker rmi gcr.azk8s.cn/google_containers/pause-arm64:3.1
docker rmi coredns/coredns:coredns-arm64
–amd64相关镜像,master和slave分配同上:
docker pull gcr.azk8s.cn/google_containers/kube-apiserver:v1.17.4
docker pull gcr.azk8s.cn/google_containers/kube-controller-manager:v1.17.4
docker pull gcr.azk8s.cn/google_containers/kube-scheduler:v1.17.4
docker pull gcr.azk8s.cn/google_containers/kube-proxy:v1.17.4
docker pull gcr.azk8s.cn/google_containers/pause:3.1
docker pull gcr.azk8s.cn/google_containers/etcd:3.4.3-0
docker pull gcr.azk8s.cn/google_containers/coredns:1.6.5
docker tag gcr.azk8s.cn/google_containers/kube-apiserver:v1.17.4 k8s.gcr.io/kube-apiserver:v1.17.4
docker tag gcr.azk8s.cn/google_containers/kube-controller-manager:v1.17.4 k8s.gcr.io/kube-controller-manager:v1.17.4
docker tag gcr.azk8s.cn/google_containers/kube-scheduler:v1.17.4 k8s.gcr.io/kube-scheduler:v1.17.4
docker tag gcr.azk8s.cn/google_containers/kube-proxy:v1.17.4 k8s.gcr.io/kube-proxy:v1.17.4
docker tag gcr.azk8s.cn/google_containers/pause:3.1 k8s.gcr.io/pause:3.1
docker tag gcr.azk8s.cn/google_containers/etcd:3.4.3-0 k8s.gcr.io/etcd:3.4.3-0
docker tag gcr.azk8s.cn/google_containers/coredns:1.6.5 k8s.gcr.io/coredns:1.6.5
docker rmi gcr.azk8s.cn/google_containers/kube-apiserver:v1.17.4
docker rmi gcr.azk8s.cn/google_containers/kube-controller-manager:v1.17.4
docker rmi gcr.azk8s.cn/google_containers/kube-scheduler:v1.17.4
docker rmi gcr.azk8s.cn/google_containers/kube-proxy:v1.17.4
docker rmi gcr.azk8s.cn/google_containers/pause:3.1
docker rmi gcr.azk8s.cn/google_containers/etcd:3.4.3-0
docker rmi gcr.azk8s.cn/google_containers/coredns:1.6.5
2.准备初始化集群的文件
kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
etcd:
external:
endpoints:
- https://192.168.2.245:2379
- https://192.168.2.246:2379
- https://192.168.2.251:2379
caFile: /opt/etcd/ssl/ca.pem
certFile: /opt/etcd/ssl/etcd.pem
keyFile: /opt/etcd/ssl/etcd-key.pem
kubernetesVersion: v1.17.4
networking:
podSubnet: "172.17.0.0/16"
clusterName: "arm-cluster"
apiServerCertSANs:
- "ubuntu19"
- "armbian1"
- "armbian2"
- "armbian3"
- "armbian4"
- "armbian5"
- "192.168.2.251"
- "192.168.2.245"
- "192.168.2.246"
- "192.168.2.247"
- "192.168.2.248"
- "192.168.2.249"
- "127.0.0.1"
controllerManagerExtraArgs:
node-monitor-grace-period: 10s
pod-eviction-timeout: 10s
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
# kubelet specific options here
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
iptables:
masqueradeAll: true
masqueradeBit: 14
minSyncPeriod: 0s
syncPeriod: 30s
ipvs:
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: ""
syncPeriod: 30s
mode: "ipvs"
nodePortAddresses: null
oomScoreAdj: -999
portRange: ""
resourceContainer: /kube-proxy
udpIdleTimeout: 250ms
etcd 修改成自己部署好的etcd集群ip与证书保存位置
apiServerCertSANs修改成所有规划好的,现在在跑以及日后可能添加上去的node的ip以及主机名
kube-flannel.yaml
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: psp.flannel.unprivileged
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
privileged: false
volumes:
- configMap
- secret
- emptyDir
- hostPath
allowedHostPaths:
- pathPrefix: "/etc/cni/net.d"
- pathPrefix: "/etc/kube-flannel"
- pathPrefix: "/run/flannel"
readOnlyRootFilesystem: false
# Users and groups
runAsUser:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
fsGroup:
rule: RunAsAny
# Privilege Escalation
allowPrivilegeEscalation: false
defaultAllowPrivilegeEscalation: false
# Capabilities
allowedCapabilities: ['NET_ADMIN']
defaultAddCapabilities: []
requiredDropCapabilities: []
# Host namespaces
hostPID: false
hostIPC: false
hostNetwork: true
hostPorts:
- min: 0
max: 65535
# SELinux
seLinux:
# SELinux is unused in CaaSP
rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
rules:
- apiGroups: ['extensions']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.0.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-amd64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: beta.kubernetes.io/os
operator: In
values:
- linux
- key: beta.kubernetes.io/arch
operator: In
values:
- amd64
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-amd64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-arm64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: beta.kubernetes.io/os
operator: In
values:
- linux
- key: beta.kubernetes.io/arch
operator: In
values:
- arm64
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-arm64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-arm64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-arm
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: beta.kubernetes.io/os
operator: In
values:
- linux
- key: beta.kubernetes.io/arch
operator: In
values:
- arm
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-arm
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-arm
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-ppc64le
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: beta.kubernetes.io/os
operator: In
values:
- linux
- key: beta.kubernetes.io/arch
operator: In
values:
- ppc64le
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-ppc64le
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-ppc64le
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-s390x
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: beta.kubernetes.io/os
operator: In
values:
- linux
- key: beta.kubernetes.io/arch
operator: In
values:
- s390x
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-s390x
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-s390x
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
“Network”: “10.0.0.0/16”, 这里修改成自己规划的内部网段
四、初始化集群
kubeadm init --config kubeadm-config.yaml
成功后会显示:
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.2.245:6443 --token slopnt.dpoed6pxm8d5dlwn \
--discovery-token-ca-cert-hash sha256:7c24d128fa0be02bceb4f421399b1d8b9a3e9214e0e816d15b0d9eee2d64d3d0
此时,集群还欠缺配置cni,需要
kubectl apply -f kube-flannel.yaml
至次,集群创建完成。
五、其他相关配置
执行kubectl命令遇到
The connection to the server localhost:8080 was refused - did you specify the right host or port?
需要
export KUBECONFIG=/etc/kubernetes/admin.conf
可在自己常登陆的用户shell上加上这句话
vi ~/.bashrc
export KUBECONFIG=/etc/kubernetes/admin.conf
配置master同时也做work节点
kubectl taint nodes --all node-role.kubernetes.io/master-
六、常用查看集群命令
查看集群node情况:kubectl get nodes
查看集群pod情况:kubectl get pod -n kube-system -o wide
查看某pod详细情况:kubectl describe pod podid -n kube-system
查看某pod日志:kubectl logs -f kubernetes-dashboard-9df9d586-pl26v -n kube-system
七、集群删除与清理
集群删除命令:
kubeadm reset
清理残留信息:
rm -rf $HOME/.kube/config
rm -rf /var/lib/cni/
rm -rf /var/lib/kubelet/*
rm -rf /etc/kubernetes/
rm -rf /etc/cni/
ifconfig cni0 down
ifconfig flannel.1 down
ip link delete cni0
ip link delete flannel.1
iptable残留信息清理:
systemctl stop kubelet
systemctl stop docker
iptables --flush
iptables -tnat --flush
systemctl start kubelet
systemctl start docker