不扯没用的淡,不写多余的字,不谈多余的原理,直接干。
一、服务器资源初始化,在每个节点执行下面命令
IP | 主机名 | 角色 |
192.168.66.110 | 无 | Vip |
192.168.66.111 | k8s-master-111 | k8s-master、etcd、keepalived |
192.168.66.112 | k8s-master-112 | k8s-master、etcd、keepalived |
192.168.66.113 | k8s-master-113 | k8s-master、etcd、keepalived |
192.168.66.128 | k8s-node-128 | k8s-node |
版本
系统: Centos7.6 64位
etcd: 3.4.13
docker: 20.10.6
kubectl: v1.20.6
kubelet: v1.20.6
kubeadm: v1.20.6
flannel: v0.14.0-rc1
Keepalived: v1.3.5
1、搞一波更新
yum -y update
2、把该停的停了
systemctl stop firewalld
systemctl disable firewalld
systemctl stop postfix
systemctl disable postfix
3、时间统一
rm -f /etc/localtime
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
/usr/sbin/ntpdate ntp1.aliyun.com
(echo "*/10 * * * * /usr/sbin/ntpdate asia.pool.ntp.org";crontab -l)|crontab
crontab -l
4、关闭swap
swapoff -a
sed -i '/swap/d' /etc/fstab
5、禁止iptables对bridge数据进行处理
cat < /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl -p /etc/sysctl.conf
6、关闭selinux
sed -i 's/^SELINUX=/s/SELINUX=.*/SELINUX=disabled/g' /etc/sysconfig/selinux
setenforce 0
7、添加主机名解析、免密登录(免密登录方便发送文件)
在k8s-master-111节点执行
cat >> /etc/hosts << EOF
192.168.66.111 k8s-master-111
192.168.66.112 k8s-master-112
192.168.66.113 k8s-master-113
EOF
ssh-keygen
ssh-copy-id k8s-master-112
ssh-copy-id k8s-master-113
二、安装docker
这这里 https://download.docker.com/linux/centos/7/x86_64/stable/Packages/ 找到自己需要的版本,wget 即可
cd /data/k8scluster_packages
mkdir rpm
cd rpm
wget https://download.docker.com/linux/centos/7/x86_64/stable/Packages/docker-ce-20.10.6-3.el7.x86_64.rpm
yum install -y docker-ce-20.10.6-3.el7.x86_64.rpm
启动docker ,设置 docker 为 systemd 方式启动
mkdir /etc/docker
cat > /etc/docker/daemon.json << EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"insecure-registries" : ["registry.xxx.com"],
"graph": "/data/docker",
"log-driver": "json-file",
"log-opts": {
"max-size": "100m",
"max-file": "4"
}
}
EOF
systemctl start docker
systemctl enable docker
三、下载各项资源
1、安装 kubectl-1.20.6 kubelet-1.20.6 kubeadm-1.20.6
配置yum源到阿里云
cp -a /etc/yum.repos.d/ /data/yum.repos.d.backup
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum clean all
查看镜像源中是否有1.20.6
yum list kubelet kubeadm kubectl --showduplicates
安装 kubectl-1.20.6 kubelet-1.20.6 kubeadm-1.20.6
yum install -y kubectl-1.20.6 kubelet-1.20.6 kubeadm-1.20.6
添加到开机启动
systemctl enable kubelet
顺便将 kubectl-1.20.6 kubelet-1.20.6 kubeadm-1.20.6 的 rpm 下载下来,方便之后添加 node 节点
cd /data/k8scluster_packages/rpm
yumdownloader kubectl-1.20.6 kubelet-1.20.6 kubeadm-1.20.6 cri-tools-1.13.0 kubernetes-cni-0.8.7 --downloaddir=./
2、执行 kubeadm config images list 命令可以看到 kubeadm 启动集群时需要什么版本的镜像,由于我们国内无法下面的地址,所以需要先拉镜像
3、规划etcd在集群外部启动,所以先单独下载etcd,百度搜索 etcd:3.4.13,到 github 下查看对应版本下载地址
mkdir /data/k8scluster_packages
cd /data/k8scluster_packages
ETCD_VER=v3.4.13
curl -L https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o ./etcd-${ETCD_VER}-linux-amd64.tar.gz
4、在浏览器访问 https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml ,将内容贴到 kube-flannel.yml 文件中
cat > kube-flannel.yml << EOF
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: psp.flannel.unprivileged
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
privileged: false
volumes:
- configMap
- secret
- emptyDir
- hostPath
allowedHostPaths:
- pathPrefix: "/etc/cni/net.d"
- pathPrefix: "/etc/kube-flannel"
- pathPrefix: "/run/flannel"
readOnlyRootFilesystem: false
# Users and groups
runAsUser:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
fsGroup:
rule: RunAsAny
# Privilege Escalation
allowPrivilegeEscalation: false
defaultAllowPrivilegeEscalation: false
# Capabilities
allowedCapabilities: ['NET_ADMIN', 'NET_RAW']
defaultAddCapabilities: []
requiredDropCapabilities: []
# Host namespaces
hostPID: false
hostIPC: false
hostNetwork: true
hostPorts:
- min: 0
max: 65535
# SELinux
seLinux:
# SELinux is unused in CaaSP
rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: flannel
rules:
- apiGroups: ['extensions']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.14.0-rc1
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.14.0-rc1
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
EOF
5、下载镜像,直接从阿里云下载,然后改名即可
查看flannel需要的镜像版本
grep image kube-flannel.yml
docker pull registry.aliyuncs.com/google_containers/pause:3.2
docker pull registry.aliyuncs.com/google_containers/coredns:1.7.0
docker pull registry.aliyuncs.com/google_containers/kube-proxy:v1.20.6
docker pull registry.aliyuncs.com/google_containers/kube-controller-manager:v1.20.6
docker pull registry.aliyuncs.com/google_containers/kube-apiserver:v1.20.6
docker pull registry.aliyuncs.com/google_containers/kube-scheduler:v1.20.6
docker pull quay.io/coreos/flannel:v0.14.0-rc1
docker tag registry.aliyuncs.com/google_containers/pause:3.2 k8s.gcr.io/pause:3.2
docker tag registry.aliyuncs.com/google_containers/coredns:1.7.0 k8s.gcr.io/coredns:1.7.0
docker tag registry.aliyuncs.com/google_containers/kube-proxy:v1.20.6 k8s.gcr.io/kube-proxy:v1.20.6
docker tag registry.aliyuncs.com/google_containers/kube-controller-manager:v1.20.6 k8s.gcr.io/kube-controller-manager:v1.20.6
docker tag registry.aliyuncs.com/google_containers/kube-apiserver:v1.20.6 k8s.gcr.io/kube-apiserver:v1.20.6
docker tag registry.aliyuncs.com/google_containers/kube-scheduler:v1.20.6 k8s.gcr.io/kube-scheduler:v1.20.6
5、导出镜像,或者上传到自己的镜像仓库。 将导出的镜像导入到其余两个master节点上。
mkdir images && cd images
导出镜像文件
for i in kube-proxy:v1.20.6 kube-controller-manager:v1.20.6 kube-scheduler:v1.20.6 kube-apiserver:v1.20.6 coredns:1.7.0 pause:3.2 flannel:v0.14.0-rc1; do
packname=`echo $i|awk -F':' '{print $1"."$2".tgz"}'`
docker save k8s.gcr.io/$i -o $packname
done
传镜像文件到其余两个节点
scp -r images/ k8s-master-112:/root/
scp -r images/ k8s-master-113:/root/
在其余两节点导入镜像
cd images
for i in *tgz ;do
docker load < $i
done
6、在 k8s-master-111 下载 cfssl, cfssljson, cfsslconfig 软件
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
chmod +x cfssl_linux-amd64
mv cfssl_linux-amd64 /usr/local/bin/cfssl
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
chmod +x cfssljson_linux-amd64
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod +x cfssl-certinfo_linux-amd64
mv cfssl-certinfo_linux-amd64 /usr/local/bin/cfssl-certinfo
四、安装 etcd 集群
1、在 k8s-master-111 生成证书所需要文件
mkdir cert
cd cert
cat > ca-csr.json <
2、这里的证书有效期时间加长点,87600h = 10年。
cat > ca-config.json <
cat > etcd-csr.json <
3、生成证书,并发送到各master节点
cfssl gencert -initca ca-csr.json | cfssljson -bare ca
cfssl gencert -ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes etcd-csr.json | cfssljson -bare etcd
mkdir -p /etc/etcd/ssl
cp etcd.pem etcd-key.pem ca.pem /etc/etcd/ssl/
scp -r /etc/etcd k8s-master-112:/etc/
scp -r /etc/etcd k8s-master-113:/etc/
4、将安装etcd,并发送到各master节点
cd /data/k8scluster_packages
tar xvf etcd-v3.4.13-linux-amd64.tar.gz
cd etcd-v3.4.13-linux-amd64
cp -a etcd etcdctl /usr/local/bin/
scp -P 36566 -r etcd etcdctl k8s-master-112:/usr/local/bin/
scp -P 36566 -r etcd etcdctl k8s-master-113:/usr/local/bin/
5、在各节点生成etcd配置文件,!!! 注意修改配置文件ip和 --name= 后的节点名称
在 k8s-master-111 执行
cat > /etc/systemd/system/etcd.service <
在 k8s-master-112 执行
cat > /etc/systemd/system/etcd.service <
在 k8s-master-113 执行
cat > /etc/systemd/system/etcd.service <
6、启动 etcd 服务
mkdir -p /var/lib/etcd
systemctl daemon-reload
systemctl enable etcd
systemctl start etcd
7、检查etcd各节点是否正常
for i in 111 112 113 ;do
ip=192.168.66.$i
echo "---> $ip <---"
etcdctl --endpoints=https://$ip:2379 --cacert=/etc/etcd/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem endpoint health
done
正常应该返回下图所示样子
8、检查etcd集群
etcdctl \
-w table --cacert=/etc/etcd/ssl/ca.pem \
--cert=/etc/etcd/ssl/etcd.pem \
--key=/etc/etcd/ssl/etcd-key.pem \
--endpoints=https://192.168.66.111:2379,https://192.168.66.112:2379,https://192.168.66.113:2379 endpoint status
正常应该返回下图所示样子
如上图,192.168.66.112 是 leader 节点,其余两个非 leader ,此时说明 etcd 已经是个集群了。
五、安装 keepalived
1、分别在三台master服务器上安装keepalived
yum -y install keepalived
2、在各 master 节点生成配置文件
在 k8s-master-111 节点, 注意修改 ip 为自己的
cat > /etc/keepalived/keepalived.conf << EOF
global_defs {
router_id LVS_k8s
}
vrrp_script CheckK8sMaster {
script "curl -k https://192.168.66.110:6443" # vip
interval 3
timeout 9
fall 2
rise 2
}
vrrp_instance VI_1 {
state MASTER
interface ens192 # 本地网卡名称
virtual_router_id 61
priority 120 # 权重,要唯一
advert_int 1
mcast_src_ip 192.168.2.111 # 本地IP
nopreempt
authentication {
auth_type PASS
auth_pass sqP05dQgMSlzrxHj
}
unicast_peer {
192.168.66.112
192.168.66.113
}
virtual_ipaddress {
192.168.66.110/24 # VIP
}
track_script {
CheckK8sMaster
}
}
EOF
在 k8s-master-112 节点, 注意修改 ip 为自己的
cat > /etc/keepalived/keepalived.conf << EOF
global_defs {
router_id LVS_k8s
}
vrrp_script CheckK8sMaster {
script "curl -k https://192.168.66.110:6443" # vip
interval 3
timeout 9
fall 2
rise 2
}
vrrp_instance VI_1 {
state BACKUP
interface ens192 # 本地网卡名称
virtual_router_id 61
priority 110 # 权重,要唯一
advert_int 1
mcast_src_ip 192.168.66.112 # 本地IP
nopreempt
authentication {
auth_type PASS
auth_pass sqP05dQgMSlzrxHj
}
unicast_peer {
192.168.66.111
192.168.66.113
}
virtual_ipaddress {
192.168.66.110/24 # VIP
}
track_script {
CheckK8sMaster
}
}
EOF
在 k8s-master-113 节点, 注意修改 ip 为自己的
cat > /etc/keepalived/keepalived.conf << EOF
global_defs {
router_id LVS_k8s
}
vrrp_script CheckK8sMaster {
script "curl -k https://192.168.66.110:6443" # vip
interval 3
timeout 9
fall 2
rise 2
}
vrrp_instance VI_1 {
state BACKUP
interface ens160 # 本地网卡名称
virtual_router_id 61
priority 100 # 权重,要唯一
advert_int 1
mcast_src_ip 192.168.66.113 # 本地IP
nopreempt
authentication {
auth_type PASS
auth_pass sqP05dQgMSlzrxHj
}
unicast_peer {
192.168.66.111
192.168.66.112
}
virtual_ipaddress {
192.168.66.110/24 # VIP
}
track_script {
CheckK8sMaster
}
}
EOF
3、启动 keepalived 服务
systemctl enable keepalived
systemctl start keepalived
systemctl status keepalived
4、检查 Vip 是否可用
在 k8s-master-111 节点检查
ip a
ping 192.168.66.110
正常情况应如下图
1、创建 kubeadm-conf.yaml 和 kube-flannel.yml 配置文件,注意修改以下配置
★ 修改certSANs的 ip 和 对应的 master主机名
★ etcd 节点的 ip 改成对应的
★ controlPlaneEndpoint 改成 Vip
★ serviceSubnet: 这个指的是k8s内 service 以后要用的 ip 网段
★ podSubnet: 这个指的是 k8s 内 pod 以后要用的 ip 网段
★ kubernetesVersion: 改成对应的版本号
mkdir kubeconf
cd kubeconf
cat > kubeadm-conf.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
---
apiServer:
timeoutForControlPlane: 4m0s
certSANs:
- 192.168.66.110
- 192.168.66.111
- 192.168.66.112
- 192.168.66.113
- "k8s-master-111"
- "k8s-master-112"
- "k8s-master-113"
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
external:
endpoints:
- https://192.168.66.111:2379
- https://192.168.66.112:2379
- https://192.168.66.113:2379
caFile: /etc/etcd/ssl/ca.pem
certFile: /etc/etcd/ssl/etcd.pem
keyFile: /etc/etcd/ssl/etcd-key.pem
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.20.6
controlPlaneEndpoint: "192.168.66.110:6443"
networking:
dnsDomain: cluster.local
serviceSubnet: 172.0.0.0/16
podSubnet: 10.0.0.0/16
scheduler: {}
EOF
2、使用 kubeadm 创建 k8s 集群
kubeadm init --config kubeadm-conf.yaml --ignore-preflight-errors=swap
成功的话,如图所示
PS:如果初始化失败,再次初始化时,一定要清空一下kubeadm 的缓存和 etcd 中的数据,否则初始化会报其他的错
kubeadm reset
etcdctl \
--endpoints="https://192.168.66.111:2379,https://192.168.66.112:2379,https://192.168.66.113:2379" \
--cacert=/etc/etcd/ssl/ca.pem \
--cert=/etc/etcd/ssl/etcd.pem \
--key=/etc/etcd/ssl/etcd-key.pem \
del /registry --prefix
清空 etcd 如图所示,第一次执行显示清空了 343 条数据,第二次执行显示没有数据可以清除了。
3、执行提示中的命令
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
4、此时可以查看一些内容
kubectl get nodes
kubectl get po --all-namespaces -o wide
coredns 没有启动的话,等启动了 flannel 一般就启动了,是因为没有可以让它启动的节点,先不要理会。
可以执行 kubectl describe po coredns-74ff55c5b-gfztj -n kube-system 命令查看为什么没有启动
5、为k8s集群启动 flannel 网络,然后查看 k8s 中所有的 pod 验证 flannel 网络是否启动
修改配置文件,其他地方不需要动,只将这里修改为上面的 kubeadm-conf.yaml 文件中配置的
6、启动 flannel
kubectl create -f kube-flannel.yml
kubectl get pod --all-namespaces
7、加入其余两个 master 节点到集群中
将 kubernetes 的证书传到其余两个 master 上
scp -r /etc/kubernetes/pki/ k8s-master-112:/etc/kubernetes/
scp -r /etc/kubernetes/pki/ k8s-master-113:/etc/kubernetes/
在其余两个节点执行加入 master 的命令,注意是带 --control-plane 的那条命令
kubeadm join 192.168.66.110:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:0b73a83b1fa6d33456e84a2bf4cf674decfd4f174ba9469f21815dfebde2423d \
--control-plane
添加完毕如图所示
再次查看node节点,等待所有的node 成为 Ready 状态
kubectl get nodes
查看各pod启动情况
kubectl get po --all-namespaces -o wide
七、加入 work node 到集群中
1、首先将第一步中前 6 步全部执行完毕
2、在 master 节点将上面下载好的 docker 、 kubelet 、kubectl 、 kubeadm 的 rpm 包传到 k8s-node-128
scp -r rpm k8s-node-128:/root/
3、然后在master节点,将node节点需要的镜像传到node节点, 除 coredns 镜像外,其余的三个镜像在加入集群时就会使用, 将 coredns 也放过去,是为了防止以后 coredns 的 pod 重启会在 node 节点启动,如果没有这个镜像 , coredns 就会启动失败
cd /data/k8scluster_packages/images
scp -r flannel.v0.14.0-rc1.tgz kube-proxy.v1.20.6.tgz pause.3.2.tgz coredns.1.7.0.tgz k8s-node-118:/root/
4、到 node 节点安装 rpm 包(注意修改docker配置文件)
cd rpm
yum install -y *rpm
5、将镜像导入
cd /root/images/
for i in *tgz ;do
docker load < $i
done
6、work 节点只需要这三个镜像
7、添加node节点,不用传 pki ,直接在 node 节点执行添加 work node 的命令,注意: 添加node 节点使用的是没有 --control-plane 参数的命令。
kubeadm join 192.168.66.110:6443 --token 7grf8m.wrjr5h53kzk1q7pf --discovery-token-ca-cert-hash sha256:0b73a83b1fa6d33456e84a2bf4cf674decfd4f174ba9469f21815dfebde2423d
添加成功后,在master节点查看
kubectl get nodes -o wide
kubectl get po -n kube-system -o wide
八、后续在k8s集群中加入 work node 方式
token 每两小时会自动改变,因此之后加入集群的话,需要获取新的 token,在 master 节点执行
kubeadm token create --print-join-command
然后在 work node 节点执行新的加入命令即可。
===============----------------------->>>
将 pod 网络和工作网络打通,请看下一篇文档《将k8s中pod网络和集群之外网络打通方案》
为 k8s 做 nginx 代理,请看文档 《nginx代理k8s应用服务的几种方案》