集群
主机名 | IP地址 |
---|---|
k8s-master01 | 192.168.200.101 |
k8s-master02 | 192.168.200.102 |
k8s-master03 | 192.168.200.103 |
k8s-node01 | 192.168.200.201 |
k8s-node02 | 192.168.200.202 |
VIP(虚拟IP)
192.168.200.80
Harbor
http://192.168.200.50
1.设置静态IP
2.设置ssh远程登录
yum install epel-release -y
yum makecache fast
yum install -y yum-utils device-mapper-persistent-data lvm2
sed -i 's/enforcing/disabled/' /etc/selinux/config
sed -ri 's/.*swap.*/#&/' /etc/fstab
swapoff -a
vi /etc/hostname
cat >> /etc/hosts <<EOF
192.168.200.101 k8s-master01
192.168.200.102 k8s-master02
192.168.200.103 k8s-master03
192.168.200.201 k8s-node01
192.168.200.202 k8s-node02
EOF
安装ntpdata
yum -y install ntp
同步时间
ntpdate ntp1.aliyun.com
设置开机自动同步与定时执行
追加
echo "ulimit -n 65535" >>/etc/profile
刷新配置
source /etc/profile
只有master01 节点执行
Master01 节点免密码登录其他节点(提示输入全部回车,不用输入)
ssh-keygen -t rsa
将密钥发送到其他主机上
ssh-copy-id -i .ssh/id_rsa.pub k8s-master01
ssh-copy-id -i .ssh/id_rsa.pub k8s-master02
ssh-copy-id -i .ssh/id_rsa.pub k8s-master03
ssh-copy-id -i .ssh/id_rsa.pub k8s-node01
ssh-copy-id -i .ssh/id_rsa.pub k8s-node02
yes并输入密码
CentOS 7.x 系统自带的 3.10.x 内核存在一些 Bugs,导致运行的 Docker、Kubernetes 不稳定所以需要升级内核 4.18+ 以上
yum install wget jg psmisc vim net-tools telnet yum-utils device-mapper-persistent-data lvm2 git -y
yum update -y --exclude=kernel* && reboot
下载选好的内核版本
wget http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-devel-4.19.12-1.el7.elrepo.x86_64.rpm
wget http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-4.19.12-1.el7.elrepo.x86_64.rpm
安装内核
yum localinstall -y kernel-ml*
更改启动内核
grub2-set-default 0 && grub2-mkconfig -o /etc/grub2.cfg
grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"
查看启动内核是否是我们需要的
grubby --default-kernel
重启生效,确认使用内核版本
reboot
uname -a
cat > /etc/sysctl.d/k8s.conf <
加载配置
sysctl --system
[所有节点安装docker]
添加docker yum源
yum -y install yum-utils
yum-config-manager \
--add-repo \
https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
sed -i 's/download.docker.com/mirrors.aliyun.com\/docker-ce/g' /etc/yum.repos.d/docker-ce.repo
更新yum包软件索引
yum makecache fast
查询所有版本
yum list docker-ce.x86_64 --showduplicates | sort -r
安装指定版本
yum -y install docker-ce-20.10.10-3.el7
启动&开机自启 docker
systemctl restart docker && systemctl enable docker
注意如果是私有仓库请设置非安全仓库,否则跳过
配置
想要推镜像,需要设置非安全仓库(这里使用http协议,没有使用https)
vim /etc/docker/daemon.json
192.168.200.50是自己搭建的 harbor仓库地址
{
"insecure-registries":["192.168.200.50"]
}
重启docker
systemctl restart docker
查看信息
docker info
[所有节点都装cri-dockerd]
上传或下载cri-dockerd安装包
wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.1/cri-dockerd-0.3.1-3.el7.x86_64.rpm
安装cri-dockerd
rpm -ivh cri-dockerd-0.3.1-3.el7.x86_64.rpm
修改镜像地址为国内,否则kubelet拉取不了镜像导致启动失败
vi /usr/lib/systemd/system/cri-docker.service
使用阿里云源
ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd:// --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.7
使用私有仓库
ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd:// --pod-infra-container-image=192.168.200.50/google_containers/pause:3.9
启动cri-dockerd
systemctl daemon-reload
systemctl enable cri-docker && systemctl start cri-docker
[所有节点安装kubeadm]
设置kubernetes 阿里云yum源
cat < /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
快速建立yum缓存
yum -y makecache fast
查询所有版本
yum list kubeadm.x86_64 --showduplicates | sort -r
安装指定版本 kubeadm kubectl kubelet
yum install kubeadm-1.26.4-0.x86_64 kubectl-1.26.4-0.x86_64 kubelet-1.26.4-0.x86_64
systemctl restart kubelet && systemctl enable kubelet
[所有Master节点安装 keepalived与haproxy ]
yum -y install keepalived haproxy
注意修改ip
cat > /etc/haproxy/haproxy.cfg << EOF
global
maxconn 2000
ulimit-n 16384
log 127.0.0.1 local0 err
stats timeout 30s
defaults
log global
mode http
option httplog
timeout connect 5000
timeout client 50000
timeout server 50000
timeout http-request 15s
timeout http-keep-alive 15s
frontend k8s-master
bind 0.0.0.0:8443
bind 127.0.0.1:8443
mode tcp
option tcplog
tcp-request inspect-delay 5s
default_backend k8s-master
backend k8s-master
mode tcp
option tcplog
option tcp-check
balance roundrobin
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server k8s-master01 192.168.200.101:6443 check
server k8s-master02 192.168.200.102:6443 check
server k8s-master03 192.168.200.103:6443 check
EOF
注意每个master节点不一样
注意修改ip,网卡名称!
cat > /etc/keepalived/keepalived.conf << EOF
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state MASTER
interface ens33
mcast_src_ip 192.168.200.101
virtual_router_id 51
priority 101
nopreempt
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.200.80
}
track_script {
chk_apiserver
} }
EOF
注意修改ip,网卡名称
cat > /etc/keepalived/keepalived.conf << EOF
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
mcast_src_ip 192.168.200.102
virtual_router_id 51
priority 90
nopreempt
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.200.80
}
track_script {
chk_apiserver
} }
EOF
注意修改ip,网卡名称
cat > /etc/keepalived/keepalived.conf << EOF
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
mcast_src_ip 192.168.200.103
virtual_router_id 51
priority 80
nopreempt
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.200.80
}
track_script {
chk_apiserver
} }
EOF
master 三节点相同
cat > /etc/keepalived/check_apiserver.sh << EOF
#!/bin/bash
err=0
for k in $(seq 1 3)
do
check_code=$(pgrep haproxy)
if [[ $check_code == "" ]]; then
err=$(expr $err + 1)
sleep 1
continue
else
err=0
break
fi
done
if [[ $err != "0" ]]; then
echo "systemctl stop keepalived"
/usr/bin/systemctl stop keepalived
exit 1
else
exit 0
fi
EOF
给执行权限
chmod +x /etc/keepalived/check_apiserver.sh
所有master节点启动haproxy和keepalived
systemctl enable --now haproxy && systemctl enable --now keepalived
ping 192.168.200.80
只有master01执行初始化
vim kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.200.101
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/cri-dockerd.sock
name: k8s-master01
taints:
- effect: PreferNoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
certSANs:
- 192.168.200.80
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: 192.168.200.80:8443
controllerManager: {}
dns: {
type:CoreDNS
}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.26.0
networking:
dnsDomain: cluster.local
podSubnet: 172.168.0.0/16
serviceSubnet: 10.96.0.0/12
scheduler: {}
只有master01执行
如果初始化ymal版本比较旧可以使用如下命令进行转换新版本
kubeadm config migrate --old-config kubeadm-config.yaml --new-config new.yaml
可提前拉取镜像非必要
kubeadm config images pull --config kubeadm-config.yaml
只有master01执行
kubeadm init --config kubeadm-config.yaml --upload-certs
成功示例:
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run “kubectl apply -f [podnetwork].yaml” with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join 192.168.200.80:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:839dcb5784d74069bb4d5bc57912125894cbc79006c6cdbe8627d2115de4aa3f \
--control-plane --certificate-key 1651b597b323b1888f0308f761e1bc0716d2cda08b0024070821ad2cb06314d4
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
“kubeadm init phase upload-certs --upload-certs” to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.200.80:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:839dcb5784d74069bb4d5bc57912125894cbc79006c6cdbe8627d2115de4aa3f
[root@k8s-master01 ~]#
只有master01可执行
获取信息
kubectl get secret -n kube-system
NAME TYPE DATA AGE
bootstrap-token-abcdef bootstrap.kubernetes.io/token 6 15h
查看内容
kubectl get secret -n kube-system bootstrap-token-abcdef -oyaml
apiVersion: v1
data:
auth-extra-groups: c3lzdGVtOmJvb3RzdHJhcHBlcnM6a3ViZWFkbTpkZWZhdWx0LW5vZGUtdG9rZW4=
expiration: MjAyMy0wNS0wMVQxNTowMzoyMlo=
token-id: YWJjZGVm
token-secret: MDEyMzQ1Njc4OWFiY2RlZg==
usage-bootstrap-authentication: dHJ1ZQ==
usage-bootstrap-signing: dHJ1ZQ==
kind: Secret
metadata:
creationTimestamp: “2023-04-30T15:03:22Z”
name: bootstrap-token-abcdef
namespace: kube-system
resourceVersion: “210”
uid: cd8dc8d5-2dfd-4d22-9c8b-320767e6c7ad
type: bootstrap.kubernetes.io/token
查看过期时间(base64编码)
echo "MjAyMy0wNS0wMVQxNTowMzoyMlo=" | base64 --decode
如果token过期那么可以从新生成
node节点:
kubeadm token create --print-join-command
kubeadm join 192.168.200.101:6443 --token fkx9wu.k20ta9krq2r5lc5y --discovery-token-ca-cert-hash sha256:a4f07db80f54e277c8514991c76b040913310dba73397e4447126dcd11b72073
master节点:
注意要加上 --config=你的路径/初始化集群时候.yaml
kubeadm init phase upload-certs --upload-certs --config=/root/new.yaml --print-join-command
[root@k8s-master01 ~]# kubeadm init phase upload-certs --upload-certs --config=/root/new.yaml
[upload-certs] Storing the certificates in Secret “kubeadm-certs” in the “kube-system” Namespace
[upload-certs] Using certificate key:
094c1265df2831b7370757f89078215a117604a462438b067bd6909235384f3c
1.使用 kubeadm reset
命令将节点还原为初始状态。这将停止 kubelet 服务,删除所有与 Kubernetes 相关的容器和 Pod、配置文件和数据。确保你已备份任何重要数据,因为此操作将永久删除集群数据。
sudo kubeadm reset
2.清理 kubelet 的配置文件和证书目录:
sudo rm -rf /etc/kubernetes
sudo rm -rf /var/lib/kubelet
sudo rm -rf /var/lib/etcd
3.清理 CNI(容器网络接口)配置和数据:
sudo rm -rf /etc/cni/net.d
sudo rm -rf /var/lib/cni
4.删除 .kube
目录,这是 kubectl 的配置目录:
sudo rm -rf $HOME/.kube
重新初始化…
sudo kubeadm init --config kubeadm-config.yaml
[只有master01执行]
官方教程
https://docs.tigera.io/calico/latest/getting-started/kubernetes/self-managed-onprem/onpremises
下载yaml
curl https://raw.githubusercontent.com/projectcalico/calico/v3.25.1/manifests/calico.yaml -O
打开编辑
vi calico.yaml
搜索 CALICO_IPV4POOL_CIDR
将前面的注释去掉下面的pi为pod的网段,保存推出
修改这里的ip,注意格式对齐,不然会报错
- name: CALICO_IPV4POOL_CIDR
value: "172.168.0.0/16"
开始部署
kubectl apply -f calico.yaml
如果镜像拉取不下来编辑 calico.yaml 修改镜像源
可以搭建私有docker harbor 仓库来存储镜像以后从本地拉取即可
如下所示:
docker push 192.168.200.50/calico/cni:v3.25.1
docker push 192.168.200.50/library/calico/cni:v3.25.1
docker push 192.168.200.50/calico/cni:v3.25.1
查看pod 状态
kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-557ff7c5d4-9cz2k 1/1 Running 0 89s
calico-node-fdj5s 1/1 Running 0 89s
coredns-5bbd96d687-ml854 1/1 Running 0 15h
coredns-5bbd96d687-mw5j7 1/1 Running 0 15h
etcd-k8s-master01 1/1 Running 10 (160m ago) 15h
kube-apiserver-k8s-master01 1/1 Running 10 (160m ago) 15h
kube-controller-manager-k8s-master01 1/1 Running 11 (118s ago) 15h
kube-proxy-6hlvn 1/1 Running 10 (160m ago) 15h
kube-scheduler-k8s-master01 1/1 Running 11 (118s ago) 15h
查看节点状态
kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master01 Ready control-plane 15h v1.26.4
除master01外[master02…03…N 执行]
其他master节点加入
注意替换token,过期请从新生成token
kubeadm join 192.168.200.80:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:a17b9bd75fe97d3d96736ee7e99b58db7a5ba0ea9757dcc7d332f81ec7130697 \
--control-plane --certificate-key 3f9e1a84015565ebeedf232990b0a3117ecf817a36fb71185ae562fdfe967c0d --cri-socket=/var/run/cri-dockerd.sock
node节点加入
注意替换token,过期请从新生成token
kubeadm join 192.168.200.80:8443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:a17b9bd75fe97d3d96736ee7e99b58db7a5ba0ea9757dcc7d332f81ec7130697 --cri-socket=/var/run/cri-dockerd.sock
从集群初始化节点 master01 将 admin.conf 传递至其他结点(以便使用kubectl命令)
scp /etc/kubernetes/admin.conf [email protected]:/etc/kubernetes/
传递完成后再 其他节点 上执行以下命令,以便 kubectl 使用 admin.conf
文件:
mkdir -p $HOME/.kube
sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
全部加入成功:
[root@k8s-master01 ~]#
[root@k8s-master01 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master01 Ready control-plane 22m v1.26.4
k8s-master02 Ready control-plane 15m v1.26.4
k8s-master03 Ready control-plane 15m v1.26.4
k8s-node01 Ready 41s v1.26.4
k8s-node02 Ready 48s v1.26.4
[root@k8s-master01 ~]#
从 Kubernetes 集群中移除一个 master 节点
首先,在要移除的 master 节点上,使用 kubeadm
重置该节点:
sudo kubeadm reset
这将删除与 Kubernetes 相关的所有配置和组件。
接下来,从其他 master 节点(例如,master01)移除 master02 的 etcd 成员。首先,找到 etcd 成员列表:
(如果不存在直接跳过第3步)
kubectl exec -n kube-system -it etcd-k8s-master01 -- etcdctl member list --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key
找到 master02 的 etcd 成员 ID(它看起来像一个长的十六进制数字)。
使用以下命令删除 master02 的 etcd 成员:
kubectl exec -n kube-system -it etcd-master01 -- etcdctl member remove --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key
请将 `` 替换为在第 2 步中找到的 master02 的 etcd 成员 ID。
更新你的负载均衡器或 VIP 配置,以便不再将流量转发到 master02。
作为最后一步,从集群中删除 master02 的节点对象:
kubectl delete node master02
现在,master02 节点已从 Kubernetes 集群中移除。
从新打标签
docker tag registry.aliyuncs.com/google_containers/kube-apiserver:v1.26.0 192.168.200.50/google_containers/kube-apiserver:v1.26.0
docker tag registry.aliyuncs.com/google_containers/kube-controller-manager:v1.26.0 192.168.200.50/google_containers/kube-controller-manager:v1.26.0
docker tag registry.aliyuncs.com/google_containers/kube-proxy:v1.26.0 192.168.200.50/google_containers/kube-proxy:v1.26.0
docker tag registry.aliyuncs.com/google_containers/kube-scheduler:v1.26.0 192.168.200.50/google_containers/kube-scheduler:v1.26.0
docker tag registry.aliyuncs.com/google_containers/etcd:3.5.6-0 192.168.200.50/google_containers/etcd:3.5.6-0
docker tag registry.aliyuncs.com/google_containers/pause:3.9 192.168.200.50/google_containers/pause:3.9
docker tag registry.aliyuncs.com/google_containers/coredns:v1.9.3 192.168.200.50/google_containers/coredns:v1.9.3
推送
docker push 192.168.200.50/google_containers/kube-apiserver:v1.26.0
docker push 192.168.200.50/google_containers/kube-controller-manager:v1.26.0
docker push 192.168.200.50/google_containers/kube-proxy:v1.26.0
docker push 192.168.200.50/google_containers/kube-scheduler:v1.26.0
docker push 192.168.200.50/google_containers/etcd:3.5.6-0
docker push 192.168.200.50/google_containers/pause:3.9
docker push 192.168.200.50/google_containers/coredns:v1.9.3
kubectl get pods --all-namespaces -o wide
kubectl get events --namespace=kube-system
kubectl logs calico-node-rlvwm -n kube-system
查看 k8s calico的yaml配置
kubectl get pods --all-namespaces -l k8s-app=calico-node -o yaml