链接: K8S社区 | K8S中文社区 | K8S安装文档
环境要求:
ext4
文件系统(xfs 文件系统,会出现无法创建 snapshot 的问题)集群规划(常规):
操作系统 | 主机 IP | 主机名 | 主机配置 | 角色 |
---|---|---|---|---|
CentOS 8.2 | 10.206.0.7 | k8s-node-01 | 4C8G64GB | master |
CentOS 8.2 | 10.206.32.12 | k8s-node-02 | 4C8G64GB | work |
CentOS 8.2 | 10.206.32.16 | k8s-node-03 | 4C8G64GB | work |
集群规划(高可用):
操作系统 | 主机 IP | 主机名 | 主机配置 | 角色 |
---|---|---|---|---|
CentOS 8.2 | 10.206.0.7 | k8s-node-01 | 4C8G64GB | worker |
CentOS 8.2 | 10.206.32.12 | k8s-node-02 | 4C8G64GB | master |
CentOS 8.2 | 10.206.32.16 | k8s-node-03 | 4C8G64GB | master |
说明 1:查看系统版本信息
$ hostnamectl
说明 2:查看 CPU 信息
$ lscpu
说明 3:查看内存信息
$ cat /proc/meminfo
说明 4:查询主机发行版的方式
$ cat /etc/redhat-release
说明 5:获取主机名、MAC 地址、product_uuid 的方式
# 获取主机名
$ hostname
# 获取 MAC 地址
$ ifconfig
# 获取产品 UUID
$ cat /sys/class/dmi/id/product_uuid
说明 6:检测端口是否占用
$ nc ip port
说明 7:判断目录的文件系统
$ df -T
说明 8:K8S占用端口列表
端口范围 | 协议 | 组件 |
---|---|---|
6443 | TCP | api-server |
2379-2380 | TCP | etcd |
10250 | TCP | kubelet |
10259 | TCP | kube-scheduler |
10257 | TCP | kube-controller-manager |
30000-32767 | TCP | kubelet |
179 | TCP | calico |
所有节点均需要执行
1)关闭防火墙
$ systemctl stop firewalld
$ systemctl disable firewalld
2)关闭 selinux
# 临时关闭
$ setenforce 0
# 永久关闭
$ sed -i 's/enforcing/disabled/' /etc/selinux/config
3)关闭 swap
# 临时
$ swapoff -a
# 永久关闭
$ sed -ri 's/.*swap.*/#&/' /etc/fstab
4)修改主机名
# 根据规划设置主机名【node-01节点上操作】
$ hostnamectl set-hostname k8s-node-01
# 根据规划设置主机名【node-02节点操作】
$ hostnamectl set-hostname k8s-node-02
# 根据规划设置主机名【node-03节点操作】
$ hostnamectl set-hostname k8s-node-03
5)配置 Host
# 在所有节点添加hosts
$ cat >> /etc/hosts << EOF
10.206.0.7 k8s-node-01
10.206.32.12 k8s-node-02
10.206.32.16 k8s-node-03
EOF
6)时间同步
$ yum install chrony -y
$ systemctl start chronyd
$ systemctl enable chronyd
$ chronyc sources
7)设置内核参数
# 设置内核参数,注意和/etc/sysctl.conf等文件中的重复项冲突
$ cat > /etc/sysctl.d/99-kubernetes-cri.conf <<EOF
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
vm.swappiness=0
EOF
# 生效
$ sysctl --system
8)设置网络参数
# 加载 overlay 模块
$ modprobe overlay
# 加载 br_netfilter 模块
$ modprobe br_netfilter
9)安装 ipvsadm
# 安装ipvsadm,并设置ipvs模块自启
yum install ipvsadm -y
$ cat > /etc/sysconfig/modules/ipvs.modules << EOF
/sbin/modinfo -F filename ip_vs > /dev/null 2>&1
if [ $? -eq 0 ];then
/sbin/modprobe ip_vs
fi
EOF
$ systemctl daemon-reload
$ systemctl enable ipvsadm
$ systemctl restart ipvsam
containerd 简介
containerd 是一个工业级标准的容器运行时,它强调简单性、健壮性和可移植性。containerd 可以在宿主机中管理完整的容器生命周期,包括容器镜像的传输和存储、容器的执行和管理、存储和网络等。
docker vs containerd
containerd 是从 docker 中分离出来的一个项目,可以作为一个底层容器运行时,现在它成了 Kubernete 容器运行时更好的选择。从 1.24 开始,K8S 已取消 docker 作为其默认运行时。
K8S 为什么要放弃使用 docker 作为容器运行时
使用 docker 作为 K8S 容器运行时的话,kubelet 需要先要通过 dockershim 去调用 docker,再通过 docker 去调用 containerd。
使用 containerd 作为 K8S 容器运行时的话,由于 containerd 内置了 CRI (Container Runtime Interface:容器运行时接口)插件,kubelet 可以直接调用 containerd。
1)安装依赖
$ yum install yum-utils device-mapper-persistent-data lvm2 iproute-tc -y
2)添加 yum 源
$ yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
3)安装 containerd
$ yum install containerd.io-1.6.6-3.1.el8.x86_64 -y
4)生成配置文件
containerd config default > /etc/containerd/config.toml
5)修改配置文件
# 修改 cgroups 为 systemd
sed -i 's#SystemdCgroup = false#SystemdCgroup = true#' /etc/containerd/config.toml
# 修改 pause 镜像地址
sed -i 's#k8s.gcr.io#registry.aliyuncs.com/google_containers#' /etc/containerd/config.toml
# 修改容器存储路径到空间比较充裕的路径
root = "/var/lib/containerd"
注意:一定要修改 sandbox_image镜像地址,否则会由于防火墙问题,导致 kubelet 启动时无法加载镜像
6)配置 crictl
# 配置文件地址 /etc/crictl.yaml,修改 sock 地址
$ cat < /etc/crictl.yaml
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF
7)启动服务
$ systemctl enable containerd
$ systemctl daemon-reload
$ systemctl restart containerd
8)查看状态
$ systemctl status containerd
说明 1:使用 ctr
管理容器运行时
COMMANDS:
attach Attach to a running container
create Create a new container
exec Run a command in a running container
version Display runtime version information
images, image, img List images
inspect Display the status of one or more containers
inspecti Return the status of one or more images
imagefsinfo Return image filesystem info
inspectp Display the status of one or more pods
logs Fetch the logs of a container
port-forward Forward local port to a pod
ps List containers
pull Pull an image from a registry
run Run a new container inside a sandbox
runp Run a new pod
rm Remove one or more containers
rmi Remove one or more images
rmp Remove one or more pods
pods List pods
start Start one or more created containers
info Display information of the container runtime
stop Stop one or more running containers
stopp Stop one or more running pods
update Update one or more running containers
config Get and set crictl client configuration options
stats List container(s) resource usage statistics
completion Output shell completion code
help, h Shows a list of commands or help for one command
说明 2:使用 crictl
管理容器运行时
COMMANDS:
attach Attach to a running container
create Create a new container
exec Run a command in a running container
version Display runtime version information
images, image, img List images
inspect Display the status of one or more containers
inspecti Return the status of one or more images
imagefsinfo Return image filesystem info
inspectp Display the status of one or more pods
logs Fetch the logs of a container
port-forward Forward local port to a pod
ps List containers
pull Pull an image from a registry
run Run a new container inside a sandbox
runp Run a new pod
rm Remove one or more containers
rmi Remove one or more images
rmp Remove one or more pods
pods List pods
start Start one or more created containers
info Display information of the container runtime
stop Stop one or more running containers
stopp Stop one or more running pods
update Update one or more running containers
config Get and set crictl client configuration options
stats List container(s) resource usage statistics
completion Output shell completion code
help, h Shows a list of commands or help for one command
1)配置 yum 源
$ cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
2)安装
# 安装组件
$ yum install -y iproute-tc
$ yum install kubectl-1.24.1-0.x86_64 kubelet-1.24.1-0.x86_64 kubeadm-1.24.1-0.x86_64 -y
# 设置开机启动
$ systemctl enable kubelet
$ systemctl daemon-reload
$ systemctl restart kubelet
常规集群:单 Master 节点,资源占用少,有单点风险,不建议生产使用
在 Master 节点执行
1)生成并修改配置文件
# 生成配置文件
$ mkdir /app
$ kubeadm config print init-defaults > kubeadm.yaml
修改如下配置:
注意:一定要配置镜像代理,否则会由于防火墙问题导致集群安装失败
修改后配置文件如下:
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
# 控制切面的IP地址
advertiseAddress: 10.206.0.7
bindPort: 6443
nodeRegistration:
# 容器运行时 socket 文件地址
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
# 控制面主机名,可省略
name: k8s-node-01
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
# 镜像服务地址
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
# K8S 版本
kubernetesVersion: 1.24.1
networking:
dnsDomain: cluster.local
# service 的网段
serviceSubnet: 10.96.0.0/16
scheduler: {}
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: systemd
2)初始化 K8S 集群
# 查看所需镜像列表
$ kubeadm config images list --config /app/kubeadm.yml
# 拉取镜像
$ kubeadm config images pull --config /app/kubeadm.yml
# 根据配置文件启动 kubeadm 初始化 k8s
$ kubeadm init --config=/app/kubeadm.yml --upload-certs --v=6
参数 | 作用 | 缺省 | 配置 |
---|---|---|---|
–apiserver-advertise-address | apiserver 绑定的 IP | Master 主机 IP | |
–apiserver-bind-port | apiserver 监听的端口 | 6443 | 6443 |
–cri-socket | CRI socket 文件路径 | “unix:///var/run/containerd/containerd.sock” | |
–control-plane-endpoint | 控制面地址 | master_vip:6440 | |
–image-repository | 选择拉取镜像的仓库 | k8s.gcr.io | registry.aliyuncs.com/google_containers |
–kubernetes-version | 选择K8S版本 | stable-1 | 1.24.1 |
–pod-network-cidr | 指定 pod 的网络 | 10.96.0.0/16 | |
–service-cidr | 指定service 的IP 范围 | 10.96.0.0/12 | 10.95.0.0/16 |
注意1:单 master 节点时不需要配置 control-plane-endpoint,多 master 节点时必须配置
注意2:一定要配置镜像代理,否则会由于防火墙问题导致集群安装失败
命令执行
kubeadm init --apiserver-advertise-address=10.206.0.7 --apiserver-bind-port=6443 --cri-socket="unix:///var/run/containerd/containerd.sock" --image-repository registry.aliyuncs.com/google_containers --kubernetes-version=1.24.1 --pod-network-cidr=10.96.0.0/16 --service-cidr=10.95.0.0/16
执行结果
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.206.32.12:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:68d48107177ec63c7e36276e2087b00767884c56304fff8ab013efbe498cd61b
说明 1:加入指令可以通过如下命令重复获取
$ kubeadm token create --print-join-command
说明 2:初始化失败可以用 reset 指令重置,解决问题后重新初始化
$ kubeadm reset
说明 3:查看日志
$ journalctl -f -u kubelet
说明 4:token 失效处理
# 重新生成token
$ kubeadm token create
abcdef.0123456789abcdef
# 获取ca证书sha256编码hash值
$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
b615fccddcd4e80fc6f9c5e477bfc7a053b017660b73fdeccf89c559739664d7
# 将新的node节点加入到k8s集群中
kubeadm join node主机ip地址:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:b615fccddcd4e80fc6f9c5e477bfc7a053b017660b73fdeccf89c559739664d7
在 Worker 节点执行
$ kubeadm join 10.206.32.12:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:68d48107177ec63c7e36276e2087b00767884c56304fff8ab013efbe498cd61b
执行结果
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
说明 1:如果初始化过程不顺利,可以添加日志输出
$ kubeadm join 10.206.32.12:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:68d48107177ec63c7e36276e2087b00767884c56304fff8ab013efbe498cd61b --v=6
高可用集群:有多个 Master 节点,避免单点风险
1)配置 VIP 绑定到所有 Master 节点(云厂商提供 VIP 配置界面,也可以选用成本更高,性能更好的硬件 F5)
2)在所有 Master 节点上安装 Keepalived 服务
yum install keepalived -y
systemctl daemon-reload
systemctl enable keepalived
修改配置文件:
vim /etc/keepalived/keepalived.conf
# 替换为如下配置
global_defs {
router_id GA_K8S
}
vrrp_instance VI_1 {
# 节点类型(master或 backup,1 个集群只有 1 个 master角色,其它为 backup 角色)
state MASTER
# 绑定的网卡
interface eth0
virtual_router_id 200
# 优先级,值越大,优先级越高
priority 100
nopreempt
advert_int 1
authentication {
auth_type PASS
auth_pass jdvm2vm8weiyhlwa
}
virtual_ipaddress {
# VIP 地址
10.206.32.18
}
}
systemctl restart keepalived
3)安装 Haproxy
yum install -y haproxy
systemctl daemon-reload
systemctl enable haproxy
修改配置文件:
vim vim /etc/haproxy/haproxy.cfg
# 替换为如下配置
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
listen stats
mode http
bind *:7000
stats enable
stats uri
#---------------------------------------------------------------------
# k8s-api frontend which proxys to the backends
#---------------------------------------------------------------------
frontend k8s-api
# 本机 ha 服务暴露地址,不同节点,配置 IP 不一样
bind 10.206.32.16:6443
bind 127.0.0.1:6443
mode tcp
option tcplog
default_backend k8s-api
#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend k8s-api
mode tcp
option tcplog
option tcp-check
balance roundrobin
# k8s 集群,api-server 的地址列表
server k8s-api-1 10.206.0.7:6443 check
server k8s-api-2 10.206.32.12:6443 check
server k8s-api-3 10.206.32.16:6443 check
systemctl restart haproxy
说明 1:查看 Keepalived 的服务状态
systemctl status keepalived
说明 2:查看 haproxy 的服务状态
systemctl status haproxy
本次由于没有 VIP,演示高可用集群效果,因此采用单节点 Haproxy 模拟 VIP ,生产不建议这样用,这种方式又会引入 HA 的单点风险
1)安装 Haproxy
$ yum install -y haproxy
$ systemctl enable haproxy
$ systemctl daemon-reload
2)配置 Haproxy
$ vim /etc/haproxy/haproxy.conf
# 添加如下配置
frontend k8s-api
# 本机 ha 服务暴露地址,不同节点,配置 IP 不一样
bind 10.206.32.12:6440
bind 127.0.0.1:6440
mode tcp
option tcplog
default_backend k8s-api
backend k8s-api
mode tcp
option tcplog
option tcp-check
balance roundrobin
# k8s 集群,api-server 的地址列表
server k8s-api-1 10.206.0.7:6443 check
server k8s-api-2 10.206.32.12:6443 check
server k8s-api-3 10.206.32.16:6443 check
3)查看状态
$ systemctl status haproxy
在初始 Master 节点执行
命令执行
kubeadm init --apiserver-advertise-address=10.206.32.12 --apiserver-bind-port=6443 --cri-socket="unix:///var/run/containerd/containerd.sock" --control-plane-endpoint=10.206.32.12:6440 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version=1.24.1 --pod-network-cidr=10.96.0.0/16 --service-cidr=10.95.0.0/16 --upload-certs
执行结果
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join 10.206.32.12:6440 --token rrm7vb.u5qceu1p8gwzgptu \
--discovery-token-ca-cert-hash sha256:a612e48dd14d5a19eba3cf30531c7134b4c83fe2098cb32c6ca6196cb289de8b \
--control-plane --certificate-key b65ed82bec66c20342e3a5e013408520f23c0f66f413729139e0c6e6f978c88a
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.206.32.12:6440 --token rrm7vb.u5qceu1p8gwzgptu \
--discovery-token-ca-cert-hash sha256:a612e48dd14d5a19eba3cf30531c7134b4c83fe2098cb32c6ca6196cb289de8b
执行命令
$ kubeadm join 10.206.32.12:6440 --token rrm7vb.u5qceu1p8gwzgptu --discovery-token-ca-cert-hash sha256:a612e48dd14d5a19eba3cf30531c7134b4c83fe2098cb32c6ca6196cb289de8b --control-plane --certificate-key b65ed82bec66c20342e3a5e013408520f23c0f66f413729139e0c6e6f978c88a
执行结果
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
执行命令
$ kubeadm join 10.206.32.12:6440 --token rrm7vb.u5qceu1p8gwzgptu --discovery-token-ca-cert-hash sha256:a612e48dd14d5a19eba3cf30531c7134b4c83fe2098cb32c6ca6196cb289de8b
执行结果
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
$ kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system coredns-74586cf9b6-cwhmr 0/1 Pending 0 18m
kube-system coredns-74586cf9b6-jdzdw 0/1 Pending 0 18m
kube-system etcd-k8s-node-02 1/1 Running 4 19m 10.206.32.12 k8s-node-02
kube-system etcd-k8s-node-03 1/1 Running 0 10m 10.206.32.16 k8s-node-03
kube-system kube-apiserver-k8s-node-02 1/1 Running 4 19m 10.206.32.12 k8s-node-02
kube-system kube-apiserver-k8s-node-03 1/1 Running 1 (10m ago) 10m 10.206.32.16 k8s-node-03
kube-system kube-controller-manager-k8s-node-02 1/1 Running 5 19m 10.206.32.12 k8s-node-02
kube-system kube-controller-manager-k8s-node-03 1/1 Running 0 9m47s 10.206.32.16 k8s-node-03
kube-system kube-proxy-nwsxm 1/1 Running 0 10m 10.206.32.16 k8s-node-03
kube-system kube-proxy-pw9b5 1/1 Running 0 5m49s 10.206.0.7 k8s-node-01
kube-system kube-proxy-t4jn6 1/1 Running 0 18m 10.206.32.12 k8s-node-02
kube-system kube-scheduler-k8s-node-02 1/1 Running 5 19m 10.206.32.12 k8s-node-02
kube-system kube-scheduler-k8s-node-03 1/1 Running 0 10m 10.206.32.16 k8s-node-03
可以看到关键服务已经做到双活,集群创建成功,但是集群尚未就绪,需要接下来安装网络组件
配置 master 节点
$ rm -rf ~/.kube/config
$ mkdir ~/,kube
$ cp -i /etc/kubernetes/admin.conf ~/.kube/config
$ kubectl cluster-info
在 Master 节点执行
1)下载安装 yaml
$ curl -o /app/calico.yaml https://docs.projectcalico.org/manifests/calico.yaml
2)修改calico.yaml里的pod网段。
$ vim /app/calico.yaml
# 把calico.yaml里pod所在网段改成kubeadm init时所指定的网段,
# 原配置
# no effect. This should fall within `--cluster-cidr`.
# - name: CALICO_IPV4POOL_CIDR
# value: "192.168.0.0/16"
# Disable file logging so `kubectl logs` works.
- name: CALICO_DISABLE_FILE_LOGGING
value: "true"
# 修改后的配置
# no effect. This should fall within `--cluster-cidr`.
- name: CALICO_IPV4POOL_CIDR
value: "10.96.0.0/16"
# Disable file logging so `kubectl logs` works.
- name: CALICO_DISABLE_FILE_LOGGING
value: "true"
注意:修改时注意缩进
3)安装
$ kubectl apply -f /app/calico.yaml
4)验证
结果(Node Ready)
$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-node-01 Ready 9m42s v1.24.1 10.206.0.7 CentOS Linux 8 4.18.0-348.7.1.el8_5.x86_64 containerd://1.6.6
k8s-node-02 Ready control-plane 22m v1.24.1 10.206.32.12 CentOS Linux 8 (Core) 4.18.0-348.7.1.el8_5.x86_64 containerd://1.6.6
k8s-node-03 Ready control-plane 14m v1.24.1 10.206.32.16 CentOS Linux 8 (Core) 4.18.0-348.7.1.el8_5.x86_64 containerd://1.6.6
结果(Pod Running)
$ kubectl get po -A -o wide | grep calico
kube-system calico-kube-controllers-56cdb7c587-prjs5 1/1 Running 0 65s 10.96.154.193 k8s-node-01
kube-system calico-node-mgcjk 1/1 Running 0 65s 10.206.32.12 k8s-node-02
kube-system calico-node-qc6fx 1/1 Running 0 65s 10.206.0.7 k8s-node-01
kube-system calico-node-rpfz4 1/1 Running 0 65s 10.206.32.16 k8s-node-03
# 查询
kubectl describe node k8s-node-02
# 添加
kubectl label node k8s-node-01 node-role.kubernetes.io/worker=true
kubectl label node k8s-node-02 node-role.kubernetes.io/master=true
kubectl label node k8s-node-03 node-role.kubernetes.io/master=true
# 删除
kubectl label node k8s-node-03 node-role.kubernetes.io/master-
# 效果
kubectl get nodes
# 查看
kubectl describe node k8s-node-02
# 添加
kubectl taint node k8s-node-02 node-role.kubernetes.io/master=true:NoSchedule
# 删除
kubectl taint node k8snode01 node-role.kubernetes.io/master=true:NoSchedule-
# 部署 nginx 服务
kubectl create deployment nginx --image=nginx
# 查看状态
kubectl get po -A | grep nginx
# 暴露端口
kubectl expose deployment nginx --port=80 --type=NodePort
# 查看端口
kubectl get svc | grep nginx
# 访问服务
curl http://k8s-node-01:30493/
由于常用应用镜像里面缺少各种网络工具,难以定位问题,因此部署 busybox 服务,便于定位节点网络问题
# 创建 yaml 文件
cat <<EOF > /app/busybox.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: busybox-test
namespace: default
spec:
selector:
matchLabels:
name: busybox-test
template:
metadata:
labels:
name: busybox-test
spec:
containers:
- name: app
image: busybox
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
terminationGracePeriodSeconds: 30
EOF
# 启动服务
kubectl apply -f /app/busybox.yaml
# 查看服务
kubectl get po -A -o wide | grep busybox
# 进入 pod,定位节点问题
kubectl exec -ti <<POD 名称>> sh -n default
在 Master 节点安装可视化控制台,方便日常监控使用,项目地址:点击跳转
1)下载配置文件
curl -o /app/dashboard.yaml https://raw.githubusercontent.com/kubernetes/dashboard/v2.6.0/aio/deploy/recommended.yaml
2)暴露服务
# dashboard.yaml 添加如下配置
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
type: NodePort
ports:
- port: 443
targetPort: 8443
nodePort: 30443
selector:
k8s-app: kubernetes-dashboard
3)创建用户,获取 token
cat <<EOF > /app/admin-user.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard
EOF
kubectl apply -f /app/admin-user.yaml
4)生成 token
kubectl -n kubernetes-dashboard create token admin-user
5)配置 token
下面操作中的 docker 卸载,为运行时是 docker 的 K8S 集群
# 重置 K8S 节点
sudo kubeadm reset -f
# 停止服务
sudo systemctl stop kubelet
sudo systemctl stop containerd
sudo systemctl stop docker
sudo systemctl disable kubelet
sudo systemctl disable containerd
sudo systemctl disable docker
# 卸载服务
yum -y remove kubelet kubeadm kubectl containerd docker-ce docker-ce-cli docker-scan-plugin
# 删除资源文件
sudo rm -rf ~/.kube/
sudo rm -rf /etc/cni
sudo rm -rf /etc/crictl.yml
sudo rm -rf /etc/docker
sudo rm -rf /etc/kubernetes
sudo rm -rf /etc/containers
sudo rm -rf /etc/containerd
sudo rm -rf /usr/bin/kube*
sudo rm -rf /usr/bin/docker*
sudo rm -rf /usr/bin/container*
sudo rm -rf /usr/bin/crictl*
sudo rm -rf /usr/bin/crio*
sudo rm -rf /usr/local/bin/kube*
sudo rm -rf /usr/local/bin/docker*
sudo rm -rf /usr/local/bin/container*
sudo rm -rf /usr/local/bin/crictl*
sudo rm -rf /usr/local/bin/crio*
sudo rm -rf /opt/cni
sudo rm -rf /opt/containerd
sudo rm -rf /var/etcd
sudo rm -rf /var/lib/etcd
sudo rm -rf /var/lib/calico
sudo rm -rf /var/lib/cni
sudo rm -rf /var/lib/container*
sudo rm -rf /var/lib/kubelet
sudo rm -rf /var/lib/docker*
sudo rm -rf /var/lib/yurttunnel-server
sudo rm -rf /run/docker*
sudo rm -rf /run/calico*
sudo rm -rf /run/crio*
sudo rm -rf /var/run/container*
# 重新加载
sudo systemctl daemon-reload
# 清理 Iptables
sudo iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
sysctl net.bridge.bridge-nf-call-iptables=1
# 卸载虚拟网卡
sudo ip link del cni0
sudo ip link del flannel.1
故障现象:Pod 内无法 ping 通外网域名,访问外网 IP、K8S 内部域名或者IP均正常
问题原因:CoreDNS 解析问题
解决方案:删除 CoreDNS,重新创建 CoreDNS
具体步骤:
1)删除已有 Coredns 服务:
kubectl delete --namespace=kube-system deployment coredns
2)重新安装
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/coredns.yaml.sed
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/deploy.sh
chmod +x /app/deploy.sh
/app/deploy.sh | kubectl apply -f -
通过本博客,基本掌握 K8S 普通集群和高可用集群的从 0 到 1 的搭建,在搭建过程中,需要注意的是由于防火墙问题导致的集群部署失败,一定要注意文中标注的几个镜像仓库配置步骤,避免浪费时间。
接下来的分享计划:
欢迎大家交流,WX公众号:源圈