目录
1.1获取镜像
1.2 安装docker[集群]
1.3 阿里仓库下载[集群]
1.4 集群部署[集群]
1.5 集群环境配置[集群]
1.6 关闭系统Swap[集群]
1.7 安装Kubeadm包[集群]
1.8 配置启动kubelet[集群]
1.9 配置master节点[master]
1.10 配置使用网络插件[master]
1.11 node加入集群[node]
1.12 后续检查[master]
主节点CPU核数必须是 ≥2核且内存要求必须≥2G,否则k8s无法启动
主机名 | 地址 | 角色 | 配置 |
---|---|---|---|
kub-k8s-master | 192.168.96.10 | 主节点 | 2核4G |
kub-k8s-node1 | 192.168.96.20 | 工作节点 | 1核2G |
kub-k8s-node2 | 192.168.96.30 | 工作节点 | 1核2G |
谷歌镜像[由于国内网络原因,无法下载,后续将采用阿里云镜像代替]
docker pull k8s.gcr.io/kube-apiserver:v1.22.0 docker pull k8s.gcr.io/kube-proxy:v1.22.0 docker pull k8s.gcr.io/kube-controller-manager:v1.22.0 docker pull k8s.gcr.io/kube-scheduler:v1.22.0 docker pull k8s.gcr.io/etcd:3.5.0-0 docker pull k8s.gcr.io/pause:3.5 docker pull k8s.gcr.io/coredns/coredns:v1.8.4
特别说明
所有机器都必须有镜像
每次部署都会有版本更新,具体版本要求,运行初始化过程失败会有版本提示
kubeadm的版本和镜像的版本必须是对应的
过程请查看docker安装部分
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.22.0 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.22.0 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.22.0 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.22.0 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.0-0 docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5 # 下载完了之后需要将aliyun下载下来的所有镜像打成k8s.gcr.io/kube-controller-manager:v1.22.0这样的tag docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.22.0 k8s.gcr.io/kube-controller-manager:v1.22.0 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.22.0 k8s.gcr.io/kube-proxy:v1.22.0 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.22.0 k8s.gcr.io/kube-apiserver:v1.22.0 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.22.0 k8s.gcr.io/kube-scheduler:v1.22.0 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4 k8s.gcr.io/coredns/coredns:v1.8.4 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.0-0 k8s.gcr.io/etcd:3.5.0-0 docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5 k8s.gcr.io/pause:3.5 # 可以清理掉aliyun的镜像标签 docker rmi -f `docker images --format {{.Repository}}:{{.Tag}} | grep aliyun`
cat >> /etc/hosts <
制作本地解析,修改主机名。相互解析
1.关闭防火墙: # systemctl disable firewalld --now 2.禁用SELinux: # setenforce 0 3.编辑文件/etc/selinux/config,将SELINUX修改为disabled,如下: # sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/sysconfig/selinux SELINUX=disabled 4.时间同步 # timedatectl set-timezone Asia/Shanghai # yum install -y ntpdate # ntpdate ntp.aliyun.com 5.配置静态ip
Kubernetes 1.8开始要求关闭系统的Swap,如果不关闭,默认配置下kubelet将无法启动。
方法一: 通过kubelet的启动参数--fail-swap-on=false更改这个限制。
方法二: 关闭系统的Swap。
1.关闭swap分区 # swapoff -a 修改/etc/fstab文件,注释掉SWAP的自动挂载,使用free -m确认swap已经关闭。 2.注释掉swap分区: # sed -i 's/.*swap.*/#&/' /etc/fstab # free -m total used free shared buff/cache available Mem: 3935 144 3415 8 375 3518 Swap: 0 0 0
配置官方源[需] # cat </etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=0 EOF 配置阿里云源 cat < /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF
所有节点: 1.安装依赖包及常用软件包 # yum install -y conntrack ntpdate ntp ipvsadm ipset jq iptables curl sysstat libseccomp wget vim net-tools git iproute lrzsz bash-completion tree bridge-utils unzip bind-utils gcc 2.安装对应版本 # yum install -y kubelet-1.22.0-0.x86_64 kubeadm-1.22.0-0.x86_64 kubectl-1.22.0-0.x86_64 3.加载ipvs相关内核模块 # cat </etc/modules-load.d/ipvs.conf ip_vs ip_vs_lc ip_vs_wlc ip_vs_rr ip_vs_wrr ip_vs_lblc ip_vs_lblcr ip_vs_dh ip_vs_sh ip_vs_nq ip_vs_sed ip_vs_ftp ip_vs_sh nf_conntrack_ipv4 ip_tables ip_set xt_set ipt_set ipt_rpfilter ipt_REJECT ipip EOF 4.配置: 配置转发相关参数,否则可能会出错 # cat < /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-iptables=1 net.bridge.bridge-nf-call-ip6tables=1 net.ipv4.ip_forward=1 net.ipv4.tcp_tw_recycle=0 vm.swappiness=0 vm.overcommit_memory=1 vm.panic_on_oom=0 fs.inotify.max_user_instances=8192 fs.inotify.max_user_watches=1048576 fs.file-max=52706963 fs.nr_open=52706963 net.ipv6.conf.all.disable_ipv6=1 net.netfilter.nf_conntrack_max=2310720 EOF 5.使配置生效 # sysctl --system 6.如果net.bridge.bridge-nf-call-iptables报错,加载br_netfilter模块 # modprobe br_netfilter # modprobe ip_conntrack # sysctl -p /etc/sysctl.d/k8s.conf 7.查看是否加载成功 # lsmod | grep ip_vs
1.配置kubelet使用pause镜像 获取docker的cgroups # DOCKER_CGROUPS=$(docker info | grep 'Cgroup' | cut -d' ' -f4) # echo $DOCKER_CGROUPS ================================= 配置变量:[root@k8s-master ~]# DOCKER_CGROUPS=`docker info |grep 'Cgroup' | awk ' NR==1 {print $3}'` [root@k8s-master ~]# echo $DOCKER_CGROUPS cgroupfs
2.配置kubelet的cgroups # cat >/etc/sysconfig/kubelet< 启动 # systemctl daemon-reload # systemctl enable kubelet && systemctl restart kubelet 在这里使用 # systemctl status kubelet,你会发现报错误信息; 10月 11 00:26:43 node1 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a 10月 11 00:26:43 node1 systemd[1]: Unit kubelet.service entered failed state. 10月 11 00:26:43 node1 systemd[1]: kubelet.service failed. 运行 # journalctl -xefu kubelet 命令查看systemd日志才发现,真正的错误是: unable to load client CA file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or directory #这个错误在运行kubeadm init 生成CA证书后会被自动解决,此处可先忽略。 #简单地说就是在kubeadm init 之前kubelet会不断重启。
运行初始化过程如下:[root@kub-k8s-master]# kubeadm init --kubernetes-version=v1.22.0 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=192.168.11.135
注: apiserver-advertise-address=192.168.96.10 ---master的ip地址。 --kubernetes-version=v1.22.0 --更具具体版本进行修改 如果报错会有版本提示,那就是有更新新版本了 [init] Using Kubernetes version: v1.22.0 [preflight] Running pre-flight checks [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/ [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.03.0-ce. Latest validated version: 18.09 [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Activating the kubelet service [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [kub-k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.96.10] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [kub-k8s-master localhost] and IPs [192.168.96.10 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [kub-k8s-master localhost] and IPs [192.168.96.10 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 24.575209 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.16" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Skipping phase. Please see --upload-certs [mark-control-plane] Marking the node kub-k8s-master as control-plane by adding the label "node-role.kubernetes.io/master=''" [mark-control-plane] Marking the node kub-k8s-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: 93erio.hbn2ti6z50he0lqs [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.96.10:6443 --token 93erio.hbn2ti6z50he0lqs \ --discovery-token-ca-cert-hash sha256:3bc60f06a19bd09f38f3e05e5cff4299011b7110ca3281796668f4edb29a56d9 #需要记住
上面记录了完成的初始化输出的内容,根据输出的内容基本上可以看出手动初始化安装一个Kubernetes集群所需要的关键步骤。 其中有以下关键内容: [kubelet] 生成kubelet的配置文件”/var/lib/kubelet/config.yaml” [certificates]生成相关的各种证书 [kubeconfig]生成相关的kubeconfig文件 [bootstraptoken]生成token记录下来,后边使用kubeadm join往集群中添加节点时会用到 配置使用kubectl 如下操作在master节点操作[root@kub-k8s-master ~]# rm -rf $HOME/.kube [root@kub-k8s-master ~]# mkdir -p $HOME/.kube [root@kub-k8s-master ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config [root@kub-k8s-master ~]# chown $(id -u):$(id -g) $HOME/.kube/config
查看node节点[root@k8s-master ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION k8s-master NotReady master 2m41s v1.22.0
# 版本差异 https://projectcalico.docs.tigera.io/archive/v3.22/getting-started/kubernetes/requirements #> 部署calico网络插件 curl -L https://docs.projectcalico.org/v3.22/manifests/calico.yaml -O kubectl apply -f calico.yaml # kubectl get pod -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-6d9cdcd744-8jt5g 1/1 Running 0 6m50s kube-system calico-node-rkz4s 1/1 Running 0 6m50s kube-system coredns-74ff55c5b-bcfzg 1/1 Running 0 52m kube-system coredns-74ff55c5b-qxl6z 1/1 Running 0 52m kube-system etcd-kub-k8s-master 1/1 Running 0 53m kube-system kube-apiserver-kub-k8s-master 1/1 Running 0 53m kube-system kube-controller-manager-kub-k8s-master 1/1 Running 0 53m kube-system kube-proxy-gfhkf 1/1 Running 0 52m kube-system kube-scheduler-kub-k8s-master 1/1 Running 0 53m
calico 原理
基于 BGP 的网络连接:Calico 使用 BGP 协议在 Kubernetes 集群中的节点之间建立网络连接。每个节点运行一个 Calico 代理【daemonset】,该代理负责与其他节点建立 BGP 连接,并将容器的网络流量路由到正确的目的地。
网络策略:Calico 提供了一种灵活的网络策略模型,允许定义容器之间的访问规则。可以使用标签来标识容器,并使用网络策略来控制容器之间的流量。
IP 地址管理:Calico 自动管理 Kubernetes 集群中的 IP 地址分配。它使用 IPIP 隧道技术将容器的 IP 地址映射到节点的 IP 地址,以确保容器之间的通信。
网络隔离:Calico 提供了网络隔离功能,允许将不同的容器和应用程序隔离到不同的网络中。
flannel 原理
网络覆盖:Flannel 在 Kubernetes 集群中的每个节点上运行一个守护进程,该守护进程使用 VXLAN。
网络配置:Flannel 守护进程负责配置虚拟网络,并将容器的网络流量路由到正确的目的地。
IP 地址分配:Flannel 自动管理 Kubernetes 集群中的 IP 地址分配。它使用 IPAM(IP Address Management)模块来分配 IP 地址,并将它们分配给容器。
网络策略:Flannel 支持网络策略,允许定义容器之间的访问规则。可以使用网络策略来控制容器之间的流量,以确保应用程序的安全性。
配置node节点加入集群: 如果报错开启ip转发: # sysctl -w net.ipv4.ip_forward=1 在所有node节点操作,此命令为初始化master成功后返回的结果 # kubeadm join 192.168.96.10:6443 --token 93erio.hbn2ti6z50he0lqs \ --discovery-token-ca-cert-hash sha256:3bc60f06a19bd09f38f3e05e5cff4299011b7110ca3281796668f4edb29a56d9
各种检测: 1.查看pods:[root@kub-k8s-master ~]# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-5644d7b6d9-sm8hs 1/1 Running 0 39m coredns-5644d7b6d9-vddll 1/1 Running 0 39m etcd-kub-k8s-master 1/1 Running 0 37m kube-apiserver-kub-k8s-master 1/1 Running 0 38m kube-controller-manager-kub-k8s-master 1/1 Running 0 38m kube-flannel-ds-amd64-9wgd8 1/1 Running 0 38m kube-flannel-ds-amd64-lffc8 1/1 Running 0 2m11s kube-flannel-ds-amd64-m8kk2 1/1 Running 0 2m2s kube-proxy-dwq9l 1/1 Running 0 2m2s kube-proxy-l77lz 1/1 Running 0 2m11s kube-proxy-sgphs 1/1 Running 0 39m kube-scheduler-kub-k8s-master 1/1 Running 0 37m
2.查看节点:[root@kub-k8s-master ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION kub-k8s-master Ready master 43m v1.22.0 kub-k8s-node1 Ready
6m46s v1.22.0 kub-k8s-node2 Ready 6m37s v1.22.0 到此集群配置完成
错误整理
#> 如果集群初始化失败:(每个节点都要执行) $ kubeadm reset -f; ipvsadm --clear; rm -rf ~/.kube $ systemctl restart kubelet #> 如果忘记token值 $ kubeadm token create --print-join-command $ kubeadm init phase upload-certs --upload-certs添加标签 kubectl label nodes node3 name=value 删除标签 kubectl label nodes node3 name-