一、简介
二、部署规划
所有节点上都需要安装以下组件:
kubeadm:用来初始化集群(Cluster)
kubelet:运行在集群中的所有节点上,负责启动 pod 和 容器。
kubectl:这个是 Kubernetes 命令行工具。通过 kubectl 可以部署和管理应用,查看各种资源,创建、删除和更新各种组件。
三、主机初始化操作
1、操作系统信息
操作系统版本: #cat /etc/redhat-release
CentOS Linux release 7.8.2003 (Core)
主机名:#uname -a
Linux bogon 3.10.0-1127.el7.x86_64 #1 SMP Tue Mar 31 23:36:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
2、修改master和node主机名和hosts文件
hostnamectl set-hostname master01
hostnamectl set-hostname worker01
hostnamectl set-hostname worker02
cat >> /etc/hosts < 192.168.120.10 master01 192.168.120.11 worker01 192.168.120.12 worker02 EOF [root@bogon ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.120.10 master01 192.168.120.11 worker01 192.168.120.12 worker02 3、安装chrony实现所有服务器间的时间同步 # yum install chrony -y # systemctl start chronyd # sed -i -e '/^server/s/^/#/' -e '1a server ntp.aliyun.com iburst' /etc/chrony.conf # systemctl restart chronyd # timedatectl set-timezone Asia/Shanghai # timedatectl [root@bogon ~]# timedatectl Local time: Sun 2021-03-07 20:06:03 CST Universal time: Sun 2021-03-07 12:06:03 UTC RTC time: Sun 2021-03-07 12:06:02 Time zone: Asia/Shanghai (CST, +0800) NTP enabled: yes NTP synchronized: yes RTC in local TZ: no DST active: n/a 4、关闭master和node的防火墙和selinux # systemctl stop firewalld && systemctl disable firewalld # sed -ri 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config # 主要查看SELINUX=disabled,如果SELINUX=eabled 需要修改为disabled # setenforce 0 # getenforce # 关掉swap # swapoff -a # 要永久禁掉swap分区,打开如下文件注释掉swap那一行 # vi /etc/fstab [root@bogon ~]# more /etc/fstab # # /etc/fstab # Created by anaconda on Thu Mar 4 04:12:45 2021 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # /dev/mapper/centos-root / xfs defaults 0 0 UUID=e078410b-2654-4884-a113-dede3e7d3fc5 /boot xfs defaults 0 0 #/dev/mapper/centos-swap swap swap defaults 0 0 5、配置系统内核参数和调优 配置sysctl内核参数 $ cat > /etc/sysctl.conf < vm.max_map_count=262144 net.ipv4.ip_forward = 1 #net.bridge.bridge-nf-call-ip6tables = 1 #net.bridge.bridge-nf-call-iptables = 1 EOF 生效文件 $ sysctl -p [root@bogon ~]# sysctl -p vm.max_map_count = 262144 net.ipv4.ip_forward = 1 修改Linux 资源配置文件,调高ulimit最大打开数和systemctl管理的服务文件最大打开数 $ echo "* soft nofile 655360" >> /etc/security/limits.conf $ echo "* hard nofile 655360" >> /etc/security/limits.conf $ echo "* soft nproc 655360" >> /etc/security/limits.conf $ echo "* hard nproc 655360" >> /etc/security/limits.conf $ echo "* soft memlock unlimited" >> /etc/security/limits.conf $ echo "* hard memlock unlimited" >> /etc/security/limits.conf $ echo "DefaultLimitNOFILE=1024000" >> /etc/systemd/system.conf $ echo "DefaultLimitNPROC=1024000" >> /etc/systemd/system.conf 6、配置主机互信 配置ssh互信,那么节点之间就能无密访问,方便日后执行自动化部署 #每台机器都执行: # ssh-keygen 【# 每台机器执行这个命令, 一路回车即可 # ssh-copy-id node 】 #每台机器都执行以下命令: ssh-copy-id master01 ssh-copy-id worker01 ssh-copy-id worker02 四、master和node节点都部署docker # 安装依赖包 # yum install -y yum-utils device-mapper-persistent-data lvm2 # 添加docker软件包的yum源 # yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo # 关闭测试版本list(只显示稳定版) # yum-config-manager --enable docker-ce-edge # yum-config-manager --enable docker-ce-test # 更新yum包索引 # yum makecache fast # 安装docker # 直接安装Docker CE # yum install docker-ce # 若需要安装指定版本的Docker CE # yum list docker-ce --showduplicates|sort -r #找到需要安装的 # yum install docker-ce-18.06.0.ce -y #启动docker # systemctl start docker & systemctl enable docker #配置docker 使用阿里云加速 #vi /etc/docker/daemon.json { "registry-mirrors": ["https://q2hy3fzi.mirror.aliyuncs.com"] } #systemctl daemon-reload && systemctl restart docker 五、master和node上安装k8s 工具 更换yum源为阿里源 # vi /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg # yum安装k8s工具 # yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes # 或指定版本安装 # yum install -y kubelet-1.19.4 kubeadm-1.19.4 kubectl-1.19.4 --disableexcludes=kubernetes # 启动k8s服务 # systemctl enable kubelet && systemctl start kubelet # 查看版本号 # kubeadm version 配置iptable # vi /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 vm.swappiness=0 # 保存后执行 # sysctl --system [root@worker02 ~]# vi /etc/sysctl.conf [root@worker02 ~]# [root@worker02 ~]# [root@worker02 ~]# sysctl -p vm.max_map_count = 262144 net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 六、master节点获取要下载的镜像列表及初始化 [root@master01 ~]# kubeadm config images list W0307 20:52:52.078151 10223 version.go:102] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get "https://storage.googleapis.com/kubernetes-release/release/stable-1.txt": context deadline exceeded (Client.Timeout exceeded while awaiting headers) W0307 20:52:52.078791 10223 version.go:103] falling back to the local client version: v1.20.4 k8s.gcr.io/kube-apiserver:v1.20.4 k8s.gcr.io/kube-controller-manager:v1.20.4 k8s.gcr.io/kube-scheduler:v1.20.4 k8s.gcr.io/kube-proxy:v1.20.4 k8s.gcr.io/pause:3.2 k8s.gcr.io/etcd:3.4.13-0 k8s.gcr.io/coredns:1.7.0 # 制作下载镜像的脚本 可按照如下的设置 # vi docker.sh !/bin/bash docker pull registry.aliyuncs.com/google_containers/kube-apiserver:v1.20.4 docker tag registry.aliyuncs.com/google_containers/kube-apiserver:v1.20.4 k8s.gcr.io/kube-apiserver:v1.20.4 docker rmi registry.aliyuncs.com/google_containers/kube-apiserver:v1.20.4 docker pull registry.aliyuncs.com/google_containers/kube-controller-manager:v1.20.4 docker tag registry.aliyuncs.com/google_containers/kube-controller-manager:v1.20.4 k8s.gcr.io/kube-controller-manager:v1.20.4 docker rmi registry.aliyuncs.com/google_containers/kube-controller-manager:v1.20.4 docker pull registry.aliyuncs.com/google_containers/kube-scheduler:v1.20.4 docker tag registry.aliyuncs.com/google_containers/kube-scheduler:v1.20.4 k8s.gcr.io/kube-scheduler:v1.20.4 docker rmi registry.aliyuncs.com/google_containers/kube-scheduler:v1.20.4 docker pull registry.aliyuncs.com/google_containers/kube-proxy:v1.20.4 docker tag registry.aliyuncs.com/google_containers/kube-proxy:v1.20.4 k8s.gcr.io/kube-proxy:v1.20.4 docker rmi registry.aliyuncs.com/google_containers/kube-proxy:v1.20.4 docker pull registry.aliyuncs.com/google_containers/etcd:3.4.13-0 docker tag registry.aliyuncs.com/google_containers/etcd:3.4.13-0 k8s.gcr.io/etcd:3.4.13-0 docker rmi registry.aliyuncs.com/google_containers/etcd:3.4.13-0 docker pull registry.aliyuncs.com/google_containers/pause:3.2 docker tag registry.aliyuncs.com/google_containers/pause:3.2 k8s.gcr.io/pause:3.2 docker rmi registry.aliyuncs.com/google_containers/pause:3.2 docker pull coredns/coredns:1.7.0 docker tag coredns/coredns:1.7.0 k8s.gcr.io/coredns:1.7.0 docker rmi coredns/coredns:1.7.0 [root@master01 ~]# docker images REPOSITORY TAG IMAGE ID CREATED SIZE k8s.gcr.io/kube-proxy v1.20.4 c29e6c583067 2 weeks ago 118MB k8s.gcr.io/kube-controller-manager v1.20.4 0a41a1414c53 2 weeks ago 116MB k8s.gcr.io/kube-apiserver v1.20.4 ae5eb22e4a9d 2 weeks ago 122MB k8s.gcr.io/kube-scheduler v1.20.4 5f8cb769bd73 2 weeks ago 47.3MB k8s.gcr.io/etcd 3.4.13-0 0369cf4303ff 6 months ago 253MB k8s.gcr.io/coredns 1.7.0 bfe3a36ebd25 8 months ago 45.2MB k8s.gcr.io/pause 3.2 80d28bedfe5d 12 months ago 683kB #master 初始化操作 # kubeadm init \ --apiserver-advertise-address=192.168.120.10 \ --image-repository registry.aliyuncs.com/google_containers \ --kubernetes-version v1.20.4 \ --service-cidr=10.96.0.0/16 \ --pod-network-cidr=10.244.0.0/16 【 kubeadm init \ --apiserver-advertise-address=192.168.106.102 \ #指定master监听的地址 --image-repository registry.aliyuncs.com/google_containers \ #指定下载源 --kubernetes-version v1.18.0 \ #指定kubernetes版本 --service-cidr=10.96.0.0/12 #设置集群内部的网络 --pod-network-cidr=10.244.0.0/16 #设置pod的网络 】 Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.120.10:6443 --token u1qw31.qacahi507rjw0me4 \ --discovery-token-ca-cert-hash sha256:54b013bab495f94242270028e2175de3593dc80a8bc763b63aed91924f61a1fa #集群主节点安装成功,这里要记得保存这条命令,以便之后各个节点加入集群: You can now join any number of machines by running the following on each node as root: kubeadm join 192.168.120.10:6443 --token netw4z.cwl1q17m2kgbiekv \ --discovery-token-ca-cert-hash sha256:2e0760511c65fff0a561b79fff019f6c632d8b09df3ef3479cb09e95d68e4486 #配置kubetl认证信息 #echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile #source ~/.bash_profile 或 #export KUBECONFIG=/etc/kubernetes/admin.conf #source ~/.bash_profile 【master节点上执行: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config #echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile #source ~/.bash_profile】 #查看一下集群pod,确认个组件都处于Running 状态 #注意由于master节点上存在污点,所以coredns 暂时还无法正常启动。 [[root@master01 ~]# echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile [root@master01 ~]# source ~/.bash_profile [root@master01 ~]# kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE coredns-74ff55c5b-4n4wm 0/1 Pending 0 6m10s coredns-74ff55c5b-g6xr4 0/1 Pending 0 6m10s etcd-master01 1/1 Running 0 6m25s kube-apiserver-master01 1/1 Running 0 6m25s kube-controller-manager-master01 1/1 Running 0 6m25s kube-proxy-lcv5t 1/1 Running 0 6m10s kube-scheduler-master01 1/1 Running 0 6m25s 七、 给集群部署flannel 网络组件 #配置flannel网络 #mkdir -p /root/k8s/ #cd /root/k8s #yum install wget #wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml 查看需要下载的镜像 # cat kube-flannel.yml |grep image|uniq image: quay.io/coreos/flannel:v0.13.1-rc2 下载镜像 # docker pull quay.io/coreos/flannel:v0.13.1-rc2 部署插件 # kubectl apply -f kube-flannel.yml 注意:如果 https://raw.githubusercontent.com 不能访问请使用: # 打开https://site.ip138.com/raw.Githubusercontent.com/ 输入raw.githubusercontent.com 查询IP地址 # 修改hosts Ubuntu,CentOS及macOS直接在终端输入 # vi /etc/hosts 185.199.109.133 raw.githubusercontent.com 199.232.68.133 raw.githubusercontent.com 如果yml中的"Network": "10.244.0.0/16"和kubeadm init xxx --pod-network-cidr不一样,就需要修改成一样的。不然可能会使得Node间Cluster IP不通。 由于我上面的kubeadm init xxx --pod-network-cidr就是10.244.0.0/16。所以此yaml文件就不需要更改了。 八、配置 k8s集群 命令补全 #(仅master) #yum install -y bash-completion #source <(kubectl completion bash) #echo "source <(kubectl completion bash)" >> ~/.bashrc #source ~/.bashrc 九、Node节点加入集群【在node节点执行】 kubeadm join 192.168.120.10:6443 --token netw4z.cwl1q17m2kgbiekv \ --discovery-token-ca-cert-hash sha256:2e0760511c65fff0a561b79fff019f6c632d8b09df3ef3479cb09e95d68e4486 检查下集群的状态: # kubectl get cs # kubectl get pod --all-namespaces # kubectl get node 若出现如下的报错,可以参考这篇博文解决:https://llovewxm1314.blog.csdn.net/article/details/108458197 在/etc/kubernetes/manifests下的kube-controller-manager.yaml和kube-scheduler.yaml设置的默认端口是0 #cd /etc/kubernetes/manifests #vi kube-controller-manager.yaml 注释27行# - --port=0 #vi kube-scheduler.yaml 注释掉 # - --port=0 重启所有节点kubelet: systemctl restart kubelet.service 排查错误命令: kubectl describe kube-flannel-ds-xstn --namespace=kube-system kubectl --namespace kube-system logs kube-flannel-ds-xstnz 清除配置,重新安装(需要重新安装主机): kubectl log kube-flannel-ds-sjs4p -n kube-system -f 启动有问题查看日志详细信息` `kubectl delete -f kube-flannel.yml #删除pod` ```bash kubeadm reset -f modprobe -r ipip lsmod rm -rf ~/.kube/ rm -rf /etc/kubernetes/ rm -rf /etc/systemd/system/kubelet.service.d rm -rf /etc/systemd/system/kubelet.service rm -rf /usr/bin/kube* rm -rf /etc/cni rm -rf /opt/cni rm -rf /var/lib/etcd rm -rf /var/etcd 十、部署 Dashboard 在Kubernetes集群中创建一个 nginx pod,然后暴露端口,验证是否正常访问: #kubectl create deployment nginx --image=nginx #kubectl expose deployment nginx --port=80 --type=NodePort #kubectl get pods,svc 访问地址:http://NodeIP:Port ,此例就是:http://192.168.120.10:32251 十一、部署 Dashboard 下载Dashboard插件配置文件 #master01节点下载镜像【master 一个节点安装】 #docker pull kubernetesui/dashboard:v2.2.0 #docker pull kubernetesui/metrics-scraper:v1.0.6 #下载配置 #cd /root/k8s #wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.2.0/aio/deploy/recommended.yaml #mv recommended.yaml kubernetes-dashboard.yaml 编辑kubernetes-dashboard.yaml文件,在Dashboard Service中添加type: NodePort,暴露Dashboard服务 kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kube-system spec: type: NodePort ports: - port: 443 targetPort: 8443 selector: k8s-app: kubernetes-dashboard 执行命令 # kubectl create -f kubernetes-dashboard.yaml namespace/kubernetes-dashboard created serviceaccount/kubernetes-dashboard created service/kubernetes-dashboard created secret/kubernetes-dashboard-certs created secret/kubernetes-dashboard-csrf created secret/kubernetes-dashboard-key-holder created configmap/kubernetes-dashboard-settings created role.rbac.authorization.k8s.io/kubernetes-dashboard created clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created deployment.apps/kubernetes-dashboard created service/dashboard-metrics-scraper created deployment.apps/dashboard-metrics-scraper created 创建sa并绑定默认的cluster-admin管理员集群角色: #kubectl create serviceaccount dashboard-admin -n kubernetes-dashboard #kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kubernetes-dashboard:dashboard-admin 登陆kubernetes-dashboard: #kubectl get secret -n kubernetes-dashboard #kubectl describe secret dashboard-admin-token-5mjtx -n kubernetes-dashboard 注意:查看kubernetes-dashboard 命令: #kubectl --namespace=kubernetes-dashboard get service kubernetes-dashboard 解决Google浏览器不能打开kubernetes dashboard方法 执行命令 #mkdir key && cd key 生成证书 #openssl genrsa -out dashboard.key 2048 #openssl req -days 36000 -new -out dashboard.csr -key dashboard.key -subj '/CN=dashboard-cert' #openssl x509 -req -in dashboard.csr -signkey dashboard.key -out dashboard.crt 删除原有的证书secret #kubectl delete secret kubernetes-dashboard-certs -n kubernetes-dashboard 创建新的证书secret #kubectl create secret generic kubernetes-dashboard-certs --from-file=dashboard.key --from-file=dashboard.crt -n kubernetes-dashboard 查看pod #kubectl get pod -n kubernetes-dashboard 重启pod #kubectl delete pod kubernetes-dashboard-9f9799597-ct4sp -n kubernetes-dashboard 查看结果 kubectl get service -n kubernetes-dashboard -o wide 十二、登录集群dashboard面板 #查看svc地址的nodeport 端口 # kubectl get svc -n kubernetes-dashboard #获取令牌登录token #kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}') 十三、部署管理工具 Lens 是一个开源的管理 Kubernetes 集群的 IDE,支持 MacOS, Windows 和 Linux。通过 Lens,我们可以很方便地管理多个 Kubernetes 集群。官方网址为:https://k8slens.dev/ 十四、参考文档 https://www.cnblogs.com/wenyang321/p/14050893.html https://cloud.tencent.com/developer/article/1509412 https://kuboard.cn/install/install-k8s.html