环境准备工作
主机名 操作系统 IP地址
master Centos 7.2-x86_64 10.199.187.176
node1 Centos 7.2-x86_64 10.199.187.177
node2 Centos 7.2-x86_64 10.199.187.178
关闭CentOS7自带的防火墙服务
systemctl disable firewalld
systemctl stop firewalld
修改主机名
//10.199.187.176节点执行
[root@localhost ~]# hostnamectl set-hostname master
//10.199.187.177节点执行
[root@localhost ~]# hostnamectl set-hostname node1
//10.199.187.178节点执行
[root@localhost ~]# hostnamectl set-hostname node2
关闭SElinux
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
关闭swap
swapoff -a # 关闭swap
sed -ri '/[#]*swap/s@^@#@' /etc/fstab # 取消开机挂载swap
添加kubernate yum源
vim /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
安装docker并设置开机启动
yum install -y docker
systemctl enable docker
安装kubelet kubeadm kubectl
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
kubeadm:快速创建集群的工具
kubelet:这是一个需要在所有集群中机器上安装的组件,它用于执行开启Pod和容器等操作。
kubectl:与集群通信的命令行工具,官方提供的CLI。
开机启动kubelet
systemctl enable --now kubelet
网络相关设置
echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables
echo '1' > /proc/sys/net/ipv4/ip_forward
初始化
kubeadm init --image-repository=registry.aliyuncs.com/google_containers --service-cidr=10.199.187.0/24 --pod-network-cidr=192.168.3.0/16 --kubernetes-version=v1.18.3
或者
kubeadm init --image-repository=registry.aliyuncs.com/google_containers --apiserver-advertise-address=10.199.187.176 --pod-network-cidr=192.168.3.0/16 --kubernetes-version=v1.18.3
--service-cidr 与其他网络平面的交互地址
--apiserver-advertise-address 指定与其它节点通信的接口
--pod-network-cidr 指定pod网络子网,使用fannel网络必须使用这个CIDR
初始化成功后,显示
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 10.199.187.176:6443 --token 5qoq4l.d93m61ae11cbefze \
--discovery-token-ca-cert-hash sha256:e8231966b98f13efd7afbdd6a89e32a3440435c67c56f0f194226edfd435d596
按照提示执行命令
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
然后给集群安装一个Pod网络组件
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
这里使用Calico
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
执行kubectl get nodes命令,等待master的状态从NotReady变成Ready
默认情况下,集群不会在Master节点上部署Pod。如果你想要允许集群在Master上部署Pod,可以执行以下命令,这样即使只有一台node节点,也可以正常使用集群
kubectl taint nodes --all node-role.kubernetes.io/master-
加入工作节点
kubeadm init成功后会显示有关kubeadm join
kubeadm join 10.199.187.176:6443 --token 5qoq4l.d93m61ae11cbefze \
--discovery-token-ca-cert-hash sha256:e8231966b98f13efd7afbdd6a89e32a3440435c67c56f0f194226edfd435d596
如果token过期,可以重新生成
kubeadm token create --print-join-command
查询集群Pod工作状态
[root@master ~]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-76d4774d89-6kp2x 1/1 Running 3 42h
kube-system calico-node-bnppg 1/1 Running 2 42h
kube-system calico-node-kg5kg 1/1 Running 1 42h
kube-system calico-node-qfkh2 1/1 Running 1 42h
kube-system coredns-7ff77c879f-b8ckk 1/1 Running 2 43h
kube-system coredns-7ff77c879f-lfmdh 1/1 Running 3 43h
kube-system etcd-master 1/1 Running 2 42h
kube-system kube-apiserver-master 1/1 Running 2 42h
kube-system kube-controller-manager-master 1/1 Running 2 42h
kube-system kube-proxy-6qgcw 1/1 Running 1 43h
kube-system kube-proxy-rcdn9 1/1 Running 3 42h
kube-system kube-proxy-thkkj 1/1 Running 1 42h
kube-system kube-scheduler-master 1/1 Running 2 42h
查看node状态
[root@master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 43h v1.18.3
node1 Ready 43h v1.18.3
node2 Ready 43h v1.18.3
查错过程
如果有Pod Ready是0,可以执行kubectl describe pod
查看node节点日志 kubectl describe node
可以通过journalctl -f -u kubelet 查看kubelet日志
重启cubelet命令如下
systemctl daemon-reload
systemctl restart kubelet
报错
The connection to the server 10.199.187.176:6443 was refused - did you specify the right host or port?
我报错的原因是因为没关防火墙的开机重启,重启了master节点,防火墙也重启了。关闭防火墙就好了。
Failed to start ContainerManager failed to initialize top level QOS containers: failed to update top level BestEffort QOS cgroup : failed to set supported cgroup subsystems for cgroup [kubepods besteffort]: Failed to set config for supported subsystems : failed to write 4611686018427387904 to hugetlb.1GB.limit_in_bytes: open /sys/fs/cgroup/hugetlb/kubepods.slice/kubepods-besteffort.slice/hugetlb.1GB.limit_in_bytes: no such file or directory
Nov 29 23:32:13 localhost systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
就是停掉所有节点的kubepod相关的systemd slice
直接执行systemctl stop kubepods.slice后,再重启kubelet
或者在/var/lib/kubelet/kubeadm-flags.env添加
--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice,如下
[root@master ~]# cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS="--cgroup-driver=systemd --runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.2"