博客新手,尽力写得详细认真,如有疏漏尽情谅解,欢迎添加公众号cloud_fans交流。
分别在master node和client node的/etc/hosts文件中,添加以下内容:
192.168.1.10 master master
192.168.1.11 client client
其中两个ip地址替换为上一篇中我们为两个虚机设置的‘host only’网络地址。
修改后:
[root@master ~]# cat /etc/hosts
127.0.0.1 master localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.10 master master
192.168.1.11 client clien
[root@client ~]# cat /etc/hosts
127.0.0.1 client localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.1.10 master master
192.168.1.11 client client
为了避免不必要的麻烦,我们将防火墙关闭。
分别在master node和client node上关闭firewalld。
[root@master ~]# systemctl stop firewalld.servic
[root@master ~]# systemctl disable firewalld.service
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
[root@client ~]# systemctl stop firewalld.servic
[root@client ~]# systemctl disable firewalld.service
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
若当前系统启用了SELinux,则需要临时设置其当前状态为permissive:
[root@master ~]# setenforce 0
[root@client ~]# setenforce 0
另外修改/etc/sysconfig/selinux文件中的SELINUX值为disabled以彻底禁用SELinux:
[root@master ~]# cat /etc/sysconfig/selinux
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
[root@client ~]# cat /etc/sysconfig/selinux
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
做完准备工作后,我们现在就可以开始部署kubernetes集群了。kubernetes支持多种容器运行环境,本文使用最流行的docker技术。
[root@master ~]# wget https://download.docker.com/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
[root@master ~]# yum install docker-ce -y
[root@client ~]# wget https://download.docker.com/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
[root@client ~]# yum install docker-ce -y
Note:
如果系统没有安装wget,使用下面命令安装wget。
yum install wget -y
安装完成后,我们需要修改iptables的FORWARD策略为ACCEPT,修改方法是修改/usr/lib/systemd/system/docker.service文件,在"ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock"之后加入一行内容(master node和client node都需要加):
ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT
然后我们重启docker服务,并设置docker为开机自启动:
[root@master ~]# systemctl daemon-reload
[root@master ~]# systemctl start docker.service
[root@master ~]# systemctl enable docker.service
Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.
[root@client ~]# systemctl daemon-reload
[root@client ~]# systemctl start docker.service
[root@client ~]# systemctl enable docker.service
Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.
首先在master node和client node上分别添加用于安装kubelet和kubeadm,kubectl等组件的yum仓库,在master node和client node中分别添加文件"/etc/yum.repos.d/kubernetes.repo",内容如下:
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
添加完成后,安装kubelet,kubeadm,kubectl:
[root@master ~]# yum install kubelet kubeadm kubectl -y
[root@client ~]# yum install kubelet kubeadm kubectl -y
kubernetes自1.8版本其强制要求关闭系统上的交换分区,否则无法启动,由于我们的虚机内存资源有限,需要保留交换分区,所以我们修改kubernetes的参数用以忽略禁止使用交换分区的限制,分别在master node 和 client node上修改文件"/etc/sysconfig/kubelet"参数如下:
KUBELET_EXTRA_ARGS="--fail-swap-on=false"
修改完成后,设置kubelet服务开机自启动。
[root@master ~]# systemctl enable kubelet
Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /usr/lib/systemd/system/kubelet.service.
[root@client ~]# systemctl enable kubelet
Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /usr/lib/systemd/system/kubelet.service.
正常流程,我们应该直接进行2.2.4集群初始化,但是由于墙的原因,我们在2.2.4集群初始化之前就需要先手动下载docker image。
首先在master node上查看集群初始化所需docker image版本:
[root@master ~]# kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.17.4
k8s.gcr.io/kube-controller-manager:v1.17.4
k8s.gcr.io/kube-scheduler:v1.17.4
k8s.gcr.io/kube-proxy:v1.17.4
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.4.3-0
k8s.gcr.io/coredns:1.6.5
然后我们将"k8s.gcr.io"替换为国内阿里的镜像源如下"registry.cn-hangzhou.aliyuncs.com/google_containers",然后在master node上输入以下命令手动下载:
[root@master ~]# docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.17.4
[root@master ~]# docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.17.4
[root@master ~]# docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.17.4
[root@master ~]# docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.17.4
[root@master ~]# docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1
[root@master ~]# docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0
[root@master ~]# docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.5
下载完成后,更改image tag,通过修改image tag将我们从阿里镜像源下载的docker image修改回原有的"k8s.gcr.io"以供kubeadm初始化使用,其中docker tag后的第一个参数是我们从阿里镜像源下载的镜像名字,第二个参数是我们使用"kubeadm config images list"查看到的镜像名字:
[root@master ~]# docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.17.4 k8s.gcr.io/kube-apiserver:v1.17.4
[root@master ~]# docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.17.4 k8s.gcr.io/kube-controller-manager:v1.17.4
[root@master ~]# docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.17.4 k8s.gcr.io/kube-scheduler:v1.17.4
[root@master ~]# docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.17.4 k8s.gcr.io/kube-proxy:v1.17.4
[root@master ~]# docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.1 k8s.gcr.io/pause:3.1
[root@master ~]# docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0 k8s.gcr.io/etcd:3.4.3-0
[root@master ~]# docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.5 k8s.gcr.io/coredns:1.6.5
在master node上运行以下命令:
[root@master ~]# kubeadm init --kubernetes-version=v1.17.4 --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12 --apiserver-advertise-address=0.0.0.0 --ignore-preflight-errors=Swap
Note:
如果运行中报错"/proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1",在master node和client node分别运行:
[root@master ~]# echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
[root@client ~]# echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
如果一切顺利,运行完"kubeadm init"命令后master node得到以下输出:
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.1.10:6443 --token z66onn.645t3tg2rdpa0iib \
--discovery-token-ca-cert-hash sha256:b201b3343a2253b4eafb529a79fccc8d70a4c7a440041167eda045c378284f13
设置配置文件:
[root@master ~]# mkdir -p $HOME/.kube
[root@master ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
查看集群组件状态:
[root@master ~]# kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
如果status都为healthy,那么说明2.2.4集群初始化运行成功,否则需要重新初始化。
查看node状态:
[root@master ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
master NotReady master 9m25s v1.17.4
显示node状态为notready,不要紧张,这是正常的,因为我们还没有安装网络插件,安装后就会是ready状态。
下载flannel配置清单,下载地址:https://github.com/coreos/flannel/blob/master/Documentation/kube-flannel.yml
然后再master node上新建flannel_config.yaml的文件,将配置清单内容复制进文件。
master node运行命令:
[root@master ~]# kubectl apply -f flannel_config.yaml
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds-amd64 created
daemonset.apps/kube-flannel-ds-arm64 created
daemonset.apps/kube-flannel-ds-arm created
daemonset.apps/kube-flannel-ds-ppc64le created
daemonset.apps/kube-flannel-ds-s390x created
运行完成后,在master node上检查flannel pod的状态,直到状态为Running时,表示flannel已安装完成。
[root@master ~]# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6955765f44-2cc4z 1/1 Running 0 13h
kube-system coredns-6955765f44-f4b55 1/1 Running 0 13h
kube-system etcd-master 1/1 Running 1 13h
kube-system kube-apiserver-master 1/1 Running 1 13h
kube-system kube-controller-manager-master 1/1 Running 1 13h
kube-system kube-flannel-ds-amd64-pbdg9 1/1 Running 0 5m22s
kube-system kube-proxy-vk8tr 1/1 Running 1 13h
kube-system kube-scheduler-master 1/1 Running 1 13h
查看node状态,已经变为Ready
[root@master ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
master Ready master 13h v1.17.4
我们在client node上输入2.2.4 集群初始化中kubeadm init输出中的
kubeadm join 192.168.1.10:6443 --token z66onn.645t3tg2rdpa0iib --discovery-token-ca-cert-hash sha256:b201b3343a2253b4eafb529a79fccc8d70a4c7a440041167eda045c378284f13
同时,加上"–ignore-preflight-errors=Swap"
[root@client ~]# kubeadm join 192.168.1.10:6443 --token z66onn.645t3tg2rdpa0iib --discovery-token-ca-cert-hash sha256:b201b3343a2253b4eafb529a79fccc8d70a4c7a440041167eda045c378284f13 --ignore-preflight-errors=Swap
运行完后,查看,在client node上查看显示:
[root@client ~]# kubectl get nodes
W0323 04:13:48.193623 4259 loader.go:223] Config not found: /etc/kubernetes/admin.conf
The connection to the server localhost:8080 was refused - did you specify the right host or port?
解决方法,在master node上运行:
[root@master ~]# scp /etc/kubernetes/admin.conf root@client:/etc/kubernetes/
然后在client node上运行:
[root@client ~]# echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
[root@client ~]# source .bashrc_profile
然后再次在client node上查看node和pod状态:
[root@client ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
client NotReady 34s v1.17.4
master Ready master 18h v1.17.4
[root@client ~]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-6955765f44-2cc4z 1/1 Running 0 18h
coredns-6955765f44-f4b55 1/1 Running 0 18h
etcd-master 1/1 Running 1 18h
kube-apiserver-master 1/1 Running 1 18h
kube-controller-manager-master 1/1 Running 1 18h
kube-flannel-ds-amd64-pbdg9 1/1 Running 0 5h50m
kube-flannel-ds-amd64-pvhbr 0/1 Init:0/1 0 5m40s
kube-proxy-dgk4b 0/1 ContainerCreating 0 5m40s
kube-proxy-vk8tr 1/1 Running 1 18h
kube-scheduler-master 1/1 Running 1 18h
node一直是notready,pod也起不来,查看log显示:
[root@master ~]# kubectl describe pod kube-proxy-dzz66 -n kube-system
Name: kube-proxy-dzz66
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: client/192.168.1.11
Start Time: Mon, 23 Mar 2020 10:29:08 -0400
Labels: controller-revision-hash=949fbc8
k8s-app=kube-proxy
pod-template-generation=1
Annotations:
Status: Pending
IP: 192.168.1.11
IPs:
IP: 192.168.1.11
Controlled By: DaemonSet/kube-proxy
Containers:
kube-proxy:
Container ID:
Image: k8s.gcr.io/kube-proxy:v1.17.4
Image ID:
Port:
Host Port:
Command:
/usr/local/bin/kube-proxy
--config=/var/lib/kube-proxy/config.conf
--hostname-override=$(NODE_NAME)
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment:
NODE_NAME: (v1:spec.nodeName)
Mounts:
/lib/modules from lib-modules (ro)
/run/xtables.lock from xtables-lock (rw)
/var/lib/kube-proxy from kube-proxy (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-proxy-token-cnrll (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-proxy:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kube-proxy
Optional: false
xtables-lock:
Type: HostPath (bare host directory volume)
Path: /run/xtables.lock
HostPathType: FileOrCreate
lib-modules:
Type: HostPath (bare host directory volume)
Path: /lib/modules
HostPathType:
kube-proxy-token-cnrll:
Type: Secret (a volume populated by a Secret)
SecretName: kube-proxy-token-cnrll
Optional: false
QoS Class: BestEffort
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations:
CriticalAddonsOnly
node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/network-unavailable:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/pid-pressure:NoSchedule
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unschedulable:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m40s default-scheduler Successfully assigned kube-system/kube-proxy-dzz66 to client
Warning FailedCreatePodSandBox 47s (x5 over 4m4s) kubelet, client Failed to create pod sandbox: rpc error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.1": Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
显示client node上pull docker image不成功,所以干脆又把之前在master node上下载docker image,更改image tag的命令在client node上重新运行一次,步骤参考2.2.3。
再次在client node上查看pod,node状态:
[root@client ~]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-6955765f44-kzzdq 1/1 Running 0 25m
coredns-6955765f44-zd76j 1/1 Running 0 25m
etcd-master 1/1 Running 0 25m
kube-apiserver-master 1/1 Running 0 25m
kube-controller-manager-master 1/1 Running 0 25m
kube-flannel-ds-amd64-49h6h 1/1 Running 0 21m
kube-flannel-ds-amd64-g6tr7 1/1 Running 0 21m
kube-proxy-c42pz 1/1 Running 0 25m
kube-proxy-dzz66 1/1 Running 0 24m
kube-scheduler-master 1/1 Running 0 25m
[root@client ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
client Ready 24m v1.17.4
master Ready master 25m v1.17.4
至此,成功。
Note:
如果安装flannel插件过程中,image下载有问题,可以使用以下repo下载并更改tag:
docker pull quay-mirror.qiniu.com/coreos/flannel:v0.12.0-amd64
docker tag quay-mirror.qiniu.com/coreos/flannel:v0.12.0-amd64 quay.io/coreos/flannel:v0.12.0-amd64