准备:有三个干净的Centos7.2的机器,分别为:
dev-learn-77 172.22.21.77
dev-learn-78 172.22.21.78
dev-learn-79 172.22.21.79
全文有如下需要注意的地方:
[root@dev-learn-77 ~]# systemctl disable firewalld
[root@dev-learn-77 ~]# systemctl stop firewalld
可升可不升
[root@dev-learn-77 ~]# yum update
kubelet通过hostname做解析,所以需要在hosts文件中配置下
[root@dev-learn-77 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.22.21.77 dev-learn-77
172.22.21.78 dev-learn-78
172.22.21.79 dev-learn-79
[root@dev-learn-77 ~]#
不设置的话,后面初始化和加入集群都会出问题
cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
[root@dev-learn-77 ~]# swapoff -a
[root@dev-learn-77 ~]# vi /etc/sysconfig/selinux
setenforce 0
cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
然后重启机器
一定要reboot,否则后面无法启动docker。
SELinux: initialized (dev overlay, type overlay), uses xattr
yum install -y docker && systemctl enable docker && systemctl start docker
[root@dev-learn-77 ~]# docker version
Client:
Version: 1.13.1
API version: 1.26
Package version: docker-1.13.1-109.gitcccb291.el7.centos.x86_64
Go version: go1.10.3
Git commit: cccb291/1.13.1
Built: Tue Mar 3 17:21:24 2020
OS/Arch: linux/amd64
Server:
Version: 1.13.1
API version: 1.26 (minimum version 1.12)
Package version: docker-1.13.1-109.gitcccb291.el7.centos.x86_64
Go version: go1.10.3
Git commit: cccb291/1.13.1
Built: Tue Mar 3 17:21:24 2020
OS/Arch: linux/amd64
Experimental: false
[root@dev-learn-77 ~]#
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubelet kubeadm kubectl
systemctl enable kubelet && systemctl start kubelet
使用kubeadm配置集群,需要google的镜像,但是如果没有措施的话,可以参考:
网络不通时下载google镜像
相关的镜像包,我都放到这个网盘里了:
链接:https://pan.baidu.com/s/1d9QuB8sm9qme73Rrm551Pw
提取码:rozu
本文以有工具为讲述:
[root@dev-learn-77 ~]# mkdir /etc/systemd/system/docker.service.d/
[root@dev-learn-77 ~]# touch http-proxy.conf
向http-proxy.conf中写配置
[Service]
Environment="HTTP_PROXY=http://proxyhost:port" "NO_PROXY=localhost,127.0.0.1,172.22.0.0/16"
配置docker,并检查是否使用代理
[root@dev-learn-77 ~]# systemctl daemon-reload
[root@dev-learn-77 ~]# systemctl show --property=Environment docker
Environment=GOTRACEBACK=crash DOCKER_HTTP_HOST_COMPAT=1 PATH=/usr/libexec/docker:/usr/bin:/usr/sbin HTTP_PROXY=http://xxx:xxx NO_PROXY=localhost,127.0.0.1,172.22.0.0/16
[root@dev-learn-77 ~]#
若要使用cni网络,必须加上参数 --pod-network-cidr,这个值可以不变,但是一旦改变,后面安装flannel的时候,也需要更改成和他一样的值。这个过程非常漫长,大部分时间在下载镜像。
可以使用如下命令,把image都先拉下来
kubeadm config images list
kubeadm config images pull
如果执行失败,需要执行kubeadm reset
,再重新init
[root@dev-learn-77 ~]# kubeadm init --kubernetes-version=v1.18.1 --pod-network-cidr=10.244.0.0/16
W0410 20:24:57.172766 14564 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.1
[preflight] Running pre-flight checks
[WARNING Hostname]: hostname "dev-learn-79" could not be reached
[WARNING Hostname]: hostname "dev-learn-79": lookup dev-learn-79 on 114.114.114.114:53: no such host
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [dev-learn-79 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.22.21.79]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [dev-learn-79 localhost] and IPs [172.22.21.79 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [dev-learn-79 localhost] and IPs [172.22.21.79 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0410 20:27:37.354321 14564 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0410 20:27:37.356960 14564 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.
[apiclient] All control plane components are healthy after 79.502208 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.18" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node dev-learn-79 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node dev-learn-79 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: snmg4v.y4snmy1sy6ed4lbe
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 172.22.21.77:6443 --token nzscgu.udr2losypa26fikw --discovery-token-ca-cert-hash sha256:294895101b396085708cd162351a0b58fc3ec8a0311ea11a166bd95c2cd25796
[root@dev-learn-77 ~]#
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
执行完后等一会发现node正常了
[root@dev-learn-77 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
dev-learn-77 Ready master 48m v1.18.1
如果不安装会有如下报错:
4月 10 20:47:31 dev-learn-77 kubelet[16068]: W0410 20:47:31.520049 16068 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
4月 10 20:47:33 dev-learn-77 kubelet[16068]: E0410 20:47:33.265069 16068 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
githubusercontent.com需要有梯子才能访问
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
链接:https://pan.baidu.com/s/1tC9lgMZR9IjRiH5mZu-BzA
提取码:uke7
一定要注意版本,因为安装时v1.18.1是k8s最新版本,所以可以用master上的yml,其他k8s版本一定要找到对应版本的yml
但是查看kube-system下面的pod,发现coredns一直处于ContainerCreating状态
[root@dev-learn-77 ~]# kubectl get pod -o wide -n kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-66bff467f8-lfw7d 0/1 ContainerCreating 0 153m 10.244.0.2 dev-learn-77 <none> <none>
coredns-66bff467f8-rh2jg 0/1 ContainerCreating 0 153m 10.244.0.4 dev-learn-77 <none> <none>
查看信息如下:
Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "271b849b2e775b7d6ded57cc4654cc44684f8e9e4f27997673554d3cbeb88c75" network for pod "coredns-66bff467f8-gwvfg": networkPlugin cni failed to set up pod "coredns-66bff467f8-gwvfg_kube-system" network: stat /var/lib/calico/nodename: no such file or directory: check that the calico/node container is running and has mounted /var/lib/calico/
docker ps -a 和kubectl get pod --all-namespace发现并没有calico的container和pod:
此时需要安装下calico
calico.yaml的链接:
链接:https://pan.baidu.com/s/10Ow4Yk3WI-ATMFggTE-w_Q
提取码:smcu
[root@dev-learn-77 ~]# kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.apps/calico-node created
serviceaccount/calico-node created
deployment.apps/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
[root@dev-learn-77 ~]# kubectl get pod -o wide -n kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-555fc8cc5c-vwds7 1/1 Running 0 21s 10.244.218.64 dev-learn-77 <none> <none>
calico-node-g2npg 1/1 Running 0 21s 172.22.21.77 dev-learn-77 <none> <none> 1/1 Running 0 21s 172.22.21.78 dev-learn-78 <none> <none>
coredns-66bff467f8-lfw7d 1/1 Running 0 153m 10.244.0.2 dev-learn-77 <none> <none>
coredns-66bff467f8-rh2jg 1/1 Running 0 153m 10.244.0.4 dev-learn-77 <none> <none>
etcd-dev-learn-77 1/1 Running 1 4h45m 172.22.21.77 dev-learn-77 <none> <none>
kube-apiserver-dev-learn-77 1/1 Running 1 4h45m 172.22.21.77 dev-learn-77 <none> <none>
kube-controller-manager-dev-learn-77 1/1 Running 1 4h45m 172.22.21.77 dev-learn-77 <none> <none>
kube-flannel-ds-amd64-h7tzd 1/1 Running 2 159m 172.22.21.77 dev-learn-77 <none> <none>
kube-proxy-swv6q 1/1 Running 1 4h45m 172.22.21.77 dev-learn-77 <none> <none>
kube-scheduler-dev-learn-77 1/1 Running 2 4h45m 172.22.21.77 dev-learn-77 <none> <none>
在dev-78和dev-79上分别执行
[root@dev-learn-78 ~]# kubeadm join 172.22.21.77:6443 --token nzscgu.udr2losypa26fikw --discovery-token-ca-cert-hash sha256:294895101b396085708cd162351a0b58fc3ec8a0311ea11a166bd95c2cd25796
[root@dev-learn-79 ~]# kubeadm join 172.22.21.77:6443 --token nzscgu.udr2losypa26fikw --discovery-token-ca-cert-hash sha256:294895101b396085708cd162351a0b58fc3ec8a0311ea11a166bd95c2cd25796
--token
和--discovery-token-ca-cert-hash
的值,是dev-77初始化时的结果
[root@dev-learn-77 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
dev-learn-77 Ready master 4h48m v1.18.1
dev-learn-78 Ready <none> 19m v1.18.1
dev-learn-79 Ready <none> 37m v1.18.1
[root@dev-learn-77 ~]# kubectl get pod -o wide -n kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-555fc8cc5c-vwds7 1/1 Running 0 24m 10.244.218.64 dev-learn-79 <none> <none>
calico-node-g2npg 1/1 Running 0 24m 172.22.21.77 dev-learn-77 <none> <none>
calico-node-lwdc2 1/1 Running 0 24m 172.22.21.79 dev-learn-79 <none> <none>
calico-node-qhzcl 1/1 Running 0 24m 172.22.21.78 dev-learn-78 <none> <none>
coredns-66bff467f8-lfw7d 1/1 Running 0 177m 10.244.0.2 dev-learn-77 <none> <none>
coredns-66bff467f8-rh2jg 1/1 Running 0 177m 10.244.0.4 dev-learn-77 <none> <none>
etcd-dev-learn-77 1/1 Running 1 5h9m 172.22.21.77 dev-learn-77 <none> <none>
kube-apiserver-dev-learn-77 1/1 Running 1 5h9m 172.22.21.77 dev-learn-77 <none> <none>
kube-controller-manager-dev-learn-77 1/1 Running 1 5h9m 172.22.21.77 dev-learn-77 <none> <none>
kube-flannel-ds-amd64-66lmw 1/1 Running 0 59m 172.22.21.79 dev-learn-79 <none> <none>
kube-flannel-ds-amd64-h7tzd 1/1 Running 2 3h3m 172.22.21.77 dev-learn-77 <none> <none>
kube-flannel-ds-amd64-sm7lq 1/1 Running 0 40m 172.22.21.78 dev-learn-78 <none> <none>
kube-proxy-8gp76 1/1 Running 0 40m 172.22.21.78 dev-learn-78 <none> <none>
kube-proxy-swv6q 1/1 Running 1 5h9m 172.22.21.77 dev-learn-77 <none> <none>
kube-proxy-vnn2v 1/1 Running 0 59m 172.22.21.79 dev-learn-79 <none> <none>
kube-scheduler-dev-learn-77 1/1 Running 2 5h9m 172.22.21.77 dev-learn-77 <none> <none>
有时会遇到这个问题:三个calico有一个是0/1 Running状态,describe pod发现是
Readiness probe failed: calico/node is not ready: BIRD is not ready: BGP not established with 10.244.0.1,10.244.2.12020-04-13 06:29:59.582 [INFO][682] health.go 156: Number of node(s) with BGP peering established = 0
解决办法:
修改从官网上得到的calico.yaml
新增两行:
- name: IP_AUTODETECTION_METHOD
value: "interface=eno16780032"
value指向从ip a看到的实际网卡名。结果如下:
# Cluster type to identify the deployment type
- name: CLUSTER_TYPE
value: "k8s,bgp"
- name: IP_AUTODETECTION_METHOD
value: "interface=eno16780032"
# Auto-detect the BGP IP address.
- name: IP
value: "autodetect"
# Enable IPIP
- name: CALICO_IPV4POOL_IPIP
value: "Always"
[root@dev-learn-77 ~]# kubectl delete -f https://docs.projectcalico.org/manifests/calico.yaml
[root@dev-learn-77 ~]# kubectl apply -f calico.yaml
等一会就发现正常了。
[root@dev-learn-77 calico]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-proxy v1.18.1 4e68534e24f6 4 days ago 117 MB
k8s.gcr.io/kube-controller-manager v1.18.1 d1ccdd18e6ed 4 days ago 162 MB
k8s.gcr.io/kube-apiserver v1.18.1 a595af0107f9 4 days ago 173 MB
k8s.gcr.io/kube-scheduler v1.18.1 6c9320041a7b 4 days ago 95.3 MB
docker.io/calico/pod2daemon-flexvol v3.13.2 77a3fabf99a5 13 days ago 111 MB
docker.io/calico/cni v3.13.2 a89faaa1676a 13 days ago 224 MB
docker.io/calico/kube-controllers v3.13.2 0c3bf0adad2b 13 days ago 56.6 MB
docker.io/calico/node v3.13.2 6f674c890b23 13 days ago 260 MB
docker.io/calico/cni v3.8.8-1 ca2a236d9210 13 days ago 161 MB
docker.io/calico/pod2daemon-flexvol v3.8.8 cacd6d732f12 3 weeks ago 9.38 MB
quay.io/coreos/flannel v0.12.0-amd64 4e9f801d2217 4 weeks ago 52.8 MB
k8s.gcr.io/pause 3.2 80d28bedfe5d 8 weeks ago 683 kB
k8s.gcr.io/coredns 1.6.7 67da37a9a360 2 months ago 43.8 MB
k8s.gcr.io/etcd 3.4.3-0 303ce5db0e90 5 months ago 288 MB
quay.io/coreos/flannel v0.11.0 ff281650a721 14 months ago 52.6 MB
[root@dev-learn-77 calico]#