OS:Centos7.1.1503
Kernel: 3.10.0-229.el7.x86_64
Kubernetes: 1.7.0
Docker: 17.06.0-ce
Etcd: 3.1.9
Calico: 2.3
k8s-master: walker-1.novalocal(172.16.6.47)
k8s-node: walker-2.novalocal(172.16.6.249)
$ mkdir /etc/yum.repos.d/backup && mv /etc/yum.repos.d/*.repo /etc/yum.repos.d/backup
# 使用阿里源
$ wget http://mirrors.aliyun.com/repo/Centos-7.repo
$ wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
$ cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
EOF
$ yum clean all
$ yum repolist
在 master 和 node 上执行:
$ yum install kubeadm kubelet docker-ce -y
安装好 kubelet 之后,需要比对docker 和 kubelet 使用的 CGroup Driver
.
[root@walker-1 ~]# docker info | grep -i cgroup
Cgroup Driver: cgroupfs
[root@walker-1 ~]# grep -i cgroup /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
Environment="KUBELET_CGROUP_ARGS=--cgroup-driver=cgroupfs"
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_CADVISOR_ARGS $KUBELET_CGROUP_ARGS $KUBELET_EXTRA_ARGS
默认docker-ce 17.06 为 cgroupfs, kubelet 为 systemd。两者不一致会导致kubelet无法调用docker, 因此需要修改。
由于gfw的原因,国内访问镜像资源受限。可通过阿里docker源来查找并下载相关镜像。具体的镜像版本可在各组件的 yaml
文中找到
所需的镜像有:
# k8s
gcr.io/google_containers/kube-proxy-amd64:v1.7.0
gcr.io/google_containers/kube-apiserver-amd64:v1.7.0
gcr.io/google_containers/kube-controller-manager-amd64:v1.7.0
gcr.io/google_containers/kube-scheduler-amd64:v1.7.0
gcr.io/google_containers/pause-amd64:3.0
# dns
gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.4
gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.4
gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.4
# calico
quay.io/calico/node:v1.3.0
quay.io/calico/cni:v1.9.1
quay.io/calico/kube-policy-controller:v0.6.0
# dashboar
# heapster
在 master 上执行:
$ test -d /etc/kubernetes/manifests/ || mkdir -p /etc/kubernetes/manifests/
$ cd /etc/kubernetes/manifests
$ wget http://docs.projectcalico.org/v2.3/getting-started/kubernetes/installation/hosted/kubeadm/1.6/calico.yaml
$ curl -O -L https://github.com/projectcalico/calicoctl/releases/download/v1.3.0/calicoctl
$ chmod +x calicoctl && cp calicoctl /usr/bin -v
$ echo "export ETCD_ENDPOINTS=http://walker-1:2379" >> /etc/profile && source /etc/profile
如果想使用外部 etcd, 可将calico.yaml 中有关 etcd 的 DaemonSet
和 Service
删除, 同时写上外部 etcd 的访问地址。
在 kubeadm
初始化时,主动构建 calico
网络。如果缺失该步骤,会造成 kube-dns
无法获取 ip, 从而创建失败:
"message":"cannot join network of a non running container"
在 master 上执行:
[root@walker-1 k8s_imgs]# kubeadm init --skip-preflight-checks
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[init] Using Kubernetes version: v1.7.1
[init] Using Authorization modes: [Node RBAC]
[preflight] Skipping pre-flight checks
[certificates] Using the existing CA certificate and key.
[certificates] Using the existing API Server certificate and key.
[certificates] Using the existing API Server kubelet client certificate and key.
[certificates] Using the existing service account token signing key.
[certificates] Using the existing front-proxy CA certificate and key.
[certificates] Using the existing front-proxy client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/scheduler.conf"
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Using existing up-to-date KubeConfig file: "/etc/kubernetes/controller-manager.conf"
[apiclient] Created API client, waiting for the control plane to become ready
[apiclient] All control plane components are healthy after 434.505352 seconds
[token] Using token: f0945d.dbfb07f1d8952edf
[apiconfig] Created RBAC rules
[addons] Applied essential addon: kube-proxy
[addons] Applied essential addon: kube-dns
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run (as a regular user):
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
http://kubernetes.io/docs/admin/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join --token f0945d.dbfb07f1d8952edf 172.16.6.47:6443
更多有关 kubeadm 参数可参考:
https://kubernetes.io/docs/admin/kubeadm/
通过 kubectl get po --namespace=kube-system
来检测 pod 的启动情况。
[root@walker-1 kubernetes]# kubectl get pod --namespace=kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE
calico-policy-controller-3271399580-hxp2d 1/1 Running 12 3d 172.16.6.47 walker-1.novalocal
kube-apiserver-walker-1.novalocal 1/1 Running 1 3h 172.16.6.47 walker-1.novalocal
kube-controller-manager-walker-1.novalocal 1/1 Running 0 3h 172.16.6.47 walker-1.novalocal
kube-dns-2425271678-2xwq4 3/3 Running 31 2d 192.168.187.206 walker-1.novalocal
kube-proxy-m7r85 1/1 Running 0 2h 172.16.6.47 walker-1.novalocal
kube-scheduler-walker-1.novalocal 1/1 Running 0 3h 172.16.6.47 walker-1.novalocal
注: 为了能顺利使用 kubectl
需要为其配置环境变量:
$ echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> /etc/profile && source /etc/profile
注: 为了让master上能调度pod,执行:
$ kubectl taint nodes --all node-role.kubernetes.io/master-
先将 master 上的 kube-proxy, calico 等镜像 copy 至 node。
在 node 上执行:
[root@walker-2 k8s_imgs]# kubeadm join --token f0945d.dbfb07f1d8952edf 172.16.6.47:6443
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[preflight] Running pre-flight checks
[preflight] WARNING: docker version is greater than the most recently validated version. Docker version: 17.06.0-ce. Max validated version: 1.12
[preflight] WARNING: hostname "" could not be reached
[preflight] WARNING: hostname "" lookup : no such host
[preflight] Some fatal errors occurred:
hostname "" a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')
/proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
[preflight] If you know what you are doing, you can skip pre-flight checks with `--skip-preflight-checks`
[root@walker-2 k8s_imgs]# kubeadm join --token f0945d.dbfb07f1d8952edf 172.16.6.47:6443 --skip-preflight-checks
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[preflight] Skipping pre-flight checks
[discovery] Trying to connect to API Server "172.16.6.47:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://172.16.6.47:6443"
[discovery] Cluster info signature and contents are valid, will use API Server "https://172.16.6.47:6443"
[discovery] Successfully established connection with API Server "172.16.6.47:6443"
[bootstrap] Detected server version: v1.7.0
[bootstrap] The server supports the Certificates API (certificates.k8s.io/v1beta1)
[csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request
[csr] Received signed certificate from the API server, generating KubeConfig...
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
Node join complete:
* Certificate signing request sent to master and response
received.
* Kubelet informed of new secure connection details.
Run 'kubectl get nodes' on the master to see this machine join.
为了使 kubectl
能正常使用,需要将 master 上的 /etc/kubernetes/admin.conf 拷贝至 node 上:
$ scp walker-1:/etc/kubernetes/admin.conf /etc/kubernetes
$ echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> /etc/profile && source /etc/profile
安装完成之后,可用使用 kubectl get po
来查看 pod:
[root@walker-2 walker]#
[root@walker-2 walker]# kubectl get nodes
NAME STATUS AGE VERSION
walker-1.novalocal Ready 3d v1.7.0
walker-2.novalocal Ready 41m v1.7.1
[root@walker-2 walker]# kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE
nginx-deployment-2059996365-06z60 1/1 Running 2 2d 192.168.187.207 walker-1.novalocal
nginx-deployment-2059996365-lvdcp 1/1 Running 0 3m 192.168.135.64 walker-2.novalocal
为校验网络连通状况,可在node上使用curl 来访问。eg: curl 192.168.187.207
安装完成之后,会看到 node 节点上的 /var/log/message 中一直刷
eviction manager: no observation found for eviction signal allocatableNodeFs.available
不知何解。github 上也有相关 issue, 持续观望中。
https://github.com/kubernetes/kubernetes/issues/48703
docker 17.06.0-ce 版本中,当为 pod 设置了内存配额时,可能会由于内核老旧而导致 pod 启动报错。相关 issue 信息如下所示:
https://stackoverflow.com/questions/45056968/hyperledger-fabric-1-0-on-centos-error-endorsing-chaincode
https://github.com/moby/moby/issues/34046
https://github.com/docker/for-linux/issues/43
kube-dns 中默认设有内存配额,3.10.0-229.el7.x86_64 内核版本会有问题。通过 elrepo 升级至 4.4.76-1.el7.elrepo.x86_64 后解决。