阿里云centos 7下kubeadm方式安装kubernetes 1.14.1集群(包含解决墙以及各种坑的问题)

(一)所有节点(master和worker node)都执行的命令

1.关闭系统swap功能,否则kubernetes无法正常启动

swapoff -a
free -h命令 swap空间为0时关闭成功

 2.升级系统:

sudo yum update -y

3.    安装docker

sudo yum install -y docker

查看docker版本

sudo docker version

 开机启动

sudo systemctl enable docker && sudo systemctl start docker

本人安装的版本为1.13.1为保证一直,请也安装此版本。

4.安装Kubernetes包

(1)首先要加入阿里的yum源

cat < /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

(2)关闭SELinux

sudo setenforce 0
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

 (3)安装kubernetes

sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes

(4) 启动

sudo systemctl enable kubelet && sudo systemctl start kubelet

(二)master节点执行命令

(1)防火墙设置 6443 和 10250可访问

sudo firewall-cmd --permanent --add-port=6443/tcp && sudo firewall-cmd --permanent --add-port=10250/tcp && sudo firewall-cmd --reload

(2) IPTables设置

sudo bash -c 'cat <  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF'

使上面的设置生效:

sudo sysctl --system

查看是否成功

sudo lsmod | grep br_netfilter

 上面命令有显示则成功

(3)Kubernetes配置

1.查看下载kubeadm依赖的images

sudo kubeadm config images list

可以看到依赖了k8s.gcr.io中的镜像,这个被墙了,而我们要完成安装,是需要拉这些images的,拉取实际命令为

sudo kubeadm config images pull

执行时,会报错,依然是墙的问题,所以我们采用其他方式,执行以下三个命令:

下载需要的镜像:

kubeadm config images list |sed -e 's/^/docker pull /g' -e 's#k8s.gcr.io#docker.io/mirrorgooglecontainers#g' |sh -x 

 重命名镜像

docker images |grep mirrorgooglecontainers |awk '{print "docker tag ",$1":"$2,$1":"$2}' |sed -e 's#docker.io/mirrorgooglecontainers#k8s.gcr.io#2' |sh -x 

删除mirrorgooglecontainers镜像 

docker images |grep mirrorgooglecontainers |awk '{print "docker rmi ", $1":"$2}' |sh -x 

这样就可以不用执行 sudo kubeadm config images pull了

 

2.初始化kubernetes

sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --image-repository index.docker.io/mirrorgooglecontainers

添加--image-repository参数是为了解决在安装其他包的时候,发生的墙的问题。

加上这个参数以后,会报错:

 failed to pull image index.docker.io/mirrorgooglecontainers/coredns:1.3.1: output: Trying to pull repository docker.io/mirrorgooglecontainers/coredns ...

 

可以先手工pull下来:

docker pull coredns/coredns:1.3.1
docker tag coredns/coredns:1.3.1 index.docker.io/mirrorgooglecontainers/coredns:1.3.1

 

 

以上命令执行过程中,如出现:

[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...

为swap没有关闭,关闭即可。

如出现:

[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...

IPTable设置有问题,重复上面的设置操作即可。

 

成功执行以后,会有如下提示信息:

[init] Using Kubernetes version: v1.14.0
[preflight] Running pre-flight checks
        [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master-node kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.120]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master-node localhost] and IPs [192.168.0.120 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master-node localhost] and IPs [192.168.0.120 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 16.501860 seconds
[upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.14" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --experimental-upload-certs
[mark-control-plane] Marking the node k8s-master-node as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8s-master-node as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 3j2pkk.xk7tnltycyz2xh5n
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.0.120:6443 --token khm95w.mo0wwenu2o9hglls \
    --discovery-token-ca-cert-hash sha256:aeb0ca593b63c8d674719858fd2397825825cebc552e3c165f00edb9671d6e32

按照提示的信息,我们在master中执行:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config 
sudo chown $(id -u):$(id -g) $HOME/.kube/config

 查看监听服务状态:

 watch kubectl get pods --all-namespaces

如果所有服务的状态都输running那么说明是正常的,在此过程中,发现coredns一直处于pending状态,

此时可执行如下命令:

export KUBECONFIG=/etc/kubernetes/admin.conf
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

等待一会以后,发现已经变成running状态了。

查看kubelet服务的状态:

systemctl status -l kubelet

 发现了两处报错:

(1) kubelet[29703]: E0429 15:57:26.321596   29703 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service"

(2)kubelet[29703]: E0429 15:57:26.321633   29703 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service"

 

添加KUBECONFIG路径

cat <> ~/.bash_profile
export KUBECONFIG=/etc/kubernetes/admin.conf
EOF
source ~/.bash_profile

编辑/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

vim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

添加参数:

Environment="KUBELET_MY_ARGS=--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice"

 添加到执行命令后面:

ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS $KUBELET_MY_ARGS

即可解决

 

3.安装dashboard

#拉取镜像
docker pull registry.cn-qingdao.aliyuncs.com/wangxiaoke/kubernetes-dashboard-amd64:v1.10.0

#重新打标签
docker tag registry.cn-qingdao.aliyuncs.com/wangxiaoke/kubernetes-dashboard-amd64:v1.10.0 k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.0

#删除无用镜像
docker image rm registry.cn-qingdao.aliyuncs.com/wangxiaoke/kubernetes-dashboard-amd64:v1.10.0

#发布
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml

4.访问dashboard

Dashboard有多种方式可以访问:

1.kubectl proxy方式:只支持127.0.0.1和localhost为来源地址的方式访问,需要配置SSH隧道,比较麻烦,不建议使用。
2.Node Port方式:该方式容易配置,只建议在开发环境的环境中使用。本文采用这种方式实现。
3.Ingress方式:通过Ingress Controller来暴露应用,比较灵活,是最推荐的方式,但较复杂。请参考文章:http://www.ebanban.com/?p=603
4.API Server方式:由于API服务器是公开的,可以从外部访问,是比较推荐的方式。请参考文章:http://www.ebanban.com/?p=603

这里我们尝试通过第二种方式实现:

(1)修改service配置,将type: ClusterIP改成NodePort

执行

kubectl edit service  kubernetes-dashboard --namespace=kube-system

阿里云centos 7下kubeadm方式安装kubernetes 1.14.1集群(包含解决墙以及各种坑的问题)_第1张图片

(2)查看外网暴露端口

kubectl get service --namespace=kube-system

  

(3)创建dashboard用户

创建admin-token.yaml文件,文件内容如下:

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: admin
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: admin
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin
  namespace: kube-system
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
kubectl create -f admin-token.yaml  

(4)获取token

kubectl describe secret/$(kubectl get secret -nkube-system |grep admin|awk '{print $1}') -nkube-system

阿里云centos 7下kubeadm方式安装kubernetes 1.14.1集群(包含解决墙以及各种坑的问题)_第2张图片

通过浏览器登陆dashboard。输入https://192.168.80.132:30502/ =》意思是外网IP:节点端口、默认浏览器会阻止访问,要加入信任列表,选择令牌访问,然后输入token。

阿里云centos 7下kubeadm方式安装kubernetes 1.14.1集群(包含解决墙以及各种坑的问题)_第3张图片

 即可。

(5)添加flannel支持,否则会报cni初始化错误的问题

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml

(三)worker node分支添加

执行之前,先在worker node机器上执行第一步,然后再按照以下步骤执行

(1)在分支机器上执行以下命令:

获取kubeadm init后执行所得的join的命令

kubeadm join 192.168.0.120:6443 --token khm95w.mo0wwenu2o9hglls \
    --discovery-token-ca-cert-hash sha256:aeb0ca593b63c8d674719858fd2397825825cebc552e3c165f00edb9671d6e32

这里面token只有24小时有效时间,获取hash值为(master上运行):

 

openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'

 

重新获取token(master上运行):

kubeadm token create

worker node如果 systemctl status kubele

告警 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d May 29 06:30:28 fnode kubelet[4136]: E0529 06:30:28.935309 4136 kubelet.go:2130] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

解决方法:

vim /var/lib/kubelet/kubeadm-flags.env

去掉--network-plugin=cni重启kubelet服务即可,此时在master上执行kubectl get nodes可以看到worker noder节点由notready变成ready状态了。所谓的CNI就是Container Network Interface,是google等指定的一套容器间进行网络通信的标准。

你可能感兴趣的:(运维)