Tungsten Fabric(6):部署更高版本的TF

之前文章中描述了部署R1902版本的踩坑过程。后续准备升级一下,目标定在了R2003,主要是https://github.com/Juniper/contrail-controller和镜像都有这个版本。很多细节和之前一样,下面就列一些需要关注的点。

部署环境

  • 系统信息
[root@master01 ~]# cat /etc/redhat-release 
CentOS Linux release 7.8.2003 (Core)
[root@master01 ~]# 
[root@master01 ~]# uname -a
Linux master01 3.10.0-1127.10.1.el7.x86_64 #1 SMP Wed Jun 3 14:28:03 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
[root@master01 ~]# 
  • 节点信息
192.168.122.79 部署机,本地registry
192.168.122.80 k8s master ,contrail controller
192.168.122.81 k8s node01
192.168.122.82 k8s node02

部署用的contrail-ansible-deploye版本

其实这个关系应该不大的,稳妥起见还是使用对应的版本

[root@deploy R2003]# git clone https://github.com/Juniper/contrail-ansible-deployer.git -b R2003

修改RedHat.yml

[root@deploy contrail-ansible-deployer]# vim playbooks/roles/k8s/tasks/RedHat.yml
# using this to avoid issue https://github.com/ansible/ansible/issues/20711
- name: make cache to import gpg keys
  #command: "yum -q makecache -y --disablerepo='*' --enablerepo='Kubernetes'"
  command: "yum -q makecache -y --disablerepo='*' --enablerepo='epel'"
  command: "rm -rf /etc/yum.repos.d/Kubernetes.repo"
  when: k8s_package_version is defined

instance.yaml

注意CONTRAIL_VERSION: 2003-latest即可

[root@deploy contrail-ansible-deployer]# cat config/instances.yaml
provider_config:
  bms:
   ssh_pwd: ""
   ssh_user: root
   ssh_public_key: /root/.ssh/id_rsa.pub
   ssh_private_key: /root/.ssh/id_rsa
   domainsuffix: local
instances:
  bms1:
    provider: bms
    roles:
      config_database:
      config:
      control:
      analytics_database:
      analytics:
      webui:
      k8s_master:
      kubemanager:
    ip: 192.168.122.80
  bms2:
    provider: bms
    roles:
      vrouter:
      k8s_node:
    ip: 192.168.122.81
  bms3:
    provider: bms
    roles:
      vrouter:
      k8s_node:
    ip: 192.168.122.82
global_configuration:
  CONTAINER_REGISTRY: 192.168.122.79
contrail_configuration:
  CONTRAIL_VERSION: 2003-latest
  CONTRAIL_CONTAINER_TAG: 2003-latest
  KUBERNETES_CLUSTER_PROJECT: {}

deploy主机将公钥拷贝到这三台设备上,可以免密ssh登录。

拉取版本

  • contrail的版本,可以从https://hub.docker.com/u/opencontrailnightly拉取,kube相关的镜像,可以从https://hub.docker.com/u/mirrorgooglecontainers拉取
  • kube镜像需要拉取v1.12.9,这个在R2003 ReleaseNote上已经注明了
    Tungsten Fabric(6):部署更高版本的TF_第1张图片
192.168.122.79/kube-proxy                             v1.12.9             295526df163c        12 months ago       95.7MB
192.168.122.79/kube-apiserver                         v1.12.9             8ea704c2d4a7        12 months ago       194MB
192.168.122.79/kube-controller-manager                v1.12.9             f473e8452c8e        12 months ago       164MB
192.168.122.79/kube-scheduler                         v1.12.9             c79506ccc1bc        12 months ago       58.4MB
  • contrail的组件,R2003多了一个contrail-provisioner,所以需要在拉取镜像的脚本中加一下

安装kube

首先每个部署节点添加阿里的k8s源

cat < /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

后续可以等待ansible脚本安装,或者在三个节点自己先手动安装好

# yum install -y kubelet-1.12.9 kubeadm-1.12.9 kubectl-1.12.9

安装其它必备软件

# yum -y install  python-devel -y
# pip install wheel setuptools docker_compose

ansible的版本

不要直接安装最新版的ansible!否则在后续部署的时候,会有本不影响部署的步骤无法pass,导致部署失败(血泪教训)。

[DEPRECATION WARNING]: Using tests as filters is deprecated. Instead of using `result|success` use `result is success`. This feature will be removed in version 2.9. Deprecation warnings can be disabled by setting 
deprecation_warnings=False in ansible.cfg.

我的部署机上用的版本是

[root@deploy contrail-ansible-deployer]# pip list | grep ansible
DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support it
ansible                          2.7.15

docker-ce的版本

之前部署R1912的时候,docker-ce的版本安装的是docker-ce-18.03.1.ce-1.el7.centos,结果在install_k8s的时候,会有报错

[root@master01 ~]# kubeadm init --token-ttl 0 --kubernetes-version v1.12.9 --apiserver-advertise-address 192.168.122.80 --pod-network-cidr 10.32.0.0/12 &&\n mkdir -p $HOME/.kube &&\n cp -u /etc
[init] using Kubernetes version: v1.12.9
[preflight] running pre-flight checks
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 18.03.1-ce. Latest validated version: 18.06

所以需要装18.06版本

yum install -y docker-ce-18.06.3.ce-3.el7
systemctl enable docker
systemctl start docker

本地仓库的端口

之前没有在意,跟着文档说明一步一步做,将registry的5000端口映射到了deploy宿主机的80端口。最新部署的时候,没有映射到80端口,结果导致install_k8s无法通过。

[root@master01 ~]# kubeadm config images pull
I0616 05:04:31.387655   26257 version.go:93] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get https://storage.googleapis.com/kubernetes-release/release/stable-1.txt:)
I0616 05:04:31.388398   26257 version.go:94] falling back to the local client version: v1.12.9
failed to pull image "k8s.gcr.io/kube-apiserver:v1.12.9": output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1

所以需要
(1)registry必须将registry docker的5000端口映射为宿主机的80
(2)修改各个节点的hosts文件,将k8s.gcr.io指向本地registry宿主机IP
(3)修改各个节点的/etc/docker/daemon.json ,将本地registry的IP(假装是k8s.gcr.io)写进"insecure-registries"列表

一点提示

  • 在执行完install_k8s.yml 后,各个节点的kubelet服务应该是up,但是因为还没有安装contrail-cni,所以
[root@node01 yum.repos.d]# systemctl status kubelet
* kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/kubelet.service.d
           `-10-kubeadm.conf
   Active: active (running) since Wed 2020-06-17 23:03:23 EDT; 3min 35s ago
     Docs: https://kubernetes.io/docs/
 Main PID: 25610 (kubelet)
    Tasks: 22
   Memory: 43.1M
   CGroup: /system.slice/kubelet.service
           `-25610 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni

Jun 17 23:06:36 master01 kubelet[25610]: W0617 23:06:36.164147   25610 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
Jun 17 23:06:36 master01 kubelet[25610]: E0617 23:06:36.166776   25610 kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready:...fig uninitialized
Jun 17 23:06:41 master01 kubelet[25610]: W0617 23:06:41.183873   25610 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
Jun 17 23:06:41 master01 kubelet[25610]: E0617 23:06:41.185117   25610 kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready:...fig uninitialized
Jun 17 23:06:46 master01 kubelet[25610]: W0617 23:06:46.190000   25610 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
Jun 17 23:06:46 master01 kubelet[25610]: E0617 23:06:46.194590   25610 kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready:...fig uninitialized
Jun 17 23:06:51 master01 kubelet[25610]: W0617 23:06:51.207761   25610 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
Jun 17 23:06:51 master01 kubelet[25610]: E0617 23:06:51.208174   25610 kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready:...fig uninitialized
Jun 17 23:06:56 master01 kubelet[25610]: W0617 23:06:56.215738   25610 cni.go:188] Unable to update cni config: No networks found in /etc/cni/net.d
Jun 17 23:06:56 master01 kubelet[25610]: E0617 23:06:56.227728   25610 kubelet.go:2167] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready:...fig uninitialized
Hint: Some lines were ellipsized, use -l to show in full.
[root@master01 yum.repos.d]# 

部署之后的检查

  • 查看contrail组件的状态
[root@master01 ~]# contrail-status
Pod              Service         Original Name                          Original Version  State    Id            Status            
                 redis           contrail-external-redis                master-105        running  7136ed933b5c  Up 2 hours        
analytics        api             contrail-analytics-api                 master-105        running  0a111a7edfd0  Up 34 minutes     
analytics        collector       contrail-analytics-collector           master-105        running  105507305da8  Up 34 minutes     
analytics        nodemgr         contrail-nodemgr                       master-105        running  f6a1d9c3145d  Up 34 minutes     
analytics        provisioner     contrail-provisioner                   master-105        running  78273234d4ba  Up 34 minutes     
config           api             contrail-controller-config-api         master-105        running  94912927b988  Up 41 minutes     
config           device-manager  contrail-controller-config-devicemgr   master-105        running  5804934ee4ea  Up 41 minutes     
config           dnsmasq         contrail-controller-config-dnsmasq     master-105        running  f0c1799b990e  Up 41 minutes     
config           nodemgr         contrail-nodemgr                       master-105        running  a015edc033b5  Up 41 minutes     
config           provisioner     contrail-provisioner                   master-105        running  77f056eb01da  Up 41 minutes     
config           schema          contrail-controller-config-schema      master-105        running  2cdd6c62ca7a  Up 41 minutes     
config           stats           contrail-controller-config-stats       master-105        running  2e399f2cd91d  Up 40 minutes     
config           svc-monitor     contrail-controller-config-svcmonitor  master-105        running  2e11f2b1c20a  Up 40 minutes     
config-database  cassandra       contrail-external-cassandra            master-105        running  ec6ea9393eb3  Up About an hour  
config-database  nodemgr         contrail-nodemgr                       master-105        running  4a152016f50e  Up About an hour  
config-database  provisioner     contrail-provisioner                   master-105        running  1ed205426bbc  Up 41 minutes     
config-database  rabbitmq        contrail-external-rabbitmq             master-105        running  a82c31f5dd57  Up About an hour  
config-database  zookeeper       contrail-external-zookeeper            master-105        running  4d2e32b0de17  Up About an hour  
control          control         contrail-controller-control-control    master-105        running  4f83acc754ce  Up 36 minutes     
control          dns             contrail-controller-control-dns        master-105        running  c8572afe3b1a  Up 36 minutes     
control          named           contrail-controller-control-named      master-105        running  07771fca3ba5  Up 36 minutes     
control          nodemgr         contrail-nodemgr                       master-105        running  d33bce33655c  Up 36 minutes     
control          provisioner     contrail-provisioner                   master-105        running  d7baa52ebc23  Up 36 minutes     
database         cassandra       contrail-external-cassandra            master-105        running  f0218c363df5  Up 35 minutes     
database         nodemgr         contrail-nodemgr                       master-105        running  8a18c1903e05  Up 35 minutes     
database         provisioner     contrail-provisioner                   master-105        running  f49298d6dbcb  Up 35 minutes     
database         query-engine    contrail-analytics-query-engine        master-105        running  60406338c8a0  Up 35 minutes     
kubernetes       kube-manager    contrail-kubernetes-kube-manager       master-105        running  3b1905176974  Up 32 minutes     
webui            job             contrail-controller-webui-job          master-105        running  4fa0cc25d81e  Up 38 minutes     
webui            web             contrail-controller-webui-web          master-105        running  2f8e79cb9e19  Up 38 minutes     

== Contrail control ==
control: active
nodemgr: active
named: active
dns: active

== Contrail config-database ==
nodemgr: initializing (Disk for DB is too low. )
zookeeper: active
rabbitmq: active
cassandra: active

== Contrail kubernetes ==
kube-manager: active

== Contrail database ==
nodemgr: initializing (Disk for DB is too low. )
query-engine: active
cassandra: active

== Contrail analytics ==
nodemgr: active
api: active
collector: active

== Contrail webui ==
web: active
job: active

== Contrail config ==
svc-monitor: active
nodemgr: active
device-manager: active
api: active
schema: active

[root@master01 ~]#

Disk for DB is too low.的提示在实验环境中可以忽略。

  • 查看pod状态,需要全是running
[root@master01 ~]# kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                    READY   STATUS    RESTARTS   AGE    IP               NODE       NOMINATED NODE
kube-system   coredns-85c98899b4-8lk8h                1/1     Running   0          135m   10.47.255.252    node01     
kube-system   coredns-85c98899b4-h9bjb                1/1     Running   0          135m   10.47.255.251    node01     
kube-system   etcd-master01                           1/1     Running   2          35m    192.168.122.80   master01   
kube-system   kube-apiserver-master01                 1/1     Running   2          36m    192.168.122.80   master01   
kube-system   kube-controller-manager-master01        1/1     Running   3          36m    192.168.122.80   master01   
kube-system   kube-proxy-ffxls                        1/1     Running   1          134m   192.168.122.81   node01     
kube-system   kube-proxy-jqbxp                        1/1     Running   1          134m   192.168.122.82   node02     
kube-system   kube-proxy-kc4fh                        1/1     Running   2          135m   192.168.122.80   master01   
kube-system   kube-scheduler-master01                 1/1     Running   2          36m    192.168.122.80   master01   
kube-system   kubernetes-dashboard-76456c6d4b-vhkgp   1/1     Running   0          134m   192.168.122.81   node01     
[root@master01 ~]# 
  • 查看node状态
[root@master01 ~]# kubectl get nodes
NAME       STATUS     ROLES    AGE    VERSION
master01   NotReady   master   166m   v1.12.9
node01     Ready         165m   v1.12.9
node02     Ready         165m   v1.12.9
[root@master01 ~]#
[root@master01 ~]# mkdir -p /etc/cni/net.d/
[root@master01 ~]# scp root@node02:/etc/cni/net.d/10-contrail.conf /etc/cni/net.d/10-contrail.conf
[root@master01 ~]# systemctl restart kubelet
#等待一分钟
[root@master01 ~]# kubectl get nodes
NAME       STATUS   ROLES    AGE    VERSION
master01   Ready    master   166m   v1.12.9
node01     Ready       165m   v1.12.9
node02     Ready       165m   v1.12.9
[root@master01 ~]#  

你可能感兴趣的:(网络通信)