使用RKE部署高可用Rancher

微信公众号:运维开发故事,作者:刘大仙

RKE简述:

Rancher Kubernetes Engine(RKE)是一款轻量级Kubernetes安装程序,支持在裸机和虚拟化服务器上安装Kubernetes。RKE解决了Kubernetes社区中的一个常见问题,比如:安装复杂性。RKE支持多种平台运行,比如MacOS,linux,windows。

详情见:https://docs.rancher.cn/rke/

Rancher简述:

Rancher 是为使用容器的公司打造的容器管理平台。Rancher 简化了使用 Kubernetes 的流程,开发者可以随处运行 Kubernetes(Run Kubernetes Everywhere),满足 IT 需求规范,赋能 DevOps 团队。

详情见:https://rancher2.docs.rancher.cn/docs/overview/_index

使用环境:

操作系统 主机名 IP地址 节点 作用
CentOS 7 1810 nginx-master 192.168.111.21 Nginx主服务器 负载均衡
CentOS 7 1810 nginx-backup 192.168.111.22 Nginx备服务器 负载均衡
ubuntu-18.04.3-live-server rke-node1 192.168.111.50 rke节点1 RKE集群
ubuntu-18.04.3-live-server rke-node2 192.168.111.51 rke节点2 RKE集群
ubuntu-18.04.3-live-server rke-node3 192.168.111.52 rke节点3 RKE集群

部署前系统环境准备:

关闭防火墙和SeLinux

为防止因端口问题造成集群组建失败,我们在这里提前关闭防火墙以及selinux

  • centos :

    systemctl stop firewalld
    systemctl disable firewalld
    setenforce 0
    sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
    
  • Ubuntu:

    sudo ufw stop
    

配置host文件:

192.168.111.21 nginx-master
192.168.111.22 nginx-backup
192.168.111.50 rke-node1
192.168.111.51 rke-node2
192.168.111.52 rke-node3
  • 配置host文件,并确保每台机器上都可以通过主机名互通

需要用到的工具:

此安装需要以下 CLI 工具。请确保这些工具已经安装并在$PATH中可用

CLI工具的安装在RKE节点上进行,确保3台节点都已经安装正确

  • kubectl - Kubernetes 命令行工具.

  • rke - Rancher Kubernetes Engine,用于构建 Kubernetes 集群的 cli。

  • helm - Kubernetes 的软件包管理工具。

    请参阅Helm 版本要求选择 Helm 的版本来安装 Rancher。

安装 Kubectl:

  • 安装参考K8S官网,由于某些特殊原因,此处我们使用snap

    sudo apt-get install snapd
    sudo snap install kubectl --classic # 此处安装较慢,请耐心等待
    # 验证安装
    kubectl help
    

安装 RKE:

  • 安装参考Rancher官网,由于是从GitHub上下载,文件较大,网络原因请自行解决

    wget https://github.com/rancher/rke/releases/download/v1.0.8/rke_linux-amd64
    # 将二进制文件移动至/usr/local/bin/下并改名成rke,并赋予可执行权限
    sudo mv rke_linux-amd64 /usr/local/bin/rke
    sudo chmod +x /usr/local/bin/rke
    # 验证安装
    rke --version
    

安装 Helm:

  • 安装参考Helm官网,Helm是Kubernetes的包管理器,Helm的版本需要高于v3

    # 下载安装包
    wget https://get.helm.sh/helm-v3.2.1-linux-amd64.tar.gz
    # 解压
    tar zxvf helm-v3.2.1-linux-amd64.tar.gz
    # 将二进制文件移动至/usr/local/bin/
    sudo mv linux-amd64/helm /usr/local/bin/helm
    # 验证安装
    helm help
    

创建 Nginx+Keepalived 集群:

此处在CentOS节点上进行

  • 安装 Nginx

    # 下载Nginx安装包
    wget http://nginx.org/download/nginx-1.17.10.tar.gz
    # 解压安装包
    tar zxvf nginx-1.17.10.tar.gz
    # 安装编译时必备的软件包
    yum install -y gcc gcc-c++ pcre pcre-devel zlib zlib-devel openssl openssl-devel libnl3-devel
    # 进入nginx目录,此处我们需要使用https,所有在编译时选择 --with-http_ssl_module 模块
    cd nginx-1.17.10
    mkdir -p /usr/local/nginx
    ./configure --prefix=/usr/local/nginx --with-http_ssl_module --with-stream
    # 安装nginx
    make && make install
    # 创建nginx命令软连接
    ln -s /usr/local/nginx/sbin/nginx /usr/local/bin/nginx
    # 验证安装
    nginx -V
    # 启动nginx
    nginx
    
  • 安装 Keepalived

    # 下载安装包
    wget https://www.keepalived.org/software/keepalived-2.0.20.tar.gz
    # 解压安装包
    tar zxvf keepalived-2.0.20.tar.gz
    # 编译安装keepalived
    cd keepalived-2.0.20
    mkdir /usr/local/keepalived
    ./configure --prefix=/usr/local/keepalived/
    make && make install
    # 配置 keepalived 为系统服务
    cp /usr/local/keepalived/sbin/keepalived /usr/sbin/keepalived
    cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/keepalived
    touch /etc/init.d/keepalived
    chmod +x /etc/init.d/keepalived # keepalived 中的内容见下文
    vim /etc/init.d/keepalived
    # 配置 keepalived
    mkdir /etc/keepalived/
    cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
    vim /etc/keepalived/keepalived.conf #keepalived.conf 中的内容见下文
    # 启动keepalived
    systemctl start keepalived
    systemctl enable keepalived
    # 验证
    systemctl status keepalived
    # 此时keepalived应该是运行,一个为master,一个为backup, master上执行 ip addr 命令时,应该存在一个虚拟ip地址,backup上不应该有
    # 访问 https://192.168.111.20 验证配置
    

    /etc/init.d/keepalived文件内容

    #!/bin/sh

    Startup script for the Keepalived daemon

    processname: keepalived

    pidfile: /var/run/keepalived.pid

    config: /etc/keepalived/keepalived.conf

    chkconfig: - 21 79

    description: Start and stop Keepalived

    Source function library

    . /etc/rc.d/init.d/functions

    Source configuration file (we set KEEPALIVED_OPTIONS there)

    . /etc/sysconfig/keepalived

    RETVAL=0

    prog=“keepalived”

    start() {
    echo -n $"Starting $prog: "
    daemon keepalived K E E P A L I V E D O P T I O N S R E T V A L = {KEEPALIVED_OPTIONS} RETVAL= KEEPALIVEDOPTIONSRETVAL=?
    echo
    [ KaTeX parse error: Expected 'EOF', got '&' at position 16: RETVAL -eq 0 ] &̲& touch /var/lo…prog
    }

    stop() {
    echo -n $"Stopping p r o g : " k i l l p r o c k e e p a l i v e d R E T V A L = prog: " killproc keepalived RETVAL= prog:"killprockeepalivedRETVAL=?
    echo
    [ KaTeX parse error: Expected 'EOF', got '&' at position 16: RETVAL -eq 0 ] &̲& rm -f /var/lo…prog
    }

    reload() {
    echo -n $"Reloading p r o g : " k i l l p r o c k e e p a l i v e d − 1 R E T V A L = prog: " killproc keepalived -1 RETVAL= prog:"killprockeepalived1RETVAL=?
    echo
    }

    See how we were called.

    case " 1 " i n s t a r t ) s t a r t ; ; s t o p ) s t o p ; ; r e l o a d ) r e l o a d ; ; r e s t a r t ) s t o p s t a r t ; ; c o n d r e s t a r t ) i f [ − f / v a r / l o c k / s u b s y s / 1" in start) start ;; stop) stop ;; reload) reload ;; restart) stop start ;; condrestart) if [ -f /var/lock/subsys/ 1"instart)start;;stop)stop;;reload)reload;;restart)stopstart;;condrestart)if[f/var/lock/subsys/prog ]; then
    stop
    start
    fi
    ;;
    status)
    status keepalived
    RETVAL=$?
    ;;
    *)
    echo “Usage: $0 {start|stop|reload|restart|condrestart|status}”
    RETVAL=1
    esac

    exit $RETVAL

    # /etc/keepalived/keepalived.conf 中的内容
    ! Configuration File for keepalived
    
    global_defs {
       router_id 192.168.111.21 # 此id在网络中有且只有一个,不应有重复的id
    }
    
    vrrp_script chk_nginx {     #因为要检测nginx服务状态,所以创建一个检查脚本
        script "/usr/local/keepalived/check_ng.sh"
        interval 3
    }
    
    vrrp_instance VI_1 {
        state MASTER    # 配置此节点为master,备机上设置为BACKUP
        interface ens33    # 设置绑定的网卡
        virtual_router_id 51    # vrrp 组, 主备的vrrp组应该一样
        priority 120    # 优先级,优先级大的为主
        advert_int 1    # 检查间隔
        authentication { # 认证
            auth_type PASS
            auth_pass 1111
        }
        virtual_ipaddress { # 虚拟IP
            192.168.111.20
        }
        track_script {    # 执行脚本
            chk_nginx
        }
    }
    

    /usr/local/keepalived/check_ng.sh 中的内容

    #!/bin/bash
    d=date --date today +%Y%m%d_%H:%M:%S
    n=ps -C nginx --no-heading|wc -l
    if [ $n -eq “0” ]; then
    systemctl start nginx
    n2=ps -C nginx --no-heading|wc -l
    if [ n 2 − e q " 0 " ] ; t h e n e c h o " n2 -eq "0" ]; then echo " n2eq"0"];thenecho"d nginx down,keepalived will stop" >> /var/log/check_ng.log
    systemctl stop keepalived
    fi
    fi

    
    

安装 docker-ce :

此处在RKE节点上进行

# 移除旧版本Docker
sudo apt-get remove docker docker-engine docker.io containerd runc
# 安装工具包
sudo apt-get install -y \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common
# 添加 Docker官方 GPG key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
# 添加 stable apt 源
sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"
# 安装 Docker-ce
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io
# 验证安装
docker info
# 将当前用户加入"docker"用户组,加入到该用户组的账号在随后安装过程会用到。用于节点访问的SSH用户必须是节点上docker组的成员
sudo usermod -aG docker $USER

配置四层负载均衡

此处在Nginx集群操作

# 更新nginx配置文件
# vim /usr/local/nginx/conf/nginx.conf

#user  nobody;
worker_processes  4;
worker_rlimit_nofile 40000;

events {
    worker_connections  8192;
}

stream {
    upstream rancher_servers_http {
        least_conn;
        server 192.168.111.50:80 max_fails=3 fail_timeout=5s;
        server 192.168.111.51:80 max_fails=3 fail_timeout=5s;
        server 192.168.111.52:80 max_fails=3 fail_timeout=5s;
    }
    server {
        listen     80;
        proxy_pass rancher_servers_http;
    }

    upstream rancher_servers_https {
        least_conn;
        server 192.168.111.50:443 max_fails=3 fail_timeout=5s;
        server 192.168.111.51:443 max_fails=3 fail_timeout=5s;
        server 192.168.111.52:443 max_fails=3 fail_timeout=5s;
    }

    server {
        listen     443;
        proxy_pass rancher_servers_https;
    }
}

开始部署:

使用 RKE 安装 Kubernetes

  • RKE-Node 之间建立 ssh 免密登陆

    # 生成 rsa 公钥秘钥
    ssh-keygen
    # 复制当前主机上的公钥到另外两台上面,实现免密码登录
    ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
    ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
    ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
    # 注意,自已也要跟自己注册一下,三个节点都要执行
    # 验证
    docker@rke-node3:~$ ssh [email protected]    # 在node3上远程node1 此时ssh应该不需要密码
    
  • 编写 rancher-cluster.yml 文件

    # vim rancher-cluster.yml
    nodes:
      - address: 192.168.111.50    # 主机IP
        user: docker    # 可以执行docker命令的用户
        role: [controlplane,worker,etcd]    # 节点角色
      - address: 192.168.111.51
        user: docker
        role: [controlplane,worker,etcd]
      - address: 192.168.111.52
        user: docker
        role: [controlplane,worker,etcd]
    
    services:
      etcd:
        snapshot: true
        creation: 6h
        retention: 24
    
  • 运行 RKE 构建 Kubernetes 集群

    rke up --config ./rancher-cluster.yml
    # 验证:返回下面的消息则说明执行成功。
    # Finished building Kubernetes cluster successfully.
    
  • Pod 是RunningCompleted状态。

  • STATUSRunning 的 Pod,READY 应该显示所有容器正在运行 (例如,3/3)。

  • STATUSCompleted的 Pod 是一次运行的作业。对于这些 Pod,READY应为0/1

    kubectl get pods --all-namespaces
    
    NAMESPACE       NAME                                      READY     STATUS      RESTARTS   AGE
    ingress-nginx   nginx-ingress-controller-tnsn4            1/1       Running     0          30s
    ingress-nginx   nginx-ingress-controller-tw2ht            1/1       Running     0          30s
    ingress-nginx   nginx-ingress-controller-v874b            1/1       Running     0          30s
    kube-system     canal-jp4hz                               3/3       Running     0          30s
    kube-system     canal-z2hg8                               3/3       Running     0          30s
    kube-system     canal-z6kpw                               3/3       Running     0          30s
    kube-system     kube-dns-7588d5b5f5-sf4vh                 3/3       Running     0          30s
    kube-system     kube-dns-autoscaler-5db9bbb766-jz2k6      1/1       Running     0          30s
    kube-system     metrics-server-97bc649d5-4rl2q            1/1       Running     0          30s
    kube-system     rke-ingress-controller-deploy-job-bhzgm   0/1       Completed   0          30s
    kube-system     rke-kubedns-addon-deploy-job-gl7t4        0/1       Completed   0          30s
    kube-system     rke-metrics-addon-deploy-job-7ljkc        0/1       Completed   0          30s
    kube-system     rke-network-plugin-deploy-job-6pbgj       0/1       Completed   0          30s
    
  • 保存好配置文件

    rancher-cluster.yml:RKE集群配置文件。
    kube_config_rancher-cluster.yml:群集的Kubeconfig文件,此文件包含完全访问群集的凭据。
    rancher-cluster.rkestate:Kubernetes群集状态文件,此文件包含完全访问群集的凭据。
    
  • 执行成功后会在当前目录下生成一个 kube_config_rancher-cluster.yml 的文件, 把这个文件复制到 .kube/kube_config_rancher-cluster.yml

    # 在用户家目录下进行
    mkdir .kube
    cp kube_config_rancher-cluster.yml .kube/
    export KUBECONFIG=$(pwd)/kube_config_rancher-cluster.yml
    # 验证
    kubectl get nodes
    NAME             STATUS   ROLES                      AGE     VERSION
    192.168.111.50   Ready    controlplane,etcd,worker   5m47s   v1.17.5
    192.168.111.51   Ready    controlplane,etcd,worker   5m46s   v1.17.5
    192.168.111.52   Ready    controlplane,etcd,worker   5m47s   v1.17.5
    
  • 检查集群 Pod 的运行情况

    检查所有必需的 Pod 和容器是否状况良好,然后可以继续进行。

安装 Rancher

  • 添加 Helm Chart 仓库

    helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
    
  • 为 Rancher 创建 Namespace

    kubectl create namespace cattle-system
    
  • 使用 Rancher 生成的自签名证书

    # 安装 CustomResourceDefinition 资源
    
    kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.12/deploy/manifests/00-crds.yaml
    
    # **重要:**
    # 如果您正在运行 Kubernetes v1.15 或更低版本,
    # 则需要在上方的 kubectl apply 命令中添加`--validate=false`标志,
    # 否则您将在 cert-manager 的 CustomResourceDefinition 资源中收到与
    # x-kubernetes-preserve-unknown-fields 字段有关的验证错误。
    # 这是一个良性错误,是由于 kubectl 执行资源验证的方式造成的。
    
    # 为 cert-manager 创建命名空间
    kubectl create namespace cert-manager
    
    # 添加 Jetstack Helm 仓库
    helm repo add jetstack https://charts.jetstack.io
    
    # 更新本地 Helm chart 仓库缓存
    helm repo update
    
    # 安装 cert-manager Helm chart
    helm install \
     cert-manager jetstack/cert-manager \
     --namespace cert-manager \
     --version v0.12.0
    
    # 验证
    kubectl get pods --namespace cert-manager
    
    NAME                                      READY   STATUS    RESTARTS   AGE
    cert-manager-754d9b75d9-6xbk4             1/1     Running   0          94s
    cert-manager-cainjector-85fbdf788-hthfn   1/1     Running   0          94s
    cert-manager-webhook-76f9b64b45-bmt5z     1/1     Running   0          94s
    
  • 部署 Rancher 集群

    helm install rancher rancher-stable/rancher \
     --namespace cattle-system \
     --set hostname=rancher.hzqx.com
    
  • 等待 Rancher 集群运行

    kubectl -n cattle-system rollout status deploy/rancher
    Waiting for deployment "rancher" rollout to finish: 0 of 3 updated replicas are available...
    deployment "rancher" successfully rolled out
    
  • 搭建完成, 访问 https://rancher.hzqx.com, 默认用户名密码均为 admin

你可能感兴趣的:(kubernetes,运维,kubernetes,docker)