微信公众号:运维开发故事,作者:刘大仙
详情见:https://docs.rancher.cn/rke/
详情见:https://rancher2.docs.rancher.cn/docs/overview/_index
操作系统 | 主机名 | IP地址 | 节点 | 作用 |
---|---|---|---|---|
CentOS 7 1810 | nginx-master | 192.168.111.21 | Nginx主服务器 | 负载均衡 |
CentOS 7 1810 | nginx-backup | 192.168.111.22 | Nginx备服务器 | 负载均衡 |
ubuntu-18.04.3-live-server | rke-node1 | 192.168.111.50 | rke节点1 | RKE集群 |
ubuntu-18.04.3-live-server | rke-node2 | 192.168.111.51 | rke节点2 | RKE集群 |
ubuntu-18.04.3-live-server | rke-node3 | 192.168.111.52 | rke节点3 | RKE集群 |
为防止因端口问题造成集群组建失败,我们在这里提前关闭防火墙以及selinux
centos :
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
Ubuntu:
sudo ufw stop
192.168.111.21 nginx-master
192.168.111.22 nginx-backup
192.168.111.50 rke-node1
192.168.111.51 rke-node2
192.168.111.52 rke-node3
此安装需要以下 CLI 工具。请确保这些工具已经安装并在$PATH
中可用
CLI工具的安装在RKE节点上进行,确保3台节点都已经安装正确
kubectl - Kubernetes 命令行工具.
rke - Rancher Kubernetes Engine,用于构建 Kubernetes 集群的 cli。
helm - Kubernetes 的软件包管理工具。
请参阅Helm 版本要求选择 Helm 的版本来安装 Rancher。
安装参考K8S官网,由于某些特殊原因,此处我们使用snap
sudo apt-get install snapd
sudo snap install kubectl --classic # 此处安装较慢,请耐心等待
# 验证安装
kubectl help
安装参考Rancher官网,由于是从GitHub上下载,文件较大,网络原因请自行解决
wget https://github.com/rancher/rke/releases/download/v1.0.8/rke_linux-amd64
# 将二进制文件移动至/usr/local/bin/下并改名成rke,并赋予可执行权限
sudo mv rke_linux-amd64 /usr/local/bin/rke
sudo chmod +x /usr/local/bin/rke
# 验证安装
rke --version
安装参考Helm官网,Helm是Kubernetes的包管理器,Helm的版本需要高于v3
# 下载安装包
wget https://get.helm.sh/helm-v3.2.1-linux-amd64.tar.gz
# 解压
tar zxvf helm-v3.2.1-linux-amd64.tar.gz
# 将二进制文件移动至/usr/local/bin/
sudo mv linux-amd64/helm /usr/local/bin/helm
# 验证安装
helm help
此处在CentOS节点上进行
安装 Nginx
# 下载Nginx安装包
wget http://nginx.org/download/nginx-1.17.10.tar.gz
# 解压安装包
tar zxvf nginx-1.17.10.tar.gz
# 安装编译时必备的软件包
yum install -y gcc gcc-c++ pcre pcre-devel zlib zlib-devel openssl openssl-devel libnl3-devel
# 进入nginx目录,此处我们需要使用https,所有在编译时选择 --with-http_ssl_module 模块
cd nginx-1.17.10
mkdir -p /usr/local/nginx
./configure --prefix=/usr/local/nginx --with-http_ssl_module --with-stream
# 安装nginx
make && make install
# 创建nginx命令软连接
ln -s /usr/local/nginx/sbin/nginx /usr/local/bin/nginx
# 验证安装
nginx -V
# 启动nginx
nginx
安装 Keepalived
# 下载安装包
wget https://www.keepalived.org/software/keepalived-2.0.20.tar.gz
# 解压安装包
tar zxvf keepalived-2.0.20.tar.gz
# 编译安装keepalived
cd keepalived-2.0.20
mkdir /usr/local/keepalived
./configure --prefix=/usr/local/keepalived/
make && make install
# 配置 keepalived 为系统服务
cp /usr/local/keepalived/sbin/keepalived /usr/sbin/keepalived
cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/keepalived
touch /etc/init.d/keepalived
chmod +x /etc/init.d/keepalived # keepalived 中的内容见下文
vim /etc/init.d/keepalived
# 配置 keepalived
mkdir /etc/keepalived/
cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
vim /etc/keepalived/keepalived.conf #keepalived.conf 中的内容见下文
# 启动keepalived
systemctl start keepalived
systemctl enable keepalived
# 验证
systemctl status keepalived
# 此时keepalived应该是运行,一个为master,一个为backup, master上执行 ip addr 命令时,应该存在一个虚拟ip地址,backup上不应该有
# 访问 https://192.168.111.20 验证配置
#!/bin/sh
. /etc/rc.d/init.d/functions
. /etc/sysconfig/keepalived
RETVAL=0
prog=“keepalived”
start() {
echo -n $"Starting $prog: "
daemon keepalived K E E P A L I V E D O P T I O N S R E T V A L = {KEEPALIVED_OPTIONS} RETVAL= KEEPALIVEDOPTIONSRETVAL=?
echo
[ KaTeX parse error: Expected 'EOF', got '&' at position 16: RETVAL -eq 0 ] &̲& touch /var/lo…prog
}
stop() {
echo -n $"Stopping p r o g : " k i l l p r o c k e e p a l i v e d R E T V A L = prog: " killproc keepalived RETVAL= prog:"killprockeepalivedRETVAL=?
echo
[ KaTeX parse error: Expected 'EOF', got '&' at position 16: RETVAL -eq 0 ] &̲& rm -f /var/lo…prog
}
reload() {
echo -n $"Reloading p r o g : " k i l l p r o c k e e p a l i v e d − 1 R E T V A L = prog: " killproc keepalived -1 RETVAL= prog:"killprockeepalived−1RETVAL=?
echo
}
case " 1 " i n s t a r t ) s t a r t ; ; s t o p ) s t o p ; ; r e l o a d ) r e l o a d ; ; r e s t a r t ) s t o p s t a r t ; ; c o n d r e s t a r t ) i f [ − f / v a r / l o c k / s u b s y s / 1" in start) start ;; stop) stop ;; reload) reload ;; restart) stop start ;; condrestart) if [ -f /var/lock/subsys/ 1"instart)start;;stop)stop;;reload)reload;;restart)stopstart;;condrestart)if[−f/var/lock/subsys/prog ]; then
stop
start
fi
;;
status)
status keepalived
RETVAL=$?
;;
*)
echo “Usage: $0 {start|stop|reload|restart|condrestart|status}”
RETVAL=1
esac
exit $RETVAL
# /etc/keepalived/keepalived.conf 中的内容
! Configuration File for keepalived
global_defs {
router_id 192.168.111.21 # 此id在网络中有且只有一个,不应有重复的id
}
vrrp_script chk_nginx { #因为要检测nginx服务状态,所以创建一个检查脚本
script "/usr/local/keepalived/check_ng.sh"
interval 3
}
vrrp_instance VI_1 {
state MASTER # 配置此节点为master,备机上设置为BACKUP
interface ens33 # 设置绑定的网卡
virtual_router_id 51 # vrrp 组, 主备的vrrp组应该一样
priority 120 # 优先级,优先级大的为主
advert_int 1 # 检查间隔
authentication { # 认证
auth_type PASS
auth_pass 1111
}
virtual_ipaddress { # 虚拟IP
192.168.111.20
}
track_script { # 执行脚本
chk_nginx
}
}
#!/bin/bash
d=date --date today +%Y%m%d_%H:%M:%S
n=ps -C nginx --no-heading|wc -l
if [ $n -eq “0” ]; then
systemctl start nginx
n2=ps -C nginx --no-heading|wc -l
if [ n 2 − e q " 0 " ] ; t h e n e c h o " n2 -eq "0" ]; then echo " n2−eq"0"];thenecho"d nginx down,keepalived will stop" >> /var/log/check_ng.log
systemctl stop keepalived
fi
fi
此处在RKE节点上进行
# 移除旧版本Docker
sudo apt-get remove docker docker-engine docker.io containerd runc
# 安装工具包
sudo apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common
# 添加 Docker官方 GPG key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
# 添加 stable apt 源
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
# 安装 Docker-ce
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io
# 验证安装
docker info
# 将当前用户加入"docker"用户组,加入到该用户组的账号在随后安装过程会用到。用于节点访问的SSH用户必须是节点上docker组的成员
sudo usermod -aG docker $USER
此处在Nginx集群操作
# 更新nginx配置文件
# vim /usr/local/nginx/conf/nginx.conf
#user nobody;
worker_processes 4;
worker_rlimit_nofile 40000;
events {
worker_connections 8192;
}
stream {
upstream rancher_servers_http {
least_conn;
server 192.168.111.50:80 max_fails=3 fail_timeout=5s;
server 192.168.111.51:80 max_fails=3 fail_timeout=5s;
server 192.168.111.52:80 max_fails=3 fail_timeout=5s;
}
server {
listen 80;
proxy_pass rancher_servers_http;
}
upstream rancher_servers_https {
least_conn;
server 192.168.111.50:443 max_fails=3 fail_timeout=5s;
server 192.168.111.51:443 max_fails=3 fail_timeout=5s;
server 192.168.111.52:443 max_fails=3 fail_timeout=5s;
}
server {
listen 443;
proxy_pass rancher_servers_https;
}
}
RKE-Node 之间建立 ssh 免密登陆
# 生成 rsa 公钥秘钥
ssh-keygen
# 复制当前主机上的公钥到另外两台上面,实现免密码登录
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
ssh-copy-id -i ~/.ssh/id_rsa.pub [email protected]
# 注意,自已也要跟自己注册一下,三个节点都要执行
# 验证
docker@rke-node3:~$ ssh [email protected] # 在node3上远程node1 此时ssh应该不需要密码
编写 rancher-cluster.yml 文件
# vim rancher-cluster.yml
nodes:
- address: 192.168.111.50 # 主机IP
user: docker # 可以执行docker命令的用户
role: [controlplane,worker,etcd] # 节点角色
- address: 192.168.111.51
user: docker
role: [controlplane,worker,etcd]
- address: 192.168.111.52
user: docker
role: [controlplane,worker,etcd]
services:
etcd:
snapshot: true
creation: 6h
retention: 24
运行 RKE 构建 Kubernetes 集群
rke up --config ./rancher-cluster.yml
# 验证:返回下面的消息则说明执行成功。
# Finished building Kubernetes cluster successfully.
Pod 是Running
或Completed
状态。
STATUS
为 Running
的 Pod,READY
应该显示所有容器正在运行 (例如,3/3
)。
STATUS
为 Completed
的 Pod 是一次运行的作业。对于这些 Pod,READY
应为0/1
。
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
ingress-nginx nginx-ingress-controller-tnsn4 1/1 Running 0 30s
ingress-nginx nginx-ingress-controller-tw2ht 1/1 Running 0 30s
ingress-nginx nginx-ingress-controller-v874b 1/1 Running 0 30s
kube-system canal-jp4hz 3/3 Running 0 30s
kube-system canal-z2hg8 3/3 Running 0 30s
kube-system canal-z6kpw 3/3 Running 0 30s
kube-system kube-dns-7588d5b5f5-sf4vh 3/3 Running 0 30s
kube-system kube-dns-autoscaler-5db9bbb766-jz2k6 1/1 Running 0 30s
kube-system metrics-server-97bc649d5-4rl2q 1/1 Running 0 30s
kube-system rke-ingress-controller-deploy-job-bhzgm 0/1 Completed 0 30s
kube-system rke-kubedns-addon-deploy-job-gl7t4 0/1 Completed 0 30s
kube-system rke-metrics-addon-deploy-job-7ljkc 0/1 Completed 0 30s
kube-system rke-network-plugin-deploy-job-6pbgj 0/1 Completed 0 30s
保存好配置文件
rancher-cluster.yml:RKE集群配置文件。
kube_config_rancher-cluster.yml:群集的Kubeconfig文件,此文件包含完全访问群集的凭据。
rancher-cluster.rkestate:Kubernetes群集状态文件,此文件包含完全访问群集的凭据。
执行成功后会在当前目录下生成一个 kube_config_rancher-cluster.yml
的文件, 把这个文件复制到 .kube/kube_config_rancher-cluster.yml
# 在用户家目录下进行
mkdir .kube
cp kube_config_rancher-cluster.yml .kube/
export KUBECONFIG=$(pwd)/kube_config_rancher-cluster.yml
# 验证
kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.111.50 Ready controlplane,etcd,worker 5m47s v1.17.5
192.168.111.51 Ready controlplane,etcd,worker 5m46s v1.17.5
192.168.111.52 Ready controlplane,etcd,worker 5m47s v1.17.5
检查集群 Pod 的运行情况
检查所有必需的 Pod 和容器是否状况良好,然后可以继续进行。
添加 Helm Chart 仓库
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
为 Rancher 创建 Namespace
kubectl create namespace cattle-system
使用 Rancher 生成的自签名证书
# 安装 CustomResourceDefinition 资源
kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.12/deploy/manifests/00-crds.yaml
# **重要:**
# 如果您正在运行 Kubernetes v1.15 或更低版本,
# 则需要在上方的 kubectl apply 命令中添加`--validate=false`标志,
# 否则您将在 cert-manager 的 CustomResourceDefinition 资源中收到与
# x-kubernetes-preserve-unknown-fields 字段有关的验证错误。
# 这是一个良性错误,是由于 kubectl 执行资源验证的方式造成的。
# 为 cert-manager 创建命名空间
kubectl create namespace cert-manager
# 添加 Jetstack Helm 仓库
helm repo add jetstack https://charts.jetstack.io
# 更新本地 Helm chart 仓库缓存
helm repo update
# 安装 cert-manager Helm chart
helm install \
cert-manager jetstack/cert-manager \
--namespace cert-manager \
--version v0.12.0
# 验证
kubectl get pods --namespace cert-manager
NAME READY STATUS RESTARTS AGE
cert-manager-754d9b75d9-6xbk4 1/1 Running 0 94s
cert-manager-cainjector-85fbdf788-hthfn 1/1 Running 0 94s
cert-manager-webhook-76f9b64b45-bmt5z 1/1 Running 0 94s
部署 Rancher 集群
helm install rancher rancher-stable/rancher \
--namespace cattle-system \
--set hostname=rancher.hzqx.com
等待 Rancher 集群运行
kubectl -n cattle-system rollout status deploy/rancher
Waiting for deployment "rancher" rollout to finish: 0 of 3 updated replicas are available...
deployment "rancher" successfully rolled out
搭建完成, 访问 https://rancher.hzqx.com
, 默认用户名密码均为 admin