手动搭建k8s-1.16.6高可用集群之初始化系统和全局变量

一、集群主机

k8s-01:192.168.0.71
k8s-02:192.168.0.72
k8s-03:192.168.0.73

三台主机混合部署etcd、master集群和worker集群。

如果没有特殊说明,需要在在所有节点上执行本文档的初始化操作。

二、设置主机名

设置永久主机名,然后重新登录。

$ hostnamectl set-hostname k8s-01 # 将k8s-01替换为当前主机名
$ hostnamectl set-hostname k8s-02 # 将k8s-02替换为当前主机名
$ hostnamectl set-hostname k8s-03 # 将k8s-03替换为当前主机名

在每台主机的/etc/hosts文件中添加主机名和IP的对应关系:

cat >> /etc/hosts <<EOF
192.168.0.71 k8s-01
192.168.0.72 k8s-02
192.168.0.73 k8s-03
EOF

三、添加节点信任关系

本操作只需要在k8s-01节点上进行,设置root账户可以无密码登录所有节点。稍后会介绍一个环境变量脚本,通过这个,就能基本实现全程在节点1中部署整个集群。

$ ssh-keygen -t rsa
$ ssh-copy-id root@k8s-01
$ ssh-copy-id root@k8s-02
$ ssh-copy-id root@k8s-03

四、创建相关目录

在三台主机上都创建下面的目录

$ mkdir -p /opt/k8s/{bin,work} /etc/{kubernetes,etcd}/cert

五、更新PATH变量

在三台主机上都添加环境变量

$ echo 'PATH=/opt/k8s/bin:$PATH' >>/root/.bashrc
$ source /root/.bashrc

/opt/k8s/bin 目录用来保存下载安装的程序。

六、环境变量

现在引入一个环境变量,以便于整个流程的部署工作。一开始可能会有点想不通为什么要这样,但是越是到后面,就会发现设置这个环境变量真的是给部署工作带来了很大的便利。

cat > /opt/k8s/bin/environment.sh << "EOF"
#!/usr/bin/bash

# 生成 EncryptionConfig 所需的加密 key
export ENCRYPTION_KEY=$(head -c 32 /dev/urandom | base64)

# 集群各机器 IP 数组
export NODE_IPS=(192.168.0.71 192.168.0.72 192.168.0.73)

# 集群各 IP 对应的主机名数组
export NODE_NAMES=(k8s-01 k8s-02 k8s-03)

# etcd 集群服务地址列表
export ETCD_ENDPOINTS="https://192.168.0.71:2379,https://192.168.0.72:2379,https://192.168.0.73:2379"

# etcd 集群间通信的 IP 和端口
export ETCD_NODES="k8s-01=https://192.168.0.71:2380,k8s-02=https://192.168.0.72:2380,k8s-03=https://192.168.0.73:2380"

# kube-apiserver 的反向代理(kube-nginx)地址端口
export KUBE_APISERVER="https://127.0.0.1:8443"

# 节点间互联网络接口名称
export IFACE="eth0"

# etcd 数据目录
export ETCD_DATA_DIR="/data/k8s/etcd/data"

# etcd WAL 目录,建议是 SSD 磁盘分区,或者和 ETCD_DATA_DIR 不同的磁盘分区
export ETCD_WAL_DIR="/data/k8s/etcd/wal"

# k8s 各组件数据目录
export K8S_DIR="/data/k8s/k8s"

## DOCKER_DIR 和 CONTAINERD_DIR 二选一,我这里选择docker
# docker 数据目录
export DOCKER_DIR="/data/k8s/docker"

# containerd 数据目录
# export CONTAINERD_DIR="/data/k8s/containerd"

## 以下参数一般不需要修改

# TLS Bootstrapping 使用的 Token,可以使用命令 head -c 16 /dev/urandom | od -An -t x | tr -d ' ' 生成
BOOTSTRAP_TOKEN="05796b965acf26efc6bf5133bb6a8ef3"

# 最好使用 当前未用的网段 来定义服务网段和 Pod 网段

# 服务网段,部署前路由不可达,部署后集群内路由可达(kube-proxy 保证)
SERVICE_CIDR="10.254.0.0/16"

# Pod 网段,建议 /16 段地址,部署前路由不可达,部署后集群内路由可达(flanneld 保证)
CLUSTER_CIDR="172.30.0.0/16"

# 服务端口范围 (NodePort Range)
export NODE_PORT_RANGE="30000-32767"

# kubernetes 服务 IP (一般是 SERVICE_CIDR 中第一个IP)
export CLUSTER_KUBERNETES_SVC_IP="10.254.0.1"

# 集群 DNS 服务 IP (从 SERVICE_CIDR 中预分配)
export CLUSTER_DNS_SVC_IP="10.254.0.2"

# 集群 DNS 域名(末尾不带点号)
export CLUSTER_DNS_DOMAIN="cluster.local"

# 将二进制目录 /opt/k8s/bin 加到 PATH 中
export PATH=/opt/k8s/bin:$PATH
EOF

根据实际情况修改其中的IP地址。

七、安装基础依赖包

在每台主机上安装依赖包:

$ yum install -y epel-release
$ yum install -y screen lrzsz tree openssl openssh-clients openssl-devel openssh-server telnet iftop iotop ntpdate dos2unix lsof net-tools conntrack ipvsadm ipset jq iptables curl sysstat libseccomp wget socat git mtr gcc gcc-c++ cmake zip unzip sudo psmisc

上面是传统的安装方法,需要到每台主机上执行安装。可以采用刚才定义的environment.sh脚本中的变量来安装。

定义一个脚本:

cat > deploy.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} "yum install -y epel-release screen lrzsz tree openssl openssh-clients openssl-devel openssh-server telnet iftop iotop ntpdate dos2unix lsof net-tools conntrack ipvsadm ipset jq iptables curl sysstat libseccomp wget socat git mtr gcc gcc-c++ cmake zip unzip sudo psmisc"
done
EOF

注意:第一个EOF需要加引号,这样文本你当中的变量就不会被替换了。

执行这个脚本:

[root@k8s-01 ~]# ./deploy.sh 
>>> 192.168.0.71
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirrors.aliyun.com
 * epel: hkg.mirror.rackspace.com
 * extras: mirrors.aliyun.com
 * updates: mirrors.aliyun.com
Package epel-release-7-12.noarch already installed and latest version
Package yum-3.4.3-163.el7.centos.noarch already installed and latest version
No package install available.
Package screen-4.1.0-0.25.20120314git3c2946.el7.x86_64 already installed and latest version
Package lrzsz-0.12.20-36.el7.x86_64 already installed and latest version
Package tree-1.6.0-10.el7.x86_64 already installed and latest version
Package 1:openssl-1.0.2k-19.el7.x86_64 already installed and latest version
Package openssh-clients-7.4p1-21.el7.x86_64 already installed and latest version
Package 1:openssl-devel-1.0.2k-19.el7.x86_64 already installed and latest version
Package openssh-server-7.4p1-21.el7.x86_64 already installed and latest version
Package 1:telnet-0.17-64.el7.x86_64 already installed and latest version
Package iftop-1.0-0.21.pre4.el7.x86_64 already installed and latest version
Package iotop-0.6-4.el7.noarch already installed and latest version
Package ntpdate-4.2.6p5-29.el7.centos.x86_64 already installed and latest version
Package dos2unix-6.0.3-7.el7.x86_64 already installed and latest version
Package lsof-4.87-6.el7.x86_64 already installed and latest version
Package net-tools-2.0-0.25.20131004git.el7.x86_64 already installed and latest version
Package conntrack-tools-1.4.4-5.el7_7.2.x86_64 already installed and latest version
Package ipvsadm-1.27-7.el7.x86_64 already installed and latest version
Package ipset-7.1-1.el7.x86_64 already installed and latest version
Package jq-1.6-1.el7.x86_64 already installed and latest version
Package iptables-1.4.21-33.el7.x86_64 already installed and latest version
Package curl-7.29.0-54.el7_7.2.x86_64 already installed and latest version
Package sysstat-10.1.5-18.el7_7.1.x86_64 already installed and latest version
Package libseccomp-2.3.1-3.el7.x86_64 already installed and latest version
Package wget-1.14-18.el7_6.1.x86_64 already installed and latest version
Package socat-1.7.3.2-2.el7.x86_64 already installed and latest version
Package git-1.8.3.1-21.el7_7.x86_64 already installed and latest version
Package 2:mtr-0.85-7.el7.x86_64 already installed and latest version
Package gcc-4.8.5-39.el7.x86_64 already installed and latest version
Package gcc-c++-4.8.5-39.el7.x86_64 already installed and latest version
Package cmake-2.8.12.2-2.el7.x86_64 already installed and latest version
Package zip-3.0-11.el7.x86_64 already installed and latest version
Package unzip-6.0-20.el7.x86_64 already installed and latest version
Package sudo-1.8.23-4.el7_7.2.x86_64 already installed and latest version
Package psmisc-22.20-16.el7.x86_64 already installed and latest version
Nothing to do
>>> 192.168.0.72
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirrors.aliyun.com
 * epel: mirrors.tuna.tsinghua.edu.cn
 * extras: mirrors.aliyun.com
 * updates: mirrors.aliyun.com
Package epel-release-7-12.noarch already installed and latest version
Package yum-3.4.3-163.el7.centos.noarch already installed and latest version
No package install available.
Package screen-4.1.0-0.25.20120314git3c2946.el7.x86_64 already installed and latest version
Package lrzsz-0.12.20-36.el7.x86_64 already installed and latest version
Package tree-1.6.0-10.el7.x86_64 already installed and latest version
Package 1:openssl-1.0.2k-19.el7.x86_64 already installed and latest version
Package openssh-clients-7.4p1-21.el7.x86_64 already installed and latest version
Package 1:openssl-devel-1.0.2k-19.el7.x86_64 already installed and latest version
Package openssh-server-7.4p1-21.el7.x86_64 already installed and latest version
Package 1:telnet-0.17-64.el7.x86_64 already installed and latest version
Package iftop-1.0-0.21.pre4.el7.x86_64 already installed and latest version
Package iotop-0.6-4.el7.noarch already installed and latest version
Package ntpdate-4.2.6p5-29.el7.centos.x86_64 already installed and latest version
Package dos2unix-6.0.3-7.el7.x86_64 already installed and latest version
Package lsof-4.87-6.el7.x86_64 already installed and latest version
Package net-tools-2.0-0.25.20131004git.el7.x86_64 already installed and latest version
Package conntrack-tools-1.4.4-5.el7_7.2.x86_64 already installed and latest version
Package ipvsadm-1.27-7.el7.x86_64 already installed and latest version
Package ipset-7.1-1.el7.x86_64 already installed and latest version
Package jq-1.6-1.el7.x86_64 already installed and latest version
Package iptables-1.4.21-33.el7.x86_64 already installed and latest version
Package curl-7.29.0-54.el7_7.2.x86_64 already installed and latest version
Package sysstat-10.1.5-18.el7_7.1.x86_64 already installed and latest version
Package libseccomp-2.3.1-3.el7.x86_64 already installed and latest version
Package wget-1.14-18.el7_6.1.x86_64 already installed and latest version
Package socat-1.7.3.2-2.el7.x86_64 already installed and latest version
Package git-1.8.3.1-21.el7_7.x86_64 already installed and latest version
Package 2:mtr-0.85-7.el7.x86_64 already installed and latest version
Package gcc-4.8.5-39.el7.x86_64 already installed and latest version
Package gcc-c++-4.8.5-39.el7.x86_64 already installed and latest version
Package cmake-2.8.12.2-2.el7.x86_64 already installed and latest version
Package zip-3.0-11.el7.x86_64 already installed and latest version
Package unzip-6.0-20.el7.x86_64 already installed and latest version
Package sudo-1.8.23-4.el7_7.2.x86_64 already installed and latest version
Package psmisc-22.20-16.el7.x86_64 already installed and latest version
Nothing to do
>>> 192.168.0.73
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirrors.aliyun.com
 * epel: mirrors.tuna.tsinghua.edu.cn
 * extras: mirrors.aliyun.com
 * updates: mirrors.aliyun.com
Package epel-release-7-12.noarch already installed and latest version
Package yum-3.4.3-163.el7.centos.noarch already installed and latest version
No package install available.
Package screen-4.1.0-0.25.20120314git3c2946.el7.x86_64 already installed and latest version
Package lrzsz-0.12.20-36.el7.x86_64 already installed and latest version
Package tree-1.6.0-10.el7.x86_64 already installed and latest version
Package 1:openssl-1.0.2k-19.el7.x86_64 already installed and latest version
Package openssh-clients-7.4p1-21.el7.x86_64 already installed and latest version
Package 1:openssl-devel-1.0.2k-19.el7.x86_64 already installed and latest version
Package openssh-server-7.4p1-21.el7.x86_64 already installed and latest version
Package 1:telnet-0.17-64.el7.x86_64 already installed and latest version
Package iftop-1.0-0.21.pre4.el7.x86_64 already installed and latest version
Package iotop-0.6-4.el7.noarch already installed and latest version
Package ntpdate-4.2.6p5-29.el7.centos.x86_64 already installed and latest version
Package dos2unix-6.0.3-7.el7.x86_64 already installed and latest version
Package lsof-4.87-6.el7.x86_64 already installed and latest version
Package net-tools-2.0-0.25.20131004git.el7.x86_64 already installed and latest version
Package conntrack-tools-1.4.4-5.el7_7.2.x86_64 already installed and latest version
Package ipvsadm-1.27-7.el7.x86_64 already installed and latest version
Package ipset-7.1-1.el7.x86_64 already installed and latest version
Package jq-1.6-1.el7.x86_64 already installed and latest version
Package iptables-1.4.21-33.el7.x86_64 already installed and latest version
Package curl-7.29.0-54.el7_7.2.x86_64 already installed and latest version
Package sysstat-10.1.5-18.el7_7.1.x86_64 already installed and latest version
Package libseccomp-2.3.1-3.el7.x86_64 already installed and latest version
Package wget-1.14-18.el7_6.1.x86_64 already installed and latest version
Package socat-1.7.3.2-2.el7.x86_64 already installed and latest version
Package git-1.8.3.1-21.el7_7.x86_64 already installed and latest version
Package 2:mtr-0.85-7.el7.x86_64 already installed and latest version
Package gcc-4.8.5-39.el7.x86_64 already installed and latest version
Package gcc-c++-4.8.5-39.el7.x86_64 already installed and latest version
Package cmake-2.8.12.2-2.el7.x86_64 already installed and latest version
Package zip-3.0-11.el7.x86_64 already installed and latest version
Package unzip-6.0-20.el7.x86_64 already installed and latest version
Package sudo-1.8.23-4.el7_7.2.x86_64 already installed and latest version
Package psmisc-22.20-16.el7.x86_64 already installed and latest version
Nothing to do

因为已经安装过了,所以会报没什么可以安装的。

八、关闭防火墙

关闭防火墙,清理防火墙规则,设置默认转发策略:

cat > deploy.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} "systemctl stop firewalld && systemctl disable firewalld"
    ssh root@${node_ip} "iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && iptables -P FORWARD ACCEPT"   
done
EOF

将内容写入到deploy.sh中,然后还是利用之前的环境变量来做一个主机的遍历执行,真的是太方便了。

九、关闭swap分区

关闭swap分区,否则kubelet会启动失败(可以设置kubelet启动参数–fail-swap-on为false关闭swap检查),所以需要在每台主机上关闭swap分区,为了防止开机自动挂载swap分区,也需要注释/etc/fatab中相应的条目。

cat > deploy.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} "swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab"
done
EOF

十、关闭SELinux

关闭SELinux,否则kubelet挂载目录时可能报错Permission denied:

cat > deploy.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} "setenforce 0 && sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config"
done
EOF

十一、加载内核模块

cat > deploy.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} "modprobe br_netfilter && modprobe ip_vs"
done
EOF

十二、优化内核参数

cat > kubernetes.conf <<EOF
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
net.ipv4.ip_forward=1
net.ipv4.tcp_tw_recycle=0
net.ipv4.neigh.default.gc_thresh1=1024
net.ipv4.neigh.default.gc_thresh1=2048
net.ipv4.neigh.default.gc_thresh1=4096
vm.swappiness=0
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576
fs.file-max=52706963
fs.nr_open=52706963
net.ipv6.conf.all.disable_ipv6=1
net.netfilter.nf_conntrack_max=2310720
EOF

kubernetes.conf分发到三台主机上,并加载。

cat > deploy.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
    echo ">>> ${node_ip}"
    scp kubernetes.conf root@${node_ip}:/etc/sysctl.d/kubernetes.conf
    ssh root@${node_ip} "sysctl -p /etc/sysctl.d/kubernetes.conf"
done
EOF

注意:关闭 tcp_tw_recycle,否则与 NAT 冲突,可能导致服务不通;

十三、设置时区并进行时间同步

设置系统时区

$ timedatectl set-timezone Asia/Shanghai

设置时间同步

cat > deploy.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} '/usr/sbin/ntpdate ntp1.aliyun.com &> /dev/null &&  hwclock --systohc &> /dev/null && echo "*/5 * * * * /usr/sbin/ntpdate ntp1.aliyun.com &&  hwclock --systohc" > /var/spool/cron/root'
done
EOF

十四、关闭无关的服务

关闭postfix服务

cat > deploy.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
    echo ">>> ${node_ip}"
    ssh root@${node_ip} "systemctl stop postfix && systemctl disable postfix"
done
EOF

十五、分发环境变量配置脚本

后续使用的环境变量都定义在文件environment.sh中,将这个文件分发到三台主机的对应目录中。

cat > deploy.sh << "EOF"
#!/bin/bash
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
    echo ">>> ${node_ip}"
    scp /opt/k8s/bin/environment.sh root@${node_ip}:/opt/k8s/bin/
    ssh root@${node_ip} "chmod +x /opt/k8s/bin/*"
done
EOF

十六、升级内核

CentOS7.x系统自带的额3.10.x内核存在一些Bug,导致运行的Docker、Kubernetes不稳定,例如:

1、高版本的docker(1.13以后)启用了 3.10 kernel 实验支持的 kernel memory account 功能(无法关闭),当节点压力大如频繁启动和停止容器时会导致 cgroup memory leak;

2、网络设备引用计数泄漏,会导致类似于报错:“kernel:unregister_netdevice: waiting for eth0 to become free. Usage count = 1”;

解决方案如下:

1)升级内核到 4.4.X 以上;

2)或者,手动编译内核,disable CONFIG_MEMCG_KMEM 特性;

3)或者,安装修复了该问题的 Docker 18.09.1 及以上的版本。但由于 kubelet 也会设置 kmem(它 vendor 了 runc),所以需要重新编译 kubelet 并指定 GOFLAGS="-tags=nokmem";

git clone --branch v1.14.1 --single-branch --depth 1 https://github.com/kubernetes/kubernetes
cd kubernetes
KUBE_GIT_VERSION=v1.14.1 ./build/run.sh make kubelet GOFLAGS="-tags=nokmem"

这里采用升级内核的解决方法:

yum -y install https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm
# 安装完成后检查 /boot/grub2/grub.cfg 中对应内核 menuentry 中是否包含 initrd16 配置,如果没有,再安装一次!
yum --enablerepo=elrepo-kernel install -y kernel-lt
# 设置开机从新内核启动
grub2-set-default 0

重启主机:

sync
reboot

参考文章:

  1. 系统内核相关参数参考:https://docs.openshift.com/enterprise/3.2/admin_guide/overcommit.html
  2. 3.10.x 内核 kmem bugs 相关的讨论和解决办法:
    1. https://github.com/kubernetes/kubernetes/issues/61937
    2. https://support.mesosphere.com/s/article/Critical-Issue-KMEM-MSPH-2018-0006
    3. https://pingcap.com/blog/try-to-fix-two-linux-kernel-bugs-while-testing-tidb-operator-in-k8s/

你可能感兴趣的:(#)