项目致力于提供快速部署高可用k8s集群的工具, 同时也努力成为k8s实践、使用的参考书;基于二进制方式部署和利用ansible-playbook实现自动化;既提供一键安装脚本, 也可以根据安装指南分步执行安装各个组件。
角色 | 数量 | 描述 | 节点ip |
---|---|---|---|
部署节点 | 1 | 运行ansible/ezctl命令。建议单独部署 | 192.168.17.130 |
etcd节点 | 3 | 注意etcd集群需要1,3,5…奇数个节点,一般复用master节点 | 192.168.17.140、192.168.17.141、192.168.17.142 |
master节点 | 3 | 高可用集群至少2个,这里用三个 | 192.168.17.130、192.168.17.131、192.168.17.132 |
node节点 | 2 | 运行应用负载的节点,可根据需要提升机器配置/增加节点数 | 192.168.17.150、192.168.17.151 |
harproxy | 2 | 负责api-server的的负载均衡,配合keepalived实现api-server高可用 | 192.168.17.160、192.168.17.161 |
机器配置:
master节点:4c/8g内存/50g硬盘
worker节点:建议8c/32g内存/200硬盘以上
注意:默认配置下容器/kubelet会占用/var的磁盘空间,如果磁盘分区特殊,可以设置config.yml中的容器/kubelet数据目录:
架构图如下,ip为上述表格内
CONTAINERD_STORAGE_DIR DOCKER_STORAGE_DIR KUBELET_ROOT_DIR
yum install -y ansible -i https://mirrors.aliyun.com/pypi/simple
[root@master01 harbor]# ansible --version
[DEPRECATION WARNING]: Ansible will require Python 3.8 or newer on the controller starting with Ansible 2.12. Current version: 3.6.8 (default, Nov 16 2020, 16:55:22) [GCC
4.8.5 20150623 (Red Hat 4.8.5-44)]. This feature will be removed from ansible-core in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False
in ansible.cfg.
ansible [core 2.11.10]
config file = None
configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/lib/python3.6/site-packages/ansible
ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
executable location = /usr/local/bin/ansible
python version = 3.6.8 (default, Nov 16 2020, 16:55:22) [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]
jinja version = 3.0.3
libyaml = True
#生成密钥对
ssh-keygen
#安装sshpass命令用于同步公钥到各k8s服务器
yum install -y sshpass
#分发密钥及同步harbor证书脚本
[root@master01 ~]# vim copysshid.sh
#!/bin/bash
IP="
192.168.17.130
192.168.17.131
192.168.17.132
192.168.17.133
192.168.17.140
192.168.17.141
192.168.17.142
192.168.17.150
192.168.17.151
192.168.17.160
192.168.17.161
"
for node in ${IP};do
sshpass -p 123456 ssh-copy-id ${node} -o StrictHostKeyChecking=no
if [ $? -eq 0 ];then
echo "${node} 密钥copy完成"
ssh ${node} " mkdir /etc/docker/certs.d/harbor.dujie.com -p"
scp -r /root/harbor-install/harbor/ssl/harbor-ca.crt ${node}:/etc/docker/certs.d/harbor.dujie.com/harbor-ca.crt
ssh ${node} "echo '192.168.17.130 harbor.dujie.com' >> /etc/hosts"
scp -r /root/.docker ${node}:/root
else
echo "${node} 密钥copy失败"
fi
done
#在两台haproxy 主机都安装keepalived和hapoxy
yum install -y keepalived
#haproxy1主机的keepavlied配置如下
[root@haproxy01 ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1{
#主
state MASTER
interface ens33
virtual_router_id 51
#权重值
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
#vip地址
192.168.17.188 dev ens33 label ens33:0
192.168.17.189 dev ens33 label ens33:1
192.168.17.190 dev ens33 label ens33:2
}
}
#haproxy2主机的keepalived配置如下:
[root@haproxy02 ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
}
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 192.168.200.1
smtp_connect_timeout 30
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_strict
vrrp_garp_interval 0
vrrp_gna_interval 0
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 80
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.17.188 dev ens33 label ens33:0
192.168.17.189 dev ens33 label ens33:1
192.168.17.190 dev ens33 label ens33:2
}
}
echo 'net.ipv4.ip_nonlocal_bind = 1' >>/etc/sysctl.conf
#配置完启动keepalived,查看vip是否在haproxy1网卡上,可以进行测试,把haproxy1的keepalived
#关闭,看看vip是否会自动漂移到haproxy2服务器上
安装haproxy,注意关闭selinux
#两个节点都需要安装,同样的配置
yum install -y haproxy
vim /etc/haproxy/haproxy.cfg
#在后面添加
listen k8s_api_master_6443
bind 192.168.17.188:6443
mode tcp
server 192.168.17.130 192.168.17.130:6443 check inter 3s fall 3 rise 1
server 192.168.17.131 192.168.17.131:6443 check inter 3s fall 3 rise 1
server 192.168.17.132 192.168.17.132:6443 check inter 3s fall 3 rise 1
systemctl restart haproxy
export release=3.2.0
wget https://github.com/easzlab/kubeasz/releases/download/${release}/ezdown
chmod +x ./ezdown
./ezdown --help
./ezdown -D
[root@master01 ~]# ll /etc/kubeasz/down/
总用量 1192924
-rw-------. 1 root root 384354816 4月 13 00:51 calico_v3.19.3.tar
-rw-------. 1 root root 46967296 4月 13 00:51 coredns_1.8.6.tar
-rw-------. 1 root root 224458240 4月 13 00:51 dashboard_v2.4.0.tar
-rw-r--r--. 1 root root 63350495 10月 5 2021 docker-20.10.9.tgz
-rw-------. 1 root root 70554112 4月 13 00:51 flannel_v0.15.1.tar
-rw-------. 1 root root 106171392 4月 13 00:51 k8s-dns-node-cache_1.21.1.tar
-rw-------. 1 root root 179055104 4月 13 00:52 kubeasz_3.2.0.tar
-rw-------. 1 root root 34463744 4月 13 00:52 metrics-scraper_v1.0.7.tar
-rw-------. 1 root root 65683968 4月 13 00:52 metrics-server_v0.5.2.tar
-rw-------. 1 root root 45084160 4月 13 00:52 nfs-provisioner_v4.0.2.tar
-rw-------. 1 root root 692736 4月 13 00:52 pause_3.6.tar
-rw-------. 1 root root 692736 4月 13 00:52 pause.tar
生成hosts文件
cd /etc/kubeasz
./ezctl new k8s-cluster
#创建完集群后会在/etc/kubeasz/clusters/k8s-cluster下生成两个文件
[root@master01 ~]# ll /etc/kubeasz/clusters/k8s-cluster/hosts
-rw-r--r--. 1 root root 1716 4月 13 01:01 /etc/kubeasz/clusters/k8s-cluster/hosts
[root@master01 ~]# ll /etc/kubeasz/clusters/k8s-cluster/config.yml
-rw-r--r--. 1 root root 6733 4月 14 17:18 /etc/kubeasz/clusters/k8s-cluster/config.yml
编辑hosts文件:
指定etcd节点、master节点、node节点、VIP、运行时、网络组件类型、serverIP与Pod ip范围等配置信息
[root@master01 k8s-cluster]# cat hosts
# 'etcd' cluster should have odd member(s) (1,3,5,...)
[etcd]
192.168.17.140
192.168.17.141
192.168.17.142
# master node(s)
[kube_master]
192.168.17.130
192.168.17.131
192.168.17.132
# work node(s)
[kube_node]
192.168.17.150
192.168.17.151
# [optional] harbor server, a private docker registry
# 'NEW_INSTALL': 'true' to install a harbor server; 'false' to integrate with existed one
[harbor]
#192.168.1.8 NEW_INSTALL=false
# [optional] loadbalance for accessing k8s from outside
[ex_lb]
192.168.1.6 LB_ROLE=backup EX_APISERVER_VIP=192.168.1.188 EX_APISERVER_PORT=6443
192.168.1.7 LB_ROLE=master EX_APISERVER_VIP=192.168.1.188 EX_APISERVER_PORT=6443
# [optional] ntp server for the cluster
[chrony]
#192.168.1.1
[all:vars]
# --------- Main Variables ---------------
# Secure port for apiservers
SECURE_PORT="6443"
# Cluster container-runtime supported: docker, containerd
CONTAINER_RUNTIME="docker"
# Network plugins supported: calico, flannel, kube-router, cilium, kube-ovn
CLUSTER_NETWORK="calico"
# Service proxy mode of kube-proxy: 'iptables' or 'ipvs'
PROXY_MODE="ipvs"
# K8S Service CIDR, not overlap with node(host) networking
SERVICE_CIDR="10.100.0.0/16"
# Cluster CIDR (Pod CIDR), not overlap with node(host) networking
CLUSTER_CIDR="10.200.0.0/16"
# NodePort Range
NODE_PORT_RANGE="30000-65000"
# Cluster DNS Domain
CLUSTER_DNS_DOMAIN="clusterdujie.local"
# -------- Additional Variables (don't change the default value right now) ---
# Binaries Directory
bin_dir="/usr/local/bin"
# Deploy Directory (kubeasz workspace)
base_dir="/etc/kubeasz"
# Directory for a specific cluster
cluster_dir="{{ base_dir }}/clusters/k8s-cluster"
# CA and other components cert/key Directory
ca_dir="/etc/kubernetes/ssl"
编辑config文件
[root@master01 k8s-cluster]# cat config.yml
############################
# prepare
############################
# 可选离线安装系统软件包 (offline|online)
INSTALL_SOURCE: "online"
# 可选进行系统安全加固 github.com/dev-sec/ansible-collection-hardening
OS_HARDEN: false
# 设置时间源服务器【重要:集群内机器时间必须同步】
ntp_servers:
- "ntp1.aliyun.com"
- "time1.cloud.tencent.com"
- "0.cn.pool.ntp.org"
# 设置允许内部时间同步的网络段,比如"10.0.0.0/8",默认全部允许
local_network: "0.0.0.0/0"
############################
# role:deploy
############################
# default: ca will expire in 100 years
# default: certs issued by the ca will expire in 50 years
CA_EXPIRY: "876000h"
CERT_EXPIRY: "876000h"
# kubeconfig 配置参数
CLUSTER_NAME: "cluster1"
CONTEXT_NAME: "context-{{ CLUSTER_NAME }}"
# k8s version
K8S_VER: "1.23.1"
############################
# role:etcd
############################
# 设置不同的wal目录,可以避免磁盘io竞争,提高性能
ETCD_DATA_DIR: "/var/lib/etcd"
ETCD_WAL_DIR: ""
############################
# role:runtime [containerd,docker]
############################
# ------------------------------------------- containerd
# [.]启用容器仓库镜像
ENABLE_MIRROR_REGISTRY: true
# [containerd]基础容器镜像
SANDBOX_IMAGE: "easzlab/pause:3.6"
# [containerd]容器持久化存储目录
CONTAINERD_STORAGE_DIR: "/var/lib/containerd"
# ------------------------------------------- docker
# [docker]容器存储目录
DOCKER_STORAGE_DIR: "/var/lib/docker"
# [docker]开启Restful API
ENABLE_REMOTE_API: false
# [docker]信任的HTTP仓库
INSECURE_REG: '["127.0.0.1/8","192.168.17.130"]'
############################
# role:kube-master
############################
# k8s 集群 master 节点证书配置,可以添加多个ip和域名(比如增加公网ip和域名)
MASTER_CERT_HOSTS:
- "192.168.17.188"
- "k8s.test.io"
#- "www.test.com"
# node 节点上 pod 网段掩码长度(决定每个节点最多能分配的pod ip地址)
# 如果flannel 使用 --kube-subnet-mgr 参数,那么它将读取该设置为每个节点分配pod网段
# https://github.com/coreos/flannel/issues/847
NODE_CIDR_LEN: 24
############################
# role:kube-node
############################
# Kubelet 根目录
KUBELET_ROOT_DIR: "/var/lib/kubelet"
# node节点最大pod 数
MAX_PODS: 500
# 配置为kube组件(kubelet,kube-proxy,dockerd等)预留的资源量
# 数值设置详见templates/kubelet-config.yaml.j2
KUBE_RESERVED_ENABLED: "no"
# k8s 官方不建议草率开启 system-reserved, 除非你基于长期监控,了解系统的资源占用状况;
# 并且随着系统运行时间,需要适当增加资源预留,数值设置详见templates/kubelet-config.yaml.j2
# 系统预留设置基于 4c/8g 虚机,最小化安装系统服务,如果使用高性能物理机可以适当增加预留
# 另外,集群安装时候apiserver等资源占用会短时较大,建议至少预留1g内存
SYS_RESERVED_ENABLED: "no"
# haproxy balance mode
BALANCE_ALG: "roundrobin"
############################
# role:network [flannel,calico,cilium,kube-ovn,kube-router]
############################
# ------------------------------------------- flannel
# [flannel]设置flannel 后端"host-gw","vxlan"等
FLANNEL_BACKEND: "vxlan"
DIRECT_ROUTING: false
# [flannel] flanneld_image: "quay.io/coreos/flannel:v0.10.0-amd64"
flannelVer: "v0.15.1"
flanneld_image: "easzlab/flannel:{{ flannelVer }}"
# [flannel]离线镜像tar包
flannel_offline: "flannel_{{ flannelVer }}.tar"
# ------------------------------------------- calico
# [calico]设置 CALICO_IPV4POOL_IPIP=“off”,可以提高网络性能,条件限制详见 docs/setup/calico.md
CALICO_IPV4POOL_IPIP: "Always"
# [calico]设置 calico-node使用的host IP,bgp邻居通过该地址建立,可手工指定也可以自动发现
IP_AUTODETECTION_METHOD: "can-reach={{ groups['kube_master'][0] }}"
# [calico]设置calico 网络 backend: brid, vxlan, none
CALICO_NETWORKING_BACKEND: "brid"
# [calico]更新支持calico 版本: [v3.3.x] [v3.4.x] [v3.8.x] [v3.15.x]
calico_ver: "v3.19.3"
# [calico]calico 主版本
calico_ver_main: "{{ calico_ver.split('.')[0] }}.{{ calico_ver.split('.')[1] }}"
# [calico]离线镜像tar包
calico_offline: "calico_{{ calico_ver }}.tar"
# ------------------------------------------- cilium
# [cilium]CILIUM_ETCD_OPERATOR 创建的 etcd 集群节点数 1,3,5,7...
ETCD_CLUSTER_SIZE: 1
# [cilium]镜像版本
cilium_ver: "v1.4.1"
# [cilium]离线镜像tar包
cilium_offline: "cilium_{{ cilium_ver }}.tar"
# ------------------------------------------- kube-ovn
# [kube-ovn]选择 OVN DB and OVN Control Plane 节点,默认为第一个master节点
OVN_DB_NODE: "{{ groups['kube_master'][0] }}"
# [kube-ovn]离线镜像tar包
kube_ovn_ver: "v1.5.3"
kube_ovn_offline: "kube_ovn_{{ kube_ovn_ver }}.tar"
# ------------------------------------------- kube-router
# [kube-router]公有云上存在限制,一般需要始终开启 ipinip;自有环境可以设置为 "subnet"
OVERLAY_TYPE: "full"
# [kube-router]NetworkPolicy 支持开关
FIREWALL_ENABLE: "true"
# [kube-router]kube-router 镜像版本
kube_router_ver: "v0.3.1"
busybox_ver: "1.28.4"
# [kube-router]kube-router 离线镜像tar包
kuberouter_offline: "kube-router_{{ kube_router_ver }}.tar"
busybox_offline: "busybox_{{ busybox_ver }}.tar"
############################
# role:cluster-addon
############################
# coredns 自动安装
dns_install: "no"
corednsVer: "1.8.6"
ENABLE_LOCAL_DNS_CACHE: false
dnsNodeCacheVer: "1.21.1"
# 设置 local dns cache 地址
LOCAL_DNS_CACHE: "169.254.20.10"
# metric server 自动安装
metricsserver_install: "no"
metricsVer: "v0.5.2"
# dashboard 自动安装
dashboard_install: "no"
dashboardVer: "v2.4.0"
dashboardMetricsScraperVer: "v1.0.7"
# ingress 自动安装
ingress_install: "no"
ingress_backend: "traefik"
traefik_chart_ver: "10.3.0"
# prometheus 自动安装
prom_install: "no"
prom_namespace: "monitor"
prom_chart_ver: "12.10.6"
# nfs-provisioner 自动安装
nfs_provisioner_install: "no"
nfs_provisioner_namespace: "kube-system"
nfs_provisioner_ver: "v4.0.2"
nfs_storage_class: "managed-nfs-storage"
nfs_server: "192.168.1.10"
nfs_path: "/data/nfs"
############################
# role:harbor
############################
# harbor version,完整版本号
HARBOR_VER: "v2.1.3"
HARBOR_DOMAIN: "harbor.yourdomain.com"
HARBOR_TLS_PORT: 8443
# if set 'false', you need to put certs named harbor.pem and harbor-key.pem in directory 'down'
HARBOR_SELF_SIGNED_CERT: true
# install extra component
HARBOR_WITH_NOTARY: false
HARBOR_WITH_TRIVY: false
HARBOR_WITH_CLAIR: false
HARBOR_WITH_CHARTMUSEUM: true
通过ansible脚本初始化环境及部署k8s高可用集群
[root@master01 kubeasz]# ./ezctl help setup
Usage: ezctl setup
available steps:
01 prepare to prepare CA/certs & kubeconfig & other system settings
02 etcd to setup the etcd cluster
03 container-runtime to setup the container runtime(docker or containerd)
04 kube-master to setup the master nodes
05 kube-node to setup the worker nodes
06 network to setup the network plugin
07 cluster-addon to setup other useful plugins
90 all to run 01~07 all at once
10 ex-lb to install external loadbalance for accessing k8s from outside
11 harbor to install a new harbor server or to integrate with an existed one
examples: ./ezctl setup test-k8s 01 (or ./ezctl setup test-k8s prepare)
./ezctl setup test-k8s 02 (or ./ezctl setup test-k8s etcd)
./ezctl setup test-k8s all
./ezctl setup test-k8s 04 -t restart_master
#准备CA和基础系统设置
[root@master01 kubeasz]# ./ezctl setup k8s-cluster 01
#如果遇到提示ansible 版本问题,将python软连接修改成python2版本
可更改启动脚本路径及版本等自定义配置
./ezctl setup k8s-cluster 02
安装完成后到etcd服务器查看2379服务是否启动正常
在master和node节点都需要安装docker,我这里已经都手动安装了,可以略此步,如果没有部署则执行03 脚本
./ezctl setup k8s-cluster 03
./ezctl setup k8s-cluster 04
#完成后查看master是否已经部署到集群中
[root@master01 kubeasz]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.17.130 Ready,SchedulingDisabled master 28h v1.23.1
192.168.17.131 Ready,SchedulingDisabled master 28h v1.23.1
192.168.17.132 Ready,SchedulingDisabled master 28h v1.23.1
./ezctl setup k8s-cluster 05
#完成后查看node是否已经部署到集群中
[root@master01 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
192.168.17.130 Ready,SchedulingDisabled master 28h v1.23.1
192.168.17.131 Ready,SchedulingDisabled master 28h v1.23.1
192.168.17.132 Ready,SchedulingDisabled master 28h v1.23.1
192.168.17.150 Ready node 28h v1.23.1
192.168.17.151 Ready node 28h v1.23.1
./ezctl setup k8s-cluster 06
执行操作6之后简单的k8s集群已经部署好,后续将持续更新部署coredns,prometheus等组件
如果上面操作有误,可以删除集群重新执行
root@k8sm1:/etc/kubeasz# ./ezctl destroy k8s-cluster1
root@k8sm1:/etc/kubeasz# ansible-playbook -i clusters/k8s-cluster1/hosts -e @clusters/k8s-cluster1/config.yml playbooks/99.clean.yml