Kubespray 是开源的部署生产级别 Kubernetes 集群的项目,它整合了 Ansible 作为部署的工具。
可以部署在 AWS,GCE,Azure,OpenStack,vSphere,Packet(Bare metal),
Oracle Cloud Infrastructure(Experimental)或Baremetal上。
高可用集群
可组合各种组件(例如,选择网络插件)
支持最受欢迎的Linux发行版
持续集成测试
官网:https://kubespray.io
项目地址:https://github.com/kubernetes-sigs/kubespray
国内特殊的网络环境导致使用 kubespray 特别困难,部分镜像需要从 gcr.io 拉取,部分二进制文件需要从
github 下载,所以可以提前下载好进行镜像导入。
说明:高可用部署 etcd 要求3个节点,所以高可用集群最少需要 3 个节点。
kubespray 需要一个部署节点,也可以复用集群任意一个节点,这里在第一个master节点( 192.168.54.211 )安装
kubespray,并执行后续的所有操作。
1、服务器规划
ip | hostname |
---|---|
192.168.54.211 | master |
192.168.54.212 | slave1 |
192.168.54.213 | slave2 |
2、设置 hostname
# 三台主机分别设置
$ hostnamectl set-hostname master
$ hostnamectl set-hostname slave1
$ hostnamectl set-hostname slave2
# 查看当前主机名称
$ hostname
3、设置 ip 和 hostname 的对应关系
# 三台主机分别设置
$ cat >> /etc/hosts << EOF
192.168.54.211 master
192.168.54.212 slave1
192.168.54.213 slave2
EOF
# master节点执行
# 下载正式发布的release版本
wget https://github.com/kubernetes-sigs/kubespray/archive/v2.16.0.tar.gz
tar -zxvf kubespray-2.16.0.tar.gz
# 或者直接克隆
git clone https://github.com/kubernetes-sigs/kubespray.git -b v2.16.0 --depth=1
# master节点执行
cd kubespray-2.16.0/
yum install -y epel-release python3-pip
pip3 install -r requirements.txt
如果报错:
# 错误一
Complete output from command python setup.py egg_info:
=============================DEBUG ASSISTANCE==========================
If you are seeing an error here please try the following to
successfully install cryptography:
Upgrade to the latest pip and try again. This will fix errors for most
users. See: https://pip.pypa.io/en/stable/installing/#upgrading-pip
=============================DEBUG ASSISTANCE==========================
Traceback (most recent call last):
File "" , line 1, in <module>
File "/tmp/pip-build-3w9d_1bk/cryptography/setup.py", line 17, in <module>
from setuptools_rust import RustExtension
ModuleNotFoundError: No module named 'setuptools_rust'
----------------------------------------
# 解决方法
pip3 install --upgrade cryptography==3.2
# 错误二
Exception: command 'gcc' failed with exit status 1
# 解决方法
# python2
yum install gcc libffi-devel python-devel openssl-devel -y
# python3
yum install gcc libffi-devel python3-devel openssl-devel -y
更新 Ansible inventory file,IPS 地址为 3 个实例的内部 IP:
# master节点执行
[root@master kubespray-2.16.0]# cp -rfp inventory/sample inventory/mycluster
[root@master kubespray-2.16.0]# declare -a IPS=( 192.168.54.211 192.168.54.212 192.168.54.213)
[root@master kubespray-2.16.0]# CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
DEBUG: Adding group all
DEBUG: Adding group kube_control_plane
DEBUG: Adding group kube_node
DEBUG: Adding group etcd
DEBUG: Adding group k8s_cluster
DEBUG: Adding group calico_rr
DEBUG: adding host node1 to group all
DEBUG: adding host node2 to group all
DEBUG: adding host node3 to group all
DEBUG: adding host node1 to group etcd
DEBUG: adding host node2 to group etcd
DEBUG: adding host node3 to group etcd
DEBUG: adding host node1 to group kube_control_plane
DEBUG: adding host node2 to group kube_control_plane
DEBUG: adding host node1 to group kube_node
DEBUG: adding host node2 to group kube_node
DEBUG: adding host node3 to group kube_node
查看自动生成的 hosts.yaml,kubespray 会根据提供的节点数量自动规划节点角色。这里部署 2 个 master 节
点,同时 3 个节点也作为 node ,3 个节点也用来部署 etcd。
# master节点执行
[root@master kubespray-2.16.0]# cat inventory/mycluster/hosts.yaml
all:
hosts:
node1:
ansible_host: 192.168.54.211
ip: 192.168.54.211
access_ip: 192.168.54.211
node2:
ansible_host: 192.168.54.212
ip: 192.168.54.212
access_ip: 192.168.54.212
node3:
ansible_host: 192.168.54.213
ip: 192.168.54.213
access_ip: 192.168.54.213
children:
kube-master:
hosts:
node1:
node2:
kube-node:
hosts:
node1:
node2:
node3:
etcd:
hosts:
node1:
node2:
node3:
k8s-cluster:
children:
kube-master:
kube-node:
calico-rr:
hosts: {}
修改 inventory/mycluster/hosts.yaml
文件:
# master节点执行
[root@master kubespray-2.16.0]# vim inventory/mycluster/hosts.yaml
all:
hosts:
master:
ansible_host: 192.168.54.211
ip: 192.168.54.211
access_ip: 192.168.54.211
slave1:
ansible_host: 192.168.54.212
ip: 192.168.54.212
access_ip: 192.168.54.212
slave2:
ansible_host: 192.168.54.213
ip: 192.168.54.213
access_ip: 192.168.54.213
children:
kube-master:
hosts:
master:
slave1:
kube-node:
hosts:
master:
slave1:
slave2:
etcd:
hosts:
master:
slave1:
slave2:
k8s-cluster:
children:
kter:
kube-node:
calico-rr:
hosts: {}
[root@master kubespray-2.16.0]# cat inventory/mycluster/group_vars/all/all.yml
---
## Directory where etcd data stored
etcd_data_dir: /var/lib/etcd
## Experimental kubeadm etcd deployment mode. Available only for new deployment
etcd_kubeadm_enabled: false
## Directory where the binaries will be installed
bin_dir: /usr/local/bin
## The access_ip variable is used to define how other nodes should access
## the node. This is used in flannel to allow other flannel nodes to see
## this node for example. The access_ip is really useful AWS and Google
## environments where the nodes are accessed remotely by the "public" ip,
## but don't know about that address themselves.
# access_ip: 1.1.1.1
## External LB example config
## apiserver_loadbalancer_domain_name: "elb.some.domain"
# loadbalancer_apiserver:
# address: 1.2.3.4
# port: 1234
## Internal loadbalancers for apiservers
# loadbalancer_apiserver_localhost: true
# valid options are "nginx" or "haproxy"
# loadbalancer_apiserver_type: nginx # valid values "nginx" or "haproxy"
## If the cilium is going to be used in strict mode, we can use the
## localhost connection and not use the external LB. If this parameter is
## not specified, the first node to connect to kubeapi will be used.
# use_localhost_as_kubeapi_loadbalancer: true
## Local loadbalancer should use this port
## And must be set port 6443
loadbalancer_apiserver_port: 6443
## If loadbalancer_apiserver_healthcheck_port variable defined, enables proxy liveness check for nginx.
loadbalancer_apiserver_healthcheck_port: 8081
### OTHER OPTIONAL VARIABLES
## Upstream dns servers
# upstream_dns_servers:
# - 8.8.8.8
# - 8.8.4.4
## There are some changes specific to the cloud providers
## for instance we need to encapsulate packets with some network plugins
## If set the possible values are either 'gce', 'aws', 'azure', 'openstack', 'vsphere', 'oci', or 'external'
## When openstack is used make sure to source in the openstack credentials
## like you would do when using openstack-client before starting the playbook.
# cloud_provider:
## When cloud_provider is set to 'external', you can set the cloud controller to deploy
## Supported cloud controllers are: 'openstack' and 'vsphere'
## When openstack or vsphere are used make sure to source in the required fields
# external_cloud_provider:
## Set these proxy values in order to update package manager and docker daemon to use proxies
# http_proxy: ""
# https_proxy: ""
## Refer to roles/kubespray-defaults/defaults/main.yml before modifying no_proxy
# no_proxy: ""
## Some problems may occur when downloading files over https proxy due to ansible bug
## https://github.com/ansible/ansible/issues/32750. Set this variable to False to disable
## SSL validation of get_url module. Note that kubespray will still be performing checksum validation.
# download_validate_certs: False
## If you need exclude all cluster nodes from proxy and other resources, add other resources here.
# additional_no_proxy: ""
## If you need to disable proxying of os package repositories but are still behind an http_proxy set
## skip_http_proxy_on_os_packages to true
## This will cause kubespray not to set proxy environment in /etc/yum.conf for centos and in /etc/apt/apt.conf for debian/ubuntu
## Special information for debian/ubuntu - you have to set the no_proxy variable, then apt package will install from your source of wish
# skip_http_proxy_on_os_packages: false
## Since workers are included in the no_proxy variable by default, docker engine will be restarted on all nodes (all
## pods will restart) when adding or removing workers. To override this behaviour by only including master nodes in the
## no_proxy variable, set below to true:
no_proxy_exclude_workers: false
## Certificate Management
## This setting determines whether certs are generated via scripts.
## Chose 'none' if you provide your own certificates.
## Option is "script", "none"
# cert_management: script
## Set to true to allow pre-checks to fail and continue deployment
# ignore_assert_errors: false
## The read-only port for the Kubelet to serve on with no authentication/authorization. Uncomment to enable.
# kube_read_only_port: 10255
## Set true to download and cache container
# download_container: true
## Deploy container engine
# Set false if you want to deploy container engine manually.
# deploy_container_engine: true
## Red Hat Enterprise Linux subscription registration
## Add either RHEL subscription Username/Password or Organization ID/Activation Key combination
## Update RHEL subscription purpose usage, role and SLA if necessary
# rh_subscription_username: ""
# rh_subscription_password: ""
# rh_subscription_org_id: ""
# rh_subscription_activation_key: ""
# rh_subscription_usage: "Development"
# rh_subscription_role: "Red Hat Enterprise Server"
# rh_subscription_sla: "Self-Support"
## Check if access_ip responds to ping. Set false if your firewall blocks ICMP.
# ping_access_ip: true
默认安装版本较低,指定 kubernetes 版本:
# master节点执行
[root@master kubespray-2.16.0]# vim inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml
## Change this to use another Kubernetes version, e.g. a current beta release
kube_version: v1.20.7
如果有其它需要,修改 inventory/mycluster/group_vars/k8s_cluster/k8s-cluster.yml
文件即可。
Kuberenetes 仪表板和入口控制器等插件请在下面的文件中进行设置:
$ vim inventory/mycluster/group_vars/k8s_cluster/addons.yml
这里不对该文件进行修改。
配置ssh免密,kubespray ansible 节点对所有节点免密。
# master节点执行
ssh-keygen
ssh-copy-id 192.168.54.211
ssh-copy-id 192.168.54.212
ssh-copy-id 192.168.54.213
ssh-copy-id master
ssh-copy-id slave1
ssh-copy-id slave2
# master节点执行
[root@master kubespray-2.16.0]# cat > inventory/mycluster/group_vars/k8s_cluster/vars.yml << EOF
gcr_image_repo: "registry.aliyuncs.com/google_containers"
kube_image_repo: "registry.aliyuncs.com/google_containers"
etcd_download_url: "https://ghproxy.com/https://github.com/coreos/etcd/releases/download/{{ etcd_version }}/etcd-{{ etcd_version }}-linux-{{ image_arch }}.tar.gz"
cni_download_url: "https://ghproxy.com/https://github.com/containernetworking/plugins/releases/download/{{ cni_version }}/cni-plugins-linux-{{ image_arch }}-{{ cni_version }}.tgz"
calicoctl_download_url: "https://ghproxy.com/https://github.com/projectcalico/calicoctl/releases/download/{{ calico_ctl_version }}/calicoctl-linux-{{ image_arch }}"
calico_crds_download_url: "https://ghproxy.com/https://github.com/projectcalico/calico/archive/{{ calico_version }}.tar.gz"
crictl_download_url: "https://ghproxy.com/https://github.com/kubernetes-sigs/cri-tools/releases/download/{{ crictl_version }}/crictl-{{ crictl_version }}-{{ ansible_system | lower }}-{{ image_arch }}.tar.gz"
nodelocaldns_image_repo: "cncamp/k8s-dns-node-cache"
dnsautoscaler_image_repo: "cncamp/cluster-proportional-autoscaler-amd64"
EOF
运行 kubespray playbook 安装集群:
# master节点执行
[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml
安装过程中会下载许多可执行文件和镜像。
出现下面的信息表示执行成功:
PLAY RECAP *************************************************************************************************************
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
master : ok=584 changed=109 unreachable=0 failed=0 skipped=1160 rescued=0 ignored=1
slave1 : ok=520 changed=97 unreachable=0 failed=0 skipped=1008 rescued=0 ignored=0
slave2 : ok=438 changed=76 unreachable=0 failed=0 skipped=678 rescued=0 ignored=0
Saturday 31 December 2022 20:07:57 +0800 (0:00:00.060) 0:59:12.196 *****
===============================================================================
container-engine/docker : ensure docker packages are installed ----------------------------------------------- 2180.79s
kubernetes/preinstall : Install packages requirements --------------------------------------------------------- 487.24s
download_file | Download item ---------------------------------------------------------------------------------- 58.95s
download_file | Download item ---------------------------------------------------------------------------------- 50.40s
download_container | Download image if required ---------------------------------------------------------------- 44.25s
download_file | Download item ---------------------------------------------------------------------------------- 42.65s
download_container | Download image if required ---------------------------------------------------------------- 38.06s
download_container | Download image if required ---------------------------------------------------------------- 32.38s
kubernetes/kubeadm : Join to cluster --------------------------------------------------------------------------- 32.29s
download_container | Download image if required ---------------------------------------------------------------- 30.67s
download_file | Download item ---------------------------------------------------------------------------------- 25.82s
kubernetes/control-plane : Joining control plane node to the cluster. ------------------------------------------ 25.60s
download_container | Download image if required ---------------------------------------------------------------- 25.34s
download_container | Download image if required ---------------------------------------------------------------- 22.49s
kubernetes/control-plane : kubeadm | Initialize first master --------------------------------------------------- 20.90s
download_container | Download image if required ---------------------------------------------------------------- 20.14s
download_file | Download item ---------------------------------------------------------------------------------- 19.50s
download_container | Download image if required ---------------------------------------------------------------- 17.84s
download_container | Download image if required ---------------------------------------------------------------- 13.96s
download_container | Download image if required ---------------------------------------------------------------- 13.31s
# master节点执行
[root@master ~]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready control-plane,master 10m v1.20.7 192.168.54.211 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.15
slave1 Ready control-plane,master 9m38s v1.20.7 192.168.54.212 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.15
slave2 Ready <none> 8m40s v1.20.7 192.168.54.213 <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 docker://19.3.15
[root@master ~]# kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-7c5b64bf96-wtmxn 1/1 Running 0 8m41s
calico-node-c6rr6 1/1 Running 0 9m6s
calico-node-l59fj 1/1 Running 0 9m6s
calico-node-n9tg6 1/1 Running 0 9m6s
coredns-f944c7f7c-n2wzp 1/1 Running 0 8m26s
coredns-f944c7f7c-x2tfl 1/1 Running 0 8m22s
dns-autoscaler-557bfb974d-6cbtk 1/1 Running 0 8m24s
kube-apiserver-master 1/1 Running 0 10m
kube-apiserver-slave1 1/1 Running 0 10m
kube-controller-manager-master 1/1 Running 0 10m
kube-controller-manager-slave1 1/1 Running 0 10m
kube-proxy-czk9s 1/1 Running 0 9m17s
kube-proxy-gwfc8 1/1 Running 0 9m17s
kube-proxy-tkxlf 1/1 Running 0 9m17s
kube-scheduler-master 1/1 Running 0 10m
kube-scheduler-slave1 1/1 Running 0 10m
nginx-proxy-slave2 1/1 Running 0 9m18s
nodelocaldns-4vd75 1/1 Running 0 8m23s
nodelocaldns-cr5gg 1/1 Running 0 8m23s
nodelocaldns-pmgqx 1/1 Running 0 8m23s
# master节点执行
[root@master ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.20.7 ff54c88b8ecf 19 months ago 118MB
registry.aliyuncs.com/google_containers/kube-controller-manager v1.20.7 22d1a2072ec7 19 months ago 116MB
registry.aliyuncs.com/google_containers/kube-apiserver v1.20.7 034671b24f0f 19 months ago 122MB
registry.aliyuncs.com/google_containers/kube-scheduler v1.20.7 38f903b54010 19 months ago 47.3MB
nginx 1.19 f0b8a9a54136 19 months ago 133MB
quay.io/calico/node v3.17.4 4d9399da41dc 20 months ago 165MB
quay.io/calico/cni v3.17.4 f3abd83bc819 20 months ago 128MB
quay.io/calico/kube-controllers v3.17.4 c623a89d3672 20 months ago 52.2MB
cncamp/k8s-dns-node-cache 1.17.1 21fc69048bd5 22 months ago 123MB
quay.io/coreos/etcd v3.4.13 d1985d404385 2 years ago 83.8MB
cncamp/cluster-proportional-autoscaler-amd64 1.8.3 078b6f04135f 2 years ago 40.6MB
registry.aliyuncs.com/google_containers/coredns 1.7.0 bfe3a36ebd25 2 years ago 45.2MB
registry.aliyuncs.com/google_containers/pause 3.3 0184c1613d92 2 years ago 683kB
registry.aliyuncs.com/google_containers/pause 3.2 80d28bedfe5d 2 years ago 683kB
# slave1节点执行
[root@slave1 ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.20.7 ff54c88b8ecf 19 months ago 118MB
registry.aliyuncs.com/google_containers/kube-apiserver v1.20.7 034671b24f0f 19 months ago 122MB
registry.aliyuncs.com/google_containers/kube-controller-manager v1.20.7 22d1a2072ec7 19 months ago 116MB
registry.aliyuncs.com/google_containers/kube-scheduler v1.20.7 38f903b54010 19 months ago 47.3MB
nginx 1.19 f0b8a9a54136 19 months ago 133MB
quay.io/calico/node v3.17.4 4d9399da41dc 20 months ago 165MB
quay.io/calico/cni v3.17.4 f3abd83bc819 20 months ago 128MB
quay.io/calico/kube-controllers v3.17.4 c623a89d3672 20 months ago 52.2MB
cncamp/k8s-dns-node-cache 1.17.1 21fc69048bd5 22 months ago 123MB
quay.io/coreos/etcd v3.4.13 d1985d404385 2 years ago 83.8MB
cncamp/cluster-proportional-autoscaler-amd64 1.8.3 078b6f04135f 2 years ago 40.6MB
registry.aliyuncs.com/google_containers/coredns 1.7.0 bfe3a36ebd25 2 years ago 45.2MB
registry.aliyuncs.com/google_containers/pause 3.3 0184c1613d92 2 years ago 683kB
registry.aliyuncs.com/google_containers/pause 3.2 80d28bedfe5d 2 years ago 683kB
# slave2节点执行
[root@slave2 ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
registry.aliyuncs.com/google_containers/kube-proxy v1.20.7 ff54c88b8ecf 19 months ago 118MB
registry.aliyuncs.com/google_containers/kube-apiserver v1.20.7 034671b24f0f 19 months ago 122MB
registry.aliyuncs.com/google_containers/kube-controller-manager v1.20.7 22d1a2072ec7 19 months ago 116MB
registry.aliyuncs.com/google_containers/kube-scheduler v1.20.7 38f903b54010 19 months ago 47.3MB
nginx 1.19 f0b8a9a54136 19 months ago 133MB
quay.io/calico/node v3.17.4 4d9399da41dc 20 months ago 165MB
quay.io/calico/cni v3.17.4 f3abd83bc819 20 months ago 128MB
quay.io/calico/kube-controllers v3.17.4 c623a89d3672 20 months ago 52.2MB
cncamp/k8s-dns-node-cache 1.17.1 21fc69048bd5 22 months ago 123MB
quay.io/coreos/etcd v3.4.13 d1985d404385 2 years ago 83.8MB
registry.aliyuncs.com/google_containers/pause 3.3 0184c1613d92 2 years ago 683kB
导出镜像供离线使用:
# master节点执行
docker save -o kube-proxy.tar registry.aliyuncs.com/google_containers/kube-proxy:v1.20.7
docker save -o kube-controller-manager.tar registry.aliyuncs.com/google_containers/kube-controller-manager:v1.20.7
docker save -o kube-apiserver.tar registry.aliyuncs.com/google_containers/kube-apiserver:v1.20.7
docker save -o kube-scheduler.tar registry.aliyuncs.com/google_containers/kube-scheduler:v1.20.7
docker save -o nginx.tar nginx:1.19
docker save -o node.tar quay.io/calico/node:v3.17.4
docker save -o cni.tar quay.io/calico/cni:v3.17.4
docker save -o kube-controllers.tar quay.io/calico/kube-controllers:v3.17.4
docker save -o k8s-dns-node-cache.tar cncamp/k8s-dns-node-cache:1.17.1
docker save -o etcd.tar quay.io/coreos/etcd:v3.4.13
docker save -o cluster-proportional-autoscaler-amd64.tar cncamp/cluster-proportional-autoscaler-amd64:1.8.3
docker save -o coredns.tar registry.aliyuncs.com/google_containers/coredns:1.7.0
docker save -o pause_3.3.tar registry.aliyuncs.com/google_containers/pause:3.3
docker save -o pause_3.2.tar registry.aliyuncs.com/google_containers/pause:3.2
查看生成的文件:
# master节点执行
[root@master ~]# tree Kubespray-2.16.0/
Kubespray-2.16.0/
├── calicoctl
├── cni-plugins-linux-amd64-v0.9.1.tgz
├── images
│ ├── cluster-proportional-autoscaler-amd64.tar
│ ├── cni.tar
│ ├── coredns.tar
│ ├── etcd.tar
│ ├── k8s-dns-node-cache.tar
│ ├── kube-apiserver.tar
│ ├── kube-controller-manager.tar
│ ├── kube-controllers.tar
│ ├── kube-proxy.tar
│ ├── kube-scheduler.tar
│ ├── nginx.tar
│ ├── node.tar
│ ├── pause_3.2.tar
│ └── pause_3.3.tar
├── kubeadm-v1.20.7-amd64
├── kubectl-v1.20.7-amd64
├── kubelet-v1.20.7-amd64
└── rpm
├── docker
│ ├── audit-libs-python-2.8.5-4.el7.x86_64.rpm
│ ├── b001-libsemanage-python-2.5-14.el7.x86_64.rpm
│ ├── b002-setools-libs-3.3.8-4.el7.x86_64.rpm
│ ├── b003-libcgroup-0.41-21.el7.x86_64.rpm
│ ├── b0041-checkpolicy-2.5-8.el7.x86_64.rpm
│ ├── b004-python-IPy-0.75-6.el7.noarch.rpm
│ ├── b005-policycoreutils-python-2.5-34.el7.x86_64.rpm
│ ├── b006-container-selinux-2.119.2-1.911c772.el7_8.noarch.rpm
│ ├── b007-containerd.io-1.3.9-3.1.el7.x86_64.rpm
│ ├── d001-docker-ce-cli-19.03.14-3.el7.x86_64.rpm
│ ├── d002-docker-ce-19.03.14-3.el7.x86_64.rpm
│ └── d003-libseccomp-2.3.1-4.el7.x86_64.rpm
└── preinstall
├── a001-libseccomp-2.3.1-4.el7.x86_64.rpm
├── bash-completion-2.1-8.el7.noarch.rpm
├── chrony-3.4-1.el7.x86_64.rpm
├── e2fsprogs-1.42.9-19.el7.x86_64.rpm
├── ebtables-2.0.10-16.el7.x86_64.rpm
├── ipset-7.1-1.el7.x86_64.rpm
├── ipvsadm-1.27-8.el7.x86_64.rpm
├── rsync-3.1.2-10.el7.x86_64.rpm
├── socat-1.7.3.2-2.el7.x86_64.rpm
├── unzip-6.0-22.el7_9.x86_64.rpm
├── wget-1.14-18.el7_6.1.x86_64.rpm
└── xfsprogs-4.5.0-22.el7.x86_64.rpm
4 directories, 43 files
卸载集群:
[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root reset.yml
1、在 inventory/mycluster/hosts.yaml
中添加新增节点信息
2、执行下面的命令:
[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root scale.yml -v -b --private-key=~/.ssh/id_rsa
不用修改 hosts.yaml 文件,而是直接执行下面的命令:
[root@master kubespray-2.16.0]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root remove-node.yml -v -b --extra-vars "node=slave1"
[root@master kubespray-2.16.0]# ansible-playbook upgrade-cluster.yml -b -i inventory/mycluster/hosts.yaml -e kube_version=v1.25.6
在线部署可能因为网络的原因导致部署失败,所以可以使用离线部署 k8s 集群。
我们可以自己制作一个离线部署的安装包。
下面是从网上看到的一个离线部署的例子。
kubespray GitHub 地址为: https://github.com/kubernetes-sigs/kubespray
这里使用分支为 release-2.15
,对应的主要组件和系统版本如下:
kubernetes v1.19.10
docker v19.03
calico v3.16.9
centos 7.9.2009
kubespray 离线包下载地址:
https://www.mediafire.com/file/nyifoimng9i6zp5/kubespray_offline.tar.gz/file
离线包下载完成后解压到 /opt 目录下:
# master节点执行
$ tar -zxvf /opt/kubespray_offline.tar.gz -C /opt/
查看文件列表:
# master节点执行
$ ll /opt/kubespray_offline
总用量 4
drwxr-xr-x. 4 root root 28 7月 11 2021 ansible_install
drwxr-xr-x. 15 root root 4096 7月 8 2021 kubespray
drwxr-xr-x. 4 root root 240 7月 9 2021 kubespray_cache
三台机器的IP地址为:192.168.54.211
、192.168.54.212
和 192.168.54.213
。
开始部署 ansible 服务器:
# master节点执行
yum install /opt/kubespray_offline/ansible_install/rpm/*
pip3 install /opt/kubespray_offline/ansible_install/pip/*
配置主机免密码登陆:
# master节点执行
ssh-keygen
ssh-copy-id 192.168.54.211
ssh-copy-id 192.168.54.212
ssh-copy-id 192.168.54.213
ssh-copy-id master
ssh-copy-id slave1
ssh-copy-id slave2
配置 ansible 主机组:
# master节点执行
[root@master ~]# cd /opt/kubespray_offline/kubespray
[root@master kubespray]# declare -a IPS=(192.168.54.211 192.168.54.212 192.168.54.213)
[root@master kubespray]# CONFIG_FILE=inventory/mycluster/hosts.yaml python3.6 contrib/inventory_builder/inventory.py ${IPS[@]}
DEBUG: Adding group all
DEBUG: Adding group kube-master
DEBUG: Adding group kube-node
DEBUG: Adding group etcd
DEBUG: Adding group k8s-cluster
DEBUG: Adding group calico-rr
DEBUG: adding host node1 to group all
DEBUG: adding host node2 to group all
DEBUG: adding host node3 to group all
DEBUG: adding host node1 to group etcd
DEBUG: adding host node2 to group etcd
DEBUG: adding host node3 to group etcd
DEBUG: adding host node1 to group kube-master
DEBUG: adding host node2 to group kube-master
DEBUG: adding host node1 to group kube-node
DEBUG: adding host node2 to group kube-node
DEBUG: adding host node3 to group kube-node
inventory/mycluster/hosts.yaml
文件会自动生成,查看改文件的内容:
# master节点执行
[root@master kubespray]# cat inventory/mycluster/hosts.yaml
all:
hosts:
node1:
ansible_host: 192.168.54.211
ip: 192.168.54.211
access_ip: 192.168.54.211
node2:
ansible_host: 192.168.54.212
ip: 192.168.54.212
access_ip: 192.168.54.212
node3:
ansible_host: 192.168.54.213
ip: 192.168.54.213
access_ip: 192.168.54.213
children:
kube-master:
hosts:
node1:
node2:
kube-node:
hosts:
node1:
node2:
node3:
etcd:
hosts:
node1:
node2:
node3:
k8s-cluster:
children:
kube-master:
kube-node:
calico-rr:
hosts: {}
修改 inventory/mycluster/hosts.yaml
文件:
# master节点执行
[root@master kubespray]# vim inventory/mycluster/hosts.yaml
all:
hosts:
master:
ansible_host: 192.168.54.211
ip: 192.168.54.211
access_ip: 192.168.54.211
slave1:
ansible_host: 192.168.54.212
ip: 192.168.54.212
access_ip: 192.168.54.212
slave2:
ansible_host: 192.168.54.213
ip: 192.168.54.213
access_ip: 192.168.54.213
children:
kube-master:
hosts:
master:
slave1:
kube-node:
hosts:
master:
slave1:
slave2:
etcd:
hosts:
master:
slave1:
slave2:
k8s-cluster:
children:
kube-master:
kube-node:
calico-rr:
hosts: {}
修改配置文件使用离线的安装包和镜像:
# master节点执行
[root@master kubespray]# vim inventory/mycluster/group_vars/all/all.yml
---
## Directory where etcd data stored
etcd_data_dir: /var/lib/etcd
## Experimental kubeadm etcd deployment mode. Available only for new deployment
etcd_kubeadm_enabled: false
## Directory where the binaries will be installed
bin_dir: /usr/local/bin
## The access_ip variable is used to define how other nodes should access
## the node. This is used in flannel to allow other flannel nodes to see
## this node for example. The access_ip is really useful AWS and Google
## environments where the nodes are accessed remotely by the "public" ip,
## but don't know about that address themselves.
# access_ip: 1.1.1.1
## External LB example config
## apiserver_loadbalancer_domain_name: "elb.some.domain"
# loadbalancer_apiserver:
# address: 1.2.3.4
# port: 1234
## Internal loadbalancers for apiservers
# loadbalancer_apiserver_localhost: true
# valid options are "nginx" or "haproxy"
# loadbalancer_apiserver_type: nginx # valid values "nginx" or "haproxy"
## If the cilium is going to be used in strict mode, we can use the
## localhost connection and not use the external LB. If this parameter is
## not specified, the first node to connect to kubeapi will be used.
# use_localhost_as_kubeapi_loadbalancer: true
## Local loadbalancer should use this port
## And must be set port 6443
loadbalancer_apiserver_port: 6443
## If loadbalancer_apiserver_healthcheck_port variable defined, enables proxy liveness check for nginx.
loadbalancer_apiserver_healthcheck_port: 8081
### OTHER OPTIONAL VARIABLES
## Upstream dns servers
# upstream_dns_servers:
# - 8.8.8.8
# - 8.8.4.4
## There are some changes specific to the cloud providers
## for instance we need to encapsulate packets with some network plugins
## If set the possible values are either 'gce', 'aws', 'azure', 'openstack', 'vsphere', 'oci', or 'external'
## When openstack is used make sure to source in the openstack credentials
## like you would do when using openstack-client before starting the playbook.
# cloud_provider:
## When cloud_provider is set to 'external', you can set the cloud controller to deploy
## Supported cloud controllers are: 'openstack' and 'vsphere'
## When openstack or vsphere are used make sure to source in the required fields
# external_cloud_provider:
## Set these proxy values in order to update package manager and docker daemon to use proxies
# http_proxy: ""
# https_proxy: ""
#
## Refer to roles/kubespray-defaults/defaults/main.yml before modifying no_proxy
# no_proxy: ""
## Some problems may occur when downloading files over https proxy due to ansible bug
## https://github.com/ansible/ansible/issues/32750. Set this variable to False to disable
## SSL validation of get_url module. Note that kubespray will still be performing checksum validation.
# download_validate_certs: False
## If you need exclude all cluster nodes from proxy and other resources, add other resources here.
# additional_no_proxy: ""
## If you need to disable proxying of os package repositories but are still behind an http_proxy set
## skip_http_proxy_on_os_packages to true
## This will cause kubespray not to set proxy environment in /etc/yum.conf for centos and in /etc/apt/apt.conf for debian/ubuntu
## Special information for debian/ubuntu - you have to set the no_proxy variable, then apt package will install from your source of wish
# skip_http_proxy_on_os_packages: false
## Since workers are included in the no_proxy variable by default, docker engine will be restarted on all nodes (all
## pods will restart) when adding or removing workers. To override this behaviour by only including master nodes in the
## no_proxy variable, set below to true:
no_proxy_exclude_workers: false
## Certificate Management
## This setting determines whether certs are generated via scripts.
## Chose 'none' if you provide your own certificates.
## Option is "script", "none"
## note: vault is removed
# cert_management: script
## Set to true to allow pre-checks to fail and continue deployment
# ignore_assert_errors: false
## The read-only port for the Kubelet to serve on with no authentication/authorization. Uncomment to enable.
# kube_read_only_port: 10255
## Set true to download and cache container
# download_container: true
## Deploy container engine
# Set false if you want to deploy container engine manually.
# deploy_container_engine: true
## Red Hat Enterprise Linux subscription registration
## Add either RHEL subscription Username/Password or Organization ID/Activation Key combination
## Update RHEL subscription purpose usage, role and SLA if necessary
# rh_subscription_username: ""
# rh_subscription_password: ""
# rh_subscription_org_id: ""
# rh_subscription_activation_key: ""
# rh_subscription_usage: "Development"
# rh_subscription_role: "Red Hat Enterprise Server"
# rh_subscription_sla: "Self-Support"
## Check if access_ip responds to ping. Set false if your firewall blocks ICMP.
# ping_access_ip: true
kube_apiserver_node_port_range: "1-65535"
kube_apiserver_node_port_range_sysctl: false
download_run_once: true
download_localhost: true
download_force_cache: true
download_cache_dir: /opt/kubespray_offline/kubespray_cache # 修改
preinstall_cache_rpm: true
docker_cache_rpm: true
download_rpm_localhost: "{{ download_cache_dir }}/rpm" # 修改
tmp_cache_dir: /tmp/k8s_cache # 修改
tmp_preinstall_rpm: "{{ tmp_cache_dir }}/rpm/preinstall" # 修改
tmp_docker_rpm: "{{ tmp_cache_dir }}/rpm/docker" # 修改
image_is_cached: true
nodelocaldns_dire_coredns: true
开始部署 k8s:
# master节点执行
[root@master kubespray]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml
部署时间大概持续半个小时,中间不需要任何介入,部署完成后,查看集群和Pod状态:
# master节点执行
[root@master kubespray]# kubectl get nodes
[root@master kubespray]# kubectl get pods -n kube-system
卸载集群:
# master节点执行
[root@master kubespray]# ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root reset.yml