k8s的搭建
使用 Kubeadm来搭建master集群,⽬前所安装的版本是 v1.18.2
欢迎运维萌新大佬等进群,涵盖业务运维、应用运维、系统运维、网络运维、数据库运维、桌面运维、运维开发等,地区不限, 新群建立中,欢迎各位进群交流业界知识~
qq群号: 1027981908
hostnamectl set-hostname k8s-master
hostnamectl set-hostname k8s-node1
hostnamectl set-hostname k8s-node2
在三个节点机器的/etc/hosts下加入如下内容(对应自己机器ip)
172.20.0.15 k8s-master
172.20.0.12 k8s-node1
172.20.0.43 k8s-node2
以下都在三个节点上跑
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
cat /etc/selinux/config
SELINUX=disabled
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum -y install docker-ce-18.09.9
cat < /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum install -y kubelet-1.18.2 kubeadm-1.18.2 kubectl-1.18.2 --disableexcludes=kubernetes
kubeadm version
systemctl start docker
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF
systemctl enable docker.service
systemctl daemon-reload
systemctl restart docker
sed -i '20i ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT' /usr/lib/systemd/system/docker.service
systemctl daemon-reload
systemctl restart docker
改完后检查一下,
iptables -L
这里我打算留坑,有些必要设置不做,跳坑的直接下滑跳至跳坑安装部分
以下在master上执行
kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.16.2
k8s.gcr.io/kube-controller-manager:v1.16.2
k8s.gcr.io/kube-scheduler:v1.16.2
k8s.gcr.io/kube-proxy:v1.16.2
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.3.15-0
k8s.gcr.io/coredns:1.6.2
一个个search一下看看国内镜像源下载路径
然后开始拉取镜像
docker pull aiotceo/kube-apiserver:v1.16.2
docker pull aiotceo/kube-scheduler:v1.16.2
docker pull aiotceo/kube-controller-manager:v1.16.2
docker pull aiotceo/kube-proxy:v1.16.2
docker pull tangxu/etcd:3.3.15-0
docker pull aiotceo/coredns:1.6.2
docker pull aiotceo/pause:3.1
docker tag aiotceo/kube-apiserver:v1.16.2 k8s.gcr.io/kube-apiserver:v1.16.2
docker tag aiotceo/kube-scheduler:v1.16.2 k8s.gcr.io/kube-scheduler:v1.16.2
docker tag aiotceo/kube-controller-manager:v1.16.2 k8s.gcr.io/kube-controller-manager:v1.16.2
docker tag aiotceo/kube-proxy:v1.16.2 k8s.gcr.io/kube-proxy:v1.16.2
docker tag tangxu/etcd:3.3.15-0 k8s.gcr.io/etcd:3.3.15-0
docker tag aiotceo/coredns:1.6.2 k8s.gcr.io/coredns:1.6.2
docker tag aiotceo/pause:3.1 k8s.gcr.io/pause:3.1
初始化集群,版本v1.16.2,这里是根据kubadm等版本来选,有些低版本会报错不支持;pod网段设置,master的ip设置:
这里有两种方式初始化集群,这里这个操作在master上执行,
第一种:
下图是参数启动,还可以使用配置文件:
kubeadm init --kubernetes-version=v1.16.2 --pod-network-cidr=172.20.0.0/16 --apiserver-advertise-address=172.20.0.15
第二种:
kubeadm config print init-defaults > xxxx.yaml
kubeadm init --config xxxx.yaml
这个自动生成的配置文件:
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: hdp0kg.ab86i3ms07muvkaxbjcc3oui #这里建议更改,字母小写,格式是:小写字母与数字组合6位+.+16位的小写字母与数字组合;
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 172.20.0.15 #api 的ip要更改,即master ip
bindPort: 6443 #api的端口要更改
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: k8s-master #自动获取了
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io #镜像地址,国外的,不想一个个tag,又不能科学上网的,可更换成阿里云的镜像源: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.16.2 #上述提供的国内镜像源无此版本对应镜像,选择替换上述镜像源的话,此版本最好用v1.18.2
networking:
dnsDomain: cluster.k8stest #改不改都可以
serviceSubnet: 10.10.0.0/16
podSubnet: 10.18.0.0/16 #pod网段,最好按自己需求设置一下,与上述网段最好不要重复
scheduler: {}
kubeadm init --kubernetes-version=v1.16.2 --pod-network-cidr=172.10.0.0/16 --apiserver-advertise-address=172.10.0.15
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Port-10250]: Port 10250 is in use
这个错就是 kubelet和docker的启动方式不一样导致,更改方式见上下文操作
failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"
将docker的启动方式改回systemd,在文件/etc/docker/daemon.json 中编辑
"exec-opts": ["native.cgroupdriver=systemd"]
kubelet的启动方式在如下文件下更改
/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf bak/10-kubeadm.conf
如下是我写的自动安装脚本(已上传到csdn资源处)其中更改kubelet启动方式的地方,其中变量KUBESTART是kubelet的启动方式:
counts=`grep -i "KUBELET_CGROUP_ARGS=–cgroup-driver=" /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf |
if [[ "$counts" != "0" ]] ; then
sed -i s/–cgroup-driver=.*$/–cgroup-driver=$KUBESTART/g /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.con
else
sed -i 5iEnvironment="KUBELET_CGROUP_ARGS=–cgroup-driver=$KUBESTART" /usr/lib/systemd/system/kubelet.service.d/1
fi
收动更改的话,就编辑上述文件,增加如下一行:
Environment="KUBELET_CGROUP_ARGS=–cgroup-driver=systemd"
同上要重载重启
k8s-master kubelet: W0710 15:05:16.718248 17552 docker_service.go:563] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth"
systemctl enable kubelet
systemctl daemon-reload
systemctl restart docker
- 下面这个类型的错误,是拉取国外镜像超时了,所以上述步骤的提前拉取镜像步骤很有必要,如果做了上述步骤还出错,检查一下tag的名字和镜像是否拉下来了
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-controller-manager:v1.16.2: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
OK,再试一次
控制台输出信息如下,这时检查一下kubelet的启动日志,以及系统日志看看具体是什么原因,对症解决即可,一般是kubelet的配置文件有误,或者域名设置有误、或者是版本不对应、以及ipvs防火墙等原因造成,一般kubelet正常启动,k8s集群基本能正常初始化了。
[init] Using Kubernetes version: v1.16.2
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using ‘kubeadm config images pull’
[kubelet-start] Writing kubelet environment file with flags to file “/var/lib/kubelet/kubeadm-flags.env”
[kubelet-start] Writing kubelet configuration to file “/var/lib/kubelet/config.yaml”
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder “/etc/kubernetes/pki”
[certs] Generating “ca” certificate and key
[certs] Generating “apiserver” certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.20.0.15]
[certs] Generating “apiserver-kubelet-client” certificate and key
[certs] Generating “front-proxy-ca” certificate and key
[certs] Generating “front-proxy-client” certificate and key
[certs] Generating “etcd/ca” certificate and key
[certs] Generating “etcd/server” certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master localhost] and IPs [172.20.0.15 127.0.0.1 ::1]
[certs] Generating “etcd/peer” certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master localhost] and IPs [172.20.0.15 127.0.0.1 ::1]
[certs] Generating “etcd/healthcheck-client” certificate and key
[certs] Generating “apiserver-etcd-client” certificate and key
[certs] Generating “sa” key and public key
[kubeconfig] Using kubeconfig folder “/etc/kubernetes”
[kubeconfig] Writing “admin.conf” kubeconfig file
[kubeconfig] Writing “kubelet.conf” kubeconfig file
[kubeconfig] Writing “controller-manager.conf” kubeconfig file
[kubeconfig] Writing “scheduler.conf” kubeconfig file
[control-plane] Using manifest folder “/etc/kubernetes/manifests”
[control-plane] Creating static Pod manifest for “kube-apiserver”
[control-plane] Creating static Pod manifest for “kube-controller-manager”
[control-plane] Creating static Pod manifest for “kube-scheduler”
[etcd] Creating static Pod manifest for local etcd in “/etc/kubernetes/manifests”
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory “/etc/kubernetes/manifests”. This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn’t running or healthy.
[kubelet-check] The HTTP call equal to ‘curl -sSL http://localhost:10248/healthz’ failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn’t running or healthy.
再执行一次试试
```bash
kubeadm init --kubernetes-version=v1.16.2 --pod-network-cidr=172.10.0.0/16 --apiserver-advertise-address=172.10.0.15 --control-plane-endpoint="172.10.0.15:6443"
日志如下,这里是之前运行过,要reset才能重新初始化;
[init] Using Kubernetes version: v1.16.2
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
kubeadm reset
kubeadm init --kubernetes-version=v1.16.2 --pod-network-cidr=172.10.0.0/16 --apiserver-advertise-address=172.10.0.15 --control-plane-endpoint="172.10.0.15:6443"
k8s-master kubelet: W0710 15:05:16.718248 17552 docker_service.go:563] Hairpin mode set to “promiscuous-bridge” but kubenet is not enabled, falling back to “hairpin-veth”
systemctl enable kubelet
journalctl -u kubelet
journalctl -xue kubelet
查看kubelet日志
系统日志: /var/log/message
修改kubelet的启动方式:
KUBELET_CGROUP_ARGS=–cgroup-driver=systemd
在cat /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf里修改添加
这个时候去看日志会发现有很多报错,之前留下的坑,现在补上;
在三节点上运行
- 启用 iptables 上桥接数据转发过滤功能
由于开启内核 ipv4 转发需要加载 br_netfilter 模块,所以加载下该模块:br_netfilter
首先加载模块临时和生效两种,修改配置文件之后重启生效;这边两种都加上,避免重启;
先看看现在配置是否有该模块
下面命令不执行也可以了,如果上图已加载
modprobe br_netfilter
开启内核ipv4的话,方便后续重建移除,建议不要直接在/etc/sysctl.conf下修改
让其永久生效,创建/etc/sysctl.d/k8s.conf 文件,添加如下内容:
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
导入
sysctl -p /etc/sysctl.d/k8s.conf
接下来就是k8s的网络实现模式,这里选择的是ipvs(IP virtual server);刚刚日志报错就是这里没有操作;
当前只加载了nf_conntrack,这里单独写一个文件去加载,放到如下配置的路径下,使其后续机器重启了也能自动加载
cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF
修改一下权限
chmod 755 /etc/sysconfig/modules/ipvs.modules
先执行一次即刻生效
bash /etc/sysconfig/modules/ipvs.modules
再检查一下是否加载了
lsmod | grep -e ip_vs -e nf_conntrack
在master上跑
kubeadm config print init-defaults > kubeadm.yaml
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: hdp0kg.ab86i3ms07muvkax
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 172.20.0.15
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: k8s-master
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.18.2
networking:
dnsDomain: cluster.k8stest
serviceSubnet: 10.10.0.0/12
podSubnet: 10.18.0.0/16
scheduler: {}
kubeadm init --config kubeadm.yaml
echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f kube-flannel.yml
这个过程时间较长,可以通过如下命令查看建立过程详细信息
kubectl get pods -A -o wide
找到flannel对应的pod id
kubectl describe pods xxxxxx -n kube-system
常见错误有拉取镜像失败超时,systemd1服务连接超时(查看系统错误日志,解决掉错误信息,一般是开机自启的服务出错导致,k8s导致的,绝大多数是因为kubelet导致,可暂时将kubelet服务手动停止,再重启系统),内存不足,磁盘空间不足等等;
其他节点加入集群,执行如下命令(token每台机器生产都不一样的,不能直接复制粘贴)
kubeadm join 172.20.0.15:6443 --token hdp0kg.ab86i3ms07muvkax --discovery-token-ca-cert-hash sha256:c58c4a30124c7c74140ade1bbe1e460552c20ddee07e79e6a197f660e9617111
不记得tocken 了就执行如下命令:
kubeadm token list