目录
一、集群规划
二、初始化工作
三、keeplived集群安装测试
centos和ubuntu安装
四、etcd集群安装
五、安装kubeadm,kubectl和kubelet
1)centos 7 安装
2)Ubuntu 14.06安装
六、初始化master节点以及将work节点加入集群,增加网络插件
七、搭建集群出现的问题,以及解决方式
本次安装采用外部etcd集群
解耦了控制平面和Etcd,集群风险小,单独挂了一台master或etcd对集群影响很小。etcd在外部方便维护和恢复。
一、集群规划
主机 |
ip |
角色 |
vip |
192.168.50.200 |
虚拟VIP |
master24 |
192.168.50.24 |
etcd1、master |
master25 |
192.168.50.25 |
etcd2、master |
node26 |
192.168.50.26 |
etcd3、node |
二、初始化工作
1.安装docker(集群所有节点)
方式见之前的博客
2.环境准备
#先安装相关工具 apt-get update && apt-get install -y apt-transport-https curl #禁用swap swapoff -a #关闭防火墙 sudo ufw disable #禁用Selinux apt install selinux-utils setenforce 0 #设置网络: tee /etc/sysctl.d/k8s.conf <<- 'EOF' net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 EOF modprobe br_netfilter #查看 ipv4 与 v6 配置是否生效: sysctl --system #配置 iptables: iptables -P FORWARD ACCEPT vim /etc/rc.local /usr/sbin/iptables -P FORWARD ACCEPT |
三、keeplived集群安装测试
centos和ubuntu安装
1.安装keeplived(主备节点执行)
# centos: yum install -y keepalived #ubuntu apt install keepalived |
2.配置
ubuntu的这里需要做个配置(centos不需要)
cp /usr/share/doc/keepalived/samples/keepalived.conf.sample /etc/keepalived/keepalived.conf |
master01
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived vrrp_instance VI_1 { state MASTER #BACKUP上修改为BACKUP interface eth0 #主网卡信息 virtual_router_id 44 #虚拟路由标识,主从相同 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 1111 #主从认证密码必须一致 } virtual_ipaddress { #虚拟IP(VIP) 192.168 . 50.240 } } virtual_server 192.168 . 50.240 6443 { #对外虚拟IP地址 delay_loop 6 #检查真实服务器时间,单位秒 lb_algo rr #设置负载调度算法,rr为轮训 lb_kind DR #设置LVS负载均衡NAT模式 protocol TCP #使用TCP协议检查realserver状态 real_server 192.168 . 50.24 6443 { #第一个节点 weight 3 #节点权重值 TCP_CHECK { #健康检查方式 connect_timeout 3 #连接超时 nb_get_retry 3 #重试次数 delay_before_retry 3 #重试间隔/S } } real_server 192.168 . 50.25 6443 { #第二个节点 weight 3 TCP_CHECK { connect_timeout 3 nb_get_retry 3 delay_before_retry 3 } } } |
master02
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived vrrp_instance VI_1 { state MASTER #BACKUP上修改为BACKUP interface eth0 #主网卡信息 virtual_router_id 44 #虚拟路由标识,主从相同 priority 80 #权重 advert_int 1 authentication { auth_type PASS auth_pass 1111 #主从认证密码必须一致 } virtual_ipaddress { #虚拟IP(VIP) 192.168 . 50.240 } } virtual_server 192.168 . 50.240 6443 { #对外虚拟IP地址 delay_loop 6 #检查真实服务器时间,单位秒 lb_algo rr #设置负载调度算法,rr为轮训 lb_kind DR #设置LVS负载均衡NAT模式 protocol TCP #使用TCP协议检查realserver状态 real_server 192.168 . 50.24 6443 { #第一个节点 weight 3 #节点权重值 TCP_CHECK { #健康检查方式 connect_timeout 3 #连接超时 nb_get_retry 3 #重试次数 delay_before_retry 3 #重试间隔/S } } real_server 192.168 . 50.25 6443 { #第二个节点 weight 3 TCP_CHECK { connect_timeout 3 nb_get_retry 3 delay_before_retry 3 } } } |
3、启动服务
systemctl start keepalived systemctl enable keepalived systemctl status keepalived |
4、查看
ip a
ip a 查看 192.168 . 50.240 是否在master01上 |
5.校验
systemctl stop keepalived 查看vip: 192.168 . 50.240 飘到master02上 |
四、etcd集群安装
1、在master1上安装cfssl(Kubernetes证书相关(CFSSL) - 简书)
wget https: //pkg.cfssl.org/R1.2/cfssl_linux-amd64 wget https: //pkg.cfssl.org/R1.2/cfssljson_linux-amd64 chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 mv cfssl_linux-amd64 /usr/local/bin/cfssl mv cfssljson_linux-amd64 /usr/local/bin/cfssljson |
2、安装etcd二进制文件
# 创建目录 mkdir -p /data/etcd/bin # 下载 cd /tmp wget https: //storage.googleapis.com/etcd/v3.3.25/etcd-v3.3.25-linux-amd64.tar.gz tar zxf etcd-v3. 3.25 -linux-amd64.tar.gz cd etcd-v3. 3.25 -linux-amd64 mv etcd etcdctl /data/etcd/bin/ |
3、创建ca证书,客户端,服务端,节点之间的证书
Etcd属于server ,etcdctl 属于client,二者之间通过http协议进行通信。
ca证书 自己给自己签名的权威证书,用来给其他证书签名
server证书 etcd的证书
client证书 客户端,比如etcdctl的证书
peer证书 节点与节点之间通信的证书
1) 创建目录
mkdir -p /data/etcd/ssl cd /data/etcd/ssl |
2) 创建ca证书
创建vim ca-config.json
{ "signing" : { "default" : { "expiry" : "438000h" }, "profiles" : { "server" : { "expiry" : "438000h" , "usages" : [ "signing" , "key encipherment" , "server auth" , "client auth" ] }, "client" : { "expiry" : "438000h" , "usages" : [ "signing" , "key encipherment" , "client auth" ] }, "peer" : { "expiry" : "438000h" , "usages" : [ "signing" , "key encipherment" , "server auth" , "client auth" ] } } } } |
server auth表示client可以用该ca对server提供的证书进行验证
client auth表示server可以用该ca对client提供的证书进行验证
创建证书签名请求ca-csr.json
vim ca-csr.json
{ "CN" : "etcd" , "key" : { "algo" : "rsa" , "size" : 2048 } } |
生成CA证书和私钥
cfssl gencert -initca ca-csr.json | cfssljson -bare ca # ls ca* # ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem |
3) 生成客户端证书
vim client.json
{ "CN" : "client" , "key" : { "algo" : "ecdsa" , "size" : 256 } } |
4) 生成server,peer证书
创建配置 vim etcd.json
{ "CN" : "etcd" , "hosts" : [ "192.168.50.24" , "192.168.50.25" , "192.168.50.26" ], "key" : { "algo" : "ecdsa" , "size" : 256 }, "names" : [ { "C" : "CN" , "L" : "BJ" , "ST" : "BJ" } ] } |
生成
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server etcd.json | cfssljson -bare server cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer etcd.json | cfssljson -bare peer |
5)配置/data/etcd/cfg/etcd.conf (其他的两个节点改到#的位置)
ETCD_DATA_DIR= "/data/etcd/data" ETCD_LISTEN_PEER_URLS= "https://192.168.50.24:2380" #本机ip ETCD_LISTEN_CLIENT_URLS= "https://192.168.50.24:2379" #本机ip ETCD_NAME=etc0 # etc名称 ETCD_SNAPSHOT_COUNT= "500000" ETCD_INITIAL_ADVERTISE_PEER_URLS= "https://192.168.50.24:2380" #本机ip ETCD_ADVERTISE_CLIENT_URLS= "https://192.168.50.24:2379" #本机ip ETCD_INITIAL_CLUSTER= "etc0=https://192.168.50.24:2380,etc1=https://192.168.50.25:2380,etc2=https://192.168.50.26:2380" ETCD_INITIAL_CLUSTER_TOKEN= "etcd-cluster" ETCD_INITIAL_CLUSTER_STATE= "new" ETCD_CERT_FILE= "/data/etcd/ssl/server.pem" ETCD_KEY_FILE= "/data/etcd/ssl/server-key.pem" ETCD_CLIENT_CERT_AUTH= "True" ETCD_TRUSTED_CA_FILE= "/data/etcd/ssl/ca.pem" ETCD_AUTO_TLS= "True" ETCD_PEER_CERT_FILE= "/data/etcd/ssl/peer.pem" ETCD_PEER_KEY_FILE= "/data/etcd/ssl/peer-key.pem" ETCD_PEER_CLIENT_CERT_AUTH= "True" ETCD_PEER_TRUSTED_CA_FILE= "/data/etcd/ssl/ca.pem" ETCD_PEER_AUTO_TLS= "True" ETCD_LOG_OUTPUT= "default" ETCD_AUTO_COMPACTION_RETENTION= "1" |
6) 将master01的/data/etcd/ssl目录同步到master02和node01上 (master02 、node01 角色为etcd)
4、 systemd配置文件
vim /usr/lib/systemd/system/etcd.service(ubuntu 18.04 的是/lib/systemd/system)
三台主机配置不一样用的时候把注释最好删除
[Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target [Service] Type=notify EnvironmentFile=/data/etcd/cfg/etcd.conf ExecStart=/data/etcd/bin/etcd Restart=always RestartSec= 15 LimitNOFILE= 65536 OOMScoreAdjust=- 999 [Install] WantedBy=multi-user.target |
5、 启动 etcd
systemctl daemon-reload systemctl enable etcd systemctl start etcd systemctl status etcd |
6、 验证是否成功
cd /data/etcd/ssl # 查看状态 [root @localhost ssl]# ../bin/etcdctl --ca-file=ca.pem --cert-file=server.pem --key-file=server-key.pem --endpoints= "https://192.168.50.24:2379" cluster-health member fa4ab5cabfae676 is healthy: got healthy result from http: //192.168.50.25:2379 member 4dc32ddff8db8fb9 is healthy: got healthy result from http: //192.168.50.24:2379 member 8f173179ae8613f7 is healthy: got healthy result from http: //192.168.50.26:2379 # 查看集群主机 [root @localhost ssl]# ../bin/etcdctl --ca-file=ca.pem --cert-file=server.pem --key-file=server-key.pem --endpoints= "https://192.168.50.24:2379" member list fa4ab5cabfae676: name=etc1 peerURLs=http: //192.168.50.25:2380 clientURLs=http://192.168.50.25:2379 isLeader=false 4dc32ddff8db8fb9: name=etc0 peerURLs=http: //192.168.50.24:2380 clientURLs=http://192.168.50.24:2379 isLeader=false 8f173179ae8613f7: name=etc2 peerURLs=http: //192.168.50.26:2380 clientURLs=http://192.168.50.26:2379 isLeader=true |
五、安装kubeadm,kubectl和kubelet
所有节点安装 kubeadm, kubelet 。kubectl是可选的,你可以安装在所有机器上,也可以只安装在一台master1上。
1)centos 7 安装
A.添加国内yum源
cat < /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=http: //mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled= 1 gpgcheck= 0 repo_gpgcheck= 0 gpgkey=http: //mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http: //mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF |
B.安装
yum install -y kubelet kubeadm kubectl |
C、在所有安装kubelet的节点上,将kubelet设置为开机启动
2)Ubuntu 14.06安装
A。添加证书
curl https: //mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add - |
B。添加apt源
cat </etc/apt/sources.list.d/kubernetes.list deb https: //mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main EOF apt-get update |
C。查看可安装版本
apt-cache madison kubelet |
D。安装
apt-get install -y kubelet=1.22.4-00 kubeadm=1.22.4-00 kubectl=1.22.4-00
E。设置开机自启动
sudo systemctl enable kubelet && sudo systemctl start kubelet |
-
六、初始化master节点以及将work节点加入集群,增加网络插件
A。在master1上将搭建etcd时生成的的ca证书和客户端证书复制到指定地点并重命名,如下
[root @master01 ] ~$ mkdir -p /etc/kubernetes/pki/etcd/ #etcd集群的ca证书 [root @master01 ] ~$ cp /data/etcd/ssl/ca.pem /etc/kubernetes/pki/etcd/ #etcd集群的client证书,apiserver访问etcd使用 [root @master01 ] ~$ cp /data/etcd/ssl/client.pem /etc/kubernetes/pki/apiserver-etcd-client.pem #etcd集群的client私钥 [root @master01 ] ~$ cp /data/etcd/ssl/client-key.pem /etc/kubernetes/pki/apiserver-etcd-client-key.pem #确保 [root @master01 ] ~$ tree /etc/kubernetes/pki/ /etc/kubernetes/pki/ ├── apiserver-etcd-client-key.pem ├── apiserver-etcd-client.pem └── etcd └── ca.pem 1 directory, 3 files |
B。创建初始化配置文件
生成默认配置文件
cd /etc/kubernetes kubeadm config print init-defaults > kubeadm-init.yaml |
最后修改后 (修改部分是什么???)
apiVersion: kubeadm.k8s.io/v1beta3 bootstrapTokens: - groups: - system:bootstrappers:kubeadm: default -node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 192.168 . 50.24 #本机ip bindPort: 6443 nodeRegistration: criSocket: /var/run/dockershim.sock imagePullPolicy: IfNotPresent #imagePullPolicy: Never name: master01 #本机hostname #taints: null taints: - effect: NoSchedule key: node-role.kubernetes.io/master --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta3 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: {} #dns: # type: CoreDNS etcd: #local: # dataDir: /var/lib/etcd external: endpoints: - https: //192.168.50.24:2379 - https: //192.168.50.25:2379 - https: //192.168.50.26:2379 caFile: /etc/kubernetes/pki/etcd/ca.pem #搭建etcd集群时生成的ca证书 certFile: /etc/kubernetes/pki/apiserver-etcd-client.pem #搭建etcd集群时生成的客户端证书 keyFile: /etc/kubernetes/pki/apiserver-etcd-client-key.pem #搭建etcd集群时生成的客户端密钥 #imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers imageRepository: k8s.gcr.io kind: ClusterConfiguration kubernetesVersion: 1.22 . 4 controlPlaneEndpoint: 192.168 . 50.200 # vip地址 networking: dnsDomain: cluster.local # add podSubnet podSubnet: 10.244 . 0.0 / 16 serviceSubnet: 10.96 . 0.0 / 12 scheduler: {} --- apiVersion: kubeproxy.config.k8s.io/v1alpha1 kind: KubeProxyConfiguration mode: "ipvs" |
C。安装所需要要的镜像(因为k8s.gcr.io下不下来)
# 查看所需的镜像 [root @master01 kubernetes]# kubeadm config images list k8s.gcr.io/kube-apiserver:v1. 22.4 k8s.gcr.io/kube-controller-manager:v1. 22.4 k8s.gcr.io/kube-scheduler:v1. 22.4 k8s.gcr.io/kube-proxy:v1. 22.4 k8s.gcr.io/pause: 3.5 k8s.gcr.io/etcd: 3.5 . 0 - 0 k8s.gcr.io/coredns/coredns:v1. 8.4 #去registry.cn-hangzhou.aliyuncs.com/google_containers下载,在改变 |
此处有个脚本
#!/bin/bash images=( kube-apiserver:v1. 22.4 kube-controller-manager:v1. 22.4 kube-scheduler:v1. 22.4 kube-proxy:v1. 22.4 pause: 3.5 etcd: 3.5 . 0 - 0 # coredns/coredns:v1. 8.4 ) for imageName in ${images[@]}; do docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName k8s.gcr.io/$imageName docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName done docker pull coredns/coredns: 1.8 . 4 docker tag coredns/coredns: 1.8 . 4 k8s.gcr.io/coredns/coredns:v1. 8.4 docker rmi coredns/coredns: 1.8 . 4 |
ubuntu运行脚本报错
root @node27 :~# sh a.sh a.sh: 2 : a.sh: Syntax error: "(" unexpected |
sudo dpkg-reconfigure dash 选择no
D。 执行初始化
kubeadm init --config=kubeadm-init.yaml |
E.配置kubectl
要使用 kubectl来 管理集群操作集群,需要做如下配置:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config |
测试下,kubectl是否正常,需要注意是此时master1的notready状态是正常的,因为我们还没有部署flannel网络插件
[root @master01 kubernetes]# kubectl get node NAME STATUS ROLES AGE VERSION master01 NotReady control-plane,master 12h v1. 22.4 |
F。 初始化master2
1)首先将 master1 中的 生成的集群共用的ca 证书,scp 到其他 master 机器。
scp -r /etc/kubernetes/pki/* 192.168 . 50.25 :/etc/kubernetes/pki/ |
2) 将初始化配置文件复制到master2
scp kubeadm-init.yaml 192.168 . 50.25 :/etc/kubernetes/ |
3) 初始化master2
修改后初始化具体修改内容根据上面的标准文件注释修改
kubeadm init --config=kubeadm-init.yaml |
H。将节点加入集群(node节点)
在master01上生成加入key
kubeadm token create --print-join-command |
在节点主机直接执行添加(node节点)192.168
.
50.240
:
6443是keepalived vip 地址
kubeadm join 192.168 . 50.240 : 6443 --token 6rqvgm.ne3pygx84k7hq1zl --discovery-token-ca-cert-hash sha256:24894b47c94186bd597494c34665efdbbf6686566948c4c58c50a02cc2d74e53 |
I。测试集群
[root @master01 kubernetes]# kubectl get nodes NAME STATUS ROLES AGE VERSION master01 NotReady control-plane,master 13h v1. 22.4 master02 NotReady control-plane,master 12h v1. 22.4 node01 NotReady 12h v1. 22.4 |
组件状态
[root @master01 kubernetes]# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1. 19 + NAME STATUS MESSAGE ERROR scheduler Healthy ok etcd- 0 Healthy { "health" : "true" } controller-manager Healthy ok etcd- 1 Healthy { "health" : "true" } etcd- 2 Healthy { "health" : "true" } |
服务账户
[root @master01 kubernetes]# kubectl get serviceaccount NAME SECRETS AGE default 1 13h |
集群信息,注意这里的api地址正是我们搭建的vip地址。
[root @master01 kubernetes]# kubectl cluster-info Kubernetes control plane is running at https: //192.168.50.240:6443 CoreDNS is running at https: //192.168.50.240:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy To further debug and diagnose cluster problems, use 'kubectl cluster-info dump' . |
J。安装网络插件
[root @master01 kubernetes]# kubectl apply -f https: //raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1. 21 +, unavailable in v1. 25 + podsecuritypolicy.policy/psp.flannel.unprivileged created clusterrole.rbac.authorization.k8s.io/flannel created clusterrolebinding.rbac.authorization.k8s.io/flannel created serviceaccount/flannel created configmap/kube-flannel-cfg created daemonset.apps/kube-flannel-ds created |
H。查看
kubectl get pods -n kube-system |
七、搭建集群出现的问题,以及解决方式
1.初始化报错:
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Swap]: running with swap on is not supported. Please disable swap
解决方式:
1 . 关掉swapoff swapoff -a 2 . 注释掉配置 vi /etc/fstab # |
2.kubeadm init --config=kubeadm-init.yaml后报
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp [::1]:10248: connect: connection refused.
解决方式:
vim /etc/docker/daemon.json #增加,后保存 { "registry-mirrors" : [ "https://dockerhub.azk8s.cn" , "https://docker.mirrors.ustc.edu.cn" ], "max-concurrent-downloads" : 10 , "log-driver" : "json-file" , "log-level" : "warn" , "log-opts" : { "max-size" : "10m" , "max-file" : "3" }, "exec-opts" : [ "native.cgroupdriver=systemd" ], "exec-opts" : [ "native.cgroupdriver=systemd" ], "data-root" : "/var/lib/docker" } systemctl restart docker cd /etc/kubernetes #删除之前初始化产生的文件 rm -rf admin.conf controller-manager.conf kubelet.conf rm -rf manifests/* #再次初始化 kubeadm init --config=kubeadm-init.yaml |
3.初始化后, kubectl get pods -n kube-system下的conds没有启动
删除、/data/etc/data
kubeadmin reset
重新搭建节点即可
3.通过kubeadm安装的k8s集群获取kube-scheduler和kube-controller-manager组件状态异常
$ kubectl get cs NAME STATUS MESSAGE ERROR controller-manager Unhealthy Get http: //127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused scheduler Unhealthy Get http: //127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused etcd- 0 Healthy { "health" : "true" } |
解决方式:
配置文件路径:/etc/kubernetes/manifests/scheduler.conf 、/etc/kubernetes/manifests/controller-manager.conf
如controller-manager组件的配置如下:可以去掉--port=0这个设置,然后重启sudo systemctl restart kubelet
$ kubectl get cs NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd- 0 Healthy { "health" : "true" } |