组件 | 版本 | 二进制包下载地址 | 备注 |
---|---|---|---|
centos | 7.8.2003 | http://mirrors.163.com/centos/7.8.2003/isos/x86_64/CentOS-7-x86_64-DVD-2003.iso | 宿主机操作系统 |
kubernetes |
v1.18.8 | https://github.com/kubernetes/kubernetes/releases/download/v1.18.8/kubernetes.tar.gz |
编排,调度pass |
docker-ce | 19.03.9 | https://download.docker.com/linux/static/stable/x86_64/docker-19.03.9.tgz | CNR运行引擎 |
cni | v0.8.6 | https://github.com/containernetworking/plugins/releases/download/v0.8.6/cni-plugins-linux-amd64-v0.8.6.tgz | 网络环境接口 |
flannel | 0.12 | https://github.com/coreos/flannel/releases/download/v0.12.0/flannel-v0.12.0-linux-amd64.tar.gz https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml |
网络xvlan-flannel也可以用calico |
etcd | v3.4.9 | https://github.com/etcd-io/etcd/releases/download/v3.4.9/etcd-v3.4.9-linux-amd64.tar.gz |
k8s数据存储 |
mino | https://dl.min.io/server/minio/release/linux-amd64/minio | 存储系统(没用上) | |
cfssl | v1.2.0 | https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 | 证书签发工具 |
kubernetes-dashboard | v2.0.3 | https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.3/aio/deploy/recommended.yaml |
ui |
Metrics-Server addon-resizer |
v0.3.6 1.8.11 |
https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/metrics-server/resource-reader.yaml https://raw.githubusercontent.com/kubernetes/kubernetes/v1.18.8/cluster/addons/metrics-server/auth-delegator.yaml https://raw.githubusercontent.com/kubernetes/kubernetes/v1.18.8/cluster/addons/metrics-server/metrics-apiservice.yaml https://raw.githubusercontent.com/kubernetes/kubernetes/v1.18.8/cluster/addons/metrics-server/metrics-server-deployment.yaml https://raw.githubusercontent.com/kubernetes/kubernetes/v1.18.8/cluster/addons/metrics-server/metrics-server-deployment.yaml https://raw.githubusercontent.com/kubernetes/kubernetes/v1.18.8/cluster/addons/metrics-server/metrics-server-service.yaml https://raw.githubusercontent.com/kubernetes/kubernetes/v1.18.8/cluster/addons/metrics-server/resource-reader.yaml |
监控系统k8s1.12之后废弃heapster |
Prometheus | |||
coredns | v1.7.0 | https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/coredns.yaml.sed https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/deploy.sh |
dns取代kube-dns |
Pod
Pod是若干个相关容器的组合,是一个逻辑概念,Pod包含的容器运行在同一个宿主机上,这些容器使用相同的网络命名空间、IP地址和端口,相互之间能通过localhost来发现和通信,共享一块存储卷空间。在Kubernetes中创建、调度和管理的最小单位是Pod。一个Pod一般只放一个业务容器和一个用于统一网络管理的网络容器。
Replication Controller
Replication Controller是用来控制管理Pod副本(Replica,或者称实例),Replication Controller确保任何时候Kubernetes集群中有指定数量的Pod副本在运行,如果少于指定数量的Pod副本,Replication Controller会启动新的Pod副本,反之会杀死多余的以保证数量不变。另外Replication Controller是弹性伸缩、滚动升级的实现核心。
Service
Service是真实应用服务的抽象,定义了Pod的逻辑集合和访问这个Pod集合的策略,Service将代理Pod对外表现为一个单一访问接口,外部不需要了解后端Pod如何运行,这给扩展或维护带来很大的好处,提供了一套简化的服务代理和发现机制。
Label
Label是用于区分Pod、Service、Replication Controller的Key/Value键值对,实际上Kubernetes中的任意API对象都可以通过Label进行标识。每个API对象可以有多个Label,但是每个Label的Key只能对应一个Value。Label是Service和Replication Controller运行的基础,它们都通过Label来关联Pod,相比于强绑定模型,这是一种非常好的松耦合关系。
Node
Kubernets属于主从的分布式集群架构,Kubernets Node(简称为Node,早期版本叫做Minion)运行并管理容器。Node作为Kubernetes的操作单元,将用来分配给Pod(或者说容器)进行绑定,Pod最终运行在Node上,Node可以认为是Pod的宿主机。
需求
目前生产部署Kubernetes集群主要有两种方式:
kubeadm
二进制包
从github下载发行版的二进制包,手动部署每个组件,组成Kubernetes集群。
Kubeadm降低部署门槛,但屏蔽了很多细节,遇到问题很难排查。如果想更容易可控,推荐使用二进制包部署Kubernetes集群,虽然手动部署麻烦点,期间可以学习很多工作原理,也利于后期维护
部署Kubernetes集群服务器需要满足以下几个条件:
服务器整体规划:
角色 | IP | 组件 |
---|---|---|
k8s-master1 | 10.10.3.139 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd |
k8s-master2 | 10.10.3.174 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd) |
k8s-master3 | 10.10.3.185 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd |
k8s-node1 | 10.10.3.179 | kubelet,kube-proxy,docker |
k8s-node2 | 10.10.3.197 | kubelet,kube-proxy,docker |
Load Balancer(Master) | 10.10.3.100,10.10.3.101 (VIP) | Nginx L4 haproxy keepalived ipvs |
Load Balancer(Backup) | 10.10.3.200 | Nginx L4 haproxy keepalived ipvs |
须知:这一套高可用集群分两部分实施,先部署一套单Master架构,再扩容为多Master架构(上述规划)。
API Server
高可用通过,可以通过 Nginx + Keepalived,或者 HaProxy + Keepalived等实现。
主机
使用 CentOS 7.8.2003
版本,并且内核都升到5.x
版本(内核非必须,我没有升级还是3.10的内核)。Iptables 模式
(kube-proxy 注释中预留 Ipvs
模式配置)IPIP
模式svc.cluster.local 改成svc.superred.com
10.10.0.1
为集群 kubernetes svc 解析ip名称 | IP网段 | 备注 |
---|---|---|
service-cluster-ip-range|是k8s svc的ip CIDR格式 | 10.0.0.0/24 | 可用地址 65534,Service(服务)网段(和微服务架构有关)。service 网段 |
pods-ip|cluster-cidr | [ clusterCIDR: 10.244.0.0/16(kube-proxy-config.yml)] CIDR格式 | 10.244.0.0/16 | 可用地址 65534,Pod 中间网络通讯我们用flannel,flannel要求是10.244.0.0/16,这个IP段就是Pod的IP段 。pod 网段 |
集群dns和是svc的ip一个网段 | 10.0.0.2 | 用于集群service域名解析,service 网段的 ip作为coredns 地址 |
service/kubernetes default的ip | 10.0.0.1 |
集群 kubernetes svc 解析ip。kubernetes(service 网段的第一个 IP) |
单Master节点服务器规划:
角色 | IP | 组件 |
---|---|---|
k8s-master | 10.10.3.139 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd |
k8s-node1 | 10.10.3.179 | kubelet,kube-proxy,docker etcd |
k8s-node2 | 10.10.3.197 | kubelet,kube-proxy,docker,etcd |
root@k8s-master:~ # ip link
1: lo: mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 52:54:00:ab:2d:cc brd ff:ff:ff:ff:ff:ff
3: docker0: mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:0f:73:13:e7 brd ff:ff:ff:ff:ff:ff
4: flannel.1: mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/ether 86:7b:76:32:21:f5 brd ff:ff:ff:ff:ff:ff
5: cni0: mtu 1450 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 1a:99:c0:9e:af:82 brd ff:ff:ff:ff:ff:ff
12: veth367dff5f@if3: mtu 1450 qdisc noqueue master cni0 state UP mode DEFAULT group default
link/ether 16:0e:e6:ae:71:08 brd ff:ff:ff:ff:ff:ff link-netnsid 0
root@k8s-master:~ # cat /sys/class/dmi/id/product_uuid
BA2F3796-00F0-49CD-876F-708CC1DD674E
以下命令在三台主机上均需运行
# 关闭防火墙
systemctl stop firewalld && systemctl disable firewalld
# 关闭selinux
sed -i 's/enforcing/disabled/' /etc/selinux/config
setenforce 0
# 关闭swap
swapoff -a
sed -ri 's/.*swap.*/#&/' /etc/fstab && swapoff -a
#添加仓库
yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
或者使用阿里云镜像:
sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-
ce/linux/centos/docker-ce.repo
# 根据规划设置主机名
hostnamectl set-hostname < name > (可不做)
# 在master添加hosts (可不做)
cat >> /etc/hosts << EOF
10.10.3.139 k8s-master
10.10.3.179 k8s-node1
10.10.3.197 k8s-node2
EOF
#加载内核模块
cat > /etc/sysconfig/modules/ipvs.modules <
#加载内核模块
#设置内核参数
组件 | 使用的证书 |
---|---|
etcd | ca.pem,etcd.pem,etcd-key.pem |
flannel | ca.pem,flannel.pem,flannel-key.pem |
kube-apiserver | ca.pem,apiserver.pem,apiserver-key.pem |
kubelet(自动颁发) | ca.pem,ca-key.pem |
kube-proxy | ca.pem,kube-proxy.pem,kube-proxy-key.pem |
kubectl | ca.pem,admin.pem,admin-key.pem |
Etcd 是一个分布式键值存储系统,Kubernetes使用Etcd进行数据存储,所以先准备一个Etcd数据库,为解决Etcd单点故障,应采用集群方式部署,这里使用3台组建集群,可容忍1台机器故障,当然,你也可以使用5台组建集群,可容忍2台机器故障。
节点名称 | IP |
---|---|
etcd-1 (k8s-master) | 10.10.3.139 |
etcd-2 (k8s-node1) | 10.10.3.179 |
etcd-3 (k8s-node2) | 10.10.3.197 |
注:为了节省机器,这里与K8s节点机器复用。也可以独立于k8s集群之外部署,只要api-server能连接到就行。
一、了解kubernetes的证书服务
1、获取证书的方式:
第三方机构证书:付费购买CA机构颁发的证书,具有真实性校验;自签发正式:自己生成的ssl证书不受浏览器信任,在内网使用,而且功能效果差。2、生成证书的方式:
Easy-Rsa:基于shell的简单CA实用程序;
Openssl:一套开源软件,SSL密码库工具,提供了一个通用、健壮、功能完备的工具套件,用以支持SSL/TLS 协议的实现。
CFSSL:是CloudFlare开源的一款PKI/TLS工具。 CFSSL包含一个命令行工具和一个用于签名,验证并且捆绑TLS证书的HTTP API服务。 使用Go语言编写。
3、Kubernetes群集分为三大类:
Client Certificate(客户端证书)、Server Certificate(服务端证书)和Peer Certificate(双向证书)。
具体详细如下:
1、Etcd配套证书
2、Kube-APIserver配套证书
3、kube-scheduler配套证书
4、kube-controller-manager配套证书
5、kube-proxy 配套证书
6、service account配套证书
7、admin配套证书
8、kubelet配套证书
注意:kubelet用到的证书都是独一无二的,使用TLS bootstrapping给kubelet单独签发的证书。
在这里,我们cfssl使用自签发私有证书,先把Kubernetes群集环境搭建起来,在后面的内容中会专门学习Kubernetes认证、授权、准入控制等机制。
cfssl是一个开源的证书管理工具,使用json文件生成证书,相比openssl更方便使用。这里用Master节点
wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod +x cfssl_linux-amd64 cfssljson_linux-amd64 cfssl-certinfo_linux-amd64
mv cfssl_linux-amd64 /usr/local/bin/cfssl
mv cfssljson_linux-amd64 /usr/local/bin/cfssljson
mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo
# 创建工作目录
mkdir -p ~/TLS/{etcd,k8s}
cd TLS/etcd
1)创建CA配置文件ca-config.json
# 自签CA
cat > ca-config.json << EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"kubernetes": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF
"ca-config.json":可以定义多个 profiles,分别指定不同的过期时间、使用场景等参数;后续在签名证书时使用某个 profile;
"expiry": "87600h" 证书10年
signing:表示该证书可用于签名其它证书;生成的 ca.pem 证书中 CA=TRUE;
server auth:表示该CA对server提供的证书进行验证;
client auth:表示该CA对client提供的证书进行验证。
2) 创建CA签名文件csr.json CN是用户名kubernetes , O是组:k8s etcd和kubernetes公用一套ca根证书
# 自签CA
cat > ca-csr.json << EOF
{
"CN": "kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "Beijing",
"ST": "Beijing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
# 自签CA
#cat > ca-csr.json << EOF
#{
# "CN": "etcd CA",
# "key": {
# "algo": "rsa",
# "size": 2048
# },
# "names": [
# {
# "C": "CN",
# "L": "Beijing",
# "ST": "Beijing"
# }
# ]
#}
#EOF
公用名称 (Common Name) | 简称:CN 字段,对于 SSL 证书,一般为网站域名;而对于代码签名证书则为申请单位名称;而对于客户端证书则为证书申请者的姓名; |
单位名称 (Organization Name) | 简称:O 字段,对于 SSL 证书,一般为网站域名;而对于代码签名证书则为申请单位名称;而对于客户端单位证书则为证书申请者所在单位名称; |
所在城市 (Locality) | 简称:L 字段
所在省份 (State/Provice) | 简称:S 字段
所在国家 (Country) | 简称:C 字段,只能是国家字母缩写,如中国:CN
电子邮件 (Email) | 简称:E 字段
多个姓名字段 | 简称:G 字段
介绍 | Description 字段
电话号码: |Phone 字段,格式要求 + 国家区号 城市区号 电话号码,如: +86 732 88888888
地址: |STREET 字段
邮政编码: |PostalCode 字段
显示其他内容 | 简称:OU 字段
3) 生成CA证书(ca.pem)和私钥(ca-key.pem)
# 生成证书
root@k8s-master:~/TLS # cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
2020/08/27 14:25:14 [INFO] generating a new CA key and certificate from CSR
2020/08/27 14:25:14 [INFO] generate received request
2020/08/27 14:25:14 [INFO] received CSR
2020/08/27 14:25:14 [INFO] generating key: rsa-2048
2020/08/27 14:25:14 [INFO] encoded CSR
2020/08/27 14:25:14 [INFO] signed certificate with serial number 725281651325617718681661325863333735086520188695
4)查看信息
[root@k8s ssl]# ls
ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem
[root@k8s ssl]# openssl x509 -in ca.pem -noout -text
2.2.2 使用自签CA签发Etcd HTTPS证书,创建etcd的TLS认证证书 ,创建 etcd证书签名请求(etcd-csr.json)
cat > etcd-csr.json <
"CN":Common Name,etcd 从证书中提取该字段作为请求的用户名 (User Name);浏览器使用该字段验证网站是否合法;
"O":Organization,etcd 从证书中提取该字段作为请求用户所属的组 (Group); 这两个参数在后面的kubernetes启用RBAC模式中很重要,因为需要设置kubelet、admin等角色权限,那么在配置证书的时候就必须配置对了,具体后面在部署kubernetes的时候会进行讲解。 "在etcd这两个参数没太大的重要意义,跟着配置就好。"
可以使用openssl命令去检查我们签发的证书是否正确
如果 hosts 字段不为空则需要指定授权使用该证书的 IP 或域名列表,由于该证书后续被 etcd 集群使用,所以填写IP即可。 因为本次部署etcd是单台,那么则需要填写单台的IP地址即可。
查看秘钥:
openssl x509 -in ca-key.key -noout -text
openssl x509 -noout -text -in ca.crt
openssl req -noout -text -in ca.csr
注:文件hosts字段中IP为所有etcd节点的集群内部通信IP,一个都不能少!为了方便后期扩容也可以多写几个预留的IP。
# 生成证书
root@k8s-master:~/TLS # cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes etcd-csr.json | cfssljson -bare etcd 1 ↵
2020/08/27 14:28:46 [INFO] generate received request
2020/08/27 14:28:46 [INFO] received CSR
2020/08/27 14:28:46 [INFO] generating key: rsa-2048
2020/08/27 14:28:47 [INFO] encoded CSR
2020/08/27 14:28:47 [INFO] signed certificate with serial number 316290753253162786864679339793041040918888860541
2020/08/27 14:28:47 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
结果
[root@k8s ssl]# ls
ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem etcd.csr etcd-csr.json etcd-key.pem etcd.pem
# 下载地址
#官网写了,因为新版3.4以上版本不支持旧版API
wget https://github.com/etcd-io/etcd/releases/download/v3.4.9/etcd-v3.4.9-linux-amd64.tar.gz
在master节点上操作,为简化操作,可以将master节点生成的所有文件拷贝到俩台node节点
mkdir /opt/etcd/{bin,cfg,ssl} -p
tar zxvf etcd-v3.4.9-linux-amd64.tar.gz
chown -R root:root etcd-v3.4.9-linux-amd64
mv etcd-v3.4.9-linux-amd64/{etcd,etcdctl} /opt/etcd/bin/
单节点
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_CLIENT_URLS="https://0.0.0.0:2379,https://0.0.0.0:4001"
ETCD_NAME="master"
ETCD_ADVERTISE_CLIENT_URLS="https://10.10.3.127:2379,https://10.10.3.127:4001"
集群
cat > /opt/etcd/cfg/etcd.conf << EOF
#[Member]
ETCD_NAME="etcd-1"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://10.10.3.139:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://10.10.3.139:2379"
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.10.3.139:2380"
ETCD_LISTEN_CLIENT_URLS="https://10.10.3.139:2379"
ETCD_INITIAL_CLUSTER="etcd-1=https://10.10.3.139:2380,etcd-2=https://10.10.3.179:2380,etcd-3=https://10.10.3.197:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF
flannel目前确实不能与etcdV3直接交互.如果用flannel二进制部署网络的话,启etcd 支持V2api功能,在etcd启动参数中加入 --enable-v2参数,并重启etcd
cat > /usr/lib/systemd/system/etcd.service << EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
EnvironmentFile=/opt/etcd/cfg/etcd.conf
#ExecStart=/opt/etcd/bin/etcd --enable-v2 \\
ExecStart=/opt/etcd/bin/etcd \\
--cert-file=/opt/etcd/ssl/etcd.pem \\
--key-file=/opt/etcd/ssl/etcd-key.pem \\
--peer-cert-file=/opt/etcd/ssl/etcd.pem \\
--peer-key-file=/opt/etcd/ssl/etcd-key.pem \\
--trusted-ca-file=/opt/etcd/ssl/ca.pem \\
--peer-trusted-ca-file=/opt/etcd/ssl/ca.pem \\
--logger=zap
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
2、为了保证通信安全,需要指定 etcd 的公私钥(cert-file和key-file)、Peers 通信的公私钥和 CA 证书(peer-cert-file、peer-key-file、peer-trusted-ca-file)、客户端的CA证书(trusted-ca-file);
在配置中添加etcd证书的命令是以下:
--cert-file=/etc/etcd/etcdSSL/etcd.pem \
--key-file=/etc/etcd/etcdSSL/etcd-key.pem \
--trusted-ca-file=/etc/etcd/etcdSSL/ca.pem \
--peer-cert-file=/etc/etcd/etcdSSL/etcd.pem \
--peer-key-file=/etc/etcd/etcdSSL/etcd-key.pem \
--peer-trusted-ca-file=/etc/etcd/etcdSSL/ca.pem \
把刚才生成的证书拷贝到配置文件中的路径
cp ~/TLS/etcd/ca*pem ~/TLS/etcd/etcd*pem /opt/etcd/ssl/
systemctl daemon-reload
systemctl restart etcd
systemctl enable etcd
# 如果报错无法启动,则需要将其他etcd节点设置完成后才可以启动
root@k8s-master:~/TLS # scp -rp /opt/etcd 10.10.3.179:/opt/
root@k8s-master:~/TLS # scp -rp /opt/etcd 10.10.3.197:/opt/
root@k8s-master:~/TLS # scp -rp /usr/lib/systemd/system/etcd.service 10.10.3.179:/usr/lib/systemd/system/
etcd.service
root@k8s-master:~/TLS # scp -rp /usr/lib/systemd/system/etcd.service 10.10.3.197:/usr/lib/systemd/system/
[root@localhost etcd]# cat cfg/etcd.conf
#[Member]
ETCD_NAME="etcd-2" #修改此处,节点2改为etcd-2,节点3改为etcd-3
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://10.10.3.179:2380" # 修改此处为当前服务器IP
ETCD_LISTEN_CLIENT_URLS="https://10.10.3.179:2379" # 修改此处为当前服务器IP
#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.10.3.179:2380" # 修改此处为当前服务器IP
ETCD_ADVERTISE_CLIENT_URLS="https://10.10.3.179:2379" # 修改此处为当前服务器IP
ETCD_INITIAL_CLUSTER="etcd-1=https://10.10.3.139:2380,etcd-2=https://10.10.3.179:2380,etcd-3=https://10.10.3.197:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
# 最后启动etcd并设置开机启动
systemctl daemon-reload
systemctl restart etcd
systemctl enable etcd
[root@localhost etcd]# systemctl status -ll etcd
● etcd.service - Etcd Server
Loaded: loaded (/usr/lib/systemd/system/etcd.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2020-08-27 14:49:59 CST; 13s ago
Main PID: 17518 (etcd)
Tasks: 20
Memory: 33.8M
CGroup: /system.slice/etcd.service
└─17518 /opt/etcd/bin/etcd --cert-file=/opt/etcd/ssl/etcd.pem --key-file=/opt/etcd/ssl/etcd-key.pem --peer-cert-file=/opt/etcd/ssl/etcd.pem --peer-key-file=/opt/etcd/ssl/etcd-key.pem --trusted-ca-file=/opt/etcd/ssl/ca.pem --peer-trusted-ca-file=/opt/etcd/ssl/ca.pem --logger=zap
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"warn","ts":"2020-08-27T14:50:04.815+0800","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"627ba52be9b69b8c","rtt":"0s","error":"dial tcp 10.10.3.179:2380: connect: connection refused"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"info","ts":"2020-08-27T14:50:04.835+0800","caller":"rafthttp/stream.go:250","msg":"set message encoder","from":"a03c2a74d8b05a51","to":"a03c2a74d8b05a51","stream-type":"stream Message"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"info","ts":"2020-08-27T14:50:04.835+0800","caller":"rafthttp/peer_status.go:51","msg":"peer became active","peer-id":"627ba52be9b69b8c"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"warn","ts":"2020-08-27T14:50:04.835+0800","caller":"rafthttp/stream.go:266","msg":"closed TCP streaming connection with remote peer","stream-writer-type":"stream Message","local-member-id":"a03c2a74d8b05a51","remote-peer-id":"627ba52be9b69b8c"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"warn","ts":"2020-08-27T14:50:04.835+0800","caller":"rafthttp/stream.go:277","msg":"established TCP streaming connection with remote peer","stream-writer-type":"stream Message","local-member-id":"a03c2a74d8b05a51","remote-peer-id":"627ba52be9b69b8c"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"info","ts":"2020-08-27T14:50:04.835+0800","caller":"rafthttp/stream.go:250","msg":"set message encoder","from":"a03c2a74d8b05a51","to":"a03c2a74d8b05a51","stream-type":"stream MsgApp v2"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"warn","ts":"2020-08-27T14:50:04.835+0800","caller":"rafthttp/stream.go:266","msg":"closed TCP streaming connection with remote peer","stream-writer-type":"stream MsgApp v2","local-member-id":"a03c2a74d8b05a51","remote-peer-id":"627ba52be9b69b8c"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"warn","ts":"2020-08-27T14:50:04.835+0800","caller":"rafthttp/stream.go:277","msg":"established TCP streaming connection with remote peer","stream-writer-type":"stream MsgApp v2","local-member-id":"a03c2a74d8b05a51","remote-peer-id":"627ba52be9b69b8c"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"info","ts":"2020-08-27T14:50:04.838+0800","caller":"rafthttp/stream.go:425","msg":"established TCP streaming connection with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"a03c2a74d8b05a51","remote-peer-id":"627ba52be9b69b8c"}
Aug 27 14:50:04 localhost.localdomain etcd[17518]: {"level":"info","ts":"2020-08-27T14:50:04.839+0800","caller":"rafthttp/stream.go:425","msg":"established TCP streaming connection with remote peer","stream-reader-type":"stream Message","local-member-id":"a03c2a74d8b05a51","remote-peer-id":"627ba52be9b69b8c"}
[root@localhost etcd]# netstat -lntup|grep etcd
tcp 0 0 10.10.3.197:2379 0.0.0.0:* LISTEN 17518/etcd
tcp 0 0 10.10.3.197:2380 0.0.0.0:* LISTEN 17518/etcd
#为三台节点都创建系统执行指令命令
cp /opt/etcd/bin/etcdctl /usr/bin/
root@k8s-master:~/TLS # etcdctl --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl/etcd.pem --key=/opt/etcd/ssl/etcd-key.pem --endpoints="https://10.10.3.139:2379,https://10.10.3.197:2379,https://10.10.3.179:2379" endpoint health
https://10.10.3.197:2379 is healthy: successfully committed proposal: took = 22.070915ms
https://10.10.3.139:2379 is healthy: successfully committed proposal: took = 25.085954ms
https://10.10.3.179:2379 is healthy: successfully committed proposal: took = 25.915676ms
如果输出上面信息,就说明Etcd集群部署成功。如果有问题先看日志:/var/log/message
或 journalctl -u etcd
我这里yum安装或用二进制安装都可以
yum 安装
# 1. 卸载旧版本
yum remove docker \
docker-client \
docker-client-latest \
docker-common \
docker-latest \
docker-latest-logrotate \
docker-logrotate \
docker-engine
# 2. 使用存储库安装
yum install -y yum-utils
# 3. 设置镜像仓库(修改为国内源地址)
yum-config-manager \
--add-repo \
http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# 5. 更新索引
yum makecache fast
# 4. 安装docker相关的依赖 默认最新版(docker-ce:社区版 ee:企业版)
yum install docker-ce docker-ce-cli containerd.io -y
#5. 安装特定docker版本(先列出列出可用版本)
yum list docker-ce --showduplicates | sort -r
yum install docker-ce-19.03.9 docker-ce-cli-19.03.9 containerd.io
containerd.io:于os分开
docker-ce-cli:客户端(cli)
docker-ce:docker后端主程序
# 所有节点设置防火墙规则,并让生效
vim /lib/systemd/system/docker.service
[Service]
#ExecStartPost=/sbin/iptables -I FORWARD -s 0.0.0.0/0 -j ACCEPT
ExecStartPost=/sbin/iptables -P FORWARD ACCEPT
# 7. 启动docker
systemctl enable docker
systemctl restart docker
# 8. 查看版本
[root@k8s-master ~]# docker --version
Docker version 19.03.9
# 9. 配置docker镜像加速器
## 镜像加速器:阿里云加速器,daocloud加速器,中科大加速器
## Docker 中国官方镜像加速:https://registry.docker-cn.com
mkdir -p /etc/docker
tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://registry.docker-cn.com"],
"insecure-registries": ["10.10.3.104"],
"insecure-registries": ["harbor.superred.com"]
}
EOF
systemctl daemon-reload
systemctl restart docker
二进制方式
下载地址 https://download.docker.com/linux/static/stable/x86_64/docker-19.03.9.tgz
解压二进制包
tar zxvf docker-19.03.9.tgz
mv docker/* /usr/bin
systemd管理docker
cat > /usr/lib/systemd/system/docker.service << EOF
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target
[Service]
Type=notify
ExecStart=/usr/bin/dockerd
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TimeoutStartSec=0
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s
[Install]
WantedBy=multi-user.target
EOF
创建配置文件
mkdir /etc/docker
cat > /etc/docker/daemon.json << EOF
{
"registry-mirrors": ["https://registry.docker-cn.com"],
"insecure-registries": ["10.10.3.104"],
"insecure-registries": ["harbor.superred.com"]
}
EOF
启动并设置开机启动
systemctl daemon-reload
systemctl start docker
systemctl enable docker
kube-apiserver,
kube-controller-manager,
kube-scheduler
cd /root/TLS/k8s
cat > ca-config.json << EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"kubernetes": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF
cat > ca-csr.json << EOF
{
"CN": "kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "Beijing",
"ST": "Beijing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
root@k8s-master:~/TLS/k8s # cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
2020/08/27 15:05:30 [INFO] generating a new CA key and certificate from CSR
2020/08/27 15:05:30 [INFO] generate received request
2020/08/27 15:05:30 [INFO] received CSR
2020/08/27 15:05:30 [INFO] generating key: rsa-2048
2020/08/27 15:05:31 [INFO] encoded CSR
2020/08/27 15:05:31 [INFO] signed certificate with serial number 543074556734204684883324478113935421961362062670
root@k8s-master:~/TLS/k8s # ll *pem
-rw------- 1 root root 1.7K Aug 27 15:05 ca-key.pem
-rw-r--r-- 1 root root 1.4K Aug 27 15:05 ca.pem
cat > apiserver-csr.json << EOF
{
"CN": "kubernetes",
"hosts": [
"10.244.0.0",
"10.244.0.1",
"10.0.0.0",
"10.0.0.1",
"10.0.0.2",
"127.0.0.1",
"10.10.3.100",
"10.10.3.101",
"10.10.3.200",
"10.10.3.139",
"10.10.3.179",
"10.10.3.197",
"10.10.3.174",
"10.10.3.185",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.superred",
"kubernetes.default.svc.superred.com"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
#上述文件hosts字段中IP为所有Master/LB/VIP IP,一个都不能少!为了方便后期扩容可以多写几个预留的IP
# 生成证书
root@k8s-master:~/TLS/k8s # ls 1 ↵
apiserver-csr.json ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem
root@k8s-master:~/TLS/k8s # cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes apiserver-csr.json | cfssljson -bare apiserver
2020/08/27 15:12:39 [INFO] generate received request
2020/08/27 15:12:39 [INFO] received CSR
2020/08/27 15:12:39 [INFO] generating key: rsa-2048
2020/08/27 15:12:39 [INFO] encoded CSR
2020/08/27 15:12:39 [INFO] signed certificate with serial number 113880470106589986816895015180975886983725249312
2020/08/27 15:12:39 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
root@k8s-master:~/TLS/k8s # ll apiserver*pem
-rw------- 1 root root 1.7K Aug 27 15:12 apiserver-key.pem
-rw-r--r-- 1 root root 1.7K Aug 27 15:12 apiserver.pem
4.2下载二进制文件包并解压https://github.com/kubernetes/kubernetes/releases/download/v1.18.8/kubernetes.tar.gz
mkdir -p /opt/kubernetes/{bin,cfg,ssl,logs}
mkdir -p /opt/kubernetes/logs/{kubelet,kube-proxy,kube-scheduler,kube-apiserver,kube-controller-manager}
1 cd kubernetes/cluster
2 ./get-kube-binaries.sh 下载二进制文件 到kubernetes/server/kubernetes-server-linux-amd64.tar.gz
2.1)下载很慢(https://github.com/kubernetes/kubernetes/tree/master/CHANGELOG 选择相应版本Server Binaries
filename sha512 hash
kubernetes-server-linux-amd64.tar.gz)
3 tar zxvf kubernetes-server-linux-amd64.tar.gz
root@k8s-master:~/work/k8s-1.18.8/kubernetes/server/kubernetes/server/bin # ls apiextensions-apiserver kube-apiserver.docker_tag kube-controller-manager.docker_tag kubelet kube-proxy.tar kube-scheduler.tar
kubeadm kube-apiserver.tar kube-controller-manager.tar kube-proxy kube-scheduler mounter
kube-apiserver kube-controller-manager kubectl kube-proxy.docker_tag kube-scheduler.docker_tag
4 cp kube-apiserver kube-controller-manager kubectl kubelet kube-proxy kube-scheduler /opt/kubernetes/bin
cat > /opt/kubernetes/cfg/kube-apiserver.conf << EOF
KUBE_APISERVER_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/kubernetes/logs/kube-apiserver \\
--etcd-servers=https://10.10.3.139:2379,https://10.10.3.197:2379,https://10.10.3.179:2379 \\
--bind-address=10.10.3.139 \\
--secure-port=6443 \\
--advertise-address=10.10.3.139 \\
--allow-privileged=true \\
--service-cluster-ip-range=10.0.0.0/24 \\
--enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota,NodeRestriction \\
#--enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction \\
--authorization-mode=RBAC,Node \\
#--kubelet-https=true \\
--enable-bootstrap-token-auth=true \\
--token-auth-file=/opt/kubernetes/cfg/token.csv \\
--service-node-port-range=1024-65535 \\
--kubelet-client-certificate=/opt/kubernetes/ssl/apiserver.pem \\
--kubelet-client-key=/opt/kubernetes/ssl/apiserver-key.pem \\
--tls-cert-file=/opt/kubernetes/ssl/apiserver.pem \\
--tls-private-key-file=/opt/kubernetes/ssl/apiserver-key.pem \\
--client-ca-file=/opt/kubernetes/ssl/ca.pem \\
--service-account-key-file=/opt/kubernetes/ssl/ca-key.pem \\
--etcd-cafile=/opt/etcd/ssl/ca.pem \\
--etcd-certfile=/opt/etcd/ssl/etcd.pem \\
--etcd-keyfile=/opt/etcd/ssl/etcd-key.pem \\
#--requestheader-client-ca-file=/opt/kubernetes/ssl/ca.pem \\
#--requestheader-extra-headers-prefix=X-Remote-Extra- \\
#--requestheader-group-headers=X-Remote-Group \\
#--requestheader-username-headers=X-Remote-User \\
#--runtime-config=api/all=true \\
--audit-log-maxage=30 \\
--audit-log-maxbackup=3 \\
--audit-log-maxsize=100 \\
--audit-log-path=/opt/kubernetes/logs/k8s-audit.log"
EOF
注:上面两个\\
第一个是转义符,第二个是换行符,使用转义符是为了使用EOF保留换行符。
--logtostderr:启用日志
--log-dir:日志目录
--v:日志等级,越小越多
--etcd-servers: etcd集群地址
--bind-address :监听地址
--secure-port:https安全端口
--advertise-address:集群通告地址
--allow-privileged:启用授权
--service-cluster-ip-range:Service虚拟IP地址段
--enable-admission-plugins:准入控制模块,决定是否启用k8s高级功能
--authorization-mode:认证授权,启用RBAC授权和节点自管理
--enable-bootstrap-token-auth:启用TLS bootstrap机制
--token-auth-file:bootstrap token文件
--service-node-port-range:Service nodeport类型默认分配端口范围
--kubelet-https:apiserver主动访问kubelet时默认使用https
--kubelet-client-xxx:apiserver访问kubelet客户端证书
--tls-xxx-file:apiserver https证书
--etcd-xxxfile:连接Etcd集群证书
--audit-log-xxx:审计日志
启用 TLS Bootstrapping 机制
TLS Bootstraping:Master apiserver启用TLS认证后,Node节点kubelet和kube-proxy要与kube-apiserver进行通信,必须使用CA签发的有效证书才可以,当Node节点很多时,这种客户端证书颁发需要大量工作,同样也会增加集群扩展复杂度。为了简化流程,Kubernetes引入了TLS bootstraping机制来自动颁发客户端证书,kubelet会以一个低权限用户自动向apiserver申请证书,kubelet的证书由apiserver动态签署。所以强烈建议在Node上使用这种方式,目前主要用于kubelet,kube-proxy还是由我们统一颁发一个证书。
root@k8s-master:/etc/kubernetes/back/1.18config # cp ~/TLS/k8s/ca*pem ~/TLS/k8s/apiserver*pem /opt/kubernetes/ssl/
root@k8s-master:/etc/kubernetes/back/1.18config # ls /opt/kubernetes/ssl
apiserver-key.pem apiserver.pem ca-key.pem ca.pem
# 获取16位token随机值的命令
head -c 16 /dev/urandom | od -An -t x | tr -d ' '
6f17a4f29f87488583583ca622b54572
# 添加token文件(格式:token,用户名,UID,用户组)
BOOTSTRAP_TOKEN=863b2ebebecffbb3a6493ff15dfc57c6
cat > /opt/kubernetes/cfg/token.csv <
允许 kubelet-bootstrap 用户创建首次启动的 CSR 请求 和RBAC授权规则
# Bind kubelet-bootstrap user to system cluster roles.
[root@localhost cfg]# kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap
[root@localhost cfg]# kubectl kubectl create clusterrolebinding kubelet-nodes --clusterrole=system:node --group=system:nodes
cat > /usr/lib/systemd/system/kube-apiserver.service << EOF
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kube-apiserver.conf
ExecStart=/opt/kubernetes/bin/kube-apiserver \$KUBE_APISERVER_OPTS
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl restart kube-apiserver
systemctl enable kube-apiserver
root@localhost:/opt/kubernetes/cfg # systemctl status -ll kube-apiserver 1 ↵
● kube-apiserver.service - Kubernetes API Server
Loaded: loaded (/usr/lib/systemd/system/kube-apiserver.service; disabled; vendor preset: disabled)
Active: active (running) since Thu 2020-08-27 16:06:39 CST; 25s ago
Docs: https://github.com/kubernetes/kubernetes
Main PID: 3785 (kube-apiserver)
Tasks: 22
Memory: 304.5M
CGroup: /system.slice/kube-apiserver.service
└─3785 /opt/kubernetes/bin/kube-apiserver --logtostderr=false --v=2 --log-dir=/opt/kubernetes/logs --etcd-servers=https://10.10.3.139:2379,https://10.10.3.197:2379,https://10.10.3.179:2379 --bind-address=10.10.3.139 --secure-port=6443 --advertise-address=10.10.3.139 --allow-privileged=true --service-cluster-ip-range=10.10.0.0/24 --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction --authorization-mode=RBAC,Node --enable-bootstrap-token-auth=true --token-auth-file=/opt/kubernetes/cfg/token.csv --service-node-port-range=1024-65535 --kubelet-client-certificate=/opt/kubernetes/ssl/apiserver.pem --kubelet-client-key=/opt/kubernetes/ssl/apiserver-key.pem --tls-cert-file=/opt/kubernetes/ssl/apiserver.pem --tls-private-key-file=/opt/kubernetes/ssl/apiserver-key.pem --client-ca-file=/opt/kubernetes/ssl/ca.pem --service-account-key-file=/opt/kubernetes/ssl/ca-key.pem --etcd-cafile=/opt/etcd/ssl/ca.pem --etcd-certfile=/opt/etcd/ssl/etcd.pem --etcd-keyfile=/opt/etcd/ssl/etcd-key.pem --audit-log-maxage=30 --audit-log-maxbackup=3 --audit-log-maxsize=100 --audit-log-path=/opt/kubernetes/logs/k8s-audit.log
Aug 27 16:06:39 localhost.localdomain systemd[1]: Started Kubernetes API Server
cat > /opt/kubernetes/cfg/kube-controller-manager.conf << EOF
KUBE_CONTROLLER_MANAGER_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/kubernetes/logs/kube-controller-manager \\
--leader-elect=true \\
--master=127.0.0.1:8080 \\
--bind-address=0.0.0.0 \\
--cluster-name=kubernetes \\
--allocate-node-cidrs=true \\
--cluster-cidr=10.244.0.1/16 \\ #pod网段
--service-cluster-ip-range=10.0.0.0/24 \\ #service 网段
--cluster-signing-cert-file=/opt/kubernetes/ssl/ca.pem \\
--cluster-signing-key-file=/opt/kubernetes/ssl/ca-key.pem \\
--root-ca-file=/opt/kubernetes/ssl/ca.pem \\
#--feature-gates=RotateKubeletServerCertificate=true \\
#--feature-gates=RotateKubeletClientCertificate=true \\
#--allocate-node-cidrs=true \\
--service-account-private-key-file=/opt/kubernetes/ssl/ca-key.pem \\
--experimental-cluster-signing-duration=87600h0m0s
EOF
cat > /usr/lib/systemd/system/kube-controller-manager.service << EOF
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kube-controller-manager.conf
ExecStart=/opt/kubernetes/bin/kube-controller-manager \$KUBE_CONTROLLER_MANAGER_OPTS
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl restart kube-controller-manager
systemctl enable kube-controller-manager
cat > /opt/kubernetes/cfg/kube-scheduler.conf << EOF
KUBE_SCHEDULER_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/kubernetes/logs/kube-scheduler \\
--leader-elect \\
--master=127.0.0.1:8080 \\
--bind-address=0.0.0.0"
EOF
cat > /usr/lib/systemd/system/kube-scheduler.service << EOF
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kube-scheduler.conf
ExecStart=/opt/kubernetes/bin/kube-scheduler \$KUBE_SCHEDULER_OPTS
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
[root@localhost cfg]# systemctl daemon-reload
[root@localhost cfg]# systemctl enable kube-scheduler --now
所有组件都已经启动成功,通过kubectl工具查看当前集群组件状态
root@localhost:/opt/kubernetes/cfg # kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
输出如上内容说明Master节点组件运行正常
以下是手动批准
如果想把Master也当做Node节点的话,也可以在Master节点上安装kubelet和kube-proxy
# 1.在所有node节点创建工作目录
mkdir -p /opt/kubernetes/{bin,cfg,ssl,logs}
mkdir -p /opt/kubernetes/logs/{kubelet,kube-proxy,kube-scheduler,kube-apiserver,kube-controller-manager}
# 2. 从master节点上解压的kubernetes压缩包中拷贝文件到所有node节点
for ip in 179 197;do scp -rp ./kubernetes/server/bin/{kubelet,kube-proxy} 10.10.3.$ip:/opt/kubernetes/bin/;done
pod的基础容器镜像改为国内的下载的镜像地址或自己的harbor
harbor.superred.com/kubernetes/pause-amd64:3.0
如果用flannel二进制部署的话 就要去掉 --network-plugin=cni 参数 否则启动kubelet的时候会报一个cni的错误 或者追加CNI网络插件也可以见6.3.2 安装 cni-plugin章节
# node1节点
cat > /opt/kubernetes/cfg/kubelet.conf << EOF
KUBELET_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/kubernetes/logs \\
--hostname-override=10.10.3.179 \\
--network-plugin=cni \\
--cni-conf-dir=/etc/cni/net.d \\
--cni-bin-dir=/opt/cni/bin \\
#--system-reserved=memory=300Mi \\
#--kube-reserved=memory=400Mi \\
--kubeconfig=/opt/kubernetes/cfg/kubelet.kubeconfig \\
--bootstrap-kubeconfig=/opt/kubernetes/cfg/bootstrap.kubeconfig \\
--config=/opt/kubernetes/cfg/kubelet-config.yml \\
--cert-dir=/opt/kubernetes/ssl \\
--pod-infra-container-image=harbor.superred.com/kubernetes/pause-amd64:3.0"
EOF
# node1节点和node2节点配置相同
cat > /opt/kubernetes/cfg/kubelet-config.yml << EOF
kind: KubeletConfiguration # 使用对象
apiVersion: kubelet.config.k8s.io/v1beta1 # api版本
address: 0.0.0.0 # 监听地址
port: 10250 # 当前kubelet的端口
readOnlyPort: 10255 # kubelet暴露的端口
cgroupDriver: cgroupfs # 驱动,要与docker info显示的驱动一致
clusterDNS:
- 10.0.0.2
clusterDomain: superred.com # 集群域
failSwapOn: false # 关闭swap
# 身份验证
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 2m0s
enabled: true
x509:
clientCAFile: /opt/kubernetes/ssl/ca.pem
# 授权
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 5m0s
cacheUnauthorizedTTL: 30s
# Node 资源保留
evictionHard:
imagefs.available: 15%
memory.available: 100Mi
nodefs.available: 10%
nodefs.inodesFree: 5%
#evictionPressureTransitionPeriod: 5m0s
# 镜像删除策略
#imageGCHighThresholdPercent: 85
#imageGCLowThresholdPercent: 80
#imageMinimumGCAge: 2m0s
# 旋转证书
#rotateCertificates: true # 旋转kubelet client 证书
#featureGates:
# RotateKubeletServerCertificate: true
# RotateKubeletClientCertificate: true
maxOpenFiles: 1000000
maxPods: 110
EOF
cgroupDriver: cgroupfs要和docker info一致才可以
在k8s-master节点将node节点需要的CA证书文件拷贝过去
在k8s-master上面
root@localhost:/opt/kubernetes/cfg # scp /opt/kubernetes/ssl/ca.pem 10.10.3.179:/opt/kubernetes/ssl 130 ↵
ca.pem 100% 1359 656.7KB/s 00:00
root@localhost:/opt/kubernetes/cfg # scp /opt/kubernetes/ssl/ca.pem 10.10.3.197:/opt/kubernetes/ssl
在k8s-master上查看Token文件的随机值
[root@k8s-master ~]# cat /opt/kubernetes/cfg/token.csv
6f17a4f29f87488583583ca622b54572,kubelet-bootstrap,10001,"system:bootstrappers"
在k8s-master上生成bootstrap.kubeconfig文件
KUBE_APISERVER="https://10.10.3.139:6443" # apiserverIP:PORT
TOKEN="6f17a4f29f87488583583ca622b54572" # 与token.csv里保持一致
# 设置集群参数
kubectl config set-cluster kubernetes \
--certificate-authority=/opt/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=bootstrap.kubeconfig
# 设置客户端认证参数
kubectl config set-credentials "kubelet-bootstrap" \
--token=${TOKEN} \
--kubeconfig=bootstrap.kubeconfig
# 设置上下文参数
kubectl config set-context default \
--cluster=kubernetes \
--user="kubelet-bootstrap" \
--kubeconfig=bootstrap.kubeconfig
# 设置默认上下文
kubectl config use-context default --kubeconfig=bootstrap.kubeconfig
# 保存到配置文件路径下
cp bootstrap.kubeconfig /opt/kubernetes/cfg/
# 拷贝到node节点的/opt/kubernetes/cfg/下
scp -rp /opt/kubernetes/cfg/bootstrap.kubeconfig 10.10.3.179:/opt/kubernetes/cfg
scp -rp /opt/kubernetes/cfg/bootstrap.kubeconfig 10.10.3.197:/opt/kubernetes/cfg
# node1节点和node2节点配置相同
cat > /usr/lib/systemd/system/kubelet.service << EOF
[Unit]
Description=Kubernetes Kubelet
After=docker.service
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kubelet.conf
ExecStart=/opt/kubernetes/bin/kubelet \$KUBELET_OPTS
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
5.2.5 启动并设置开机启动
systemctl daemon-reload
systemctl restart kubelet
systemctl enable kubelet
[root@localhost cfg]# systemctl status -ll kubelet
● kubelet.service - Kubernetes Kubelet
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2020-08-27 17:02:54 CST; 7s ago
Main PID: 22236 (kubelet)
CGroup: /system.slice/kubelet.service
└─22236 /opt/kubernetes/bin/kubelet --logtostderr=false --v=2 --log-dir=/opt/kubernetes/logs --hostname-override=10.10.3.179 --network-plugin=cni --kubeconfig=/opt/kubernetes/cfg/kubelet.kubeconfig --bootstrap-kubeconfig=/opt/kubernetes/cfg/bootstrap.kubeconfig --config=/opt/kubernetes/cfg/kubelet-config.yml --cert-dir=/opt/kubernetes/ssl --pod-infra-container-image=harbor.superred.com/kubernetes/pause-amd64:3.0
Aug 27 17:02:54 localhost.localdomain systemd[1]: Started Kubernetes Kubelet.
1)手动审批脚本(启动 node 节点 kubelet 之后操作),在 kubelet 首次启动后,如果用户 Token 没问题,并且 RBAC 也做了相应的设置就是4.3.3章节说的(kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap),那么此时在集群内应该能看到 kubelet 发起的 CSR 请求 ,必须通过后kubernetes 系统才会将该 Node 加入到集群。查看未授权的CSR 请求
[root@localhost ~]# kubectl get csr
NAME AGE REQUESTOR CONDITION
node-csr--k3G2G1EoM4h9w1FuJRjJjfbIPNxa551A8TZfW9dG-g 2m kubelet-bootstrap Pending
$ kubectl get nodes
No resources found.
通过CSR 请求:
[root@localhost ~]# kubectl certificate approve node-csr--k3G2G1EoM4h9w1FuJRjJjfbIPNxa551A8TZfW9dG-g
certificatesigningrequest "node-csr--k3G2G1EoM4h9w1FuJRjJjfbIPNxa551A8TZfW9dG-g" approved
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION172.20.95.174 Ready 48s v1.10.0
在node节点自动生成了kubelet kubeconfig 文件和公私钥:
[root@localhost ~]# ls /opt/kubernetes/ssl/
ca.pem kubelet-client-2021-02-08-10-03-17.pem kubelet-client-current.pem kubelet.crt kubelet.key
当成功签发证书后,目标节点的 kubelet 会将证书写入到 --cert-dir= 选项指定的目录中;注意此时如果不做其他设置应当生成四个文件 .kubelet 与 apiserver 通讯所使用的证书为kubelet-client-current.pem,剩下的 kubelet.crt 将会被用于 kubelet server(10250) 做鉴权使用;注意,此时 kubelet.crt 这个证书是个独立于 apiserver CA 的自签 CA,并且删除后 kubelet 组件会重新生成它
Subject: O = system:nodes, CN = system:node:10.10.3.167
[root@localhost ssl]# openssl x509 -in kubelet-client-2021-02-08-10-03-17.pem -noout -text
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
f2:96:0a:41:b1:10:97:85:95:68:5f:74:1f:bd:b6:76
Signature Algorithm: sha256WithRSAEncryption
Issuer: C = CN, ST = Beijing, L = Beijing, O = k8s, OU = System, CN = kubernetes
Validity
Not Before: Feb 8 01:58:17 2021 GMT
Not After : Feb 7 01:11:00 2026 GMT
Subject: O = system:nodes, CN = system:node:10.10.3.167
Subject Public Key Info:
Public Key Algorithm: id-ecPublicKey
Public-Key: (256 bit)
pub:
04:a7:77:cb:99:32:56:14:63:a1:e6:4e:f3:3b:21:
96:aa:ef:7e:92:8d:cf:89:8b:92:97:f1:43:72:0b:
72:ae:aa:f0:46:15:ea:de:13:0c:7f:92:24:23:0a:
3d:5b:5d:ce:ac:85:04:45:ea:50:36:1a:bf:45:8e:
b3:da:42:51:92
ASN1 OID: prime256v1
NIST CURVE: P-256
X509v3 extensions:
X509v3 Key Usage: critical
Digital Signature, Key Encipherment
X509v3 Extended Key Usage:
TLS Web Client Authentication
X509v3 Basic Constraints: critical
CA:FALSE
X509v3 Authority Key Identifier:
keyid:5F:C4:6E:67:76:7E:F7:15:E1:2A:50:08:90:77:0C:39:5D:02:AE:C9
Signature Algorithm: sha256WithRSAEncryption
55:c3:a7:3e:40:c5:0f:bd:40:58:2e:ea:8b:35:8b:b4:a3:92:
5d:95:ca:fa:00:e4:71:f8:81:94:8c:bc:70:91:4e:a7:00:15:
56:18:2f:b2:6f:97:02:d4:e3:c2:90:fe:af:58:bc:09:40:ce:
e3:8b:b4:03:23:22:dd:22:da:5d:cf:08:55:80:ed:20:ee:89:
b6:dd:7e:e6:00:82:73:8f:5c:ee:38:7d:32:2e:fa:75:94:33:
d6:09:59:d4:86:c8:66:8e:d9:12:46:50:b8:15:a0:6e:6d:f1:
b6:36:88:13:15:8c:28:1a:10:a6:a7:58:a4:0a:44:4e:37:81:
0c:d9:fa:30:9f:b1:bb:12:04:f2:73:5c:59:17:4b:6b:f2:5c:
c9:e9:82:92:c8:c3:c0:f5:ca:64:ed:ca:03:d4:35:b7:6e:16:
52:25:10:0f:f6:3f:31:1d:f1:66:bc:d9:40:5c:0a:e7:37:52:
3b:ec:dd:d2:38:45:9e:fd:59:d8:56:da:01:19:8e:3f:93:a1:
2d:5f:77:c0:0a:53:52:c7:5b:85:68:71:3b:54:90:91:8a:63:
e1:4e:6d:43:fd:cf:c9:d4:25:0e:5c:cf:7d:f0:27:de:2e:d5:
37:73:71:f4:9b:bb:e1:d2:0f:0c:3e:d5:d9:8c:23:1f:bc:b2:
e0:4b:ab:de
[root@localhost ssl]#
脚本
vim k8s-csr-approve.sh
#!/bin/bash
CSRS=$(kubectl get csr | awk '{if(NR>1) print $1}')
for csr in $CSRS; do
kubectl certificate approve $csr;
done
1 # 查看kubelet证书请求
root@localhost:/opt/kubernetes/cfg # kubectl get csr 1 ↵
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-eMnRD_0D_DU8icIQvr8gr54kw2IxxtwJv9IYOD5fMu8 72s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending
2 # 批准请求
root@localhost:/opt/kubernetes/cfg # kubectl get csr 1 ↵
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-eMnRD_0D_DU8icIQvr8gr54kw2IxxtwJv9IYOD5fMu8 2m10s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending
root@localhost:/opt/kubernetes/cfg # kubectl certificate approve node-csr-eMnRD_0D_DU8icIQvr8gr54kw2IxxtwJv9IYOD5fMu8k
root@localhost:/opt/kubernetes/cfg # kubectl certificate approve node-csr-eMnRD_0D_DU8icIQvr8gr54kw2IxxtwJv9IYOD5fMu8
certificatesigningrequest.certificates.k8s.io/node-csr-eMnRD_0D_DU8icIQvr8gr54kw2IxxtwJv9IYOD5fMu8 approved
root@localhost:/opt/kubernetes/cfg # kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-eMnRD_0D_DU8icIQvr8gr54kw2IxxtwJv9IYOD5fMu8 2m33s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issued
3 # 查看node节点状态
root@localhost:/opt/kubernetes/cfg # kubectl get nodes
NAME STATUS ROLES AGE VERSION
10.10.3.179 NotReady 35s v1.18.8
#@注:由于网络插件还没有部署,节点会没有准备就绪 NotReady
# node1节点和node2节点配置相同
cat > /opt/kubernetes/cfg/kube-proxy.conf << EOF
KUBE_PROXY_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/kubernetes/logs \\
--config=/opt/kubernetes/cfg/kube-proxy-config.yml"
EOF
# node1
cat > /opt/kubernetes/cfg/kube-proxy-config.yml << EOF
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0 # 监听地址
metricsBindAddress: 0.0.0.0:10249 # 监控指标地址,监控获取相关信息 就从这里获取
clientConnection:
kubeconfig: /opt/kubernetes/cfg/kube-proxy.kubeconfig # 读取配置文件
hostnameOverride: 10.10.3.179 # 注册到k8s的节点名称唯一
clusterCIDR: 10.244.0.0/16
#mode: iptables # 使用iptables模式
# 使用 ipvs 模式
#mode: ipvs # ipvs 模式
#ipvs:
# scheduler: "rr"
#iptables:
# masqueradeAll: true
EOF
# node2
cat > /opt/kubernetes/cfg/kube-proxy-config.yml << EOF
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0 # 监听地址
metricsBindAddress: 0.0.0.0:10249 # 监控指标地址,监控获取相关信息 就从这里获取
clientConnection:
kubeconfig: /opt/kubernetes/cfg/kube-proxy.kubeconfig # 读取配置文件
hostnameOverride: 10.10.3.197 # 注册到k8s的节点名称唯一
clusterCIDR: 10.244.0.0/16
#mode: iptables # 使用iptables模式
# 使用 ipvs 模式
#mode: ipvs # ipvs 模式
#ipvs:
# scheduler: "rr"
#iptables:
# masqueradeAll: true
EOF
Kube-proxy中的–cluster-dir|clusterCIDR指定的是集群中pod使用的网段,pod使用的网段和apiserver中指定的service的cluster ip|service-cluster-ip-range网段不是同一个网段
在master节点生成kube-proxy证书 在k8s-master节点操作
# 切换到存放证书目录
cd ~/TLS/k8s/
# 创建证书请求文件
cat > kube-proxy-csr.json << EOF
{
"CN": "system:kube-proxy",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing",
"O": "k8s",
"OU": "System"
}
]
}
EOF
# 生成证书
root@localhost:~/TLS/k8s # ls
apiserver.csr apiserver-csr.json apiserver-key.pem apiserver.pem ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem kube-proxy-csr.json
root@localhost:~/TLS/k8s # cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
2020/08/27 17:10:00 [INFO] generate received request
2020/08/27 17:10:00 [INFO] received CSR
2020/08/27 17:10:00 [INFO] generating key: rsa-2048
2020/08/27 17:10:00 [INFO] encoded CSR
2020/08/27 17:10:00 [INFO] signed certificate with serial number 414616310621739470117285557327250607193567259357
2020/08/27 17:10:00 [WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for
websites. For more information see the Baseline Requirements for the Issuance and Management
of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org);
specifically, section 10.2.3 ("Information Requirements").
root@localhost:~/TLS/k8s # ll kube-proxy*pem
-rw------- 1 root root 1.7K Aug 27 17:10 kube-proxy-key.pem
-rw-r--r-- 1 root root 1.4K Aug 27 17:10 kube-proxy.pem
在master节点生成kube-proxy.kubeconfig文件
KUBE_APISERVER="https://10.10.3.139:6443"
# 设置集群参数
kubectl config set-cluster kubernetes \
--certificate-authority=/opt/kubernetes/ssl/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=kube-proxy.kubeconfig
# 设置客户端认证参数
kubectl config set-credentials kube-proxy \
--client-certificate=./kube-proxy.pem \
--client-key=./kube-proxy-key.pem \
--embed-certs=true \
--kubeconfig=kube-proxy.kubeconfig
# 设置上下文参数
kubectl config set-context default \
--cluster=kubernetes \
--user=kube-proxy \
--kubeconfig=kube-proxy.kubeconfig
# 设置默认上下文
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
# 保存到配置文件路径下
cp kube-proxy.kubeconfig /opt/kubernetes/cfg/
# 拷贝到node节点的/opt/kubernetes/cfg/下
scp -rp /opt/kubernetes/cfg/kube-proxy.kubeconfig 10.10.3.179:/opt/kubernetes/cfg
scp -rp /opt/kubernetes/cfg/kube-proxy.kubeconfig 10.10.3.197:/opt/kubernetes/cfg
cat > /usr/lib/systemd/system/kube-proxy.service << EOF
[Unit]
Description=Kubernetes Proxy
After=network.target
[Service]
EnvironmentFile=/opt/kubernetes/cfg/kube-proxy.conf
ExecStart=/opt/kubernetes/bin/kube-proxy \$KUBE_PROXY_OPTS
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl restart kube-proxy
systemctl enable kube-proxy
[root@localhost cfg]# systemctl status -ll kube-proxy
● kube-proxy.service - Kubernetes Proxy
Loaded: loaded (/usr/lib/systemd/system/kube-proxy.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2020-08-27 17:14:55 CST; 7s ago
Main PID: 25853 (kube-proxy)
CGroup: /system.slice/kube-proxy.service
└─25853 /opt/kubernetes/bin/kube-proxy --logtostderr=false --v=2 --log-dir=/opt/kubernetes/logs --config=/opt/kubernetes/cfg/kube-proxy-config.yml
Aug 27 17:14:55 localhost.localdomain systemd[1]: Started Kubernetes Proxy.
kubernetes集群的网络官方文档
github上的CNI文档
容器网络接口
kubernetes网络模型设计的基本要求:
目前支持的技术
最常用的是flannel和calico
下载最新版地址:https://github.com/containernetworking/plugins/releases/tag/v0.8.7
wget https://github.com/containernetworking/plugins/releases/download/v0.8.7/cni-plugins-linux-amd64-v0.8.7.tgz
# 解压二进制包并移动到默认工作目录
mkdir -p /opt/cni/bin
tar zxvf cni-plugins-linux-amd64-v0.8.7.tgz -C /opt/cni/bin
# 在node节点创建cni目录
mkdir -p /opt/cni/bin
# 在master节点上拷贝到node节点的cni目录
scp -rp /opt/cni/bin/* k8s-node1:/opt/cni/bin/
scp -rp /opt/cni/bin/* k8s-node2:/opt/cni/bin/
所有的节点都需要安装flannel,其实只在k8s-master主节点安装即可,其余节点只有kubelet启动起来,会自动在master启动一个flannel
# 部署CNI网络(网络问题可以多尝试几次)
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# 默认镜像地址无法访问外网,可以修改为docker hub镜像仓库
# harbor.superred.com/kubernetes/coreos/flannel:v0.12.0-amd64
sed -ri "s#quay.io/coreos/flannel:.*-amd64#harbor.superred.com/kubernetes/coreos/flannel:v0.12.0-amd64#g" kube-flannel.yml
#添加"--iface=eth0"一句指定网卡
...
containers:
- name: kube-flannel
image: harbor.superred.com/kubernetes/coreos/flannel:v0.12.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=eth0
root@localhost:~/work/flannel # cat kube-flannel.yml
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: psp.flannel.unprivileged
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
privileged: false
volumes:
- configMap
- secret
- emptyDir
- hostPath
allowedHostPaths:
- pathPrefix: "/etc/cni/net.d"
- pathPrefix: "/etc/kube-flannel"
- pathPrefix: "/run/flannel"
readOnlyRootFilesystem: false
# Users and groups
runAsUser:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
fsGroup:
rule: RunAsAny
# Privilege Escalation
allowPrivilegeEscalation: false
defaultAllowPrivilegeEscalation: false
# Capabilities
allowedCapabilities: ['NET_ADMIN', 'NET_RAW']
defaultAddCapabilities: []
requiredDropCapabilities: []
# Host namespaces
hostPID: false
hostIPC: false
hostNetwork: true
hostPorts:
- min: 0
max: 65535
# SELinux
seLinux:
# SELinux is unused in CaaSP
rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
rules:
- apiGroups: ['extensions']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-amd64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- amd64
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: harbor.superred.com/kubernetes/coreos/flannel:v0.12.0-amd64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: harbor.superred.com/kubernetes/coreos/flannel:v0.12.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=eth0
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-arm64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- arm64
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-arm64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-arm64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-arm
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- arm
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-arm
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-arm
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-ppc64le
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- ppc64le
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-ppc64le
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-ppc64le
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-s390x
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
- key: kubernetes.io/arch
operator: In
values:
- s390x
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.12.0-s390x
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.12.0-s390x
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
root@localhost:~/work/flannel # kubectl apply -f kube-flannel.yml
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds-amd64 created
daemonset.apps/kube-flannel-ds-arm64 created
daemonset.apps/kube-flannel-ds-arm created
daemonset.apps/kube-flannel-ds-ppc64le created
daemonset.apps/kube-flannel-ds-s390x created
root@localhost:~/work/flannel # kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
kube-flannel-ds-amd64-pnr7b 1/1 Running 0 29s
root@localhost:~/work/flannel # kubectl get nodes
NAME STATUS ROLES AGE VERSION
10.10.3.179 Ready 69s v1.18.8
1)下载flannel二进制包,解压并发送到各节点
下载
[root@master flannel]# wget https://github.com/coreos/flannel/releases/download/v0.13.0/flannel-v0.13.0-linux-amd64.tar.gz
2)将 pod 网段信息写入 etcd 中,falnnel要用etcd存储自身一个子网信息,所以要保证能成功连接etcd,写入预定义子网段。
[root@k8s coredns]# ETCDCTL_API=2 etcdctl --ca-file=/opt/etcd/ssl/ca.pem --cert-file=/opt/etcd/ssl/etcd.pem --key-file=/opt/etcd/ssl/etcd-key.pem --endpoints=https://10.10.3.163:2379 set /coreos.com/network/config '{"Network": "10.244.0.0/16", "Backend": {"Type": "vxlan"}}'
{"Network": "10.244.0.0/16", "Backend": {"Type": "vxlan"}}
3)flanneld 的配置文件
[root@k8s coredns]# cat < /opt/kubernetes/cfg/flanneld
FLANNEL_OPTIONS="--etcd-endpoints=https://10.10.3.163:2379 \
-etcd-cafile=/opt/etcd/ssl/ca.pem \
-etcd-certfile=/opt/etcd/ssl/etcd.pem \
-etcd-keyfile=/opt/etcd/ssl/etcd-key.pem \
-etcd-prefix=/coreos.com/network"
EOF
4)flanneld.service 服务文件
cat </usr/lib/systemd/system/flanneld.service
[Unit]
Description=Flanneld overlay address etcd agent
After=network-online.target network.target
Before=docker.service
[Service]
Type=notify
EnvironmentFile=/opt/kubernetes/cfg/flanneld
ExecStart=/opt/kubernetes/bin/flanneld --ip-masq \$FLANNEL_OPTIONS
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable flanneld
systemctl restart flanneld
[root@k8s coredns]# ip a
1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: ens3: mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:85:8c:5a brd ff:ff:ff:ff:ff:ff
inet 10.10.3.163/24 brd 10.10.3.255 scope global noprefixroute ens3
valid_lft forever preferred_lft forever
3: docker0: mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:5c:bc:e2:82 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
6: flannel.1: mtu 1450 qdisc noqueue state UNKNOWN group default
link/ether 0a:e7:2c:b6:81:b7 brd ff:ff:ff:ff:ff:ff
inet 10.244.63.0/32 brd 10.244.63.0 scope global flannel.1
valid_lft forever preferred_lft forever
如上,生成flannel.1虚拟网卡,且各节点flannel.1网卡能够户型ping通,则表示flannel安装成功。
下载 cni 插件
[root@localhost opt]# wget https://github.com/containernetworking/plugins/releases/download/v0.8.7/cni-plugins-linux-amd64-v0.8.7.tgz
[root@localhost opt]# tar xf cni-plugins-linux-amd64-v0.8.6.tgz
[root@localhost opt]# ls cni/
bin cni-plugins-linux-amd64-v0.8.7.tgz
[root@localhost opt]# ls cni/bin/
bandwidth bridge dhcp firewall flannel host-device host-local ipvlan loopback macvlan portmap ptp sbr static tuning vlan
[root@localhost opt]# chmod -R 755 bin
创建 cni 配置文件
#!/bin/bash
mkdir /etc/cni/net.d/ -pv
cat < /etc/cni/net.d/10-flannel.conflist
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
EOF
重启 kubelet
systemctl restart kubelet
cat > apiserver-to-kubelet-rbac.yaml << EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:kube-apiserver-to-kubelet
rules:
- apiGroups:
- ""
resources:
- nodes/proxy
- nodes/stats
- nodes/log
- nodes/spec
- nodes/metrics
- pods/log
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:kube-apiserver
namespace: ""
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:kube-apiserver-to-kubelet
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: kubernetes
EOF
kubectl apply -f apiserver-to-kubelet-rbac.yaml
root@localhost:~/work/flannel # kubectl get ClusterRoleBinding | grep "system:kube-apiserve" 1 ↵
system:kube-apiserver ClusterRole/system:kube-apiserver-to-kubelet 97s
root@localhost:~/work/flannel # kubectl get ClusterRole | grep "system:kube-apiserver-to-kubelet"
system:kube-apiserver-to-kubelet 2020-08-27T09:33:30Z
在10.10.3.179节点将Worker Node涉及文件拷贝到新节点10.10.3.197或其他、
[root@localhost ~]# ls /opt/kubernetes/bin/
kubelet kube-proxy
[root@localhost ~]# ls /opt/kubernetes/cfg/
bootstrap.kubeconfig kubelet.conf kubelet-config.yml kubelet.kubeconfig kube-proxy.conf kube-proxy-config.yml kube-proxy.kubeconfig
[root@localhost ~]# ls /opt/kubernetes/ssl/
ca.pem kubelet-client-2020-08-27-17-05-24.pem kubelet-client-current.pem kubelet.crt kubelet.key
[root@localhost ~]# ls /opt/cni/bin/
bandwidth bridge dhcp firewall flannel host-device host-local ipvlan loopback macvlan portmap ptp sbr static tuning vlan
scp -r /opt/kubernetes 10.10.3.197:/opt
scp -r /usr/lib/systemd/system/{kubelet,kube-proxy}.service [email protected]:/usr/lib/systemd/system
如果二进制安装的flanneld
scp -r /usr/lib/systemd/system/flanneld.service [email protected]:/usr/lib/systemd/system
scp /etc/cni/net.d/10-flannel.conflist 10.10.3.197:/etc/cni/net.d/
scp -r /opt/cni [email protected]:/opt/
scp /opt/kubernetes/ssl/ca.pem [email protected]:/opt/kubernetes/ssl
rm /opt/kubernetes/cfg/kubelet.kubeconfig
rm -f /opt/kubernetes/ssl/kubelet*
注:这几个文件是证书申请审批后自动生成的,每个Node不同,必须删除重新生成。
vi /opt/kubernetes/cfg/kubelet.conf
--hostname-override=10.10.3.197
vi /opt/kubernetes/cfg/kube-proxy-config.yml
hostnameOverride: 10.10.3.197
完整
[root@localhost cfg]# cat kubelet.conf
KUBELET_OPTS="--logtostderr=false \
--v=2 \
--log-dir=/opt/kubernetes/logs \
--hostname-override=10.10.3.197 \
--network-plugin=cni \
--kubeconfig=/opt/kubernetes/cfg/kubelet.kubeconfig \
--bootstrap-kubeconfig=/opt/kubernetes/cfg/bootstrap.kubeconfig \
--config=/opt/kubernetes/cfg/kubelet-config.yml \
--cert-dir=/opt/kubernetes/ssl \
--pod-infra-container-image=harbor.superred.com/kubernetes/pause-amd64:3.0"
[root@localhost cfg]# cat kube-proxy-config.yml
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
metricsBindAddress: 0.0.0.0:10249
clientConnection:
kubeconfig: /opt/kubernetes/cfg/kube-proxy.kubeconfig
hostnameOverride: 10.10.3.197
clusterCIDR: 10.244.0.0/16
systemctl daemon-reload
systemctl start kubelet
systemctl enable kubelet
systemctl start kube-proxy
systemctl enable kube-proxy
[root@localhost cfg]# systemctl status kubelet
● kubelet.service - Kubernetes Kubelet
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2020-08-28 10:07:02 CST; 9min ago
Main PID: 11212 (kubelet)
Tasks: 16
Memory: 19.1M
CGroup: /system.slice/kubelet.service
└─11212 /opt/kubernetes/bin/kubelet --logtostderr=false --v=2 --log-dir=/opt/kubernetes/logs --hostname-override=10.10.3.197 --network-plugin=cni --kubeconfig...
Aug 28 10:07:02 localhost.localdomain systemd[1]: Started Kubernetes Kubelet.
[root@localhost cfg]# systemctl status kube-proxy
● kube-proxy.service - Kubernetes Proxy
Loaded: loaded (/usr/lib/systemd/system/kube-proxy.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2020-08-28 10:03:58 CST; 12min ago
Main PID: 11030 (kube-proxy)
Tasks: 15
Memory: 14.7M
CGroup: /system.slice/kube-proxy.service
└─11030 /opt/kubernetes/bin/kube-proxy --logtostderr=false --v=2 --log-dir=/opt/kubernetes/logs --config=/opt/kubernetes/cfg/kube-proxy-config.yml
Aug 28 10:03:58 localhost.localdomain systemd[1]: Started Kubernetes Proxy.
Aug 28 10:03:58 localhost.localdomain kube-proxy[11030]: E0828 10:03:58.403733 11030 node.go:125] Failed to retrieve node info: nodes "10.10.3.197" not found
Aug 28 10:03:59 localhost.localdomain kube-proxy[11030]: E0828 10:03:59.469988 11030 node.go:125] Failed to retrieve node info: nodes "10.10.3.197" not found
Aug 28 10:04:01 localhost.localdomain kube-proxy[11030]: E0828 10:04:01.496776 11030 node.go:125] Failed to retrieve node info: nodes "10.10.3.197" not found
Aug 28 10:04:06 localhost.localdomain kube-proxy[11030]: E0828 10:04:06.095214 11030 node.go:125] Failed to retrieve node info: nodes "10.10.3.197" not found
Aug 28 10:04:14 localhost.localdomain kube-proxy[11030]: E0828 10:04:14.473297 11030 node.go:125] Failed to retrieve node info: nodes "10.10.3.197" not found
Aug 28 10:04:31 localhost.localdomain kube-proxy[11030]: E0828 10:04:31.189822 11030 node.go:125] Failed to retrieve node info: nodes "10.10.3.197" not found
注意
这里可能会有个报错导致启动失败:error: failed to run Kubelet: cannot create certificate signing request: certificatesigningrequests.certificates.k8s.io is forbidden: User "kubelet-bootstrap" cannot create certificatesigningrequests.certificates.k8s.io at the cluster scope
原因是:kubelet-bootstrap并没有权限创建证书。所以要创建这个用户的权限并绑定到这个角色上。
解决方法是在master上执行kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap
root@localhost:~ # kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-rAPlSXBF-YLeLIvxDoLB6VKPaTHJ96s7JS5svj8spQw 10m kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending
kubectl certificate approve node-csr-rAPlSXBF-YLeLIvxDoLB6VKPaTHJ96s7JS5svj8spQw
root@localhost:~ # kubectl get node
NAME STATUS ROLES AGE VERSION
10.10.3.179 Ready 16h v1.18.8
10.10.3.197 Ready 12s v1.18.8
root@localhost:~ # kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
node-csr-rAPlSXBF-YLeLIvxDoLB6VKPaTHJ96s7JS5svj8spQw 11m kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Approved,Issued
最主要的区别上面已经说过了,即ServiceAccount是K8S内部资源,而User是独立于K8S之外的。从它们的本质可以看出:
这里说的添加用户指的是普通意义上的用户,即存在于集群外的用户,为k8s的使用者。
实际上叫做添加用户也不准确,用户早已存在,这里所做的只是使K8S能够认知此用户,并且控制此用户在集群内的权限
尽管K8S认知用户靠的只是用户的名字,但是只需要一个名字就能请求K8S的API显然是不合理的,所以依然需要验证此用户的身份
在K8S中,有以下几种验证方式:
--client-ca-file=xxx
选项启用,API Server通过此ca文件来验证API请求携带的客户端证书的有效性,一旦验证成功,API Server就会将客户端证书Subject里的CN属性作为此次请求的用户名--token-auth-file=SOMEFILE
选项来启用bearer token验证方式,引用的文件是一个包含了 token,用户名,用户ID 的csv文件 请求时,带上Authorization: Bearer 31ada4fd-adec-460c-809a-9e56ceb75269
头信息即可通过bearer token验证--basic-auth-file=SOMEFILE
选项启用密码验证,类似的,引用的文件时一个包含 密码,用户名,用户ID 的csv文件 请求时需要将Authorization
头设置为Basic BASE64ENCODED(USER:PASSWORD)
这里只介绍客户端验证
假设我们操作的用户名为wubo(role,rolebinding)普通用户和admin(clusterRole,clusterRolebinding)管理员用户
openssl方式
openssl genrsa -out admin.key 2048;
openssl genrsa -out wubo.key 2048openssl req -new -key admin.key -out admin.csr -subj "/CN=admin/O=system"
;openssl req -new -key wubo.key -out wubo.csr -subj "/CN=wubo/O=dev"
/etc/kubernetes/pki/
路径下,会有两个文件,一个是CA证书(ca.crt),一个是CA私钥(ca.key)openssl x509 -req -in admin.csr -CA path/to/ca.crt -CAkey path/to/ca.key -CAcreateserial -out admin.crt -days 365;openssl x509 -req -in wubo.csr -CA path/to/ca.crt -CAkey path/to/ca.key -CAcreateserial -out wubo.crt -days 365
或者cfssl 方式
[root@localhost k8s]# kubectl describe clusterrolebinding cluster-admin
Name: cluster-admin
Labels: kubernetes.io/bootstrapping=rbac-defaults
Annotations: rbac.authorization.kubernetes.io/autoupdate: true
Role:
Kind: ClusterRole
Name: cluster-admin
Subjects:
Kind Name Namespace
---- ---- ---------
Group system:masters
[root@localhost k8s]# cat admin-csr.json
{
"CN": "admin",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "BeiJing",
"ST": "BeiJing",
"O": "system:masters",
"OU": "System"
}
]
}
root@localhost:~/TLS/k8s # cat wubo-csr.json
{
"CN": "wubo",
"hosts": [],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "wubo",
"OU": "dev"
}
]
}
找到K8S集群(API Server)的CA证书文件,其位置取决于安装集群的方式,/root/TLS/k8s
路径下,会有三个文件,一个是CA证书(ca.pem),一个是CA私钥(ca-key.pem)和配置文件ca-config.json
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes wubo-csr.json | cfssljson -bare wubo
-CA和-CAkey以及-config参数需要指定集群CA证书所在位置和配置文件root@localhost:~/TLS/k8s # ls admin*
admin.csr admin-csr.json admin-key.pem admin.pem
wubo.csr wubo-csr.json wubo-key.pem wubo.pem
设置集群
[root@localhost k8s]# kubectl config set-cluster kubernetes --server=https://10.10.3.163:6443 --certificate-authority=ca.pem --embed-certs=true --kubeconfig=admin.kubeconfig
创建用户
[root@localhost k8s]# kubectl config set-credentials admin --client-certificate=admin.pem --client-key=admin-key.pem --embed-certs=true --kubeconfig=admin.kubeconfig
设置上线文且指定namespac,namespac默认是default
[root@localhost k8s]# kubectl config set-context default --cluster=kubernetes --user=admin --kubeconfig=admin.kubeconfig
使用
[root@localhost k8s]# kubectl config use-context default --kubeconfig=admin.kubeconfig
然后把kubectl和admin.kubeconfig 拷给其他人或机器上面
第一次没有~/.kube/config
[root@localhost ~]# kubectl get nodes
The connection to the server localhost:8080 was refused - did you specify the right host or port?
有~/.kube/config之后
[root@localhost ~]# mkdir ~/.kube
[root@localhost ~]# cat admin.kubeconfig > ~/.kube/config
[root@localhost ~]# kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
创建namespaces
kubectl ns wubo
设置集群
kubectl config set-cluster kubernetes --server=https://10.10.3.139:6443 --certificate-authority=ca.pem --embed-certs=true --kubeconfig=wubo.kubeconfig
创建用户
kubectl config set-credentials wubo --client-certificate=wubo.pem --client-key=wubo-key.pem --embed-certs=true --kubeconfig=wubo.kubeconfig
设置上线文且指定namespace
kubectl config set-context wubo@kubernetes --cluster=kubernetes --user=wubo -n wubo --kubeconfig=wubo.kubeconfig
使用
kubectl config use-context wubo@kubernetes --kubeconfig=wubo.kubeconfig
创建serviceaccount的wubo用户在指定的namespace下面
root@localhost:~/TLS/k8s # kubectl create serviceaccount wubo -n wubo 1 ↵
serviceaccount/wubo created
root@localhost:~/TLS/k8s # kubectl get serviceaccount -n wubo
NAME SECRETS AGE
default 1 34m
wubo 1 23s
root@localhost:~/TLS/k8s # kubectl describe serviceaccount wubo -n wubo
Name: wubo
Namespace: wubo
Labels:
Annotations:
Image pull secrets:
Mountable secrets: wubo-token-4xhm6
Tokens: wubo-token-4xhm6
Events:
root@localhost:~/TLS/k8s # kubectl describe secret wubo -n wubo
Name: wubo-token-4xhm6
Namespace: wubo
Labels:
Annotations: kubernetes.io/service-account.name: wubo
kubernetes.io/service-account.uid: eae41b5e-2362-478f-8b6b-041c54cd5858
Type: kubernetes.io/service-account-token
Data
====
namespace: 4 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6Imtla2RhLS1SZ0NBUkVYTkY3eTZSWkE4UkNJVWJ2d0doS3NrYXpOOXlreWcifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJ3dWJvIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6Ind1Ym8tdG9rZW4tNHhobTYiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoid3VibyIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImVhZTQxYjVlLTIzNjItNDc4Zi04YjZiLTA0MWM1NGNkNTg1OCIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDp3dWJvOnd1Ym8ifQ.Gz985FdIBlZ-qeOMhdCnr34XtGNoDF_CA8p2Pv4n6re5qaOp9CmudDfvm7PohRRpuWZiTQZU3fgYhkaX9D5lpJ_8WktMGvApC4hXQR2v0_-AGVp_UcVQ2-ZR2-4UQbTRghaO8iwY2czr-2ULqGk4mtuGAnStc6TSRVeADwh1oRKAmL5f27UGEIQ1pFYqdJEiUCPGwcLTbhtZH-jBnVf-g8YCyhpgy1f8KrAmaFUdg8r6sCADn5JYuklBkiuQs2HcaitAHETnGR1d0uVOn48CO5LEjD_aFI7eBwaSGGetxmCq8rlAKQuPVnZTsqvaQNq0iVK4ogX8lOsTgfp94_8GuQ
ca.crt: 1359 bytes
在RBAC中,角色有两种——普通角色(Role)和集群角色(ClusterRole),ClusterRole是特殊的Role,相对于Role来说:
(1)Role --> User -->Rolebinding
首先创造一个角色,普通角色跟namespace挂钩的
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
namespace: wubo
name: wubo-role
rules:
- apiGroups: [""]
resources: ["*"]
verbs: ["*"]
或者
kubectl create role wubo-role --verb="*" --resource="*" -n wubo
root@localhost:~/TLS/k8s # kubectl get role -n wubo 2 ↵
NAME CREATED AT
wubo-role 2020-08-31T03:37:23Z
这是在wubo命名空间内创建了一个wubo-role管理员角色,这里只是用wubo-role角色举例,实际上如果只是为了授予用户某命名空间管理员的权限的话,是不需要新建一个角色的,K8S已经内置了一个名为admin的ClusterRole。admin的角色是某个namespace的角色,而cluster-admin是所有namespaces的角色范围更大权限更大。use不但可以绑定普通role,user还可以绑定集群clusterole。
root@localhost:~/TLS/k8s # kubectl get clusterrole | grep admin
admin 2020-08-27T09:29:19Z
cluster-admin 2020-08-27T09:29:19Z
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: wubo-binding
namespace: wubo
subjects:
- kind: User
name: wubo
apiGroup: ""
roleRef:
kind: Role
name: wubo-role
apiGroup: ""
或者
kubectl create rolebinding wubo-binding --role=wubo-role --user=wubo -n wubo
也可以指定为某一个namespace的集群角色
kubectl create rolebinding wubo-binding --clusterrole=wubo-role --user=wubo -n wubo
如yaml中所示,RoleBinding资源创建了一个 Role-User 之间的关系,roleRef
节点指定此RoleBinding所引用的角色,subjects
节点指定了此RoleBinding的受体,可以是User,也可以是前面说过的ServiceAccount,在这里只包含了名为 wubo的用户
root@localhost:~/TLS/k8s # kubectl get rolebinding -n wubo
NAME ROLE AGE
wubo-bind Role/wubo-role 6s
前面说过,K8S内置了一个名为admin的ClusterRole,所以实际上我们无需创建一个admin Role,直接对集群默认的admin ClusterRole添加RoleBinding就可以了
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: wubo-binding
namespace: wubo
subjects:
- kind: User
name: wubo
apiGroup: ""
roleRef:
kind: ClusterRole
name: admin
apiGroup: ""
kubectl create rolebinding wubo-binding --clusterrole=admin --user=wubo -n wubo
root@localhost:~/TLS/k8s # kubectl get rolebinding -n wubo
NAME ROLE AGE
wubo-admin-binding ClusterRole/admin 3s
使用RoleBinding去绑定ClusterRole:
如果有10个名称空间,每个名称空间都需要一个管理员,而这些管理员的权限都是一致的。那么此时需要去定义这样的管理员,使用RoleBinding就需要创建10个Role,这样显得很麻烦。为此当使用RoleBinding去绑定一个ClusterRole时,该User仅仅拥有对当前名称空间的集群操作权限,而不是拥有所有名称空间的权限,所以此时只需要创建一个ClusterRole代替掉10个Role就解决了以上的需求。
注意:--role和--clusterrole 还都是属于一个namespace的角色只是方便,减少role创建。每个namespace角色相同。
现在wubo/admin已经有权限访问了;
以上操作role和rolebinding都是只对当前名称空间生效;
这里虽然引用的是作为ClusterRole的admin角色,但是其权限被限制在RoleBinding admin-binding所
命名空间,即wubo内
如果想要添加全命名空间或者说全集群的管理员,可以使用cluster-admin角色
到此为止,我们已经:
wubo/admin已经是管理员了,现在我们想要通过kubectl以admin/wubo的身份来操作集群,需要将wubo/admin的认证信息添加进kubectl的配置,即~/.kube/config中
这里假设config中已经配置好了k8s集群
kubectl config set-credentials wubo --client-certificate=path/to/wubo|admin.pem--client-key=path/to/wubo|admin-key.pem
kubectl config set-context
wubo/admin@kubernests --cluster=kubernests--namespace=wubo --user=
wubo/adminkubernests
集群,默认使用wubo
命名空间,使用用户wubo/admin进行验证kubectl --context=
wubo/admin@kubernests...
参数即可指定kubectl使用之前添加的名为wubo@kubernests
的context操作集群kubectl config use-context
wubo/admin@kubernests
来设置当前激活的context通过kubectl config set-credentials
命令添加的用户,其默认使用的是引用证书文件路径的方式,表现在~/.kube/config中,就是:
users:
- name: wubo|admin
user:
client-certificate: path/to/wubo|admin.crt
client-key: path/to/wubo|admin.key
如果觉得这样总是带着两个证书文件不方便的话,可以将证书内容直接放到config文件里
cat wubo|admin.pem| base64 --wrap=0
cat wubo|admin-key.pem | base64 --wrap=0
users:
- name: ich
user:
client-certificate-data: ...
client-key-data: ...
这样就不再需要证书和私钥文件了,当然这两个文件还是保存起来比较好
为Kubernetes集群添加用户 - 知乎
https://github.com/kubernetes/dashboard
wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.3/aio/deploy/recommended.yaml
默认Dashboard只能集群内部访问,修改Service为NodePort类型,暴露到外部
# 修改yaml文件
[root@k8s-master yaml]# vim recommended.yaml (32gg)
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
ports:
- port: 443
targetPort: 8443
nodePort: 30001 #添加类型
type: NodePort
selector:
k8s-app: kubernetes-dashboard
完整
root@localhost:~/work/dashboard # cat recommended.yaml
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
apiVersion: v1
kind: Namespace
metadata:
name: kubernetes-dashboard
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
---
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
type: NodePort
ports:
- port: 443
targetPort: 8443
nodePort: 30001
selector:
k8s-app: kubernetes-dashboard
---
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-certs
namespace: kubernetes-dashboard
type: Opaque
---
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-csrf
namespace: kubernetes-dashboard
type: Opaque
data:
csrf: ""
---
apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-key-holder
namespace: kubernetes-dashboard
type: Opaque
---
kind: ConfigMap
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-settings
namespace: kubernetes-dashboard
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
rules:
# Allow Dashboard to get, update and delete Dashboard exclusive secrets.
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs", "kubernetes-dashboard-csrf"]
verbs: ["get", "update", "delete"]
# Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["kubernetes-dashboard-settings"]
verbs: ["get", "update"]
# Allow Dashboard to get metrics.
- apiGroups: [""]
resources: ["services"]
resourceNames: ["heapster", "dashboard-metrics-scraper"]
verbs: ["proxy"]
- apiGroups: [""]
resources: ["services/proxy"]
resourceNames: ["heapster", "http:heapster:", "https:heapster:", "dashboard-metrics-scraper", "http:dashboard-metrics-scraper"]
verbs: ["get"]
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
rules:
# Allow Metrics Scraper to get metrics from the Metrics server
- apiGroups: ["metrics.k8s.io"]
resources: ["pods", "nodes"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kubernetes-dashboard
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kubernetes-dashboard
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kubernetes-dashboard
---
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kubernetes-dashboard
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kubernetes-dashboard
template:
metadata:
labels:
k8s-app: kubernetes-dashboard
spec:
containers:
- name: kubernetes-dashboard
image: harbor.superred.com/kubernetes/kubernetesui/dashboard:v2.0.3
imagePullPolicy: Always
ports:
- containerPort: 8443
protocol: TCP
args:
- --auto-generate-certificates
- --namespace=kubernetes-dashboard
# Uncomment the following line to manually specify Kubernetes API server Host
# If not specified, Dashboard will attempt to auto discover the API server and connect
# to it. Uncomment only if the default does not work.
# - --apiserver-host=http://my-address:port
volumeMounts:
- name: kubernetes-dashboard-certs
mountPath: /certs
# Create on-disk volume to store exec logs
- mountPath: /tmp
name: tmp-volume
livenessProbe:
httpGet:
scheme: HTTPS
path: /
port: 8443
initialDelaySeconds: 30
timeoutSeconds: 30
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsUser: 1001
runAsGroup: 2001
volumes:
- name: kubernetes-dashboard-certs
secret:
secretName: kubernetes-dashboard-certs
- name: tmp-volume
emptyDir: {}
serviceAccountName: kubernetes-dashboard
nodeSelector:
"kubernetes.io/os": linux
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
---
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: dashboard-metrics-scraper
name: dashboard-metrics-scraper
namespace: kubernetes-dashboard
spec:
type: NodePort
ports:
- port: 8000
targetPort: 8000
nodePort: 30002
selector:
k8s-app: dashboard-metrics-scraper
---
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
k8s-app: dashboard-metrics-scraper
name: dashboard-metrics-scraper
namespace: kubernetes-dashboard
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: dashboard-metrics-scraper
template:
metadata:
labels:
k8s-app: dashboard-metrics-scraper
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'runtime/default'
spec:
containers:
- name: dashboard-metrics-scraper
image: harbor.superred.com/kubernetes/kubernetesui/metrics-scraper:v1.0.4
ports:
- containerPort: 8000
protocol: TCP
livenessProbe:
httpGet:
scheme: HTTP
path: /
port: 8000
initialDelaySeconds: 30
timeoutSeconds: 30
volumeMounts:
- mountPath: /tmp
name: tmp-volume
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsUser: 1001
runAsGroup: 2001
serviceAccountName: kubernetes-dashboard
nodeSelector:
"kubernetes.io/os": linux
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
volumes:
- name: tmp-volume
emptyDir: {}
# 生成dashboard
kubectl apply -f recommended.yaml
root@localhost:~/work/dashboard # kubectl apply -f recommended.yaml
namespace/kubernetes-dashboard created
serviceaccount/kubernetes-dashboard created
service/kubernetes-dashboard created
secret/kubernetes-dashboard-certs created
secret/kubernetes-dashboard-csrf created
secret/kubernetes-dashboard-key-holder created
configmap/kubernetes-dashboard-settings created
role.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
deployment.apps/kubernetes-dashboard created
service/dashboard-metrics-scraper created
deployment.apps/dashboard-metrics-scraper created
root@localhost:~/work/dashboard # kubectl get pods,svc -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
pod/dashboard-metrics-scraper-775b89678b-2pnr9 1/1 Running 0 10s
pod/kubernetes-dashboard-66d54d4cd7-cfzmx 1/1 Running 0 10s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/dashboard-metrics-scraper NodePort 10.0.0.135 8000:30002/TCP 10s
service/kubernetes-dashboard NodePort 10.0.0.89 443:30001/TCP 10s
查看pod日志
kubectl get pods -n kube-system 1 ↵
NAME READY STATUS RESTARTS AGE
kube-flannel-ds-amd64-pnr7b 1/1 Running 0 8m45s
root@localhost:~/work/dashboard # kubectl -n kube-system logs -f kube-flannel-ds-amd64-pnr7b
I0827 09:30:18.812198 1 main.go:518] Determining IP address of default interface
I0827 09:30:18.812898 1 main.go:531] Using interface with name eth0 and address 10.10.3.179
I0827 09:30:18.812930 1 main.go:548] Defaulting external address to interface address (10.10.3.179)
W0827 09:30:18.812969 1 client_config.go:517] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0827 09:30:19.008200 1 kube.go:119] Waiting 10m0s for node controller to sync
I0827 09:30:19.008490 1 kube.go:306] Starting kube subnet manager
I0827 09:30:20.009134 1 kube.go:126] Node controller sync successful
I0827 09:30:20.009271 1 main.go:246] Created subnet manager: Kubernetes Subnet Manager - 10.10.3.179
I0827 09:30:20.009279 1 main.go:249] Installing signal handlers
I0827 09:30:20.009465 1 main.go:390] Found network config - Backend type: vxlan
I0827 09:30:20.009613 1 vxlan.go:121] VXLAN config: VNI=1 Port=0 GBP=false Learning=false DirectRouting=false
I0827 09:30:20.107538 1 main.go:355] Current network or subnet (10.244.0.0/16, 10.244.0.0/24) is not equal to previous one (0.0.0.0/0, 0.0.0.0/0), trying to recycle old iptables rules
I0827 09:30:20.115034 1 iptables.go:167] Deleting iptables rule: -s 0.0.0.0/0 -d 0.0.0.0/0 -j RETURN
I0827 09:30:20.209595 1 iptables.go:167] Deleting iptables rule: -s 0.0.0.0/0 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
I0827 09:30:20.211834 1 iptables.go:167] Deleting iptables rule: ! -s 0.0.0.0/0 -d 0.0.0.0/0 -j RETURN
I0827 09:30:20.213233 1 iptables.go:167] Deleting iptables rule: ! -s 0.0.0.0/0 -d 0.0.0.0/0 -j MASQUERADE --random-fully
I0827 09:30:20.214655 1 main.go:305] Setting up masking rules
I0827 09:30:20.215906 1 main.go:313] Changing default FORWARD chain policy to ACCEPT
I0827 09:30:20.216037 1 main.go:321] Wrote subnet file to /run/flannel/subnet.env
I0827 09:30:20.216045 1 main.go:325] Running backend.
I0827 09:30:20.216061 1 main.go:343] Waiting for all goroutines to exit
I0827 09:30:20.216206 1 vxlan_network.go:60] watching for new subnet leases
I0827 09:30:20.406745 1 iptables.go:145] Some iptables rules are missing; deleting and recreating rules
I0827 09:30:20.406798 1 iptables.go:167] Deleting iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
I0827 09:30:20.409981 1 iptables.go:145] Some iptables rules are missing; deleting and recreating rules
I0827 09:30:20.410022 1 iptables.go:167] Deleting iptables rule: -s 10.244.0.0/16 -j ACCEPT
I0827 09:30:20.410667 1 iptables.go:167] Deleting iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
I0827 09:30:20.507534 1 iptables.go:167] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/24 -j RETURN
I0827 09:30:20.508204 1 iptables.go:167] Deleting iptables rule: -d 10.244.0.0/16 -j ACCEPT
I0827 09:30:20.510628 1 iptables.go:167] Deleting iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE --random-fully
I0827 09:30:20.606393 1 iptables.go:155] Adding iptables rule: -s 10.244.0.0/16 -d 10.244.0.0/16 -j RETURN
I0827 09:30:20.607101 1 iptables.go:155] Adding iptables rule: -s 10.244.0.0/16 -j ACCEPT
I0827 09:30:20.610845 1 iptables.go:155] Adding iptables rule: -d 10.244.0.0/16 -j ACCEPT
I0827 09:30:20.806505 1 iptables.go:155] Adding iptables rule: -s 10.244.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE --random-fully
I0827 09:30:20.908327 1 iptables.go:155] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/24 -j RETURN
I0827 09:30:20.914368 1 iptables.go:155] Adding iptables rule: ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE --random-fully
查看 kubernetes-dashboard在哪里节点
# 我们去相应的节点访问指定IP即可访问
访问地址:https://NodeIP:30001
root@localhost:/opt/kubernetes/cfg # kubectl get pods -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system kube-flannel-ds-amd64-pnr7b 1/1 Running 0 14m 10.10.3.179 10.10.3.179
kubernetes-dashboard dashboard-metrics-scraper-775b89678b-2pnr9 1/1 Running 0 6m6s 10.244.0.3 10.10.3.179
kubernetes-dashboard kubernetes-dashboard-66d54d4cd7-cfzmx 1/1 Running 0 6m6s 10.244.0.2 10.10.3.179
创建service account并绑定默认cluster-admin管理员集群角色:
root@localhost:/opt/kubernetes/cfg # kubectl create serviceaccount dashboard-admin -n kube-system 1 ↵
serviceaccount/dashboard-admin created
root@localhost:/opt/kubernetes/cfg # kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
clusterrolebinding.rbac.authorization.k8s.io/dashboard-admin created
root@localhost:/opt/kubernetes/cfg # kubectl describe secrets -n kube-system $(kubectl -n kube-system get secret | awk '/dashboard-admin/{print $1}')
Name: dashboard-admin-token-jxr2g
Namespace: kube-system
Labels:
Annotations: kubernetes.io/service-account.name: dashboard-admin
kubernetes.io/service-account.uid: 6c8e994b-c140-4b7c-8085-8955cb016f12
Type: kubernetes.io/service-account-token
Data
====
ca.crt: 1359 bytes
namespace: 11 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6Imtla2RhLS1SZ0NBUkVYTkY3eTZSWkE4UkNJVWJ2d0doS3NrYXpOOXlreWcifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tanhyMmciLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiNmM4ZTk5NGItYzE0MC00YjdjLTgwODUtODk1NWNiMDE2ZjEyIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.DL-hoDCeLhB--edIVhYe021ev27KvD4LMlCrwD3gHdcD7Pc8L3iktZHb6DRE8f6DZ_p2_HshmZKRvNZVljd9lUSJzolw4DHh6_o5sibeeS1eADQqeg3m6fbUga9nuY3LlnYvSd0Q9QAZEHpiUWXUkpY9-S72BW2GLvJIYpd9xJrBUwDsxR5SRRWdv3iDBisSO8kC7usDdlgJvqvy7qqiu9QyLXwK3TgfXk4c-CFBuq5ZsmolqVT5MctF9B66L_9BLuNYeCiJUVd328Y-vgievIE9lN3RfG4fvxbM6KBkmwVgHA63RIi0f8ftxDsZ09MsKjOm0FVivmBDb9qTpfMzEQ
使用输出的token登录Dashboard。
使用输出的kubeconfig登录Dashboard
1.找到dashboard-admin-token-jxr2g
root@localhost:~ # kubectl get secret -n kube-system
NAME TYPE DATA AGE
dashboard-admin-token-jxr2g kubernetes.io/service-account-token 3 28m
default-token-28k9v kubernetes.io/service-account-token 3 41m
flannel-token-ss9zn kubernetes.io/service-account-token 3 40m
2 获取token
DASH_TOCKEN=$(kubectl get secret -n kube-system dashboard-admin-token-jxr2g -o jsonpath={.data.token}|base64 -d)
3. 设置集群
kubectl config set-cluster kubernetes --server=http://10.10.3.139:6443 --kubeconfig=/root/dashbord-admin.conf
4. 设置用(dashboard-admin)用token
kubectl config set-credentials dashboard-admin --token=$DASH_TOCKEN --kubeconfig=/root/dashbord-admin.conf
设置上下文参数
5.kubectl config set-context dashboard-admin@kubernets --cluster=kubernetes --user=dashboard-admin --kubeconfig=/root/dashbord-admin.conf
6. 设置默认上线文
kubectl config use-context dashboard-admin@kubernets --kubeconfig=/root/dashbord-admin.conf
生成的dashbord-admin.conf即可用于登录dashboard
注意:k8s 与 coredns 的版本对应关系
https://github.com/coredns/deployment/blob/master/kubernetes/CoreDNS-k8s_version.md
注意修改 clusterIP 和镜像版本1.6.7
下载文件
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/coredns.yaml.sed
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/deploy.sh
在官网下载https://github.com/coredns/deployment/tree/master/kubernetes 配置文件主要是deploy.sh和coredns.yam.sed,由于不是从kube-dns转到coredns,所以要注释掉kubectl相关操作,修改REVERSE_CIDRS、CLUSTER_DOMAIN(DNS_DOMAIN)、CLUSTER_DNS_IP等变量为实际值,具体命令./deploy.sh -s -r 10.254.0.0/16 -i 10.254.0.10 -d superred.com> coredns.yaml
CLUSTER_DNS_IP=$(kubectl get service --namespace kube-system kube-dns -o jsonpath="{.spec.clusterIP}")
# 下载coredns
git clone https://github.com/coredns/deployment.git
cd deployment/kubernetes/
# 修改部署脚本
vim deploy.sh
if [[ -z $CLUSTER_DNS_IP ]]; then
# Default IP to kube-dns IP
# CLUSTER_DNS_IP=$(kubectl get service --namespace kube-system kube-dns -o jsonpath="{.spec.clusterIP}")
CLUSTER_DNS_IP=10.10.0.2
# 执行部署
yum -y install epel-release jq
./deploy.sh | kubectl apply -f -
root@localhost:~/work/coredns # ./deploy.sh -s -r 10.0.0.0/16 -i 10.0.0.2 -d superred.com> coredns.yaml
root@localhost:~/work/coredns # diff coredns.yaml.sed coredns.yaml
55c55
< kubernetes CLUSTER_DOMAIN REVERSE_CIDRS {
---
> kubernetes superred.com 10.0.0.0/16 {
59c59
< forward . UPSTREAMNAMESERVER {
---
> forward . /etc/resolv.conf {
66c66
< }STUBDOMAINS
---
> }
181c181
< clusterIP: CLUSTER_DNS_IP
---
> clusterIP: 10.0.0.2
root@localhost:~/work/coredns # kubectl apply -f coredns.yaml
serviceaccount/coredns created
clusterrole.rbac.authorization.k8s.io/system:coredns created
clusterrolebinding.rbac.authorization.k8s.io/system:coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created
[root@localhost coredns]# cat coredns.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: coredns
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:coredns
rules:
- apiGroups:
- ""
resources:
- endpoints
- services
- pods
- namespaces
verbs:
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:coredns
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:coredns
subjects:
- kind: ServiceAccount
name: coredns
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cclinux.com.cn 10.0.0.0/16 {
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/name: "CoreDNS"
spec:
# replicas: not specified here:
# 1. Default is 1.
# 2. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
replicas: 2
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
k8s-app: kube-dns
template:
metadata:
labels:
k8s-app: kube-dns
spec:
priorityClassName: system-cluster-critical
serviceAccountName: coredns
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
nodeSelector:
kubernetes.io/os: linux
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: k8s-app
operator: In
values: ["kube-dns"]
topologyKey: kubernetes.io/hostname
containers:
- name: coredns
image: harbor.superred.com/kubernetes/coredns:1.6.7
imagePullPolicy: IfNotPresent
resources:
limits:
memory: 170Mi
requests:
cpu: 100m
memory: 70Mi
args: [ "-conf", "/etc/coredns/Corefile" ]
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
readOnly: true
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- containerPort: 9153
name: metrics
protocol: TCP
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /ready
port: 8181
scheme: HTTP
dnsPolicy: Default
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
---
apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
annotations:
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
kubernetes.io/name: "CoreDNS"
spec:
selector:
k8s-app: kube-dns
clusterIP: 10.0.0.2
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
- name: metrics
port: 9153
protocol: TCP
运行情况
Every 2.0s: kubectl get pods,svc,ns,serviceaccounts -o wide --all-namespaces localhost.localdomain: Mon Feb 8 13:31:22 2021
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system pod/coredns-654979db4b-5dgzj 1/1 Running 0 90s 10.244.20.2 10.10.3.167
kube-system pod/coredns-654979db4b-5kzg4 1/1 Running 0 90s 10.244.20.3 10.10.3.167
root@localhost:~/work/dashboard # kubectl get nodes
NAME STATUS ROLES AGE VERSION
10.10.3.179 Ready 49m v1.18.8
root@localhost:~/work/dashboard # kubectl get rc
No resources found in default namespace.
root@localhost:~/work/dashboard # kubectl get pods
master运行
kubectl run nginx-test --image=nginx --port=80
root@localhost:~/work/dashboard # kubectl get pods,svc
NAME READY STATUS RESTARTS AGE
pod/nginx-test 1/1 Running 0 21s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.0.0.1 443/TCP 52m
建nginx对外服务
root@localhost:~/work/dashboard # kubectl expose pod nginx-test --port=80 --target-port=80 --type=NodePort 1 ↵
service/nginx-test exposed
root@localhost:~/work/dashboard # kubectl get pods,svc
NAME READY STATUS RESTARTS AGE
pod/nginx-test 1/1 Running 0 94s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.0.0.1 443/TCP 53m
service/nginx-test NodePort 10.0.0.66 80:27022/TCP 3
或者
root@localhost:~/work/dashboard # kubectl expose pod nginx-test --port=80 --target-port=80 --type=LoadBalancer 1 ↵
service/nginx-test exposed
root@localhost:~/work/dashboard # kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.0.0.1 443/TCP 54m
nginx-test LoadBalancer 10.0.0.130 80:62931/TCP 8s
root@localhost:~/work/dashboard # kubectl describe svc nginx-test
Name: nginx-test
Namespace: default
Labels: run=nginx-test
Annotations:
Selector: run=nginx-test
Type: LoadBalancer
IP: 10.0.0.130
Port: 80/TCP
TargetPort: 80/TCP
NodePort: 62931/TCP
Endpoints: 10.244.0.8:80
Session Affinity: None
External Traffic Policy: Cluster
Events:
主要是--type=不一样 支持"ClusterIP", "ExternalName", "LoadBalancer", "NodePort"
如果有防火墙
在node1节点开放对应的NodePort
firewall-cmd --zone public --add-port 62931--permanent
firewall-cmd --reload
kubectl run -it --rm --restart=Never --image=harbor.superred.com/kubernetes/infoblox/dnstools:latest dnstools
If you don't see a command prompt, try pressing enter.
Error attaching, falling back to logs: unable to upgrade connection: Forbidden (user=kubernetes, verb=create, resource=nodes, subresource=proxy)
pod "dnstools" deleted
Error from server (Forbidden): Forbidden (user=kubernetes, verb=get, resource=nodes, subresource=proxy) ( pods/log dnstools)
说明kubelet对apiserver没有rbac权限
查看角色system:kubelet-api-admin的详细信息
[root@localhost cfg]# kubectl describe clusterrole system:kubelet-api-admin
Name: system:kubelet-api-admin
Labels: kubernetes.io/bootstrapping=rbac-defaults
Annotations: rbac.authorization.kubernetes.io/autoupdate: true
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
nodes/log [] [] [*]
nodes/metrics [] [] [*]
nodes/proxy [] [] [*]
nodes/spec [] [] [*]
nodes/stats [] [] [*]
nodes [] [] [get list watch proxy]
需要加角色给用户 kubernetes
kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernetes
或
[root@localhost cfg]# cat apiserver-to-kubelet-rbac.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:kube-apiserver-to-kubelet
rules:
- apiGroups:
- ""
resources:
- nodes/proxy
- nodes/stats
- nodes/log
- nodes/spec
- nodes/metrics
- pods/log
verbs:
- "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:kube-apiserver
namespace: ""
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:kube-apiserver-to-kubelet
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: User
name: kubernetes
[root@localhost cfg]# kubectl create -f apiserver-to-kubelet-rbac.yaml
再次操作就可以了
root@localhost:~/work/coredns # kubectl run -it --rm --restart=Never --image=harbor.superred.com/kubernetes/infoblox/dnstools:latest dnstools 1 ↵
If you don't see a command prompt, try pressing enter.
dnstools# nslookup nginx-test
Server: 10.0.0.2
Address: 10.0.0.2#53
Name: nginx-test.default.svc.superred.com
Address: 10.0.0.130
dnstools# nslookup kubernetes
Server: 10.0.0.2
Address: 10.0.0.2#53
Name: kubernetes.default.svc.superred.com
Address: 10.0.0.1
dnstools#
Heapster - container-monitor
k8s学习笔记- 部署prometheus - 屌丝的IT - 博客园
Kubernetes学习之路(二十四)之Prometheus监控 - 烟雨浮华 - 博客园
kubernetes监控-prometheus(十六) - 过眼风 - 博客园
k8s全栈监控之metrics-server和prometheus - 诗码者 - 博客园
监控系统:cAdvisor+Heapster+InfluxDB+Grafana,metrics-server,prometheus,k8s-prometheus-adapter
cAdvisor+Heapster+InfluxDB+Grafana |
Y |
简单 |
容器监控 |
cAdvisor/exporter+Prometheus+Grafana |
Y |
扩展性好 |
容器,应用,主机全方面监控 |
Prometheus+Grafana是监控告警解决方案里的后起之秀
通过各种exporter采集不同维度的监控指标,并通过Prometheus支持的数据格式暴露出来,Prometheus定期pull数据并用Grafana展示,异常情况使用AlertManager告警。
通过cadvisor采集容器、Pod相关的性能指标数据,并通过暴露的/metrics接口用prometheus抓取
通过prometheus-node-exporter采集主机的性能指标数据,并通过暴露的/metrics接口用prometheus抓取
应用侧自己采集容器中进程主动暴露的指标数据(暴露指标的功能由应用自己实现,并添加平台侧约定的annotation,平台侧负责根据annotation实现通过Prometheus的抓取)
通过kube-state-metrics采集k8s资源对象的状态指标数据,并通过暴露的/metrics接口用prometheus抓取
通过etcd、kubelet、kube-apiserver、kube-controller-manager、kube-scheduler自身暴露的/metrics获取节点上与k8s集群相关的一些特征指标数据。
实现思路
监控指标 |
具体实现 |
举例 |
Pod性能 |
cAdvisor |
容器CPU,内存利用率 |
Node性能 |
node-exporter |
节点CPU,内存利用率 |
K8S资源对象 |
kube-state-metrics |
Pod/Deployment/Service |
使用metric-server收集数据给k8s集群内使用,如kubectl,hpa,scheduler等
使用prometheus-operator部署prometheus,存储监控数据
使用kube-state-metrics收集k8s集群内资源对象数据
使用node_exporter收集集群中各节点的数据
使用prometheus收集apiserver,scheduler,controller-manager,kubelet组件数据
使用alertmanager实现监控报警
使用grafana实现数据可视化
Heapster是Kubernetes旗下的一个项目,Heapster是一个收集者,并不是采集。收集流程
1.Heapster首先从apiserver获取集群中所有Node的信息。
2.通过这些Node上的kubelet获取有用数据,而kubelet本身的数据则是从cAdvisor得到。
3.所有获取到的数据都被推到Heapster配置的后端存储中,并还支持数据的可视化。
1.Heapster可以收集Node节点上的cAdvisor数据:CPU、内存、网络和磁盘
2.将每个Node上的cAdvisor的数据进行汇总
3.按照kubernetes的资源类型来集合资源,比如Pod、Namespace
4.默认的metric数据聚合时间间隔是1分钟。还可以把数据导入到第三方工具ElasticSearch、InfluxDB、Kafka、Graphite
5.展示:Grafana或Google Cloud Monitoring
现状
heapster已经被官方废弃(k8s 1.11版本中,HPA已经不再从hepaster获取数据)
CPU内存、HPA指标: 改为metrics-server
基础监控:集成到prometheus中,kubelet将metric信息暴露成prometheus需要的格式,使用Prometheus Operator
事件监控:集成到https://github.com/heptiolabs/eventrouter
从 v1.8 开始,资源使用情况的监控可以通过 Metrics API的形式获取,具体的组件为Metrics Server,用来替换之前的heapster,heapster从1.11开始逐渐被废弃。
Metrics-Server是集群核心监控数据的聚合器,从 Kubernetes1.8 开始,它作为一个 Deployment对象默认部署在由kube-up.sh脚本创建的集群中。
Metrics API
介绍Metrics-Server之前,必须要提一下Metrics API的概念
Metrics API相比于之前的监控采集方式(hepaster)是一种新的思路,官方希望核心指标的监控应该是稳定的,版本可控的,且可以直接被用户访问(例如通过使用 kubectl top 命令),或由集群中的控制器使用(如HPA),和其他的Kubernetes APIs一样。
官方废弃heapster项目,就是为了将核心资源监控作为一等公民对待,即像pod、service那样直接通过api-server或者client直接访问,不再是安装一个hepater来汇聚且由heapster单独管理。
假设每个pod和node我们收集10个指标,从k8s的1.6开始,支持5000节点,每个节点30个pod,假设采集粒度为1分钟一次,则:
因为k8s的api-server将所有的数据持久化到了etcd中,显然k8s本身不能处理这种频率的采集,而且这种监控数据变化快且都是临时数据,因此需要有一个组件单独处理他们,k8s版本只存放部分在内存中,于是metric-server的概念诞生了。
其实hepaster已经有暴露了api,但是用户和Kubernetes的其他组件必须通过master proxy的方式才能访问到,且heapster的接口不像api-server一样,有完整的鉴权以及client集成。这个api现在还在alpha阶段(18年8月),希望能到GA阶段。类api-server风格的写法:generic apiserver
有了Metrics Server组件,也采集到了该有的数据,也暴露了api,但因为api要统一,如何将请求到api-server的/apis/metrics
请求转发给Metrics Server呢,解决方案就是:kube-aggregator,在k8s的1.7中已经完成,之前Metrics Server一直没有面世,就是耽误在了kube-aggregator这一步。
kube-aggregator(聚合api)主要提供:
Provide an API for registering API servers.
Summarize discovery information from all the servers.
Proxy client requests to individual servers.
详细设计文档:参考链接
metric api的使用:
Metrics API 只可以查询当前的度量数据,并不保存历史数据
Metrics API URI 为 /apis/metrics.k8s.io/,在 k8s.io/metrics 维护
必须部署 metrics-server 才能使用该 API,metrics-server 通过调用 Kubelet Summary API 获取数据
如:
http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/nodes
http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/nodes/
http://127.0.0.1:8001/apis/metrics.k8s.io/v1beta1/namespace//pods/
Metrics-Server
Metrics server定时从Kubelet的Summary API(类似/ap1/v1/nodes/nodename/stats/summary)采集指标信息,这些聚合过的数据将存储在内存中,且以metric-api的形式暴露出去。
Metrics server复用了api-server的库来实现自己的功能,比如鉴权、版本等,为了实现将数据存放在内存中,去掉了默认的etcd存储,引入了内存存储(即实现Storage interface)。因为存放在内存中,因此监控数据是没有持久化的,可以通过第三方存储来拓展,这个和heapster是一致的。
Metrics server出现后,新的Kubernetes 监控架构将变成上图的样子
核心流程(黑色部分):这是 Kubernetes正常工作所需要的核心度量,从 Kubelet、cAdvisor 等获取度量数据,再由metrics-server提供给 Dashboard、HPA 控制器等使用。
监控流程(蓝色部分):基于核心度量构建的监控流程,比如 Prometheus 可以从 metrics-server 获取核心度量,从其他数据源(如 Node Exporter 等)获取非核心度量,再基于它们构建监控告警系统。
官方地址:https://github.com/kubernetes-incubator/metrics-server、https://github.com/kubernetes-sigs/metrics-server
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml 下载需要根据自身集群状态进行修改,比如,镜像地址、资源限制...
修改内容:修改镜像地址,添加资源限制和相关命令这个是不需要ca认证的
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
spec:
template:
spec:
containers:
- name: metrics-server
image: harbor.superred.com/kubernetes/metrics-server-amd64:v0.3.6
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: 400m
memory: 512Mi
requests:
cpu: 50m
memory: 50Mi
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
根据您的群集设置,您可能还需要更改传递给Metrics Server容器的标志。最有用的标志:
--kubelet-preferred-address-types
-确定连接到特定节点的地址时使用的节点地址类型的优先级(default [Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP])--kubelet-insecure-tls
-不要验证Kubelets提供的服务证书的CA。仅用于测试目的。--requestheader-client-ca-file
-指定根证书捆绑包,以验证传入请求上的客户端证书。执行该文件
[root@localhost metrics-server]# kubectl create -f components.yaml
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
等待一段时间即可查看效果
[root@localhost metrics-server]# watch kubectl get pods,svc -o wide --all-namespaces
[root@localhost metrics-server]# kubectl get pods,svc -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system pod/coredns-654979db4b-hxq9g 1/1 Running 0 41m 10.244.38.2 10.10.3.178
kube-system pod/coredns-654979db4b-vknrv 1/1 Running 0 41m 10.244.45.2 10.10.3.177
kube-system pod/metrics-server-7f644ff97d-gr8lw 1/1 Running 0 29s 10.244.45.3 10.10.3.177
kubernetes-dashboard pod/dashboard-metrics-scraper-775b89678b-77rgh 1/1 Running 0 41m 10.244.40.2 10.10.3.170
kubernetes-dashboard pod/kubernetes-dashboard-66d54d4cd7-6fgn9 1/1 Running 0 41m 10.244.20.9 10.10.3.167
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default service/kubernetes ClusterIP 10.0.0.1 443/TCP 157m
kube-system service/kube-dns ClusterIP 10.0.0.2 53/UDP,53/TCP,9153/TCP 41m k8s-app=kube-dns
kube-system service/metrics-server ClusterIP 10.0.0.80 443/TCP 29s k8s-app=metrics-server
kubernetes-dashboard service/dashboard-metrics-scraper NodePort 10.0.0.130 8000:30002/TCP 41m k8s-app=dashboard-metrics-scraper
kubernetes-dashboard service/kubernetes-dashboard NodePort 10.0.0.156 443:30001/TCP 41m k8s-app=kubernetes-dashboard
[root@localhost metrics-server]# kubectl logs -f pod/metrics-server-7f644ff97d-gr8lw -n kube-system
I0209 10:15:32.502938 1 serving.go:312] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
W0209 10:15:34.336160 1 authentication.go:296] Cluster doesn't provide requestheader-client-ca-file in configmap/extension-apiserver-authentication in kube-system, so request-header client certificate authentication won't work.
I0209 10:15:34.396591 1 secure_serving.go:116] Serving securely on [::]:4443
完整文件内容如下
[root@localhost metrics-server]# cat components.yaml
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:aggregated-metrics-reader
labels:
rbac.authorization.k8s.io/aggregate-to-view: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rules:
- apiGroups: ["metrics.k8s.io"]
resources: ["pods", "nodes"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.metrics.k8s.io
spec:
service:
name: metrics-server
namespace: kube-system
group: metrics.k8s.io
version: v1beta1
insecureSkipTLSVerify: true
groupPriorityMinimum: 100
versionPriority: 100
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: harbor.superred.com/kubernetes/metrics-server-amd64:v0.3.6
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: 400m
memory: 512Mi
requests:
cpu: 50m
memory: 50Mi
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
args:
- --cert-dir=/tmp
- --secure-port=4443
ports:
- name: main-port
containerPort: 4443
protocol: TCP
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- name: tmp-dir
mountPath: /tmp
nodeSelector:
kubernetes.io/os: linux
kubernetes.io/arch: "amd64"
---
apiVersion: v1
kind: Service
metadata:
name: metrics-server
namespace: kube-system
labels:
kubernetes.io/name: "Metrics-server"
kubernetes.io/cluster-service: "true"
spec:
selector:
k8s-app: metrics-server
ports:
- port: 443
protocol: TCP
targetPort: main-port
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:metrics-server
rules:
- apiGroups:
- ""
resources:
- pods
- nodes
- nodes/stats
- namespaces
- configmaps
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
Kubernetes API 安全机制详解_molixuebeibi的博客-CSDN博客
如上文提到的,metric-server是扩展的apiserver,依赖于kube-aggregator,因此需要在apiserver中开启相关参数。这个是需要ca认证的
metrics的证书
# 注意: "CN": "system:metrics-server" 一定是这个,因为后面授权时用到这个名称,否则会报禁止匿名访问
$ cat > metrics-server-csr.json <
生成 metrics-server 证书和私钥
生成证书
$ cfssl gencert -ca=/opt/kubernetes/ssl/ca.pem -ca-key=/opt/kubernetes/ssl/ca-key.pem -config=/opt/kubernetes/ssl/ca-config.json -profile=kubernetes metrics-server-csr.json | cfssljson -bare metrics-server
Metrics Server可以用来在集群范围可以进行资源使用状况信息的收集,它通过Kubelet收集在各个节点上提供的指标信息,Metrics Server的使用需要开启聚合层,所以需要在Api Server的启动选项中添加如下内容:
--requestheader-client-ca-file=/opt/kubernetes/ssl/ca.pem \
#--requestheader-allowed-names=aggregator \
--requestheader-extra-headers-prefix=X-Remote-Extra- \
--requestheader-group-headers=X-Remote-Group \
--requestheader-username-headers=X-Remote-User \
--proxy-client-cert-file=/opt/kubernetes/ssl/metrics-server.pem \
--proxy-client-key-file=/opt/kubernetes/ssl/metrics-server-key.pem \
--enable-aggregator-routing=true"
参数解释:
–requestheader-username-headers(必需,不区分大小写)。设定Header中的用户标识,第一个包含value的Header将被作为用户名。
–requestheader-client-ca-file(必需)。PEM 编码的CA证书。用来校验代理端客户端证书,防止 Header 欺骗。
–requestheader-allowed-names(可选)。证书名称(CN)列表。如果设置了,则必须提供名称列表中的有效客户端证书。如果为空,则允许使用任何名称客户端证书。
–requestheader-group-headers( 1.6 以上版本支持,可选,不区分大小写)。建议为 “X-Remote-Group”,设定Header中的用户组标识。指定 Header 中的所有值都将作为用户分组。
–requestheader-extra-headers-prefix (1.6 以上版本支持,可选,不区分大小写)。建议为 “X-Remote-Extra-”,标题前缀可用于查找用户的扩展信息,以任何指定的前缀开头的 Header删除前缀后其余部分将成为用户扩展信息的键值,而 Header值则是扩展的值。
– enable-aggregator-routing:master节点上没有kube-proxy时,需要设定为true
kube-apiserver.conf 的完整配置在下面,上面只是介绍
1)、cAdvisor为谷歌开源的专门用于监控容器的服务,已经集成到了k8s里面(数据采集Agent)
只需在宿主机上部署cAdvisor容器,用户就可通过Web界面或REST服务访问当前节点和容器的性能数据(CPU、内存、网络、磁盘、文件系统等等),非常详细。
默认cAdvisor是将数据缓存在内存中,数据展示能力有限;它也提供不同的持久化存储后端支持,可以将监控数据保存、汇总到Google BigQuery、InfluxDB或者Redis之上。
新的Kubernetes版本里,cadvior功能已经被集成到了kubelet组件中
需要注意的是,cadvisor的web界面,只能看到单前物理机上容器的信息,其他机器是需要访问对应ip的url,数量少时,很有效果,当数量多时,比较麻烦,所以需要把cadvisor的数据进行汇总、展示,需要用到“cadvisor+influxdb+grafana”组合
grafanashig
Cadvisor监控,只需要 在kubelet命令中,启用Cadvisor,和配置相关信息 ,即可
Kubernetes有个出名的监控agent—cAdvisor。在每个kubernetes Node上都会运行cAdvisor,它会收集本机以及容器的监控数据(cpu,memory,filesystem,network,uptime)。在较新的版本中,K8S已经将cAdvisor功能集成到kubelet组件中。每个Node节点可以直接进行web访问。http://10.10.3.119:4194/ 直接访问
2)、Heapster是容器集群监控和性能分析工具,天然的支持Kubernetes和CoreOS。但是Heapster已经退休了!(数据收集)
Heapster是一个收集者,Heapster可以收集Node节点上的cAdvisor数据,将每个Node上的cAdvisor的数据进行汇总,还可以按照kubernetes的资源类型来集合资源,比如Pod、Namespace,可以分别获取它们的CPU、内存、网络和磁盘的metric。默认的metric数据聚合时间间隔是1分钟。还可以把数据导入到第三方工具(如InfluxDB)。
Kubernetes原生dashboard的监控图表信息来自heapster。在Horizontal Pod Autoscaling中也用到了Heapster,HPA将Heapster作为Resource Metrics API,向其获取metric。
3)、InfluxDB是一个开源的时序数据库。(数据存储)
4)、grafana是一个开源的数据展示工具。(数据展示)
kubernetes/cluster/addons/metrics-server at v1.18.8 · kubernetes/kubernetes · GitHub
业内方案 - container-monitor
08. 附录 - Docker最佳实践 - 《Kubernetes中文指南/实践手册(201801)》 - 书栈网 · BookStack
Metrics API 只可以查询当前的度量数据,并不保存历史数据
Metrics server定时从Kubelet的Summary API(类似/ap1/v1/nodes/nodename/stats/summary)采集指标信息,这些聚合过的数据将存储在内存中,且以metric-api的形式暴露出去
K8S资源指标获取工具:metrics-server
自定义指标的监控工具:prometheus,k8s-prometheus-adapter
Metrics-Server是集群核心监控数据的聚合器,在k8s早期版本中,对资源的监控使用的是heapster的资源监控工具。但是从 Kubernetes 1.8 开始,Kubernetes 通过 Metrics API 获取资源使用指标,例如容器 CPU 和内存使用情况。这些度量指标可以由用户直接访问,例如通过使用kubectl top 命令,或者使用集群中的控制器,,因为k8s的api-server将所有的数据持久化到了etcd中,显然k8s本身不能处理这种频率的采集,而且这种监控数据变化快且都是临时数据,因此需要有一个组件单独处理他们.
prometheus:prometheus能够收集各种维度的资源指标,比如CPU利用率,网络连接的数量,网络报文的收发速率,包括进程的新建及回收速率等等,能够监控许许多多的指标,而这些指标K8S早期是不支持的,所以需要把prometheus能采集到的各种指标整合进k8s里,能让K8S根据这些指标来判断是否需要根据这些指标来进行pod的伸缩。
prometheus既作为监控系统来使用,也作为某些特殊的资源指标的提供者来使用。但是这些指标不是标准的K8S内建指标,称之为自定义指标,但是prometheus要想将监控采集到的数据作为指标来展示,则需要一个插件,这个插件叫k8s-prometheus-adapter,这些指标判断pod是否需要伸缩的基本标准,例如根据cpu的利用率、内存使用量去进行伸缩。
随着prometheus和k8s-prometheus-adapter的引入,新一代的k8s架构也就形成了。
K8S新一代架构
核心指标流水线:由kubelet、metrics-server以及由API server提供的api组成;CPU累积使用率、内存的实时使用率、pod的资源占用率及容器的磁盘占用率;
监控流水线:用于从系统收集各种指标数据并提供给终端用户、存储系统以及HPA,包含核心指标以及其他许多非核心指标。非核心指标本身不能被K8S所解析。所以需要k8s-prometheus-adapter将prometheus采集到的数据转化为k8s能理解的格式,为k8s所使用。
核心指标监控
之前使用的是heapster,但是1.12后就废弃了,之后使用的替代者是metrics-server;metrics-server是由用户开发的一个api server,用于服务资源指标,而不是服务pod,deploy的。metrics-server本身不是k8s的组成部分,是托管运行在k8s上的一个pod,那么如果想要用户在k8s上无缝的使用metrics-server提供的api服务,因此在新一代的架构中需要这样去组合它们。如图,使用一个聚合器去聚合k8s的api server与metrics-server,然后由群组/apis/metrics.k8s.io/v1beta1来获取。
之后如果用户还有其他的api server都可以整合进aggregator,由aggregator来提供服务,如图
Prometheus概述
除了前面的资源指标(如CPU、内存)以外,用户或管理员需要了解更多的指标数据,比如Kubernetes
指标、容器指标、节点资源指标以及应用程序指标等等。自定义指标API允许请求任意的指标,其指标API的实现要指定相应的后端监视系统。而Prometheus
是第一个开发了相应适配器的监控系统。这个适用于Prometheus
的Kubernetes Customm Metrics Adapter
是属于Github上的k8s-prometheus-adapter项目提供的。其原理图如下:
要知道的是prometheus
本身就是一监控系统,也分为server
端和agent
端,server
端从被监控主机获取数据,而agent
端需要部署一个node_exporter
,主要用于数据采集和暴露节点的数据,那么 在获取Pod级别或者是mysql等多种应用的数据,也是需要部署相关的exporter
。我们可以通过PromQL
的方式对数据进行查询,但是由于本身prometheus
属于第三方的 解决方案,原生的k8s系统并不能对Prometheus
的自定义指标进行解析,就需要借助于k8s-prometheus-adapter
将这些指标数据查询接口转换为标准的Kubernetes
自定义指标。
Prometheus是一个开源的服务监控系统和时序数据库,其提供了通用的数据模型和快捷数据采集、存储和查询接口。它的核心组件Prometheus服务器定期从静态配置的监控目标或者基于服务发现自动配置的目标中进行拉取数据,新拉取到啊的 数据大于配置的内存缓存区时,数据就会持久化到存储设备当中。Prometheus组件架构图如下:
如上图,每个被监控的主机都可以通过专用的exporter
程序提供输出监控数据的接口,并等待Prometheus
服务器周期性的进行数据抓取。如果存在告警规则,则抓取到数据之后会根据规则进行计算,满足告警条件则会生成告警,并发送到Alertmanager
完成告警的汇总和分发。当被监控的目标有主动推送数据的需求时,可以以Pushgateway
组件进行接收并临时存储数据,然后等待Prometheus
服务器完成数据的采集。
任何被监控的目标都需要事先纳入到监控系统中才能进行时序数据采集、存储、告警和展示,监控目标可以通过配置信息以静态形式指定,也可以让Prometheus通过服务发现的机制进行动态管理。下面是组件的一些解析:
Prometheus 能够 直接 把 Kubernetes API Server 作为 服务 发现 系统 使用 进而 动态 发现 和 监控 集群 中的 所有 可被 监控 的 对象。 这里 需要 特别 说明 的 是, Pod 资源 需要 添加 下列 注解 信息 才 能被 Prometheus 系统 自动 发现 并 抓取 其 内建 的 指标 数据。
另外, 仅 期望 Prometheus 为 后端 生成 自定义 指标 时 仅 部署 Prometheus 服务器 即可, 它 甚至 也不 需要 数据 持久 功能。 但 若要 配置 完整 功能 的 监控 系统, 管理员 还需 要在 每个 主机 上 部署 node_ exporter、 按 需 部署 其他 特有 类型 的 exporter 以及 Alertmanager。
Kubernetes学习之路(二十四)之Prometheus监控 - 烟雨浮华 - 博客园
custom metrics - container-monitor
查看k8s默认的api-version,可以看到是没有metrics.k8s.io这个组的
当你部署好metrics-server后再查看api-versions就可以看到metrics.k8s.io这个组了
部署metrics-server
进到kubernetes项目下的cluster下的addons,找到对应的项目下载下来
1、metrics-server-deployment.yaml
metrics-server的command中加上 - --kubelet-insecure-tls 表示不验证客户端的证书,注释掉端口10255,注释后会使用10250,通过https通信
addon-resizer的command中写上具体的cpu、memory、extra-memory的值,注释掉minClusterSize={{ metrics_server_min_cluster_size }}
vim metrics-server-deployment.yaml
vim resource-reader.yaml
cat auth-delegator.yaml
root@localhost:~/work/metrics # cat auth-delegator.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: metrics-server:system:auth-delegator
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
cat auth-reader.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: metrics-server-auth-reader
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
cat metrics-apiservice.yaml
root@localhost:~/work/metrics # cat metrics-apiservice.yaml
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
name: v1beta1.metrics.k8s.io
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
service:
name: metrics-server
namespace: kube-system
group: metrics.k8s.io
version: v1beta1
insecureSkipTLSVerify: true
groupPriorityMinimum: 100
versionPriority: 100
cat metrics-server-deployment.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: v1
kind: ConfigMap
metadata:
name: metrics-server-config
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: EnsureExists
data:
NannyConfiguration: |-
apiVersion: nannyconfig/v1alpha1
kind: NannyConfiguration
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server-v0.3.6
namespace: kube-system
labels:
k8s-app: metrics-server
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
version: v0.3.6
spec:
selector:
matchLabels:
k8s-app: metrics-server
version: v0.3.6
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
version: v0.3.6
annotations:
seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
spec:
hostNetwork: true
priorityClassName: system-cluster-critical
serviceAccountName: metrics-server
nodeSelector:
kubernetes.io/os: linux
containers:
- name: metrics-server
image: harbor.superred.com/kubernetes/metrics-server-amd64:v0.3.6
command:
- /metrics-server
- --metric-resolution=30s
- --requestheader-allowed-names=aggregator
# These are needed for GKE, which doesn't support secure communication yet.
# Remove these lines for non-GKE clusters, and when GKE supports token-based auth.
#- --kubelet-port=10255
#- --deprecated-kubelet-completely-insecure=true
- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
- --kubelet-insecure-tls
ports:
- containerPort: 443
name: https
protocol: TCP
- name: metrics-server-nanny
image: harbor.superred.com/kubernetes/addon-resizer:1.8.11
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 5m
memory: 50Mi
env:
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: MY_POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: metrics-server-config-volume
mountPath: /etc/config
command:
- /pod_nanny
- --config-dir=/etc/config
#- --cpu={{ base_metrics_server_cpu }}
- --cpu=80m
- --extra-cpu=0.5m
#- --memory={{ base_metrics_server_memory }}
- --memory=80Mi
#- --extra-memory={{ metrics_server_memory_per_node }}Mi
- --extra-memory=8Mi
- --threshold=5
- --deployment=metrics-server-v0.3.6
- --container=metrics-server
- --poll-period=300000
- --estimator=exponential
# Specifies the smallest cluster (defined in number of nodes)
# resources will be scaled to.
#- --minClusterSize={{ metrics_server_min_cluster_size }}
volumes:
- name: metrics-server-config-volume
configMap:
name: metrics-server-config
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
cat metrics-server-service.yaml
root@localhost:~/work/metrics # cat metrics-server-service.yaml
apiVersion: v1
kind: Service
metadata:
name: metrics-server
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/cluster-service: "true"
kubernetes.io/name: "Metrics-server"
spec:
selector:
k8s-app: metrics-server
ports:
- port: 443
protocol: TCP
targetPort: https
cat resource-reader.yaml
root@localhost:~/work/metrics # cat resource-reader.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:metrics-server
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
- ""
resources:
- pods
- nodes
- namespaces
- nodes/stats
verbs:
- get
- list
- watch
- apiGroups:
- "apps"
resources:
- deployments
verbs:
- get
- list
- update
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:metrics-server
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
kubectl create -f .
开启聚合层api
1)修改master的kube-apiserver的启动脚本文件:
root@localhost:~/work/metrics # cat /opt/kubernetes/cfg/kube-apiserver.conf
KUBE_APISERVER_OPTS="--logtostderr=false \
--v=2 \
--log-dir=/opt/kubernetes/logs \
--etcd-servers=https://10.10.3.139:2379,https://10.10.3.197:2379,https://10.10.3.179:2379 \
--bind-address=0.0.0.0 \
--secure-port=6443 \
--advertise-address=10.10.3.139 \
--allow-privileged=true \
--service-cluster-ip-range=10.0.0.0/24 \
--enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction \
--authorization-mode=RBAC,Node \
--enable-bootstrap-token-auth=true \
--token-auth-file=/opt/kubernetes/cfg/token.csv \
--service-node-port-range=1024-65535 \
--kubelet-client-certificate=/opt/kubernetes/ssl/apiserver.pem \
--kubelet-client-key=/opt/kubernetes/ssl/apiserver-key.pem \
--tls-cert-file=/opt/kubernetes/ssl/apiserver.pem \
--tls-private-key-file=/opt/kubernetes/ssl/apiserver-key.pem \
--client-ca-file=/opt/kubernetes/ssl/ca.pem \
--service-account-key-file=/opt/kubernetes/ssl/ca-key.pem \
--etcd-cafile=/opt/etcd/ssl/ca.pem \
--etcd-certfile=/opt/etcd/ssl/etcd.pem \
--etcd-keyfile=/opt/etcd/ssl/etcd-key.pem \
--audit-log-maxage=30 \
--audit-log-maxbackup=3 \
--audit-log-maxsize=100 \
--audit-log-path=/opt/kubernetes/logs/k8s-audit.log \
--requestheader-client-ca-file=/opt/kubernetes/ssl/ca.pem \
#--requestheader-allowed-names=aggregator \
--requestheader-extra-headers-prefix=X-Remote-Extra- \
--requestheader-group-headers=X-Remote-Group \
--requestheader-username-headers=X-Remote-User \
--proxy-client-cert-file=/opt/kubernetes/ssl/metrics-server.pem \
--proxy-client-key-file=/opt/kubernetes/ssl/metrics-server-key.pem \
--enable-aggregator-routing=true"
追加如下配置
注意:master没有安装kube-proxy 需要加上 --enable-aggregator-routing=true
#--requestheader-allowed-names的值和CN的名字一样
#--runtime-config=api/all=true \
--requestheader-client-ca-file=/opt/kubernetes/ssl/ca.pem \
#--requestheader-allowed-names=aggregator \
--requestheader-extra-headers-prefix=X-Remote-Extra- \
--requestheader-group-headers=X-Remote-Group \
--requestheader-username-headers=X-Remote-User \
--proxy-client-cert-file=/opt/kubernetes/ssl/metrics-server.pem \
--proxy-client-key-file=/opt/kubernetes/ssl/metrics-server.pem \
--enable-aggregator-routing=true"
2)修改master的kube-controller-manager.service(我没有修改)
#vi /usr/lib/systemd/system/kube-controller-manager.service
--horizontal-pod-autoscaler-use-rest-clients=true
查看node及pod监控指标
如果权限问题就给最高权限先看看效果,可以根据情况自行给与权限大小。
root@localhost:~/work/metrics # kubectl top node 1 ↵
Error from server (Forbidden): nodes.metrics.k8s.io is forbidden: User "system:metrics-server" cannot list resource "nodes" in API group "metrics.k8s.io" at the cluster scope
[root@localhost cfg]# kubectl create clusterrolebinding wubo:metrics-server --clusterrole=cluster-admin --user=system:metrics-server
clusterrolebinding.rbac.authorization.k8s.io/wubo:metrics-server created
至此,metrics-server部署结束。
kubernetes1.18安装kube-prometheus_码农崛起-CSDN博客
wget https://raw.githubusercontent.com/giantswarm/prometheus/master/manifests-all.yaml
root@localhost:~/work/jiankong/prometheus # cat manifests-all.yaml| grep image
image: harbor.superred.com/kubernetes/prometheus/alertmanager:v0.7.1 报警
image: harbor.superred.com/kubernetes/grafana/grafana:4.2.0 展示
image: harbor.superred.com/kubernetes/giantswarm/tiny-tools
image: harbor.superred.com/kubernetes/prom/prometheus:v1.7.0 抓取数据,最好和influxdb一起使用
image: harbor.superred.com/kubernetes/google_containers/kube-state-metrics:v0.5.0 k8s-master主节点一些信息
image: harbor.superred.com/kubernetes/dockermuenster/caddy:0.9.3
image: harbor.superred.com/kubernetes/prom/node-exporter:v0.14.0 物理节点信息
Every 2.0s: kubectl get pod,svc,ingress -o wide --all-namespaces Thu Sep 10 16:46:02 2020
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system pod/coredns-85b4878f78-t4jvl 1/1 Running 0 13d 10.244.1.4 10.10.3.197
kube-system pod/kube-flannel-ds-amd64-4b69t 1/1 Running 0 13d 10.10.3.197 10.10.3.197
kube-system pod/kube-flannel-ds-amd64-pnr7b 1/1 Running 0 13d 10.10.3.179 10.10.3.179
kube-system pod/metrics-server-v0.3.6-7444db4cbb-4zvn5 2/2 Running 0 26h 10.10.3.197 10.10.3.197
kubernetes-dashboard pod/dashboard-metrics-scraper-775b89678b-x4qp6 1/1 Running 0 37h 10.244.1.27 10.10.3.197
kubernetes-dashboard pod/kubernetes-dashboard-66d54d4cd7-6z64c 1/1 Running 0 37h 10.244.1.26 10.10.3.197
monitoring pod/alertmanager-db5fbcfd5-ssq8c 1/1 Running 0 22h 10.244.0.33 10.10.3.179
monitoring pod/grafana-core-d5fcd78b-fdw2h 1/1 Running 0 22h 10.244.0.38 10.10.3.179
monitoring pod/grafana-import-dashboards-klqgp 0/1 Init:0/1 0 22h 10.244.0.34 10.10.3.179
monitoring pod/kube-state-metrics-886f8f84d-l9sxb 1/1 Running 0 22h 10.244.1.37 10.10.3.197
monitoring pod/node-directory-size-metrics-d8c7q 2/2 Running 0 22h 10.244.1.36 10.10.3.197
monitoring pod/node-directory-size-metrics-zt5mq 2/2 Running 0 22h 10.244.0.36 10.10.3.179
monitoring pod/prometheus-core-6b9687f87d-ng9tl 1/1 Running 1 22h 10.244.1.38 10.10.3.197
monitoring pod/prometheus-node-exporter-4cs9h 1/1 Running 0 22h 10.10.3.197 10.10.3.197
monitoring pod/prometheus-node-exporter-fnld5 1/1 Running 0 22h 10.10.3.179 10.10.3.179
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default service/kubernetes ClusterIP 10.0.0.1 443/TCP 13d
default service/nginx-test LoadBalancer 10.0.0.130 80:62931/TCP 13d run=nginx-test
kube-system service/kube-dns ClusterIP 10.0.0.2 53/UDP,53/TCP,9153/TCP 13d k8s-app=kube-dns
kube-system service/metrics-server ClusterIP 10.0.0.109 443/TCP 26h k8s-app=metrics-server
kubernetes-dashboard service/dashboard-metrics-scraper NodePort 10.0.0.135 8000:30002/TCP 13d k8s-app=dashboard-metrics-scraper
kubernetes-dashboard service/kubernetes-dashboard NodePort 10.0.0.89 443:30001/TCP 13d k8s-app=kubernetes-dashboard
monitoring service/alertmanager NodePort 10.0.0.57 9093:49093/TCP 22h app=alertmanager
monitoring service/grafana NodePort 10.0.0.113 3000:43000/TCP 22h app=grafana,component=core
monitoring service/kube-state-metrics NodePort 10.0.0.66 8080:48080/TCP 22h app=kube-state-metrics
monitoring service/prometheus NodePort 10.0.0.10 9090:49090/TCP 22h app=prometheus,component=core
monitoring service/prometheus-node-exporter ClusterIP None 9100/TCP 22h app=prometheus,component=node-exporter
安装成功后,访问地址api地址为:
root@localhost:~/work/metrics # curl --cacert /opt/kubernetes/ssl/ca.pem --key /opt/kubernetes/ssl/apiserver-key.pem --cert /opt/kubernetes/ssl/apiserver.pem https://10.10.3.139:6443/apis/metrics.k8s.io/v1beta1
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "metrics.k8s.io/v1beta1",
"resources": [
{
"name": "nodes",
"singularName": "",
"namespaced": false,
"kind": "NodeMetrics",
"verbs": [
"get",
"list"
]
},
{
"name": "pods",
"singularName": "",
"namespaced": true,
"kind": "PodMetrics",
"verbs": [
"get",
"list"
]
}
]
}#
Metrics Server的资源占用量会随着集群中的Pod数量的不断增长而不断上升,因此需要 addon-resizer垂直扩缩这个容器。addon-resizer依据集群中节点的数量线性地扩展Metrics Server,以保证其能够有能力提供完整的metrics API服务。具体参考:链接
k8s全栈监控之metrics-server和prometheus - 诗码者 - 博客园
部署Kubernetes集群(二进制 v1.18.5版)_linux丶晨星-CSDN博客
【运维】K8S集群部署系列之普通Node节点搭建(八)_Let's Golang的博客-CSDN博客
部署Kubernetes集群+Dashboard可视化页面-1.18.6版本_大鹅的博客-CSDN博客
03 . 二进制部署kubernetes1.18.4 - 常见-youmen - 博客园
部署一套完整的Kubernetes高可用集群(二进制,最新版v1.18)上 - 李振良的技术博客 - 博客园
kubernetes(k8s)coredns1.5.0部署 - 简书
你们威哥的博客
kubernetes(七) 二进制部署k8s(1.18.4版本)【附源码】_alexhuiwang_51CTO博客
K8S 资源指标监控-部署metrics-server_wangmiaoyan的博客-CSDN博客_k8s metrics
【原】二进制部署 k8s 1.18.3 - liyongjian5179 - 博客园
kubernetes认证授权机制【附源码】_yzy121403725_51CTO博客
=========================================================================
kubernetes(七) 二进制部署k8s(1.18.4版本)【附源码】_alexhuiwang_51CTO博客
Kubernetes作为容器集群系统,通过健康检查+重启策略实现了Pod故障自我修复能力,通过调度算法 实现将Pod分布式部署,并保持预期副本数,根据Node失效状态自动在其他Node拉起Pod,实现了应用层的高可用性。
针对Kubernetes集群,高可用性还应包含以下两个层面的考虑:Etcd数据库的高可用性和Kubernetes Master组件的高可用性。 而Etcd我们已经采用3个节点组建集群实现高可用,本节将对Master节点高可 用进行说明和实施。
Master节点扮演着总控中心的角色,通过不断与工作节点上的Kubelet和kube-proxy进行通信来维护整 个集群的健康工作状态。如果Master节点故障,将无法使用kubectl工具或者API做任何集群管理。
Master节点主要有三个服务kube-apiserver、kube-controller-manager和kube-scheduler,其中kube- controller-manager和kube-scheduler组件自身通过选择机制已经实现了高可用,所以Master高可用主 要针对kube-apiserver组件,而该组件是以HTTP API提供服务,因此对他高可用与Web服务器类似,增 加负载均衡器对其负载均衡即可,并且可水平扩容。
多Master架构图:
新增主机:centos7-node7, 角色k8s-master2/3 就是上述的规划。
k8s-master2 | 10.10.3.174 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd) |
k8s-master3 | 10.10.3.185 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd |
flanneld启动报错Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized - 峰哥ge - 博客园
Kubernetes部署-flannel 报错(Couldn't fetch network config)--已解决_zhuzhuxiazst的专栏-CSDN博客
二进制部署k8s 1.18.14(高可用) - 流年似水zlw - 博客园
《蹲坑学kubernetes》之七:签发CA证书
etcd+https部署 - 于老三 - 博客园