Metrics-Server是集群核心监控数据的聚合器,用来替换之前的heapster。
容器相关的 Metrics 主要来自于 kubelet 内置的 cAdvisor 服务,有了Metrics-Server之后,用户就可以通过标准的 Kubernetes API 来访问到这些监控数据。
Metrics Server 并不是 kube-apiserver 的一部分,而是通过 Aggregator 这种插件机制,在独立部署的情况下同 kube-apiserver 一起统一对外服务的。
kube-aggregator 其实就是一个根据 URL 选择具体的 API 后端的代理服务器。
Metrics-server属于Core metrics(核心指标),提供API metrics.k8s.io,仅提供Node和Pod的CPU和内存使用情况。而其他Custom Metrics(自定义指标)由Prometheus等组件来完成。
资源下载:
https://github.com/kubernetes-incubator/metrics-server
https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
上传镜像至harbor仓库
安装脚本,将指定内容部分端口改为4443,因为1000以下端口开放需要额外配置
[root@server1 ~]# cat components.yaml
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
image: metrics-server:v0.5.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /livez
port: https
scheme: HTTPS
periodSeconds: 10
name: metrics-server
ports:
- containerPort: 4443
执行安装脚本,查看 kube-system
pod,存在问题
[root@server1 ~]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
metrics-server-5567648887-gs6qw 0/1 Running 0 5m26s
部署后查看Metrics-server的Pod日志
错误1:dial tcp: lookup server2 on 10.96.0.10:53: no such host
这是因为没有内网的DNS服务器,所以metrics-server无法解析节点名字。可以直接修改coredns的configmap,讲各个节点的主机名加入到hosts中,这样所有Pod都可以从CoreDNS中解析各个节点的名字。
kubectl edit configmap coredns -n kube-system
hosts{
172.25.3.1 server1
172.25.3.2 server2
172.25.3.4 server4
fallthrough
}
报错2:x509: certificate signed by unknown authority
Metric Server 支持一个参数 --kubelet-insecure-tls,可以跳过这一检查,然而官方也明确说了,这种方式不推荐生产使用。
启用TLS Bootstrap 证书签发
[root@server1 ~]# vim /var/lib/kubelet/config.yaml
[root@server1 ~]# systemctl restart kubelet
[root@server1 ~]# kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
csr-kwhxf 61s kubernetes.io/kubelet-serving system:node:server2 Pending
csr-m9bkn 3s kubernetes.io/kubelet-serving system:node:server4 Pending
[root@server1 ~]# kubectl certificate approve csr-kwhxf
certificatesigningrequest.certificates.k8s.io/csr-kwhxf approved
[root@server1 ~]# kubectl certificate approve csr-m9bkn
certificatesigningrequest.certificates.k8s.io/csr-m9bkn approved
[root@server1 ~]# kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
csr-kwhxf 85s kubernetes.io/kubelet-serving system:node:server2 Approved,Issued
csr-m9bkn 27s kubernetes.io/kubelet-serving system:node:server4 Approved,Issued
部署成功后可以看到
[root@server1 ~]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
metrics-server-5567648887-gs6qw 1/1 Running 0 7m48s
获取server1内容
[root@server1 ~]# kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes/server1"
{"kind":"NodeMetrics","apiVersion":"metrics.k8s.io/v1beta1","metadata":{"name":"server1","creationTimestamp":"2021-08-03T09:58:30Z","labels":{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64","kubernetes.io/hostname":"server1","kubernetes.io/os":"linux","node-role.kubernetes.io/control-plane":"","node-role.kubernetes.io/master":"","node.kubernetes.io/exclude-from-external-load-balancers":""}},"timestamp":"2021-08-03T09:57:32Z","window":"1m1s","usage":{"cpu":"161350086n","memory":"1415200Ki"}}
获取node信息
[root@server1 ~]# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
server1 161m 8% 1284Mi 67%
server2 79m 3% 482Mi 25%
server4 103m 5% 647Mi 34%
查看metrics-server
服务部署信息
[root@server1 ~]# kubectl -n kube-system describe svc metrics-server
Name: metrics-server
Namespace: kube-system
Labels: k8s-app=metrics-server
Annotations: <none>
Selector: k8s-app=metrics-server
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.102.154.231
IPs: 10.102.154.231
Port: https 443/TCP
TargetPort: https/TCP
Endpoints: 10.244.179.78:4443
Session Affinity: None
Events: <none>
[root@server1 ~]#
Dashboard可以给用户提供一个可视化的 Web 界面来查看当前集群的各种信息。用户可以用 Kubernetes Dashboard 部署容器化的应用、监控应用的状态、执行故障排查任务以及管理 Kubernetes 各种资源。
网址:https://github.com/kubernetes/dashboard
部署文件地址:https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-rc5/aio/deploy/recommended.yaml
拉取镜像,上传至harbor
[root@server1 dash]# docker pull kubernetesui/dashboard:v2.3.1
v2.3.1: Pulling from kubernetesui/dashboard
b82bd84ec244: Pull complete
21c9e94e8195: Pull complete
Digest: sha256:ec27f462cf1946220f5a9ace416a84a57c18f98c777876a8054405d1428cc92e
Status: Downloaded newer image for kubernetesui/dashboard:v2.3.1
docker.io/kubernetesui/dashboard:v2.3.1
[root@server1 dash]# docker pull kubernetesui/metrics-scraper:v1.0.6
v1.0.6: Pulling from kubernetesui/metrics-scraper
47a33a630fb7: Pull complete
62498b3018cb: Pull complete
Digest: sha256:1f977343873ed0e2efd4916a6b2f3075f310ff6fe42ee098f54fc58aa7a28ab7
Status: Downloaded newer image for kubernetesui/metrics-scraper:v1.0.6
docker.io/kubernetesui/metrics-scraper:v1.0.6
[root@server1 dash]# docker tag kubernetesui/metrics-scraper:v1.0.6 reg.westos.org/library/metrics-scraper:v1.0.6
[root@server1 dash]# docker tag kubernetesui/dashboard:v2.3.1 reg.westos.org/library/dashboard:v2.3.1
[root@server1 dash]# docker push reg.westos.org/library/dashboard:v2.3.1
The push refers to repository [reg.westos.org/library/dashboard]
c94f86b1c637: Pushed
8ca79a390046: Pushed
v2.3.1: digest: sha256:e5848489963be532ec39d454ce509f2300ed8d3470bdfb8419be5d3a982bb09a size: 736
[root@server1 dash]# docker push reg.westos.org/library/metrics-scraper:v1.0.6
The push refers to repository [reg.westos.org/library/metrics-scraper]
a652c34ae13a: Pushed
6de384dd3099: Pushed
v1.0.6: digest: sha256:c09adb7f46e1a9b5b0bde058713c5cb47e9e7f647d38a37027cd94ef558f0612 size: 736
开始部署
kubectl apply -f recommended.yaml
查看部署服务信息
[root@server1 dash]# kubectl -n kubernetes-dashboard get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dashboard-metrics-scraper ClusterIP 10.101.84.145 <none> 8000/TCP 34s
kubernetes-dashboard ClusterIP 10.110.109.97 <none> 443/TCP 34s
[root@server1 dash]# kubectl -n kubernetes-dashboard get all
NAME READY STATUS RESTARTS AGE
pod/dashboard-metrics-scraper-6875fdf695-lmshg 1/1 Running 0 41s
pod/kubernetes-dashboard-55c66865b7-j66xk 1/1 Running 0 41s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/dashboard-metrics-scraper ClusterIP 10.101.84.145 <none> 8000/TCP 41s
service/kubernetes-dashboard ClusterIP 10.110.109.97 <none> 443/TCP 41s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/dashboard-metrics-scraper 1/1 1 1 41s
deployment.apps/kubernetes-dashboard 1/1 1 1 41s
NAME DESIRED CURRENT READY AGE
replicaset.apps/dashboard-metrics-scraper-6875fdf695 1 1 1 41s
replicaset.apps/kubernetes-dashboard-55c66865b7 1 1 1 41s
修改为LoadBalancer
方式,以便外部访问。
[root@server1 dash]# kubectl -n kubernetes-dashboard edit svc kubernetes-dashboard
service/kubernetes-dashboard edited
[root@server1 dash]# kubectl -n kubernetes-dashboard get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dashboard-metrics-scraper ClusterIP 10.101.84.145 <none> 8000/TCP 8m2s
kubernetes-dashboard LoadBalancer 10.110.109.97 172.25.3.11 443:31222/TCP 8m2s
登陆dashboard需要认证,需要获取dashboard pod的token:
[root@server1 dash]# kubectl -n kubernetes-dashboard get secrets
NAME TYPE DATA AGE
default-token-d4cdb kubernetes.io/service-account-token 3 114s
kubernetes-dashboard-certs Opaque 0 114s
kubernetes-dashboard-csrf Opaque 1 114s
kubernetes-dashboard-key-holder Opaque 2 114s
kubernetes-dashboard-token-tltxv kubernetes.io/service-account-token 3 114s
[root@server1 dash]# kubectl -n kubernetes-dashboard describe secrets kubernetes-dashboard-token-tltxv
Name: kubernetes-dashboard-token-tltxv
Namespace: kubernetes-dashboard
Labels: <none>
Annotations: kubernetes.io/service-account.name: kubernetes-dashboard
kubernetes.io/service-account.uid: c1c384ba-9ad7-45eb-8c56-b6f843d73665
Type: kubernetes.io/service-account-token
Data
====
ca.crt: 1066 bytes
namespace: 20 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6InE4VVNZcU1lWVFJUTZRMDZPcHk4LUlBMF8yN1ltWFZlY3lKcWlyR2MtN1EifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZC10b2tlbi10bHR4diIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6ImMxYzM4NGJhLTlhZDctNDVlYi04YzU2LWI2Zjg0M2Q3MzY2NSIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDprdWJlcm5ldGVzLWRhc2hib2FyZCJ9.qmKfkESgalW64ZnMEEB8_LMdFCA83s_W3VDwhspn243zGXFq6omj5spIORFqS4vUUgHL85fpW9hCQ5EKg8h0uYIlSRWmrotS9KWvqoCTAYkLOYy1WTPcJlmvCVWVg1kARD78y0yLaC0cxFljUImMGRHGwB1vHClWDVYQqO07vo1UpzKyJ6nulla_KN-jdX__g-b1V1ENUSzY0dYjhuudWyTZxqyCs-iH8yON7MX8VzirFghQcpqIvXd0RN7FoFXWNamGi4RNab7jekSuxrhvUizu2qSWnVkJAKZuOUWjWJP5ftndoac8fbGpoN8o-sR-LO4HSHbOW6Ov2DPtQGVpQg
默认dashboard对集群没有操作权限,需要授权
[root@server1 dash]# cat rbac.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kubernetes-dashboard
测试访问,成功
[root@server1 dash]# curl https://172.25.3.11
curl: (60) Issuer certificate is invalid.
More details here: http://curl.haxx.se/docs/sslcerts.html
curl performs SSL certificate verification by default, using a "bundle"
of Certificate Authority (CA) public keys (CA certs). If the default
bundle file isn't adequate, you can specify an alternate file
using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
the bundle, the certificate verification probably failed due to a
problem with the certificate (it might be expired, or the name might
not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
the -k (or --insecure) option.
firefox访问 https://172.25.3.11