K8S 笔记 - k8s 部署 metrics-server

k8s 提供了 top 命令可用于统计资源使用情况,它包含有 node 和 pod 两个⼦命令,分别显⽰ node 节点和 Pod 对象的资源使⽤信息。

kubectl top 命令依赖于 metrics 接口。k8s 系统默认未安装该接口,需要单独部署:

[[email protected] k8s-install]# kubectl top pod
error: Metrics API not available

安装过程

一、下载部署文件

下载 metrics 接口的部署文件 metrics-server-components.yaml

[[email protected] k8s-install]# wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml -O metrics-server-components.yaml
--2022-10-11 00:13:01--  https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
正在解析主机 github.com (github.com)... 20.205.243.166
正在连接 github.com (github.com)|20.205.243.166|:443... 已连接。
已发出 HTTP 请求,正在等待回应... 302 Found
位置:https://github.com/kubernetes-sigs/metrics-server/releases/download/metrics-server-helm-chart-3.8.2/components.yaml [跟随至新的 URL]
--2022-10-11 00:13:01--  https://github.com/kubernetes-sigs/metrics-server/releases/download/metrics-server-helm-chart-3.8.2/components.yaml
再次使用存在的到 github.com:443 的连接。
已发出 HTTP 请求,正在等待回应... 302 Found
位置:https://objects.githubusercontent.com/github-production-release-asset-2e65be/92132038/d85e100a-2404-4c5e-b6a9-f3814ad4e6e5?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20221010%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20221010T161303Z&X-Amz-Expires=300&X-Amz-Signature=efa1ff5dd16b6cd86b6186adb3b4c72afed8197bdf08e2bffcd71b9118137831&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=92132038&response-content-disposition=attachment%3B%20filename%3Dcomponents.yaml&response-content-type=application%2Foctet-stream [跟随至新的 URL]
--2022-10-11 00:13:02--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/92132038/d85e100a-2404-4c5e-b6a9-f3814ad4e6e5?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20221010%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20221010T161303Z&X-Amz-Expires=300&X-Amz-Signature=efa1ff5dd16b6cd86b6186adb3b4c72afed8197bdf08e2bffcd71b9118137831&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=92132038&response-content-disposition=attachment%3B%20filename%3Dcomponents.yaml&response-content-type=application%2Foctet-stream
正在解析主机 objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.109.133, 185.199.111.133, 185.199.108.133, ...
正在连接 objects.githubusercontent.com (objects.githubusercontent.com)|185.199.109.133|:443... 已连接。
已发出 HTTP 请求,正在等待回应... 200 OK
长度:4181 (4.1K) [application/octet-stream]
正在保存至: “metrics-server-components.yaml”

100%[============================================================================================================================>] 4,181       --.-K/s 用时 0.01s   

2022-10-11 00:13:10 (385 KB/s) - 已保存 “metrics-server-components.yaml” [4181/4181])

二、修改镜像地址

将部署文件中镜像地址修改为国内的地址。大概在部署文件的第 140 行。

原配置是:

image: k8s.gcr.io/metrics-server/metrics-server:v0.6.1

修改后的配置是:

image: registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.1

可使用如下命令实现修改:

sed -i 's/k8s.gcr.io/metrics-server/registry.cn-hangzhou.aliyuncs.com/google_containers/g' metrics-server-components.yaml

三、部署 metrics 接口

[[email protected] k8s-install]# kubectl create -f metrics-server-components.yaml 
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created

查看该 metric pod 的运行情况:

[[email protected] k8s-install]# kubectl get pods --all-namespaces | grep metrics
kube-system   metrics-server-6ffc8966f5-84hbb      0/1     Running   0              2m23s

查看该 pod 的情况,发现是探针问题:Readiness probe failed: HTTP probe failed with statuscode: 500

[[email protected] k8s-install]# kubectl describe pod metrics-server-6ffc8966f5-84hbb -n kube-system
Name:                 metrics-server-6ffc8966f5-84hbb
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 k8s-slave2/192.168.100.22
Start Time:           Tue, 11 Oct 2022 00:27:33 +0800
Labels:               k8s-app=metrics-server
                      pod-template-hash=6ffc8966f5
Annotations:          
Status:               Running
IP:                   10.244.2.9
IPs:
  IP:           10.244.2.9
Controlled By:  ReplicaSet/metrics-server-6ffc8966f5
Containers:
  metrics-server:
    Container ID:  docker://e913a075e0381b98eabfb6e298f308ef69dfbd7c672bdcfb75bb2ff3e4b5a0a4
    Image:         registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.1
    Image ID:      docker-pullable://registry.cn-hangzhou.aliyuncs.com/google_containers/[email protected]:5ddc6458eb95f5c70bd13fdab90cbd7d6ad1066e5b528ad1dcb28b76c5fb2f00
    Port:          4443/TCP
    Host Port:     0/TCP
    Args:
      --cert-dir=/tmp
      --secure-port=4443
      --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
      --kubelet-use-node-status-port
      --metric-resolution=15s
    State:          Running
      Started:      Tue, 11 Oct 2022 00:27:45 +0800
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:        100m
      memory:     200Mi
    Liveness:     http-get https://:https/livez delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:    http-get https://:https/readyz delay=20s timeout=1s period=10s #success=1 #failure=3
    Environment:  
    Mounts:
      /tmp from tmp-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-x2spb (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  tmp-dir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  
  kube-api-access-x2spb:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                     From               Message
  ----     ------     ----                    ----               -------
  Normal   Scheduled  7m27s                   default-scheduler  Successfully assigned kube-system/metrics-server-6ffc8966f5-84hbb to k8s-slave2
  Normal   Pulling    7m26s                   kubelet            Pulling image "registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.1"
  Normal   Pulled     7m15s                   kubelet            Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.1" in 10.976606194s
  Normal   Created    7m15s                   kubelet            Created container metrics-server
  Normal   Started    7m15s                   kubelet            Started container metrics-server
  Warning  Unhealthy  2m17s (x31 over 6m47s)  kubelet            Readiness probe failed: HTTP probe failed with statuscode: 500

进而查看 pod 的日志:
[[email protected] k8s-install]# kubectl logs metrics-server-6ffc8966f5-84hbb -n kube-system 
I1010 16:27:46.228594       1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I1010 16:27:46.633494       1 secure_serving.go:266] Serving securely on [::]:4443
I1010 16:27:46.633585       1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I1010 16:27:46.633616       1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I1010 16:27:46.633653       1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"
I1010 16:27:46.634221       1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
W1010 16:27:46.634296       1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed
I1010 16:27:46.634365       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I1010 16:27:46.634370       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I1010 16:27:46.634409       1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I1010 16:27:46.634415       1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
E1010 16:27:46.641663       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.100.22:10250/metrics/resource\": x509: cannot validate certificate for 192.168.100.22 because it doesn't contain any IP SANs" node="k8s-slave2"
E1010 16:27:46.645389       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.100.20:10250/metrics/resource\": x509: cannot validate certificate for 192.168.100.20 because it doesn't contain any IP SANs" node="k8s-master"
E1010 16:27:46.652261       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.100.21:10250/metrics/resource\": x509: cannot validate certificate for 192.168.100.21 because it doesn't contain any IP SANs" node="k8s-slave1"
I1010 16:27:46.733747       1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController 
I1010 16:27:46.735167       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file 
I1010 16:27:46.735194       1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file 
E1010 16:28:01.643646       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.100.22:10250/metrics/resource\": x509: cannot validate certificate for 192.168.100.22 because it doesn't contain any IP SANs" node="k8s-slave2"
E1010 16:28:01.643805       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.100.21:10250/metrics/resource\": x509: cannot validate certificate for 192.168.100.21 because it doesn't contain any IP SANs" node="k8s-slave1"
E1010 16:28:01.646721       1 scraper.go:140] "Failed to scrape node" err="Get \"https://192.168.100.20:10250/metrics/resource\": x509: cannot validate certificate for 192.168.100.20 because it doesn't contain any IP SANs" node="k8s-master"
I1010 16:28:13.397373       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"

基本可以确定 pod 异常是因为:Readiness Probe 探针检测到 Metris 容器启动后对 http Get 探针存活没反应,具体原因是:cannot validate certificate for 192.168.100.22 because it doesn't contain any IP SANs" node="k8s-slave2"

查看 metrics-server 的文档(https://github.com/kubernetes...),有如下一段说明:

Kubelet certificate needs to be signed by cluster Certificate Authority (or disable certificate validation by passing 
--kubelet-insecure-tls to Metrics Server)

意思是:kubelet 证书需要由集群证书颁发机构签名(或者通过向 Metrics Server 传递参数 --kubelet-insecure-tls 来禁用证书验证)。
由于是测试环境,我们选择使用参数禁用证书验证,生产环境不推荐这样做!!!

在大概 139 行的位置追加参数:

    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls

apply 部署文件:

[[email protected] k8s-install]# kubectl apply -f metrics-server-components.yaml
Warning: resource serviceaccounts/metrics-server is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
serviceaccount/metrics-server configured
Warning: resource clusterroles/system:aggregated-metrics-reader is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader configured
Warning: resource clusterroles/system:metrics-server is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
clusterrole.rbac.authorization.k8s.io/system:metrics-server configured
Warning: resource rolebindings/metrics-server-auth-reader is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader configured
Warning: resource clusterrolebindings/metrics-server:system:auth-delegator is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator configured
Warning: resource clusterrolebindings/system:metrics-server is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server configured
Warning: resource services/metrics-server is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
service/metrics-server configured
Warning: resource deployments/metrics-server is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
deployment.apps/metrics-server configured
Warning: resource apiservices/v1beta1.metrics.k8s.io is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io configured

metrics pod 已经正常运行:

[[email protected] k8s-install]# kubectl get pod -A | grep metrics
kube-system   metrics-server-fd9598766-8zphn       1/1     Running   0              89s

再次执行 kubectl top 命令成功:

[[email protected] k8s-install]# kubectl top pod
NAME                            CPU(cores)   MEMORY(bytes)   
front-end-59bc6df748-699vb      0m           3Mi             
front-end-59bc6df748-r7pkr      0m           3Mi             
kucc4                           1m           2Mi             
legacy-app                      1m           1Mi             
my-demo-nginx-998bbf8f5-9t9pw   0m           0Mi             
my-demo-nginx-998bbf8f5-lfgvw   0m           0Mi             
my-demo-nginx-998bbf8f5-nfn7r   1m           0Mi             
nginx-kusc00401                 0m           3Mi
[[email protected] k8s-install]# 
[[email protected] k8s-install]# kubectl top node
NAME         CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
k8s-master   232m         5%     1708Mi          46%       
k8s-slave1   29m          1%     594Mi           34%       
k8s-slave2   25m          1%     556Mi           32%

你可能感兴趣的:(K8S 笔记 - k8s 部署 metrics-server)