无法使用 metrics得采集指令,kubectl top nodes ,无法使用

无法使用 metrics得采集指令,kubectl top nodes ,无法使用

问题描述

在主机重启之后,metrics的相关操作均不可用,日志如下

# metrics-servcie 日志
I1113 06:30:38.255110 1 serving.go:312] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)

I1113 06:30:41.082525 1 secure_serving.go:116] Serving securely on [::]:4443

E1113 06:30:55.501456 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/jaeger-collector-76bf54b467-z8smv: no metrics known for pod

E1113 06:30:55.506335 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/jaeger-collector-76bf54b467-z8smv: no metrics known for pod

E1113 06:31:09.079549 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/jaeger-collector-76bf54b467-z8smv: no metrics known for pod

E1113 06:31:09.083549 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/jaeger-collector-76bf54b467-z8smv: no metrics known for pod

E1113 06:31:25.568509 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/jaeger-collector-76bf54b467-z8smv: no metrics known for pod

E1113 06:31:25.593969 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/jaeger-collector-76bf54b467-z8smv: no metrics known for pod

E1113 06:31:38.873501 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/jaeger-collector-76bf54b467-z8smv: no metrics known for pod

E1113 06:31:38.877544 1 reststorage.go:160] unable to fetch pod metrics for pod istio-system/jaeger-collector-76bf54b467-z8smv: no metrics known for pod

E1113 06:39:41.179891 1 manager.go:111] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:node1: unable to get a valid timestamp for metric point for container "wait-mysql" in pod kubesphere-alerting-system/notification-deployment-67cd9b7985-xzll8 on node "61.155.5.52", discarding data: no non-zero timestamp on either CPU or memory

问题分析

这里通过对各种的日志以及相关服务查看,发现其实是调取不到api,后来想到是不是网络的一些问题,然后还真的是,我这里的原因是 calico
实际得问题是,不值知道为什么,多了一个网卡,calico在初始化得时候,读取了这个虚拟网卡,而非主机得网卡,才会导致他无法与其他主机得pod进行通信!
大坑!

在这里插入图片描述

问题解决

保证集群网络可用我这里的bug就解决了
顺带提下,我这里是因为calico选择的并不是我的实际网卡问题导致的网络不可用
这里的是修改是制定了calico使用的网卡 em1

# 在配置文件指定网卡,这里是使用 em1
# 在env里面添加
- name: IP_AUTODETECTION_METHOD
  value: "interface=em1"

以上!

你可能感兴趣的:(k8s,kubernetes,linux)