背景:最近看马哥的k8s教程,在《容器资源需求、资源限制及HeapSter》章节中,教程里kubectl top和grafana图形最终也没有显示出来;heapster会在后面的版本中废弃,所以不必纠结。我只是比较好奇而已。下面把遇到的问题及解决过程讲一下,我安装的k8s版本是v1.13.3。
查看版本
[ryuser@cdh-master metrics]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
cdh-master.rongyi.com Ready master 41d v1.13.3
cdh-slave.rongyi.com Ready 41d v1.13.3
cdh-slave2.rongyi.com Ready 39d v1.13.3
1、 创建heapster时,查看日志总是下面的错误
[ryuser@cdh-master metrics]$ kubectl logs heapster-f64999bc-25tvv -n kube-system
I0326 06:23:03.317063 1 heapster.go:78] /heapster --source=kubernetes:https://kubernetes.default --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
I0326 06:23:03.317170 1 heapster.go:79] Heapster version v1.5.4
I0326 06:23:03.317421 1 configs.go:61] Using Kubernetes client with master "https://kubernetes.default" and version v1
I0326 06:23:03.317437 1 configs.go:62] Using kubelet port 10255
I0326 06:23:03.341940 1 influxdb.go:312] created influxdb sink with options: host:monitoring-influxdb.kube-system.svc:8086 user:root db:k8s
I0326 06:23:03.341976 1 heapster.go:202] Starting with InfluxDB Sink
I0326 06:23:03.341985 1 heapster.go:202] Starting with Metric Sink
I0326 06:23:03.364225 1 heapster.go:112] Starting heapster on port 8082
E0326 06:24:05.006245 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.73:10255: failed to get all container stats from Kubelet URL "http://192.168.10.73:10255/stats/container/": Post http://192.168.10.73:10255/stats/container/: dial tcp 192.168.10.73:10255: getsockopt: connection refused
E0326 06:24:05.006326 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.77:10255: failed to get all container stats from Kubelet URL "http://192.168.10.77:10255/stats/container/": Post http://192.168.10.77:10255/stats/container/: dial tcp 192.168.10.77:10255: getsockopt: connection refused
E0326 06:24:05.006827 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.74:10255: failed to get all container stats from Kubelet URL "http://192.168.10.74:10255/stats/container/": Post http://192.168.10.74:10255/stats/container/: dial tcp 192.168.10.74:10255: getsockopt: connection refused
W0326 06:24:25.002576 1 manager.go:152] Failed to get all responses in time (got 0/3)
I0326 06:24:25.033246 1 influxdb.go:274] Created database "k8s" on influxDB server at "monitoring-influxdb.kube-system.svc:8086"
E0326 06:25:05.009902 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.77:10255: failed to get all container stats from Kubelet URL "http://192.168.10.77:10255/stats/container/": Post http://192.168.10.77:10255/stats/container/: dial tcp 192.168.10.77:10255: getsockopt: connection refused
E0326 06:25:05.010317 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.73:10255: failed to get all container stats from Kubelet URL "http://192.168.10.73:10255/stats/container/": Post http://192.168.10.73:10255/stats/container/: dial tcp 192.168.10.73:10255: getsockopt: connection refused
E0326 06:25:05.024937 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.74:10255: failed to get all container stats from Kubelet URL "http://192.168.10.74:10255/stats/container/": Post http://192.168.10.74:10255/stats/container/: dial tcp 192.168.10.74:10255: getsockopt: connection refused
W0326 06:25:25.002198 1 manager.go:152] Failed to get all responses in time (got 0/3)
E0326 06:26:05.011184 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.77:10255: failed to get all container stats from Kubelet URL "http://192.168.10.77:10255/stats/container/": Post http://192.168.10.77:10255/stats/container/: dial tcp 192.168.10.77:10255: getsockopt: connection refused
E0326 06:26:05.014660 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.73:10255: failed to get all container stats from Kubelet URL "http://192.168.10.73:10255/stats/container/": Post http://192.168.10.73:10255/stats/container/: dial tcp 192.168.10.73:10255: getsockopt: connection refused
E0326 06:26:05.021066 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.74:10255: failed to get all container stats from Kubelet URL "http://192.168.10.74:10255/stats/container/": Post http://192.168.10.74:10255/stats/container/: dial tcp 192.168.10.74:10255: getsockopt: connection refused
2、kubectl top 命令也获取不到想要的结果
[ryuser@cdh-master metrics]$ kubectl top pod
W0326 15:13:19.303263 20846 top_pod.go:259] Metrics not available for pod default/client, age: 980h4m21.303224766s
error: Metrics not available for pod default/client, age: 980h4m21.303224766s
[ryuser@cdh-master metrics]$ kubectl top node
error: metrics not available yet
解决办法:
#在heapster.yaml清单文件中进行如下修改
- --source=kubernetes:https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true
- --sink=influxdb:http://monitoring-influxdb.kube-system.svc.cluster.local:8086
然后删除heapster重建
kubectl delete -f heapster.yaml
kubectl apply -f heapster.yaml
continue
3、 又遇到403错误
403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)"
[ryuser@cdh-master metrics]$ kubectl logs -f heapster-5fcf457b-zq99c -n kube-system
I0326 07:36:23.229287 1 heapster.go:78] /heapster --source=kubernetes:https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true --sink=influxdb:http://monitoring-influxdb.kube-system.svc.cluster.local:8086
I0326 07:36:23.229348 1 heapster.go:79] Heapster version v1.5.4
I0326 07:36:23.229602 1 configs.go:61] Using Kubernetes client with master "https://kubernetes.default" and version v1
I0326 07:36:23.229618 1 configs.go:62] Using kubelet port 10250
I0326 07:36:23.334904 1 influxdb.go:312] created influxdb sink with options: host:monitoring-influxdb.kube-system.svc.cluster.local:8086 user:root db:k8s
I0326 07:36:23.334946 1 heapster.go:202] Starting with InfluxDB Sink
I0326 07:36:23.334955 1 heapster.go:202] Starting with Metric Sink
I0326 07:36:23.347573 1 heapster.go:112] Starting heapster on port 8082
E0326 07:37:05.028341 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.74:10250: failed to get all container stats from Kubelet URL "https://192.168.10.74:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)"
E0326 07:37:05.096629 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.73:10250: failed to get all container stats from Kubelet URL "https://192.168.10.73:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)"
E0326 07:37:05.157683 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.77:10250: failed to get all container stats from Kubelet URL "https://192.168.10.77:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)"
W0326 07:37:25.003226 1 manager.go:152] Failed to get all responses in time (got 0/3)
I0326 07:37:25.037245 1 influxdb.go:274] Created database "k8s" on influxDB server at "monitoring-influxdb.kube-system.svc.cluster.local:8086"
E0326 07:38:05.013221 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.77:10250: failed to get all container stats from Kubelet URL "https://192.168.10.77:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)"
E0326 07:38:05.019540 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.74:10250: failed to get all container stats from Kubelet URL "https://192.168.10.74:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)"
E0326 07:38:05.022849 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.73:10250: failed to get all container stats from Kubelet URL "https://192.168.10.73:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)"
W0326 07:38:25.003081 1 manager.go:152] Failed to get all responses in time (got 0/3)
E0326 07:39:05.010246 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.73:10250: failed to get all container stats from Kubelet URL "https://192.168.10.73:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)"
E0326 07:39:05.019238 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.74:10250: failed to get all container stats from Kubelet URL "https://192.168.10.74:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)"
E0326 07:39:05.024794 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.77:10250: failed to get all container stats from Kubelet URL "https://192.168.10.77:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)"
W0326 07:39:25.004236 1 manager.go:152] Failed to get all responses in time (got 0/3)
E0326 07:40:05.016757 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.77:10250: failed to get all container stats from Kubelet URL "https://192.168.10.77:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)"
E0326 07:40:05.020030 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.74:10250: failed to get all container stats from Kubelet URL "https://192.168.10.74:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)"
E0326 07:40:05.020763 1 manager.go:101] Error in scraping containers from kubelet:192.168.10.73:10250: failed to get all container stats from Kubelet URL "https://192.168.10.73:10250/stats/container/": request failed - "403 Forbidden", response: "Forbidden (user=system:serviceaccount:kube-system:heapster, verb=create, resource=nodes, subresource=stats)"
W0326 07:40:25.002318 1 manager.go:152] Failed to get all responses in time (got 0/3)
解决办法:
查看ClusterRole: system:heapster的权限,发现的确没有针对Resource: nodes/stats 的create权限
[ryuser@cdh-master metrics]$ kubectl describe clusterrole system:heapster
Name: system:heapster
Labels: kubernetes.io/bootstrapping=rbac-defaults
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{"rbac.authorization.kubernetes.io/autoupdate"...
rbac.authorization.kubernetes.io/autoupdate: true
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
events [] [] [get list watch]
namespaces [] [] [get list watch]
nodes/stats [] [] [get list watch]
nodes [] [] [get list watch]
pods [] [] [get list watch]
deployments.extensions [] [] [get list watch]
修改ClusterRole: system:heapster的权限
生成清单文件
kubectl get clusterrole system:heapster -o yaml > heapster_modify.yaml
修改文件,增加verbs:create权限,增加resources:nodes/stats
vim heapster_modify.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{"rbac.authorization.kubernetes.io/autoupdate":"true"},"creationTimestamp":"2019-02-12T10:41:33Z","labels":{"kubernetes.io/bootstrapping":"rbac-defaults"},"name":"system:heapster","resourceVersion":"70","selfLink":"/apis/rbac.authorization.k8s.io/v1/clusterroles/system%3Aheapster","uid":"c3bd303a-2eb2-11e9-9c98-005056be639a"},"rules":[{"apiGroups":[""],"resources":["events","namespaces","nodes","pods"],"verbs":["create","get","list","watch"]},{"apiGroups":["extensions"],"resources":["deployments"],"verbs":["get","list","watch"]}]}
rbac.authorization.kubernetes.io/autoupdate: "true"
creationTimestamp: "2019-02-12T10:41:33Z"
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:heapster
resourceVersion: "4109335"
selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/system%3Aheapster
uid: c3bd303a-2eb2-11e9-9c98-005056be639a
rules:
- apiGroups:
- ""
resources:
- events
- namespaces
- nodes
- pods
- nodes/stats # 增加
verbs:
- create #增加
- get
- list
- watch
- apiGroups:
- extensions
resources:
- deployments
verbs:
- get
- list
- watch
删除heapster重新部署
kubectl delete -f heapster.yaml
kubectl apply -f heapster.yaml
终于不报错了。
[ryuser@cdh-master metrics]$ kubectl logs -f heapster-5fcf457b-vhrxf -n kube-system
I0326 07:47:00.574138 1 heapster.go:78] /heapster --source=kubernetes:https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true --sink=influxdb:http://monitoring-influxdb.kube-system.svc.cluster.local:8086
I0326 07:47:00.574204 1 heapster.go:79] Heapster version v1.5.4
I0326 07:47:00.574470 1 configs.go:61] Using Kubernetes client with master "https://kubernetes.default" and version v1
I0326 07:47:00.574487 1 configs.go:62] Using kubelet port 10250
I0326 07:47:00.639292 1 influxdb.go:312] created influxdb sink with options: host:monitoring-influxdb.kube-system.svc.cluster.local:8086 user:root db:k8s
I0326 07:47:00.639338 1 heapster.go:202] Starting with InfluxDB Sink
I0326 07:47:00.639354 1 heapster.go:202] Starting with Metric Sink
I0326 07:47:00.670576 1 heapster.go:112] Starting heapster on port 8082
I0326 07:48:05.366442 1 influxdb.go:274] Created database "k8s" on influxDB server at "monitoring-influxdb.kube-system.svc.cluster.local:8086"
kubectl top
[ryuser@cdh-master metrics]$ kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
cdh-master.rongyi.com 158m 3% 2550Mi 69%
cdh-slave.rongyi.com 79m 1% 2386Mi 64%
cdh-slave2.rongyi.com 820m 41% 3136Mi 84%
[ryuser@cdh-master metrics]$ kubectl top pods
NAME CPU(cores) MEMORY(bytes)
curl-66959f6557-bvn9r 0m 0Mi
dep-httpd-5b774f45df-vtv59 0m 21Mi
dep-httpd-5b774f45df-wd5kf 0m 15Mi
myapp-0 0m 1Mi
myapp-1 0m 3Mi
myapp-2 0m 1Mi
myapp-3 0m 1Mi
myapp-4 0m 1Mi
pod-demo 499m 138Mi
另外还有一个问题,就是grafana里面的dashboard是不显示数据。 经过上面的折腾有数据了。
附:dashboard的下载地址:
“Kubernetes Node Statistics”dashabord : https://grafana.com/dashboards/3646
“Kubernetes Pod Statistics”dashabord:https://grafana.com/dashboards/3649