heapster已经deprecated了:https://github.com/kubernetes/heapster
Heapster是容器集群监控和性能分析工具,天然的支持Kubernetes和CoreOS。
Kubernetes有个出名的监控agent—cAdvisor。在每个kubernetes Node上都会运行cAdvisor,它会收集本机以及容器的监控数据(cpu,memory,filesystem,network,uptime)。在较新的版本中,K8S已经将cAdvisor功能集成到kubelet组件中。每个Node节点可以直接进行web访问。
Heapster是一个收集者,Heapster可以收集Node节点上的cAdvisor数据,将每个Node上的cAdvisor的数据进行汇总,还可以按照kubernetes的资源类型来集合资源,比如Pod、Namespace,可以分别获取它们的CPU、内存、网络和磁盘的metric。默认的metric数据聚合时间间隔是1分钟。还可以把数据导入到第三方工具(如InfluxDB)。
Kubernetes原生dashboard的监控图表信息来自heapster。在Horizontal Pod Autoscaling中也用到了Heapster,HPA将Heapster作为Resource Metrics API,向其获取metric。
Heapster首先从apiserver获取集群中所有Node的信息,然后通过这些Node上的kubelet获取有用数据,而kubelet本身的数据则是从cAdvisor得到。所有获取到的数据都被推到Heapster配置的后端存储中,并还支持数据的可视化。现在后端存储 + 可视化的方法,如InfluxDB + grafana。
说明:本实验采用influxdb作为heapster后端
influxdb和grafana的介绍请见这篇文章
到 heapster release 页面下载最新版本的 heapster:
wget https://github.com/kubernetes/heapster/archive/v1.5.2.zip
unzip v1.5.2.zip
cd heapster-1.5.2/deploy/kube-config/influxdb
我修改好的文件已上传到github。
修改heapster.yml配置文件:
apiVersion: v1
kind: ServiceAccount
metadata:
name: heapster
namespace: kube-system
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: heapster
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
task: monitoring
k8s-app: heapster
spec:
serviceAccountName: heapster
containers:
- name: heapster
image: daocloud.io/liukuan73/heapster-amd64:v1.5.2
imagePullPolicy: IfNotPresent
command:
- /heapster
- --source=kubernetes:https://kubernetes.default
- --sink=influxdb:http://monitoring-influxdb.kube-system.svc:8086
nodeSelector:
node-role.kubernetes.io/master: true
tolerations:
- key: "node-role.kubernetes.io/master"
effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
labels:
task: monitoring
# For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
# If you are NOT using this as an addon, you should comment out this line.
kubernetes.io/cluster-service: 'true'
kubernetes.io/name: Heapster
name: heapster
namespace: kube-system
spec:
type: NodePort
ports:
- port: 80
targetPort: 8082
nodePort: 30003
selector:
k8s-app: heapster
部署好后可以打开kubernetes-dashboard看效果,看是否显示各 Nodes、Pods 的 CPU、内存、负载等利用率曲线图;
修改influxdb.yaml:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: monitoring-influxdb
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
task: monitoring
k8s-app: influxdb
spec:
containers:
- name: influxdb
image: daocloud.io/liukuan73/heapster-influxdb-amd64:v1.3.3
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /data
name: influxdb-storage
volumes:
- name: influxdb-storage
emptyDir: {}
nodeSelector:
node-role.kubernetes.io/master: true
tolerations:
- key: "node-role.kubernetes.io/master"
effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
labels:
task: monitoring
# For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
# If you are NOT using this as an addon, you should comment out this line.
#kubernetes.io/cluster-service: 'true'
kubernetes.io/name: monitoring-influxdb
name: monitoring-influxdb
namespace: kube-system
spec:
type: NodePort
ports:
- port: 8086
targetPort: 8086
nodePort: 30004
selector:
k8s-app: influxdb
修改grafana.yaml配置文件:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: monitoring-grafana
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
task: monitoring
k8s-app: grafana
spec:
containers:
- name: grafana
image: daocloud.io/liukuan73/heapster-grafana:v4.4.3
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3000
protocol: TCP
volumeMounts:
- mountPath: /etc/ssl/certs
name: ca-certificates
readOnly: true
- mountPath: /var
name: grafana-storage
env:
- name: INFLUXDB_HOST
value: monitoring-influxdb
- name: GF_SERVER_HTTP_PORT
value: "3000"
# The following env variables are required to make Grafana accessible via
# the kubernetes api-server proxy. On production clusters, we recommend
# removing these env variables, setup auth for grafana, and expose the grafana
# service using a LoadBalancer or a public IP.
- name: GF_AUTH_BASIC_ENABLED
value: "false"
- name: GF_AUTH_ANONYMOUS_ENABLED
value: "true"
- name: GF_AUTH_ANONYMOUS_ORG_ROLE
value: Admin
- name: GF_SERVER_ROOT_URL
# If you're only using the API Server proxy, set this value instead:
# value: /api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
value: /
volumes:
- name: ca-certificates
hostPath:
path: /etc/ssl/certs
- name: grafana-storage
emptyDir: {}
nodeSelector:
node-role.kubernetes.io/master: true
tolerations:
- key: "node-role.kubernetes.io/master"
effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
labels:
# For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
# If you are NOT using this as an addon, you should comment out this line.
kubernetes.io/cluster-service: 'true'
kubernetes.io/name: monitoring-grafana
name: monitoring-grafana
namespace: kube-system
spec:
type: NodePort
# In a production setup, we recommend accessing Grafana through an external Loadbalancer
# or through a public IP.
# type: LoadBalancer
# You could also use NodePort to expose the service at a randomly-generated port
# type: NodePort
ports:
- port: 80
targetPort: 3000
nodePort: 30002
selector:
k8s-app: grafana
注意
在安装好 Grafana 之后我们使用的是默认的 template 配置,页面上的 namespace 选择里只有 default 和 kube-system,并不是说其他的 namespace 里的指标没有得到监控,只是我们没有在 Grafana 中开启他它们的显示而已。见 Cannot see other namespaces except, kube-system and default #1279。
修改grafana模板
将 Templating 中的 namespace 的 Data source 设置为 influxdb-datasource,Refresh 设置为 on Dashboard Load 保存设置,刷新浏览器,即可看到其他 namespace 选项。
配置influxdb数据源
点击“Add data source”配置数据源:
配置dashboard
1.导入“Kubernetes Node Statistics”dashabord
import这个模板的序号3646就可以了:
效果:
2.导入“Kubernetes Pod Statistics”dashabord
import这个模板的序号3649就可以了:
效果:
这里主要介绍Heapster的API使用,及可获取的Metrics。
The Heapster Model is a structured representation of metrics for Kubernetes clusters, which is exposed through a set of REST API endpoints. It allows the extraction of up to 15 minutes of historical data for any Container, Pod, Node or Namespace in the cluster, as well as the cluster itself (depending on the metric).
API文档及可用Metrics在官方文档中都介绍的比较齐全。下面用几条测试命令来解释API使用:
curl -L http://10.142.21.21:30003/api/v1/model/metrics
[
"cpu/limit",
"memory/request",
"memory/limit",
"cpu/usage_rate",
"memory/usage",
"cpu/request"
]
curl -L http://10.142.21.21:30003/api/v1/model/nodes/k8smaster01/metrics
curl -L http://10.142.21.21:30003/api/v1/model/namespaces//pods//metrics/cpu-usag
API文档: https://github.com/kubernetes/heapster/blob/v1.5.0-beta.2/docs/model.md
注意,1.5.0版本开始,Heapster Model API已被deprecated了,使用这个:https://github.com/kubernetes/metrics
heapster导出的所有Metrics: https://github.com/kubernetes/heapster/blob/master/docs/storage-schema.md
1.https://jimmysong.io/kubernetes-handbook/practice/heapster-addon-installation.html
2.https://www.kubernetes.org.cn/932.html