kube-state-metrics:通过监听API Server生成有关资源对象的状态指标,kube-state-metrics只是提供一个metrics数据,并不会存储这些指标数据,所以我们可以使用prometheus抓取这些数据后存储,主要关注的是业务相关的一些元数据,比如Deployment,Pod副本状态,调度了多少个replicas,多少个pod的状态是running/stopped/terminated?pod重启了多少次?目前多少job在运行中.
kube-state-metrics | Kubernetes 1.20 | Kubernetes 1.21 | Kubernetes 1.22 | Kubernetes 1.23 | Kubernetes 1.24 |
---|---|---|---|---|---|
v2.3.0 | ✓ | ✓ | ✓ | ✓ | - |
v2.4.2 | -/✓ | -/✓ | ✓ | ✓ | - |
v2.5.0 | -/✓ | -/✓ | ✓ | ✓ | ✓ |
v2.6.0 | -/✓ | -/✓ | ✓ | ✓ | ✓ |
master | -/✓ | -/✓ | ✓ | ✓ | ✓ |
从hub.docker下载镜像,这里使用到的是kube-state-metrics:2.4.2
https://hub.docker.com/r/bitnami/kube-state-metrics
为了今后使用方便,下载后打上tag然后上传harbor仓库
docker pull bitnami/kube-state-metrics:2.4.2
docker tag bitnami/kube-state-metrics:2.4.2 harbor.intra.com/prometheus/kube-state-metrics:2.4.2
docker push harbor.intra.com/prometheus/kube-state-metrics:2.4.2
apiVersion: apps/v1
kind: Deployment
metadata:
name: kube-state-metrics
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: kube-state-metrics
template:
metadata:
labels:
app: kube-state-metrics
spec:
serviceAccountName: kube-state-metrics
containers:
- name: kube-state-metrics
image: harbor.intra.com/prometheus/kube-state-metrics:2.4.2
ports:
- containerPort: 8080
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kube-state-metrics
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kube-state-metrics
rules:
- apiGroups: [""]
resources: ["nodes", "pods", "services", "resourcequotas", "replicationcontrollers", "limitranges", "persistentvolumeclaims", "persistentvolumes", "namespaces", "endpoints"]
verbs: ["list", "watch"]
- apiGroups: ["extensions"]
resources: ["daemonsets", "deployments", "replicasets"]
verbs: ["list", "watch"]
- apiGroups: ["apps"]
resources: ["statefulsets"]
verbs: ["list", "watch"]
- apiGroups: ["batch"]
resources: ["cronjobs", "jobs"]
verbs: ["list", "watch"]
- apiGroups: ["autoscaling"]
resources: ["horizontalpodautoscalers"]
verbs: ["list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kube-state-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kube-state-metrics
subjects:
- kind: ServiceAccount
name: kube-state-metrics
namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/scrape: 'true'
name: kube-state-metrics
namespace: kube-system
labels:
app: kube-state-metrics
spec:
type: NodePort
ports:
- name: kube-state-metrics
port: 8080
targetPort: 8080
nodePort: 31666
protocol: TCP
selector:
app: kube-state-metrics
执行部署后在nskube-system下创建kube-state-metrics的deployment含有一个pod:kube-state-metrics-8444bbc459-q2wqg,Service名为kube-state-metrics,以nodeport的方式暴露31666对外提供服务.
kubectl apply -f kube-state-metrics.yaml
root@k8s-master-01:/opt/k8s-data/yaml/prometheus-files/case1/prometheus-files/case# kubectl get pods -n kube-system |grep kube-state-metrics
kube-state-metrics-8444bbc459-q2wqg 1/1 Running 0 12m
root@k8s-master-01:/opt/k8s-data/yaml/prometheus-files/case1/prometheus-files/case# kubectl get svc -n kube-system |grep kube-state-metrics
kube-state-metrics NodePort 10.200.98.90 <none> 8080:31666/TCP 12m
创建ha后可以通过ha地址直接访问metrics信息
静态配置
- job_name: "kube-state-metrics"
metrics_path: /metrics
static_configs:
- targets: ["192.168.31.188:31666"]
重启prometheus
root@prometheus-2:/apps/prometheus# systemctl restart prometheus.service
这样数据被正常收集到prometheus