创建Operator的关键是CRD(自定义资源)的设计
Operator的核心实现就是基于 Kubernetes 的以下两个概念:
资源:对象的状态定义
控制器:观测、分析和行动,以调节资源的分布
Prometheus-Operator的核心是Operator控制器,它会创建5个crd资源对象
alertmanagers.monitoring.coreos.com
podmonitors.monitoring.coreos.com
prometheuses.monitoring.coreos.com
prometheusrules.monitoring.coreos.com
servicemonitors.monitoring.coreos.com
prometheuses就是prometheus server
servicemonitors可以理解为提供metric接口的各种exporter
servicemonitors的后端是各类service
安装operator
$ git clone https://github.com/coreos/kube-prometheus.git
$ cd cd kube-prometheus/manifests/
$ vim prometheus-serviceMonitorKubelet.yaml
将其中的两个https-metrics更改为http-metrics,并删除其中的tls证书
因为默认情况下,这个 ServiceMonitor 是关联的 kubelet 的10250端口去采集的节点数据,为了安全,这个 metrics 数据已经迁移到10255这个只读端口上面去了
如果不修改,就是监控kubelete的10250端口。
$ kubectl apply -f .
$ kubectl get crd
$ kubectl get pods -n monitoring
alertmanager-main-0 2/2 Running 0 4d3h
alertmanager-main-1 2/2 Running 0 4d3h
alertmanager-main-2 2/2 Running 0 4d3h
grafana-57bfdd47f8-88nkx 1/1 Running 0 4d3h
kube-state-metrics-65d5b4b99d-qdmw7 4/4 Running 0 4h17m
node-exporter-bmwp2 2/2 Running 0 4d3h
node-exporter-cnnw8 2/2 Running 0 4d3h
node-exporter-s6jv4 2/2 Running 0 4d3h
prometheus-adapter-668748ddbd-l2zzv 1/1 Running 0 4d3h
prometheus-k8s-0 3/3 Running 1 4h56m
prometheus-k8s-1 3/3 Running 1 4h56m
prometheus-operator-55b978b89-68wkg 1/1 Running 0 4d3h
其中prometheus-k8s就是用StatefulSet控制器管理的prometheus server pod
prometheus-operator就是operator控制器pod
$ kubectl get svc -n monitoring
需要将grafana和prometheus-k8s这两个svc修改为NodePort或者为其创建ingress
$ kubectl edit svc grafana -n monitoring
$ kubectl edit svc prometheus-k8s -n monitoring
$ kubectl get svc -n monitoring
alertmanager-main ClusterIP 10.99.104.91 9093/TCP
alertmanager-operated ClusterIP None 9093/TCP,9094/TCP,9094/UDP
grafana NodePort 10.98.238.122 3000:31633/TCP
kube-state-metrics ClusterIP None 8443/TCP,9443/TCP
node-exporter ClusterIP None 9100/TCP
prometheus-adapter ClusterIP 10.102.67.217 443/TCP
prometheus-k8s NodePort 10.111.85.217 9090:31405/TCP
prometheus-operated ClusterIP None 9090/TCP
prometheus-operator ClusterIP None 8080/TCP
http://192.168.1.243:31405/targets
正常监控到的目标包括:
altermanager(9093)
kube-apiserver(6443)
kube-state-metrics(8443和9443)
kubelet(10255或者10250)
node-exporter(9100)
prometheus-operator(8080)
prometheus(9090)
coredns(9153)
没有正常监控到的目标包括:
kube-controller-manager(10252)
kube-scheduler(10251)
创建对应的svc修复kube-scheduler
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-scheduler
labels:
k8s-app: kube-scheduler
spec:
selector:
component: kube-scheduler
ports:
- name: http-metrics
port: 10251
targetPort: 10251
protocol: TCP
其中k8s-app: kube-scheduler和其serviceMonitor中定义的selector匹配
component: kube-scheduler和其pod中的labels匹配
$ cat prometheus-serviceMonitorKubeScheduler.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: kube-scheduler
name: kube-scheduler
namespace: monitoring
spec:
endpoints:
- interval: 30s # 每30s获取一次信息
port: http-metrics # 对应service的端口名
jobLabel: k8s-app
namespaceSelector: # 表示去匹配某一命名空间中的service,如果想从所有的namespace中匹配用any: true
matchNames:
- kube-system
selector: # 匹配的 Service 的labels,如果使用mathLabels,则下面的所有标签都匹配时才会匹配该service,如果使用matchExpressions,则至少匹配一个标签的service都会被选择
matchLabels:
k8s-app: kube-scheduler
$ cat /etc/kubernetes/manifests/kube-scheduler.yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-scheduler
tier: control-plane
name: kube-scheduler
namespace: kube-system
......
$ vim /etc/kubernetes/manifests/kube-scheduler.yaml
将–address地址更改成0.0.0.0
containers:
- command:
- kube-scheduler
- --leader-elect=true
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --address=0.0.0.0
将该文件重命名,过一会再改回来以实现更新,千万不能用apply更新
创建对应的svc修复kube-controller-manager
apiVersion: v1
kind: Service
metadata:
namespace: kube-system
name: kube-controller-manager
labels:
k8s-app: kube-controller-manager
spec:
selector:
component: kube-controller-manager
ports:
- name: http-metrics
port: 10252
targetPort: 10252
protocol: TCP
$ cat prometheus-serviceMonitorKubeControllerManager.yaml
$ cat /etc/kubernetes/manifests/kube-controller-manager.yaml
$ vim /etc/kubernetes/manifests/kube-controller-manager.yaml