在k8s集群中安装prometheus operator

创建Operator的关键是CRD(自定义资源)的设计
Operator的核心实现就是基于 Kubernetes 的以下两个概念:
资源:对象的状态定义
控制器:观测、分析和行动,以调节资源的分布

Prometheus-Operator的核心是Operator控制器,它会创建5个crd资源对象
alertmanagers.monitoring.coreos.com
podmonitors.monitoring.coreos.com
prometheuses.monitoring.coreos.com
prometheusrules.monitoring.coreos.com
servicemonitors.monitoring.coreos.com
prometheuses就是prometheus server
servicemonitors可以理解为提供metric接口的各种exporter
servicemonitors的后端是各类service

安装operator
$ git clone https://github.com/coreos/kube-prometheus.git
$ cd cd kube-prometheus/manifests/
$ vim prometheus-serviceMonitorKubelet.yaml
将其中的两个https-metrics更改为http-metrics,并删除其中的tls证书
因为默认情况下,这个 ServiceMonitor 是关联的 kubelet 的10250端口去采集的节点数据,为了安全,这个 metrics 数据已经迁移到10255这个只读端口上面去了
如果不修改,就是监控kubelete的10250端口。

$ kubectl apply -f .
$ kubectl get crd
$ kubectl get pods -n monitoring
alertmanager-main-0 2/2 Running 0 4d3h
alertmanager-main-1 2/2 Running 0 4d3h
alertmanager-main-2 2/2 Running 0 4d3h
grafana-57bfdd47f8-88nkx 1/1 Running 0 4d3h
kube-state-metrics-65d5b4b99d-qdmw7 4/4 Running 0 4h17m
node-exporter-bmwp2 2/2 Running 0 4d3h
node-exporter-cnnw8 2/2 Running 0 4d3h
node-exporter-s6jv4 2/2 Running 0 4d3h
prometheus-adapter-668748ddbd-l2zzv 1/1 Running 0 4d3h
prometheus-k8s-0 3/3 Running 1 4h56m
prometheus-k8s-1 3/3 Running 1 4h56m
prometheus-operator-55b978b89-68wkg 1/1 Running 0 4d3h
其中prometheus-k8s就是用StatefulSet控制器管理的prometheus server pod
prometheus-operator就是operator控制器pod

$ kubectl get svc -n monitoring
需要将grafana和prometheus-k8s这两个svc修改为NodePort或者为其创建ingress
$ kubectl edit svc grafana -n monitoring
$ kubectl edit svc prometheus-k8s -n monitoring
$ kubectl get svc -n monitoring
alertmanager-main ClusterIP 10.99.104.91 9093/TCP
alertmanager-operated ClusterIP None 9093/TCP,9094/TCP,9094/UDP
grafana NodePort 10.98.238.122 3000:31633/TCP
kube-state-metrics ClusterIP None 8443/TCP,9443/TCP
node-exporter ClusterIP None 9100/TCP
prometheus-adapter ClusterIP 10.102.67.217 443/TCP
prometheus-k8s NodePort 10.111.85.217 9090:31405/TCP
prometheus-operated ClusterIP None 9090/TCP
prometheus-operator ClusterIP None 8080/TCP

http://192.168.1.243:31405/targets
正常监控到的目标包括:
altermanager(9093)
kube-apiserver(6443)
kube-state-metrics(8443和9443)
kubelet(10255或者10250)
node-exporter(9100)
prometheus-operator(8080)
prometheus(9090)
coredns(9153)
没有正常监控到的目标包括:
kube-controller-manager(10252)
kube-scheduler(10251)

创建对应的svc修复kube-scheduler

apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-scheduler
  labels:
    k8s-app: kube-scheduler
spec:
  selector:
    component: kube-scheduler
  ports:
  - name: http-metrics
    port: 10251
    targetPort: 10251
    protocol: TCP

其中k8s-app: kube-scheduler和其serviceMonitor中定义的selector匹配
component: kube-scheduler和其pod中的labels匹配
$ cat prometheus-serviceMonitorKubeScheduler.yaml

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    k8s-app: kube-scheduler
  name: kube-scheduler
  namespace: monitoring
spec:
  endpoints:
  - interval: 30s # 每30s获取一次信息
    port: http-metrics  # 对应service的端口名
  jobLabel: k8s-app
  namespaceSelector: # 表示去匹配某一命名空间中的service,如果想从所有的namespace中匹配用any: true
    matchNames:
    - kube-system
  selector:  # 匹配的 Service 的labels,如果使用mathLabels,则下面的所有标签都匹配时才会匹配该service,如果使用matchExpressions,则至少匹配一个标签的service都会被选择
    matchLabels:
      k8s-app: kube-scheduler

$ cat /etc/kubernetes/manifests/kube-scheduler.yaml

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-scheduler
    tier: control-plane
  name: kube-scheduler
  namespace: kube-system
......

$ vim /etc/kubernetes/manifests/kube-scheduler.yaml
将–address地址更改成0.0.0.0

containers:
- command:
- kube-scheduler
- --leader-elect=true
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --address=0.0.0.0    

将该文件重命名,过一会再改回来以实现更新,千万不能用apply更新

创建对应的svc修复kube-controller-manager

apiVersion: v1
kind: Service
metadata:
  namespace: kube-system
  name: kube-controller-manager
  labels:
    k8s-app: kube-controller-manager
spec:
  selector:
    component: kube-controller-manager
  ports:
  - name: http-metrics
    port: 10252
    targetPort: 10252
    protocol: TCP

$ cat prometheus-serviceMonitorKubeControllerManager.yaml
$ cat /etc/kubernetes/manifests/kube-controller-manager.yaml
$ vim /etc/kubernetes/manifests/kube-controller-manager.yaml

你可能感兴趣的:(在k8s集群中安装prometheus operator)