参考博客:https://blog.51cto.com/14143894/2458468
1、修改代码
参考链接:https://prometheus.io/docs/instrumenting/clientlibs/
Python版本连接: https://github.com/prometheus/client_python
import prometheus_client
from prometheus_client import Counter
from prometheus_client.core import CollectorRegistry
request_total = Counter(“http_requests_total”,“Total request cout of the service”)
@app.route(’/metrics’)
def requests_count():
# request_total.inc()
return Response(prometheus_client.generate_latest(request_total),mimetype='text/plain')
@app.before_request
def request_stat():
request_total.inc()
2、有本地环境的话可以测试下:
C:\Users\luogu>curl http://192.168.11.5:8081/metrics
http_requests_total 18225.0
http_requests_created 1.6097487772094784e+09
指标会自动加上_total 后缀
3、部署应用:
apiVersion: apps/v1
kind: Deployment
metadata:
name: flask-admin
namespace: ms-test
labels:
app: flask-admin
spec:
minReadySeconds: 5
strategy:
type: RollingUpdate
rollingUpdate:
#更新时允许最大激增的容器数
maxSurge: 1
#更新时允许最大unavailable的容器数
maxUnavailable: 0
replicas: 1
selector:
matchLabels:
app: flask-admin
template:
metadata:
labels:
app: flask-admin
annotations:
prometheus.io/scrape: “true”
prometheus.io/port: “8081”
prometheus.io/path: “/metrics”
添加上注解:
annotations:
prometheus.io/scrape: “true”
prometheus.io/port: “8081”
prometheus.io/path: “/metrics”
让prometheus能自动发现pod的指标
4、 再用clusterIp测试指标获取正不正常:
[root@node1 ~]# kubectl -n ms-test get svc|grep flask-admin
flask-admin NodePort 10.68.192.130 8081:34622/TCP 17d
[root@node1 ~]#
[root@node1 ~]#
[root@node1 ~]#
[root@node1 ~]# curl http://10.68.192.130:8081/metrics
http_requests_total 3.0
http_requests_created 1.6097489733610158e+09
5、在prometheus查看指标
我用operator安装的prometheus默认查找不到,因为没有配置Pod的ServiceMonitor,这里拿istio的yaml过来改下:
cat serviceMonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
#labels:
name: kubernetes-pods-custom-metrics
namespace: monitoring
spec:
endpoints:
relabelings:
regex: “true”
sourceLabels:
__meta_kubernetes_pod_annotation_prometheus_io_scrape
action: replace
regex: (.+)
sourceLabels:
targetLabel: metrics_path
regex: ([^:]+)(?::\d+)?\d+)
replacement: $1:$2
sourceLabels:
address
__meta_kubernetes_pod_annotation_prometheus_io_port
targetLabel: address
regex: _meta_kubernetes_pod_label(.+)
sourceLabels:
targetLabel: namespace
sourceLabels:
targetLabel: pod_name
jobLabel: kubernetes-pods-custom-metrics
namespaceSelector:
any: true
selector:
matchExpressions:
operator: DoesNotExist
#matchLabels:
#app: flask-admin
kubectl apply -f serviceMonitor.yaml
这时再去prometheus上查看http_requests_total 指标就能找到了
6、 部署 Custom Metrics Adapter
prometheus采集到的metrics并不能直接给k8s用,因为两者数据格式不兼容,还需要另外一个组件(k8s-prometheus-adpater),将prometheus的metrics 数据格式转换成k8s API接口能识别的格式,转换以后,因为是自定义API,所以还需要用Kubernetes aggregator在主APIServer中注册,以便直接通过/apis/来访问。
这个主要作用就是将自己注册到api-server中,第二就是转换成api可以识别的数据,
https://github.com/DirectXMan12/k8s-prometheus-adapter
该 PrometheusAdapter 有一个稳定的Helm Charts,我们直接使用。
先准备下helm环境:
[root@k8s-master1 helm]# wget https://get.helm.sh/helm-v3.0.0-linux-amd64.tar.gz
[root@k8s-master1 helm]# tar xf helm-v3.0.0-linux-amd64.tar.gz
[root@k8s-master1 helm]# mv linux-amd64/helm /usr/bin
现在就可以使用helm 了,安装好helm,还能配置一个helm的仓库
也就是它将adapter存放到这个仓库里面了
添加的话建议使用微软云的adapter的
[root@k8s-master1 helm]# helm repo add stable http://mirror.azure.cn/kubernetes/charts
“stable” has been added to your repositories
[root@k8s-master1 helm]# helm repo ls
NAME URL
stable http://mirror.azure.cn/kubernetes/charts
这样的话,我们就能使用helm install,安装adapter了
因为adapter这个chart需要把prometheus连接改为自己环境的,如下:
[root@k8s-master1 helm]# helm install prometheus-adapter stable/prometheus-adapter --namespace kube-system --set prometheus.url=http://prometheus-k8s.monitoring,prometheus.port=9090
NAME: prometheus-adapter
LAST DEPLOYED: Fri Dec 13 15:22:42 2019
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
prometheus-adapter has been deployed.
In a few minutes you should be able to list metrics using the following command(s):
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1
实际操作,会提示仓库已经废弃,将就用着先
[root@k8s-master1 helm]# helm list -n kube-system
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
prometheus-adapter kube-system 1 2019-12-13 15:22:42.043441232 +0800 CST deployed prometheus-adapter-1.4.0 v0.5.0
查看pod已经部署成功
[root@k8s-master1 helm]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
…
prometheus-adapter-77b7b4dd8b-9rv26 1/1 Running 0 2m36s
检查判断pod是否工作正常,这里已经是注册到聚合层了
[root@k8s-master1 helm]# kubectl get apiservice
v1beta1.custom.metrics.k8s.io kube-system/prometheus-adapter True 13m
这样就能通过一个原始的url去测试这个接口能不能用
[root@k8s-master1 helm]# kubectl get --raw “/apis/custom.metrics.k8s.io/v1beta1” |jq
创建hpa策略
[root@node1 prometheus-adapter]# cat flask-admin-hpa.yaml
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: flask-admin-hpa
namespace: ms-test
spec:
minReplicas: 1
maxReplicas: 3
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: flask-admin
metrics:
注意单位是m,换成个/s需要除以1000,上面意思也就是请求大于平均40个/s,就会扩容
查看: 目前拿不到值
[root@node1 prometheus-adapter]# kubectl -n ms-test get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
flask-admin-hpa Deployment/flask-admin /20 1 3 1 49s
因为适配器还不知道你要什么指标(http_requests_per_second),HPA也就获取不到Pod提供指标。
在名称空间中编辑prometheus-adapter ConfigMap,在该rules部分的顶部添加一个seriesQuery,来收集我们想实现的QPS的值,如下:
[root@k8s-master1 hpa]# kubectl edit cm prometheus-adapter -n kube-system
rules:
- seriesQuery: 'http_requests_total{namespace!="",pod!=""}'
resources:
overrides:
namespace: {resource: "namespace"}
pod: {resource: "pod"}
name:
matches: "^(.*)_total"
as: "${1}_per_second"
metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'
这个值是求的它的一个平均值,也就是2分钟之内的http请求平均值
rate(http_requests_total{namespace!="",pod!=""}[2m])
因为我们是多个pod,所以我们需要相加对外提供一个指标,然后我们再给一个by,给个标签,这样的话进行标签去查询
sum(rate(http_requests_total{namespace!="",pod!=""}[2m]))
使用by,定义标签的名称方便去查询
sum(rate(http_requests_total{namespace!="",pod!=""}[2m])) by (pod)
测试api
kubectl get --raw “/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/http_requests_per_second” |grep “http_requests_per_second”
内容很多,我是输出到文件,然后再查找http_requests_per_second有没有这样的字眼
目前已经收到我们的值了
[root@node1 ~]# kubectl -n ms-test get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
flask-admin-hpa Deployment/flask-admin 66m/40 1 3 1 4h41m
压测 ab -c 3 -n 10000 -H ‘token: eyJhbGciOiJIUzI1NiIsImlhdCI6MTYwOTU4NzE2MiwiZXhwIjoxNjEyMTc5MTYyfQ.eyJpZCI6M30.spvwRMBdf5Cz5AxOa-d31ar2x5hfKkRBL-2AxH5XI3I’ http://test-gateway.kkkk.com/worksheet/version_list (换成你自己压测的地址,我这个地址有token需要加上了)
查看扩容状态
等待一会大概5分钟就会进行副本的缩容
总结流程: 代码暴露指标=>prometheus通过自动发现获取指标(operator安装的需要加serviceMonitor,普通安装的需要改配置,网上有)=>安装prometheus-adapter,将接口注册到apiserver供hpa扩缩容时查询,对应于metric-server=>编写hpa=>压测检验结果