Kubernetes Metrics Server安装

1. 下载metrics-server代码

git clone https://github.com/kubernetes-incubator/metrics-server.git

2. 查看依赖的镜像

$ cd metrics-server/deploy/1.8+
$ grep 'image:' *
metrics-server-deployment.yaml:        image: k8s.gcr.io/metrics-server-amd64:v0.3.3

假如gcr.io的镜像访问不到可以将metrics-server-deployment.yaml中的镜像替换为:registry.cn-hangzhou.aliyuncs.com/kubernets-imags/metrics-server-amd64:v0.3.3

sed -i "s/image: .*/image: registry.cn-hangzhou.aliyuncs.com\/kubernets-imags\/metrics-server-amd64:v0.3.3/g" metrics-server-deployment.yaml

3. 安装metrics-server

$ cd metrics-server
$ kubectl create -f deploy/1.8+/

稍后就可以看到 metrics-server 运行起来:

$ kubectl -n kube-system get pods -l k8s-app=metrics-server
NAME                              READY   STATUS    RESTARTS   AGE
metrics-server-54957b58f4-dnntx   1/1     Running   0          21s

4. 验证是否安全成功

$ kubectl top node
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)

从上面的输出可以看到 metrics-server 并未成功启动。查看 metrics-server 运行日志:

$ kubectl logs metrics-server-54957b58f4-dnntx  -n kube-system
E1005 11:58:15.654250       1 manager.go:111] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:mesos-test2: unable to fetch metrics from Kubelet mesos-test2 (mesos-test2): Get https://mesos-test2:10250/stats/summary/: dial tcp: lookup mesos-test2 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:k8s-slave20: unable to fetch metrics from Kubelet k8s-slave20 (k8s-slave20): Get https://k8s-slave20:10250/stats/summary/: dial tcp: lookup k8s-slave20 on 10.96.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:mesos-test1: unable to fetch metrics from Kubelet mesos-test1 (mesos-test1): Get https://mesos-test1:10250/stats/summary/: dial tcp: lookup mesos-test1 on 10.96.0.10:53: no such host]

可以看到metrics-server在从kubelet的10250端口获取信息时,使用的是hostname,而因为node1和node2是一个独立的Kubernetes演示环境,只是修改了这两个节点系统的/etc/hosts文件,而并没有内网的DNS服务器,所以metrics-server中不认识node1和node2的名字。
解决方案:

  1. 删除metrics-server
kubectl delete pods metrics-server-54957b58f4-dnntx  -n kube-system
  1. 修改metrics-server-deployment.yaml,添加如下command配置,然后重新部署metrics-server。
        imagePullPolicy: Always
        command:
            - /metrics-server
            - --kubelet-preferred-address-types=InternalIP
            - --kubelet-insecure-tls
        volumeMounts:
        - name: tmp-dir
          mountPath: /tmp
  1. 参考文档
    Kubernetes的Metrics API和Metrics Server
    kubernetes 1.14安装部署metrics-server插件
    metrics-server部署后服务不可用 #417

你可能感兴趣的:(Kubernetes Metrics Server安装)