k8s集群搭建完成后,由于pod分布在不同node内,定位问题查看日志变得复杂起来,pod数量不多的情况下可以通过kubectl自带的log命令进行日志查询,随着pod数量的增加日志查询的繁琐度也是呈指数型增长,定位问题也变得异常困难。
现在迫切需要搭建一套集群日志收集系统,目前主流的两种系统:
ELK:Filebeat(收集)、Logstash(过滤)、Kafka(缓冲)、Elasticsearch(存储)、Kibana(展示)
EFK:Fluentd(收集)、Elasticsearch(存储)、Kibana(展示)
其中EFK也是官方推荐的一种方案,本文就EFK搭建部署和遇到的一些坑进行一定总结。
$ kubectl get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-elasticsearch Ready <none> 86m v1.21.0 172.16.66.169 <none> CentOS Linux 8 4.18.0-305.19.1.el8_4.x86_64 docker://20.10.9
k8s-master Ready control-plane,master 86m v1.21.0 172.16.66.167 <none> CentOS Linux 8 4.18.0-305.19.1.el8_4.x86_64 docker://20.10.9
k8s-node1 Ready <none> 86m v1.21.0 172.16.66.168 <none> CentOS Linux 8 4.18.0-305.19.1.el8_4.x86_64 docker://20.10.9
k8s-node2 Ready <none> 86m v1.21.0 172.16.66.170 <none> CentOS Linux 8 4.18.0-305.19.1.el8_4.x86_64 docker://20.10.9
# node1与node2部署了两个node express web应用
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
websvr1-deployment-67fd6cf9d4-9fcfv 1/1 Running 0 62m 10.244.36.65 k8s-node1 <none> <none>
websvr1-deployment-67fd6cf9d4-bdhn8 1/1 Running 0 62m 10.244.169.129 k8s-node2 <none> <none>
websvr1-deployment-67fd6cf9d4-n6xt2 1/1 Running 0 62m 10.244.169.130 k8s-node2 <none> <none>
websvr2-deployment-67dfc4f674-79wrd 1/1 Running 0 62m 10.244.36.68 k8s-node1 <none> <none>
websvr2-deployment-67dfc4f674-bwdwx 1/1 Running 0 62m 10.244.36.67 k8s-node1 <none> <none>
websvr2-deployment-67dfc4f674-ktfml 1/1 Running 0 62m 10.244.36.66 k8s-node1 <none> <none>
由于elasticsearch集群占用内存比较大,为避免与业务容器竞争资源,应该将elasticsearch与业务容器隔离。(也可以将elasticsearch集群单独部署,甚至放在公司内网搭建,只需要fluentd可以正常与elasticsearch网络通信即可)在生产环境中,应保证至少有三个物理机可以用来搭建elasticsearch集群,单个物理机内存保证在2G以上。这里单独将elasticsearch部署在k8s-elasticsearch节点中作为测试,内存8G。
为了区分业务,创建新的空间用来部署elasticsearch
$ kubectl create ns kube-log
namespace/kube-log created
$ kubectl get ns
NAME STATUS AGE
default Active 3h37m
ingress-nginx Active 3h5m
kube-log Active 39s
kube-node-lease Active 3h37m
kube-public Active 3h37m
kube-system Active 3h37m
在一个集群中有这个几个组件:pod-a,svc-b,pod-b1,pod-b2。当 pod-a 想访问 pod-b 中的应用程序时,先会把请求打到 svc-b,再由 svc-b 将请求随机转发到 pod-b1或 pod-b2。
如果有个需求:pod-a 需要同时连接到 pod-b1和 pod-b2 ,这时再采用 svc-b 转发显然已经不能满足需求了。那 pod-a 该如何获取到 pod-b1和 pod-b2 的 IP 地址呢?采用 handless service 就可以实现。
vim handlessSvc.yaml
kind: Service
apiVersion: v1
metadata:
name: elasticsearch
namespace: kube-log
labels:
app: elasticsearch
spec:
selector:
app: elasticsearch
clusterIP: None
ports:
- port: 9200
name: rest
- port: 9300
name: inter-node
$ kubectl apply -f handlessSvc.yaml
$ kubectl get svc -n kube-log
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch ClusterIP None <none> 9200/TCP,9300/TCP 76s
#此处cluster-IP为none即为无头服务
在elasticsearch部署节点安装nfs,此处安装在k8s-elasticsearch节点
$ yum install -y nfs-utils
$ systemctl start nfs-server #老版本nfs启动为:systemctl start nfs
$ chkconfig nfs-server on #老版本为:chkconfig nfs-server on
$ systemctl enable nfs-server #老版本为:systemctl enable nfs
#创建nfs共享目录
$ mkdir /data/eslog -p
$ vim /etc/exports
> /data/eslog *(rw,no_root_squash) #设置允许访问该目录的IP地址,可设置为*,即允许所有IP
$ exportfs -arv
#配置生效
$ systemctl restart nfs-server #老版本为:systemctl restart nfs
$ serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: nfs-provisioner
$ kubectl apply -f serviceaccount.yaml
$ vim rbac.yaml
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nfs-provisioner-runner
rules:
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "update", "patch"]
- apiGroups: [""]
resources: ["services", "endpoints"]
verbs: ["get"]
- apiGroups: ["extensions"]
resources: ["podsecuritypolicies"]
resourceNames: ["nfs-provisioner"]
verbs: ["use"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: run-nfs-provisioner
subjects:
- kind: ServiceAccount
name: nfs-provisioner
namespace: default
roleRef:
kind: ClusterRole
name: nfs-provisioner-runner
apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-provisioner
rules:
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: leader-locking-nfs-provisioner
subjects:
- kind: ServiceAccount
name: nfs-provisioner
namespace: default
roleRef:
kind: Role
name: leader-locking-nfs-provisioner
apiGroup: rbac.authorization.k8s.io
$ kubectl apply -f rbac.yaml
$ vim npv.yaml
kind: Deployment
apiVersion: apps/v1
metadata:
name: nfs-provisioner
spec:
selector:
matchLabels:
app: nfs-provisioner
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: nfs-provisioner
spec:
nodeName: k8s-elasticsearch #此处指定部署到k8s-elasticsearch节点,如果es集群分布在不同物理机,可使用nodeSelector+标签指定部署
serviceAccount: nfs-provisioner
containers:
- name: nfs-provisioner
image: registry.cn-hangzhou.aliyuncs.com/open-ali/nfs-client-provisioner:latest
imagePullPolicy: IfNotPresent
volumeMounts:
- name: nfs-client-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: eslog/nfs #PROVISIONER_NAME是eslog/nfs,eslog/nfs需要跟后面的storageclass的provisinoer保持一致
- name: NFS_SERVER
value: 172.16.66.169 #这个需要写nfs服务端所在的ip地址,此处为k8s-elasticsearch地址
- name: NFS_PATH
value: /data/eslog #共享目录
volumes:
- name: nfs-client-root
nfs:
server: 172.16.66.169 #这个是nfs服务端的ip,大家需要写自己的nfs地址
path: /data/eslog
$ kubectl apply -f npv.yaml
$ vim class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: es-block-storage
provisioner: eslog/nfs
$ kubectl apply -f class.yaml
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nfs-provisioner-75cf88b6c9-wg6b6 0/1 Running 0 6m41s <none> k8s-elasticsearch <none> <none>
websvr1-deployment-67fd6cf9d4-9fcfv 1/1 Running 0 5h20m 10.244.36.65 k8s-node1 <none> <none>
websvr1-deployment-67fd6cf9d4-bdhn8 1/1 Running 0 5h20m 10.244.169.129 k8s-node2 <none> <none>
websvr1-deployment-67fd6cf9d4-n6xt2 1/1 Running 0 5h20m 10.244.169.130 k8s-node2 <none> <none>
websvr2-deployment-67dfc4f674-79wrd 1/1 Running 0 5h19m 10.244.36.68 k8s-node1 <none> <none>
websvr2-deployment-67dfc4f674-bwdwx 1/1 Running 0 5h19m 10.244.36.67 k8s-node1 <none> <none>
websvr2-deployment-67dfc4f674-ktfml 1/1 Running 0 5h19m 10.244.36.66 k8s-node1 <none> <none>
$ kubectl get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
es-block-storage eslog/nfs Delete Immediate false 55m
以stateful部署elasticsearch,有状态有序的
$ vim es.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es-cluster
namespace: kube-log
spec:
serviceName: elasticsearch
replicas: 3
selector:
matchLabels:
app: elasticsearch
template:
metadata:
labels:
app: elasticsearch
spec:
nodeName: k8s-elasticsearch #此处指定部署到k8s-elasticsearch节点,如果es集群分布在不同物理机,可使用nodeSelector+标签指定部署
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:7.2.0
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: 1000m #单个容器最多可使用1个CPU
requests:
cpu: 100m #单个容器最少保证有0.1个CPU
ports:
- containerPort: 9200
name: rest #与handless service一致
protocol: TCP
- containerPort: 9300
name: inter-node
protocol: TCP
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
env:
- name: cluster.name #集群名称
value: k8s-logs
- name: node.name #节点名称,通过matedata.name获取
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: discovery.seed_hosts #设置在Elasticsearch集群中节点相互连接的发现方法,由于都在同一个 namespace 下面,我们可以将其缩短为es-cluster-[0,1,2].elasticsearch
value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"
- name: cluster.initial_master_nodes
value: "es-cluster-0,es-cluster-1,es-cluster-2"
- name: ES_JAVA_OPTS
value: "-Xms512m -Xmx512m" #告诉JVM使用512MB的最小和最大堆
initContainers: #这里定义了几个在主应用程序之前运行的Init 容器,这些初始容器按照定义的顺序依次执行,执行完成后才会启动主应用容器。
- name: fix-permissions
image: busybox
imagePullPolicy: IfNotPresent
command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
securityContext:
privileged: true
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
#第一个名为 fix-permissions 的容器用来运行 chown 命令,将 Elasticsearch 数据目录的用户和组更改为1000:1000(Elasticsearch 用户的 UID)。
#因为默认情况下,Kubernetes 用 root 用户挂载数据目录,这会使得 Elasticsearch 无法方法该数据目录
- name: increase-vm-max-map
image: busybox
imagePullPolicy: IfNotPresent
command: ["sysctl", "-w", "vm.max_map_count=262144"]
securityContext:
privileged: true
#第二个名为increase-vm-max-map 的容器用来增加操作系统对mmap计数的限制,默认情况下该值可能太低,导致内存不足的错误
- name: increase-fd-ulimit
image: busybox
imagePullPolicy: IfNotPresent
command: ["sh", "-c", "ulimit -n 65536"]
securityContext:
privileged: true
#最后一个初始化容器是用来执行ulimit命令增加打开文件描述符的最大数量的。
#此外 Elastisearch Notes for Production Use 文档还提到了由于性能原因最好禁用 swap,当然对于 Kubernetes 集群而言,最好也是禁用 swap 分区的
volumeClaimTemplates:
- metadata:
name: data
labels:
app: elasticsearch
spec:
accessModes: [ "ReadWriteOnce" ] #只能被 mount到单个节点上进行读写
storageClassName: es-block-storage #需要提前创建该对象,我们这里使用的 NFS 作为存储后端,所以需要安装一个对应的 provisioner驱动
resources:
requests:
storage: 10Gi #每个PV大小设置为10G
$ kubectl apply -f es.yaml
$ kubectl get pod -owide -n kube-log
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
es-cluster-0 0/1 Init:0/3 0 10m <none> k8s-elasticsearch <none> <none>
#可以看到一直处于初始化中,这是由于elasticsearch:7.2.0镜像拉取失败导致的,可以手动在部署es的节点上拉取:
$ docker pull elasticsearch:7.2.0
#重命名为yaml中的镜像名称:
$ docker tag 0efa6a3de177 docker.elastic.co/elasticsearch/elasticsearch:7.2.0
再次查看运行状态发现依然初始化中,查阅大量资料发现是在centos8中,需要手动修改kubelet配置文件,在master节点修改:
$ vim /etc/kubernetes/manifests/kube-apiserver.yaml
#在spec.containers.command结尾处增加:
- --feature-gates=RemoveSelfLink=false
#重启kubelet
service kubelet restart
#再次查看es状态:
$ kubectl get pod -owide -n kube-log
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
es-cluster-0 1/1 Running 0 21m 10.244.117.10 k8s-elasticsearch <none> <none>
es-cluster-1 1/1 Running 0 2m11s 10.244.117.11 k8s-elasticsearch <none> <none>
es-cluster-2 1/1 Running 0 115s 10.244.117.12 k8s-elasticsearch <none> <none>
$ kubectl get svc -n kube-log
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch ClusterIP None <none> 9200/TCP,9300/TCP 3h48m
此时elasticsearch才部署成功
$ kibana.yaml
apiVersion: v1
kind: Service
metadata:
name: kibana
namespace: kube-log
labels:
app: kibana
spec:
type: NodePort #为了测试方便,我们将 Service 设置为了 NodePort 类型
ports:
- port: 5601
selector:
app: kibana
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: kibana
namespace: kube-log
labels:
app: kibana
spec:
replicas: 1
selector:
matchLabels:
app: kibana
template:
metadata:
labels:
app: kibana
spec:
nodeName: k8s-elasticsearch #此处指定部署到k8s-elasticsearch节点,如果es集群分布在不同物理机,可使用nodeSelector+标签指定部署
containers:
- name: kibana
image: docker.elastic.co/kibana/kibana:7.2.0 #kibana版本需要与es版本一致
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: 1000m
requests:
cpu: 100m
env:
- name: ELASTICSEARCH_URL
value: http://elasticsearch:9200 #设置为handless service dns地址即可
ports:
- containerPort: 5601
$ kubectl apply -f kibana.yaml
#此处如果kibana长时间拉不下来,可以参考上面es部署镜像的方式从docker官方手动拉取
$ kubectl get pod -o wide -n kube-log
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
es-cluster-0 1/1 Running 0 33m 10.244.117.10 k8s-elasticsearch <none> <none>
es-cluster-1 1/1 Running 0 13m 10.244.117.11 k8s-elasticsearch <none> <none>
es-cluster-2 1/1 Running 0 13m 10.244.117.12 k8s-elasticsearch <none> <none>
kibana-5dd9f479dc-gbprl 1/1 Running 0 4m59s 10.244.117.13 k8s-elasticsearch <none> <none>
$ kubectl get svc -n kube-log -owide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
elasticsearch ClusterIP None <none> 9200/TCP,9300/TCP 3h57m app=elasticsearch
kibana NodePort 10.102.222.139 <none> 5601:32591/TCP 5m11s app=kibana
此时通过公网访问elasticsearch 服务器,端口32591,可以正常访问kibana日志管理系统,最后我们还需要部署fluentd将每个pod的日志发送给elasticsearch服务即可大功告成。
使用daemonset控制器部署fluentd组件,这样可以保证集群中的每个节点都可以运行同样fluentd的pod副本,这样就可以收集k8s集群中每个节点的日志,在k8s集群中,容器应用程序的输入输出日志会重定向到node节点里的json文件中,fluentd可以tail和过滤以及把日志转换成指定的格式发送到elasticsearch集群中。除了容器日志,fluentd也可以采集kubelet、kube-proxy、docker的日志
$ fluentd.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd
namespace: kube-log
labels:
app: fluentd
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fluentd
labels:
app: fluentd
rules:
- apiGroups:
- ""
resources:
- pods
- namespaces
verbs:
- get
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: fluentd
roleRef:
kind: ClusterRole
name: fluentd
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: fluentd
namespace: kube-log
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: kube-log
labels:
app: fluentd
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
serviceAccount: fluentd
serviceAccountName: fluentd
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1
imagePullPolicy: IfNotPresent
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch.kube-log.svc.cluster.local"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
- name: FLUENT_ELASTICSEARCH_SCHEME
value: "http"
- name: FLUENTD_SYSTEMD_CONF
value: disable
resources:
limits:
memory: 512Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
$ kubectl apply -f fluentd
$ kubectl get pod -owide -n kube-log
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
es-cluster-0 1/1 Running 0 20h 10.244.117.10 k8s-elasticsearch <none> <none>
es-cluster-1 1/1 Running 0 19h 10.244.117.11 k8s-elasticsearch <none> <none>
es-cluster-2 1/1 Running 0 19h 10.244.117.12 k8s-elasticsearch <none> <none>
fluentd-65ngd 1/1 Running 0 141m 10.244.36.69 k8s-node1 <none> <none>
fluentd-h8j2z 1/1 Running 0 141m 10.244.117.14 k8s-elasticsearch <none> <none>
fluentd-prsgv 1/1 Running 0 141m 10.244.169.131 k8s-node2 <none> <none>
fluentd-wtsf9 1/1 Running 0 141m 10.244.235.193 k8s-master <none> <none>
kibana-5f64ccf544-4wjwv 1/1 Running 0 66m 10.244.117.15 k8s-elasticsearch <none> <none>
至此日志收集集群已部署完成。