包括:
日志路径标准化 /data/logs/app/app_error.log app_access.log;
日志格式标准化 logformat
日志收集 fluentd或filebeat到kakfa
日志存储 kafka到es
日志查询 es到kibanna
日志监控 storm消费kafka
我们通过在每台node上部署一个以DaemonSet方式运行的fluentd来收集每台node上的日志。Fluentd将docker日志目录/var/lib/docker/containers和/var/log目录挂载到Pod中,然后Pod会在node节点的/var/log/pods目录中创建新的目录,可以区别不同的容器日志输出,该目录下有一个日志文件链接到/var/lib/docker/contianers目录下的容器日志输出。
官方文档:https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/fluentd-elasticsearch
1)下载yaml
# mkdir efk && cd efk
wget \
https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/es-statefulset.yaml
wget \
https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/es-service.yaml
wget \
https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/fluentd-es-configmap.yaml
wget \
https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/fluentd-es-ds.yaml
wget \
https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/kibana-service.yaml
wget \
https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/fluentd-elasticsearch/kibana-deployment.yaml
2)拉取镜像
3)Node节点打标签
Fluentd 是以 DaemonSet 的形式运行在 Kubernetes 集群中,这样就可以保证集群中每个 Node 上都会启动一个 Fluentd,我们在 Master 节点创建 Fluented 服务,最终会在各个 Node 上运行,查看fluentd-es-ds.yaml 信息:
[centos@k8s-master efk]$ kubectl get -f fluentd-es-ds.yaml
NAME SECRETS AGE
serviceaccount/fluentd-es 1 70m
NAME AGE
clusterrole.rbac.authorization.k8s.io/fluentd-es 70m
NAME AGE
clusterrolebinding.rbac.authorization.k8s.io/fluentd-es 70m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/fluentd-es-v2.2.1 0 0 0 0 0 beta.kubernetes.io/fluentd-ds-ready=true 70m
通过输出内容我们发现 NODE-SELECTOR 选项为 beta.kubernetes.io/fluentd-ds-ready=true,这个选项说明 fluentd 只会调度到设置了标签 beta.kubernetes.io/fluentd-ds-ready=true 的 Node 节点,否则fluentd的pod无法正常启动。
查看集群中 Node 节点是否有这个标签:
[root@master] ~/efk$ kubectl describe nodes master.hanli.com |grep -A 5 Labels #查看关键字所在的那一行和之后的5行
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=master.hanli.com
node-role.kubernetes.io/master=
Annotations: flannel.alpha.coreos.com/backend-data: {"VtepMAC":"aa:f0:02:e1:ec:13"}
flannel.alpha.coreos.com/backend-type: vxlan
发现没有这个标签,为需要收集日志的所有Node 打上这个标签:
[root@master] ~/efk$ kubectl label nodes master.hanli.com beta.kubernetes.io/fluentd-ds-ready=true
node/master.hanli.com labeled
[root@master] ~/efk$ kubectl label nodes slave1.hanli.com beta.kubernetes.io/fluentd-ds-ready=true
node/slave1.hanli.com labeled
[root@master] ~/efk$ kubectl label nodes slave2.hanli.com beta.kubernetes.io/fluentd-ds-ready=true
node/slave2.hanli.com labeled
[root@master] ~/efk$ kubectl label nodes slave3.hanli.com beta.kubernetes.io/fluentd-ds-ready=true
node/slave3.hanli.com labeled
4)部署
[root@master] ~/efk$ kubectl create -f .
service/elasticsearch-logging created
serviceaccount/elasticsearch-logging created
clusterrole.rbac.authorization.k8s.io/elasticsearch-logging created
clusterrolebinding.rbac.authorization.k8s.io/elasticsearch-logging created
statefulset.apps/elasticsearch-logging created
configmap/fluentd-es-config-v0.2.0 created
serviceaccount/fluentd-es created
clusterrole.rbac.authorization.k8s.io/fluentd-es created
clusterrolebinding.rbac.authorization.k8s.io/fluentd-es created
daemonset.apps/fluentd-es-v2.4.0 created
deployment.apps/kibana-logging created
service/kibana-logging created
所有资源都部署在 kube-system Namespace里,镜像拉取可能需要一段时间,拉取失败也可以登录对应节点手动拉取。
查看pod状态:
[root@master] ~/efk$ kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default curl-66959f6557-r4crd 1/1 Running 1 40h 10.244.2.7 slave2.hanli.com
default nginx-58db6fdb58-5wt7p 1/1 Running 0 4d22h 10.244.1.4 slave1.hanli.com
default nginx-58db6fdb58-bhmcv 1/1 Running 0 40h 10.244.2.8 slave2.hanli.com
kube-system coredns-86c58d9df4-8s9ss 1/1 Running 0 5d 10.244.1.2 slave1.hanli.com
kube-system coredns-86c58d9df4-z6hw5 1/1 Running 0 5d 10.244.1.3 slave1.hanli.com
kube-system elasticsearch-logging-0 0/1 ImagePullBackOff 0 6m29s 10.244.1.13 slave1.hanli.com
kube-system etcd-master.hanli.com 1/1 Running 0 5d 192.168.255.130 master.hanli.com
kube-system fluentd-es-v2.4.0-4j72x 0/1 ImagePullBackOff 0 6m30s 10.244.3.8 slave3.hanli.com
kube-system fluentd-es-v2.4.0-hkdlf 1/1 Running 0 6m30s 10.244.0.9 master.hanli.com
kube-system fluentd-es-v2.4.0-nk5wf 1/1 Running 0 6m30s 10.244.1.12 slave1.hanli.com
kube-system fluentd-es-v2.4.0-q94ht 0/1 CrashLoopBackOff 3 6m30s 10.244.2.14 slave2.hanli.com
kube-system kibana-logging-764d446c7d-kcc76 0/1 ImagePullBackOff 0 6m30s 10.244.3.9 slave3.hanli.com
kube-system kube-apiserver-master.hanli.com 1/1 Running 0 38h 192.168.255.130 master.hanli.com
kube-system kube-controller-manager-master.hanli.com 1/1 Running 0 38h 192.168.255.130 master.hanli.com
kube-system kube-flannel-ds-amd64-b4xqf 1/1 Running 1 5d 192.168.255.121 slave1.hanli.com
kube-system kube-flannel-ds-amd64-jk579 1/1 Running 0 5d 192.168.255.130 master.hanli.com
kube-system kube-flannel-ds-amd64-pkfcv 1/1 Running 0 37h 192.168.255.123 slave3.hanli.com
kube-system kube-flannel-ds-amd64-wx24x 1/1 Running 0 5d 192.168.255.122 slave2.hanli.com
kube-system kube-proxy-47dh9 1/1 Running 0 37h 192.168.255.121 slave1.hanli.com
kube-system kube-proxy-64qnx 1/1 Running 0 37h 192.168.255.123 slave3.hanli.com
kube-system kube-proxy-cbm26 1/1 Running 0 37h 192.168.255.122 slave2.hanli.com
kube-system kube-proxy-xnpnn 1/1 Running 0 37h 192.168.255.130 master.hanli.com
kube-system kube-scheduler-master.hanli.com 1/1 Running 0 38h 192.168.255.130 master.hanli.com
kube-system kubernetes-dashboard-57df4db6b-wlwl4 1/1 Running 0 4d22h 10.244.2.2 slave2.hanli.com
kube-system metrics-server-867cb8c5f4-p4nj5 1/1 Running 0 23h 10.244.1.9 slave1.hanli.com
monitoring alertmanager-main-0 2/2 Running 0 5h2m 10.244.1.10 slave1.hanli.com
monitoring alertmanager-main-1 0/2 ContainerCreating 0 6m29s slave3.hanli.com
monitoring alertmanager-main-2 2/2 Running 0 5h 10.244.0.8 master.hanli.com
monitoring grafana-777cf74b98-v9czp 1/1 Running 0 5h11m 10.244.3.6 slave3.hanli.com
monitoring kube-state-metrics-66c5b5b6d4-twtsg 4/4 Running 0 136m 10.244.2.13 slave2.hanli.com
monitoring node-exporter-klgfj 2/2 Running 0 5h11m 192.168.255.130 master.hanli.com
monitoring node-exporter-tgh4f 2/2 Running 0 5h11m 192.168.255.123 slave3.hanli.com
monitoring node-exporter-z24dz 2/2 Running 0 5h11m 192.168.255.121 slave1.hanli.com
monitoring node-exporter-z9pb8 2/2 Running 0 5h11m 192.168.255.122 slave2.hanli.com
monitoring prometheus-adapter-66fc7797fd-hhwms 1/1 Running 0 5h11m 10.244.2.11 slave2.hanli.com
monitoring prometheus-k8s-0 0/3 Pending 0 6m17s
monitoring prometheus-k8s-1 3/3 Running 0 5h1m 10.244.0.7 master.hanli.com
monitoring prometheus-operator-7df4c46d5b-826gp 1/1 Running 0 5h11m 10.244.3.5 slave3.hanli.com
5)先通过proxy代理将kibana服务暴露出来:
kubectl proxy --address=‘192.168.255.130’ --port=8086 --accept-hosts=’^*$’
访问 kibana, http://192.168.255.130:8086/api/v1/namespaces/kube-system/services/kibana-logging/proxy