初探ELK-filebeat使用小结
2016/9/22
注:由于在实践过程中,会不断学习到新的东西,文章会不定时保持更新,如果遇到问题,请重新刷新一下文章查看是否有更新。
一、安装 1、下载 有2种方式下载,推荐缓存rpm包到本地yum源 1)直接使用rpm [root@vm49 ~]# curl -L -O https://download.elastic.co/beats/filebeat/filebeat-1.3.1-x86_64.rpm 2)使用yum源 [root@vm49 ~]# rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch [root@vm49 ~]# vim /etc/yum.repos.d/beats.repo [beats] name=Elastic Beats Repository baseurl=https://packages.elastic.co/beats/yum/el/$basearch enabled=1 gpgkey=https://packages.elastic.co/GPG-KEY-elasticsearch gpgcheck=1 [root@vm49 ~]# yum install filebeat [root@vm49 ~]# chkconfig filebeat on 2、配置 【默认的配置】 [root@vm49 ~]# cat /etc/filebeat/filebeat.yml |grep -Ev '^(#| #| #| #| #|$)' filebeat: prospectors: - paths: - /var/log/*.log input_type: log registry_file: /var/lib/filebeat/registry output: elasticsearch: hosts: ["localhost:9200"] shipper: logging: files: rotateeverybytes: 10485760 # = 10MB 二、使用 1、测试环境(已经部署了服务) 客户端:10.50.200.49 nginx(www.test.com, www.work.com) 服务端:10.50.200.220 logstash, elasticsearch, kibana ---------------------------------------------------- 注:由于在前文的测试中,已经部署了 logstash, redis 等应用,因此,需要清理一下,如果是新环境,请略过。 【客户端】 # mkdir /etc/logstash/bak # mv /etc/logstash/conf.d/* /etc/logstash/bak/ # service logstash stop # chkconfig logstash off 【服务端】 # service redis stop # chkconfig redis off # mkdir /etc/logstash/bak # mv /etc/logstash/conf.d/* /etc/logstash/bak/ # service logstash stop ---------------------------------------------------- 2、场景1:只有1个域名/模糊匹配N个域名 目的:将匹配的 access 日志收集起来集中展示。 【客户端】 输入:filebeat 输出:logstash [root@vm49 ~]# cat /etc/filebeat/filebeat.yml |grep -Ev '^(#| #| #| #| #|$)' filebeat: prospectors: - paths: - /var/log/nginx/access_*.log input_type: log document_type: NginxAccess registry_file: /var/lib/filebeat/registry output: logstash: hosts: ["10.50.200.220:5044"] shipper: logging: to_files: true files: path: /var/log/filebeat name: filebeat rotateeverybytes: 10485760 # = 10MB [root@vm49 ~]# service filebeat restart 【服务端】 输入:logstash 输出:elasticsearch 配置自定义的 pattern [root@vm220 ~]# mkdir -p /etc/logstash/patterns.d [root@vm220 ~]# vim /etc/logstash/patterns.d/extra_patterns NGINXACCESS %{IPORHOST:clientip} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}" (?:%{QS:content_type}|-) (?:%{QS:request_body}|-) (?:"(?:%{URI:referrer}|-)"|%{QS:referrer}) %{NUMBER:response} %{BASE16FLOAT:request_time} (?:%{NUMBER:bytes}|-) 调整 logstash 的配置,启用 filebeat 插件。 [root@vm220 ~]# cat /etc/logstash/conf.d/filebeat.conf input { beats { port => "5044" } } filter { if[type] =~ "NginxAccess" { grok { patterns_dir => ["/etc/logstash/patterns.d"] match => { "message" => "%{NGINXACCESS}" } } date { match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ] remove_field => [ "timestamp" ] } } } output { if[type] =~ "NginxAccess" { elasticsearch { hosts => ["10.50.200.218:9200", "10.50.200.219:9200", "10.50.200.220:9200"] manage_template => false index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}" document_type => "%{[@metadata][type]}" } } } [root@vm220 ~]# service logstash restart 回到 kibana 界面,使用 index 名称为: filebeat-* 来获取数据。 结果:符合预期。 3、场景2:N个域名分开收集 目的:将 www.test.com 和 www.work.com 的 access 日志收集起来分开展示 【客户端】 输入:filebeat 输出:logstash [root@vm49 ~]# cat /etc/filebeat/filebeat.yml |grep -Ev '^(#| #| #| #| #|$)' filebeat: prospectors: - paths: - /var/log/nginx/access_www.test.com*.log input_type: log document_type: NginxAccess-www.test.com - paths: - /var/log/nginx/access_www.work.com*.log input_type: log document_type: NginxAccess-www.work.com registry_file: /var/lib/filebeat/registry output: logstash: hosts: ["10.50.200.220:5044"] shipper: logging: to_files: true files: path: /var/log/filebeat name: filebeat rotateeverybytes: 10485760 # = 10MB [root@vm49 ~]# service filebeat restart 【服务端】 输入:logstash 输出:elasticsearch 配置自定义的 pattern [root@vm220 ~]# mkdir -p /etc/logstash/patterns.d [root@vm220 ~]# vim /etc/logstash/patterns.d/extra_patterns NGINXACCESS %{IPORHOST:clientip} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{URIPATHPARAM:request} HTTP/%{NUMBER:httpversion}" (?:%{QS:content_type}|-) (?:%{QS:request_body}|-) (?:"(?:%{URI:referrer}|-)"|%{QS:referrer}) %{NUMBER:response} %{BASE16FLOAT:request_time} (?:%{NUMBER:bytes}|-) 调整 logstash 的配置,启用 filebeat 插件。 [root@vm220 ~]# cat /etc/logstash/conf.d/filebeat.conf input { beats { port => "5044" } } filter { if[type] =~ "NginxAccess-" { grok { patterns_dir => ["/etc/logstash/patterns.d"] match => { "message" => "%{NGINXACCESS}" } } date { match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ] remove_field => [ "timestamp" ] } } } output { if[type] == "NginxAccess-www.test.com" { elasticsearch { hosts => ["10.50.200.218:9200", "10.50.200.219:9200", "10.50.200.220:9200"] manage_template => false index => "%{[@metadata][beat]}-nginxaccess-www.test.com-%{+YYYY.MM.dd}" document_type => "%{[@metadata][type]}" } } if[type] == "NginxAccess-www.work.com" { elasticsearch { hosts => ["10.50.200.218:9200", "10.50.200.219:9200", "10.50.200.220:9200"] manage_template => false index => "%{[@metadata][beat]}-nginxaccess-www.work.com-%{+YYYY.MM.dd}" document_type => "%{[@metadata][type]}" } } } [root@vm220 ~]# service logstash restart 回到 kibana 界面,使用 index 名称为: filebeat-nginxaccess-www.test.com-* filebeat-nginxaccess-www.work.com-* 来获取数据。 结果:符合预期。 三、小结FAQ 1、数据流向 ------------------------------------------------------------------------- |---------client--------|----------server-------------------------------| / elasticsearch(vm218) log_files -> filebeat --> logstash -> - elasticsearch(vm219) -> kibana \ elasticsearch(vm220) ------------------------------------------------------------------------- 预期未来的扩容可能有2种方式: 1)调整 filebeat 的 output 这一节,增加服务端多个 logstash 节点。 --------------------------------------------------------------------------- |---------client--------|----------server---------------------------------| / logstash / elasticsearch(vm218) log_files -> filebeat --> - logstash -> - elasticsearch(vm219) -> kibana \ logstash \ elasticsearch(vm220) --------------------------------------------------------------------------- 2)调整 filebeat 的 output 这一节,增加本地 logstash 节点,把数据丢到消息队列(MQ,redis之类的应用)中,服务端的 logstash 节点从 redis 中取数据来执行 filter 操作再保存到 ES 集群中。 ------------------------------------------------------------------------------------------------- |---------client---------------------|----------server------------------------------------------| / logstash / elasticsearch(vm218) log_files -> filebeat -> logstash --> redis -> - logstash -> - elasticsearch(vm219) -> kibana \ logstash \ elasticsearch(vm218) ------------------------------------------------------------------------------------------------- 2、filebeat 是如何记录 offset 的? [root@vm49 ~]# cat /var/lib/filebeat/registry |python -mjson.tool { "/var/log/nginx/access_www.test.com_80.log": { "FileStateOS": { "device": 64515, "inode": 918691 }, "offset": 6236304, "source": "/var/log/nginx/access_www.test.com_80.log" }, "/var/log/nginx/access_www.work.com_80.log": { "FileStateOS": { "device": 64515, "inode": 918737 }, "offset": 6793272, "source": "/var/log/nginx/access_www.work.com_80.log" } } 用途示例:测试时,想重新收集日志,那么,直接删掉这个文件,让 filebeat 重头开始读日志即可。 【客户端】 [root@vm49 ~]# service filebeat stop [root@vm49 ~]# rm /var/lib/filebeat/registry 【服务端】 [root@vm220 ~]# service logstash stop [root@vm220 ~]# curl -XDELETE 'http://localhost:9200/filebeat-*?pretty' { "acknowledged" : true } 【客户端】 [root@vm49 ~]# service filebeat start 【服务端】 [root@vm220 ~]# service logstash start 结果:符合预期。 3、在 logstash 中定义 manage_template => false 的意义 是为了使用自定义的模版吗?待验证。 ZYXW、参考 1、官网 https://www.elastic.co/guide/en/beats/filebeat/current/config-filebeat-logstash.html https://www.elastic.co/guide/en/beats/libbeat/1.3/logstash-installation.html#logstash-setup https://www.elastic.co/guide/en/beats/libbeat/1.3/setup-repositories.html https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-template.html#load-template-shell