filebeat合并多行日志

原文地址:https://www.elastic.co/guide/en/beats/filebeat/current/_examples_of_multiline_configuration.html

一、多行配置示例

1、将Java堆栈跟踪日志组合成一个事件
2、将C风格的日志组合成一个事件
3、结合时间戳处理多行事件

二、Java堆栈跟踪

1、Java示例一
Java堆栈跟踪由多行组成,每一行在初始行之后以空格开头,如本例中所述:

Exception in thread "main" java.lang.NullPointerException
        at com.example.myproject.Book.getTitle(Book.java:16)
        at com.example.myproject.Author.getBookTitles(Author.java:25)
        at com.example.myproject.Bootstrap.main(Bootstrap.java:14)

要将这些行整合到Filebeat中的单个事件中,请使用以下多行配置:

multiline.pattern: '^[[:space:]]'
multiline.negate: false
multiline.match: after

此配置将以空格开头的所有行合并到上一行。

2、Java示例二
下面是一个Java堆栈跟踪日志,稍微复杂的例子:

Exception in thread "main" java.lang.IllegalStateException: A book has a null property
       at com.example.myproject.Author.getBookIds(Author.java:38)
       at com.example.myproject.Bootstrap.main(Bootstrap.java:14)
Caused by: java.lang.NullPointerException
       at com.example.myproject.Book.getId(Book.java:22)
       at com.example.myproject.Author.getBookIds(Author.java:35)
       ... 1 more

要将这些行整合到Filebeat中的单个事件中,请使用以下多行配置:

multiline.pattern: '^[[:space:]]+(at|\.{3})\b|^Caused by:'
multiline.negate: false
multiline.match: after

此配置解释说明:

将以空格开头的所有行合并到上一行
并把以Caused by开头的也追加到上一行

三、C风格的日志

一些编程语言在一行末尾使用反斜杠()字符,表示该行仍在继续,如本例中所示:

printf ("%10.10ld  \t %10.10ld \t %s\
  %f", w, x, y, z );

要将这些行整合到Filebeat中的单个事件中,请使用以下多行配置:

multiline.pattern: '\\$'
multiline.negate: false
multiline.match: before

此配置将以""字符结尾的任何行与后面的行合并。

四、时间戳

来自Elasticsearch等服务的活动日志通常以时间戳开始,然后是关于特定活动的信息,如下例所示:

[2015-08-24 11:49:14,389][INFO ][env                      ] [Letha] using [1] data paths, mounts [[/
(/dev/disk1)]], net usable_space [34.5gb], net total_space [118.9gb], types [hfs]

要将这些行整合到Filebeat中的单个事件中,请使用以下多行配置:

multiline.pattern: '^\[[0-9]{4}-[0-9]{2}-[0-9]{2}'
multiline.negate: true
multiline.match: after

此配置使用negate: true和match: after设置来指定任何不符合指定模式的行都属于上一行。

五、应用程序事件

有时您的应用程序日志包含以自定义标记开始和结束的事件,如以下示例:

[2015-08-24 11:49:14,389] Start new event
[2015-08-24 11:49:14,395] Content of processing something
[2015-08-24 11:49:14,399] End event

要在Filebeat中将其整合为单个事件,请使用以下多行配置:

multiline.pattern: 'Start new event'
multiline.negate: true
multiline.match: after
multiline.flush_pattern: 'End event'

此配置把指定字符串开头,指定字符串结尾的多行合并为一个事件。

备注:
1、字段详解参考
2、multiline.match中的after和logstash中的previous意思相同,before和logstash中的next意思相同
3、logstash多行匹配示例

input {
    file {
        path => "/var/log/message"
        stat_interval => "10"
        start_position => "beginning"
        codec => multiline {
            pattern => "^\[\d{2}-"
            negate => true
            what => "previous"
        }
    }
}
what确定合并属于上一个事件还是下一个事件,可以为next和previous

六、生产环境用的配置文件示例

1、filebeat收集模块日志配置文件

filebeat.inputs:
- input_type: log
  paths:
    - /data/logs/company/logs/*.log
  exclude_files: ['.gz$','INFO']
  multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
  multiline.negate: true
  multiline.match: after
  tags: ["company"]

- input_type: log
  paths:
    - /data/logs/store/logs/*.log
  exclude_files: ['.gz$','INFO']
  multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
  multiline.negate: true
  multiline.match: after
  tags: ["store"]

- input_type: log
  paths:
    - /data/logs/pos/logs/*.log
  exclude_files: ['.gz$','INFO']
  multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
  multiline.negate: true
  multiline.match: after
  tags: ["pos"]

output.logstash:
  hosts: ["192.168.0.144:5044"]
  enabled: true
  worker: 2
  compression_level: 3

2、logstash获取filebeat日志,并读到redis中

input {
  beats {
    port => "5044"
  }
}
output {
  if "company" in [tags] {
    redis {
      host => "192.168.0.112"
      port => "6379"
      db => "3"
      key => "company"
      data_type => "list"
      password => "123456"
    }
  }
  if "store" in [tags] {
    redis {
      host => "192.168.0.112"
      port => "6379"
      db => "3"
      key => "store"
      data_type => "list"
      password => "123456"
    }
  }
  if "pos" in [tags] {
    redis {
      host => "192.168.0.112"
      port => "6379"
      db => "3"
      key => "pos"
      data_type => "list"
      password => "123456"
    }
  }
}

3、logstash从redis中读取日志写入到ES

input {
  redis {
    host => "192.168.0.112"
    port => "6379"
    db => "3"
    key => "company"
    data_type => "list"
    password => "123456"
    type => "company"
  }
  redis {
    host => "192.168.0.112"
    port => "6379"
    db => "3"
    key => "store"
    data_type => "list"
    password => "123456"
    type => "store"
  }
  redis {
    host => "192.168.0.112"
    port => "6379"
    db => "3"
    key => "pos"
    data_type => "list"
    password => "123456"
    type => "pos"
  }
}
output {
  if [type] == "company" {
    elasticsearch {
      hosts => ["192.168.0.117:9200","192.168.0.118:9200","192.168.0.119:9200"]
      index => "logstash-company-%{+YYYY.MM.dd}"
    }
  }
  if [type] == "store" {
    elasticsearch {
      hosts => ["192.168.0.117:9200","192.168.0.118:9200","192.168.0.119:9200"]
      index => "logstash-store-%{+YYYY.MM.dd}"
    }
  }
  if [type] == "pos" {
    elasticsearch {
      hosts => ["192.168.0.117:9200","192.168.0.118:9200","192.168.0.119:9200"]
      index => "logstash-pos-%{+YYYY.MM.dd}"
    }
  }
}

七、补充内容(filebeat收集json格式日志)

cat > json-log.yml  << END
#filebeat.prospectors:
filebeat.inputs:
- type: log
  enabled: true
  json.keys_under_root: true      #json格式收集
  json.overwrite_keys: true       #json格式收集
  paths:
    - /var/log/nginx/access.log   #需要收集的日志文件路径
  exclude_lines: ['^DBG',"^$",".gz$"]
  fields:
    log_topics: nginx-access-log  #设置日志标题 
  
output.logstash:
  hosts: ["192.168.0.117:5044"]  #输出到logstash服务地址和端口
​END

备注:

用processors中的decode_json_fields处理器进行处理,它类似logstash中的filter,具体格式如下:

processors:
- decode_json_fields:
  fields: ['message']   #要进行解析的字段
  target: "" #json内容解析到指定的字段,如果为空(""),则解析到顶级结构下
  overwrite_keys: false #如果解析出的json结构中某个字段在原始的event(在filebeat中传输的一条数据为一个event)中也存在,是否覆盖event中该字段的值,默认值:false
  process_array: false #数组是否解码,默认值:false
  max_depth: 1 #解码深度,默认值:1

你可能感兴趣的:(elk日志分析)