原文地址:https://www.elastic.co/guide/en/beats/filebeat/current/_examples_of_multiline_configuration.html
1、将Java堆栈跟踪日志组合成一个事件
2、将C风格的日志组合成一个事件
3、结合时间戳处理多行事件
1、Java示例一
Java堆栈跟踪由多行组成,每一行在初始行之后以空格开头,如本例中所述:
Exception in thread "main" java.lang.NullPointerException
at com.example.myproject.Book.getTitle(Book.java:16)
at com.example.myproject.Author.getBookTitles(Author.java:25)
at com.example.myproject.Bootstrap.main(Bootstrap.java:14)
要将这些行整合到Filebeat中的单个事件中,请使用以下多行配置:
multiline.pattern: '^[[:space:]]'
multiline.negate: false
multiline.match: after
此配置将以空格开头的所有行合并到上一行。
2、Java示例二
下面是一个Java堆栈跟踪日志,稍微复杂的例子:
Exception in thread "main" java.lang.IllegalStateException: A book has a null property
at com.example.myproject.Author.getBookIds(Author.java:38)
at com.example.myproject.Bootstrap.main(Bootstrap.java:14)
Caused by: java.lang.NullPointerException
at com.example.myproject.Book.getId(Book.java:22)
at com.example.myproject.Author.getBookIds(Author.java:35)
... 1 more
要将这些行整合到Filebeat中的单个事件中,请使用以下多行配置:
multiline.pattern: '^[[:space:]]+(at|\.{3})\b|^Caused by:'
multiline.negate: false
multiline.match: after
此配置解释说明:
将以空格开头的所有行合并到上一行
并把以Caused by开头的也追加到上一行
一些编程语言在一行末尾使用反斜杠()字符,表示该行仍在继续,如本例中所示:
printf ("%10.10ld \t %10.10ld \t %s\
%f", w, x, y, z );
要将这些行整合到Filebeat中的单个事件中,请使用以下多行配置:
multiline.pattern: '\\$'
multiline.negate: false
multiline.match: before
此配置将以""字符结尾的任何行与后面的行合并。
来自Elasticsearch等服务的活动日志通常以时间戳开始,然后是关于特定活动的信息,如下例所示:
[2015-08-24 11:49:14,389][INFO ][env ] [Letha] using [1] data paths, mounts [[/
(/dev/disk1)]], net usable_space [34.5gb], net total_space [118.9gb], types [hfs]
要将这些行整合到Filebeat中的单个事件中,请使用以下多行配置:
multiline.pattern: '^\[[0-9]{4}-[0-9]{2}-[0-9]{2}'
multiline.negate: true
multiline.match: after
此配置使用negate: true和match: after设置来指定任何不符合指定模式的行都属于上一行。
有时您的应用程序日志包含以自定义标记开始和结束的事件,如以下示例:
[2015-08-24 11:49:14,389] Start new event
[2015-08-24 11:49:14,395] Content of processing something
[2015-08-24 11:49:14,399] End event
要在Filebeat中将其整合为单个事件,请使用以下多行配置:
multiline.pattern: 'Start new event'
multiline.negate: true
multiline.match: after
multiline.flush_pattern: 'End event'
此配置把指定字符串开头,指定字符串结尾的多行合并为一个事件。
备注:
1、字段详解参考
2、multiline.match中的after和logstash中的previous意思相同,before和logstash中的next意思相同
3、logstash多行匹配示例
input {
file {
path => "/var/log/message"
stat_interval => "10"
start_position => "beginning"
codec => multiline {
pattern => "^\[\d{2}-"
negate => true
what => "previous"
}
}
}
what确定合并属于上一个事件还是下一个事件,可以为next和previous
1、filebeat收集模块日志配置文件
filebeat.inputs:
- input_type: log
paths:
- /data/logs/company/logs/*.log
exclude_files: ['.gz$','INFO']
multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
multiline.negate: true
multiline.match: after
tags: ["company"]
- input_type: log
paths:
- /data/logs/store/logs/*.log
exclude_files: ['.gz$','INFO']
multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
multiline.negate: true
multiline.match: after
tags: ["store"]
- input_type: log
paths:
- /data/logs/pos/logs/*.log
exclude_files: ['.gz$','INFO']
multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
multiline.negate: true
multiline.match: after
tags: ["pos"]
output.logstash:
hosts: ["192.168.0.144:5044"]
enabled: true
worker: 2
compression_level: 3
2、logstash获取filebeat日志,并读到redis中
input {
beats {
port => "5044"
}
}
output {
if "company" in [tags] {
redis {
host => "192.168.0.112"
port => "6379"
db => "3"
key => "company"
data_type => "list"
password => "123456"
}
}
if "store" in [tags] {
redis {
host => "192.168.0.112"
port => "6379"
db => "3"
key => "store"
data_type => "list"
password => "123456"
}
}
if "pos" in [tags] {
redis {
host => "192.168.0.112"
port => "6379"
db => "3"
key => "pos"
data_type => "list"
password => "123456"
}
}
}
3、logstash从redis中读取日志写入到ES
input {
redis {
host => "192.168.0.112"
port => "6379"
db => "3"
key => "company"
data_type => "list"
password => "123456"
type => "company"
}
redis {
host => "192.168.0.112"
port => "6379"
db => "3"
key => "store"
data_type => "list"
password => "123456"
type => "store"
}
redis {
host => "192.168.0.112"
port => "6379"
db => "3"
key => "pos"
data_type => "list"
password => "123456"
type => "pos"
}
}
output {
if [type] == "company" {
elasticsearch {
hosts => ["192.168.0.117:9200","192.168.0.118:9200","192.168.0.119:9200"]
index => "logstash-company-%{+YYYY.MM.dd}"
}
}
if [type] == "store" {
elasticsearch {
hosts => ["192.168.0.117:9200","192.168.0.118:9200","192.168.0.119:9200"]
index => "logstash-store-%{+YYYY.MM.dd}"
}
}
if [type] == "pos" {
elasticsearch {
hosts => ["192.168.0.117:9200","192.168.0.118:9200","192.168.0.119:9200"]
index => "logstash-pos-%{+YYYY.MM.dd}"
}
}
}
cat > json-log.yml << END
#filebeat.prospectors:
filebeat.inputs:
- type: log
enabled: true
json.keys_under_root: true #json格式收集
json.overwrite_keys: true #json格式收集
paths:
- /var/log/nginx/access.log #需要收集的日志文件路径
exclude_lines: ['^DBG',"^$",".gz$"]
fields:
log_topics: nginx-access-log #设置日志标题
output.logstash:
hosts: ["192.168.0.117:5044"] #输出到logstash服务地址和端口
END
备注:
用processors中的decode_json_fields处理器进行处理,它类似logstash中的filter,具体格式如下:
processors:
- decode_json_fields:
fields: ['message'] #要进行解析的字段
target: "" #json内容解析到指定的字段,如果为空(""),则解析到顶级结构下
overwrite_keys: false #如果解析出的json结构中某个字段在原始的event(在filebeat中传输的一条数据为一个event)中也存在,是否覆盖event中该字段的值,默认值:false
process_array: false #数组是否解码,默认值:false
max_depth: 1 #解码深度,默认值:1