使用ELK收集日志时往往会接一层logstash对日志进行我们想要的格式解析,方便后面统计分析。测试可以直接放到Kibana DEV Tools页面中进行测试。
# 这个解析方式是最原生态的一个解析方式,需要一个字段或一个字符进行匹配,不能重叠或者漏掉否则解析不了。--[T ]代表一个空格,可以使用空格表示,也可以使用[T ]表示
# Sample Data
2019-04-29 17:40:01.131 INFO [query,,,] 9623 --- [trap-executor-0] c.n.d.s.r.aws.ConfigClusterResolver : Resolving eureka endpoints via configuration
# Grok Pattern
%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:loglevel}[T ]\[%{GREEDYDATA:servername},,,\][T ]%{INT:pid}[T ]---[T ]\[%{GREEDYDATA:loop}\][T ]%{GREEDYDATA:log}
# Custom patterns (自定义不用,上面的引用的全是自带的)
# Structured Data
{
"log": "c.n.d.s.r.aws.ConfigClusterResolver : Resolving eureka endpoints via configuration",
"loop": "trap-executor-0",
"loglevel": "INFO",
"servername": "query",
"pid": "9623",
"timestamp": "2019-04-29 17:40:01.131"
}
# Sample Data
2019-04-29 17:40:01.131 INFO [query,,,] 9623 --- [trap-executor-0] c.n.d.s.r.aws.ConfigClusterResolver : Resolving eureka endpoints via configuration
# Grok Pattern
%{TIME:timestamp} %{LOG2:log}
# Custom patterns
SSS [0-9]{3}
PID [0-9]+
TIME %{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{HOUR}:%{SECOND}:%{MINUTE}.%{SSS}
LOG1 %{LOGLEVEL:loglevel}[T ]\[%{GREEDYDATA:servername},,,\][T ]%{PID:pid}[T ]---[T ]\[%{GREEDYDATA:loop}\][T ]%{GREEDYDATA:log}
LOG2 %{LOGLEVEL:loglevel}[T ]\[%{GREEDYDATA:servername},,,\][T ]%{PID:pid}[T ]---[T ]%{GREEDYDATA:message}
# Structured Data
{
"log": "INFO [query,,,] 9623 --- [trap-executor-0] c.n.d.s.r.aws.ConfigClusterResolver : Resolving eureka endpoints via configuration",
"loglevel": "INFO",
"servername": "query",
"pid": "9623",
"message": "[trap-executor-0] c.n.d.s.r.aws.ConfigClusterResolver : Resolving eureka endpoints via configuration",
"timestamp": "2019-04-29 17:40:01.131"
}
# 建议: 在配置文件中删除log字段,覆盖message字段。
# 如我需要存储在ES上的格式日志如下:
# 2019-04-30 09:17:03.131 9623 --- [trap-executor-0] c.n.d.s.r.aws.ConfigClusterResolver : Resolving eureka endpoints via configuration
# cat patterns
SSS [0-9]{3}
PID [0-9]+
TIME %{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{HOUR}:%{SECOND}:%{MINUTE}.%{SSS}
LOG1 %{LOGLEVEL:loglevel}[T ]\[%{GREEDYDATA:servername},,,\][T ]%{PID:pid}[T ]---[T ]\[%{GREEDYDATA:loop}\][T ]%{GREEDYDATA:log}
LOG2 %{PID:pid}[T ]---[T ]%{GREEDYDATA:message}
# cat logstash.conf
...
grok {
patterns_dir => ["./patterns"]
match => ["message", "%{TIMESTAMP_ISO8601:logdate} %{LOGLEVEL:loglevel}[T ]\[%{GREEDYDATA:servername},,,\][T ]%{LOG2:logtest}"]
add_field => {"logtext" => "%{logdate} %{logtest}"}
}
mutate {
remove_field => ["logdate","log","message"]
}
...
如果对日志进行解析的话,不能漏下任何一个字段,包括字符(指没有匹配上的),同时也不能重叠字段解析。 但是我想把日志字段解析出来,一个字段中内容包含另一个字段内容的话我们就可以使用自定义正则。