任务场景和目标:
目的是对Pipeline文件进行验证。
日志格式如下:
行号|时间戳|进程ID|线程ID|日志级别|消息内容
示例:
2|2018-11-28,10:50:06.792978|6719|140737353873600|WARN|***DKDD
2.步骤
操作步骤如下:
保存以下内容为文件,如/home/liujg/dev/crush-backend-cpp/crush/gateway/bin/Debug/pipeline.json
{
"description": "test-pipeline",
"processors": [{
"grok": {
"field": "message",
"patterns": ["%{NUMBER:lineno}\\|%{MY_TIMESTAMP:my_timestamp}\\|%{PID:pid}\\|%{TID:tid}\\|%{LOGLEVEL:log_level}\\|%{GREEDYDATA:message}"],
"pattern_definitions": {
"DATE_ZH": "%{YEAR}-%{MONTHNUM2}-%{MONTHDAY}",
"TIME_MS": "%{TIME}.\\d{6}",
"MY_TIMESTAMP": "%{DATE_ZH},%{TIME_MS}",
"PID": "%{NUMBER}",
"TID":"%{NUMBER}"
}
}
}]
}
patterns:日志匹配模式
pattern_definitions: 自定义模式. DATE_ZH为"yyyy-MM-dd"格式的日期,TIMES_MS为时间格式.
curl -H'Content-Type: application/json' -XPUT 'http://localhost:9200/_ingest/pipeline/test-pipeline' -d@/home/liujg/dev/crush-backend-cpp/crush/gateway/bin/Debug/pipeline.json
http://localhost:9200为Elasticsearch主机端口.
test-pipeline为创建的Pipeline名称.
-d@后面为pipeline文件名称.
创建上述日志内容的文档.
curl -H'Content-Type: application/json' -XPOST 'http://localhost:9200/_ingest/pipeline/test-pipeline/_simulate' -d'
{
"docs": [{
"_index": "my-test-log",
"_type": "log",
"_id": "AVpsUYR_du9kwoEnKsSA",
"_score": 1,
"_source": {
"@timestamp": "2017-03-31T18:22:25.981Z",
"beat": {
"hostname": "my think",
"name": "RestReviews",
"version": "5.1.1"
},
"input_type": "log",
"message": "2|2018-11-28,10:50:06.792978|6719|140737353873600|WARN|***DKDD",
"offset": 3,
"source": "/home/liujg/dev/crush-backend-cpp/crush/gateway/bin/Debug/1.log",
"tags": [
"debug",
"reviews"
],
"type": "log"
}
}]
}'
写入的索引名称为my-test-log.
message与pipeline的field对应,内容为日志信息.
返回内容如下:
{
"docs": [{
"doc": {
"_index": "my-test-log",
"_type": "log",
"_id": "AVpsUYR_du9kwoEnKsSA",
"_source": {
"offset": 3,
"my_timestamp": "2018-11-28,10:50:06.792978",
"input_type": "log",
"log_level": "WARN",
"pid": "6719",
"source": "/home/liujg/dev/crush-backend-cpp/crush/gateway/bin/Debug/1.log",
"message": "***DKDD",
"type": "log",
"tid": "140737353873600",
"tags": ["debug", "reviews"],
"@timestamp": "2017-03-31T18:22:25.981Z",
"lineno": "2",
"beat": {
"name": "RestReviews",
"version": "5.1.1",
"hostname": "my think"
}
},
"_ingest": {
"timestamp": "2018-12-04T09:24:27.236Z"
}
}
}]
}
提取出的结构化数据有:
Parsing csv files with Filebeat and Elasticsearch Ingest Pipelines
https://www.objectrocket.com/blog/how-to/elasticsearch-ingest-csv/
grok模式
https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns