1.Dockerfile
FROM docker.elastic.co/logstash/logstash:5.5.2
#自定义输入、输出流
RUN rm -f /usr/share/logstash/pipeline/logstash.conf
ADD config/pipeline/ /usr/share/logstash/pipeline/
#测试暂时不改变setting文件
#ADD config/setting/ /usr/share/logstash/config/
2.build_image.sh
#!/bin/bash
VER="5.5.2"
docker build -t "dev.docker.mcc.cn:5000/logstash:${VER}" .
docker push dev.docker.mcc.cn:5000/logstash:${VER}
3.输入输出流配置文件: stdout.conf
说明: 从文件读取,然后输出到控制台
input {
file {
path => "/var/log/glog/*"
type => "file_2_console"
start_position => "beginning"
}
}
output {
if [type] == "file_2_console" {
stdout {
codec => rubydebug
}
}
}
1.先创建私有镜像仓库
说明:否则会报错Repo not exist,至于为什么打了tag就能push成功,后续再搞清楚
docker login xxx
docker pull docker.elastic.co/logstash/logstash:5.5.2
docker tag docker.elastic.co/logstash/logstash:5.5.2 dev.docker.mcc.cn:5000/logstash:5.5.2
docker push dev.docker.mcc.cn:5000/logstash:5.5.2
2.构建镜像
root@node36:~/mcc/logstash# ./build.sh
3.执行启动脚本start.sh
说明:挂载宿主机的/var/log/glog这个目录到容器内,从而使得容器内可以读取到宿主机的file log.(其他业务日志需要写文件挂载到此目录)
#!/bin/bash
docker run -d -v /var/log/glog/:/var/log/glog/ dev.docker.mcc.cn:5000/logstash:5.5.2
至此,完成了logstash的启动
root@node36:/var/logs/glog# vi test.log 保存退出
写入数据到文件中
root@node36:/var/logs/glog# echo “11111” >> test.log
docker logs -f xxxx
测试完毕!
之后可以每个机器node都拉取镜像启动即可.
config/setting/logstash.yml
说明:不替换原镜像的设置文件也不会影响什么,但是一直会有WARN日志,原因是因为logstash和es有个安全认证(es设置账号密码),用的xpack这个东东。
https://www.elastic.co/guide/en/x-pack/current/monitoring-logstash.html#CO33-1
http.host: "0.0.0.0"
path.config: /usr/share/logstash/pipeline
xpack.monitoring.elasticsearch.url: "http://10.6.1.38:9200"
xpack.monitoring.elasticsearch.username: "no"
xpack.monitoring.elasticsearch.password: "no"
file2es.conf
input {
file {
path => "/var/log/glog/*"
type => "glog_file"
start_position => "beginning"
}
}
output {
if [type] == "glog_file" {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => ["10.6.1.38:9200"]
index => "mcc-test-log"
manage_template => true
}
}
}
//workers => 1这个配置会报插件不支持,从而导致docker容器启动失败
kafka2es.conf
说明:过滤掉不是json的,并且提取源数据为JSON到最上级(源数据即为JSON)
input {
file {
tags => "glog_file"
path => "/var/log/glog/*"
}
kafka {
tags => "kafka"
bootstrap_servers => "kafka.gy.mcc.com:9092"
topics => "backstage_logs"
group_id => "logstash"
auto_offset_reset => "earliest"
}
}
#filter must in order
filter {
if ([message] !~ "^{") {
drop {}
}else {
mutate {
remove_field => "@version"
remove_field => "path"
remove_field => "host"
remove_field => "type"
}
json {
source => "message"
remove_field => "message"
}
}
}
output {
stdout {
codec => rubydebug
}
if [tags] !~ "glog_file" {
elasticsearch {
hosts => ["10.6.1.38:9200"]
index => "mcc-test-log"
manage_template => true
document_type => "%{tags}"
}
}
}
Elastic最新版本 logstash 5.5.2
1.logstash配置input为kafka错误
[2017-09-07T07:05:31,064][ERROR][logstash.inputs.kafka ] Unknown setting ‘zk_connect’ for kafka
[2017-09-07T07:05:31,064][ERROR][logstash.inputs.kafka ] Unknown setting ‘topic_id’ for kafka
[2017-09-07T07:05:31,064][ERROR][logstash.inputs.kafka ] Unknown setting ‘reset_beginning’ for kafka
解决:
zk_connect (Zookeeper host) was replaced by bootstrap_servers (Kafka broker) and topic_id by topics in 5.0
参考:
https://stackoverflow.com/questions/40546649/logstash-kafka-input-not-working
2.正常JSON写入ES失败,无任何异常
配置文件:
input {
kafka {
type => "kafka"
bootstrap_servers => "kafka.gy.mcc.com:9092"
topics => "backstage_logs"
group_id => "logstash"
auto_offset_reset => "earliest"
}
}
#filter must in order
filter {
if ([message] !~ "^{") {
drop {}
}else {
json {
source => "message"
#target => "jsoncontent"
}
mutate {
remove_field => "message"
remove_field => "@version"
}
}
}
output {
if [type] == "kafka" {
elasticsearch {
hosts => ["10.6.1.38:9200"]
index => "mcc-test-log"
manage_template => true
}
}
}
现象:
echo ‘{“test”:{“b”:”b”},”b”:”b”}’ >> /var/log/glog/test.log //写入成功
echo ‘{“class”:”test.class”,”line”:”44”,”type”:0}’ >> /var/log/glog/test.log //写入失败
猜测原因: type字段重复导致,因为input和output定义了type==kafka,然后我的JSON体里面也有type,没有覆盖,直接被drop了。
echo ‘{“class”:”test.class”,”line”:”44”,”ytype”:0}’ >> /var/log/glog/test.log //写入成功
证明确实是因为字段重复导致的,并且并不是丢弃,而是丢到其他的index/type去了
解决:
input {
file {
tags => "glog_file"
path => "/var/log/glog/*"
}
kafka {
tags => "kafka"
bootstrap_servers => "kafka.gy.mcc.com:9092"
topics => "backstage_logs"
group_id => "logstash"
auto_offset_reset => "earliest"
}
}
#filter must in order
filter {
if ([message] !~ "^{") {
drop {}
}else {
mutate {
remove_field => "@version"
remove_field => "path"
remove_field => "host"
remove_field => "type"
}
json {
source => "message"
remove_field => "message"
}
}
}
output {
stdout {
codec => rubydebug
}
if [tags] !~ "glog_file" {
elasticsearch {
hosts => ["10.6.1.38:9200"]
index => "mcc-test-log"
manage_template => true
document_type => "%{tags}"
}
}
}
1.删除自动添加的type字段及其他无用字段(只保留@timestamp即可)
2.指定 document_type => “xxx” (这个就是等于_type)
说明:
一般说来,beat及其他来源都会有一个type,如果希望保存这个type,那么我们就把它转换成其他的字段,然后document_type 指定即可.一般情况下document_type 应该可以具体指定
安装:
elastic官方docker安装logstash
参数配置:
ELKstack中文指南
logstash配置语法
推荐:
https://doc.yonyoucloud.com/doc/logstash-best-practice-cn/index.html
elk+kafka搭建及遇到的问题解决
Logstash 讲解与实战应用