logstash笔记

logstash


主要配置文件
input {
    beats {
        port => 5044
        type => "log_5044"
    }
   # file {
   #     path => "/usr/share/logstash/pipeline/logs/0527test.txt"
   #     type => "log_5044"
   #     start_position => "beginning"
   # }
}
filter {
    if [type] == "log_5044"{
        #删除不需要列,拆分字段获取项目名称
        mutate{ 
            add_field => { temp => "%{[attrs][tag]}"}
            #删除beat自带相关字段
            remove_field => ["@version","tags","time","stream",attrs]
        }
        mutate{
            split =>["temp","."]
            add_field => { prj_name => "%{[temp][0]}" }
            remove_field =>["temp"]   
        }
        #增加prj_status,用于output过滤数据源,获取特殊标志,区分系统日志
        ruby{
            code => "
                _prjName =event.get('prj_name')
                _allPrj =['citydew-admin']
                for i in _allPrj
                    if _prjName == i
                        event.set('prj_status','PASS')
                        break
                    else
                        event.set('prj_status','NO_PASS')
                    end
                end
                 _log =event.get('log')
                                _isBusinessLog = _log.include?'This is Logs:'
                event.set('log_type',_isBusinessLog)
                return [event]
            "
        }
        if [log_type] {
            grok { # 使用grok 插件进行一整条日志信息格式成key-value信息
                patterns_dir => ["/usr/share/logstash/config/my_patterns"]
                match => { 
                    "log"=> "%{TIMESTAMP_ISO8601:logdate}%{SPACE}%{LOGLEVEL:loglevel}%{SPACE}%{RM_SELF_DEF:logdel}%{SPACE}This is Logs:%{JAVALOGMESSAGE:businesslog}"
                } 
            }
            #json格式化
            json {
                        source => "businesslog"
                        target => "logmsg"
            }
            date { # 将kibana 的查询时间改成日志的打印时间,方便之后查询,如果不改的话,kibana会有自己的时间,导致查询不方便
                            match => [ "logdate", "MMM dd yyyy HH:mm:ss", "MMM  d yyyy HH:mm:ss", "ISO8601" ]
                target => "time1"
                            locale => "cn"
                            timezone =>"Asia/Shanghai"
                    }
            #输出数据处理
            mutate {
                remove_field => ["logdel","businesslog"]
                gsub =>["[logmsg][sys_type]","Dew",""]
             }
        }
    }

}

output {
   if([prj_status] =="PASS"){
    elasticsearch { hosts => ["elasticsearch:9200"]
                  index => "%{prj_name}-%{+YYYY.MM.dd}"
    }
    stdout { codec => rubydebug }
   }
}

my_patterns内容:

RM_SELF_DEF [0-9][^:]:
SELF_SPACE \s+
GET_BUSINESSLOG (Logs begin){1}.
(Logs end){1}


详细说明
  1. 正则表达式,grok为logstash filter的重要插件。可以使用grok自带的正则解析普通的java,tomcat,nginx等日志。如果grok官方正则无法满足,还可以自定义正则。具体设置查看上方的源码。grok debugger可以帮助我们快速的测试正则解析。官方调试地址需翻墙,http://grokdebug.herokuapp.com/,本地grok debugger服务器搭建参考https://www.jianshu.com/p/11dc3ada022e
    ,在不知道如何使用logstash中的正则的时候请使用Grok Debugger的Descover来自动匹配。

  2. ruby插件。在特殊情况下 ,正则表达式无法解决我们的需求的时候,我们可以直接写ruby代码进行日志解析处理。可以直接在插件ruby块里编写处理代码,也可以引入已经编写好的rb文件。具体使用参考官方文档。

  3. mutate有大量的内置方法,update字段值,首位去空格,大小写转化,字符替换等,有需要记得使用。


参考资料

[logstash] : https://www.elastic.co/guide/en/logstash/6.4/plugins-filters-mutate.html#plugins-filters-mutate-replace "官方文档API"
[grok]: https://www.jianshu.com/p/11dc3ada022e "grok Debugger本地安装"
[grok]: https://github.com/logstash-plugins/logstash-patterns-core/tree/master/patterns "grok 源码正则表达式"
[logstash]: https://www.cntofu.com/book/52/input/README.md "logstash简易入门文档"

你可能感兴趣的:(logstash笔记)