elk 学习

为什么80%的码农都做不了架构师?>>>   hot3.png

相关资源:

安装包:

    elasticsearch下载: https://download.elasticsearch.org/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.2.0/elasticsearch-2.2.0.tar.gz

    logstash下载: https://download.elastic.co/logstash/logstash/logstash-2.2.1.tar.gz

    kibana下载: https://download.elastic.co/kibana/kibana/kibana-4.4.1-linux-x64.tar.gz

常用网址:

    grok debugger 验证grok表达式: https://grokdebug.herokuapp.com/

    es中文文档: http://wiki.jikexueyuan.com/project/elasticsearch-definitive-guide-cn/

    elk中文文档: http://kibana.logstash.es/

    kibaba搜索语法: https://segmentfault.com/a/1190000002972420

    es配置: https://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration.html


es 官方推荐配置:

检测命令:

curl localhost:9200/_nodes/stats/process?pretty

系统配置:

    1. 修改JAVA_OPTS指定堆栈大小

    2. ulimit -n 65535以上

    3. sysctl -w vm.max_map_count=262144

    4. 关闭swap或者swappiness设置为0

    5. 设置bootstrap.mlockall: true,并且启动指定tmp目录 ./bin/elasticsearch -Djna.tmpdir=/path/to/new/dir


filebeat配置:

filebeat:
  registry_file: "/usr/local/filebeat/.filebeat"
  prospectors:
    -
      paths:
        - /usr/local/nginx/logs/access.log
      document_type: type1
    -
      paths:
        - /usr/local/nginx/logs/error.log
      document_type: type2
    -
      paths:
        - /usr/local/nginx/logs/other.log
      document_type: type3
output:
  logstash:
    hosts: ["192.168.241.130:5044"]
    worker: 1


logstash配置:

input {
    beats {
        port => 5044
        # ssl => false
        # ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
        # ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
    }
}

filter {
    if [type] == "type1" {
       grok {
           patterns_dir => "./patterns"
           match => { "message" => "%{NGINXLOG}" }
       }
       geoip {
            source => "remote_addr"
            target => "geoip"
            database =>"/usr/local/logstash/geoip/GeoLiteCity.dat"
            add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
            add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
       }
       # 下面这部分使用日志时间作为索引字段,默认是使用日志导入时间。
       date {
           locale => "en"
           match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
       }
       mutate {
           remove_field => "timestamp"
       }
    }
    else if [type] == "type2" {
        # 下面这部分为了解析json,先把日志里非json筛选出来。
        grok {
            patterns_dir => "./patterns/"
            match => {"message" => "%{DATETIME:timestamp} \[%{WORD:level}\] %{GREEDYDATA:jsonmessage}"}
        }
        json {
            source => "jsonmessage"
            remove_field => ["jsonmessage"]
        }
    }
    else if [type] == "type3" {
        # 处理多行日志
        multiline {
            pattern => "^\d{4}\/\d{2}\/\d{2} \d{2}\:\d{2}\:\d{2}"
            negate => true
            what => previous
        }
        grok {
            patterns_dir => "./patterns/"
            match => {"message" => "%{LUALOG}"}
        }
        # 匹配失败走下面
        if "_grokparsefailure" in [tags] {
            grok {
                patterns_dir => "./patterns"
                match => { "message" => "%{NGINX_ERROR_LOG}"}
            }
            date {
                locale => "en"
                match => [ "timestamp", "yyyy/MM/dd HH:mm:ss", "yyyy/MMM/dd HH:mm:ss" ] #匹配多种格式
            }
            mutate {
                remove_field => "timestamp"
            }
            geoip {
                 source => "client"
                 target => "geoip"
                 database =>"/usr/local/logstash/geoip/GeoLiteCity.dat"
                 add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
                 add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
            }
        }
    }
}

output {
  elasticsearch { hosts => "192.168.241.130"
        index => "logstash-%{type}-%{+YYYY.MM.dd}"
        document_type => "%{type}"
        workers => 2
        flush_size => 20000
        idle_flush_time => 10
    }
}
  1. filebeat指定type类型,logstash判断区分日志

  2. 日志格式复杂的话,可以在filter中使用grok+json类似这种多层解析。

  3. if之内可以再嵌套if,通过if "_grokparsefailure" in [tags]来匹配上一个匹配失败的日志。

  4. 使用日志时间作为索引字段,默认是使用日志导入时间。

logstash patterns配置:

# COMMON
DATETIME (?\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})

# FOR NGINX
DATETIME (?\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})
NGINX_ERROR_DATE_TIME %{YEAR}\/%{MONTHNUM}\/%{MONTHDAY} %{TIME}
NGINXLOG \[%{HTTPDATE:timestamp}\] %{NUMBER:request_time} %{IPORHOST:remote_addr} %{INT:status} %{INT:body_bytes_sent} "%{WORD:method} %{URIPATH:path}(?:%{URIPARAM:param})? HTTP/%{NUMBER:httpversion}" %{QS:http_referer} %{QS:http_user_agent} %{QS:cookie}
NGINX_ERROR_LOG (?\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[%{DATA:err_severity}\] (%{NUMBER:pid:int}#%{NUMBER}: \*%{NUMBER}|\*%{NUMBER}) %{DATA:err_message}(?:, client: (?%{IP}|%{HOSTNAME}))(?:, server: %{IPORHOST:server})(?:, request: %{QS:request})?(?:, host: %{QS:client_ip})?(?:, referrer: \"%{URI:referrer})?

# FOR JAVA
JAVACLASS (?:[a-zA-Z0-9-]+\.)+[A-Za-z0-9$]+
JAVALOGMESSAGE (.*)
CATALINA_EXEC %{WORD}-%{WORD}-%{WORD}
TOMCAT_DATESTAMP %{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{HOUR}:?%{MINUTE}(?::?%{SECOND})
TOMCAT_ERROR %{TOMCAT_DATESTAMP:timestamp} \[%{CATALINA_EXEC:catalina_exec}\] %{LOGLEVEL:level} %{JAVACLASS:class} - %{JAVALOGMESSAGE_NONGREEDY:logmessage}\n%{JAVALOGMESSAGE_NONGREEDY:logmessage}\n\t%{GREEDYDATA:fulllogmessge}

# FOR openresty lua相关日志
(?\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[%{LOGLEVEL:severity}\] %{POSINT:pid}#%{NUMBER}: %{GREEDYDATA:errormessage}\n(?\S+ \S+):\n(?\S+ \d):\s+(?\[C]: in function 'require')?\s+?(?.*>), (client: (?%{IP}|%{HOSTNAME}))(?:, server: %{IPORHOST:server})(?:, request: %{QS:request})(?:, host: %{QS:client_ip})

error 格式不固定,不好处理,上面的方法也只能处理个别的errorlog。

es配置:

cluster.name: my-application
bootstrap.mlockall: true
network.host: 192.168.241.130
http.port: 9200
security.manager.enabled: false # 如果链接hadoop,要设置false

kibana配置:

elasticsearch.url: "http://192.168.241.130:9200"


常用命令:

清理elasticsearch过期数据:

curl -XDELETE 'http://192.168.241.130:9200/logstash-2016.02.24'


坑:

    1. 不要用vim修改被监控文件,会造成保存后,之前存在的内容也被发送给logstash。

    2. 目前filebeat ssl验证貌似有问题。

转载于:https://my.oschina.net/MaTech/blog/615751

你可能感兴趣的:(json,运维,开发工具)