一 应用场景描述

最近在研究日志平台解决方案。最终选择使用目前比较流行的ELK框架,即Elasticsearch,Logstash,Kibana三个开源软件的组合来构建日志平台。其中Elasticsearch用于日志搜索,Logstash用于日志的收集,过滤,处理等,Kibana用于日志的界面展示。最核心的就是要先了解Logstash的工作原理。


二 Logstash介绍

Logstash是一款用于接收,处理并输出日志的工具。Logstash可以处理各种各样的日志,包括系统日志,WEB容器日志如Apache日志和Nginx日志和Tomcat日志等,各种应用日志等。


三 Logstash简单使用

Logstash是用ruby语言编写,Jruby作为ruby解释器。所以运行Logstash只需要安装Java就行。

在CentOS上安装Java

yum -y install java-1.7.0-openjdk*

$ java -version

java version "1.7.0_75"

OpenJDK Runtime Environment (rhel-2.5.4.0.el6_6-x86_64 u75-b13)

OpenJDK 64-Bit Server VM (build 24.75-b04, mixed mode)


wget https://download.elasticsearch.org/logstash/logstash/logstash-1.4.2.tar.gz

tar zxvf logstash-1.4.2.tar.gz

cd logstash-1.4.2


使用bin/logstash agent --help 查看参数说明

-e  后面直接跟配置信息,而不通过-f 参数指定配置文件。可以用于快速测试


在命令行运行

$ bin/logstash -e 'input {stdin {} } output {stdout {} }'


然后再输入一些信息

$ bin/logstash -e 'input {stdin {} } output {stdout {} }'

hello world

2015-01-31T12:02:20.438+0000 xxxxx hello world


这里通过stdin输入信息,然后通过stdout输出信息。在输入hello world后Logstash将处理后的信息输出到屏幕


$ bin/logstash -e 'input {stdin {} } output {stdout { codec => rubydebug  } }'
goodnight moon
{
      "message" => "goodnight moon",
      "@version" => "1",
        "@timestamp" => "2015-01-31T12:09:38.564Z",
      "host" => "xxxx-elk-log"
}


存储日志到Elasticsearch

wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.4.2.zip

unzip elasticsearch-1.4.2.zip

cd elasticsearch-1.4.2

./bin/elasticsearch


Logstash和Elasticsearch的版本要一致

$bin/logstash -e 'input { stdin {} } output { elasticsearch { host => localhost }}'
you know,for logs

这里logstash从屏幕接收信息,然后将输出结果发送到Elasticsearch,然后验证Elasticsearch是否从Logstash接收了数据

$ curl 'http://localhost:9200/_search?pretty'
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "logstash-2015.01.31",
      "_type" : "logs",
      "_id" : "W6HMXGx2Tw25sTX7OwZPug",
      "_score" : 1.0,
      "_source":{"message":"you know,for logs","@version":"1","@timestamp":"2015-01-31T12:43:53.630Z","host":"jidong-elk-log"}
    } ]
  }
}


另外可以通过Elasticearch-kopf插件访问查看Logstash数据

使用一下方式安装

bin/plugin -install lmenezes/elasticsearch-kopf

然后通过

http://localhost:9200/_plugin/kopf 访问


使用多种输出方式

$bin/logstash -e 'input { stdin {} } output { elasticsearch { host => localhost } stdout {}   }'
multiple outputs
2015-01-31T13:03:43.426+0000 jidong-elk-log multiple outputs


这里除了将从键盘输入的内容输出到Elasticsearch外,还输出到屏幕


$ curl 'http://localhost:9200/_search?pretty'
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "logstash-2015.01.31",
      "_type" : "logs",
      "_id" : "W6HMXGx2Tw25sTX7OwZPug",
      "_score" : 1.0,
      "_source":{"message":"you know,for logs","@version":"1","@timestamp":"2015-01-31T12:43:53.630Z","host":"jidong-elk-log"}
    }, {
      "_index" : "logstash-2015.01.31",
      "_type" : "logs",
      "_id" : "kMXKoQglQNCDYOEyOmAnhg",
      "_score" : 1.0,
      "_source":{"message":"multiple outputs","@version":"1","@timestamp":"2015-01-31T13:03:43.426Z","host":"jidong-elk-log"}
    } ]
  }
}


Elasticsearch默认是根据日期来创建索引,每天创建一个索引,如logstash-2015.01.31 


Logstash事件的生命周期 The life of an event

Inputs,Outputs,Codecs,Filters是Logstash配置的核心。


Inputs 传送日志数据到Logstash,主要有以下几个插件可以使用


file 从一个文件中读入日志数据

syslog 默认监听514端口,接收来自syslog的日志,并根据RFC3164格式解析

redis 从redis读入日志数据,通常redis在一个集中Logstash部署架构中作为一个broker来缓冲来自Logstash agent或其他方式发送过来的日志。

lumberjack  处理使用lumberjack协议发送过来的日志。现在叫做logstash-forwarder


Filters 用于根据各种匹配条件对Logstash事件进行过滤处理,主要有以下几个插件

grok  解析任意文本并将它结构化

mutate 对事件进行添加,删除,移动,替换,修改等更改操作

drop  丢掉特定事件

clone 克隆事件

geoip 添加IP地址的物理位置信息


Outputs 是Logstash pipeline的最后一个阶段。一个事件可以有多种输出。常用的有以下几个插件

elasticsearch  将事件数据写入到Elasticsearch

file        将事件数据写入到磁盘文件


Codecs 是用于流过滤,可以添加到input或output。主要有plain,json等



使用配置文件

conf/logstash-simple.conf


input {
  stdin {}
     }
output {
  elasticsearch {
    host => localhost
                }
  stdout {
    codec => rubydebug
         }
}


$ sudo bin/logstash -f conf/logstash-simple.conf 
config file
{
       "message" => "config file",
      "@version" => "1",
    "@timestamp" => "2015-02-01T02:38:15.347Z",
          "host" => "xxxxxx"
}


curl 'http://localhost:9200/_search?pretty'

  "_index" : "logstash-2015.02.01",
      "_type" : "logs",
      "_id" : "NW2e8LdWSwuNE-aJZNtd-w",
      "_score" : 1.0,
      "_source":{"message":"config file","@version":"1","@timestamp":"2015-02-01T02:38:15.347Z","host":"xxxxxx"}
    } ]
  }



Filter测试

$ cat conf/logstash-filter.conf 
input {
     stdin {}
      }
filter {
  grok {
   match => { 
     "message" => "%{COMBINEDAPACHELOG}"
            }
      }
  date {
   match => [ "timestamp","dd/MMM/yyyy:HH:mm:ss Z" ]
       }

}

output {
  elasticsearch {
     host => localhost
                }
  stdout {
     codec => rubydebug
         }
}


在屏幕输入一行Apache日志

$ sudo bin/logstash -f conf/logstash-filter.conf 
127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] "GET /xampp/status.php HTTP/1.1" 200 3891 "http://cadenza/xampp/navi.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0"
{
        "message" => "127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/status.php HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"",
       "@version" => "1",
     "@timestamp" => "2013-12-11T08:01:45.000Z",
           "host" => "xxxxxxx",
       "clientip" => "127.0.0.1",
          "ident" => "-",
           "auth" => "-",
      "timestamp" => "11/Dec/2013:00:01:45 -0800",
           "verb" => "GET",
        "request" => "/xampp/status.php",
    "httpversion" => "1.1",
       "response" => "200",
          "bytes" => "3891",
       "referrer" => "\"http://cadenza/xampp/navi.php\"",
          "agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\""
}



      "_index" : "logstash-2013.12.11",
      "_type" : "logs",
      "_id" : "QusW5lY5T8a9wqgCcottnA",
      "_score" : 1.0,
      "_source":{"message":"127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/status.php HTTP/1.1\
" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gec
ko/20100101 Firefox/25.0\"","@version":"1","@timestamp":"2013-12-11T08:01:45.000Z","host":"jidong-elk-lo
g","clientip":"127.0.0.1","ident":"-","auth":"-","timestamp":"11/Dec/2013:00:01:45 -0800","verb":"GET","
request":"/xampp/status.php","httpversion":"1.1","response":"200","bytes":"3891","referrer":"\"http://ca
denza/xampp/navi.php\"","agent":"\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 
Firefox/25.0\""}



案例一,使用Logstash处理Apache日志

$ cat /tmp/access.log 
71.141.244.242 - kurt [18/May/2011:01:48:10 -0700] "GET /admin HTTP/1.1" 301 566 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3"
134.39.72.245 - - [18/May/2011:12:40:18 -0700] "GET /favicon.ico HTTP/1.1" 200 1189 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)"
98.83.179.51 - - [18/May/2011:19:35:08 -0700] "GET /css/main.css HTTP/1.1" 200 1837 "http://www.safesand.com/information.htm" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"


$ cat conf/logstash-apache.conf 
input {
  file {
    path => "/tmp/access_log"
    start_position => beginning
       }
      }

filter {
  if [path] =~"access" {
   mutate {
     replace => {
        "type" => "apache_access"
                }
          }
   grok {
     match => {
         "message" => "%{COMBINEDAPACHELOG}"
              }
        }
   date {
     match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
        }
        }
}
output {
  elasticsearch { host => localhost }
  stdout { codec => rubydebug }
}


启动Logstash后可以看到Logstash将/tmp/access_log的日志数据处理了

$ sudo bin/logstash  -f conf/logstash-apache.conf 
Using milestone 2 input plugin 'file'. This plugin should be stable, but if you see strange behavior, please let us know! For more information on plugin milestones, see http://logstash.net/docs/1.4.2/plugin-milestones {:level=>:warn}
{
        "message" => "71.141.244.242 - kurt [18/May/2011:01:48:10 -0700] \"GET /admin HTTP/1.1\" 301 566 \"-\" \"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3\"",
       "@version" => "1",
     "@timestamp" => "2011-05-18T08:48:10.000Z",
           "host" => "jidong-elk-log",
           "path" => "/tmp/access_log",
           "type" => "apache_access",
       "clientip" => "71.141.244.242",
          "ident" => "-",
           "auth" => "kurt",
      "timestamp" => "18/May/2011:01:48:10 -0700",
           "verb" => "GET",
        "request" => "/admin",
    "httpversion" => "1.1",
       "response" => "301",
          "bytes" => "566",
       "referrer" => "\"-\"",
          "agent" => "\"Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3\""
}
{
        "message" => "134.39.72.245 - - [18/May/2011:12:40:18 -0700] \"GET /favicon.ico HTTP/1.1\" 200 1189 \"-\" \"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)\"",
       "@version" => "1",
     "@timestamp" => "2011-05-18T19:40:18.000Z",
           "host" => "jidong-elk-log",
           "path" => "/tmp/access_log",
           "type" => "apache_access",
       "clientip" => "134.39.72.245",
          "ident" => "-",
           "auth" => "-",
      "timestamp" => "18/May/2011:12:40:18 -0700",
           "verb" => "GET",
        "request" => "/favicon.ico",
    "httpversion" => "1.1",
       "response" => "200",
          "bytes" => "1189",
       "referrer" => "\"-\"",
          "agent" => "\"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)\""
}
{
        "message" => "98.83.179.51 - - [18/May/2011:19:35:08 -0700] \"GET /css/main.css HTTP/1.1\" 200 1837 \"http://www.safesand.com/information.htm\" \"Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1\"",
       "@version" => "1",
     "@timestamp" => "2011-05-19T02:35:08.000Z",
           "host" => "jidong-elk-log",
           "path" => "/tmp/access_log",
           "type" => "apache_access",
       "clientip" => "98.83.179.51",
          "ident" => "-",
           "auth" => "-",
      "timestamp" => "18/May/2011:19:35:08 -0700",
           "verb" => "GET",
        "request" => "/css/main.css",
    "httpversion" => "1.1",
       "response" => "200",
          "bytes" => "1837",
       "referrer" => "\"http://www.safesand.com/information.htm\"",
          "agent" => "\"Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1\""
}


查看Elasticsearch

curl 'http://localhost:9200/_search?pretty'


案例二,使用Logstash处理来自syslog的日志

$ cat conf/logstash-syslog.conf 
input {
  tcp {
    port => 5000
    type => syslog
      }
  udp {
    port => 5000
    type => syslog
      }
}

filter {
 if [type] == "syslog" {
  grok {
   match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}"}
   add_field => [ "received_at", "%{@timestamp}" ]
   add_field => [ "receieved_from", "%{host}" ]
       }

  syslog_pri {}
  date {   
         match => [ "syslog_timestamp", "MMM d HH:mm:ss","MMM dd HH:mm:ss" ]
       }
     }
}

output {
   elasticsearch { host => localhost }
   stdout { codec => rubydebug}
}


启动logstash

$ sudo bin/logstash  -f conf/logstash-syslog.conf


通过telnet连接到5000端口,然后发送日志信息给Logstash

$ telnet localhost 5000
Trying ::1...
Connected to localhost.
Escape character is '^]'.

Dec 23 12:11:43 louis postfix/smtpd[31499]: connect from unknown[95.75.93.154]
Dec 23 14:42:56 louis named[16000]: client 199.48.164.7#64817: query (cache) 'amsterdamboothuren.com/MX/IN' denied
Dec 23 14:30:01 louis CRON[619]: (www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)
Dec 22 18:28:06 louis rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="2253" x-info="http://www.rsyslog.com"] rsyslogd was HUPed, type 'lightweight'.


查看Logstash屏幕输出

$ sudo bin/logstash  -f conf/logstash-syslog.conf 
{
                 "message" => "Dec 23 12:11:43 louis postfix/smtpd[31499]: connect from unknown[95.75.93.154]\r",
                "@version" => "1",
              "@timestamp" => "2015-12-23T04:11:43.000Z",
                    "host" => "0:0:0:0:0:0:0:1:34337",
                    "type" => "syslog",
        "syslog_timestamp" => "Dec 23 12:11:43",
         "syslog_hostname" => "louis",
          "syslog_program" => "postfix/smtpd",
              "syslog_pid" => "31499",
          "syslog_message" => "connect from unknown[95.75.93.154]\r",
             "received_at" => "2015-02-01 05:01:48 UTC",
          "receieved_from" => "0:0:0:0:0:0:0:1:34337",
    "syslog_severity_code" => 5,
    "syslog_facility_code" => 1,
         "syslog_facility" => "user-level",
         "syslog_severity" => "notice"
}
{
                 "message" => "Dec 23 14:42:56 louis named[16000]: client 199.48.164.7#64817: query (cache) 'amsterdamboothuren.com/MX/IN' denied\r",
                "@version" => "1",
              "@timestamp" => "2015-12-23T06:42:56.000Z",
                    "host" => "0:0:0:0:0:0:0:1:34337",
                    "type" => "syslog",
        "syslog_timestamp" => "Dec 23 14:42:56",
         "syslog_hostname" => "louis",
          "syslog_program" => "named",
              "syslog_pid" => "16000",
          "syslog_message" => "client 199.48.164.7#64817: query (cache) 'amsterdamboothuren.com/MX/IN' denied\r",
             "received_at" => "2015-02-01 05:01:48 UTC",
          "receieved_from" => "0:0:0:0:0:0:0:1:34337",
    "syslog_severity_code" => 5,
    "syslog_facility_code" => 1,
         "syslog_facility" => "user-level",
         "syslog_severity" => "notice"
}
{
                 "message" => "Dec 23 14:30:01 louis CRON[619]: (www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)\r",
                "@version" => "1",
              "@timestamp" => "2015-12-23T06:30:01.000Z",
                    "host" => "0:0:0:0:0:0:0:1:34337",
                    "type" => "syslog",
        "syslog_timestamp" => "Dec 23 14:30:01",
         "syslog_hostname" => "louis",
          "syslog_program" => "CRON",
              "syslog_pid" => "619",
          "syslog_message" => "(www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)\r",
             "received_at" => "2015-02-01 05:01:48 UTC",
          "receieved_from" => "0:0:0:0:0:0:0:1:34337",
    "syslog_severity_code" => 5,
    "syslog_facility_code" => 1,
         "syslog_facility" => "user-level",
         "syslog_severity" => "notice"
}
{
                 "message" => "Dec 22 18:28:06 louis rsyslogd: [origin software=\"rsyslogd\" swVersion=\"4.2.0\" x-pid=\"2253\" x-info=\"http://www.rsyslog.com\"] rsyslogd was HUPed, type 'lightweight'.\r",
                "@version" => "1",
              "@timestamp" => "2015-12-22T10:28:06.000Z",
                    "host" => "0:0:0:0:0:0:0:1:34337",
                    "type" => "syslog",
        "syslog_timestamp" => "Dec 22 18:28:06",
         "syslog_hostname" => "louis",
          "syslog_program" => "rsyslogd",
          "syslog_message" => "[origin software=\"rsyslogd\" swVersion=\"4.2.0\" x-pid=\"2253\" x-info=\"http://www.rsyslog.com\"] rsyslogd was HUPed, type 'lightweight'.\r",
             "received_at" => "2015-02-01 05:01:53 UTC",
          "receieved_from" => "0:0:0:0:0:0:0:1:34337",
    "syslog_severity_code" => 5,
    "syslog_facility_code" => 1,
         "syslog_facility" => "user-level",
         "syslog_severity" => "notice"
}






参考文档

http://logstash.net/docs/1.4.2/

http://logstash.net/docs/1.4.2/tutorials/getting-started-with-logstash