es+kibana+logstash+filebeat 搭建

架构:

10.6.14.77 es,kibana,logstash  (三项默认的配置均为localhost,起在同一台服务器不再需要修改)

10.6.13.116  filebeat 

10.6.13.210 filebeat

分别收集两台服务器的日志

版本:

es 6.2.3

kibana 6.2.3

logstash 6.2.3

filebeat 6.3.2

 

前置:

需要安装java8

elastic stack 官网:https://www.elastic.co/products/

elasticsearch:

1.unzip elasticsearch-6.2.3.zip

2.启动  ./bin/elasticsearch  

   后台启动  ./bin/elasticsearch -d

   检测是否启动成功:curl localhost:9200

3.若报错:can not run elasticsearch as root 

   原因:这是出于系统安全考虑设置的条件。由于ElasticSearch可以接收用户输入的脚本并且执行,为了系统安全考虑, 建议创建一个单独的用户用来运行ElasticSearch

   解决方案:创建elsearch用户组及elsearch用户

   groupadd elsearch

   useradd elsearch -g elsearch -p elasticsearch

   chown -R elsearch:elsearch elasticsearch

4.若报错:max virtual memory areas vm.maxmapcount [65530] is too low

   sudo sysctl -w vm.max_map_count=262144

   在本机查看启动信息:curl localhost:9200

5. 配置外网访问:

config/elasticsearch.yml

修改为:network.host: 0.0.0.0

    

kibana:

1.tar zxvf kibana-6.2.3-linux-x86_64.tar.gz

2.启动 ./bin/kibana

   后台启动:nohup ./bin/kibana &

3.配置外网访问:

config/kibana.yml

修改为:server.host: 0.0.0.0

 

logstash:

1.unzip logstash-6.2.3.zip

2.测试启动:

./bin/logstash -e "input {stdin{}} output {stdout{}}"

可在命令行输入,并输出至命令行

3.配置文件测试启动:

-e :指定logstash的配置信息,可以用于快速测试;
-f :指定logstash的配置文件;可以用于生产环境;

配置文件样例:

从命令行读入
输出至命令行
input { stdin { } }

filter {
  grok {
    match => { "message" => "%{COMBINEDAPACHELOG}" }
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  }
}

output {
  stdout { codec => rubydebug }
}
.bin/logstash -f test.conf

4.配置启动

input {
    beats {
        port => "5044"
    }
} # 配置输入源为5044端口,filebeat默认打到5044端口

input {
	beats {
		add_field => {"log_type" => "pisces"}
		port => 5044 
	}
	beats {
		add_field => {"log_type" => "aries"}
		port => 5043
	}
} # 配置不同的filebeat日志来源,并标记log_type
filter { # 对不同的log来源采用不同的处理逻辑
	if [log_type] == "pisces" {
		grok {
			match => { "message" => "\[%{WORD:info_level} %{DATESTAMP:timestamp} %{WORD:temp}:%{NUMBER:temp}\] %{NUMBER:status} %{WORD:method} %{URIPATHPARAM:request} \(%{IP:ip}\) %{GREEDYDATA:C}"
			} # 对日志整体进行切分
		}
		mutate {
			split => ["request", "?"]
			add_field => {
				"uri" => "%{[request][0]}"
			}
			add_field => {
				"param" => "%{[request][1]}"
			} # 对切分的部分字段再加工
		}

	        kv {
			source => "param"
                	field_split => "&?"
			include_keys => ["subject", "paper_id", "session_key"]
			target => "kv"
		} # 对字段匹配key-value
        	date {
			match => ["timestamp", "dd-MM-YY HH:mm:ss"]
	                timezone => "+08:00" # 默认存储时间会有时间差,+8以消除时差
		} # 重定义timestamp,该字段为kibana时间统计的主要字段
	        if ([tags][0] == "_grokparsefailure" or [tags][1] == "_grokparsefailure") {
			drop {}
		} # 抛弃不满足grok规则的日志
        	if ([uri] =~ "^\/klx") {
			drop {}
		} # 抛弃指定开头的日志
	}
	if [log_type] == "aries" {
		grok {
			match => { "message" => "\[%{WORD:info_level} %{DATESTAMP:timestamp} %{WORD:temp}:%{NUMBER:temp}\] %{NUMBER:status} %{WORD:method} %{URIPATHPARAM:request} \(%{IP:ip}\) %{GREEDYDATA:C}"
			}
		}
		mutate {
			split => ["request", "?"]
			add_field => {
				"uri" => "%{[request][0]}"
			}
			add_field => {
				"param" => "%{[request][1]}"
			}
		}

	        kv {
			source => "param"
                	field_split => "&?"
			include_keys => ["subject", "paper_id", "username", "gourp_id", "role"]
			target => "kv"
		}
        	date {
			match => ["timestamp", "dd-MM-YY HH:mm:ss"]
	                timezone => "+08:00"
		}
	        if ([tags][0] == "_grokparsefailure" or [tags][1] == "_grokparsefailure") {
			drop {}
		}
        	if ([uri] =~ "^\/klx") {
			drop {}
		}
	}


}

output { # 对不同的日志来源存储进不同的index
	if [log_type] == "pisces" {
	        elasticsearch {
			hosts => ["localhost:9200"]
			index => "pisces.log"
		}
	}
	if [log_type] == "aries" {
		elasticsearch {
			hosts => ["localhost:9200"]
			index => "aries.log"
		}
	}
	stdout {}
}

grok切分规则:https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns

grok调试工具:https://grokdebug.herokuapp.com/

以上grok规则会将日志切分为以下结构存入es:

log:[I 26-03-2019 14:50:01 web:1971] 200 GET /api/student/situation_analysis?session_key=88d4faeacb0e97d24982405cf2e788b7&source=paper&subject=math&paper_id=5c9820af3eaeefc905030afa&t=1553583001380 (113.91.43.156) 40.97ms

{
             "C" => "40.97ms",
          "host" => "M7-10-6-12-27-14-77",
       "message" => "[I 26-03-2019 14:50:01 web:1971] 200 GET /api/student/situation_analysis?session_key=88d4faeacb0e97d24982405cf2e788b7&source=paper&subject=math&paper_id=5c9820af3eaeefc905030afa&t=1553583001380 (113.91.43.156) 40.97ms",
          "temp" => [
        [0] "web",
        [1] "1971"
    ],
        "method" => "GET",
       "request" => [
        [0] "/api/student/situation_analysis",
        [1] "session_key=88d4faeacb0e97d24982405cf2e788b7&source=paper&subject=math&paper_id=5c9820af3eaeefc905030afa&t=1553583001380"
    ],
            "ip" => "113.91.43.156",
        "status" => "200",
           "uri" => "/api/student/situation_analysis",
    "info_level" => "I",
     "timestamp" => "26-03-2019 14:50:01",
            "kv" => {
        "session_key" => "88d4faeacb0e97d24982405cf2e788b7",
            "subject" => "math",
           "paper_id" => "5c9820af3eaeefc905030afa"
    },
      "@version" => "1",
    "@timestamp" => 2019-03-26T06:50:01.000Z,
         "param" => "session_key=88d4faeacb0e97d24982405cf2e788b7&source=paper&subject=math&paper_id=5c9820af3eaeefc905030afa&t=1553583001380"
}

filebeat:

1.修改filebeat.yml配置

filebeat.inputs: # 设置输入源
  # Change to true to enable this input configuration.
  enabled: true 
  paths:
        - /data/log/* # 读取日志路径
output.logstash: # 设置输出至logstash
  hosts: ["10.6.14.77:5044"]

tail_files: true #从文件末尾开始读取

# 将es的output注释掉

2.启动:

./filebeat -e -c filebeat.yml

 

HINT

采集同一台服务器上日志,并做区分采用不同logstash存入es不同index中

1.修改filebeat.yml配置

filebeat.inputs:
# 设置两个path源,对于不同源的日志文件,写入不同的tags标记
- type: log
  paths:
    - /data/log/pisces/*
  tags: ["pisces"]

- type: log
  paths:
    - /data/log/aries/*
  tags: ["aries"]

2.修改logstash.conf配置

#对于包含不同tags标签的来源数据存入不同index,同理可在filter中对不同日志源采取不同分词
output {
	if "pisces" in [tags] {
		elasticsearch {
			hosts => ["10.6.14.77:9200"]
			index => "pisces.log"
		}
	}
	if "aries" in [tags] {
		elasticsearch {
			hosts => ["10.6.14.77:9200"]
			index => "aries.log"
		}
	}
}

supervisor 运维配置

目标:所有服务使用supervisor启动,并管理相关log

相关配置:

Elasticsearch

[program: Elasticsearch]
command =/data/elasticsearch-6.2.3/bin/elasticsearch
process_name=%(process_num)d
stopsignal=KILL
user=elsearch
redirect_stderr=true
stdout_logfile_maxbytes=5MB
stdout_logfile_backups=20
stdout_logfile=/data/log/Elasticsearch/Elasticsearch.log
environment=PATH=/data/jdk1.8.0_201/bin:%(ENV_PATH)s

kibana

[program:kibana]
command = /data/kibana-6.2.3-linux-x86_64/bin/kibana
process_name=%(process_num)d
stopsignal=KILL
user=root
redirect_stderr=true
stdout_logfile_maxbytes=5MB
stdout_logfile_backups=20
stdout_logfile=/data/log/kibana/kibana.log
environment=PATH=/data/jdk1.8.0_201/bin:%(ENV_PATH)s

logstash

[program:logstash]
command =/data/logstash-6.2.3/bin/logstash -f /data/logstash-6.2.3/pisces.conf
directory=/data/logstash-6.2.3
process_name=%(process_num)d
stopsignal=KILL
user=root
redirect_stderr=true
stdout_logfile_maxbytes=5MB
stdout_logfile_backups=20
stdout_logfile=/data/log/logstash/logstash.log
environment=PATH=/data/jdk1.8.0_201/bin:/data/logstash-6.2.3/bin:%(ENV_PATH)s

filebeat

[program:filebeat]
command = /data/filebeat-6.3.2-linux-x86_64/filebeat -e -c /data/filebeat-6.3.2-linux-x86_64/filebeat.yml
process_name=%(process_num)d
stopsignal=KILL
user=root
redirect_stderr=true
stdout_logfile_maxbytes=5MB
stdout_logfile_backups=20
stdout_logfile=/data/log/filebeat/filebeat.log
environment=PATH=/data/jdk1.8.0_201/bin:%(ENV_PATH)s

 

你可能感兴趣的:(数据分析)