Elasticsearch
:实时全文搜索和分析引擎,提供搜集、分析、存储数据三大功能,具有分布式,零配置,自动发现,索引自动分片,索引副本机制,restful 风格接口,多数据源,自动搜索负载等特点
Logstash
:它支持几乎任何类型的日志,包括系统日志、错误日志和自定义应用程序日志,它可以从许多来源接收日志,这些来源包括 syslog、消息传递(例如 RabbitMQ)和JMX,它能够以多种方式输出数据,包括电子邮件、websockets和Elasticsearch
Kibana
:基于Web的图形界面,用于搜索、分析和可视化存储在 Elasticsearch指标中的日志数据,它利用Elasticsearch的REST接口来检索数据,不仅允许用户创建他们自己的数据的定制仪表板视图,还允许他们以特殊的方式查询和过滤数据,Kibana 可以为 Logstash 和 Elasticsearch 提供友好的日志分析 web 界面,可以帮助你汇总、分析和搜索重要数据日志
Filebeat
简介
Filebeat
收集各种日志 ,之后发送到指定的目标系统上,但是同一时间只能配置一个输出目标.Filebeat
会对配置好的日志内容进行收集,第一次会从每个文件的开头一直读到当前文件的最后一行。]SON
格式的数据。在Filebeat中负责完成这个动作的官方称它为Harvester
(收割机)作用 | IP地址 | 操作系统 | 配置 |
---|---|---|---|
ELK-01 | 192.168.93.20 | CentOS Linux release 7.5.1804 | 1颗CPU 4G内存 |
ELK-02 | 192.168.93.21 | CentOS Linux release 7.5.1804 | 1颗CPU 3G内存 |
ELK-03 | 192.168.93.22 | CentOS Linux release 7.5.1804 | 1颗CPU 3G内存 |
Filebeat
安装
#官网下载filebeat
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.13.2-linux-x86_64.tar.gz
#解压至/usr/local
tar xf filebeat-7.13.2-linux-x86_64.tar.gz -C /usr/local/
mv /usr/local/filebeat-7.13.2-linux-x86_64/ /usr/local/filebeat
# Filebeat启动管理
# 1.前台运行:采用前台运行的方式查看Filebeat获取的日志结果
# 2.后台运行:使用nohup方式启动Filebeat到后台,日志结果可查看nohup.out文件
# 使用systemd管理的后台方式启动Filebeat进程不能查看输出日志,测试阶段勿用
# 配置systemd方式的Filebeat启动管理文件
vim /usr/lib/systemd/system/filebeat.service
[Unit]
Description=Filebeat sends log files to Logstash or directly to Elasticsearch.
Wants=network-online.target
After=network-online.target
[Service]
ExecStart=/usr/local/filebeat/filebeat -c /usr/local/filebeat/filebeat.yml
Restart=always
[Install]
WantedBy=multi-user.target
#建立系统进程
systemctl daemon-reload && systemctl enable filebeat --now
Filebeat
简单使用
# 准备测试数据
vim /tmp/access.log
112.195.209.90 - - [20/Feb/2018:12:12:14 +0800] "GET / HTTP/1.1" 200 190 "-" "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Mobile Safari/537.36" "-"
# 备份配置文件
cp /usr/local/filebeat/filebeat.yml /usr/local/filebeat/filebeat.yml.bak
# 配置Filebeat的输入和输出
vim /usr/local/filebeat/filebeat.yml
filebeat.inputs: # 输入模块,希望收集什么
- type: log # 类型:日志
enabled: true # 开启收集日志,改
paths: # 日志路径
- /tmp/*.log # 指定需要收集日志的路径,支持通配符可以写多个
#- type: filestream
# enabled: false
# paths:
# - /var/log/*.log
filebeat.config.modules: # 内置的收集日志的模块配置文件的存放路径
path: ${path.config}/modules.d/*.yml# 安装路径 modules.d下有更多的规则
reload.enabled: false # 当模块配置文件发生变化时,filebeat自身重启,影响收集日志过程,一般配置完成才启动
setup.template.settings:
index.number_of_shards: 1 # 索引副本数量, 1 不产生副本
output.console: # 添加 输出到终端屏幕上
pretty: true # 开启
#setup.kibana:
#output.elasticsearch:
# hosts: ["localhost:9200"]
processors: # 处理
- add_host_metadata: # 添加此主机的源数据信息到输出数据中,如 IP MAC OS 等信息
when.not.contains.tags: forwarded
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
# Filebeat模块测试
# 如启动时发生了报错
/usr/local/filebeat/filebeat -c /usr/local/filebeat/filebeat.yml
Exiting: data path already locked by another beat. Please make sure that multiple beats are not sharing the same data path (path.data).
# 关闭filebeat即可,本机器已经存在filebeat启动,datapath被lock
systemctl stop filebeat
/usr/local/filebeat/filebeat -c /usr/local/filebeat/filebeat.yml
{
"@timestamp": "2021-07-17T05:33:45.381Z",
"@metadata": {
"beat": "filebeat",
"type": "_doc",
"version": "7.13.2"
},
"input": {
"type": "log"
},
"host": {
"id": "ad8a55213faa46babc18170804417b90",
"containerized": false,
"name": "pakho",
"ip": [
"192.168.100.200",
"fe80::ec53:d68d:60ea:b5e0"
],
"mac": [
"00:0c:29:ae:a5:a7"
],
"hostname": "filebeat",
"architecture": "x86_64",
"os": {
"type": "linux",
"platform": "centos",
"version": "7 (Core)",
"family": "redhat",
"name": "CentOS Linux",
"kernel": "3.10.0-862.el7.x86_64",
"codename": "Core"
}
},
"agent": {
"id": "33541cdc-c78e-4cf1-9181-e03db1ebdc36",
"name": "filebeat",
"type": "filebeat",
"version": "7.13.2",
"hostname": "filebeat",
"ephemeral_id": "4f5cb4e0-47b3-4398-8574-8e36905aea10"
},
"ecs": {
"version": "1.8.0"
},
"message": "112.195.209.90 - - [20/Feb/2018:12:12:14 +0800] \"GET / HTTP/1.1\" 200 190 \"-\" \"Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Mobile Safari/537.36\" \"-\"",
"log": {
"offset": 0, #从日志文件什么地方开始取的,从第一行
"file": {
"path": "/tmp/access.log"
}
}
}
-c
指定配置文件位置
./filebeat -c /usr/local/filebeat/filebeat.yml
/usr/local/filebeat/logs/filebeat
filebeat.yml
中实现#==================================Logging================================
# Sets log level. The default log level is info.
#Available log levels are: error, warning, info,debug
#logging.level: debug
path.logs: /var/log/ #添加此行即可
/var/log
,日志的文件名为filebeat
,每次启动或者重启程序会产生一个新的日志文件filebeat
,旧的日志命名为filebeat.1
以此类推专用日志搜集模块
模块文件存储位置
[root@pakho ~]# ls /usr/local/filebeat/modules.d
禁用模块
/usr/local/filebeat/filebeat modules disable 模块名
启用模块
/usr/local/filebeat/filebeat modules enable 模块名
Nginx模块
[root@pakho ~]# vim /var/log/access.log
123.127.39.50 - - [04/Nar/2021:10:50:28 +0800] "GET/logo.jpg HTTP/1.1" 200 14137 "http://81.68.233.173/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) ApplewebKit/537.36(KHTAL, like Gecko) Chrome/88.0.4324.192 Safari/537.36" "_"
[root@pakho ~]# vim /var/log/error.log
2021/03/04 10:50:28 [error] 11396#0: *5 open() "/farm/bg.jpg" failed (2: No such file or directory), client: 123.127.39.50, server: localhost, request: "GET /bg.jpg HTTP/1.1", host:"81.68.233.173", referrer: "http://81.68.233.173/"
[root@pakho ~]# /usr/local/filebeat/filebeat -c /usr/local/filebeat/filebeat.yml modules enable nginx
Enabled nginx
[root@filebeat ~]# ls /usr/local/filebeat/modules.d
nginx.yml...
modules.d/nginx.yml
文件内容如下- module: nginx
access:
enabled: true
error:
enabled: true
/var/log/nginx/access.log*
/var/log/nginx/error.log*
var.paths:
属性进行配置。
[root@filebeat ~]# vim /usr/local/filebeat/modules.d/nginx.yml
- module: nginx
access:
enabled: true
error:
enabled: true
var.paths: ["/var/log/access.log","/var/log/error.log"]
- module: nginx
access:
enabled: true
error:
enabled: true
var.paths:
- "/var/log/access.log*"
- "/var/log/error.log*"
var.paths
指定的路径,是以追加的方式和模块默认路径合并到一起的,也就是说假如模块的默认路径有具体的日志文件/var/log/nginx/access.log
var.paths
也配置了路径/var/log/access.log
,那么最终Filebeat收集的日志路径将会是:
/var/log/nginx/access.log
/var/log/access.log
./filebeat -e
[root@pakho filebeat]# pwd
/usr/local/filebeat
# -c 指定配置文件 -e 开启模块
[root@pakho filebeat]# ./filebeat -c /usr/local/filebeat/filebeat.yml -e
配置output
Filebeat
是用于搜集日志,之后把日志推送到某个接收的系统中的,这些系统或者装置在Filebeat中称为outputconsole
终端屏幕elasticsearch
存放日志,并提供查询logstash
进一步对日志数据进行处理kafka
消息队列Filebeat
运行的时候,以上的output
只配置一种即可JSON
数据output.console:
pretty: true
Filebeat
的安装目录下,执行命令前台运行./filebeat
JSON
数据中的某些字段output.console:
codec.format:
string: '%{[@timestamp]} %{[message]}'
其他输出目标
Elasticsearch
output.elasticsearch:
hosts: ['http://es01:9200','http://es02:9200']
logstach
output.logstach:
hosts: ["127.0.0.1:5044"]
Exiting: data path already locked by another beat. Please make sure that multiple beats are not sharing the same data path (path.data).
Filebeat
[root@pakho filebeat]# ps -ef | grep 'filebea[t]'
root 2322 2019 0 17:10 pts/2 00:00:00 ./filebeat -c /usr/local/filebeat/filebeat.yml -e
去除日志中的某些行
DBG:
开头的行processors:
- drop_event: #丢弃事件
when: #当
regexp: #正则表达式,告诉系统下面这段话带正则表达式
message: "^DBG:" #message为自定义字段
向输出的数据中添加某些自定义字段
processors:
- add_fields:
target: project #要添加的自定义字段key的名称
fields:
name: myproject
id: '574734885120952459'
processors:
- drop_fields:
fields: ["field1","field2",...]
ignore_missing: false
field1
和field2
ignore_missing
的值为false
表示,字段名不存在则会返回错误,为true
不会返回错误@timestamp
和type
字段是无法删除的input
和顶级字段ecs
中的version
字段 - drop_fields:
fields: ['input',"ecs.version"]
#下载Logstash
[root@pakho ~]# curl -L -O https://artifacts.elastic.co/downloads/logstash/logstash-7.13.2-linux-x86_64.tar.gz
#解压至/usr/local
[root@pakho ~]# tar xf logstash-7.13.2-linux-x86_64.tar.gz -C /usr/local/
[root@pakho ~]# mv /usr/local/logstash-7.13.2/ /usr/local/logstash
测试运行
Logstash
管道来测试Logstash
安装Logstash
管道具有两个必需元素input
和output
,以及一个可选元素filter
,输入插件使用来自源的数据,过滤器插件根据你的指定修改参数,输出插件将数据写入目标bin/logstash -e ''
-e
选项用于设置Logstash
处理数据的输入和输出-e
等同于 -e input { stdin { type => stdin} } output { { codec => rubydebug} }
input { stdin { type => stdin} }
Logstash
需要处理的数据来源于标准输入设备output { stdont { codec = > rubydebug} }
Logstash
把处理好的数据输出到标准输出设备[root@pakho bin]# pwd
/usr/local/logstash/bin
#input输入 type从键盘读取 output输出 标准输出
[root@pakho bin]# ./logstash -e 'input { stdin { type => stdin } } output { stdout { codec=> rubydebug } }'
...
hello
{
"@timestamp" => 2021-07-17T13:36:08.457Z,
"host" => "pakho",
"message" => "hello",
"@version" => "1",
"type" => "stdin"
}
message
字段对应的值是Logstash
接收到的一行完整的数据@version
是版本信息,可以用于建立索引使用@timestamp
处理对应的值是Logstash
接收到的一行完整的数据type
就是之前input
中设置的值,这个值可以任意修改,但是,type
是内置的变量,不能修改,用于建立索引和条件判断等hosts
表示从哪个主机过来的数据type
的值为nginx
的示例(主要是区分索引的时候用,这里改了没什么实效果)./bin/logstash -e "input { stdin { type => nginx}} output { stdout { codec => rubydebug } }"
Logstash
管道要复杂一些:它通常具有一个或多个输入,过滤器和输出插件。Logstash
管道,该管道使用标准输入来获取Apache Web
日志作为输入,解析这些日志以从日志中创建特定的命名字段,然后将解析的数据输出到标准输出(屏幕上)。并且这次无需在命令行上定义管道配置,而是在配置文件中定义管道Logstash
的管道配置文件[root@pakho ~]# vim /usr/local/logstash/config/first-pipeline.conf
input {
stdin{ }
}
output {
stdout{ }
}
[root@pakho logstash]# pwd
/usr/local/logstash
[root@filebeat logstash]# bin/logstash -f config/first-pipeline.conf --config.test_and_exit
...
Configuration OK
-f
用于指定管道配置文件Logstash
[root@pakho logstash]# bin/logstash -f config/first-pipeline.conf
The stdin plugin is now waiting for input:
83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] "GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama2013/" "Mozilla/5.0 (Macintosh;IntelMac 0s X 10_9_1) ApplewebKit/537.36 (KHTML,like Gecko) Chrome/32.0.1700.77 Safari/537.36"
{
"host" => "pakho",
"@version" => "1",
"@timestamp" => 2021-07-17T14:00:44.109Z,
"message" => "83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] \"GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1\" 200 203023 \"http://semicomplete.com/presentations/logstash-monitorama2013/\" \"Mozilla/5.0 (Macintosh;IntelMac 0s X 10_9_1) ApplewebKit/537.36 (KHTML,like Gecko) Chrome/32.0.1700.77 Safari/537.36\""
}
grok
过滤器插件grok
过滤器插件,可以将非结构化日志数据解析为结构化和可查询的内容grok
会根据你感兴趣的内容分配字段名称,并把这些内容和对应的字段名称进行绑定grok
如何知道哪些内容是你感兴趣的呢?它是通过自己预定义的模式来识别感兴趣的字段的。这个可以通过给其配置不同的模式来实现。%{COMBINEDAPACHELOG}
%{COMBINEDAPACHELOG}
使用以下模式从Apache
日志中构造行:原信息 | 对应新的字段名称 |
---|---|
IP地址 | clientip |
用户ID | ident |
用户认证信息 | auth |
时间戳 | timestamp |
HTTP请求方法 | verb |
请求的URL | request |
HTTP版本 | httpversion |
响应码 | response |
响应体大小 | bytes |
跳转来源 | referrer |
input
为stdin
file
,创建示例日志文件[root@pakho ~]# vim /var/log/httpd.log
83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] "GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; IntelMac OS X 10_9_1) AppleWebKit/537.36 (KHTAL,like Gecko) Chrome/32.0.1700.77 Safari/537.36"
确保没有缓存数据
[root@pakho data]# pwd
/usr/local/logstash/data
[root@pakho data]# ls
dead_letter_queue queue uuid
修改好的管道配置文件如下:
[root@pakho ~]# vim /usr/local/logstash/config/first-pipeline.conf
input {
file {
path => ["/var/log/httpd.log"]
start_position => "beginning" #从文件起始开始收集
}
}
filter {
grok { #对web日志进行过滤处理,输出结构化的数据
#在message字段对应的值中查询匹配上COMBINEDAPACHELOG
match => { "message" => "%{COMBINEDAPACHELOG}"}
}
}
output {
stdout{ }
}
match => {"message" => "%{COMBINEDAPACHELOG}"}
的意思是:
"message"
字段时,用户模式"COMBINEDAPACHELOG"
进行字段映射[root@pakho logstash]# bin/logstash -f config/first-pipeline.conf
{
"host" => "pakho",
"auth" => "-",
"timestamp" => "04/Jan/2015:05:13:42 +0000",
"ident" => "-",
"verb" => "GET",
"request" => "/presentations/logstash-monitorama-2013/imageskibana-search.png",
"bytes" => "203023",
"referrer" => "\"http://semicomplete.com/presentations/logstash-monitorama-2013/\"",
"@version" => "1",
"agent" => "\"Mozilla/5.0 (Macintosh; IntelMac OS X 10_9_1) AppleWebKit/537.36 (KHTAL,like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
"path" => "/var/log/httpd.log",
"@timestamp" => 2021-07-17T14:31:04.282Z,
"response" => "200",
"message" => "83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] \"GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1\" 200 203023 \"http://semicomplete.com/presentations/logstash-monitorama-2013/\" \"Mozilla/5.0 (Macintosh; IntelMac OS X 10_9_1) AppleWebKit/537.36 (KHTAL,like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
"httpversion" => "1.1",
"clientip" => "83.149.9.216"
}
message
字段仍然存在,假如不需要它,可以使用grok
中提供的常用选项之一:
remove_field
来移除这个字段remove_field
可以移除任意的字段,它可以接受的值是一个数组rename
可以重新命名字段[root@pakho ~]# vim /usr/local/logstash/config/first-pipeline.conf
input {
file {
path => ["/var/log/httpd.log"]
start_position => "beginning" #从文件起始开始收集
}
}
filter {
grok { #对web日志进行过滤处理,输出结构化的数据
#在message字段对应的值中查询匹配上COMBINEDAPACHELOG
match => { "message" => "%{COMBINEDAPACHELOG}"} }
mutate {
#重写字段
rename => {
"clientip" => "cip"
}
}
mutate {
#去掉没用字段
remove_field => ["message","input_type","@version","fields"]
}
}
output {
stdout{ }
}
[root@pakho logstash]# bin/logstash -f config/first-pipeline.conf
...
[2021-07-17T22:48:25,567][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
message
不见了,而且clientip
重命名为cip
!{
"@timestamp" => 2021-07-17T14:49:42.501Z,
"auth" => "-",
"host" => "pakho",
"timestamp" => "04/Jan/2015:05:13:42 +0000",
"request" => "/presentations/logstash-monitorama-2013/imageskibana-search.png",
"httpversion" => "1.1",
"bytes" => "203023",
"response" => "200",
"agent" => "\"Mozilla/5.0 (Macintosh; IntelMac OS X 10_9_1) AppleWebKit/537.36 (KHTAL,like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
"cip" => "83.149.9.217",
"referrer" => "\"http://semicomplete.com/presentations/logstash-monitorama-2013/\"",
"path" => "/var/log/httpd.log",
"ident" => "-",
"verb" => "GET"
}
geoip
插件可以通过查找到IP地址,并从自己自带的数据库中找到地址对应的地理位置信息,然后将该位置信息添加到日志中geoip
插件配置要求您指定包含IP地址来查找源字段的名称。在此示例中,该clientip
字段包含IP地址 geoip {
source => "clientip"
}
geoip
部分位于grok
配置文件的该部分之后,并且grok
和geoip
部分都嵌套在 该 filter
部分中[root@pakho ~]# vim /usr/local/logstash/config/first-pipeline.conf
input {
file {
path => ["/var/log/httpd.log"]
start_position => "beginning" #从文件起始开始收集
}
}
filter {
grok { #对web日志进行过滤处理,输出结构化的数据
#在message字段对应的值中查询匹配上COMBINEDAPACHELOG
match => { "message" => "%{COMBINEDAPACHELOG}"}
}
geoip { source => "clientip" }
}
output {
stdout{ }
}
[root@pakho logstash]# bin/logstash -f config/first-pipeline.conf
{
"ident" => "-",
"auth" => "-",
"@version" => "1",
"request" => "/presentations/logstash-monitorama-2013/imageskibana-search.png",
"httpversion" => "1.1",
"@timestamp" => 2021-07-17T15:00:34.438Z,
"host" => "pakho",
"path" => "/var/log/httpd.log",
"timestamp" => "04/Jan/2015:05:13:42 +0000",
"verb" => "GET",
"bytes" => "203023",
"message" => "83.149.9.217 - - [04/Jan/2015:05:13:42 +0000] \"GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1\" 200 203023 \"http://semicomplete.com/presentations/logstash-monitorama-2013/\" \"Mozilla/5.0 (Macintosh; IntelMac OS X 10_9_1) AppleWebKit/537.36 (KHTAL,like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
"agent" => "\"Mozilla/5.0 (Macintosh; IntelMac OS X 10_9_1) AppleWebKit/537.36 (KHTAL,like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
"referrer" => "\"http://semicomplete.com/presentations/logstash-monitorama-2013/\"",
"geoip" => {
"continent_code" => "EU",
"country_code2" => "RU",
"region_code" => "MOW",
"timezone" => "Europe/Moscow",
"country_code3" => "RU",
"location" => {
"lat" => 55.7527,
"lon" => 37.6172
},
"country_name" => "Russia",
"longitude" => 37.6172,
"city_name" => "Moscow",
"latitude" => 55.7527,
"ip" => "83.149.9.217",
"postal_code" => "129223",
"region_name" => "Moscow"
},
"clientip" => "83.149.9.217",
"response" => "200"
}
{
"ident" => "-",
"auth" => "-",
"@version" => "1",
"request" => "/presentations/logstash-monitorama-2013/imageskibana-search.png",
"httpversion" => "1.1",
"@timestamp" => 2021-07-17T15:00:34.439Z,
"host" => "pakho",
"path" => "/var/log/httpd.log",
"timestamp" => "04/Jan/2015:05:13:42 +0000",
"verb" => "GET",
"bytes" => "203023",
"message" => "182.149.163.223 - - [04/Jan/2015:05:13:42 +0000] \"GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1\" 200 203023 \"http://semicomplete.com/presentations/logstash-monitorama-2013/\" \"Mozilla/5.0 (Macintosh; IntelMac OS X 10_9_1) AppleWebKit/537.36 (KHTAL,like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
"agent" => "\"Mozilla/5.0 (Macintosh; IntelMac OS X 10_9_1) AppleWebKit/537.36 (KHTAL,like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
"referrer" => "\"http://semicomplete.com/presentations/logstash-monitorama-2013/\"",
"geoip" => {
"continent_code" => "AS",
"country_code2" => "CN",
"region_code" => "SC",
"timezone" => "Asia/Shanghai",
"country_code3" => "CN",
"location" => {
"lat" => 30.6667,
"lon" => 104.0667
},
"country_name" => "China",
"longitude" => 104.0667,
"city_name" => "Chengdu",
"latitude" => 30.6667,
"ip" => "182.149.163.223",
"region_name" => "Sichuan"
},
"clientip" => "182.149.163.223",
"response" => "200"
}
[root@pakho ~]# vim /usr/local/logstash/config/first-pipeline.conf
#监听 5044 端口,接收 filebeat 的输入
input {
beats {
port => 5044
}
}
filter {
grok { #对web日志进行过滤处理,输出结构化的数据
#在message字段对应的值中查询匹配上COMBINEDAPACHELOG
match => { "message" => "%{COMBINEDAPACHELOG}"} }
#geoip { source => "clientip" }
}
output {
stdout{ }
}
filebeat
配置文件[root@pakho ~]# vim /usr/local/filebeat/filebeat.yml
...
#注释console
#output.console:
# codec,format:
# string: '%{[@timestamp]} %{[message]}'
# pretty: true
#开启logstash
output.logstash:
#The Logstash hosts
hosts: ["192.168.100.200:5044"]
filebeat
机器清除缓存目录[root@pakho ~]# rm -rf /usr/local/filebeat/data/
filebeat
[root@pakho filebeat]# pwd
/usr/local/filebeat
[root@pakho filebeat]# ./filebeat
Logstash
[root@pakho logstash]# pwd
/usr/local/logstash
[root@pakho logstash]# bin/logstash -f config/first-pipeline.conf
/tmp/access.log
文件添加一条访问日志182.149.163.223 - - [20/Feb/2018:12:12:14 +0800] "GET / HTTP/1.1" 200 190 "-" "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Mobile Safari/537.36" "-"
Logstash
接受到的filebeat
消息如下...
"@version" => "1",
"message" => "182.149.163.223 - - [20/Feb/2018:12:12:14 +0800] \"GET / HTTP/1.1\" 200 190 \"-\" \"Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Mobile Safari/537.36\" \"-\"",
"ecs" => {
"version" => "1.8.0"
},
"auth" => "-",
"request" => "/",
"verb" => "GET",
"response" => "200",
"agent" => {
"type" => "filebeat",
"id" => "57d4937a-5d65-498a-bc50-f8779a11ebcf",
"hostname" => "pakho",
"ephemeral_id" => "2b08ef40-c15b-4f94-98df-024fabda68d8",
"version" => "7.13.2",
"name" => "pakho"
},
"bytes" => "190"
}
...
Elasticsearch
:存储、搜索和分析
Elasticsearch
是Elastic Stack
核心的分布式搜索和分析引擎。Logstash
和Beats
有助于收集,聚合和丰富你的数据并将其存储在Elasticsearch
中。使用Kibana,你可以交互式地探索,可视化和共享对数据的见解,并管理和监视堆栈。Elasticsearch
是发生索引,搜索和分析数据的地方。Elasticsearch
为所有类型的数据提供近乎实时的搜索和分析。
Elasticsearch
是面向文档的,文档是所有可搜索数据的最小单位
JSON
格式,保存在Elasticsearch
中
JSON
对象由字段组成Unique ID
Elasticsearch
自动生成JSON
文档,格式灵活,不需要预先定义格式
Elasticsearch
自动推算7.0
之前,一个Index
可以设置多个Types
7.0
开始一个索引只能建立一个Type
:_doc
192.168.100.200
#下载GPG-KEY校验
[root@pakho ~]# rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
#创建YUM源
[root@pakho ~]# vim /etc/yum.repos.d/elasticsearch.repo
[elasticsearch]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
#安装elasticsearch
[root@pakho ~]# yum -y install elasticsearch
[root@pakho ~]# systemctl daemon-reload
[root@pakho ~]# systemctl enable elasticsearch.service
[root@pakho ~]# systemctl start elasticsearch.service
[root@pakho ~]# ss -lnt
...
:9200
:9300
#关闭elasticsearch , 等下做集群的时候一直开着 ,系统会认为集群只有本机
[root@pakho ~]# systemctl stop elasticsearch.service
Elasticsearch
集群是一个多节点组成的高可用可扩展的分布式系统
Elasticsearch
的分布式架构的好处
Elasticsearch
的分布式架构
-E cluster.name=geektime
进行设定Master-eligible Node
和 Master Node
Master eligible
节点
node.master:false
禁止Master
节点Master
节点才能修改集群的状态信息
(Clister State)
,维护了一个集群 ,必要的信息
Mapping
和Setting
信息Date Node
和Coordinating Node
Data Node
,负责保存分片数据,在数据扩展上起到了至关重要的作用Coordinating Node
的职责Lucene
的实例Reindex
blogs
索引的分片分布情况分片的设定
主分片是在一开始建立索引时候设置的,后期无法更改!
一般和节点相同即可!
主机 | IP地址 | 配置 |
---|---|---|
Master | 192.168.100.200 | 4G内存 |
es_node1 | 192.168.100.201 | 4G内存 |
es_node2 | 192.168.100.202 | 4G内存 |
[root@master ~]# vim /etc/hosts
192.168.100.200 master
192.168.100.201 es_node1
192.168.100.202 es_node2
[root@master ~]# scp -r /etc/hosts es_node1:/etc/hosts
[root@master ~]# scp -r /etc/hosts es_node2:/etc/hosts
Master
节点配置[root@master ~]# vim /etc/elasticsearch/elasticsearch.yml
cluster.name: elk
node.name: master
node.data: true #添加,这是一个数据节点吗,是的
network.host: 0.0.0.0
http.port: 9200 #对外服务端口9200
#官方文档指定写法如下,三台机器一样
discovery.seed_hosts:
- master #节点1主机名称
- 192.168.100.201:9300 #节点2的IP加端口
- 192.168.100.202 #节点3的IP
#集群初始化主人是谁? 写一个就行,为什么写三个, 如果主机挂了 其余主机能顶替, 注意主机之间的空格
cluster.initial_master_nodes: ["master", "es_node1", "es_node2"]
#拷贝至其余节点
[root@master ~]# scp -r /etc/elasticsearch/elasticsearch.yml es_node1:/etc/elasticsearch/
[root@master ~]# scp -r /etc/elasticsearch/elasticsearch.yml es_node2:/etc/elasticsearch/
es_node1
节点配置[root@es_node1 ~]# vim /etc/elasticsearch/elasticsearch.yml
cluster.name: elk
node.name: es_node1
node.data: true
network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts:
- master
- 192.168.100.201:9300
- 192.168.100.202
cluster.initial_master_nodes: ["master", "es_node1", "es_node2"]
es_node2
节点配置[root@es_node2 ~]# vim /etc/elasticsearch/elasticsearch.yml
node.name: es_node2 #只需改名即可
cluster.name
集群名称,各节点配成相同的集群名称node.name
节点名称,各节点配置不同node.data
指示节点是否为数据节点,数据节点包含并管理索引的一部分network.host
绑定节点IPhttp.port
监听端口path.data
数据存储目录path.logs
日志存储目录discovery.seed.hosts
指定集群成员,用于主动发现他们,所有成员都要写进来,包括自己,每个节点中应该写一样的信息cluster.initial_master_nodes
指定有资格成为master的节点http.cors.enabled
用于允许head插件访问eshttp.cors.allow-origin
允许的源地址network.host
,elasticsearch
会假设正在从开发模式过渡到生产模式,并将许多系统启动检查从警告升级到异常cluster.initial_master_nodes
中的节点名称需要和node.name
的名称一致启动集群
elasticsearch
进程yum
方式安装直接启动服务即可,以下为二进制方式的启动ela
,yum
安装直接使用systemctl start elasticsearch
启动YUM
安装启动时注意启动顺序,Master -> es_node1 -> es_node2
su - ela
cd /usr/local/elasticsearch-7.10.0
./bin/elasticsearch -d -p /tmp/elasticsearch.pid
-d
后台运行-p
指定一个文件,用于存放pid
9200
用于外部访问的监听端口,比如查看集群状态,向其传输数据,查询数据等9300
用户集群中节点之间的相互通信,比如主节点之间的选举,集群节点信息的通告等ss -ntal
$ES_HOME/logs/
目录中找到ls /logs/elk.log
YUM
安装的日志:cat /var/log/elasticsearch/elasticsearch.log
查看集群健康状态
curl -X GET "localhost:9200/_cat/health?v"
[root@es_node2 ~]# curl -X GET "localhost:9200/_cat/health?v"
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1626544475 17:54:35 elk green 3 3 0 0 0 0 0 0 - 100.0%
Elasticsearch
实例,则集群状态将保持黄色,单节点集群具有完整的功能,但是无法将数据复制到另一个节点以提供弹性查看集群节点信息
curl -X GET "localhost:9200/_cat/nodes?v"
[root@master ~]# curl -X GET "localhost:9200/_cat/nodes?v"
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.100.200 36 72 3 0.02 0.08 0.10 cdfhilmrstw * master
192.168.100.201 26 76 3 0.05 0.09 0.10 cdfhilmrstw - es_node1
192.168.100.202 14 76 3 0.00 0.10 0.11 cdfhilmrstw - es_node2
集群报错排查
# 找到进程
[ela@ela1 elasticsearch-7.10.0]$ jdk/bin/jps
8244 jps
7526 Elasticsearch
# 杀死进程
[ela@ela1 elasticsearch-7.18.0]$ kill -9 7526
logs/elk.log
#删除数据目录中的所有文件
[ela@ela1 elasticsearch-7.10.0]$ rm -rf data/*
#删除日志
[ela@ela1 elasticsearch-7.18.0]$ rm -rf logs/*
#删除keystore文件
[ela@ela1 elasticsearch-7.10.0]$ rm -rf config/elasticsearch.keystore
#重新启动进程
[ela@ela1 elasticsearch-7.10.0]$ bin/elasticsearch -d -p /tmp/elk.pid
Elasticsearch
进程#二进制方式
# pkill -F /tmp/elasticsearch.pid
ES集群测试
Filebeat
搜集 Nginx的日志,输出到Logstash
,再由Logstash
处理完数据后输出到Elasticsearch
Elasticsearch
集群可用Logstash
配置[root@master ~]# vim /usr/local/logstash/config/first-pipeline.conf
#监听 5044 端口,接收 filebeat 的输入
input {
beats {
port => 5044
}
}
filter {
grok { #对web日志进行过滤处理,输出结构化的数据
#在message字段对应的值中查询匹配上COMBINEDAPACHELOG
match => { "message" => "%{COMBINEDAPACHELOG}"} }
#geoip { source => "clientip" }
}
output {
stdout{
codec => rubydebug
}
elasticsearch {
# 这里是输出到 elasticsearch 集群中
hosts => ["192.168.100.200:9200","192.168.100.201:9200","192.168.100.202"]
}
}
Logstash
配置文件准备完成后,启动filebeat
[root@master filebeat]# pwd
/usr/local/filebeat
#指定配置文件启动
[root@master filebeat]# ./filebeat -c /usr/local/filebeat/filebeat.yml
Logstash
[root@master logstash]# pwd
/usr/local/logstash
[root@master logstash]# ./bin/logstash -f config/first-pipeline.conf
验证
[root@master ~]# vim /var/log/access.log
...
Logstash
的终端...
{
"project" => {
"id" => "574734885120952459",
"name" => "myproject"
},
"log" => {
"offset" => 3290,
"file" => {
"path" => "/var/log/access.log"
}
},
"fileset" => {
"name" => "access"
},
"ecs" => {
"version" => "1.9.0"
},
"@version" => "1",
"tags" => [
[0] "beats_input_codec_plain_applied",
[1] "_grokparsefailure"
],
"agent" => {
"hostname" => "master",
"type" => "filebeat",
"id" => "57d4937a-5d65-498a-bc50-f8779a11ebcf",
"ephemeral_id" => "977cb9be-c500-41b2-b9b3-64ce579ee7b0",
"version" => "7.13.2",
"name" => "master"
},
"event" => {
"module" => "nginx",
"dataset" => "nginx.access",
"timezone" => "+08:00"
},
"message" => "FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF",
"host" => {
"os" => {
"name" => "CentOS Linux",
"codename" => "Core",
"family" => "redhat",
"kernel" => "3.10.0-862.el7.x86_64",
"type" => "linux",
"version" => "7 (Core)",
"platform" => "centos"
},
"ip" => [
[0] "192.168.100.200",
[1] "fe80::ec53:d68d:60ea:b5e0"
],
"hostname" => "master",
"architecture" => "x86_64",
"id" => "ad8a55213faa46babc18170804417b90",
"mac" => [
[0] "00:0c:29:ae:a5:a7"
],
"containerized" => false,
"name" => "master"
},
"input" => {
"type" => "log"
},
"@timestamp" => 2021-07-17T18:24:44.580Z,
"service" => {
"type" => "nginx"
}
}
...
Elasticsearch
是否创建了索引[root@master ~]# curl -X GET "192.168.100.200:9200/_cat/indices"
green open logstash-2021.07.17-000001 HqWptzX_RkeGjm91LfF22w 1 1 31 0 80.5kb 40.2kb
logstash-2021.07.17-000001
是Elasticsearch
自动创建的索引[root@master ~]# curl -L -O https://artifacts.elastic.co/downloads/kibana/kibana-7.13.2-linux-x86_64.tar.gz
[root@master ~]# tar xf kibana-7.13.2-linux-x86_64.tar.gz -C /usr/local/
[root@master ~]# mv /usr/local/kibana-7.13.2-linux-x86_64/ /usr/local/kibana
/usr/local/kibana/config/kibana.yml
[root@master ~]# vim /usr/local/kibana/config/kibana.yml
server.port: 5601
server.host: "0.0.0.0"
#用于连接到ES 集群的地址和端口
elasticsearch.hosts: ["http://192.168.100.200:9200"]
#日志文件路径
#logging.dest: stdout
logging.dest: /var/log/kibana/kibana.log
#设置页面的字体为中文
i18n.locale: "zh-CN"
kibana
的普通用户[root@master ~]# useradd ela
[root@master ~]# mkdir /run/kibana /var/log/kibana
[root@master ~]# chown ela.ela /run/kibana/ /var/log/kibana/ /usr/local/kibana/ -R
[root@master ~]# su - ela
[ela@master ~]$ /usr/local/kibana/bin/kibana
[ela@master ~]$ nohup /usr/local/kibana/bin/kibana &
[ela@master ~]$ nohup /usr/local/kibana/bin/kibana --allow-root &
http://ip:5601