昨日内容回顾:
- logstash的多实例:
要同时运行多个logstash程序,即需要启动多个JVM虚拟机,每个实例需要单独指定数据路径。
- logstash的pipline
只运行一个logstash,即启动一个虚拟机即可,如果有多个业务处理逻辑,可以单独定义pipline。
如果使用了-e或者-f选项,则默认忽略pipline.yaml配置文件哟。
- logstash常用filter插件:
- grok:
基于正则表达式匹配任意文本。
必填字段: match
- geoip:
分享客户单的IP地址归属地等信息。
必填字段: source
- date:
处理时间的相关字段。
必填字段: match
- useragent:
分析客户端的设备类型等信息。
必填字段: source
- mutate:
处理数据,转换,切分,重命名,剔除空格,字母大小写等。
常用字段: split,strip,rename,convert,...
- logstash的通用字段:
- add_field:
添加字段。
- remove_field:
删除字段。
- add_tag:
添加标签。
- remove_tag:
移除标签。
今日内容预告:
- 分析web日志,统计pv,带宽量
- 分析商品日志,统计各组产品的信息
- Kibana的可视化库,dashboard,RBAC
- zookeeper分布式集群部署
需求:分析web日志,统计pv,带宽量
分析需求:
- 源数据:
nginx
- 使用什么工具收集分析
filebeat
logstash
elasticsearch
kibana
1)access.log源数据
{"@oldboyedu-timestamp":"2022-08-24T09:21:07+08:00","host":"10.0.0.101","clientip":"221.218.210.53","SendBytes":4833,"responsetime":0.000,"upstreamtime":"-","upstreamhost":"-","http_host":"10.0.0.101","uri":"/index.html","domain":"10.0.0.101","xff":"-","referer":"-","tcp_xff":"-","http_user_agent":"curl/7.29.0","status":"200"}
{"@oldboyedu-timestamp":"2022-08-20T09:21:07+08:00","host":"10.0.0.101","clientip":"221.218.210.51","SendBytes":14833,"responsetime":0.000,"upstreamtime":"-","upstreamhost":"-","http_host":"10.0.0.101","uri":"/oldboyedu.html","domain":"10.0.0.101","xff":"-","referer":"-","tcp_xff":"-","http_user_agent":"curl/7.29.0","status":"404"}
{"@oldboyedu-timestamp":"2022-08-20T19:21:07+08:00","host":"10.0.0.101","clientip":"221.218.210.61","SendBytes":24833,"responsetime":0.000,"upstreamtime":"-","upstreamhost":"-","http_host":"10.0.0.101","uri":"/oldboyedu.html","domain":"10.0.0.101","xff":"-","referer":"-","tcp_xff":"-","http_user_agent":"curl/7.29.0","status":"302"}
2)filebeat配置文件
cat > config/23-nginx-to-logstash.yaml <
- type: log
paths:
- /var/log/nginx/access.log*
json.keys_under_root: true
output.logstash:
hosts: ["10.0.0.103:9999"]
EOF
3)logstash配置文件
cat > config/14-nginx-project-es.conf <
beats {
port => 9999
}
}
filter {
geoip {
source => "clientip"
remove_field => [
"tags",
"input",
"ecs",
"@version",
"agent"
]
}
date {
match => [
"@oldboyedu-timestamp",
"yyyy-MM-dd'T'HH:mm:ssZ"
]
}
useragent {
source => "http_user_agent"
target => "oldboyedu-linux82-useragent"
}
}
output {
stdout {}
elasticsearch {
hosts => ["10.0.0.101:9200","10.0.0.102:9200","10.0.0.103:9200"]
index => "oldboyedu-linux82-project-nginx"
}
}
EOF
filebeat的processors处理器补充:
1)局部处理器
cat > config/16-input_common_options-to-console.yaml <
- type: tcp
host: 0.0.0.0:8888
enabled: false
tags: ["oldboyedu-linux82-tcp","oldboyedu-elk","oldboyedu-linux"]
fields:
school: oldboyedu
class: linux82
fields_under_root: true
- type: log
enabled: true
paths:
- /tmp/oldboyedu-linux82/*
processors:
- drop_event:
when:
contains:
message: "info:"
output.console:
pretty: true
EOF
2)全局处理器
cat > config/24-filebeat-to-es.yaml <
- type: log
paths:
- /tmp/apps.log
# 自定义全局处理器,参考链接:
# https://www.elastic.co/guide/en/beats/filebeat/7.17/defining-processors.html
processors:
# 删除字段
- drop_fields:
fields: ["log", "input", "ecs","agent","host"]
output.console:
pretty: true
EOF
地理位置坐标案例:
1)创建索引映射
PUT http://10.0.0.103:9200/oldboyedu-map
{
"mappings": {
"properties": {
"geoip": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}
}
2)写入地理位置-lat代表纬度,lon代表经度
{
"location": {
"lat": 39.914,
"lon": 116.386
}
}
3)批量地理位置
{ "create" : { "_index" : "oldboyedu-map" } }
{ "location": { "lat": 24,"lon": 121 }}
{ "create" : { "_index" : "oldboyedu-map" } }
{ "location": { "lat": 36.61,"lon": 114.488 }}
{ "create" : { "_index" : "oldboyedu-map" } }
{ "location": { "lat": 39.914,"lon": 116.386 }}
修复nginx日志解析经纬度问题故障演练:
1)修改nginx的索引的地理位置映射
PUT http://10.0.0.103:9200/oldboyedu-linux82-project-nginx
{
"mappings": {
"properties": {
"geoip": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}
}
2)批量创建测试地理位置数据
POST http://10.0.0.103:9200/_bulk
{ "create" : { "_index" : "oldboyedu-linux82-project-nginx" } }
{ "geoip.location": { "lat": 25,"lon": 121 }}
{ "create" : { "_index" : "oldboyedu-linux82-project-nginx" } }
{ "geoip.location": { "lat": 35.61,"lon": 114.488 }}
{ "create" : { "_index" : "oldboyedu-linux82-project-nginx" } }
{ "geoip.location": { "lat": 35.914,"lon": 116.386 }}
{ "create" : { "_index" : "oldboyedu-linux82-project-nginx" } }
{ "geoip.location": { "lat": 45.914,"lon": 118.386 }}
{ "create" : { "_index" : "oldboyedu-linux82-project-nginx" } }
{ "geoip.location": { "lat": 55.914,"lon": 126.386 }}
{ "create" : { "_index" : "oldboyedu-linux82-project-nginx" } }
{ "geoip.location": { "lat": 75.914,"lon": 26.386 }}
{ "create" : { "_index" : "oldboyedu-linux82-project-nginx" } }
{ "geoip.location": { "lat": 85.914,"lon": 16.386 }}
cat 26-filebeat-pipline.yaml
filebeat.inputs:
- type: log
paths:
- /tmp/apps.log
fields:
type: apps
fields_under_root: true
- type: log
paths:
- /var/log/nginx/access.log*
json.keys_under_root: true
fields:
type: nginx
fields_under_root: true
processors:
- drop_fields:
fields: ["log", "input", "ecs","agent","host"]
output.logstash:
hosts: ["10.0.0.103:9999"]
------------------------------------------------
cat config/16-logstash-pipline.conf
input {
beats {
port => 9999
}
}
filter {
mutate {
remove_field => ["tags","@version"]
}
if [type] == "nginx" {
geoip {
source => "clientip"
}
date {
match => [
"@oldboyedu-timestamp",
"yyyy-MM-dd'T'HH:mm:ssZ"
]
}
useragent {
source => "http_user_agent"
target => "oldboyedu-linux82-useragent"
}
} else {
mutate {
split => { "message" => "|" }
add_field => {
user_id => "%{[message][1]}"
verb => "%{[message][2]}"
svip => "%{[message][3]}"
price => "%{[message][4]}"
}
}
mutate {
rename => { "verb" => "action" }
strip => ["svip"]
remove_field => ["message"]
}
mutate {
convert => {
"user_id" => "integer"
"svip" => "boolean"
"price" => "float"
}
}
}
}
output {
stdout {}
if [type] == "nginx" {
elasticsearch {
hosts => ["10.0.0.101:9200","10.0.0.102:9200","10.0.0.103:9200"]
index => "oldboyedu-linux82-project-nginx"
}
} else {
elasticsearch {
hosts => ["10.0.0.101:9200","10.0.0.102:9200","10.0.0.103:9200"]
index => "oldboyedu-linux82-project-apps"
}
}
}
cat pipelines.yml
- pipeline.id: oldboyedu-linux
config.string: |
input {
generator {
lines => [
"line 1 ---> 老男孩",
"line 2 ---> 苍老师"
]
# Emit all lines 3 times.
count => 3
}
}
filter {
sleep { time => 3 }
}
output {
pipeline {
send_to => ["oldboyedu-linux82"]
}
}
- pipeline.id: oldboyedu-python
config.string: |
input {
pipeline {
address => "oldboyedu-linux82"
}
}
filter {
sleep {
time => 2
}
}
output {
stdout {}
}
基于nginx的实现kibana反向代理存在以下缺点:
(1)如果kibana并不是监听在127/8地址上,则用户可以直接基于kibana的IP进行访问,从而跳过nginx的验证;
(2)nginx对于多租户管理不太方便,权限管控也不足;
(3)nginx不支持基于角色的管理,也就是我们所说的RBAC;
RBAC
今日内容回顾:
- kiaban的可视化库,地图,仪表盘的基本使用;
- kibana的RBAC
- ES加密
- filebeat + logstash + elasticsearch + kibana的项目实战