graylog 是一个开源的专业的日志聚合、分析、审计、展示、预警的工具,跟 ELK 很相似,但是更简单,下面说一说 graylog 如何部署,使用,以及对 graylog 的工作流程做一个简单的梳理
本文篇幅比较长,一共使用了三台机器,这三台机器上部署了 kafka 集群(2.3),es 集群(7.11.2),MongoDB 副本集(4.2),还有 graylog 集群(4.0.2),搜集的日志是 k8s 的日志,使用 DaemonSet 的方式通过 filebeat(7.11.2)将日志搜集到 kafka 中。本文将从部署开始,一步一步的了解 graylog 是怎么部署,以及简单的使用。
graylog 介绍
组件介绍
从架构图中可以看出,graylog 是由三部分组成:
- mongodb 存放 gralog 控制台上的配置信息,以及 graylog 集群状态信息,还有一些元信息
- es 存放日志数据,以及检索数据等
- graylog 相当于一个中转的角色
mongodb 和 es 没什么好说的,作用都比较清晰,重点说一下 graylog 的一些组件,及其作用。
- Inputs 日志数据来源,可以通过 graylog 的 Sidecar 来主动抓取,也可以通过其他 beats,syslog 等主动推送
- Extractors 日志数据格式转换,主要用于 json 解析、kv 解析、时间戳解析、正则解析
- Streams 日志信息分类,通过设置一些规则来将日志发送到指定的索引中
- Indices 持久化数据存储,设置索引名及索引的过期策略、分片数、副本数、flush 时间间隔等
- Outputs 日志数据的转发,将解析的 Stream 发送到其他的 graylog 集群
- Pipelines 日志数据的过滤,建立数据清洗的过滤规则、字段添加或删除、条件过滤、自定义函数
- Sidecar 轻量级的日志采集器
- LookupTables 服务解析,基于 IP 的 Whois 查询和基于源 IP 的情报监控
- Geolocation 可视化地理位置,基于来源 IP 的监控
流程介绍
Graylog 通过设置 Input 来搜集日志,比如这里通过设置好 kafka 或者 redis 或者直接通过 filebeat 将日志搜集过来,然后 Input 配置好 Extractors,用来对日志中的字段做提取和转换,可以设置多个 Extractors,按照顺序执行,设置好后,系统会把日志通过在 Stream 中设置的匹配规则保存到 Stream 中,可以在 Stream 中指定索引位置,然后存储到 es 的索引中,完成这些操作后,可以在控制台中通过指定 Stream 名称来查看对应的日志。
安装 mongodb
按照官方文档,装的是 4.2.x 的
时间同步
安装 ntpdate
yum install ntpdate -y
cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
添加到计划任务中
# crontab -e
5 * * * * ntpdate -u ntp.ntsc.ac.cn
配置仓库源并安装
vim /etc/yum.repos.d/mongodb-org.repo
[mongodb-org-4.2]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/4.2/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-4.2.asc
然后安装
yum makecache
yum -y install mongodb-org
然后启动
systemctl daemon-reload
systemctl enable mongod.service
systemctl start mongod.service
systemctl --type=service --state=active | grep mongod
修改配置文件设置副本集
# vim /etc/mongod.conf
# mongod.conf
# for documentation of all options, see:
# http://docs.mongodb.org/manual/reference/configuration-options/
# where to write logging data.
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongod.log
# Where and how to store data.
storage:
dbPath: /var/lib/mongo
journal:
enabled: true
# engine:
# wiredTiger:
# how the process runs
processManagement:
fork: true # fork and run in background
pidFilePath: /var/run/mongodb/mongod.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
# network interfaces
net:
port: 27017
bindIp: 0.0.0.0 # Enter 0.0.0.0,:: to bind to all IPv4 and IPv6 addresses or, alternatively, use the net.bindIpAll setting.
#security:
#operationProfiling:
replication:
replSetName: graylog-rs #设置副本集名称
#sharding:
## Enterprise-Only Options
#auditLog:
#snmp:
初始化副本集
> use admin;
switched to db admin
> rs.initiate( {
... _id : "graylog-rs",
... members: [
... { _id: 0, host: "10.0.105.74:27017"},
... { _id: 1, host: "10.0.105.76:27017"},
... { _id: 2, host: "10.0.105.96:27017"}
... ]
... })
{
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1615885669, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1615885669, 1)
}
确认副本集状态
不出意外的话,集群会有两个角色,一个是Primary
,另一个是Secondary
,使用命令可以查看
rs.status()
会返回一堆信息,如下所示:
"members" : [
{
"_id" : 0,
"name" : "10.0.105.74:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 623,
"optime" : {
"ts" : Timestamp(1615885829, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2021-03-16T09:10:29Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"electionTime" : Timestamp(1615885679, 1),
"electionDate" : ISODate("2021-03-16T09:07:59Z"),
"configVersion" : 1,
"self" : true,
"lastHeartbeatMessage" : ""
},
{
"_id" : 1,
"name" : "10.0.105.76:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 162,
"optime" : {
"ts" : Timestamp(1615885829, 1),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1615885829, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2021-03-16T09:10:29Z"),
"optimeDurableDate" : ISODate("2021-03-16T09:10:29Z"),
"lastHeartbeat" : ISODate("2021-03-16T09:10:31.690Z"),
"lastHeartbeatRecv" : ISODate("2021-03-16T09:10:30.288Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "10.0.105.74:27017",
"syncSourceHost" : "10.0.105.74:27017",
"syncSourceId" : 0,
"infoMessage" : "",
"configVersion" : 1
},
{
"_id" : 2,
"name" : "10.0.105.96:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 162,
"optime" : {
"ts" : Timestamp(1615885829, 1),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1615885829, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2021-03-16T09:10:29Z"),
"optimeDurableDate" : ISODate("2021-03-16T09:10:29Z"),
"lastHeartbeat" : ISODate("2021-03-16T09:10:31.690Z"),
"lastHeartbeatRecv" : ISODate("2021-03-16T09:10:30.286Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "10.0.105.74:27017",
"syncSourceHost" : "10.0.105.74:27017",
"syncSourceId" : 0,
"infoMessage" : "",
"configVersion" : 1
}
]
创建用户
随便找一台机器执行即可
use admin
db.createUser({user: "admin", pwd: "Root_1234", roles: ["root"]})
db.auth("admin","Root_1234")
然后别退出,再创建一个用于 graylog 连接的用户
use graylog
db.createUser("graylog", {
"roles" : [{
"role" : "dbOwner",
"db" : "graylog"
}, {
"role" : "readWrite",
"db" : "graylog"
}]
})
生成 keyFile 文件
openssl rand -base64 756 > /var/lib/mongo/access.key
修改权限
chown -R mongod.mongod /var/lib/mongo/access.key
chmod 600 /var/lib/mongo/access.key
生成完这个 key 之后,需要拷贝到其他另外两台机器上,并同样修改好权限
scp -r /var/lib/mongo/access.key 10.0.105.76:/var/lib/mongo/
拷贝完成后,需要修改配置文件
# vim /etc/mongod.conf
#添加如下配置
security:
keyFile: /var/lib/mongo/access.key
authorization: enabled
三台机器都需要如此设置,然后重启服务
systemctl restart mongod
然后登陆验证即可,验证两块地方
- 是否能认证成功
- 副本集状态是否正常
如果以上 ok,那通过 yum 安装的 mongodb4.2 版本的副本集就部署好了,下面去部署 es 集群
部署 Es 集群
es 版本为目前为止最新的版本:7.11.x
系统优化
- 内核参数优化
# vim /etc/sysctl.conf
fs.file-max=655360
vm.max_map_count=655360
vm.swappiness = 0
- 修改 limits
# vim /etc/security/limits.conf
* soft nproc 655350
* hard nproc 655350
* soft nofile 655350
* hard nofile 655350
* hard memlock unlimited
* soft memlock unlimited
- 添加普通用户
启动 es 需要使用普通用户
useradd es
groupadd es
echo 123456 | passwd es --stdin
- 安装 jdk
yum install -y java-1.8.0-openjdk-devel.x86_64
设置环境变量
# vim /etc/profile
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.282.b08-1.el7_9.x86_64/
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
上传压缩包
es 下载地址:https://artifacts.elastic.co/...
解压
tar zxvf elasticsearch-7.11.2-linux-x86_64.tar.gz -C /usr/local/
修改权限
chown -R es.es /usr/local/elasticsearch-7.11.2
修改 es 配置
配置集群
# vim /usr/local/elasticsearch-7.11.2/config/elasticsearch.yml
cluster.name: graylog-cluster
node.name: node03
path.data: /data/elasticsearch/data
path.logs: /data/elasticsearch/logs
bootstrap.memory_lock: true
network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["10.0.105.74","10.0.105.76","10.0.105.96"]
cluster.initial_master_nodes: ["10.0.105.74","10.0.105.76"]
http.cors.enabled: true
http.cors.allow-origin: "*"
修改 jvm 内存大小
-Xms16g #设置为宿主机内存的一半
-Xmx16g
使用 systemd 管理服务
# vim /usr/lib/systemd/system/elasticsearch.service
[Unit]
Description=elasticsearch server daemon
Wants=network-online.target
After=network-online.target
[Service]
Type=simple
User=es
Group=es
LimitMEMLOCK=infinity
LimitNOFILE=655350
LimitNPROC=655350
ExecStart=/usr/local/elasticsearch-7.11.2/bin/elasticsearch
Restart=always
[Install]
WantedBy=multi-user.target
启动并设置开机启动
systemctl daemon-reload
systemctl enable elasticsearch
systemctl start elasticsearch
简单验证下
# curl -XGET http://127.0.0.1:9200/_cluster/health?pretty
{
"cluster_name" : "graylog-cluster",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 1,
"active_shards" : 2,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
到这里 es 就安装完成了
部署 kafka 集群
因为我的机器是复用的,之前已经安装过 java 环境了,所以这里就不再写了
下载安装包
kafka: https://www.dogfei.cn/pkgs/kafka_2.12-2.3.0.tgz
zookeeper: https://www.dogfei.cn/pkgs/apache-zookeeper-3.6.0-bin.tar.gz
解压
tar zxvf kafka_2.12-2.3.0.tgz -C /usr/local/
tar zxvf apache-zookeeper-3.6.0-bin.tar.gz -C /usr/local/
修改配置文件
kafka
# grep -v -E "^#|^$" /usr/local/kafka_2.12-2.3.0/config/server.properties
broker.id=1
listeners=PLAINTEXT://10.0.105.74:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=1048576
socket.receive.buffer.bytes=1048576
socket.request.max.bytes=104857600
log.dirs=/data/kafka/data
num.partitions=8
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
message.max.bytes=20971520
log.retention.hours=1
log.retention.bytes=1073741824
log.segment.bytes=536870912
log.retention.check.interval.ms=300000
zookeeper.connect=10.0.105.74:2181,10.0.105.76:2181,10.0.105.96:2181
zookeeper.connection.timeout.ms=1000000
zookeeper.sync.time.ms=2000
group.initial.rebalance.delay.ms=0
log.cleaner.enable=true
delete.topic.enable=true
zookeeper
# grep -v -E "^#|^$" /usr/local/apache-zookeeper-3.6.0-bin/conf/zoo.cfg
tickTime=10000
initLimit=10
syncLimit=5
dataDir=/data/zookeeper/data
clientPort=2181
admin.serverPort=8888
server.1=10.0.105.74:22888:33888
server.2=10.0.105.76:22888:33888
server.3=10.0.105.96:22888:33888
嗯,别忘了创建好相应的目录
加入 systemd
kafka
# cat /usr/lib/systemd/system/kafka.service
[Unit]
Description=Kafka
After=zookeeper.service
[Service]
Type=simple
Environment=LOG_DIR=/data/kafka/logs
WorkingDirectory=/usr/local/kafka_2.12-2.3.0
ExecStart=/usr/local/kafka_2.12-2.3.0/bin/kafka-server-start.sh /usr/local/kafka_2.12-2.3.0/config/server.properties
ExecStop=/usr/local/kafka_2.12-2.3.0/bin/kafka-server-stop.sh
Restart=always
[Install]
WantedBy=multi-user.target
zookeeper
# cat /usr/lib/systemd/system/zookeeper.service
[Unit]
Description=zookeeper.service
After=network.target
[Service]
Type=forking
Environment=ZOO_LOG_DIR=/data/zookeeper/logs
ExecStart=/usr/local/apache-zookeeper-3.6.0-bin/bin/zkServer.sh start
ExecStop=/usr/local/apache-zookeeper-3.6.0-bin/bin/zkServer.sh stop
Restart=always
[Install]
WantedBy=multi-user.target
启动服务
systemctl daemon-reload
systemctl start zookeeper
systemctl start kafka
systemctl enable zookeeper
systemctl enable kafka
部署 filebeat
由于收集的是 k8s 的日志,filebeat 是采用 DaemonSet 方式部署,示例如下:
daemonset 参考
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app: filebeat
name: filebeat
namespace: default
spec:
selector:
matchLabels:
app: filebeat
template:
metadata:
labels:
app: filebeat
name: filebeat
spec:
affinity: {}
containers:
- args:
- -e
- -E
- http.enabled=true
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
image: docker.elastic.co/beats/filebeat:7.11.2
imagePullPolicy: IfNotPresent
livenessProbe:
exec:
command:
- sh
- -c
- |
#!/usr/bin/env bash -e
curl --fail 127.0.0.1:5066
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: filebeat
resources:
limits:
cpu: "1"
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
securityContext:
privileged: false
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /usr/share/filebeat/filebeat.yml
name: filebeat-config
readOnly: true
subPath: filebeat.yml
- mountPath: /usr/share/filebeat/data
name: data
- mountPath: /opt/docker/containers/
name: varlibdockercontainers
readOnly: true
- mountPath: /var/log
name: varlog
readOnly: true
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: filebeat
serviceAccountName: filebeat
terminationGracePeriodSeconds: 30
tolerations:
- operator: Exists
volumes:
- configMap:
defaultMode: 384
name: filebeat-daemonset-config
name: filebeat-config
- hostPath:
path: /opt/docker/containers
type: ""
name: varlibdockercontainers
- hostPath:
path: /var/log
type: ""
name: varlog
- hostPath:
path: /var/lib/filebeat-data
type: DirectoryOrCreate
name: data
updateStrategy:
rollingUpdate:
maxUnavailable: 1
type: RollingUpdate
configmap 参考
apiVersion: v1
data:
filebeat.yml: |
filebeat.inputs:
- type: container
paths:
- /var/log/containers/*.log
#多行合并
multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
multiline.negate: true
multiline.match: after
multiline.timeout: 30
fields:
#自定义字段用于logstash识别k8s输入的日志
service: k8s-log
#禁止收集host.xxxx字段
#publisher_pipeline.disable_host: true
processors:
- add_kubernetes_metadata:
#添加k8s描述字段
default_indexers.enabled: true
default_matchers.enabled: true
host: ${NODE_NAME}
matchers:
- logs_path:
logs_path: "/var/log/containers/"
- drop_fields:
#删除的多余字段
fields: ["host", "tags", "ecs", "log", "prospector", "agent", "input", "beat", "offset"]
ignore_missing: true
output.kafka:
hosts: ["10.0.105.74:9092","10.0.105.76:9092","10.0.105.96:9092"]
topic: "dev-k8s-log"
compression: gzip
max_message_bytes: 1000000
kind: ConfigMap
metadata:
labels:
app: filebeat
name: filebeat-daemonset-config
namespace: default
然后执行下,把 pod 启动起来就可以了
部署 graylog 集群
导入 rpm 包
rpm -Uvh https://packages.graylog2.org/repo/packages/graylog-4.0-repository_latest.rpm
安装
yum install graylog-server -y
启动并加入开机启动
systemctl enable graylog-server
systemctl start graylog-server
生成秘钥
生成两个秘钥,分别用于配置文件中的root_password_sha2
和password_secret
# echo -n "Enter Password: " && head -1
修改配置文件
# vim /etc/graylog/server/server.conf
is_master = false #是否是主节点,如果是主节点,则设置为true, 集群中只有一个主节点
node_id_file = /etc/graylog/server/node-id
password_secret = iMh21uM57Pt2nMHDicInjPvnE8o894AIs7rJj9SW #将上面生成的秘钥配置到这里
root_password_sha2 = 8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92 #将上面生成的秘钥配置到这里
plugin_dir = /usr/share/graylog-server/plugin
http_bind_address = 0.0.0.0:9000
http_publish_uri = http://10.0.105.96:9000/
web_enable = true
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 2
elasticsearch_replicas = 0
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = false
allow_highlighting = false
elasticsearch_analyzer = standard
output_batch_size = 5000
output_flush_interval = 120
output_fault_count_threshold = 8
output_fault_penalty_seconds = 120
processbuffer_processors = 20
outputbuffer_processors = 40
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://graylog:[email protected]:27017,10.0.105.76:27017,10.0.105.96:27017/graylog?replicaSet=graylog-rs
mongodb_max_connections = 1000
mongodb_threads_allowed_to_block_multiplier = 5
content_packs_dir = /usr/share/graylog-server/contentpacks
content_packs_auto_load = grok-patterns.json
proxied_requests_thread_pool_size = 32
elasticsearch_hosts = http://10.0.105.74:9200,http://10.0.105.76:9200,http://10.0.105.96:9200
elasticsearch_discovery_enabled = true
在这里要注意 mongodb 和 es 的连接方式,我这里全都是部署的集群,所以写的是集群的连接方式
mongodb_uri = mongodb://graylog:[email protected]:27017,10.0.105.76:27017,10.0.105.96:27017/graylog?replicaSet=graylog-rs
elasticsearch_hosts = http://10.0.105.74:9200,http://10.0.105.76:9200,http://10.0.105.96:9200
到这里部署工作就结束了,下面是在 graylog 控制台上进行配置下,但是首先得把 graylog 给代理出来,可以通过 nginx 进行代理,nginx 配置文件参考:
user nginx;
worker_processes 2;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;
include /usr/share/nginx/modules/*.conf;
events {
worker_connections 65535;
}
http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
include /etc/nginx/mime.types;
default_type application/octet-stream;
include /etc/nginx/conf.d/*.conf;
upstream graylog_servers {
server 10.0.105.74:9000;
server 10.0.105.76:9000;
server 10.0.105.96:9000;
}
server {
listen 80 default_server;
server_name 设置一个域名;
location / {
proxy_set_header Host $http_host;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Server $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Graylog-Server-URL http://$server_name/;
proxy_pass http://graylog_servers;
}
}
}
完事后,重启下 nginx,浏览器上访问即可,用户名是 admin,密码是之前使用 sha25 加密方式创建的密码
graylog 接入日志
配置输入源
System --> Inputs
Raw/Plaintext Kafka ---> Lauch new input
设置 kafka 和 zookeeper 地址,设置 topic 名称,保存
状态都要是 running 状态
创建索引
System/indices
设置索引信息,索引名,副本数、分片数,过期策略,创建索引策略
创建 Streams
保存,就可以了,然后去首页就可以看到日志了
总结
到这里,一个完整的部署流程就结束了,这里先讲一下 graylog 是怎么部署的,然后又说了下怎么使用,后面会对它的其他功能做下探索,对日志字段做下提取之类的,敬请关注。