03-Elastic日志系统-filebeat-kafka-logstash-elasticsearch-kibana-6.8.0搭建流程

03-Elastic日志系统-filebeat-kafka-logstash-elasticsearch-kibana-6.8.0搭建流程

  • 1. 介绍
  • 2. 准备工作
      • 2.1 软件版本
      • 2.2 日志流
  • 3. 配置zookeeper集群
  • 4. 配置kafka集群
  • 5. 配置filebeat输出
  • 6. 配置logstash input
  • 7. 可能会遇到的问题

1. 介绍

前面写了使用redis作为中间缓存,消息队列,进行日志流的削峰,但是在使用的过程中,前期日志量不大的情况下,完全没有问题,当日志不断上升,redis队列不断堆积的时候,问题就来了,redis是内存型数据库,日子不能被即时消费的情况下,内存占用会不断上升,直至OOM,导致系统崩溃,当然logstash消费日志的速率也是一个问题。不过还是考虑换掉单节点redis,用三节点kafka吧,并且把elasticsearch的启动参数做了修改。下面只介绍换为kafka的配置,和遇到的问题,其他配置参考之前的文章。

2. 准备工作

节点:
192.168.72.56
192.168.72.57
192.168.72.58

2.1 软件版本

elastic相关软件全部使用6.8.0 rpm包安装
zookeeper:3.4.14,下载地址
kafka:2.11-2.4.0,下载地址

系统版本CentOS Linux release 7.7.1908 (Core)

2.2 日志流

filebeat --> kafka集群 --> logstash --> elasticsearch集群 --> kibana

3. 配置zookeeper集群

我们使用了kafka外部的zookeeper集群,实际kafka安装包里面也有一个自带的zookeeper组件。参考:https://www.cnblogs.com/longBlogs/p/10340251.html

下面介绍配置

wget https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gz
tar -xvf zookeeper-3.4.14.tar.gz -C /usr/local
cd /usr/local
ln -sv zookeeper-3.4.14 zookeeper
cd zookeeper/conf
cp zoo_sample.cfg zoo.cfg
mkdir -pv /usr/local/zookeeper/{data,logs}

节点一编辑配置文件zoo.cfg

# 指定数据文件夹,日志文件夹
dataDir=/usr/local/zookeeper/data
dataLogDir=/usr/local/zookeeper/logs

clientPort=2181

server.1=192.168.72.56:2888:3888
server.2=192.168.72.57:2888:3888
server.3=192.168.72.58:2888:3888
# 第一个端口是master和slave之间的通信端口,默认是2888,第二个端口是leader选举的端口,集群刚启动的时候选举或者leader挂掉之后进行新的选举的端口默认是3888

配置节点id

echo "1" > /usr/local/zookeeper/data/myid   #server1配置,各节点不同,跟上面配置server.1的号码一样
echo "2" > /usr/local/zookeeper/data/myid   #server2配置,各节点不同,跟上面配置server.2的号码一样
echo "3" > /usr/local/zookeeper/data/myid   #server3配置,各节点不同,跟上面配置server.3的号码一样

启动停止zookeeper

# 启动
/usr/local/zookeeper/bin/zkServer.sh start
# 停止
/usr/local/zookeeper/bin/zkServer.sh stop
# 状态查看
/usr/local/zookeeper/bin/zkServer.sh status

配置zookeeper服务

cd /usr/lib/systemd/system
# vim zookeeper.service

=========================================
[Unit]
Description=zookeeper server daemon
After=zookeeper.target

[Service]
Type=forking
ExecStart=/usr/local/zookeeper/bin/zkServer.sh start
ExecReload=/usr/local/zookeeper/bin/zkServer.sh stop && sleep 2 && /usr/local/zookeeper/bin/zkServer.sh start
ExecStop=/usr/local/zookeeper/bin/zkServer.sh stop
Restart=always

[Install]

WantedBy=multi-user.target
=======================================================
# systemctl start  zookeeper
# systemctl enable zookeeper

4. 配置kafka集群

下载安装

wget http://mirrors.tuna.tsinghua.edu.cn/apache/kafka/2.4.0/kafka_2.11-2.4.0.tgz
tar -xvf kafka_2.11-2.4.0.tgz  -C /usr/local
cd /usr/local
ln -sv kafka_2.11-2.4.0 kafka
cd kafka/config

修改配置

# vim server.properties
broker.id=1 # 每一个broker在集群中的唯一标示,要求是正数,三节点不同
host.name=192.168.72.56 # 新增项,节点IP
num.network.threads=3 # 每个topic的分区个数, 更多的分区允许更大的并行操作
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/var/log/kafka  # 日志文件夹
num.partitions=3
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168 #segment文件保留的最长时间(小时),超时将被删除,也就是说7天之前的数据将被清理掉
log.segment.bytes=1073741824  # 日志文件中每个segmeng的大小(字节),默认为1G
log.retention.check.interval.ms=300000
log.cleaner.enable=true # 开启日志清理
zookeeper.connect=192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 # zookeeper集群的地址,可以是多个
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0

kafka节点默认需要的内存为1G,如果需要修改内存,可以修改kafka-server-start.sh的配置项
找到KAFKA_HEAP_OPTS配置项,例如修改如下:
export KAFKA_HEAP_OPTS="-Xmx2G -Xms2G"

启动kafka

cd /usr/local/kafka
./bin/kafka-server-start.sh -daemon ./config/server.properties

设置开机启动

# cd /usr/lib/systemd/system
# vim kafka.service
=========================================
[Unit]
Description=kafka server daemon
After=kafka.target

[Service]

Type=forking
ExecStart=/usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties
ExecReload=/usr/local/kafka/bin/kafka-server-stop.sh && sleep 2 && /usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties
ExecStop=/usr/local/kafka/bin/kafka-server-stop.sh
Restart=always

[Install]
WantedBy=multi-user.target
=======================================================

# systemctl start kafka
# systemctl enable kafka

创建topic
创建3分区、3备份

cd /usr/local/kafka
/bin/kafka-topics.sh --create --zookeeper 192.168.89.11:2181,192.168.89.12:2181,192.168.89.13:2181 --replication-factor 3 --partitions 3 --topic java

常用命令

1)  停止kafka
./bin/kafka-server-stop.sh 

2)  创建topic
./bin/kafka-topics.sh --create --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --replication-factor 1 --partitions 1 --topic topic_name

分区扩展
./bin/kafka-topics.sh --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --alter --topic java --partitions 40

3)  展示topic
./bin/kafka-topics.sh --list --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181

4) 查看描述topic
./bin/kafka-topics.sh --describe --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --topic topic_name

5)  生产者发送消息
./bin/kafka-console-producer.sh --broker-list 192.168.89.11:9092 --topic topic_name

6)  消费者消费消息
./bin/kafka-console-consumer.sh --bootstrap-server 192.168.89.11:9092,192.168.89.12:9092,192.168.89.13:9092 --topic topic_name

7)  删除topic
./bin/kafka-topics.sh --delete --topictopic_name --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181

8)  查看每分区consumer_offsets(可以连接到的消费主机)
./bin/kafka-topics.sh --describe --zookeeper 192.168.72.56:2181,192.168.72.57:2181,192.168.72.58:2181 --topic __consumer_offsets

5. 配置filebeat输出

详细参考https://www.elastic.co/guide/en/beats/filebeat/current/kafka-output.html

output.kafka:
  enabled: true
  hosts: ["192.168.72.56:9092","192.168.72.56:9092","192.168.72.56:9092"]
  topic: java
  required_acks: 1
  compression: gzip
  message.max.bytes: 500000000  # 每次消息最大传输字节数,大于这个数值的会被丢掉

重启filebeat

systemctl restart filebeat

6. 配置logstash input

先安装kafka输入模块

/usr/share/logstash/bin/logstash-plugin install logstash-input-kafka

添加配置文件:

vim /etc/logstash/conf.d/kafka.conf
======================
input {
    kafka {
        bootstrap_servers => "192.168.72.56:9092"
        group_id => "java"
        auto_offset_reset => "latest"
        consumer_threads => "5"
        decorate_events => "false"
        topics => ["java"]
        codec => json
    }
}

output {
    elasticsearch {
        hosts => ["192.168.72.56:9200","192.168.72.57:9200","192.168.72.58:9200"]
        user => "elastic"
        password => "changme"
        index => "logs-other-%{+YYYY.MM.dd}"
        http_compression => true
  }
}

添加完成后,先测试一下配置文件

/usr/share/logstash/bin/logstash -t -f  /etc/logstash/conf.d/kafka.conf

测试没问题,重启logstash

7. 可能会遇到的问题

filebeat报错

  1. *WARN producer/broker/0 maximum request accumulated, waiting for space
    参考:https://linux.xiao5tech.com/bigdata/elk/elk_2.2.1_error_filebeat_kafka_waiting_for_space.html
    原因:max_message_bytes的缓冲区数值配置的小了

  2. dropping too large message of size
    参考:https://www.cnblogs.com/zhaosc-haha/p/12133699.html
    原因:传输的消息字节数超过了限制,修改日志的扫描频率或者确认日志输出是否异常,有没有不必要的输出。太大的日志会严重影响kafka的性能。
    设置值:10000000(10MB)

你可能感兴趣的:(elk)