这几天为了rsyslog日志检测搞得痛苦难言,将文档中的内容导入kafka中,听起来简单,过程却是十分复杂,走的弯路多,所以就把这几天的辛苦历程记录下来,给需要的同学们。
kafka+zookeeper集群的配置我就不想多说了,配通知后自己新建topic,启动producter输入数据查看consumer有没有数据到处即可;
具体的kafka那几条命令:
创建topic:./bin/kafka-topics.sh --create --topic test --replication-factor 1 --partitions 32 --zookeeper localhost:2181
删除topic:./bin/kafka-topics.sh --delete --topic test --zookeeper localhost:2181
查看topic列表:./bin/kafka-topics.sh --list --zookeeper localhost:2181
查看某个topic详细:./bin/kafka-topics.sh --describe --topic test --zookeeper localhosts:2181
监控某个topic的消费:./bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test
指定消费组查看消费情况:/usr/local/kafka/kafka_2.11-0.8.2.1/bin/kafka-consumer-offset-checker.sh --zookeeper localhost:2181 --group test
查看数据:/usr/local/kafka/kafka_2.11-0.8.2.1/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic test(这条命令打开后,只要rsyslog有数据写入,会自动显示导入的内容)
注意:--zookeeper后面的内容一定是主机名+端口号(/etc/hosts里面有主机名和IP的映射),集群方式的zookeeper安装形式逗号隔开:
例如:
./kafka-topics --zookeeper LetvHM2PXD2:2181,LetvHM1VXD2:2181,LetvHM1PXD2:2181 -alter --partitions 10 --topic AppAd (修改partitions命令)
我们现在只讲rsyslo的配置:
楼主这几天仔细看了rsyslog的官网文档:http://www.rsyslog.com/doc/v8-stable/configuration/index.html(官网网址)确实有点多,牵扯到syslog方面的东西,没有基础的看起来有点费劲,我们姑且不管,具体的语法,可以自己慢慢看,我们先搭起来基本的配置,能够将数据导入kafka作为我们的基本任务,废话不多说,我们进入正题:
rsyslog安装:
我用的系统是Centos6.6,yum安装的方式,所用的yum源:
[rsyslog_v8]
name=Adiscon CentOS-$releasever - local packages for $basearch
baseurl=http://rpms.adiscon.com/v8-stable/epel-$releasever/$basearch
enabled=1
gpgcheck=0
gpgkey=http://rpms.adiscon.com/RPM-GPG-KEY-Adiscon
protect=1
导入之后,运行以下命令安装rsyslog和rsyslog-kafka:
【命令】yum install -y rsyslog rsyslog-kafka
【命令】rpm -ql rsyslog-kafka
/lib64/rsyslog/omkafka.so #rsyslog 连接 kafka的插件
【命令】 rsyslogd -N1 #检测rsyslog配置文件是否对错的命令(重要)
切记:每次配置rsyslog配置文档时一定要重启rsyslogrestart(重要)
我的配置文档放在/etc/rsyslog.d/kafka.conf中,内容如图所示:
################### egrep -v "^$|^#" /etc/rsyslog.d/kafka.conf
module(load="imfile" PollingInterval="1")
module(load="omkafka")
input(type="imfile"
File="/data/ark/tracking"
Tag="tracking"
PersistStateInterval="1"
Facility="*"
Severity="*"
ruleset="tracking"
)
$template tracking_template, "%msg%\n"
ruleset(name="tracking"){
action (type="omkafka"
topic="rsyslog"
broker="HMHRXD2:9092,HM9LXD2:9092,HMJSXD2:9092,HMMQXD2:9092,HMKLXD2:9092,HMPPXD2:9092,HMBVXD2:9092,HMKPXD2:9092,HMNNXD2:9092"
template="tracking_template"
partitions.auto="on"
confParam=["socket.keepalive.enable=true"]
)
}
楼主检测的是/data/ark下的tracking文件,只要tracking中有内容追加,kafka中就会有数据写入。
主机名换成自己集群kafka主机的主机名;
切记:
broker中的书写格式一定要是主机名:端口号,楼主就是在这不出了问题直接用IP:端口号的方式栽了大跟头,就在这一步。
另附/etc/rsysog.conf的内容:
# rsyslog configuration file
# note that most of this config file uses old-style format,
# because it is well-known AND quite suitable for simple cases
# like we have with the default config. For more advanced
# things, RainerScript configuration is suggested.
# For more information see /usr/share/doc/rsyslog-*/rsyslog_conf.html
# If you experience problems, see http://www.rsyslog.com/doc/troubleshoot.html
#### MODULES ####
#module(load="imuxsock") # provides support for local system logging (e.g. via logger command)
#module(load="imklog") # provides kernel logging support (previously done by rklogd)
#module(load"immark") # provides --MARK-- message capability
# Provides UDP syslog reception
# for parameters see http://www.rsyslog.com/doc/imudp.html
#module(load="imudp") # needs to be done just once
#input(type="imudp" port="514")
#module(load="imudp") # needs to be done just once
#input(type="imudp" port="1514")
#module(load="imfile")
#module(load="omhiredis") # lets you send to Redis
#module(load="omkafka") # lets you send to Kafka
#action(type="omkafka" topic="test_nginx" broker="localhost:9092")
#module(load="imtcp") # needs to be done just once
#input(type="imtcp" port="514")
# Provides TCP syslog reception
# for parameters see http://www.rsyslog.com/doc/imtcp.html
#module(load="imtcp") # needs to be done just once
#input(type="imtcp" port="514")
#action(type="omkafka" topic="test_nginx" broker="localhost:9092")
#### GLOBAL DIRECTIVES ####
# Use default timestamp format
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
# File syncing capability is disabled by default. This feature is usually not required,
# not useful and an extreme performance hit
#$ActionFileEnableSync on
# Include all config files in /etc/rsyslog.d/
$IncludeConfig /etc/rsyslog.d/*.conf
#### RULES ####
# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.* /dev/console
# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none /var/log/messages
# The authpriv file has restricted access.
authpriv.* /var/log/secure
# Log all the mail messages in one place.
mail.* /var/log/maillog
# Log cron stuff
cron.* /var/log/cron
# Everybody gets emergency messages
*.emerg :omusrmsg:*
# Save news errors of level crit and higher in a special file.
uucp,news.crit /var/log/spooler
# Save boot messages also to boot.log
local7.* /var/log/boot.log
# ### begin forwarding rule ###
# The statement between the begin ... end define a SINGLE forwarding
# rule. They belong together, do NOT split them. If you create multiple
# forwarding rules, duplicate the whole block!
# Remote Logging (we use TCP for reliable delivery)
#
# An on-disk queue is created for this action. If the remote host is
# down, messages are spooled to disk and sent when it is up again.
#$WorkDirectory /var/lib/rsyslog # where to place spool files
#$ActionQueueFileName fwdRule1 # unique name prefix for spool files
#$ActionQueueMaxDiskSpace 1g # 1gb space limit (use as much as possible)
#$ActionQueueSaveOnShutdown on # save messages to disk on shutdown
#$ActionQueueType LinkedList # run asynchronously
#$ActionResumeRetryCount -1 # infinite retries if host is down
# remote host is: name/ip:port, e.g. 192.168.0.1:514, port optional
#*.* @@remote-host:514
# ### end of the forwarding rule ###
以上内容如果小白的话可以直接复制粘贴,楼主亲测,成功。