flume安装:
官网下载:
http://archive.apache.org/dist/flume/
一.进入该下载包目录:
cd /tools
wget http://archive.apache.org/dist/flume/1.8.0/apache-flume-1.8.0-bin.tar.gz
tar -zxvf apache-flume-1.8.0-bin.tar.gz -C /usr/local/
chown -R hadoop:hadoop apache-flume-1.8.0-bin/
vi /etc/profile
# flume config
export FLUME_HOME=/usr/local/apache-flume-1.8.0-bin
export PATH=$PATH:$FLUME_HOME/bin
source /etc/profile
在任意目录执行flume-ng命令,若能成功输出下面的信息,说明环境变量配置成功
commands:
help display this help text
agent run a Flume agent
avro-client run an avro Flume client
version show Flume version info
vi flume-conf.properties
# 单节点Flume配置例子
# 给Agent 中的三个组件Source Sink 和 Channel 各取一个别名,a1代表为Agent取的别名
a1.sources = r1
a1.channels = c1
a1.sinks = k1
# 给 Source 属性配置
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
# 绑定source 和 sink 到 channel 上
a1.sinks.k1.channel = c1
a1.sources.r1.channels = c1
# sink 属性配置信息
a1.sinks.k1.type = logger
# Each channel's type is defined.
# channel 属性配置信息
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
场景一:
flume-ng agent --conf conf --conf-file /usr/local/apache-flume-1.8.0-bin/conf/flume-conf.properties --name a1 -Dflume.root.logger=INFO,console
新开一个连接node-1的SSH窗口
telnet localhost 44444
连接成功后: 随便输入内容,后台启动agent的SSH窗口
场景二: 日志监控(1)
vi flume-conf-01.properties
# 单节点Flume配置例子
# 给Agent 中的三个组件Source Sink 和 Channel 各取一个别名,a1代表为Agent取的别名
a1.sources = r1
a1.channels = c1
a1.sinks = k1
# 给 Source 属性配置
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /home/hadoop/data.log
# 绑定source 和 sink 到 channel 上
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
# sink 属性配置信息
a1.sinks.k1.type = logger
# channel 属性配置信息
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
flume-ng agent --name a1 --conf conf --conf-file /usr/local/apache-flume-1.8.0-bin/conf/flume-conf-01.properties -Dflume.root.logger=INFO,console
另外一个SSH窗口
echo 'laoggaogaoo4156465413' >> /home/hadoop/data.log
场景三:日志监控(2)
配置文件的内容如下:(接收池为sinks = hdfs)
vi flume-conf-hdfs.properties
# 给Agent 中的三个组件Source Sink 和 Channel 各取一个别名,a1代表为Agent取的别名
a1.sources = r1
a1.channels = c1
a1.sinks = k1
# 给 Source 属性配置
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /usr/local/elasticsearch-6.5.4/logs/gc.log.0.current
# 绑定source 和 sink 到 channel 上
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
# 组件类型为hdfs
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://node-1:9000/flume/%Y-%m
a1.sinks.k1.hdfs.filePrefix = %Y-%m-%d-%H
a1.sinks.k1.hdfs.fileSuffix = .log
a1.sinks.k1.hdfs.useLocalTimeStamp = true
a1.sinks.k1.hdfs.minBlockReplicas = 1
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.writerFormat = Text
a1.sinks.k1.hdfs.rollInterval = 86400
a1.sinks.k1.hdfs.rollSize = 1000000
a1.sinks.k1.hdfs.rollCount = 10000
# channel 属性配置信息
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
执行如下的命令:
flume-ng agent --name a1 --conf conf --conf-file /usr/local/apache-flume-1.8.0-bin/conf/flume-conf-hdfs.properties -Dflume.root.logger=INFO,console
如果执行命令下面的错误:
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create file/flume/2019-11/2019-11-10-21.1573393841845.log.tmp. Name node is in safe mode.
Resources are low on NN. Please add or free up more resources then turn off safe mode manually. NOTE: If you turn off safe mode before adding resources, the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode leave" to turn safe mode off. NamenodeHostName:node-1
执行该命令
hdfs dfsadmin -safemode leave
Safe mode is OFF
http://node-1:50070/ 进入hadoop的web ui页面查看生成的日志
hadoop fs -ls -R /flume/ 任意一个节点执行,查看flume下生成的日志文件
hadoop fs -cat /flume/2019-11/* 查看日志生成的内容
.......