搭建3个节点的hadoop集群(完全分布式部署)5 flume安装及flume导数据到hdfs

下载并安装flume,安装目录为/opt/flume

mv flume-conf.properties.template flume-conf.properties
mv flume-env.sh.template flume-env.sh

修改flume-env.sh 环境变量,添加如下:

export JAVA_HOME=/opt/jdk1.8.0_121

FLUME_CLASSPATH="/opt/hadoop-2.7.5/share/hadoop/hdfs/*"


新增文件 hdfs.properties,内容如下:

#spooldir 对监控指定文件夹中新文件的变化,一旦有新文件出现就解析,解析写入channel后完成的文件名将追加后缀为*.COMPLATE
LogAgent.sources.apache.type = spooldir
LogAgent.sources.apache.spoolDir = /tmp/logs
LogAgent.sources.apache.channels = fileChannel
LogAgent.sources.apache.fileHeader = false

#sinks config
LogAgent.sinks.HDFS.channel = fileChannel
LogAgent.sinks.HDFS.type = hdfs
LogAgent.sinks.HDFS.hdfs.path = hdfs://elephant:9000/data/logs/%Y-%m-%d/%H
LogAgent.sinks.HDFS.hdfs.fileType = DataStream
LogAgent.sinks.HDFS.hdfs.writeFormat=TEXT
LogAgent.sinks.HDFS.hdfs.filePrefix = flumeHdfs
LogAgent.sinks.HDFS.hdfs.batchSize = 1000
LogAgent.sinks.HDFS.hdfs.rollSize = 10240
LogAgent.sinks.HDFS.hdfs.rollCount = 0
LogAgent.sinks.HDFS.hdfs.rollInterval = 1
LogAgent.sinks.HDFS.hdfs.useLocalTimeStamp = true

#channels config
LogAgent.channels.fileChannel.type = memory
LogAgent.channels.fileChannel.capacity =10000
LogAgent.channels.fileChannel.transactionCapacity = 100
新建监控目录/tmp/logs

mkdir -p /tmp/logs

在 apache-flume 目录下执行

bin/flume-ng agent --conf-file  conf/hdfs.properties -c conf/ --name LogAgent -Dflume.root.logger=DEBUG,console

验证:

   a.另新建一终端操作;

   b.在监控目录/tmp/logs下新建test.log目录

vim test.log

test log hello hi

查看hadoop相关目录,可以看到一上传成功.

搭建3个节点的hadoop集群(完全分布式部署)5 flume安装及flume导数据到hdfs_第1张图片


你可能感兴趣的:(hadoop)