centos8 安装flume

安装flume数据采集软件

1.上传 apache-flume-1.10.1-bin.tar.gz 到/bigdata/soft
2.解压到指定目录
tar -zxvf apache-flume-1.10.1-bin.tar.gz -C /bigdata/server/
3.创建软链接
cd /bigdata/server
ln -s apache-flume-1.10.1-bin/ flume

在/etc/profile.d/my_env.sh后面追加环境变量

#FLUME_HOME
export FLUME_HOME=/bigdata/server/flume
export PATH=$PATH:$FLUME_HOME/bin
# 添加完之后执行
source /etc/profile

配置采集数据到hdfs文件的配置

#为各组件命名
a1.sources = r1
a1.channels = c1
a1.sinks = k1

#描述
sourcea1.sources.r1.type = TAILDIR
a1.sources.r1.filegroups = f1
a1.sources.r1.filegroups.f1 = /bigdata/soft/app/log/behavior/.*
a1.sources.r1.positionFile =/bigdata/server/flume/position/behavior/taildir_position.json
a1.sources.r1.interceptors =  i1
a1.sources.r1.interceptors.i1.type =cn.wolfcode.flume.interceptor.ETLInterceptor$Builder
a1.sources.r1.interceptors =  i2
a1.sources.r1.interceptors.i2.type =cn.wolfcode.flume.interceptor.TimeStampInterceptor$Builder

## channel1
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

## sink1
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = /behavior/origin/log/%Y-%m-%d
a1.sinks.k1.hdfs.filePrefix = log-
a1.sinks.k1.hdfs.round = false
a1.sinks.k1.hdfs.rollInterval = 10
a1.sinks.k1.hdfs.rollSize = 134217728
a1.sinks.k1.hdfs.rollCount = 0

## 控制输出文件是原生文件。
a1.sinks.k1.hdfs.fileType = DataStream

## 拼装
a1.sources.r1.channels = c1 
a1.sinks.k1.channel= c1

配置文件可以去官方文档那里参考

flume官方文档

运行数据采集命令

# 进入到Flume的目录
cd  /bigdata/server/flume
bin/flume-ng agent --c conf/ --n a1 --f jobs/log_file_to_hdfs.conf -Dflume.root.logger=INFO,console

# 后台启动运行
nohup bin/flume-ng agent --conf conf/ --name a1 --conf-file jobs/log_file_to_hdfs.conf-Dflume.root.logger=INFO,console >/bigdata/server/flume/logs/log_file_to_hdfs.log 2>&1&

你可能感兴趣的:(flume,大数据,hadoop)