hadoop集群配置flume

1.先从官网下载flume的jar包。我们下载最新的Apache Flume binary (tar.gz) 1.8版本

地址:http://www.apache.org/dyn/closer.lua/flume/1.8.0/apache-flume-1.8.0-bin.tar.gz

2.之后再hadoop中解压

tar -zxvf apache-flume-1.8.0-bin.tar.gz

然后我们可以改变文件名,这样好操作(可以不改)

mv apache-flume-1.8.0-bin.tar.gz flume

3.之后前往/etc/profile配置环境变量(不配也可以,配了之后命令好操作)

export ZOOKEEPER_HOME=/home/zookeeper
export PATH=$PATH:$ZOOKEEPER_HOME/bin:/home/flume/bin
4.之后创建一个文件test

# example.conf: A single-node Flume configuration

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = node1
a1.sources.r1.port = 44444

# Describe the sink
a1.sinks.k1.type = logger

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

注意:a1.sources.r1.bind = localhost(改成你的主机地址)

5.运行

$ bin/flume-ng agent --conf conf --conf-file example.conf --name a1 -Dflume.root.logger=INFO,console
注意:由于设置source设置的是netcat,是tcp  所以我们要使用tcp协议的输入

我们可以使用telnet 192.168.56.101 44444来检测日志是否生成。

最后我们

a1.sinks.k1.type = logger
我们sink输出的是logger所以就在node1节点打印输出
6.上面的netcat是被动输出日志的,我 们还可以使用一种主动地生成日志

# example.conf: A single-node Flume configuration

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /opt/flume

# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://192.168.56.101:8020/flume/%Y-%m-%d/%H%M
a1.sinks.k1.hdfs.rollCount = 0
a1.sinks.k1.hdfs.rollInterval = 60
a1.sinks.k1.hdfs.rollSize = 10240
a1.sinks.k1.hdfs.idleTimeout = 3
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.useLocalTimeStamp = true
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 5
a1.sinks.k1.hdfs.roundUnit =  minute


# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

7.spooldir 是一种主动生成日志的方式
之后我们往/opt/flume传入一个文件 发现日志文件在hadoop集群上面主动生成


如果遇到此错误

2017-11-22 21:22:26,047 (pool-3-thread-1) [ERROR - org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:280)] FATAL: Spool Directory source r1: { spoolDir: /opt/flume }: Uncaught exception in SpoolDirectorySource thread. Restart or reconfigure Flume to continue processing.
java.nio.charset.MalformedInputException: Input length = 1
    at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
    at org.apache.flume.serialization.ResettableFileInputStream.readChar(ResettableFileInputStream.java:283)
    at org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:132)
    at org.apache.flume.serialization.LineDeserializer.readEvent(LineDeserializer.java:70)
    at org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:89)
    at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readDeserializerEvents(ReliableSpoolingFileEventReader.java:343)
    at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:318)
    at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:250)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

是因为你传入的文件有特殊符号,请更改就没错误了,至此单点的flume创建完成。/

你可能感兴趣的:(云计算/hadoop)