flume写到HDFS处理小文件问题

原文链接: https://my.oschina.net/dreamness/blog/3093956

当使用hdfs sink时 有可能会产生严重的小文件问题。
通过配置rollInterval, rollSize, rollCount三个参数来缓解小文件问题。

a1.sinks.hdfssink.type                   = hdfs
a1.sinks.hdfssink.hdfs.path              = hdfs://c1:8020/flume/alertlog/%y%m%d%H%M/origin
a1.sinks.hdfssink.filePrefix             = alert-
a1.sinks.hdfssink.hdfs.useLocalTimeStamp = true
a1.sinks.hdfssink.hdfs.rollInterval      = 60
a1.sinks.hdfssink.hdfs.rollSize          = 10485760
a1.sinks.hdfssink.hdfs.rollCount         = 0
a1.sinks.hdfssink.hdfs.codeC             = snappy
a1.sinks.hdfssink.hdfs.fileType          = CompressedStream
a1.sinks.hdfssink.hdfs.writeFormat       = Text


转载于:https://my.oschina.net/dreamness/blog/3093956

你可能感兴趣的:(flume写到HDFS处理小文件问题)