如图1所示,一个flume-ng agent主要包括source,channel和sink三部分,三部分运行在java JVM中,JVM一般运行在linux'操作系统上,因此,这些因素都可能影响最终的性能。flume-ng性能优化与架构设计,简单来讲,也主要包括以上部分。
1, 主键的参数设计agent.channels.memory_chan_1.type = memory agent.channels.memory_chan_1.keep-alive = 30 agent.channels.memory_chan_1.transactionCapacity = 20000 agent.channels.memory_chan_1.byteCapacityBufferPercentage = 40 agent.channels.memory_chan_1.byteCapacity = 50000000 agent.channels.memory_chan_1.capacity = 500000相关参数说明
# Each sink's type must be defined agent.sinks.hdfsSink_1.type = hdfs agent.sinks.hdfsSink_1.channel = memory_chan_1 agent.sinks.hdfsSink_1.hdfs.path = /logdata/%Y%m%d/%{hostname}/%{filename}%{CRMLOG} agent.sinks.hdfsSink_1.hdfs.filePrefix = %{filename}%{CRMLOG} agent.sinks.hdfsSink_1.hdfs.rollInterval = 3600 agent.sinks.hdfsSink_1.hdfs.rollSize = 40000000 agent.sinks.hdfsSink_1.hdfs.rollCount = 0 agent.sinks.hdfsSink_1.hdfs.writeFormat = Writable agent.sinks.hdfsSink_1.hdfs.fileType = CompressedStream agent.sinks.hdfsSink_1.hdfs.batchSize = 10000 agent.sinks.hdfsSink_1.hdfs.serializer = avro_event agent.sinks.hdfsSink_1.hdfs.threadsPoolSize = 100 agent.sinks.hdfsSink_1.hdfs.codeC = gzip
cat /etc/sysctl.conf kernel.shmall = 33554432 kernel.shmmax = 137438953472 kernel.shmmni = 4096 kernel.sem = 250 32000 100 128 fs.file-max = 6815744 fs.aio-max-nr = 1048576 net.ipv4.ip_local_port_range = 9000 65500 net.core.rmem_default = 262144 net.core.rmem_max = 4194304 net.core.wmem_default = 262144 net.core.wmem_max = 1048576用户级别的参数设定
vi /etc/security/limits.conf # End of file hadoop soft nproc 32047 hadoop hard nproc 36384 hadoop soft nofile 31024 hadoop hard nofile 655364,网络配置
agent.sinks = flowSink-3-1 flowSink-3-2 flowSink-3-3 flowSink-3-4 flowSink-3-5 agent.sinkgroups = g1 agent.sinkgroups.g1.sinks = flowSink-3-1 flowSink-3-2 flowSink-3-3 flowSink-3-4 flowSink-3-5 agent.sinkgroups.g1.processor.type = load_balance agent.sinkgroups.g1.processor.selector = round_robin agent.sinkgroups.g1.processor.backoff = true ... agent.sinks.flowSink-3-1.type = avro agent.sinks.flowSink-3-1.channel = memory_chan_1 agent.sinks.flowSink-3-1.hostname = 127.0.0.1 agent.sinks.flowSink-3-1.port = 41451 agent.sinks.flowSink-3-1.batch-size = 1000一个高可用,可扩展的架构示意图如下