flume系列之:flume落cos

flume系列之:flume落cos

  • 一、参考文章
  • 二、安装cos jar包
  • 三、添加hadoop-cos的相关配置
  • 四、flume环境添加hadoop类路径
  • 五、使用cos路径
  • 六、启动/重启flume

一、参考文章

  • Kafka 数据通过 Flume 存储到 HDFS 或 COS
  • flume to cos 使用指南

二、安装cos jar包

将对应hadoop版本的hadoop-cos的jar包(hadoop-cos-{hadoop.version}-{cosn.version}.jar和cos_api-bundle-5.x.x.jar)放置到 H A D O O P H O M E / s h a r e / h a d o o p / t o o l s / l i b 目 录 下 , 同 时 在 {HADOOP_HOME}/share/hadoop/tools/lib目录下,同时在 HADOOPHOME/share/hadoop/tools/lib{HADOOP_HOME}/hadoop-2.8.5/etc/hadoop/hadoop-env.sh末尾添加以下内容:

for f in $HADOOP_HOME/share/hadoop/tools/lib/*.jar; do
  if [ "$HADOOP_CLASSPATH" ]; then
    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
  else
    export HADOOP_CLASSPATH=$f
  fi
done

三、添加hadoop-cos的相关配置

编辑hadoop的${HADOOP_HOME}/etc/hadoop/core-site.xml,添加hadoop-cos的相关配置:

<property>
    <property>
        <name>fs.cosn.implname>
        <value>org.apache.hadoop.fs.CosFileSystemvalue>
    property>

    <property>
        <name>fs.AbstractFileSystem.cosn.implname>
        <value>org.apache.hadoop.fs.CosNvalue>
    property>

    <property>
        <name>fs.cosn.userinfo.secretIdname>
        <value>xxxxxxxxxxxxxxxxxxxxxxxxxvalue>
    property>

    <property>
        <name>fs.cosn.userinfo.secretKeyname>
        <value>xxxxxxxxxxxxxxxxxxxxxxxxvalue>
    property>

    <property>
        <name>fs.cosn.bucket.regionname>
        <value>ap-xxxvalue>
    property>

property>

四、flume环境添加hadoop类路径

编辑${FLUME_HOME}/conf/flume-env.sh文件,将Hadoop的类路径添加到FLUME_CLASS中:

export $HADOOP_HOME/etc/hadoop:$HADOOP_HOME/share/hadoop/common:$HADOOP_HOME/share/hadoop/hdfs:$HADOOP_HOME/share/hadoop/tools/lib/*

五、使用cos路径

修改flume-conf.properties中的hdfs路径为COS的路径即可:

vim kafka.properties
agent.sources = kafka_source
agent.channels = mem_channel
agent.sinks = hdfs_sink
# 以下配置 source
agent.sources.kafka_source.type = org.apache.flume.source.kafka.KafkaSource
agent.sources.kafka_source.channels = mem_channel
agent.sources.kafka_source.batchSize = 5000
agent.sources.kafka_source.kafka.bootstrap.servers = $kafkaIP:9092
agent.sources.kafka_source.kafka.topics = kafka_test
# 以下配置 sink
agent.sinks.hdfs_sink.type = hdfs
agent.sinks.hdfs_sink.channel = mem_channel
agent.sinks.hdfs_sink.hdfs.path = /data/flume/kafka/%Y%m%d(或cosn://bucket/xxx)
agent.sinks.hdfs_sink.hdfs.rollSize = 0  
agent.sinks.hdfs_sink.hdfs.rollCount = 0  
agent.sinks.hdfs_sink.hdfs.rollInterval = 3600  
agent.sinks.hdfs_sink.hdfs.threadsPoolSize = 30
agent.sinks.hdfs_sink.hdfs.fileType=DataStream    
agent.sinks.hdfs_sink.hdfs.useLocalTimeStamp=true
agent.sinks.hdfs_sink.hdfs.writeFormat=Text
# 以下配置 channel
agent.channels.mem_channel.type = memory
agent.channels.mem_channel.capacity = 100000
agent.channels.mem_channel.transactionCapacity = 10000

六、启动/重启flume

启动/重启flume即可:

./bin/flume-ng agent -c conf -f conf/flume-conf.properties -n $agentName

你可能感兴趣的:(日常分享专栏,flume系列)