经过一番折腾后终于实现window系统通过flume将txt中的数据抽取到Sqlserver中,现将开发过程分享如下:
(1) jre-8u171-windows-x64.exe
(2) apache-flume-1.7.0-bin
(3)编写flume的配置文档client.properts
a1.channels = c1
a1.sources = r1
a1.sinks = k1
a1.sources.r1.type = spooldir
a1.sources.r1.channels = c1
a1.sources.r1.spoolDir = D:/flumetest #flume监控目录
a1.sources.r1.fileHeader = true
a1.sources.r1.fileHeaderKey = file
a1.sources.r1.deletePolicy = immediate
a1.sources.r1.recursiveDirectorySearch = true
a1.sources.r1.inputCharset = UTF-8
a1.sources.r1.batchSize = 100
a1.sources.r1.decodeErrorPolicy = IGNORE
a1.sources.r1.deserializer = LINE
a1.channels.c1.type = file
a1.channels.c1.checkpointDir = D:/flumedata/checkpoint #设置checkpoint的位置,不能放在flume监控的目录里
a1.channels.c1.dataDirs = D:/flumedata/data #设置checkpoint的位置,不能放在flume监控的目录里
a1.channels.c1.keep-alive = 1
a1.channels.c1.transactionCapacity = 500 #transationCapacity<=capacity
a1.channels.c1.capacity = 500
a1.sinks.k1.type = avro
a1.sinks.k1.hostname =10.96.183.54
a1.sinks.k1.port = 12345
a1.sinks.k1.channel = c1
(4)flume的log4j.properties
此文件用于设置flume运行时写的日志文件,在cmd启动flume:flume-ng.cmd agent -c ../conf -f ../conf/client.properties -n a1, 结合log4日志使用时不需要添加这个属性 -property flume.root.logger=Error,console,相关日志会存在flume.log.dir设置的目录下。
#flume.root.logger=DEBUG,console
flume.root.logger=ERROR,LOGFILE
flume.log.dir=./logs
flume.log.file=flume.log
log4j.logger.org.apache.flume.lifecycle = INFO
log4j.logger.org.jboss = WARN
log4j.logger.org.mortbay = INFO
log4j.logger.org.apache.avro.ipc.NettyTransceiver = WARN
log4j.logger.org.apache.hadoop = INFO
log4j.logger.org.apache.hadoop.hive = ERROR
# Define the root logger to the system property "flume.root.logger".
log4j.rootLogger=${flume.root.logger}
# Stock log4j rolling file appender
# Default log rotation configuration
log4j.appender.LOGFILE=org.apache.log4j.RollingFileAppender
log4j.appender.LOGFILE.MaxFileSize=100MB
log4j.appender.LOGFILE.MaxBackupIndex=10
log4j.appender.LOGFILE.File=${flume.log.dir}/${flume.log.file}
log4j.appender.LOGFILE.layout=org.apache.log4j.PatternLayout
log4j.appender.LOGFILE.layout.ConversionPattern=%d{dd MMM yyyy HH:mm:ss,SSS} %-5p [%t] (%C.%M:%L) %x - %m%n
log4j.appender.DAILY=org.apache.log4j.rolling.RollingFileAppender
log4j.appender.DAILY.rollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.DAILY.rollingPolicy.ActiveFileName=${flume.log.dir}/${flume.log.file}
log4j.appender.DAILY.rollingPolicy.FileNamePattern=${flume.log.dir}/${flume.log.file}.%d{yyyy-MM-dd}
log4j.appender.DAILY.layout=org.apache.log4j.PatternLayout
log4j.appender.DAILY.layout.ConversionPattern=%d{dd MMM yyyy HH:mm:ss,SSS} %-5p [%t] (%C.%M:%L) %x - %m%n
# console
# Add "console" to flume.root.logger above if you want to use this
log4j.appender.console=org.apache.log4j.ConsoleAppender
log4j.appender.console.target=System.err
log4j.appender.console.layout=org.apache.log4j.PatternLayout
log4j.appender.console.layout.ConversionPattern=%d (%t) [%p - %l] %m%n
(5)将flume在后台运行
1.编写startFlume.bat
@echo off
cd D:\APP\apache-flume-1.7.0-bin\bin
@echo off
flume-ng.cmd agent -c ../conf -f ../conf/client.properties -n a1
2.编写start.vbe文件
CreateObject("WScript.Shell").Run "cmd /c D:\APP\apache-flume-1.7.0-bin\bin\startFlume.bat /start",0
3.执行start.vbe文件,agent将在后台运行,资源管理器中可以看到有java.exe进程。
(6)将开发的start.vbe设置为开机启动
将start.vbe文件放在,用户启动目录下:
C:\Users\ICC17K761\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup
或者是C:\ProgramData\Microsoft\Windows\Start Menu\Programs\Startup(针对所有账户)
注意事项:
1.环境变量中需要配置JAVA_HOME
2.配置windows 环境变量path时,将java路径贴到path值的最前面,避免系统读取其它地方的java.exe,导致flume报错
下一篇讲解自定义flume:sqlserver 源码开发。