flume收集日志到hive遇到问题总结

异常如下:

19/01/18 02:12:51 WARN hive.HiveSink: k1 : Failed connecting to EndPoint {metaStoreUri='thrift://wangfutai:9083', database='hive', table='flume2', partitionVals=[] }
org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://wangfutai:9083', database='hive', table='flume2', partitionVals=[] }
	at org.apache.flume.sink.hive.HiveWriter.(HiveWriter.java:99)
	at org.apache.flume.sink.hive.HiveSink.getOrCreateWriter(HiveSink.java:343)
	at org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:295)
	at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:253)
	at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
	at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flume.sink.hive.HiveWriter$TxnBatchException: Failed acquiring Transaction Batch from EndPoint: {metaStoreUri='thrift://wangfutai:9083', database='hive', table='flume2', partitionVals=[] }
	at org.apache.flume.sink.hive.HiveWriter.nextTxnBatch(HiveWriter.java:400)
	at org.apache.flume.sink.hive.HiveWriter.(HiveWriter.java:90)
	... 6 more
Caused by: org.apache.hive.hcatalog.streaming.TransactionBatchUnAvailable: Unable to acquire transaction batch on end point: {metaStoreUri='thrift://wangfutai:9083', database='hive', table='flume2', partitionVals=[] }
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.(HiveEndPoint.java:514)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.(HiveEndPoint.java:464)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatchImpl(HiveEndPoint.java:351)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatch(HiveEndPoint.java:331)
	at org.apache.flume.sink.hive.HiveWriter$9.call(HiveWriter.java:395)
	at org.apache.flume.sink.hive.HiveWriter$9.call(HiveWriter.java:392)
	at org.apache.flume.sink.hive.HiveWriter$11.call(HiveWriter.java:428)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	... 1 more
Caused by: org.apache.thrift.TApplicationException: Internal error processing open_txns
	at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_open_txns(ThriftHiveMetastore.java:4195)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.open_txns(ThriftHiveMetastore.java:4182)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.openTxns(HiveMetaStoreClient.java:1988)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.openTxnImpl(HiveEndPoint.java:523)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.(HiveEndPoint.java:507)
	... 10 more
19/01/18 02:12:51 ERROR flume.SinkRunner: Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://wangfutai:9083', database='hive', table='flume2', partitionVals=[] }
	at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:267)
	at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
	at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flume.sink.hive.HiveWriter$ConnectException: Failed connecting to EndPoint {metaStoreUri='thrift://wangfutai:9083', database='hive', table='flume2', partitionVals=[] }
	at org.apache.flume.sink.hive.HiveWriter.(HiveWriter.java:99)
	at org.apache.flume.sink.hive.HiveSink.getOrCreateWriter(HiveSink.java:343)
	at org.apache.flume.sink.hive.HiveSink.drainOneBatch(HiveSink.java:295)
	at org.apache.flume.sink.hive.HiveSink.process(HiveSink.java:253)
	... 3 more
Caused by: org.apache.flume.sink.hive.HiveWriter$TxnBatchException: Failed acquiring Transaction Batch from EndPoint: {metaStoreUri='thrift://wangfutai:9083', database='hive', table='flume2', partitionVals=[] }
	at org.apache.flume.sink.hive.HiveWriter.nextTxnBatch(HiveWriter.java:400)
	at org.apache.flume.sink.hive.HiveWriter.(HiveWriter.java:90)
	... 6 more
Caused by: org.apache.hive.hcatalog.streaming.TransactionBatchUnAvailable: Unable to acquire transaction batch on end point: {metaStoreUri='thrift://wangfutai:9083', database='hive', table='flume2', partitionVals=[] }
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.(HiveEndPoint.java:514)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.(HiveEndPoint.java:464)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatchImpl(HiveEndPoint.java:351)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$ConnectionImpl.fetchTransactionBatch(HiveEndPoint.java:331)
	at org.apache.flume.sink.hive.HiveWriter$9.call(HiveWriter.java:395)
	at org.apache.flume.sink.hive.HiveWriter$9.call(HiveWriter.java:392)
	at org.apache.flume.sink.hive.HiveWriter$11.call(HiveWriter.java:428)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	... 1 more
Caused by: org.apache.thrift.TApplicationException: Internal error processing open_txns
	at org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_open_txns(ThriftHiveMetastore.java:4195)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.open_txns(ThriftHiveMetastore.java:4182)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.openTxns(HiveMetaStoreClient.java:1988)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.openTxnImpl(HiveEndPoint.java:523)
	at org.apache.hive.hcatalog.streaming.HiveEndPoint$TransactionBatchImpl.(HiveEndPoint.java:507)
	... 10 more

一.检查.bash_profile

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*

二.查看jar包

hive-hcatalog-core-1.1.0-cdh5.15.0.jar
hive-hcatalog-pig-adapter-1.1.0-cdh5.15.0.jar
hive-hcatalog-server-extensions-1.1.0-cdh5.15.0.jar
hive-hcatalog-streaming-1.1.0-cdh5.15.0.jar
要将hive-1.1.0-cdh5.15.0/hcatalog/share/hcatalog下的这4个包放到apache-flume-1.6.0-cdh5.15.0-bin/lib目录下

三.检查hive-site.xml

可以做如下配置(注:以下的配置不全是解决异常所须,可以对比自己的下xml检查缺少哪些):


  hive.cli.print.header
  true
  Whether to print the names of the columns in query output.


  hive.cli.print.current.db
  true
  Whether to include the current database in the Hive prompt.


  hive.metastore.uris
  thrift://xxx:9083
  Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.

	
  javax.jdo.option.ConnectionURL
  jdbc:mysql://xxx:3306/hive?createDatabaseIfNotExist=true
  JDBC connect string for a JDBC metastore



  javax.jdo.option.ConnectionDriverName
  com.mysql.jdbc.Driver
  Driver class name for a JDBC metastore


  javax.jdo.option.ConnectionUserName
  hive
  username to use against metastore database



  javax.jdo.option.ConnectionPassword
  hive
  password to use against metastore database


  hive.metastore.warehouse.dir
  /user/xxx/hive/warehouse
  location of default database for the warehouse


	  hive.exec.parallel
	  true
	  Whether to execute jobs in parallel



    	hive.support.concurrency
    	true



    	hive.enforce.bucketing
    	true



    	hive.exec.dynamic.partition.mode
    	nonstrict


    	hive.txn.manager
    	org.apache.hadoop.hive.ql.lockmgr.DbTxnManager



    	hive.compactor.initiator.on
    	true



    	hive.compactor.worker.threads
    	1

 四.可以将hive.xml和hive-env.sh放到apache-flume-1.6.0-cdh5.15.0-bin/conf下

 五.表要分桶,开启事务,格式是:org  

create table hive.flume2

 ( id int , 

   name string,

   age int )     

clustered by (id) into 2 buckets

stored as orc

tblproperties("transactional"='true');

你可能感兴趣的:(flume,hive)