https://www.confluent.io/connector/kafka-connect-hbase-sink/
链接地址 https://github.com/nishutayal/kafka-connect-hbase/blob/master/README.md
name | data type | required | desciption |
---|---|---|---|
zookeeper.quorum | string | yes | hbase集群的zookeeper |
event.parser.class | string | yes | 配置数据解析器,可以是AvroEventParser 或JsonEventParser |
topics | string | yes | list of kafka topics. |
hbase. |
string | yes | rowkey字段*(?逗号分隔)* |
hbase..family | string | yes | 配置column families,一个或多个,逗号分隔 |
hbase. |
string | No | 只有多个column families才需要配置,逗号分隔 |
name=kafka-cdc-hbase
connector.class=io.svectors.hbase.sink.HBaseSinkConnector
tasks.max=1
topics=test
zookeeper.quorum=localhost:2181
event.parser.class=io.svectors.hbase.parser.AvroEventParser
hbase.test.rowkey.columns=id
hbase.test.rowkey.delimiter=|
hbase.test.family=c,d
hbase.test.c.columns=c1,c2
hbase.test.d.columns=d1,d2
实际配置
name=kafka-cdc-hbase
connector.class=io.svectors.hbase.sink.HBaseSinkConnector
tasks.max=1
topics=kafka_test
zookeeper.quorum=bigdata1:2181
event.parser.class=io.svectors.hbase.parser.AvroEventParser
# properties for hbase table 'kafka_test'
hbase.kafka_test.rowkey.columns=id
hbase.kafka_test.rowkey.delimiter=|
hbase.kafka_test.family=cf
# incase of more than one column family, define the column mapping
#hbase.kafka_test.family=c,d
#hbase.kafka_test.c.columns=c1,c2
#hbase.kafka_test.d.columns=d1,d2
将下载的zip包nishutayal-kafka-connect-hbase-1.0.0.zip上传到服务器
具体操作
cd /home/kafka/confluent-5.1.2/share/java/
mkdir kafka-connect-hbase
unzip nishutayal-kafka-connect-hbase-1.0.0.zip
cp ./nishutayal-kafka-connect-hbase-1.0.0/lib/* kafka-connect-hbase
jar -uvf hbase-site.xml
实际操作
jar -uvf kafka-connect-hbase-1.0.0.jar hbase-sit.xml
注: OpenJDK命令不能运行,可以在有ORacle的JDK上操作,再传过来,过再window系统上用压缩工具把hbase-site.xml添加到根目录里。
hbase shell
create 'kafka_test','cf'
list
略
export CLASSPATH=$CONFLUENT_HOME/share/java/kafka-connect-hbase/hbase-sink.jar
$CONFLUENT_HOME/bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties etc/kafka-connect-hbase/hbase-sink.properties
实际操作
export CLASSPATH=$CLASSPATH:/home/kafka/confluent-5.1.2/share/java/kafka-connect-hbase
bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties etc/kafka-connect-hbase/hbase-sink-connector.properties
bin/kafka-avro-console-producer --broker-list appserver5:9092 --topic kafka_test --property value.schema='{"type":"record","name":"record","fields":[{"name":"id","type":"int"}, {"name":"name", "type": "string"}]}'
#insert at prompt
{"id": 1, "name": "foo"}
{"id": 2, "name": "bar"}
hbase shell
list
scan ‘kafka_test’
[2019-03-25 12:40:23,070] INFO WorkerSinkTask{id=kafka-cdc-hbase-0} Sink task finished initialization and start (org.apache.kafka.connect.runtime.WorkerSinkTask:303)
[2019-03-25 12:40:23,080] INFO Cluster ID: zP7zgprFQhm9lMJ8NDfwXg (org.apache.kafka.clients.Metadata:285)
[2019-03-25 12:40:23,081] INFO [Consumer clientId=consumer-1, groupId=connect-kafka-cdc-hbase] Discovered group coordinator appserver5:9092 (id: 2147483597 rack: null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:654)
[2019-03-25 12:40:23,083] INFO [Consumer clientId=consumer-1, groupId=connect-kafka-cdc-hbase] Revoking previously assigned partitions [] (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:458)
[2019-03-25 12:40:23,084] INFO [Consumer clientId=consumer-1, groupId=connect-kafka-cdc-hbase] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:486)
[2019-03-25 12:40:23,095] INFO [Consumer clientId=consumer-1, groupId=connect-kafka-cdc-hbase] Successfully joined group with generation 1 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:450)
[2019-03-25 12:40:23,097] INFO [Consumer clientId=consumer-1, groupId=connect-kafka-cdc-hbase] Setting newly assigned partitions [kafka_test-0] (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:289)
[2019-03-25 12:40:23,105] INFO [Consumer clientId=consumer-1, groupId=connect-kafka-cdc-hbase] Resetting offset for partition kafka_test-0 to offset 0. (org.apache.kafka.clients.consumer.internals.Fetcher:583)
[2019-03-25 12:40:23,329] WARN Unable to load native-hadoop library for your platform... using builtin-java classes where applicable (org.apache.hadoop.util.NativeCodeLoader:62)
[2019-03-25 12:40:23,524] WARN hbase.regionserver.global.memstore.upperLimit is deprecated by hbase.regionserver.global.memstore.size (org.apache.hadoop.hbase.io.util.HeapMemorySizeUtil:76)
[2019-03-25 12:40:23,558] INFO Process identifier=hconnection-0x64df6e5a connecting to ZooKeeper ensemble=bigdata1:2181 (org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper:122)
[2019-03-25 12:40:23,596] WARN Cannot locate configuration: tried hadoop-metrics2-hbase.properties,hadoop-metrics2.properties (org.apache.hadoop.metrics2.impl.MetricsConfig:125)
[2019-03-25 12:40:23,611] INFO Scheduled snapshot period at 10 second(s). (org.apache.hadoop.metrics2.impl.MetricsSystemImpl:376)
[2019-03-25 12:40:23,612] INFO HBase metrics system started (org.apache.hadoop.metrics2.impl.MetricsSystemImpl:192)
[2019-03-25 12:40:23,624] INFO Loaded MetricRegistries class org.apache.hadoop.hbase.metrics.impl.MetricRegistriesImpl (org.apache.hadoop.hbase.metrics.MetricRegistries:65)
[2019-03-25 12:40:23,667] INFO ClusterId read in ZooKeeper is null (org.apache.hadoop.hbase.client.ZooKeeperRegistry:107)