kafka connect 常见问题

ERROR WorkerSourceTask{id=mysql-source-binlog-336-jobId-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:179)

org.apache.kafka.connect.errors.ConnectException: Unrecoverable exception from producer send callback

        at org.apache.kafka.connect.runtime.WorkerSourceTask.maybeThrowProducerSendException(WorkerSourceTask.java:252)

        at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:221)

        at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)

        at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)

Caused by: org.apache.kafka.common.errors.RecordTooLargeException: The message is 3900896 bytes when serialized which is larger than the maximum request size you have configured with the max.request.size configuration.

查看 kafka connect的日志 INFO ProducerConfig values中max.request.size 的值

mysql-source-binlog-336-jobId-dbhistory  的max.request.siz改掉了。但是数据的max.request.siz没改依然报错

connector-producer-mysql-source-binlog-336-jobId-0 

参考

https://github.com/confluentinc/cp-docker-images/issues/445

加入参数

"database.history.producer.max.request.size": "157286400",

"max.request.size": "157286400"

到source中解决

---------------------------------------------------------------------------------------------------------------------

{"code":500,"data":null,"msg":"java.lang.IllegalArgumentException: URI is not absolute"}

等待下次再次出现通过 http://elk-ops.iqdnet.cn/ 查看日志分析,当前日志只保留4天,无法查看

---------------------------------------------------------------------------------------------------------------------

[2019-11-14 14:10:10,064] ERROR WorkerSourceTask{id=mysql-source-binlog-336-jobId-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:179)

org.apache.kafka.connect.errors.ConnectException: Tolerance exceeded in error handler

        at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:178)

        at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:104)

        at org.apache.kafka.connect.runtime.WorkerSourceTask.convertTransformedRecord(WorkerSourceTask.java:281)

        at org.apache.kafka.connect.runtime.WorkerSourceTask.sendRecords(WorkerSourceTask.java:309)

        at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:234)

        at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)

        at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)

Caused by: org.apache.kafka.connect.errors.DataException: Failed to serialize Avro data from topic mysql-source-binlog-336-jobId.slow_log.slow_log :

        at io.confluent.connect.avro.AvroConverter.fromConnectData(AvroConverter.java:83)

        at org.apache.kafka.connect.runtime.WorkerSourceTask.lambda$convertTransformedRecord$1(WorkerSourceTask.java:281)

        at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)

        at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)

        ... 11 more

Caused by: org.apache.kafka.common.errors.SerializationException: Error registering Avro schema: {"type":"record","name":"Key","namespace":"mysql_source_binlog_336_jobId.slow_log.slow_log","fields":[{"name":"id","type":"string"}],"connect.name":"mysql_source_binlog_336_jobId.slow_log.slow_log.Key"}

Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Error while forwarding register schema request to the master; error code: 50003

        at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:230)

        at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:256)

        at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:356)

        at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:348)

        at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:334)

        at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.registerAndGetId(CachedSchemaRegistryClient.java:168)

        at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:222)

        at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:198)

        at io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:70)

        at io.confluent.connect.avro.AvroConverter$Serializer.serialize(AvroConverter.java:131)

        at io.confluent.connect.avro.AvroConverter.fromConnectData(AvroConverter.java:80)

        at org.apache.kafka.connect.runtime.WorkerSourceTask.lambda$convertTransformedRecord$1(WorkerSourceTask.java:281)

        at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)

        at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)

        at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execute(RetryWithToleranceOperator.java:104)

        at org.apache.kafka.connect.runtime.WorkerSourceTask.convertTransformedRecord(WorkerSourceTask.java:281)

        at org.apache.kafka.connect.runtime.WorkerSourceTask.sendRecords(WorkerSourceTask.java:309)

        at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:234)

        at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)

        at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)

查看SchemaRegister日志没有发现明显错误,看到hostname不通

---------------------------------------------------------------------------------------------------------------------

[2019-11-21 16:32:16,681] ERROR Error during binlog processing. Last offset stored = {ts_sec=1574325000, file=mysqld-bin.008559, pos=419190460, row=1, server_

id=463306, event=3}, binlog reader near position = mysqld-bin.008559/430041542 (io.debezium.connector.mysql.BinlogReader:1054)

[2019-11-21 16:32:16,681] ERROR Failed due to error: Error processing binlog event (io.debezium.connector.mysql.BinlogReader:209)

org.apache.kafka.connect.errors.ConnectException: Error recording the DDL statement(s) in the database history Kafka topic dbhistory.mysql-source-binlog-333-jobId:0 using brokers at null: truncate table mysql.slow_log

        at io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)

        at io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:208)

        at io.debezium.connector.mysql.BinlogReader.handleEvent(BinlogReader.java:508)

        at com.github.shyiko.mysql.binlog.BinaryLogClient.notifyEventListeners(BinaryLogClient.java:1095)

        at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:943)

        at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:580)

        at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:825)

        at java.lang.Thread.run(Thread.java:748)

Caused by: org.apache.kafka.connect.errors.ConnectException: Error recording the DDL statement(s) in the database history Kafka topic dbhistory.mysql-source-binlog-333-jobId:0 using brokers at null: truncate table mysql.slow_log

        at io.debezium.connector.mysql.MySqlSchema.applyDdl(MySqlSchema.java:361)

        at io.debezium.connector.mysql.BinlogReader.handleQueryEvent(BinlogReader.java:694)

        at io.debezium.connector.mysql.BinlogReader.handleEvent(BinlogReader.java:492)

        ... 5 more

Caused by: io.debezium.relational.history.DatabaseHistoryException: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for dbhistory.mysql-source-binlog-333-jobId-0:120001 ms has passed since batch creation

        at io.debezium.relational.history.KafkaDatabaseHistory.storeRecord(KafkaDatabaseHistory.java:198)

        at io.debezium.relational.history.AbstractDatabaseHistory.record(AbstractDatabaseHistory.java:66)

        at io.debezium.relational.history.AbstractDatabaseHistory.record(AbstractDatabaseHistory.java:60)

        at io.debezium.connector.mysql.MySqlSchema.applyDdl(MySqlSchema.java:356)

        ... 7 more

Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for dbhistory.mysql-source-binlog-333-jobId-0:120001 ms has passed since batch creation

        at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:98)

        at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:67)

        at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)

        at io.debezium.relational.history.KafkaDatabaseHistory.storeRecord(KafkaDatabaseHistory.java:188)

        ... 10 more

Caused by: org.apache.kafka.common.errors.TimeoutException: Expiring 1 record(s) for dbhistory.mysql-source-binlog-333-jobId-0:120001 ms has passed since batch creation             

---------------------------------------------------------------------------------------------------------------------

ERROR WorkerSourceTask{id=mysql-source-binlog-358-jobId-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.conn

ect.runtime.WorkerTask:179)

org.apache.kafka.connect.errors.ConnectException: log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master; the first event 'mysqld-

bin.000060' at 910723425, the last event read from '/data/binlog/qdpp/mysqld-bin.000060' at 123, the last byte read from '/data/binlog/qdpp/mysqld-bin.000060'

at 910723444. Error code: 1236; SQLSTATE: HY000.

        at io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)

        at io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:197)

        at io.debezium.connector.mysql.BinlogReader$ReaderThreadLifecycleListener.onCommunicationFailure(BinlogReader.java:1041)

        at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:950)

        at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:580)

        at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:825)

        at java.lang.Thread.run(Thread.java:748)

Caused by: com.github.shyiko.mysql.binlog.network.ServerException: log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master; the fir

st event 'mysqld-bin.000060' at 910723425, the last event read from '/data/binlog/qdpp/mysqld-bin.000060' at 123, the last byte read from '/data/binlog/qdpp/m

ysqld-bin.000060' at 910723444.

        at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:914)

        ... 3 more 

单个事物日志超过max_allowed_packet配置的限制大小

同时Error code: 1236代表BINLOG不存在

---------------------------------------------------------------------------------------------------------------------

ERROR WorkerSinkTask{id=elasticsearch-sink-binlog-364-taskId-0} Task threw an uncaught and unrecoverable exception. Task is being ki

lled and will not recover until manually restarted. (org.apache.kafka.connect.runtime.WorkerSinkTask:558)

org.apache.kafka.connect.errors.ConnectException: java.net.SocketTimeoutException: Read timed out

        at io.confluent.connect.elasticsearch.jest.JestElasticsearchClient.indexExists(JestElasticsearchClient.java:284)

        at io.confluent.connect.elasticsearch.jest.JestElasticsearchClient.createIndices(JestElasticsearchClient.java:290)

        at io.confluent.connect.elasticsearch.ElasticsearchWriter.write(ElasticsearchWriter.java:255)

        at io.confluent.connect.elasticsearch.ElasticsearchSinkTask.put(ElasticsearchSinkTask.java:169)

        at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:538)

        at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:321)

        at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:224)

        at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:192)

        at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)

        at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)

可能是ES重启导致,需要手动重启任务

参考,说的是长时间不使用断开链接

https://github.com/confluentinc/kafka-connect-elasticsearch/pull/349

添加参数

max.connection.idle.time.ms

---------------------------------------------------------------------------------------------------------------------

[2019-11-25 20:27:55,399] ERROR WorkerSourceTask{id=mysql-source-binlog-333-jobId-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.conn

ect.runtime.WorkerTask:179)

org.apache.kafka.connect.errors.ConnectException: Error recording the DDL statement(s) in the database history Kafka topic dbhistory.mysql-source-binlog-333-j

obId:0 using brokers at null: SAVEPOINT `SAVEPOINT_1`

        at io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)

        at io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:208)

        at io.debezium.connector.mysql.BinlogReader.handleEvent(BinlogReader.java:508)

        at com.github.shyiko.mysql.binlog.BinaryLogClient.notifyEventListeners(BinaryLogClient.java:1095)

        at com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:943)

        at com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:580)

        at com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:825)

        at java.lang.Thread.run(Thread.java:748)

Caused by: org.apache.kafka.connect.errors.ConnectException: Error recording the DDL statement(s) in the database history Kafka topic dbhistory.mysql-source-b

inlog-333-jobId:0 using brokers at null: SAVEPOINT `SAVEPOINT_1`

        at io.debezium.connector.mysql.MySqlSchema.applyDdl(MySqlSchema.java:361)

        at io.debezium.connector.mysql.BinlogReader.handleQueryEvent(BinlogReader.java:694)

        at io.debezium.connector.mysql.BinlogReader.handleEvent(BinlogReader.java:492)

        ... 5 more

Caused by: io.debezium.relational.history.DatabaseHistoryException: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.NotLeaderForPartit

ionException: This server is not the leader for that topic-partition.

        at io.debezium.relational.history.KafkaDatabaseHistory.storeRecord(KafkaDatabaseHistory.java:198)

        at io.debezium.relational.history.AbstractDatabaseHistory.record(AbstractDatabaseHistory.java:66)

        at io.debezium.relational.history.AbstractDatabaseHistory.record(AbstractDatabaseHistory.java:60)

        at io.debezium.connector.mysql.MySqlSchema.applyDdl(MySqlSchema.java:356)

        ... 7 more

Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topi

c-partition.

        at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.valueOrError(FutureRecordMetadata.java:98)

        at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:67)

        at org.apache.kafka.clients.producer.internals.FutureRecordMetadata.get(FutureRecordMetadata.java:30)

        at io.debezium.relational.history.KafkaDatabaseHistory.storeRecord(KafkaDatabaseHistory.java:188)

        ... 10 more

Caused by: org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition.

由于kafka调整参数重启,leader切换导致的问题。

---------------------------------------------------------------------------------------------------------------------

查看partition情况

./bin/kafka-topics  --bootstrap-server  10.37.251.101:9092 --topic mysql-source-binlog-324-jobId.longfor_mdm.cp_role_user_relation --describe

./bin/kafka-topics  --bootstrap-server  kafka-platform-01.qiandingyun.com:9092 --topic mysql-source-binlog-337-jobId.databus.ads_fenxiao_hehuoren_detail --describe

发现分区数为5

查看offest

./bin/kafka-consumer-groups --bootstrap-server 10.37.251.101:9092 --describe --group mysql-source-binlog-324-jobId.longfor_mdm.cp_role_user_relation

---------------------------------------------------------------------------------------------------------------------

Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema being registered is incompatible with an earlier schema; error code: 409; error code: 409   

存在相同版本的schema,直接删除旧的

http://10.50.6.54:8081/subjects/mysql-source-binlog-336-jobId.slow_log.slow_log-value/versions/1

---------------------------------------------------------------------------------------------------------------------

[2019-12-04 17:05:03,719] INFO [ReplicaFetcher replicaId=1001, leaderId=0, fetcherId=0] Retrying leaderEpoch request for partition _schemas-0 as the leader reported an error: UNKNOWN_SERVER_ERROR (kafka.server.ReplicaFetcherThread)

[2019-12-04 17:05:04,721] WARN [ReplicaFetcher replicaId=1001, leaderId=0, fetcherId=0] Error when sending leader epoch request for Map(_schemas-0 -> (currentLeaderEpoch=Optional[1], leaderEpoch=0)) (kafka.server.ReplicaFetcherThread)

java.io.IOException: Connection to 0 was disconnected before the response was read

        at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:100)

        at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:107)

        at kafka.server.ReplicaFetcherThread.fetchEpochEndOffsets(ReplicaFetcherThread.scala:310)

        at kafka.server.AbstractFetcherThread.truncateToEpochEndOffsets(AbstractFetcherThread.scala:208)

        at kafka.server.AbstractFetcherThread.maybeTruncate(AbstractFetcherThread.scala:173)

        at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:113)

        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:89)

[2019-12-04 17:05:04,722] INFO [ReplicaFetcher replicaId=1001, leaderId=0, fetcherId=0] Retrying leaderEpoch request for partition _schemas-0 as the leader reported an error: UNKNOWN_SERVER_ERROR (kafka.server.ReplicaFetcherThread)

---------------------------------------------------------------------------------------------------------------------

JDBC source mysql source不加任务时区参数,time类型小于8点,可能导致了-8小时为负数导致报错

Caused by: org.apache.kafka.connect.errors.DataException: Kafka Connect Time type should not have any date fields set to non-zero values.

        at org.apache.kafka.connect.data.Time.fromLogical(Time.java:64)

        at io.confluent.connect.avro.AvroData$7.convert(AvroData.java:287)

        at io.confluent.connect.avro.AvroData.fromConnectData(AvroData.java:420)

        at io.confluent.connect.avro.AvroData.fromConnectData(AvroData.java:607)

        at io.confluent.connect.avro.AvroData.fromConnectData(AvroData.java:366)

        at io.confluent.connect.avro.AvroConverter.fromConnectData(AvroConverter.java:80)

        at org.apache.kafka.connect.runtime.WorkerSourceTask.lambda$convertTransformedRecord$2(WorkerSourceTask.java:284)

        at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)

        at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)

        ... 11 more 

source参数加上时区参数

?serverTimezone=Asia/Shanghai

"db.timezone":"Asia/Shanghai"

后貌似Date类型-8小时导致时分秒不为零报错

Caused by: org.apache.kafka.connect.errors.DataException: Kafka Connect Date type should not have any time fields set to non-zero values.

        at org.apache.kafka.connect.data.Date.fromLogical(Date.java:64)

        at io.confluent.connect.avro.AvroData$6.convert(AvroData.java:276)

        at io.confluent.connect.avro.AvroData.fromConnectData(AvroData.java:420)

        at io.confluent.connect.avro.AvroData.fromConnectData(AvroData.java:607)

        at io.confluent.connect.avro.AvroData.fromConnectData(AvroData.java:366)

        at io.confluent.connect.avro.AvroConverter.fromConnectData(AvroConverter.java:80)

        at org.apache.kafka.connect.runtime.WorkerSourceTask.lambda$convertTransformedRecord$2(WorkerSourceTask.java:284)

        at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)

        at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)

        ... 11 more   

---------------------------------------------------------------------------------------------------------------------

1.

kafka在重启后,server.log 一直报WARN

INFO [ReplicaFetcher replicaId=1001, leaderId=0, fetcherId=0] Retrying leaderEpoch request for partition _schemas2-0 as the leader reported an error: UNKNOWN_SERVER_ERROR (kafka.server.ReplicaFetcherThread)

[2019-12-23 16:10:08,438] INFO [ReplicaFetcher replicaId=1001, leaderId=0, fetcherId=0] Error sending fetch request (sessionId=INVALID, epoch=INITIAL) to node 0: java.io.IOException: Connection to 0 was disconnected before the response was read. (org.apache.kafka.clients.FetchSessionHandler)

WARN [ReplicaFetcher replicaId=1001, leaderId=0, fetcherId=0] Error when sending leader epoch request for Map(_schemas2-0 -> (currentLeaderEpoch=Optional[1], leaderEpoch=0)) (kafka.server.ReplicaFetcherThread)

java.io.IOException: Connection to 0 was disconnected before the response was read

        at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:100)

        at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:107)

        at kafka.server.ReplicaFetcherThread.fetchEpochEndOffsets(ReplicaFetcherThread.scala:310)

        at kafka.server.AbstractFetcherThread.truncateToEpochEndOffsets(AbstractFetcherThread.scala:208)

        at kafka.server.AbstractFetcherThread.maybeTruncate(AbstractFetcherThread.scala:173)

        at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:113)

        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:89) 

通过命令

./bin/kafka-topics --describe --zookeeper 10.37.251.222:2181 --topic _schemas2

Topic:_schemas2 PartitionCount:1        ReplicationFactor:2    Configs:cleanup.policy=compact

        Topic: _schemas2        Partition: 0    Leader: 0      Replicas: 1001,0        Isr: 0

指定优先选择副本作为leader

指定partions的leader

{

"partitions":

  [

    {"topic":"_schemas2","partition":0}

  ]

}

./bin/kafka-preferred-replica-election.sh --bootstrap-server 10.37.251.101:9092 --path-to-json-file /tmp/kafka-preferred-replica-election.json

指定replicas级别leader指定

执行

{

"version":1,

"partitions":

  [

    {"topic":"_schemas2","partition":0,"replicas":[1001]}

  ]

}

./bin/kafka-reassign-partitions.sh --zookeeper 10.37.251.222:2181  --reassignment-json-file /tmp/reassign-plan.json --execute

验证

./bin/kafka-reassign-partitions.sh --zookeeper 10.37.251.222:2181 --reassignment-json-file /tmp/reassign-plan.json --verify

再次验证

./bin/kafka-topics.sh --zookeeper 10.37.251.222:2181 --describe --topic _schemas2

查询zk信息

./bin/zkCli.sh -server 10.37.251.222:2181

修改了brlkers/ids下的[1001,0]为[1001]

修改后部分topic的leader变为-1,手动修改指定leader

产生0这个broker原因是中途打开broker.id的配置属性配置为0导致的,虽然新配置还原重启但是造成了新broker存在的假象。最终建议加一个0节点和1节点应该也能用了

重启kafka WARN不再报,但是kfaka的数据丢失同步任务目的地少数据。

set /brokers/topics/_schemas2/partitions/0/state {"controller_epoch":2,"leader":1001,"version":1,"leader_epoch":6,"isr":[1001]}

set /brokers/topics/mysql-source-binlog-318-jobId.qdp_rosetta.rst_task_problem/partitions/1/state {"controller_epoch":2,"leader":1001,"version":1,"leader_epoch":1,"isr":[1001]}

多个副本进入Isr只有leader可被读取

通过对比qa环境的schema的topic状态发现 dev的Replicas为2

./bin/kafka-topics --describe --zookeeper 10.37.253.31:2181 --topic _schemas

Topic:_schemas  PartitionCount:1        ReplicationFactor:3    Configs:cleanup.policy=compact

        Topic: _schemas Partition: 0    Leader: 0      Replicas: 0,1,2 Isr: 0,1 

_schemas2数据出现丢失,leader不可用

2.

产生的关联问题

重启kafka后发现任务报错。_schema中存在mysql-source-binlog-318-jobId.qdp_rosetta.rst_task_item-value 但是ID是421,也就是根据ID取出schema没拿到。但是为什么会取错ID为403

Caused by: org.apache.kafka.connect.errors.DataException: Failed to deserialize data for topic mysql-source-binlog-318-jobId.qdp_rosetta.rst_task to Avro:

        at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:110)

        at org.apache.kafka.connect.runtime.WorkerSinkTask.lambda$convertAndTransformRecord$0(WorkerSinkTask.java:484)

        at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndRetry(RetryWithToleranceOperator.java:128)

        at org.apache.kafka.connect.runtime.errors.RetryWithToleranceOperator.execAndHandleError(RetryWithToleranceOperator.java:162)

        ... 13 more                     

Caused by: org.apache.kafka.common.errors.SerializationException: Error retrieving Avro schema for id 403

Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema not found; error code: 40403

        at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:230)

        at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:256)   

---------------------------------------------------------------------------------------------------------------------

poll()方法return后没有返回,kafkaconnect的日志报WARN

[2019-12-20 09:54:20,153] WARN [Producer clientId=connector-producer-mysql-source-jdbc-333-jobId-0] Error while fetching metadata with correlation id 3 : {mysql-source-jdbc-333-jobId.devds.=LEADER_NOT_AVAILABLE}

(org.apache.kafka.clients.NetworkClient:1051)

---------------------------------------------------------------------------------------------------------------------

jdbc sink自动建表问题

无法识别mysql字段类型的长度值,始终使用默认值设定长度。并且像year(2018)类型最终映射成为date(2018-01-01)落地不符合要求

---------------------------------------------------------------------------------------------------------------------

org.apache.kafka.connect.errors.ConnectException: query may not be combined with whole-table copying settings.

jdbc source 中的sql 存在 join 语句不允许使用"table.whitelist" 属性,因为使用了"table.whitelist"导致对应的topic 的 schema没有创建成功,导致数据无法进入kafka而且不报错

必须topic.prefix属性指定完整的命名 mysql-source-jdbc-443-jobId.devds.user

---------------------------------------------------------------------------------------------------------------------

数据同步少数,停止同步不报错

[2020-01-08 15:43:11,124] ERROR WorkerSourceTask{id=mysql-source-binlog-353-jobId-0} Failed to flush, timed out while waiting for producer to flush outstanding 3 messages (org.apache.kafka.connect.runtime.WorkerSourceTask:431)

[2020-01-08 15:43:11,124] ERROR WorkerSourceTask{id=mysql-source-binlog-353-jobId-0} Failed to commit offsets (org.apache.kafka.connect.runtime.SourceTaskOffsetCommitter:114)

---------------------------------------------------------------------------------------------------------------------

[2020-01-09 14:11:12,933] ERROR WorkerSourceTask{id=mysql-source-binlog-360-jobId-0} Failed to flush, timed out while waiting for producer to flush outstanding 3050 messages (org.apache.kafka.connect.runtime.WorkerSourceTask:431)

[2020-01-09 14:11:12,949] ERROR WorkerSourceTask{id=mysql-source-binlog-360-jobId-0} Failed to commit offsets (org.apache.kafka.connect.runtime.SourceTaskOffsetCommitter:114)

---------------------------------------------------------------------------------------------------------------------

[2020-01-10 16:12:17,935] ERROR WorkerSinkTask{id=mysql-sink-binlog-485-taskId-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:179)

org.apache.kafka.common.errors.WakeupException

        at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.maybeTriggerWakeup(ConsumerNetworkClient.java:490)

        at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:275)

        at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:233)

        at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:212)

        at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:693)

        at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1454)

        at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1412)

        at org.apache.kafka.connect.runtime.WorkerSinkTask.doCommitSync(WorkerSinkTask.java:332)

        at org.apache.kafka.connect.runtime.WorkerSinkTask.doCommit(WorkerSinkTask.java:360)

        at org.apache.kafka.connect.runtime.WorkerSinkTask.commitOffsets(WorkerSinkTask.java:431)

        at org.apache.kafka.connect.runtime.WorkerSinkTask.closePartitions(WorkerSinkTask.java:590)

        at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:196)

        at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)

        at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)

        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

        at java.util.concurrent.FutureTask.run(FutureTask.java:266)

        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

        at java.lang.Thread.run(Thread.java:748)

---------------------------------------------------------------------------------------------------------------------

org.apache.kafka.connect.errors.ConnectException: PK mode for table 'users' is RECORD_KEY, but record key schema is missing

配置错误

"pk.mode": "record_key",改为"pk.mode": "record_value"

---------------------------------------------------------------------------------------------------------------------

Failed due to error: Aborting snapshot due to error when last running 'SELECT * FROM `qding_brick`.`bd_person_addr`': Can''t call rollback when autocommit=true (io.debezium.connector.mysql.SnapshotReader:20

9)

---------------------------------------------------------------------------------------------------------------------

[2020-01-14 01:56:33,803] ERROR WorkerSinkTask{id=mysql-sink-binlog-501-taskId-0} Commit of offsets threw an unexpected exception for sequence number 4899: {mysql-source-binlog-371-jobId.qdp_rosetta.rst_task_item-0=OffsetAndMetadata{offse

t=17870259, leaderEpoch=null, metadata=''}} (org.apache.kafka.connect.runtime.WorkerSinkTask:259)

org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset commit failed with a retriable exception. You should retry committing the latest consumed offsets.

Caused by: org.apache.kafka.common.errors.TimeoutException: The request timed out.

[2020-01-14 01:56:34,053] INFO [Consumer clientId=connector-consumer-mysql-sink-binlog-501-taskId-0, groupId=connect-mysql-sink-binlog-501-taskId] Discovered group coordinator 10.50.6.53:9092 (id: 2147483644 rack: null) (org.apache.kafka.

clients.consumer.internals.AbstractCoordinator:728)

---------------------------------------------------------------------------------------------------------------------

debezium做snapshot时使用jdbc查询tinyint(1)将大与0的数变为了1返回

设置这个参数解决

"database.tinyInt1isBit":"false"

---------------------------------------------------------------------------------------------------------------------

A slave with the same server_uuid/server_id as this slave has connected to the master

不明原因,task重启解决了,google查询怀疑和mysql服务有关系

---------------------------------------------------------------------------------------------------------------------

Caused by: io.debezium.text.ParsingException: no viable alternative at input

直接抛弃解析不了的sql "database.history.skip.unparseable.ddl": "true"

搜索可以升级到1.1.0.Beta1解决该bug

---------------------------------------------------------------------------------------------------------------------

[2020-03-12 12:42:28,414] ERROR WorkerSinkTask{id=mysql-sink-binlog-536-taskId-0} Commit of offsets threw an unexpected exception for sequence number 56: {mysql-source-binlog-378-jobId.qdp_rosetta.rst_task_item-0=OffsetAndMetadata{offset=19974922, leaderEpoch=null, metadata=''}} (org.apache.kafka.connect.runtime.WorkerSinkTask:259)

max.poll.records

max.poll.interval.ms

offset.flush.timeout.ms=10000

---------------------------------------------------------------------------------------------------------------------

No producer is available. Ensure that 'start()' is called before storing database history records.

---------------------------------------------------------------------------------------------------------------------

Failed to flush, timed out while waiting for producer to flush outstanding 14581 messages

---------------------------------------------------------------------------------------------------------------------

  ERROR WorkerSinkTask{id=mysql-sink-binlog-498-taskId-0} Commit of offsets threw an unexpected exception for sequence number 604: {mysql-source-binlog-366-jobId.qdp_rosetta.rst_task-0=OffsetAndMetadata{offset=264911, leaderEpoch=null, metadata=''}} (org.apache.kafka.connect.runtime.WorkerSinkTask:259)

org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured max.poll.interval.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing max.poll.interval.ms or by reducing the maximum size of batches returned in poll() with max.poll.records.

你可能感兴趣的:(kafka connect 常见问题)