kafka 由于网络原因,最后报错打开的文件过多 (org.apache.zookeeper.ClientCnxn)停止结束服务,求教! --- 采坑

kafka 运行好好地,今天一用发现有一台服务停了,最后查看日志发现为网络原因,打开文件过多,代理停止。求教大佬解决方案!

[2019-05-01 17:59:08,224] INFO Socket error occurred: 192.168.5.101/192.168.5.101:2181: 网络不可达 (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:10,271] INFO Opening socket connection to server 192.168.5.12/192.168.5.12:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:10,271] ERROR Unable to open socket to 192.168.5.12/192.168.5.12:2181 (org.apache.zookeeper.ClientCnxnSocketNIO)
[2019-05-01 17:59:10,271] INFO Socket error occurred: 192.168.5.12/192.168.5.12:2181: 网络不可达 (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:10,805] INFO Opening socket connection to server 192.168.5.101/192.168.5.101:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:10,805] ERROR Unable to open socket to 192.168.5.101/192.168.5.101:2181 (org.apache.zookeeper.ClientCnxnSocketNIO)
[2019-05-01 17:59:10,805] INFO Socket error occurred: 192.168.5.101/192.168.5.101:2181: 网络不可达 (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:12,394] INFO Opening socket connection to server 192.168.5.12/192.168.5.12:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:12,394] ERROR Unable to open socket to 192.168.5.12/192.168.5.12:2181 (org.apache.zookeeper.ClientCnxnSocketNIO)
[2019-05-01 17:59:12,394] INFO Socket error occurred: 192.168.5.12/192.168.5.12:2181: 网络不可达 (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:12,507] INFO Opening socket connection to server 192.168.5.101/192.168.5.101:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:12,507] ERROR Unable to open socket to 192.168.5.101/192.168.5.101:2181 (org.apache.zookeeper.ClientCnxnSocketNIO)
[2019-05-01 17:59:12,507] INFO Socket error occurred: 192.168.5.101/192.168.5.101:2181: 网络不可达 (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:13,858] INFO Opening socket connection to server 192.168.5.12/192.168.5.12:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:13,858] ERROR Unable to open socket to 192.168.5.12/192.168.5.12:2181 (org.apache.zookeeper.ClientCnxnSocketNIO)
[2019-05-01 17:59:13,858] INFO Socket error occurred: 192.168.5.12/192.168.5.12:2181: 网络不可达 (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:14,804] INFO Opening socket connection to server 192.168.5.101/192.168.5.101:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:14,804] ERROR Unable to open socket to 192.168.5.101/192.168.5.101:2181 (org.apache.zookeeper.ClientCnxnSocketNIO)
[2019-05-01 17:59:14,804] INFO Socket error occurred: 192.168.5.101/192.168.5.101:2181: 网络不可达 (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:16,024] INFO Opening socket connection to server 192.168.5.12/192.168.5.12:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:16,025] INFO Socket error occurred: 192.168.5.12/192.168.5.12:2181: 打开的文件过多 (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:16,383] INFO Opening socket connection to server 192.168.5.101/192.168.5.101:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:16,383] INFO Socket error occurred: 192.168.5.101/192.168.5.101:2181: 打开的文件过多 (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:18,420] ERROR Error while writing to checkpoint file /usr/local/kafka/kafkalogs/replication-offset-checkpoint (kafka.server.LogDirFailureChannel)
java.io.FileNotFoundException: /usr/local/kafka/kafkalogs/replication-offset-checkpoint.tmp (打开的文件过多)
	at java.io.FileOutputStream.open0(Native Method)
	at java.io.FileOutputStream.open(FileOutputStream.java:270)
	at java.io.FileOutputStream.(FileOutputStream.java:213)
	at java.io.FileOutputStream.(FileOutputStream.java:162)
	at kafka.server.checkpoints.CheckpointFile.liftedTree1$1(CheckpointFile.scala:52)
	at kafka.server.checkpoints.CheckpointFile.write(CheckpointFile.scala:50)
	at kafka.server.checkpoints.OffsetCheckpointFile.write(OffsetCheckpointFile.scala:59)
	at kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2$$anonfun$apply$49.apply(ReplicaManager.scala:1392)
	at kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2$$anonfun$apply$49.apply(ReplicaManager.scala:1392)
	at scala.Option.foreach(Option.scala:257)
	at kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2.apply(ReplicaManager.scala:1392)
	at kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2.apply(ReplicaManager.scala:1389)
	at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
	at scala.collection.immutable.Map$Map1.foreach(Map.scala:116)
	at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
	at kafka.server.ReplicaManager.checkpointHighWatermarks(ReplicaManager.scala:1389)
	at kafka.server.ReplicaManager$$anonfun$1.apply$mcV$sp(ReplicaManager.scala:248)
	at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:114)
	at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:63)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2019-05-01 17:59:18,421] INFO [ReplicaManager broker=1] Stopping serving replicas in dir /usr/local/kafka/kafkalogs (kafka.server.ReplicaManager)
[2019-05-01 17:59:18,422] ERROR [ReplicaManager broker=1] Error while writing to highwatermark file in directory /usr/local/kafka/kafkalogs (kafka.server.ReplicaManager)
org.apache.kafka.common.errors.KafkaStorageException: Error while writing to checkpoint file /usr/local/kafka/kafkalogs/replication-offset-checkpoint
Caused by: java.io.FileNotFoundException: /usr/local/kafka/kafkalogs/replication-offset-checkpoint.tmp (打开的文件过多)
	at java.io.FileOutputStream.open0(Native Method)
	at java.io.FileOutputStream.open(FileOutputStream.java:270)
	at java.io.FileOutputStream.(FileOutputStream.java:213)
	at java.io.FileOutputStream.(FileOutputStream.java:162)
	at kafka.server.checkpoints.CheckpointFile.liftedTree1$1(CheckpointFile.scala:52)
	at kafka.server.checkpoints.CheckpointFile.write(CheckpointFile.scala:50)
	at kafka.server.checkpoints.OffsetCheckpointFile.write(OffsetCheckpointFile.scala:59)
	at kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2$$anonfun$apply$49.apply(ReplicaManager.scala:1392)
	at kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2$$anonfun$apply$49.apply(ReplicaManager.scala:1392)
	at scala.Option.foreach(Option.scala:257)
	at kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2.apply(ReplicaManager.scala:1392)
	at kafka.server.ReplicaManager$$anonfun$checkpointHighWatermarks$2.apply(ReplicaManager.scala:1389)
	at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
	at scala.collection.immutable.Map$Map1.foreach(Map.scala:116)
	at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
	at kafka.server.ReplicaManager.checkpointHighWatermarks(ReplicaManager.scala:1389)
	at kafka.server.ReplicaManager$$anonfun$1.apply$mcV$sp(ReplicaManager.scala:248)
	at kafka.utils.KafkaScheduler$$anonfun$1.apply$mcV$sp(KafkaScheduler.scala:114)
	at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:63)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
[2019-05-01 17:59:18,423] INFO [ReplicaFetcherManager on broker 1] Removed fetcher for partitions Set(__consumer_offsets-21, cascading_wordOrder_dataReport-0, __consumer_offsets-27, __consumer_offsets-7, __consumer_offsets-9, __consumer_offsets-25, __consumer_offsets-35, __consumer_offsets-41, __consumer_offsets-33, __consumer_offsets-23, __consumer_offsets-49, __consumer_offsets-47, __consumer_offsets-31, topicssss-0, __consumer_offsets-3, msgTopic-0, __consumer_offsets-37, __consumer_offsets-15, HASHLExAF_KAFKA_TOPIC-0, __consumer_offsets-17, msgTopic-1, __consumer_offsets-19, __consumer_offsets-11, __consumer_offsets-13, __consumer_offsets-43, EXP_IMAGE_TOPIC-0, __consumer_offsets-39, fafa-0, __consumer_offsets-45, __consumer_offsets-1, __consumer_offsets-5, __consumer_offsets-29, cascading_alarm_dataReport-0) (kafka.server.ReplicaFetcherManager)
[2019-05-01 17:59:18,424] INFO [ReplicaAlterLogDirsManager on broker 1] Removed fetcher for partitions Set(__consumer_offsets-21, cascading_wordOrder_dataReport-0, __consumer_offsets-27, __consumer_offsets-7, __consumer_offsets-9, __consumer_offsets-25, __consumer_offsets-35, __consumer_offsets-41, __consumer_offsets-33, __consumer_offsets-23, __consumer_offsets-49, __consumer_offsets-47, __consumer_offsets-31, topicssss-0, __consumer_offsets-3, msgTopic-0, __consumer_offsets-37, __consumer_offsets-15, HASHLExAF_KAFKA_TOPIC-0, __consumer_offsets-17, msgTopic-1, __consumer_offsets-19, __consumer_offsets-11, __consumer_offsets-13, __consumer_offsets-43, EXP_IMAGE_TOPIC-0, __consumer_offsets-39, fafa-0, __consumer_offsets-45, __consumer_offsets-1, __consumer_offsets-5, __consumer_offsets-29, cascading_alarm_dataReport-0) (kafka.server.ReplicaAlterLogDirsManager)
[2019-05-01 17:59:18,439] INFO Opening socket connection to server 192.168.5.12/192.168.5.12:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:18,439] INFO Socket error occurred: 192.168.5.12/192.168.5.12:2181: 打开的文件过多 (org.apache.zookeeper.ClientCnxn)
[2019-05-01 17:59:18,444] INFO [ReplicaManager broker=1] Broker 1 stopped fetcher for partitions __consumer_offsets-21,cascading_wordOrder_dataReport-0,__consumer_offsets-27,__consumer_offsets-7,__consumer_offsets-9,__consumer_offsets-25,__consumer_offsets-35,__consumer_offsets-41,__consumer_offsets-33,__consumer_offsets-23,__consumer_offsets-49,__consumer_offsets-47,__consumer_offsets-31,topicssss-0,__consumer_offsets-3,msgTopic-0,__consumer_offsets-37,__consumer_offsets-15,HASHLExAF_KAFKA_TOPIC-0,__consumer_offsets-17,msgTopic-1,__consumer_offsets-19,__consumer_offsets-11,__consumer_offsets-13,__consumer_offsets-43,EXP_IMAGE_TOPIC-0,__consumer_offsets-39,fafa-0,__consumer_offsets-45,__consumer_offsets-1,__consumer_offsets-5,__consumer_offsets-29,cascading_alarm_dataReport-0 and stopped moving logs for partitions  because they are in the failed log directory /usr/local/kafka/kafkalogs. (kafka.server.ReplicaManager)
[2019-05-01 17:59:18,444] INFO Stopping serving logs in dir /usr/local/kafka/kafkalogs (kafka.log.LogManager)
[2019-05-01 17:59:18,447] ERROR Shutdown broker because all log dirs in /usr/local/kafka/kafkalogs have failed (kafka.log.LogManager)

该如何解决,先记录下来! 

 

你可能感兴趣的:(kafka)