RocketMq broker硬盘空间不足的问题

1、问题来源:

压测环境是两台namesever,两台broker master,分别是10.255.255.142(broker-b)和10.255.255.151(broker-a),从监控上看151从2015-3-13后就没收到过消息。

测试环境两天master,总共的TPS是4000左右,消息大小是2K,


2、寻找问题点:

1、在eclipse环境连接压测环境,发现消息只发送到broker-b上,没有发送到broker-a上面。

2、怀疑是producer没有连接上broker-a,用netstat命令查看broker-a的连接,producer连接上了broker-a

3、怀疑producer从nameserver没有获取到broker-a上面的消息队列,使用MessageQueueSelector发现nameserver返回了broker-a的消息队列。

4、只往broker-a的消息队列上发送消息,报如下错误

com.alibaba.rocketmq.client.exception.MQBrokerException: CODE: 14  DESC: service not available now, maybe disk full, CL:  0.87 CQ:  0.87 INDEX:  0.87, maybe your broker machine memory too small.
For more information, please visit the url, https://github.com/alibaba/RocketMQ/issues/64
	at com.alibaba.rocketmq.client.impl.MQClientAPIImpl.processSendResponse(MQClientAPIImpl.java:492)
	at com.alibaba.rocketmq.client.impl.MQClientAPIImpl.sendMessageSync(MQClientAPIImpl.java:398)
	at com.alibaba.rocketmq.client.impl.MQClientAPIImpl.sendMessage(MQClientAPIImpl.java:379)
	at com.alibaba.rocketmq.client.impl.producer.DefaultMQProducerImpl.sendKernelImpl(DefaultMQProducerImpl.java:698)
	at com.alibaba.rocketmq.client.impl.producer.DefaultMQProducerImpl.sendSelectImpl(DefaultMQProducerImpl.java:877)
	at com.alibaba.rocketmq.client.impl.producer.DefaultMQProducerImpl.send(DefaultMQProducerImpl.java:851)
	at com.alibaba.rocketmq.client.producer.DefaultMQProducer.send(DefaultMQProducer.java:163)
	at com.ruishenh.rocketmq.example.Producer.main(Producer.java:78)

5、发现时硬盘不足,去broker-a上查看硬盘,硬盘还是有空间的

RocketMq broker硬盘空间不足的问题_第1张图片

6、查看RocketMQ的源码,知道出现问题的地方:

DefaultMessageStore中的public PutMessageResult putMessage(MessageExtBrokerInner msg)

        if (!this.runningFlags.isWriteable()) {
            long value = this.printTimes.getAndIncrement();
            if ((value % 50000) == 0) {
                log.warn("message store is not writeable, so putMessage is forbidden "
                        + this.runningFlags.getFlagBits());
            }

            return new PutMessageResult(PutMessageStatus.SERVICE_NOT_AVAILABLE, null);
        }
        else {
            this.printTimes.set(0);
        }

RunningFlags类中的方法

    public boolean isWriteable() {
        if ((this.flagBits & (NotWriteableBit | WriteLogicsQueueErrorBit | DiskFullBit | WriteIndexFileErrorBit)) == 0) {
            return true;
        }

        return false;
    }
7、基本判断是硬盘不足了,让测试人员把释放一部分的硬盘空间,当硬盘空闲空间达到4G以上broker-a就能正常工作了,出问题的时候空闲的硬盘空间是2.5G



预留问题点:

1、为什么硬盘还有2.5G,但是broker确不能正常工作。

2、如何配置broker删除stroe/commitlog等文件的策略。


你可能感兴趣的:(分布式系统)