Consumer默认使用DefaultMQPushConsumerImpl长轮询拉消息的方式来消费,可以保证实时性同Push一样。还可以参考example项目,使用DefaultMQPullConsumerImpl,由业务控制如果拉消息,更新消费进度等方式。
2. 集群模式VS广播模式
集群模式:
- 消息的消费进度,即consumerOffset.json保存在broker上。
- 所有consumer平均消费topic的消息。
- 消息消费失败后,consumer会发回broker,broker根据消费失败次数设置不同的delayLevel进行重发。
- 相同topic不同的consumerGroup组成伪广播模式,可达到所有consumer都会收到消息。
广播模式:
- 消息的消费进度保存在consumer的机器上。
- 所有的consumer都会收到topic下的消息。
- 消息消费失败后直接丢弃,不会发回broker进行重投递。
3. ConsumeFromWhere
consumer可以设置消费的起始点,MQ提供了三种方式:
- CONSUME_FROM_LAST_OFFSET
- CONSUME_FROM_FIRST_OFFSET
- CONSUME_FROM_TIMESTAMP
我们从源码的角度,理解这三种不同消费点的含义。DefaultMQPushConsumerImpl内部有一个类RebalancePushImpl,首先计算客户端需要拉取队列,然后到broker获取消费进度,获取消费offset代码在computePullFromWhere方法:
public long computePullFromWhere(MessageQueue mq) { long result = -1; final ConsumeFromWhere consumeFromWhere = this.defaultMQPushConsumerImpl.getDefaultMQPushConsumer().getConsumeFromWhere(); final OffsetStore offsetStore = this.defaultMQPushConsumerImpl.getOffsetStore(); switch (consumeFromWhere) { case CONSUME_FROM_LAST_OFFSET_AND_FROM_MIN_WHEN_BOOT_FIRST: case CONSUME_FROM_MIN_OFFSET: case CONSUME_FROM_MAX_OFFSET: case CONSUME_FROM_LAST_OFFSET: { long lastOffset = offsetStore.readOffset(mq, ReadOffsetType.READ_FROM_STORE); if (lastOffset >= 0) { result = lastOffset; } // First start,no offset else if (-1 == lastOffset) { if (mq.getTopic().startsWith(MixAll.RETRY_GROUP_TOPIC_PREFIX)) { result = 0L; } else { try { result = this.mQClientFactory.getMQAdminImpl().maxOffset(mq); } catch (MQClientException e) { result = -1; } } } else { result = -1; } break; } case CONSUME_FROM_FIRST_OFFSET: { long lastOffset = offsetStore.readOffset(mq, ReadOffsetType.READ_FROM_STORE); if (lastOffset >= 0) { result = lastOffset; } else if (-1 == lastOffset) { result = 0L; } else { result = -1; } break; } case CONSUME_FROM_TIMESTAMP: { long lastOffset = offsetStore.readOffset(mq, ReadOffsetType.READ_FROM_STORE); if (lastOffset >= 0) { result = lastOffset; } else if (-1 == lastOffset) { if (mq.getTopic().startsWith(MixAll.RETRY_GROUP_TOPIC_PREFIX)) { try { result = this.mQClientFactory.getMQAdminImpl().maxOffset(mq); } catch (MQClientException e) { result = -1; } } else { try { long timestamp = UtilAll.parseDate(this.defaultMQPushConsumerImpl.getDefaultMQPushConsumer().getConsumeTimestamp(), UtilAll.yyyyMMddHHmmss).getTime(); result = this.mQClientFactory.getMQAdminImpl().searchOffset(mq, timestamp); } catch (MQClientException e) { result = -1; } } } else { result = -1; } break; } default: break; } return result; }
首先看CONSUME_FROM_LAST_OFFSET的逻辑,lastOffset >= 0,意味着broker端有消费进度了,说明之前已经启动且消费了一些消息,那么就从返回的offset开始消费。当-1 == lastOffset,如果为重试队列,从头消费,为普通队列则从最大offset处消费。
CONSUME_FROM_TIMESTAMP,如果为第一次启动,即-1 == lastOffset时,为普通队列的话,从设置的时间点开始消费,如果未设置时间点,默认从半小时前开始消费。
private String consumeTimestamp = UtilAll.timeMillisToHumanString3(System.currentTimeMillis() - (1000 * 60 * 30));
4. 拉消息
Consumer的启动源码如下:
public void start() throws MQClientException { switch (this.serviceState) { case CREATE_JUST: log.info("the consumer [{}] start beginning. messageModel={}, isUnitMode={}", this.defaultMQPushConsumer.getConsumerGroup(), this.defaultMQPushConsumer.getMessageModel(), this.defaultMQPushConsumer.isUnitMode()); this.serviceState = ServiceState.START_FAILED; this.checkConfig(); this.copySubscription(); if (this.defaultMQPushConsumer.getMessageModel() == MessageModel.CLUSTERING) { this.defaultMQPushConsumer.changeInstanceNameToPID(); } this.mQClientFactory = MQClientManager.getInstance().getAndCreateMQClientInstance(this.defaultMQPushConsumer, this.rpcHook); this.rebalanceImpl.setConsumerGroup(this.defaultMQPushConsumer.getConsumerGroup()); this.rebalanceImpl.setMessageModel(this.defaultMQPushConsumer.getMessageModel()); this.rebalanceImpl.setAllocateMessageQueueStrategy(this.defaultMQPushConsumer.getAllocateMessageQueueStrategy()); this.rebalanceImpl.setmQClientFactory(this.mQClientFactory); this.pullAPIWrapper = new PullAPIWrapper(// mQClientFactory, // this.defaultMQPushConsumer.getConsumerGroup(), isUnitMode()); this.pullAPIWrapper.registerFilterMessageHook(filterMessageHookList); if (this.defaultMQPushConsumer.getOffsetStore() != null) { this.offsetStore = this.defaultMQPushConsumer.getOffsetStore(); } else { switch (this.defaultMQPushConsumer.getMessageModel()) { case BROADCASTING: this.offsetStore = new LocalFileOffsetStore(this.mQClientFactory, this.defaultMQPushConsumer.getConsumerGroup()); break; case CLUSTERING: this.offsetStore = new RemoteBrokerOffsetStore(this.mQClientFactory, this.defaultMQPushConsumer.getConsumerGroup()); break; default: break; } } this.offsetStore.load(); if (this.getMessageListenerInner() instanceof MessageListenerOrderly) { this.consumeOrderly = true; this.consumeMessageService = new ConsumeMessageOrderlyService(this, (MessageListenerOrderly) this.getMessageListenerInner()); } else if (this.getMessageListenerInner() instanceof MessageListenerConcurrently) { this.consumeOrderly = false; this.consumeMessageService = new ConsumeMessageConcurrentlyService(this, (MessageListenerConcurrently) this.getMessageListenerInner()); } this.consumeMessageService.start(); boolean registerOK = mQClientFactory.registerConsumer(this.defaultMQPushConsumer.getConsumerGroup(), this); if (!registerOK) { this.serviceState = ServiceState.CREATE_JUST; this.consumeMessageService.shutdown(); throw new MQClientException("The consumer group[" + this.defaultMQPushConsumer.getConsumerGroup() + "] has been created before, specify another name please." + FAQUrl.suggestTodo(FAQUrl.GROUP_NAME_DUPLICATE_URL), null); } mQClientFactory.start(); log.info("the consumer [{}] start OK.", this.defaultMQPushConsumer.getConsumerGroup()); this.serviceState = ServiceState.RUNNING; break; case RUNNING: case START_FAILED: case SHUTDOWN_ALREADY: throw new MQClientException("The PushConsumer service state not OK, maybe started once, "// + this.serviceState// + FAQUrl.suggestTodo(FAQUrl.CLIENT_SERVICE_NOT_OK), null); default: break; } this.updateTopicSubscribeInfoWhenSubscriptionChanged(); this.mQClientFactory.sendHeartbeatToAllBrokerWithLock(); this.mQClientFactory.rebalanceImmediately(); }
this.offsetStore.load();
首先就是加载消息消费进度,BROADCASTING模式实例化LocalFileOffsetStore,CLUSTERING实例化RemoteBrokerOffsetStore,加载内存,以后所有的读写都是直接操作的内存数据,由定时任务每隔5s进行落盘。
this.consumeMessageService.start();
为消费消息服务,默认实现为ConsumeMessageConcurrentlyService,即并发消费。
mQClientFactory.start();
mQClientFactory的启动会建立和broker通道,定时任务,拉消息服务,负载均衡服务。由于拉消息的请求是由负载均衡发起,所以先说负载均衡服务。
负载均衡服务由RebalanceService线程每隔20s做一次,跟踪代码最终会调用到RebalanceImpl#rebalanceByTopic方法:
case BROADCASTING: { SetmqSet = this.topicSubscribeInfoTable.get(topic); if (mqSet != null) { boolean changed = this.updateProcessQueueTableInRebalance(topic, mqSet, isOrder); if (changed) { this.messageQueueChanged(topic, mqSet, mqSet); log.info("messageQueueChanged {} {} {} {}", // consumerGroup, // topic, // mqSet, // mqSet); } } else { log.warn("doRebalance, {}, but the topic[{}] not exist.", consumerGroup, topic); } break; }
广播模式,由于所有consumer都需要收到消息,所以不存在负载均衡策略。
ListcidAll = this.mQClientFactory.findConsumerIdList(topic, consumerGroup);
集群模式下,首先通过topic和consumerGroup获取consumer列表,然后分配拉取消息队列,默认为平均分配策略AllocateMessageQueueAveragely。
直接mock数据,可以看到AllocateMessageQueueAveragely的allocate方法,就是以上结论。
得到当前consumer需要拉取的消息队列后,到RebalanceImpl#updateProcessQueueTableInRebalance进行拉取数据请求PullRequest构造,到DefaultMQPushConsumerImpl#pullMessage进行拉取消息前的逻辑整合,最终通过mQClientFactory内部的mQClientAPIImpl通道到broker异步拉取数据。
回到DefaultMQPushConsumerImpl#pullMessage方法,看一下内部的逻辑:
long size = processQueue.getMsgCount().get(); if (size > this.defaultMQPushConsumer.getPullThresholdForQueue()) { this.executePullRequestLater(pullRequest, PullTimeDelayMillsWhenFlowControl); if ((flowControlTimes1++ % 1000) == 0) { log.warn( "the consumer message buffer is full, so do flow control, minOffset={}, maxOffset={}, size={}, pullRequest={}, flowControlTimes={}", processQueue.getMsgTreeMap().firstKey(), processQueue.getMsgTreeMap().lastKey(), size, pullRequest, flowControlTimes1); } return; }
由于拉取的消息,会放到本地队列ProcessQueue进行处理,当发现本地队列大小超过1000,就延迟50ms再拉取。
if (!this.consumeOrderly) { if (processQueue.getMaxSpan() > this.defaultMQPushConsumer.getConsumeConcurrentlyMaxSpan()) { this.executePullRequestLater(pullRequest, PullTimeDelayMillsWhenFlowControl); if ((flowControlTimes2++ % 1000) == 0) { log.warn( "the queue's messages, span too long, so do flow control, minOffset={}, maxOffset={}, maxSpan={}, pullRequest={}, flowControlTimes={}", processQueue.getMsgTreeMap().firstKey(), processQueue.getMsgTreeMap().lastKey(), processQueue.getMaxSpan(), pullRequest, flowControlTimes2); } return; } }
由于消息以offset为key,放入本地队列ProcessQueue的TreeMap中,所以这里有一步span检查。当span值(即this.msgTreeMap.lastKey() - this.msgTreeMap.firstKey())大于2000时,延迟拉取。 由于业务关系,消息消费快慢无法保证,如果offset大的消息处理的快,本地队列就会积压offset小的消息,所以span的值可能会越来越大。
由于是异步拉取消息,所以这里需要构造一个PullCallback对象。onSuccess方法内:
if (pullResult != null) { pullResult = DefaultMQPushConsumerImpl.this.pullAPIWrapper.processPullResult(pullRequest.getMessageQueue(), pullResult, subscriptionData);
对拉取结果进行处理,如果有消息,反序列化,再进行一步tag的比较去重。前面章节,有提到过,broker的ConsumeQueue中保存tag的hashcode值,所以consumer的此步去重是保证消息的准确性。
boolean dispathToConsume = processQueue.putMessage(pullResult.getMsgFoundList()); DefaultMQPushConsumerImpl.this.consumeMessageService.submitConsumeRequest(// pullResult.getMsgFoundList(), // processQueue, // pullRequest.getMessageQueue(), // dispathToConsume);
然后将消息放入本地队列,并通过submitConsumeRequest方法构造ConsumeRequest,由于consumeBatchSize为1,提交任务到consumeExecutor线程池(20个线程),每线程每消息并发处理,ConsumeRequest会调用consumeMessageService触发MessageListener#consumeMessage,执行业务处理。
ConsumeMessageConcurrentlyService.this.processConsumeResult(status, context, this);
在ConsumeRequest#run业务处理完后,执行processConsumeResult方法:
switch (this.defaultMQPushConsumer.getMessageModel()) { case BROADCASTING: for (int i = ackIndex + 1; i < consumeRequest.getMsgs().size(); i++) { MessageExt msg = consumeRequest.getMsgs().get(i); log.warn("BROADCASTING, the message consume failed, drop it, {}", msg.toString()); } break; case CLUSTERING: ListmsgBackFailed = new ArrayList (consumeRequest.getMsgs().size()); for (int i = ackIndex + 1; i < consumeRequest.getMsgs().size(); i++) { MessageExt msg = consumeRequest.getMsgs().get(i); boolean result = this.sendMessageBack(msg, context); if (!result) { msg.setReconsumeTimes(msg.getReconsumeTimes() + 1); msgBackFailed.add(msg); } } if (!msgBackFailed.isEmpty()) { consumeRequest.getMsgs().removeAll(msgBackFailed); this.submitConsumeRequestLater(msgBackFailed, consumeRequest.getProcessQueue(), consumeRequest.getMessageQueue()); } break; default: break; }
当业务处理返回RECONSUME_LATER后,ackIndex为-1,返回CONSUME_SUCCESS,ackIndex为0,。如果consumer为广播模式,消息消费失败后,直接打印log。如果consumer为集群模式,消息会通过this.sendMessageBack(msg, context)发回broker,consumer会重新收到消息。如果发回broker失败,consumer会尝试5s后重新消费。
发回broker的这些消费失败消息会根据消费失败的次数设置不同的delayLevel。SendMessageProcessor#consumerSendMsgBack:
if (msgExt.getReconsumeTimes() >= maxReconsumeTimes// || delayLevel < 0) { newTopic = MixAll.getDLQTopic(requestHeader.getGroup()); queueIdInt = Math.abs(this.random.nextInt() % 99999999) % DLQ_NUMS_PER_GROUP; topicConfig = this.brokerController.getTopicConfigManager().createTopicInSendMessageBackMethod(newTopic, // DLQ_NUMS_PER_GROUP, // PermName.PERM_WRITE, 0 ); if (null == topicConfig) { response.setCode(ResponseCode.SYSTEM_ERROR); response.setRemark("topic[" + newTopic + "] not exist"); return response; } } else { if (0 == delayLevel) { delayLevel = 3 + msgExt.getReconsumeTimes(); } msgExt.setDelayTimeLevel(delayLevel); }
第一次消费失败,msgExt.getReconsumeTimes()为0,由于原始messageDelayLevel为“1s 5s 10s 30s 1m 2m 3m 4m 5m 6m 7m 8m 9m 10m 20m 30m 1h 2h”,当delayLevel为3的情况下,就对应10s,也就是说,第一次消费失败的消息,会在10s后重新消费,依次类推。消息自第一次消费失败后,由原先的topic转入%RETRY%队列,如果消息消费失败次数大于maxReconsumeTimes(16次),消息会进入DLQ队列。
long offset = consumeRequest.getProcessQueue().removeMessage(consumeRequest.getMsgs()); if (offset >= 0 && !consumeRequest.getProcessQueue().isDropped()) { this.defaultMQPushConsumerImpl.getOffsetStore().updateOffset(consumeRequest.getMessageQueue(), offset, true); }
然后删除本地处理队列ProcessQueue中数据,注意这个removeMessage方法内部,永远返回的是msgTreeMap.firstKey(),所以更新消费的offset永远为最小值,这也正是MQ官方文档中说,MQ的重复消息需要让业务端过滤或采用幂等操作。
5. 长轮询
前面提到,consumer是长轮询拉消息,当consumer拉消息时,broker端如果没有新消息,broker会通过PullRequestHoldService服务hold住这个请求:
if (brokerAllowSuspend && hasSuspendFlag) { long pollingTimeMills = suspendTimeoutMillisLong; if (!this.brokerController.getBrokerConfig().isLongPollingEnable()) { pollingTimeMills = this.brokerController.getBrokerConfig().getShortPollingTimeMills(); } String topic = requestHeader.getTopic(); long offset = requestHeader.getQueueOffset(); int queueId = requestHeader.getQueueId(); PullRequest pullRequest = new PullRequest(request, channel, pollingTimeMills, this.brokerController.getMessageStore().now(), offset, subscriptionData); this.brokerController.getPullRequestHoldService().suspendPullRequest(topic, queueId, pullRequest); response = null; break; }
Broker通过ReputMessageService异步构建ConsumeQueue并通过注册的MessageArrivingListener通知PullRequestHoldService#notifyMessageArriving达到有消息,立即推送给consumer。ReputMessageService#doReput:
if (BrokerRole.SLAVE != DefaultMessageStore.this.getMessageStoreConfig().getBrokerRole() && DefaultMessageStore.this.brokerConfig.isLongPollingEnable()) { DefaultMessageStore.this.messageArrivingListener.arriving(dispatchRequest.getTopic(), dispatchRequest.getQueueId(), dispatchRequest.getConsumeQueueOffset() + 1, dispatchRequest.getTagsCode()); }
public void arriving(String topic, int queueId, long logicOffset, long tagsCode) { this.pullRequestHoldService.notifyMessageArriving(topic, queueId, logicOffset, tagsCode); }