RocketMQ-长文详解Consumer启动、消息消费、MessageQueue重平衡

Consumer作为RocketMQ消息消费的终端,了解Consumer对使用RocketMQ有很大的帮助。

1.Consumer启动

1.1 启动Consumer

        //消费
        DefaultMQPushConsumer consumer = new DefaultMQPushConsumer("OrderTopicGroup");
        consumer.setConsumeFromWhere(ConsumeFromWhere.CONSUME_FROM_FIRST_OFFSET);
        consumer.setMessageModel(MessageModel.CLUSTERING);
        consumer.setNamesrvAddr(namesrvAddr);
        consumer.subscribe(topic, "*");
        consumer.setConsumeThreadMin(1);
        consumer.setConsumeThreadMax(1);
        //使用 MessageListenerOrderly 消息
        consumer.registerMessageListener(new MessageListenerOrderly() {

            @Override
            public ConsumeOrderlyStatus consumeMessage(List<MessageExt> msgs, ConsumeOrderlyContext context) {
                System.out.printf("%s Receive New Messages: %s %n", Thread.currentThread().getName(), msgs);
                return ConsumeOrderlyStatus.SUCCESS;
            }

        });
        consumer.start();

以上为Consumer配置和启动的代码,RocketMQ把复杂的逻辑都封装在底层,使用层面代码量不多。

先看看consumer.start()都执行了什么。

1.2 DefaultMQPushConsumer#DefaultMQPushConsumer

DefaultMQPushConsumer最简单的构造函数,实例化consumerGroupnamespaceallocateMessageQueueStrategy(消息队列分配策略,后面重平衡再介绍),defaultMQPushConsumerImpl(Push方式消费核心类)。

public DefaultMQPushConsumer(final String namespace, final String consumerGroup, RPCHook rpcHook,
        AllocateMessageQueueStrategy allocateMessageQueueStrategy) {
        this.consumerGroup = consumerGroup;
        this.namespace = namespace;
        this.allocateMessageQueueStrategy = allocateMessageQueueStrategy;
        defaultMQPushConsumerImpl = new DefaultMQPushConsumerImpl(this, rpcHook);
    }

defaultMQPushConsumerImpl使用DefaultMQPushConsumerImpl类进行实例化。

1.3 DefaultMQPushConsumer#start

@Override
    public void start() throws MQClientException {
        //根据namespace和consumerGroup设置消费者组
        setConsumerGroup(NamespaceUtil.wrapNamespace(this.getNamespace(), this.consumerGroup));
        //默认消费者实现启动
        this.defaultMQPushConsumerImpl.start();
        //消息轨迹跟踪服务,默认null
        if (null != traceDispatcher) {
            try {
                traceDispatcher.start(this.getNamesrvAddr(), this.getAccessChannel());
            } catch (MQClientException e) {
                log.warn("trace dispatcher start failed ", e);
            }
        }
    }

核心逻辑还是由defaultMQPushConsumerImpl.start()执行。

1.4 DefaultMQPushConsumerImpl#start

/**
     * 启动默认消费者实现
     * @throws MQClientException
     */
    public synchronized void start() throws MQClientException {
        //根据服务状态选择走不同的代码分支
        switch (this.serviceState) {
            //服务仅仅创建,而不是启动状态,那么启动服务
            case CREATE_JUST:
                log.info("the consumer [{}] start beginning. messageModel={}, isUnitMode={}", this.defaultMQPushConsumer.getConsumerGroup(),
                    this.defaultMQPushConsumer.getMessageModel(), this.defaultMQPushConsumer.isUnitMode());
                //首先修改服务状态为服务启动失败,如果最终启动成功则再修改为RUNNING
                this.serviceState = ServiceState.START_FAILED;

                /*
                 * 1 检查消费者的配置信息
                 *
                 * 如果consumerGroup为空,或者长度大于255个字符,或者包含非法字符(正常的匹配模式为 ^[%|a-zA-Z0-9_-]+$),或者消费者组名为默认组名DEFAULT_CONSUMER
                 * 或者messageModel为空,或者consumeFromWhere为空,或者consumeTimestamp为空,或者allocateMessageQueueStrategy为空……等等属性的空校验
                 * 满足以上任意条件都校验不通过抛出异常。
                 */
                this.checkConfig();

                /*
                 * 2 拷贝拷贝订阅关系
                 *
                 * 为集群消费模式的消费者,配置其对应的重试主题 retryTopic = %RETRY% + consumerGroup
                 * 并且设置当前消费者自动订阅该消费者组对应的重试topic,用于实现消费重试。
                 */
                this.copySubscription();
                //如果是集群消费模式,如果instanceName为默认值 "DEFAULT",那么改成 UtilAll.getPid() + "#" + System.nanoTime()
                if (this.defaultMQPushConsumer.getMessageModel() == MessageModel.CLUSTERING) {
                    this.defaultMQPushConsumer.changeInstanceNameToPID();
                }

                /*
                 * 3 获取MQClientManager实例,然后根据clientId获取或者创建CreateMQClientInstance实例,并赋给mQClientFactory变量
                 *
                 * MQClientInstance封装了RocketMQ底层网络处理API,Producer、Consumer都会使用到这个类,是Producer、Consumer与NameServer、Broker 打交道的网络通道。
                 * 因此,同一个clientId对应同一个MQClientInstance实例就可以了,即同一个应用中的多个producer和consumer使用同一个MQClientInstance实例即可。
                 */
                this.mQClientFactory = MQClientManager.getInstance().getOrCreateMQClientInstance(this.defaultMQPushConsumer, this.rpcHook);
                /*
                 * 4 设置负载均衡服务的相关属性
                 */
                this.rebalanceImpl.setConsumerGroup(this.defaultMQPushConsumer.getConsumerGroup());
                this.rebalanceImpl.setMessageModel(this.defaultMQPushConsumer.getMessageModel());
                this.rebalanceImpl.setAllocateMessageQueueStrategy(this.defaultMQPushConsumer.getAllocateMessageQueueStrategy());
                this.rebalanceImpl.setmQClientFactory(this.mQClientFactory);
                /*
                 * 5 创建消息拉取核心对象PullAPIWrapper,封装了消息拉取及结果解析逻辑的API
                 */
                this.pullAPIWrapper = new PullAPIWrapper(
                    mQClientFactory,
                    this.defaultMQPushConsumer.getConsumerGroup(), isUnitMode());
                //为PullAPIWrapper注册过滤消息的钩子函数
                this.pullAPIWrapper.registerFilterMessageHook(filterMessageHookList);

                /*
                 * 6 根据消息模式设置不同的OffsetStore,用于实现消费者的消息消费偏移量offset的管理
                 */
                if (this.defaultMQPushConsumer.getOffsetStore() != null) {
                    this.offsetStore = this.defaultMQPushConsumer.getOffsetStore();
                } else {
                    //根据不用的消费模式选择不同的OffsetStore实现
                    switch (this.defaultMQPushConsumer.getMessageModel()) {
                        case BROADCASTING:
                            //如果是广播消费模式,则是LocalFileOffsetStore,消息消费进度即offset存储在本地磁盘中。
                            this.offsetStore = new LocalFileOffsetStore(this.mQClientFactory, this.defaultMQPushConsumer.getConsumerGroup());
                            break;
                        case CLUSTERING:
                            //如果是集群消费模式,则是RemoteBrokerOffsetStore,消息消费进度即offset存储在远程broker中。
                            this.offsetStore = new RemoteBrokerOffsetStore(this.mQClientFactory, this.defaultMQPushConsumer.getConsumerGroup());
                            break;
                        default:
                            break;
                    }
                    this.defaultMQPushConsumer.setOffsetStore(this.offsetStore);
                }
                /*
                 * 7 加载消费偏移量
                 * LocalFileOffsetStore会加载本地磁盘中的数据,
                 * RemoteBrokerOffsetStore则是一个空实现。
                 */
                this.offsetStore.load();

                /*
                 * 8 根据消息监听器的类型创建不同的消息消费服务
                 */
                if (this.getMessageListenerInner() instanceof MessageListenerOrderly) {
                    //如果是MessageListenerOrderly类型,则表示顺序消费,创建ConsumeMessageOrderlyService
                    this.consumeOrderly = true;
                    this.consumeMessageService =
                        new ConsumeMessageOrderlyService(this, (MessageListenerOrderly) this.getMessageListenerInner());
                } else if (this.getMessageListenerInner() instanceof MessageListenerConcurrently) {
                    //如果是MessageListenerConcurrently类型,则表示并发消费,创建ConsumeMessageConcurrentlyService
                    this.consumeOrderly = false;
                    this.consumeMessageService =
                        new ConsumeMessageConcurrentlyService(this, (MessageListenerConcurrently) this.getMessageListenerInner());
                }

                //启动消息消费服务
                this.consumeMessageService.start();

                /*
                 * 9 注册消费者组和消费者到MQClientInstance中的consumerTable中
                 */
                boolean registerOK = mQClientFactory.registerConsumer(this.defaultMQPushConsumer.getConsumerGroup(), this);
                if (!registerOK) {
                    //如果没注册成功,那么可能是因为同一个程序中存在同名消费者组的不同消费者
                    this.serviceState = ServiceState.CREATE_JUST;
                    this.consumeMessageService.shutdown(defaultMQPushConsumer.getAwaitTerminationMillisWhenShutdown());
                    throw new MQClientException("The consumer group[" + this.defaultMQPushConsumer.getConsumerGroup()
                        + "] has been created before, specify another name please." + FAQUrl.suggestTodo(FAQUrl.GROUP_NAME_DUPLICATE_URL),
                        null);
                }

                /*
                 * 10 启动CreateMQClientInstance客户端通信实例
                 * netty服务、各种定时任务、拉取消息服务、rebalanceService服务
                 */
                mQClientFactory.start();
                log.info("the consumer [{}] start OK.", this.defaultMQPushConsumer.getConsumerGroup());
                this.serviceState = ServiceState.RUNNING;
                break;
            //服务状态是其他的,那么抛出异常,即start方法仅能调用一次
            case RUNNING:
            case START_FAILED:
            case SHUTDOWN_ALREADY:
                throw new MQClientException("The PushConsumer service state not OK, maybe started once, "
                    + this.serviceState
                    + FAQUrl.suggestTodo(FAQUrl.CLIENT_SERVICE_NOT_OK),
                    null);
            default:
                break;
        }

        //11 后续处理
        //向NameServer拉取并更新当前消费者订阅的topic路由信息
        this.updateTopicSubscribeInfoWhenSubscriptionChanged();
        //随机选择一个Broker,发送检查客户端tag配置的请求,主要是检测Broker是否支持SQL92类型的tag过滤以及SQL92的tag语法是否正确
        this.mQClientFactory.checkClientInBroker();
        //发送心跳信息给所有broker
        this.mQClientFactory.sendHeartbeatToAllBrokerWithLock();
        //唤醒负载均衡服务rebalanceService,进行重平衡
        this.mQClientFactory.rebalanceImmediately();
    }

DefaultMQPushConsumerImpl#copySubscription

/**
     * 拷贝订阅关系.
     * 将defaultMQPushConsumer中的订阅关系Map集合subscription中的数据拷贝到RebalanceImpl的subscriptionInner中。
     *
     * 然后还有很重要的一步,就是为集群消费模式的消费者,配置其对应的重试主题 retryTopic = %RETRY% + consumerGroup,
     * 并且设置当前消费者自动订阅该消费者组对应的重试topic,用于实现消费重试。
     *
     * 而如果是广播消费模式,那么不订阅重试topic,所以说,从Consumer启动的时候开始,就注定了广播消费模式的消费者,消费失败消息会丢弃,无法重试。
     * @throws MQClientException
     */
    private void copySubscription() throws MQClientException {
        try {
            //将订阅关系拷贝到RebalanceImpl的subscriptionInner中
            Map<String, String> sub = this.defaultMQPushConsumer.getSubscription();
            if (sub != null) {
                for (final Map.Entry<String, String> entry : sub.entrySet()) {
                    final String topic = entry.getKey();
                    final String subString = entry.getValue();
                    SubscriptionData subscriptionData = FilterAPI.buildSubscriptionData(topic, subString);
                    this.rebalanceImpl.getSubscriptionInner().put(topic, subscriptionData);
                }
            }

            //如果messageListenerInner为null,那么将defaultMQPushConsumer的messageListener赋给DefaultMQPushConsumerImpl的messageListenerInner
            //在defaultMQPushConsumer的registerMessageListener方法中就已经赋值了
            if (null == this.messageListenerInner) {
                this.messageListenerInner = this.defaultMQPushConsumer.getMessageListener();
            }

            //消息消费模式
            switch (this.defaultMQPushConsumer.getMessageModel()) {
                //广播消费模式,消费失败消息会丢弃
                case BROADCASTING:
                    break;
                //集群消费模式,支持消费失败重试
                //自动订阅该消费者组对应的重试topic,默认就是这个模式
                case CLUSTERING:
                    //获取当前消费者对应的重试主题 retryTopic = %RETRY% + consumerGroup
                    final String retryTopic = MixAll.getRetryTopic(this.defaultMQPushConsumer.getConsumerGroup());
                    //当前消费者自动订阅该消费者组对应的重试topic,用于实现消费重试
                    SubscriptionData subscriptionData = FilterAPI.buildSubscriptionData(retryTopic, SubscriptionData.SUB_ALL);
                    this.rebalanceImpl.getSubscriptionInner().put(retryTopic, subscriptionData);
                    break;
                default:
                    break;
            }
        } catch (Exception e) {
            throw new MQClientException("subscription exception", e);
        }
    }

1.5 MQClientManager#getOrCreateMQClientInstance

/**
     * MQClientManager的方法
     *
     * @param clientConfig 生产者客户端配置类
     * @param rpcHook      rpc钩子
     * @return MQClientInstance
     */
    public MQClientInstance getOrCreateMQClientInstance(final ClientConfig clientConfig, RPCHook rpcHook) {
        //构建clientId,格式为 clientIP@instanceName@unitName
        String clientId = clientConfig.buildMQClientId();
        //从本地缓存factoryTable中,查找该clientId的MQClientInstance实例
        MQClientInstance instance = this.factoryTable.get(clientId);
        //如果不存在则创建并存入factoryTable
        if (null == instance) {
            instance =
                new MQClientInstance(clientConfig.cloneClientConfig(),
                    this.factoryIndexGenerator.getAndIncrement(), clientId, rpcHook);
            MQClientInstance prev = this.factoryTable.putIfAbsent(clientId, instance);
            if (prev != null) {
                instance = prev;
                log.warn("Returned Previous MQClientInstance for clientId:[{}]", clientId);
            } else {
                log.info("Created new MQClientInstance for clientId:[{}]", clientId);
            }
        }

        return instance;
    }

1.6 ConsumeMessageConcurrentlyService#start

并发消息消费启动:

public void start() {
        //通过cleanExpireMsgExecutors定时任务清理过期的消息
        //启动后15min开始执行,后每15min执行一次,这里的15min时RocketMQ大的默认超时时间,可通过defaultMQPushConsumer#consumeTimeout属性设置
        this.cleanExpireMsgExecutors.scheduleAtFixedRate(new Runnable() {

            @Override
            public void run() {
                try {
                    //清理过期消息
                    cleanExpireMsg();
                } catch (Throwable e) {
                    log.error("scheduleAtFixedRate cleanExpireMsg exception", e);
                }
            }

        }, this.defaultMQPushConsumer.getConsumeTimeout(), this.defaultMQPushConsumer.getConsumeTimeout(), TimeUnit.MINUTES);
    }

1.7 ConsumeMessageOrderlyService#start

顺序消息消费启动:
通过RequestCode.LOCK_BATCH_MQ请求Broker锁定MessageQueue

public void start() {
        if (MessageModel.CLUSTERING.equals(ConsumeMessageOrderlyService.this.defaultMQPushConsumerImpl.messageModel())) {
            //启动后1s开始执行,后每20s执行一次,这里的20s时RocketMQ大的默认超时时间,可通过drocketmq.client.rebalance.lockInterval配置设置
            this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
                @Override
                public void run() {
                    try {
                        ConsumeMessageOrderlyService.this.lockMQPeriodically();
                    } catch (Throwable e) {
                        log.error("scheduleAtFixedRate lockMQPeriodically exception", e);
                    }
                }
            }, 1000 * 1, ProcessQueue.REBALANCE_LOCK_INTERVAL, TimeUnit.MILLISECONDS);
        }
    }

1.8 MQClientInstance#start

启动Netty、启动定时任务、PullMessageService(拉取消息服务)、RebalanceService(默认的消息生产者)。

public void start() throws MQClientException {

        synchronized (this) {
            switch (this.serviceState) {
                case CREATE_JUST:
                    this.serviceState = ServiceState.START_FAILED;
                    // If not specified,looking address from name server
                    if (null == this.clientConfig.getNamesrvAddr()) {
                        this.mQClientAPIImpl.fetchNameServerAddr();
                    }
                    // Start request-response channel
                    this.mQClientAPIImpl.start();
                    // Start various schedule tasks
                    this.startScheduledTask();
                    // Start pull service
                    this.pullMessageService.start();
                    // Start rebalance service
                    this.rebalanceService.start();
                    // Start push service
                    this.defaultMQProducer.getDefaultMQProducerImpl().start(false);
                    log.info("the client factory [{}] start OK", this.clientId);
                    this.serviceState = ServiceState.RUNNING;
                    break;
                case START_FAILED:
                    throw new MQClientException("The Factory object[" + this.getClientId() + "] has been created before, and failed.", null);
                default:
                    break;
            }
        }
    }

1.8.1 MQClientAPIImpl#start

public void start() {
	//调用NettyRemotingClient#start
	this.remotingClient.start();
}
	/**
	 * NettyRemotingClient#start
     * 创建netty客户端,并没有真正启动
     */
    @Override
    public void start() {
        //创建默认事件处理器组,默认4个线程,线程名以NettyClientWorkerThread_为前缀。
        //主要用于执行在真正执行业务逻辑之前需要进行的SSL验证、编解码、空闲检查、网络连接管理等操作
        //其工作时间位于IO线程组之后,process线程组之前
        this.defaultEventExecutorGroup = new DefaultEventExecutorGroup(
            nettyClientConfig.getClientWorkerThreads(),
            new ThreadFactory() {

                private AtomicInteger threadIndex = new AtomicInteger(0);

                @Override
                public Thread newThread(Runnable r) {
                    return new Thread(r, "NettyClientWorkerThread_" + this.threadIndex.incrementAndGet());
                }
            });

        //初始化netty客户端
        //eventLoopGroupWorker线程组,默认一个线程
        Bootstrap handler = this.bootstrap.group(this.eventLoopGroupWorker).channel(NioSocketChannel.class)
            //对应于套接字选项中的TCP_NODELAY,该参数的使用与Nagle算法有关
            .option(ChannelOption.TCP_NODELAY, true)
            //对应于套接字选项中的SO_KEEPALIVE,该参数用于设置TCP连接,当设置该选项以后,连接会测试链接的状态
            .option(ChannelOption.SO_KEEPALIVE, false)
            //用来设置连接超时时长,单位是毫秒,默认3000
            .option(ChannelOption.CONNECT_TIMEOUT_MILLIS, nettyClientConfig.getConnectTimeoutMillis())
            .handler(new ChannelInitializer<SocketChannel>() {
                @Override
                public void initChannel(SocketChannel ch) throws Exception {
                    ChannelPipeline pipeline = ch.pipeline();
                    if (nettyClientConfig.isUseTLS()) {
                        if (null != sslContext) {
                            pipeline.addFirst(defaultEventExecutorGroup, "sslHandler", sslContext.newHandler(ch.alloc()));
                            log.info("Prepend SSL handler");
                        } else {
                            log.warn("Connections are insecure as SSLContext is null!");
                        }
                    }
                    //为defaultEventExecutorGroup,添加handler
                    pipeline.addLast(
                        defaultEventExecutorGroup,
                        //RocketMQ自定义的请求解码器
                        new NettyEncoder(),
                        //RocketMQ自定义的请求编码器
                        new NettyDecoder(),
                        //Netty自带的心跳管理器,主要是用来检测远端是否存活
                        //即测试端一定时间内未接受到被测试端消息和一定时间内向被测试端发送消息的超时时间为120秒
                        new IdleStateHandler(0, 0, nettyClientConfig.getClientChannelMaxIdleTimeSeconds()),
                        //连接管理器,他负责连接的激活、断开、超时、异常等事件
                        new NettyConnectManageHandler(),
                        //服务请求处理器,处理RemotingCommand消息,即请求和响应的业务处理,并且返回相应的处理结果。这是重点
                        //例如broker注册、producer/consumer获取Broker、Topic信息等请求都是该处理器处理
                        //serverHandler最终会将请求根据不同的消息类型code分发到不同的process线程池处理
                        new NettyClientHandler());
                }
            });
        if (nettyClientConfig.getClientSocketSndBufSize() > 0) {
            log.info("client set SO_SNDBUF to {}", nettyClientConfig.getClientSocketSndBufSize());
            handler.option(ChannelOption.SO_SNDBUF, nettyClientConfig.getClientSocketSndBufSize());
        }
        if (nettyClientConfig.getClientSocketRcvBufSize() > 0) {
            log.info("client set SO_RCVBUF to {}", nettyClientConfig.getClientSocketRcvBufSize());
            handler.option(ChannelOption.SO_RCVBUF, nettyClientConfig.getClientSocketRcvBufSize());
        }
        if (nettyClientConfig.getWriteBufferLowWaterMark() > 0 && nettyClientConfig.getWriteBufferHighWaterMark() > 0) {
            log.info("client set netty WRITE_BUFFER_WATER_MARK to {},{}",
                    nettyClientConfig.getWriteBufferLowWaterMark(), nettyClientConfig.getWriteBufferHighWaterMark());
            handler.option(ChannelOption.WRITE_BUFFER_WATER_MARK, new WriteBufferWaterMark(
                    nettyClientConfig.getWriteBufferLowWaterMark(), nettyClientConfig.getWriteBufferHighWaterMark()));
        }
        /*
         * 启动定时任务,初始启动3秒后执行,此后每隔1秒执行一次
         * 扫描responseTable,将超时的ResponseFuture直接移除,并且执行这些超时ResponseFuture的回调
         */
        this.timer.scheduleAtFixedRate(new TimerTask() {
            @Override
            public void run() {
                try {
                    NettyRemotingClient.this.scanResponseTable();
                } catch (Throwable e) {
                    log.error("scanResponseTable exception", e);
                }
            }
        }, 1000 * 3, 1000);

        /*
         * 启动netty事件监听器,处理各种事件
         */
        if (this.channelEventListener != null) {
            this.nettyEventExecutor.start();
        }
    }

1.8.2 MQClientInstance#startScheduledTask

启动定时任务:

private void startScheduledTask() {
        /*
         * 1 如果没有手动指定namesrvAddr,那么每隔2m从nameServer地址服务器拉取最新的nameServer地址并更新
         * 要想动态更新nameServer地址,需要指定一个地址服务器的url
         */
        if (null == this.clientConfig.getNamesrvAddr()) {
            this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {

                @Override
                public void run() {
                    try {
                        MQClientInstance.this.mQClientAPIImpl.fetchNameServerAddr();
                    } catch (Exception e) {
                        log.error("ScheduledTask fetchNameServerAddr exception", e);
                    }
                }
            }, 1000 * 10, 1000 * 60 * 2, TimeUnit.MILLISECONDS);
        }

        /*
         * 2 每隔30S尝试从nameServer更新topic路由信息
         */
        this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {

            @Override
            public void run() {
                try {
                    MQClientInstance.this.updateTopicRouteInfoFromNameServer();
                } catch (Exception e) {
                    log.error("ScheduledTask updateTopicRouteInfoFromNameServer exception", e);
                }
            }
        }, 10, this.clientConfig.getPollNameServerInterval(), TimeUnit.MILLISECONDS);
        /*
         * 3 每隔30S尝试清除无效的broker信息,以及发送心跳信息给所有broker
         */
        this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {

            @Override
            public void run() {
                try {
                    MQClientInstance.this.cleanOfflineBroker();
                    MQClientInstance.this.sendHeartbeatToAllBrokerWithLock();
                } catch (Exception e) {
                    log.error("ScheduledTask sendHeartbeatToAllBroker exception", e);
                }
            }
        }, 1000, this.clientConfig.getHeartbeatBrokerInterval(), TimeUnit.MILLISECONDS);
        /*
         * 4 每隔5S尝试持久化消费者偏移量,即消费进度
         * 广播消费模式下持久化到本地,集群消费模式下推送到broker端
         */
        this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {

            @Override
            public void run() {
                try {
                    MQClientInstance.this.persistAllConsumerOffset();
                } catch (Exception e) {
                    log.error("ScheduledTask persistAllConsumerOffset exception", e);
                }
            }
        }, 1000 * 10, this.clientConfig.getPersistConsumerOffsetInterval(), TimeUnit.MILLISECONDS);
        /*
         * 5 每隔1min尝试调整push模式的消费线程池的线程数量,目前默认没有实现该功能
         */
        this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {

            @Override
            public void run() {
                try {
                    MQClientInstance.this.adjustThreadPool();
                } catch (Exception e) {
                    log.error("ScheduledTask adjustThreadPool exception", e);
                }
            }
        }, 1, 1, TimeUnit.MINUTES);
    }

1.8.3 PullMessageService#start

PullMessageServiceServiceThread线程类,实例化一个线程并且启动线程。

1.8.4 RebalanceService#start

RebalanceServiceServiceThread线程类,实例化一个线程并且启动线程。

总结

启动默认消费者实现代码,代码注释已详细说明启动流程,简单总结下启动流程:

  1. DefaultMQPushConsumerImpl#checkConfig检查消费者的配置信息,例如consumerGroup是否合法、MessageModel是否为空、consumeFromWhere是否为空,allocateMessageQueueStrategy是否为空以及消费限流配置是否合法等。
  2. DefaultMQPushConsumerImpl#copySubscription拷贝拷贝订阅关系。存在消息消费失败,进行消息重试,此时Topic为%RETRY% + consumerGroup,拷贝订阅关系时订阅改重试Topic,注意广播消费模式不会订阅重试Topic。
  3. 获取MQClientManager实例,然后根据clientId获取或者创建CreateMQClientInstance实例,并赋给mQClientFactory变量。
  4. 设置负载均衡服务的相关属性。
  5. 创建消息拉取核心对象PullAPIWrapper(封装了消息拉取及结果解析逻辑的API)。
  6. 根据消息模式设置不同的OffsetStore(实现消费者的消息消费偏移量offset的管理),MessageModel=BROADCASTING使用LocalFileOffsetStore(消息消费进度即offset存储在本地磁盘中);MessageModel=CLUSTERING使用RemoteBrokerOffsetStore(消息消费进度即offset存储在远程broker中)。
  7. 加载消费偏移量,LocalFileOffsetStore会加载本地磁盘中的数据,RemoteBrokerOffsetStore则是一个空实现。
  8. 根据消息监听器的类型创建不同的消息消费服务,顺序消费使用ConsumeMessageOrderlyService、并发消费使用MessageListenerConcurrently,并且启动消息消费服务。
  9. 注册消费者组和消费者到MQClientInstance中的consumerTable中。
  10. 启动CreateMQClientInstance客户端通信实例。
  11. 后续处理,包括“向NameServer拉取并更新当前消费者订阅的topic路由信息”、“向Broker发送检查客户端tag配置的请求”、“发送心跳信息给所有broker”、“唤醒rebalanceService进行重平衡”。

2.Consumer消息消费

消息消费有几个概念:

  1. 消息消费模型(MessageModel):有广播(BROADCASTING)、集群(CLUSTERING)。广播模式所有Consumer消费Topic下所有MessageQueue的消息,集群模式下所有Consumer分别消费Topic一部分MessageQueue
  2. 从哪里开始消费(ConsumeFromWhere):
  • CONSUME_FROM_LAST_OFFSET:消费者组第一次启动时从最后的位置消费,后续再启动接着上次消费的进度开始消费。
    CONSUME_FROM_LAST_OFFSET_AND_FROM_MIN_WHEN_BOOT_FIRSTCONSUME_FROM_MIN_OFFSETCONSUME_FROM_MAX_OFFSET 已丢弃,将使用默认CONSUME_FROM_LAST_OFFSET
  • CONSUME_FROM_FIRST_OFFSET:消费者组第一次启动时从最开始的位置消费,后续再启动接着上次消费的进度开始消费
  • CONSUME_FROM_TIMESTAMP:消费者组第一次启动时消费在指定时间戳后产生的消息,后续再启动接着上次消费的进度开始消费。

2.1 Consumer拉取消息

2.1.1 PullMessageService类

PullMessageService作为Consumer拉取的入口,看看PullMessageService都有哪些属性:

public class PullMessageService extends ServiceThread {
    private final InternalLogger log = ClientLogger.getLog();
    /**拉取消息请求队列*/
    private final LinkedBlockingQueue<PullRequest> pullRequestQueue = new LinkedBlockingQueue<PullRequest>();
    /**RocketMQ客户端*/
    private final MQClientInstance mQClientFactory;
    /**定时线程池,用于延迟执行拉取消息请求*/
    private final ScheduledExecutorService scheduledExecutorService = Executors
        .newSingleThreadScheduledExecutor(new ThreadFactory() {
            @Override
            public Thread newThread(Runnable r) {
                return new Thread(r, "PullMessageServiceScheduledThread");
            }
        });
}

2.1.2 PullMessageService#run

拉取消息的入口:

	@Override
    public void run() {
        log.info(this.getServiceName() + " service started");
        /*
         * 运行时逻辑
         * 如果服务没有停止,则在死循环中执行拉取消息的操作
         */
        while (!this.isStopped()) {
            try {
                //阻塞式的获取并移除队列的头部数据,即拉取消息的请求
                PullRequest pullRequest = this.pullRequestQueue.take();
                //根据该请求去broker拉取消息
                this.pullMessage(pullRequest);
            } catch (InterruptedException ignored) {
            } catch (Exception e) {
                log.error("Pull Message Service Run Method exception", e);
            }
        }

        log.info(this.getServiceName() + " service end");
    }

2.1.3 PullMessageService#pullMessage

拉取消息的逻辑,实际由具体DefaultMQPushConsumerImpl执行:

    private void pullMessage(final PullRequest pullRequest) {
        //从consumerTable中获取pullRequest中保存的消费者组的消费者实例
        final MQConsumerInner consumer = this.mQClientFactory.selectConsumer(pullRequest.getConsumerGroup());
        if (consumer != null) {
            DefaultMQPushConsumerImpl impl = (DefaultMQPushConsumerImpl) consumer;
            //拉取消息
            impl.pullMessage(pullRequest);
        } else {
            log.warn("No matched consumer for the PullRequest {}, drop it", pullRequest);
        }
    }

2.1.4 DefaultMQPushConsumerImpl#pullMessage

拉取消息流程如下:

  1. 检查ProcessQueue是否已丢弃,如丢弃不再执行拉取消息。
  2. 状态校验,校验Consumer状态是否正常(RUNNING状态)。
  3. 流控校验,总消息数量(DefaultMQPushConsumer#pullThresholdForQueue配置阈值,默认1000)、总消息大小(DefaultMQPushConsumer#pullThresholdSizeForQueue配置阈值,默认100MB)是否操作阈值,如超过阈值,延迟再拉取消息(实际为延迟把拉取请求放入pullRequestQueue中)。统计数据更新位于:ProcessQueue#putMessage,但拉取到消息后调用。
  4. 顺序消费和并发消费的校验,
  • 并发消费:如果内存中消息的offset的最大跨度大于设定的阈值,默认2000(DefaultMQPushConsumer#consumeConcurrentlyMaxSpan配置)延迟再拉取消息;
  • 顺序消费:第一次拉取时与Broker校验offset是否超前。
  1. 创建拉取消息的回调函数对象,用于处理拉取消息请求返回后的消息处理。
  • onSuccess:进行消息解码、过滤以及设置其他属性和根据拉取状态PullStatus分别处理。
  • onException:延迟再拉取消息。
  1. 检查是否上报消费offset,只有MessageModel=CLUSTERING(集群模式)才执行,读取RemoteBrokerOffsetStore本地offset,如果大于0,拉取请求sysFlag将标记上传offsetPullSysFlag#FLAG_COMMIT_OFFSET),并且把当前offset一起发送给Broker。
  2. PullAPIWrapper真正的开始拉取消息
/**
     * 拉取消息
     *
     * @param pullRequest 拉取消息请求
     */
    public void pullMessage(final PullRequest pullRequest) {
        //获取对应的处理队列
        final ProcessQueue processQueue = pullRequest.getProcessQueue();
        //如果该处理队列已被丢去,那么直接返回
        if (processQueue.isDropped()) {
            log.info("the pull request[{}] is dropped.", pullRequest.toString());
            return;
        }

        //设置最后的拉取时间戳
        pullRequest.getProcessQueue().setLastPullTimestamp(System.currentTimeMillis());

        //1 状态校验
        try {
            //确定此consumer的服务状态正常,如果服务状态不是RUNNING,那么抛出异常
            this.makeSureStateOK();
        } catch (MQClientException e) {
            log.warn("pullMessage exception, consumer state not ok", e);
            this.executePullRequestLater(pullRequest, pullTimeDelayMillsWhenException);
            return;
        }

        //如果消费者暂停了,那么延迟1s发送拉取消息请求
        if (this.isPause()) {
            log.warn("consumer was paused, execute pull request later. instanceName={}, group={}", this.defaultMQPushConsumer.getInstanceName(), this.defaultMQPushConsumer.getConsumerGroup());
            this.executePullRequestLater(pullRequest, PULL_TIME_DELAY_MILLS_WHEN_SUSPEND);
            return;
        }

        //2 流控校验
        //获取processQueue中已缓存的消息总数量
        long cachedMessageCount = processQueue.getMsgCount().get();
        //获取processQueue中已缓存的消息总大小MB
        long cachedMessageSizeInMiB = processQueue.getMsgSize().get() / (1024 * 1024);
        //如果processQueue中已缓存的消息总数量大于设定的阈值,默认1000
        if (cachedMessageCount > this.defaultMQPushConsumer.getPullThresholdForQueue()) {
            //延迟50ms发送拉取消息请求
            this.executePullRequestLater(pullRequest, PULL_TIME_DELAY_MILLS_WHEN_FLOW_CONTROL);
            if ((queueFlowControlTimes++ % 1000) == 0) {
                log.warn(
                    "the cached message count exceeds the threshold {}, so do flow control, minOffset={}, maxOffset={}, count={}, size={} MiB, pullRequest={}, flowControlTimes={}",
                    this.defaultMQPushConsumer.getPullThresholdForQueue(), processQueue.getMsgTreeMap().firstKey(), processQueue.getMsgTreeMap().lastKey(), cachedMessageCount, cachedMessageSizeInMiB, pullRequest, queueFlowControlTimes);
            }
            return;
        }
        //如果processQueue中已缓存的消息总大小大于设定的阈值,默认100MB
        if (cachedMessageSizeInMiB > this.defaultMQPushConsumer.getPullThresholdSizeForQueue()) {
            //延迟50ms发送拉取消息请求
            this.executePullRequestLater(pullRequest, PULL_TIME_DELAY_MILLS_WHEN_FLOW_CONTROL);
            if ((queueFlowControlTimes++ % 1000) == 0) {
                log.warn(
                    "the cached message size exceeds the threshold {} MiB, so do flow control, minOffset={}, maxOffset={}, count={}, size={} MiB, pullRequest={}, flowControlTimes={}",
                    this.defaultMQPushConsumer.getPullThresholdSizeForQueue(), processQueue.getMsgTreeMap().firstKey(), processQueue.getMsgTreeMap().lastKey(), cachedMessageCount, cachedMessageSizeInMiB, pullRequest, queueFlowControlTimes);
            }
            return;
        }

        //3 顺序消费和并发消费的校验
        //如果不是顺序消息,即并发消费
        if (!this.consumeOrderly) {
            //如果内存中消息的offset的最大跨度大于设定的阈值,默认2000
            if (processQueue.getMaxSpan() > this.defaultMQPushConsumer.getConsumeConcurrentlyMaxSpan()) {
                this.executePullRequestLater(pullRequest, PULL_TIME_DELAY_MILLS_WHEN_FLOW_CONTROL);
                //延迟50ms发送拉取消息请求
                if ((queueMaxSpanFlowControlTimes++ % 1000) == 0) {
                    log.warn(
                        "the queue's messages, span too long, so do flow control, minOffset={}, maxOffset={}, maxSpan={}, pullRequest={}, flowControlTimes={}",
                        processQueue.getMsgTreeMap().firstKey(), processQueue.getMsgTreeMap().lastKey(), processQueue.getMaxSpan(),
                        pullRequest, queueMaxSpanFlowControlTimes);
                }
                return;
            }
        } else {
            //顺序消费校验,如果已锁定
            if (processQueue.isLocked()) {
                //如果此前没有锁定过,那么需要设置消费点位
                if (!pullRequest.isPreviouslyLocked()) {
                    long offset = -1L;
                    try {
                        //获取该MessageQueue的下一个消息的消费偏移量offset
                        offset = this.rebalanceImpl.computePullFromWhereWithException(pullRequest.getMessageQueue());
                    } catch (Exception e) {
                        //延迟3s发送拉取消息请求
                        this.executePullRequestLater(pullRequest, pullTimeDelayMillsWhenException);
                        log.error("Failed to compute pull offset, pullResult: {}", pullRequest, e);
                        return;
                    }
                    //消费点位超前,那么重设消费点位
                    boolean brokerBusy = offset < pullRequest.getNextOffset();
                    log.info("the first time to pull message, so fix offset from broker. pullRequest: {} NewOffset: {} brokerBusy: {}",
                        pullRequest, offset, brokerBusy);
                    if (brokerBusy) {
                        log.info("[NOTIFYME]the first time to pull message, but pull request offset larger than broker consume offset. pullRequest: {} NewOffset: {}",
                            pullRequest, offset);
                    }
                    //设置previouslyLocked为true
                    pullRequest.setPreviouslyLocked(true);
                    //重设消费点位
                    pullRequest.setNextOffset(offset);
                }
            } else {
                //如果没有被锁住,那么延迟3s发送拉取消息请求
                this.executePullRequestLater(pullRequest, pullTimeDelayMillsWhenException);
                log.info("pull message later because not locked in broker, {}", pullRequest);
                return;
            }
        }

        //获取topic对应的SubscriptionData订阅关系
        final SubscriptionData subscriptionData = this.rebalanceImpl.getSubscriptionInner().get(pullRequest.getMessageQueue().getTopic());
        //如果没有订阅信息
        if (null == subscriptionData) {
            //延迟3s发送拉取消息请求
            this.executePullRequestLater(pullRequest, pullTimeDelayMillsWhenException);
            log.warn("find the consumer's subscription failed, {}", pullRequest);
            return;
        }

        //起始时间
        final long beginTimestamp = System.currentTimeMillis();

        //4 创建拉取消息的回调函数对象,当拉取消息的请求返回之后,将会指定回调函数
        PullCallback pullCallback = new PullCallback() {
            @Override
            public void onSuccess(PullResult pullResult) {
                if (pullResult != null) {
                    //处理pullResult,进行消息解码、过滤以及设置其他属性的操作
                    pullResult = DefaultMQPushConsumerImpl.this.pullAPIWrapper.processPullResult(pullRequest.getMessageQueue(), pullResult,
                        subscriptionData);

                    switch (pullResult.getPullStatus()) {
                        case FOUND:
                            //拉取的起始offset
                            long prevRequestOffset = pullRequest.getNextOffset();
                            //设置下一次拉取的起始offset到PullRequest中
                            pullRequest.setNextOffset(pullResult.getNextBeginOffset());
                            //增加拉取耗时
                            long pullRT = System.currentTimeMillis() - beginTimestamp;
                            DefaultMQPushConsumerImpl.this.getConsumerStatsManager().incPullRT(pullRequest.getConsumerGroup(),
                                pullRequest.getMessageQueue().getTopic(), pullRT);

                            long firstMsgOffset = Long.MAX_VALUE;
                            //如果没有消息
                            if (pullResult.getMsgFoundList() == null || pullResult.getMsgFoundList().isEmpty()) {
                                /*
                                 * 立即将拉取请求再次放入PullMessageService的pullRequestQueue中,PullMessageService是一个线程服务
                                 * PullMessageService将会循环的获取pullRequestQueue中的pullRequest然后向broker发起新的拉取消息请求
                                 * 进行下次消息的拉取
                                 */
                                DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
                            } else {
                                //获取第一个消息的offset
                                firstMsgOffset = pullResult.getMsgFoundList().get(0).getQueueOffset();

                                //增加拉取tps
                                DefaultMQPushConsumerImpl.this.getConsumerStatsManager().incPullTPS(pullRequest.getConsumerGroup(),
                                    pullRequest.getMessageQueue().getTopic(), pullResult.getMsgFoundList().size());

                                //将拉取到的所有消息,存入对应的processQueue处理队列内部的msgTreeMap中
                                boolean dispatchToConsume = processQueue.putMessage(pullResult.getMsgFoundList());
                                //通过consumeMessageService将拉取到的消息构建为ConsumeRequest,然后通过内部的consumeExecutor线程池消费消息
                                //consumeMessageService有ConsumeMessageConcurrentlyService并发消费和ConsumeMessageOrderlyService顺序消费两种实现
                                DefaultMQPushConsumerImpl.this.consumeMessageService.submitConsumeRequest(
                                    pullResult.getMsgFoundList(),
                                    processQueue,
                                    pullRequest.getMessageQueue(),
                                    dispatchToConsume);

                                //获取配置的消息拉取间隔,默认为0,则等待间隔时间后将拉取请求再次放入pullRequestQueue中,
                                //否则立即放入pullRequestQueue中进行下次消息的拉取
                                if (DefaultMQPushConsumerImpl.this.defaultMQPushConsumer.getPullInterval() > 0) {
                                    /*
                                     * 将executePullRequestImmediately的执行放入一个PullMessageService的scheduledExecutorService延迟任务线程池中
                                     * 等待给定的延迟时间到了之后再执行executePullRequestImmediately方法
                                     */
                                    DefaultMQPushConsumerImpl.this.executePullRequestLater(pullRequest,
                                        DefaultMQPushConsumerImpl.this.defaultMQPushConsumer.getPullInterval());
                                } else {
                                    //立即将拉取请求再次放入PullMessageService的pullRequestQueue中,等待下次拉取
                                    DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
                                }
                            }

                            if (pullResult.getNextBeginOffset() < prevRequestOffset
                                || firstMsgOffset < prevRequestOffset) {
                                log.warn(
                                    "[BUG] pull message result maybe data wrong, nextBeginOffset: {} firstMsgOffset: {} prevRequestOffset: {}",
                                    pullResult.getNextBeginOffset(),
                                    firstMsgOffset,
                                    prevRequestOffset);
                            }

                            break;
                        case NO_NEW_MSG:
                            //没有匹配到消息
                        case NO_MATCHED_MSG:
                            //更新下一次拉取偏移量
                            pullRequest.setNextOffset(pullResult.getNextBeginOffset());

                            DefaultMQPushConsumerImpl.this.correctTagsOffset(pullRequest);

                            //立即将拉取请求再次放入PullMessageService的pullRequestQueue中,等待下次拉取
                            DefaultMQPushConsumerImpl.this.executePullRequestImmediately(pullRequest);
                            break;

                        //请求offset不合法,过大或者过小
                        case OFFSET_ILLEGAL:
                            log.warn("the pull request offset illegal, {} {}",
                                pullRequest.toString(), pullResult.toString());
                            //更新下一次拉取偏移量
                            pullRequest.setNextOffset(pullResult.getNextBeginOffset());

                            //丢弃拉取请求
                            pullRequest.getProcessQueue().setDropped(true);
                            DefaultMQPushConsumerImpl.this.executeTaskLater(new Runnable() {

                                @Override
                                public void run() {
                                    try {
                                        //更新下次拉取偏移量
                                        DefaultMQPushConsumerImpl.this.offsetStore.updateOffset(pullRequest.getMessageQueue(),
                                            pullRequest.getNextOffset(), false);

                                        //持久化offset
                                        DefaultMQPushConsumerImpl.this.offsetStore.persist(pullRequest.getMessageQueue());

                                        //移除对应的消费队列,同时将消息队列从负载均衡服务中移除
                                        DefaultMQPushConsumerImpl.this.rebalanceImpl.removeProcessQueue(pullRequest.getMessageQueue());

                                        log.warn("fix the pull request offset, {}", pullRequest);
                                    } catch (Throwable e) {
                                        log.error("executeTaskLater Exception", e);
                                    }
                                }
                            }, 10000);
                            break;
                        default:
                            break;
                    }
                }
            }

            @Override
            public void onException(Throwable e) {
                if (!pullRequest.getMessageQueue().getTopic().startsWith(MixAll.RETRY_GROUP_TOPIC_PREFIX)) {
                    log.warn("execute the pull request exception", e);
                }

                /*
                 * 出现异常,延迟3s将拉取请求再次放入PullMessageService的pullRequestQueue中,等待下次拉取
                 */
                DefaultMQPushConsumerImpl.this.executePullRequestLater(pullRequest, pullTimeDelayMillsWhenException);
            }
        };

        //5 是否允许上报消费点位
        boolean commitOffsetEnable = false;
        long commitOffsetValue = 0L;
        //如果是集群消费模式
        if (MessageModel.CLUSTERING == this.defaultMQPushConsumer.getMessageModel()) {
            //从本地内存offsetTable读取commitOffsetValue
            commitOffsetValue = this.offsetStore.readOffset(pullRequest.getMessageQueue(), ReadOffsetType.READ_FROM_MEMORY);
            if (commitOffsetValue > 0) {
                //如果本地内存有关于此mq的offset,那么设置为true,表示可以上报消费位点给Broker
                commitOffsetEnable = true;
            }
        }

        String subExpression = null;
        boolean classFilter = false;
        //classFilter相关处理
        SubscriptionData sd = this.rebalanceImpl.getSubscriptionInner().get(pullRequest.getMessageQueue().getTopic());
        if (sd != null) {
            if (this.defaultMQPushConsumer.isPostSubscriptionWhenPull() && !sd.isClassFilterMode()) {
                subExpression = sd.getSubString();
            }

            classFilter = sd.isClassFilterMode();
        }

        //系统标记
        int sysFlag = PullSysFlag.buildSysFlag(
            commitOffsetEnable, // commitOffset
            true, // suspend
            subExpression != null, // subscription
            classFilter // class filter
        );

        //6 真正的开始拉取消息
        try {
            this.pullAPIWrapper.pullKernelImpl(
                pullRequest.getMessageQueue(),
                subExpression,
                subscriptionData.getExpressionType(),
                subscriptionData.getSubVersion(),
                pullRequest.getNextOffset(),
                this.defaultMQPushConsumer.getPullBatchSize(),
                sysFlag,
                commitOffsetValue,
                BROKER_SUSPEND_MAX_TIME_MILLIS,
                CONSUMER_TIMEOUT_MILLIS_WHEN_SUSPEND,
                CommunicationMode.ASYNC,
                pullCallback
            );
        } catch (Exception e) {
            log.error("pullKernelImpl exception", e);
            //拉取异常,延迟3s发送拉取消息请求
            this.executePullRequestLater(pullRequest, pullTimeDelayMillsWhenException);
        }
    }

2.1.5 PullAPIWrapper#pullKernelImpl

异步请求Broker拉取消息。

  1. maxNums:每次默认最大拉取32条,DefaultMQPushConsumer#pullBatchSize配置。
  2. brokerSuspendMaxTimeMillis:broker挂起请求的最长时间,默认15s,DefaultMQPushConsumerImpl#BROKER_SUSPEND_MAX_TIME_MILLIS配置,不能修改。
  3. timeoutMillis:消费者消息拉取超时时间,默认30s,DefaultMQPushConsumerImpl#CONSUMER_TIMEOUT_MILLIS_WHEN_SUSPEND配置,不能修改。
/**
     * 拉取消息
     *
     * @param mq                         消息队列
     * @param subExpression              订阅关系表达式,它仅支持或操作,如“tag1 | | tag2 | | tag3”,如果为 null 或 *,则表示订阅全部
     * @param expressionType             订阅关系表达式类型,支持TAG和SQL92,用于过滤
     * @param subVersion                 订阅关系版本
     * @param offset                     下一个拉取的offset
     * @param maxNums                    一次批量拉取的最大消息数,默认32
     * @param sysFlag                    系统标记
     * @param commitOffset               提交的消费点位
     * @param brokerSuspendMaxTimeMillis broker挂起请求的最长时间,默认15s
     * @param timeoutMillis              消费者消息拉取超时时间,默认30s
     * @param communicationMode          消息拉取模式,默认为异步拉取
     * @param pullCallback               拉取到消息之后调用的回调函数
     * @return 拉取结果
     */
    public PullResult pullKernelImpl(
        final MessageQueue mq,
        final String subExpression,
        final String expressionType,
        final long subVersion,
        final long offset,
        final int maxNums,
        final int sysFlag,
        final long commitOffset,
        final long brokerSuspendMaxTimeMillis,
        final long timeoutMillis,
        final CommunicationMode communicationMode,
        final PullCallback pullCallback
    ) throws MQClientException, RemotingException, MQBrokerException, InterruptedException {
        //获取指定brokerName的broker地址,默认获取master地址,如果由建议的拉取地址,则获取建议的broker地址
        FindBrokerResult findBrokerResult =
            this.mQClientFactory.findBrokerAddressInSubscribe(mq.getBrokerName(),
                this.recalculatePullFromWhichNode(mq), false);
        //没找到broker地址,那么更新topic路由信息再获取一次
        if (null == findBrokerResult) {
            this.mQClientFactory.updateTopicRouteInfoFromNameServer(mq.getTopic());
            findBrokerResult =
                this.mQClientFactory.findBrokerAddressInSubscribe(mq.getBrokerName(),
                    this.recalculatePullFromWhichNode(mq), false);
        }

        //找到了broker
        if (findBrokerResult != null) {
            {
                // check version
                //检查版本
                if (!ExpressionType.isTagType(expressionType)
                    && findBrokerResult.getBrokerVersion() < MQVersion.Version.V4_1_0_SNAPSHOT.ordinal()) {
                    throw new MQClientException("The broker[" + mq.getBrokerName() + ", "
                        + findBrokerResult.getBrokerVersion() + "] does not upgrade to support for filter message by " + expressionType, null);
                }
            }
            int sysFlagInner = sysFlag;

            if (findBrokerResult.isSlave()) {
                sysFlagInner = PullSysFlag.clearCommitOffsetFlag(sysFlagInner);
            }

            //构造PullMessageRequestHeader请求头
            PullMessageRequestHeader requestHeader = new PullMessageRequestHeader();
            //消费者组
            requestHeader.setConsumerGroup(this.consumerGroup);
            //topic
            requestHeader.setTopic(mq.getTopic());
            //队列id
            requestHeader.setQueueId(mq.getQueueId());
            //拉取偏移量
            requestHeader.setQueueOffset(offset);
            //最大拉取消息数量
            requestHeader.setMaxMsgNums(maxNums);
            //系统标记
            requestHeader.setSysFlag(sysFlagInner);
            //提交的消费点位
            requestHeader.setCommitOffset(commitOffset);
            //broker挂起请求的最长时间,默认15s
            requestHeader.setSuspendTimeoutMillis(brokerSuspendMaxTimeMillis);
            //订阅关系表达式,它仅支持或操作,如“tag1 | | tag2 | | tag3”,如果为 null 或 *,则表示订阅全部
            requestHeader.setSubscription(subExpression);
            //订阅关系版本
            requestHeader.setSubVersion(subVersion);
            //表达式类型 TAG 或者SQL92
            requestHeader.setExpressionType(expressionType);

            String brokerAddr = findBrokerResult.getBrokerAddr();
            if (PullSysFlag.hasClassFilterFlag(sysFlagInner)) {
                brokerAddr = computePullFromWhichFilterServer(mq.getTopic(), brokerAddr);
            }

            //调用MQClientAPIImpl#pullMessage方法发送请求,进行消息拉取
            PullResult pullResult = this.mQClientFactory.getMQClientAPIImpl().pullMessage(
                brokerAddr,
                requestHeader,
                timeoutMillis,
                communicationMode,
                pullCallback);

            return pullResult;
        }

        throw new MQClientException("The broker[" + mq.getBrokerName() + "] not exist", null);
    }

DefaultMessageStore#getMessage获取消息:

/**
     * 从给定偏移量开始,在queueId中最多查询属于topic的最多maxMsgNums条消息。
     * 获取的消息将使用提供的消息过滤器messageFilter进行进一步筛选。
     *
     * @param group         所属消费者组
     * @param topic         查询的topic
     * @param queueId       查询的queueId
     * @param offset        起始逻辑偏移量
     * @param maxMsgNums    要查询的最大消息数,默认32
     * @param messageFilter 用于筛选所需消息的消息过滤器
     * @return 匹配的消息
     */
    public GetMessageResult getMessage(final String group, final String topic, final int queueId, final long offset,
        final int maxMsgNums,
        final MessageFilter messageFilter) {
        /*
         * 1 前置校验
         */
        if (this.shutdown) {
            log.warn("message store has shutdown, so getMessage is forbidden");
            return null;
        }

        if (!this.runningFlags.isReadable()) {
            log.warn("message store is not readable, so getMessage is forbidden " + this.runningFlags.getFlagBits());
            return null;
        }

        if (MixAll.isLmq(topic) && this.isLmqConsumeQueueNumExceeded()) {
            log.warn("message store is not available, broker config enableLmq and enableMultiDispatch, lmq consumeQueue num exceed maxLmqConsumeQueueNum config num");
            return null;
        }

        //起始时间
        long beginTime = this.getSystemClock().now();

        //拉取消息的状态,默认为 队列里没有消息
        GetMessageStatus status = GetMessageStatus.NO_MESSAGE_IN_QUEUE;
        //下次拉取的consumeQueue的起始逻辑偏移量
        long nextBeginOffset = offset;
        long minOffset = 0;
        long maxOffset = 0;

        // lazy init when find msg.
        GetMessageResult getResult = null;

        //获取commitLog的最大物理偏移量
        final long maxOffsetPy = this.commitLog.getMaxOffset();

        //根据topic和队列id确定需要写入的ConsumeQueue
        ConsumeQueue consumeQueue = findConsumeQueue(topic, queueId);
        if (consumeQueue != null) {
            /*
             * 2 偏移量校验
             */
            //获取consumeQueue的最小和最大的逻辑偏移量offset
            minOffset = consumeQueue.getMinOffsetInQueue();
            maxOffset = consumeQueue.getMaxOffsetInQueue();

            if (maxOffset == 0) {
                //最大的逻辑偏移量offset为0,表示消息队列无消息,设置NO_MESSAGE_IN_QUEUE
                status = GetMessageStatus.NO_MESSAGE_IN_QUEUE;
                //矫正下一次拉取的开始偏移量,如果broker不是SLAVE节点,或者是SLAVE节点但是从服务器支持offset检查,则为0
                nextBeginOffset = nextOffsetCorrection(offset, 0);
            } else if (offset < minOffset) {
                //consumer传递的offset小于最小偏移量,表示拉取的位置太小,设置OFFSET_TOO_SMALL
                status = GetMessageStatus.OFFSET_TOO_SMALL;
                //矫正下一次拉取的开始偏移量,如果broker不是SLAVE节点,或者是SLAVE节点但是从服务器支持offset检查,则为minOffset
                nextBeginOffset = nextOffsetCorrection(offset, minOffset);
            } else if (offset == maxOffset) {
                //consumer传递的offset等于最大偏移量,表示拉取的位置溢出,设置OFFSET_OVERFLOW_ONE
                status = GetMessageStatus.OFFSET_OVERFLOW_ONE;
                //矫正下一次拉取的开始偏移量,还是offset
                nextBeginOffset = nextOffsetCorrection(offset, offset);
            } else if (offset > maxOffset) {
                //consumer传递的offset大于最大偏移量,表示拉取的位置严重溢出,设置OFFSET_OVERFLOW_BADLY
                status = GetMessageStatus.OFFSET_OVERFLOW_BADLY;
                //如果最小偏移量为0
                if (0 == minOffset) {
                    //矫正下一次拉取的开始偏移量,如果broker不是SLAVE节点,或者是SLAVE节点但是从服务器支持offset检查,则为minOffset
                    nextBeginOffset = nextOffsetCorrection(offset, minOffset);
                } else {
                    //矫正下一次拉取的开始偏移量,如果broker不是SLAVE节点,或者是SLAVE节点但是从服务器支持offset检查,则为maxOffset
                    nextBeginOffset = nextOffsetCorrection(offset, maxOffset);
                }
            }
            //consumer传递的offset大于等于minOffset,且小于maxOffset,表示偏移量在正常范围内
            else {
                /*
                 * 3 根据逻辑offset定位到物理偏移量,然后截取该偏移量之后的一段Buffer,其包含要拉取的消息的索引数据及对应consumeQueue文件之后的全部索引数据。
                 * 一条consumeQueue索引默认固定长度20B,这里截取的Buffer可能包含多条索引数据,但是一定包含将要拉取的下一条数据。
                 */
                SelectMappedBufferResult bufferConsumeQueue = consumeQueue.getIndexBuffer(offset);
                //如果截取到了缓存区数据,那么从Buffer中检查索引数据以及查找commitLog中的消息数据
                if (bufferConsumeQueue != null) {
                    try {
                        //先设置为NO_MATCHED_MESSAGE
                        status = GetMessageStatus.NO_MATCHED_MESSAGE;

                        //下一个commitLog文件的起始物理偏移量,默认从Long.MIN_VALUE开始
                        //用来记录上一次循环的时候时候是否换到了下一个commitLog文件
                        long nextPhyFileStartOffset = Long.MIN_VALUE;
                        //本次消息拉取的最大物理偏移量
                        long maxPhyOffsetPulling = 0;

                        int i = 0;
                        //每次最大的过滤消息字节数,一般为16000/20 = 800 条
                        final int maxFilterMessageCount = Math.max(16000, maxMsgNums * ConsumeQueue.CQ_STORE_UNIT_SIZE);
                        //是否需要记录commitLog磁盘的剩余可拉取的消息字节数,默认true
                        final boolean diskFallRecorded = this.messageStoreConfig.isDiskFallRecorded();

                        //创建拉取结果对象
                        getResult = new GetMessageResult(maxMsgNums);

                        //存储单元
                        ConsumeQueueExt.CqExtUnit cqExtUnit = new ConsumeQueueExt.CqExtUnit();
                        /*
                         * 4 循环遍历截取的buffer,处理每一条ConsumeQueue索引,拉取消息,ConsumeQueue消息固定长度20字节,因此每次移动20B的长度
                         */
                        for (; i < bufferConsumeQueue.getSize() && i < maxFilterMessageCount; i += ConsumeQueue.CQ_STORE_UNIT_SIZE) {
                            //消息在CommitLog中的物理偏移量
                            long offsetPy = bufferConsumeQueue.getByteBuffer().getLong();
                            //消息大小
                            int sizePy = bufferConsumeQueue.getByteBuffer().getInt();
                            //延迟消息就是消息投递时间,其他消息就是消息的tags的hashCode,即生产者发送消息时设置的tags
                            //消费数据时,可以通过对比消费者设置的过滤信息来匹配消息
                            long tagsCode = bufferConsumeQueue.getByteBuffer().getLong();

                            //更新maxPhyOffsetPulling为当前消息在CommitLog中的物理偏移量
                            maxPhyOffsetPulling = offsetPy;

                            //如果nextPhyFileStartOffset不为Long.MIN_VALUE,并且offsetPy 小于 nextPhyFileStartOffset那
                            //表示切换到了下一个commitLog文件,并且当前偏移量下一该文件最小偏移量,那么跳过该消息的处理
                            if (nextPhyFileStartOffset != Long.MIN_VALUE) {
                                if (offsetPy < nextPhyFileStartOffset)
                                    continue;
                            }

                            /*
                             * 4.1 检查要拉取的消息是否在磁盘上
                             */
                            boolean isInDisk = checkInDiskByCommitOffset(offsetPy, maxOffsetPy);

                            /*
                             * 4.2 判断消息拉取是否达到上限,如果达到上限,则跳出循环,结束消息的拉取
                             */
                            if (this.isTheBatchFull(sizePy, maxMsgNums, getResult.getBufferTotalSize(), getResult.getMessageCount(),
                                isInDisk)) {
                                break;
                            }

                            //额外信息判断,一般没有
                            boolean extRet = false, isTagsCodeLegal = true;
                            if (consumeQueue.isExtAddr(tagsCode)) {
                                extRet = consumeQueue.getExt(tagsCode, cqExtUnit);
                                if (extRet) {
                                    tagsCode = cqExtUnit.getTagsCode();
                                } else {
                                    // can't find ext content.Client will filter messages by tag also.
                                    log.error("[BUG] can't find consume queue extend file content!addr={}, offsetPy={}, sizePy={}, topic={}, group={}",
                                        tagsCode, offsetPy, sizePy, topic, group);
                                    isTagsCodeLegal = false;
                                }
                            }

                            /*
                             * 4.3 通过messageFilter#isMatchedByConsumeQueue方法执行消息tagsCode过滤
                             * tagsCode在ConsumeQueue中保存着,因此基于ConsumeQueue条目就能执行broker端的TAG过滤
                             */
                            if (messageFilter != null
                                && !messageFilter.isMatchedByConsumeQueue(isTagsCodeLegal ? tagsCode : null, extRet ? cqExtUnit : null)) {
                                //如果过滤没通过,并且已拉取的消息总大小为0,则设置为NO_MATCHED_MESSAGE状态
                                if (getResult.getBufferTotalSize() == 0) {
                                    status = GetMessageStatus.NO_MATCHED_MESSAGE;
                                }

                                //跳过该索引条目,拉取下一个索引条目
                                continue;
                            }

                            /*
                             * 4.4 TAG校验通过,调用commitLog#getMessage方法根据消息的物理偏移量和消息大小获取该索引对应的真正的消息
                             * 索引里面包含了消息的物理偏移量和消息大小,因此能够从commitLog中获取真正的消息所在的内存,而消息的格式是固定的,因此能够解析出里面的数据
                             */
                            SelectMappedBufferResult selectResult = this.commitLog.getMessage(offsetPy, sizePy);
                            //没有找到消息,表示该偏移量可能到达了文件末尾,消息存放在下一个commitLog文件中
                            if (null == selectResult) {
                                if (getResult.getBufferTotalSize() == 0) {
                                    status = GetMessageStatus.MESSAGE_WAS_REMOVING;
                                }

                                //nextPhyFileStartOffset设置为下一个commitLog文件的起始物理偏移量,并跳过本次拉取
                                nextPhyFileStartOffset = this.commitLog.rollNextFile(offsetPy);
                                continue;
                            }

                            /*
                             * 4.5 找到了消息,继续通过messageFilter#isMatchedByCommitLog方法执行消息SQL92 过滤
                             * SQL92 过滤依赖于消息中的属性,而消息体的内容存放在commitLog中的,因此需要先拉取到消息,在进行SQL92过滤
                             */
                            if (messageFilter != null
                                && !messageFilter.isMatchedByCommitLog(selectResult.getByteBuffer().slice(), null)) {
                                if (getResult.getBufferTotalSize() == 0) {
                                    status = GetMessageStatus.NO_MATCHED_MESSAGE;
                                }
                                // release...
                                //过滤不通过,释放这一段内存,并跳过本次拉取
                                selectResult.release();
                                continue;
                            }

                            //传输的单条消息数量自增1,用于控制台展示
                            this.storeStatsService.getGetMessageTransferedMsgCount().add(1);
                            /*
                             * 4.6 TAG和SQL92校验通过,那么将消息存入getResult,注意存入的是一段
                             */
                            getResult.addMessage(selectResult);
                            //更改status
                            status = GetMessageStatus.FOUND;
                            //nextPhyFileStartOffset重新置为Long.MIN_VALUE,继续下一次循环
                            nextPhyFileStartOffset = Long.MIN_VALUE;
                        }

                        //如果需要记录commitLog磁盘的剩余可拉取的消息字节数,默认true
                        if (diskFallRecorded) {
                            //磁盘最大物理偏移量 - 本次消息拉取的最大物理偏移量 = 剩余的commitLog磁盘可拉取的消息字节数
                            long fallBehind = maxOffsetPy - maxPhyOffsetPulling;
                            //记录剩余的commitLog磁盘可拉取的消息字节数
                            brokerStatsManager.recordDiskFallBehindSize(group, topic, queueId, fallBehind);
                        }

                        /*
                         * 5 计算下一次读取数据的ConsumeQueue的开始偏移量,判断是否建议下一次从SLAVE broker中拉取消息
                         */
                        //计算下一次读取数据的ConsumeQueue的开始偏移量
                        nextBeginOffset = offset + (i / ConsumeQueue.CQ_STORE_UNIT_SIZE);

                        //磁盘最大物理偏移量 - 本次消息拉取的最大物理偏移量  = 剩余的commitLog磁盘可拉取的消息字节数
                        long diff = maxOffsetPy - maxPhyOffsetPulling;
                        //broker服务最大可使用物理内存
                        long memory = (long) (StoreUtil.TOTAL_PHYSICAL_MEMORY_SIZE
                            * (this.messageStoreConfig.getAccessMessageInMemoryMaxRatio() / 100.0));
                        //如果剩余的commitLog磁盘可拉取的消息字节数 大于 broker服务最大可使用物理内存,那么设置建议下一次从SLAVE broker中拉取消息
                        getResult.setSuggestPullingFromSlave(diff > memory);
                    } finally {
                        //截取的buffer内存释放
                        bufferConsumeQueue.release();
                    }
                } else {
                    //没获取到缓存buffer,可能是到达了当前consumeQueue文件的尾部
                    status = GetMessageStatus.OFFSET_FOUND_NULL;
                    //nextBeginOffset设置为consumeQueue的下一个文件的起始偏移量
                    nextBeginOffset = nextOffsetCorrection(offset, consumeQueue.rollNextFile(offset));
                    log.warn("consumer request topic: " + topic + "offset: " + offset + " minOffset: " + minOffset + " maxOffset: "
                        + maxOffset + ", but access logic queue failed.");
                }
            }
        } else {
            //没找到consumeQueue
            status = GetMessageStatus.NO_MATCHED_LOGIC_QUEUE;
            //nextBeginOffset设置为0
            nextBeginOffset = nextOffsetCorrection(offset, 0);
        }

        if (GetMessageStatus.FOUND == status) {
            //找到了消息,那么拉取到的次数统计字段getMessageTimesTotalFound+1,用于控制台展示
            //broker中的tps只计算拉取次数,而非拉取的消息条数,默认情况下pushConsumer一次拉取32条
            this.storeStatsService.getGetMessageTimesTotalFound().add(1);
        } else {
            //未找到消息,那么未拉取到的次数统计字段getMessageTimesTotalMiss+1,用于控制台展示
            this.storeStatsService.getGetMessageTimesTotalMiss().add(1);
        }
        //计算本次拉取消耗的时间
        long elapsedTime = this.getSystemClock().now() - beginTime;
        //尝试比较并更新最长的拉取消息的时间字段getMessageEntireTimeMax
        this.storeStatsService.setGetMessageEntireTimeMax(elapsedTime);

        // lazy init no data found.
        if (getResult == null) {
            //没找到消息的情况,延迟初始化GetMessageResult,设置拉去结果为0
            getResult = new GetMessageResult(0);
        }

        /*
         * 6 设置getResult的属性并返回
         */
        //设置拉取状态
        getResult.setStatus(status);
        //设置下次拉取的consumeQueue的起始逻辑偏移量
        getResult.setNextBeginOffset(nextBeginOffset);
        //设置consumeQueue的最小、最大的逻辑偏移量offset
        getResult.setMaxOffset(maxOffset);
        getResult.setMinOffset(minOffset);
        return getResult;
    }

2.1.6 Broker处理拉取消息请求

BrokerController#registerProcessor注册处理器:

public void registerProcessor() {
//省略其他
	/**
	 * PullMessageProcessor
	*/
	this.remotingServer.registerProcessor(RequestCode.PULL_MESSAGE, this.pullMessageProcessor, this.pullMessageExecutor);
	this.pullMessageProcessor.registerConsumeMessageHook(consumeMessageHookList);
//省略其他	
}

PullMessageProcessor#processRequest处理拉取消息请求:

/**
     * 处理拉取消息的请求,
     * 包括构建过滤信息,拉取消息,拉取结果处理(判断直接返回响应还是挂起请求),上报消费点位等步骤
     * @param channel            通连接道
     * @param request            请求
     * @param brokerAllowSuspend broker是否支持挂起请求
     * @return
     * @throws RemotingCommandException
     */
    private RemotingCommand processRequest(final Channel channel, RemotingCommand request, boolean brokerAllowSuspend)
        throws RemotingCommandException {
        //起始时间
        final long beginTimeMills = this.brokerController.getMessageStore().now();
        //创建响应命令对象
        RemotingCommand response = RemotingCommand.createResponseCommand(PullMessageResponseHeader.class);
        //创建响应头
        final PullMessageResponseHeader responseHeader = (PullMessageResponseHeader) response.readCustomHeader();
        //解析请求头
        final PullMessageRequestHeader requestHeader =
            (PullMessageRequestHeader) request.decodeCommandCustomHeader(PullMessageRequestHeader.class);

        //设置请求id,通过id可以获取请求结果
        response.setOpaque(request.getOpaque());

        log.debug("receive PullMessage request command, {}", request);

        //当前broker是否可读,不可读则直接返回
        if (!PermName.isReadable(this.brokerController.getBrokerConfig().getBrokerPermission())) {
            response.setCode(ResponseCode.NO_PERMISSION);
            response.setRemark(String.format("the broker[%s] pulling message is forbidden", this.brokerController.getBrokerConfig().getBrokerIP1()));
            return response;
        }

        //获取当前consumerGroup对应的订阅信息
        SubscriptionGroupConfig subscriptionGroupConfig =
            this.brokerController.getSubscriptionGroupManager().findSubscriptionGroupConfig(requestHeader.getConsumerGroup());
        if (null == subscriptionGroupConfig) {
            response.setCode(ResponseCode.SUBSCRIPTION_GROUP_NOT_EXIST);
            response.setRemark(String.format("subscription group [%s] does not exist, %s", requestHeader.getConsumerGroup(), FAQUrl.suggestTodo(FAQUrl.SUBSCRIPTION_GROUP_NOT_EXIST)));
            return response;
        }

        //判断是否可消费,不可消费则直接返回
        if (!subscriptionGroupConfig.isConsumeEnable()) {
            response.setCode(ResponseCode.NO_PERMISSION);
            response.setRemark("subscription group no permission, " + requestHeader.getConsumerGroup());
            return response;
        }

        //是否支持请求挂起
        final boolean hasSuspendFlag = PullSysFlag.hasSuspendFlag(requestHeader.getSysFlag());
        //是否提交消费进度
        final boolean hasCommitOffsetFlag = PullSysFlag.hasCommitOffsetFlag(requestHeader.getSysFlag());
        //是否存在子订阅,即TAG或者SQL92的设置,用于过滤消息
        final boolean hasSubscriptionFlag = PullSysFlag.hasSubscriptionFlag(requestHeader.getSysFlag());

        //计算broker最长的挂起时间,默认15s,该参数是消费者传递的
        final long suspendTimeoutMillisLong = hasSuspendFlag ? requestHeader.getSuspendTimeoutMillis() : 0;

        //获取topic配置
        TopicConfig topicConfig = this.brokerController.getTopicConfigManager().selectTopicConfig(requestHeader.getTopic());
        if (null == topicConfig) {
            log.error("the topic {} not exist, consumer: {}", requestHeader.getTopic(), RemotingHelper.parseChannelRemoteAddr(channel));
            response.setCode(ResponseCode.TOPIC_NOT_EXIST);
            response.setRemark(String.format("topic[%s] not exist, apply first please! %s", requestHeader.getTopic(), FAQUrl.suggestTodo(FAQUrl.APPLY_TOPIC_URL)));
            return response;
        }

        //topic是否可读,不可读则直接返回
        if (!PermName.isReadable(topicConfig.getPerm())) {
            response.setCode(ResponseCode.NO_PERMISSION);
            response.setRemark("the topic[" + requestHeader.getTopic() + "] pulling message is forbidden");
            return response;
        }

        //校验请求中的队列id,如果小于0或者大于等于topic配置中的读队列数量,那么直接返回
        if (requestHeader.getQueueId() < 0 || requestHeader.getQueueId() >= topicConfig.getReadQueueNums()) {
            String errorInfo = String.format("queueId[%d] is illegal, topic:[%s] topicConfig.readQueueNums:[%d] consumer:[%s]",
                requestHeader.getQueueId(), requestHeader.getTopic(), topicConfig.getReadQueueNums(), channel.remoteAddress());
            log.warn(errorInfo);
            response.setCode(ResponseCode.SYSTEM_ERROR);
            response.setRemark(errorInfo);
            return response;
        }

        /*
         * 1 构建过滤信息
         * 真正的过滤消息操作还在后面,而且broker和consumer都会进行过滤
         */
        SubscriptionData subscriptionData = null;
        ConsumerFilterData consumerFilterData = null;
        //如果有子订阅标记,那么每次拉取消息都会重新构建subscriptionData和consumerFilterData,而不是使用缓存的信息,一般都是false
        //因为hasSubscriptionFlag为true需要consumer端将postSubscriptionWhenPull=true,并且订阅不是classFilter模式同时满足
        if (hasSubscriptionFlag) {
            try {
                subscriptionData = FilterAPI.build(
                    requestHeader.getTopic(), requestHeader.getSubscription(), requestHeader.getExpressionType()
                );
                if (!ExpressionType.isTagType(subscriptionData.getExpressionType())) {
                    consumerFilterData = ConsumerFilterManager.build(
                        requestHeader.getTopic(), requestHeader.getConsumerGroup(), requestHeader.getSubscription(),
                        requestHeader.getExpressionType(), requestHeader.getSubVersion()
                    );
                    assert consumerFilterData != null;
                }
            } catch (Exception e) {
                log.warn("Parse the consumer's subscription[{}] failed, group: {}", requestHeader.getSubscription(),
                    requestHeader.getConsumerGroup());
                response.setCode(ResponseCode.SUBSCRIPTION_PARSE_FAILED);
                response.setRemark("parse the consumer's subscription failed");
                return response;
            }
        } else {
            //获取消费者组信息
            ConsumerGroupInfo consumerGroupInfo =
                this.brokerController.getConsumerManager().getConsumerGroupInfo(requestHeader.getConsumerGroup());
            if (null == consumerGroupInfo) {
                log.warn("the consumer's group info not exist, group: {}", requestHeader.getConsumerGroup());
                response.setCode(ResponseCode.SUBSCRIPTION_NOT_EXIST);
                response.setRemark("the consumer's group info not exist" + FAQUrl.suggestTodo(FAQUrl.SAME_GROUP_DIFFERENT_TOPIC));
                return response;
            }

            //如果不支持广播消费但是消费者消费模式是广播消费,则直接返回
            if (!subscriptionGroupConfig.isConsumeBroadcastEnable()
                && consumerGroupInfo.getMessageModel() == MessageModel.BROADCASTING) {
                response.setCode(ResponseCode.NO_PERMISSION);
                response.setRemark("the consumer group[" + requestHeader.getConsumerGroup() + "] can not consume by broadcast way");
                return response;
            }

            //获取broker缓存的此consumerGroupInfo中关于此topic的订阅关系
            subscriptionData = consumerGroupInfo.findSubscriptionData(requestHeader.getTopic());
            if (null == subscriptionData) {
                log.warn("the consumer's subscription not exist, group: {}, topic:{}", requestHeader.getConsumerGroup(), requestHeader.getTopic());
                response.setCode(ResponseCode.SUBSCRIPTION_NOT_EXIST);
                response.setRemark("the consumer's subscription not exist" + FAQUrl.suggestTodo(FAQUrl.SAME_GROUP_DIFFERENT_TOPIC));
                return response;
            }

            //比较订阅关系版本
            if (subscriptionData.getSubVersion() < requestHeader.getSubVersion()) {
                log.warn("The broker's subscription is not latest, group: {} {}", requestHeader.getConsumerGroup(),
                    subscriptionData.getSubString());
                response.setCode(ResponseCode.SUBSCRIPTION_NOT_LATEST);
                response.setRemark("the consumer's subscription not latest");
                return response;
            }
            //如果订阅关系表达式不是TAG类型,那么构建consumerFilterData
            if (!ExpressionType.isTagType(subscriptionData.getExpressionType())) {
                consumerFilterData = this.brokerController.getConsumerFilterManager().get(requestHeader.getTopic(),
                    requestHeader.getConsumerGroup());
                if (consumerFilterData == null) {
                    response.setCode(ResponseCode.FILTER_DATA_NOT_EXIST);
                    response.setRemark("The broker's consumer filter data is not exist!Your expression may be wrong!");
                    return response;
                }
                if (consumerFilterData.getClientVersion() < requestHeader.getSubVersion()) {
                    log.warn("The broker's consumer filter data is not latest, group: {}, topic: {}, serverV: {}, clientV: {}",
                        requestHeader.getConsumerGroup(), requestHeader.getTopic(), consumerFilterData.getClientVersion(), requestHeader.getSubVersion());
                    response.setCode(ResponseCode.FILTER_DATA_NOT_LATEST);
                    response.setRemark("the consumer's consumer filter data not latest");
                    return response;
                }
            }
        }

        //如果订阅关系表达式不是TAG类型,并且enablePropertyFilter没有开启支持SQL92,那么抛出异常
        //也就是说,如果消费者使用SQL92模式订阅,那么需要现在broker端设置enablePropertyFilter=true
        if (!ExpressionType.isTagType(subscriptionData.getExpressionType())
            && !this.brokerController.getBrokerConfig().isEnablePropertyFilter()) {
            response.setCode(ResponseCode.SYSTEM_ERROR);
            response.setRemark("The broker does not support consumer to filter message by " + subscriptionData.getExpressionType());
            return response;
        }

        MessageFilter messageFilter;
        //重试topic是否支持filter过滤,默认false,即重试topic是不支持过滤额
        if (this.brokerController.getBrokerConfig().isFilterSupportRetry()) {
            messageFilter = new ExpressionForRetryMessageFilter(subscriptionData, consumerFilterData,
                this.brokerController.getConsumerFilterManager());
        } else {
            //创建普通的ExpressionMessageFilter,内部保存了消费者启动时通过心跳上报的订阅关系
            //一般基于tag订阅的情况下,consumerFilterData是null,通过subscriptionData进行过滤
            messageFilter = new ExpressionMessageFilter(subscriptionData, consumerFilterData,
                this.brokerController.getConsumerFilterManager());
        }

        /*
         * 2 通过DefaultMessageStore#getMessage方法批量拉取消息,并且进行过滤操作
         */
        final GetMessageResult getMessageResult =
            this.brokerController.getMessageStore().getMessage(requestHeader.getConsumerGroup(), requestHeader.getTopic(),
                requestHeader.getQueueId(), requestHeader.getQueueOffset(), requestHeader.getMaxMsgNums(), messageFilter);
        /*
         * 3 对于拉取结果GetMessageResult进行处理
         */
        if (getMessageResult != null) {
            //设置拉去状态枚举名字
            response.setRemark(getMessageResult.getStatus().name());
            //设置下次拉取的consumeQueue的起始逻辑偏移量
            responseHeader.setNextBeginOffset(getMessageResult.getNextBeginOffset());
            //设置consumeQueue的最小、最大的逻辑偏移量maxOffset和minOffset
            responseHeader.setMinOffset(getMessageResult.getMinOffset());
            responseHeader.setMaxOffset(getMessageResult.getMaxOffset());

            /*
             * 3.1 判断并设置下次拉取消息的建议broker是MATER还是SLAVE
             */
            //是否建议从SLAVE拉取消息
            if (getMessageResult.isSuggestPullingFromSlave()) {
                //设置建议的brokerId为从服务器的id 1
                responseHeader.setSuggestWhichBrokerId(subscriptionGroupConfig.getWhichBrokerWhenConsumeSlowly());
            } else {
                //否则,设置建议的brokerId为主服务器的id 0
                responseHeader.setSuggestWhichBrokerId(MixAll.MASTER_ID);
            }

            //判断broker角色
            switch (this.brokerController.getMessageStoreConfig().getBrokerRole()) {
                case ASYNC_MASTER:
                case SYNC_MASTER:
                    break;
                case SLAVE:
                    //如果是SLAVE,并且从服务器不可读
                    if (!this.brokerController.getBrokerConfig().isSlaveReadEnable()) {
                        //设置响应码为PULL_RETRY_IMMEDIATELY,consumer收到响应后会立即从MASTER重试拉取
                        response.setCode(ResponseCode.PULL_RETRY_IMMEDIATELY);
                        //设置建议的brokerId为主服务器的id 0
                        responseHeader.setSuggestWhichBrokerId(MixAll.MASTER_ID);
                    }
                    break;
            }

            //如果从服务器可读
            if (this.brokerController.getBrokerConfig().isSlaveReadEnable()) {
                // consume too slow ,redirect to another machine
                // 如果消费太慢了,那么下次重定向到另一台broker,id通过subscriptionGroupConfig的whichBrokerWhenConsumeSlowly指定,默认1,即SLAVE
                if (getMessageResult.isSuggestPullingFromSlave()) {
                    responseHeader.setSuggestWhichBrokerId(subscriptionGroupConfig.getWhichBrokerWhenConsumeSlowly());
                }
                // consume ok
                else {
                    //id通过subscriptionGroupConfig的brokerId指定,默认0,即MASTER
                    responseHeader.setSuggestWhichBrokerId(subscriptionGroupConfig.getBrokerId());
                }
            } else {
                //如果从服务器不可读,设置建议的brokerId为主服务器的id 0
                responseHeader.setSuggestWhichBrokerId(MixAll.MASTER_ID);
            }

            /*
             * 3.2 判断拉取消息状态码,并设置对应的响应码
             */
            switch (getMessageResult.getStatus()) {
                case FOUND:
                    //找到了消息
                    response.setCode(ResponseCode.SUCCESS);
                    break;
                case MESSAGE_WAS_REMOVING:
                    //commitLog中没有找到消息
                    response.setCode(ResponseCode.PULL_RETRY_IMMEDIATELY);
                    break;
                case NO_MATCHED_LOGIC_QUEUE:
                case NO_MESSAGE_IN_QUEUE:
                    //没找到consumeQueue,或者consumeQueue没有消息
                    if (0 != requestHeader.getQueueOffset()) {
                        response.setCode(ResponseCode.PULL_OFFSET_MOVED);

                        // XXX: warn and notify me
                        log.info("the broker store no queue data, fix the request offset {} to {}, Topic: {} QueueId: {} Consumer Group: {}",
                            requestHeader.getQueueOffset(),
                            getMessageResult.getNextBeginOffset(),
                            requestHeader.getTopic(),
                            requestHeader.getQueueId(),
                            requestHeader.getConsumerGroup()
                        );
                    } else {
                        response.setCode(ResponseCode.PULL_NOT_FOUND);
                    }
                    break;
                case NO_MATCHED_MESSAGE:
                    //没匹配到消息
                    response.setCode(ResponseCode.PULL_RETRY_IMMEDIATELY);
                    break;
                case OFFSET_FOUND_NULL:
                    response.setCode(ResponseCode.PULL_NOT_FOUND);
                    break;
                case OFFSET_OVERFLOW_BADLY:
                    response.setCode(ResponseCode.PULL_OFFSET_MOVED);
                    // XXX: warn and notify me
                    log.info("the request offset: {} over flow badly, broker max offset: {}, consumer: {}",
                        requestHeader.getQueueOffset(), getMessageResult.getMaxOffset(), channel.remoteAddress());
                    break;
                case OFFSET_OVERFLOW_ONE:
                    response.setCode(ResponseCode.PULL_NOT_FOUND);
                    break;
                case OFFSET_TOO_SMALL:
                    response.setCode(ResponseCode.PULL_OFFSET_MOVED);
                    log.info("the request offset too small. group={}, topic={}, requestOffset={}, brokerMinOffset={}, clientIp={}",
                        requestHeader.getConsumerGroup(), requestHeader.getTopic(), requestHeader.getQueueOffset(),
                        getMessageResult.getMinOffset(), channel.remoteAddress());
                    break;
                default:
                    assert false;
                    break;
            }

            /*
             * 3.3 判断如果有消费钩子,那么执行consumeMessageBefore方法
             */
            if (this.hasConsumeMessageHook()) {
                //构建上下文
                ConsumeMessageContext context = new ConsumeMessageContext();
                context.setConsumerGroup(requestHeader.getConsumerGroup());
                context.setTopic(requestHeader.getTopic());
                context.setQueueId(requestHeader.getQueueId());

                String owner = request.getExtFields().get(BrokerStatsManager.COMMERCIAL_OWNER);

                switch (response.getCode()) {
                    case ResponseCode.SUCCESS:
                        int commercialBaseCount = brokerController.getBrokerConfig().getCommercialBaseCount();
                        int incValue = getMessageResult.getMsgCount4Commercial() * commercialBaseCount;

                        context.setCommercialRcvStats(BrokerStatsManager.StatsType.RCV_SUCCESS);
                        context.setCommercialRcvTimes(incValue);
                        context.setCommercialRcvSize(getMessageResult.getBufferTotalSize());
                        context.setCommercialOwner(owner);

                        break;
                    case ResponseCode.PULL_NOT_FOUND:
                        if (!brokerAllowSuspend) {

                            context.setCommercialRcvStats(BrokerStatsManager.StatsType.RCV_EPOLLS);
                            context.setCommercialRcvTimes(1);
                            context.setCommercialOwner(owner);

                        }
                        break;
                    case ResponseCode.PULL_RETRY_IMMEDIATELY:
                    case ResponseCode.PULL_OFFSET_MOVED:
                        context.setCommercialRcvStats(BrokerStatsManager.StatsType.RCV_EPOLLS);
                        context.setCommercialRcvTimes(1);
                        context.setCommercialOwner(owner);
                        break;
                    default:
                        assert false;
                        break;
                }

                /*
                 * 执行前置钩子方法
                 */
                this.executeConsumeMessageHookBefore(context);
            }

            /*
             * 3.4 判断响应码,然后直接返回数据或者进行短轮询或者长轮询
             */
            switch (response.getCode()) {
                //如果拉取消息成功
                case ResponseCode.SUCCESS:

                    //更新一些统计信息
                    this.brokerController.getBrokerStatsManager().incGroupGetNums(requestHeader.getConsumerGroup(), requestHeader.getTopic(),
                        getMessageResult.getMessageCount());

                    this.brokerController.getBrokerStatsManager().incGroupGetSize(requestHeader.getConsumerGroup(), requestHeader.getTopic(),
                        getMessageResult.getBufferTotalSize());

                    this.brokerController.getBrokerStatsManager().incBrokerGetNums(getMessageResult.getMessageCount());
                    //是否读取消息到堆内存中,默认true
                    if (this.brokerController.getBrokerConfig().isTransferMsgByHeap()) {
                        //从buffer中读取出消息转换为字节数组
                        final byte[] r = this.readGetMessageResult(getMessageResult, requestHeader.getConsumerGroup(), requestHeader.getTopic(), requestHeader.getQueueId());
                        this.brokerController.getBrokerStatsManager().incGroupGetLatency(requestHeader.getConsumerGroup(),
                            requestHeader.getTopic(), requestHeader.getQueueId(),
                            (int) (this.brokerController.getMessageStore().now() - beginTimeMills));
                        //设置到body中
                        response.setBody(r);
                    } else {
                        try {
                            //基于netty直接读取buffer传输
                            FileRegion fileRegion =
                                new ManyMessageTransfer(response.encodeHeader(getMessageResult.getBufferTotalSize()), getMessageResult);
                            channel.writeAndFlush(fileRegion).addListener(new ChannelFutureListener() {
                                @Override
                                public void operationComplete(ChannelFuture future) throws Exception {
                                    getMessageResult.release();
                                    if (!future.isSuccess()) {
                                        log.error("transfer many message by pagecache failed, {}", channel.remoteAddress(), future.cause());
                                    }
                                }
                            });
                        } catch (Throwable e) {
                            log.error("transfer many message by pagecache exception", e);
                            getMessageResult.release();
                        }

                        response = null;
                    }
                    break;
                //没有读取到消息
                case ResponseCode.PULL_NOT_FOUND:

                    //如果broker允许挂起请求并且客户端支持请求挂起,则broker挂起该请求一段时间,中间如果有消息到达则会唤醒请求拉取消息并返回
                    if (brokerAllowSuspend && hasSuspendFlag) {
                        //broker最长的挂起时间,默认15s,该参数是消费者传递的
                        long pollingTimeMills = suspendTimeoutMillisLong;
                        //如果broker不支持长轮询,默认都是支持的
                        if (!this.brokerController.getBrokerConfig().isLongPollingEnable()) {
                            //那么使用短轮询,即最长的挂起时间设置为1s
                            pollingTimeMills = this.brokerController.getBrokerConfig().getShortPollingTimeMills();
                        }

                        String topic = requestHeader.getTopic();
                        long offset = requestHeader.getQueueOffset();
                        int queueId = requestHeader.getQueueId();
                        //创建新的拉取请求
                        PullRequest pullRequest = new PullRequest(request, channel, pollingTimeMills,
                            this.brokerController.getMessageStore().now(), offset, subscriptionData, messageFilter);
                        //通过pullRequestHoldService#suspendPullRequest方法提交PullRequest,该请求将会被挂起并异步处理
                        this.brokerController.getPullRequestHoldService().suspendPullRequest(topic, queueId, pullRequest);
                        response = null;
                        break;
                    }

                case ResponseCode.PULL_RETRY_IMMEDIATELY:
                    break;
                //读取的offset不正确,太大或者太小
                case ResponseCode.PULL_OFFSET_MOVED:
                    //如果broker不是SLAVE,或者是SLAVE,但是允许offset校验
                    if (this.brokerController.getMessageStoreConfig().getBrokerRole() != BrokerRole.SLAVE
                        || this.brokerController.getMessageStoreConfig().isOffsetCheckInSlave()) {
                        //发布offset移除事件
                        MessageQueue mq = new MessageQueue();
                        mq.setTopic(requestHeader.getTopic());
                        mq.setQueueId(requestHeader.getQueueId());
                        mq.setBrokerName(this.brokerController.getBrokerConfig().getBrokerName());

                        OffsetMovedEvent event = new OffsetMovedEvent();
                        event.setConsumerGroup(requestHeader.getConsumerGroup());
                        event.setMessageQueue(mq);
                        event.setOffsetRequest(requestHeader.getQueueOffset());
                        event.setOffsetNew(getMessageResult.getNextBeginOffset());
                        this.generateOffsetMovedEvent(event);
                        log.warn(
                            "PULL_OFFSET_MOVED:correction offset. topic={}, groupId={}, requestOffset={}, newOffset={}, suggestBrokerId={}",
                            requestHeader.getTopic(), requestHeader.getConsumerGroup(), event.getOffsetRequest(), event.getOffsetNew(),
                            responseHeader.getSuggestWhichBrokerId());
                    } else {
                        responseHeader.setSuggestWhichBrokerId(subscriptionGroupConfig.getBrokerId());
                        response.setCode(ResponseCode.PULL_RETRY_IMMEDIATELY);
                        log.warn("PULL_OFFSET_MOVED:none correction. topic={}, groupId={}, requestOffset={}, suggestBrokerId={}",
                            requestHeader.getTopic(), requestHeader.getConsumerGroup(), requestHeader.getQueueOffset(),
                            responseHeader.getSuggestWhichBrokerId());
                    }

                    break;
                default:
                    assert false;
            }
        } else {
            response.setCode(ResponseCode.SYSTEM_ERROR);
            response.setRemark("store getMessage return null");
        }

        /*
         * 4 拉取消息完毕之后,无论是否拉取到消息,只要broker支持挂起请求,并且consumer支持提交消费进度,并且当前broker不是SLAVE角色,都会上报该消费者上一次的消费点位
         * 另外消费者客户端也会定时没5s上报一次消费点
         */
        //要求brokerAllowSuspend为true,新的拉取请求为true,但是已被suspend的请求将会是false
        boolean storeOffsetEnable = brokerAllowSuspend;
        storeOffsetEnable = storeOffsetEnable && hasCommitOffsetFlag;
        storeOffsetEnable = storeOffsetEnable
            && this.brokerController.getMessageStoreConfig().getBrokerRole() != BrokerRole.SLAVE;
        //如果支持持久化偏移量
        if (storeOffsetEnable) {
            //上报偏移量
            this.brokerController.getConsumerOffsetManager().commitOffset(RemotingHelper.parseChannelRemoteAddr(channel),
                requestHeader.getConsumerGroup(), requestHeader.getTopic(), requestHeader.getQueueId(), requestHeader.getCommitOffset());
        }
        return response;
    }

2.1.7 PullAPIWrapper#processPullResult

处理pullResult,进行消息解码、过滤以及设置其他属性的操作。

主要包括:

  1. 更新下次拉取建议的brokerId,下次拉取消息时从pullFromWhichNodeTable中直接取出
  2. 如果拉取到消息:
    2.1 MessageDecoder#decodes把字节数组解析为MessageExt消息对象;
    2.2 如果存在tag,根据tag过滤消息。
    2.3 如果有消息过滤钩子,那么执行钩子方法,这里可以扩展自定义的消息过滤的逻辑;
    2.4 事务消息标识、最大偏移、最小偏移属性处理;
  3. 消息数组设置为空,便于内存回收。
/**
     * 处理pullResult,进行消息解码、过滤以及设置其他属性的操作
     *
     * @param mq               消息队列
     * @param pullResult       拉取结果
     * @param subscriptionData 获取topic对应的SubscriptionData订阅关系
     * @return 处理后的PullResult
     */
    public PullResult processPullResult(final MessageQueue mq, final PullResult pullResult,
        final SubscriptionData subscriptionData) {
        PullResultExt pullResultExt = (PullResultExt) pullResult;

        //1 更新下次拉取建议的brokerId,下次拉取消息时从pullFromWhichNodeTable中直接取出
        this.updatePullFromWhichNode(mq, pullResultExt.getSuggestWhichBrokerId());
        if (PullStatus.FOUND == pullResult.getPullStatus()) {
            //2 对二进制字节数组进行解码转换为java的List消息集合
            ByteBuffer byteBuffer = ByteBuffer.wrap(pullResultExt.getMessageBinary());
            List<MessageExt> msgList = MessageDecoder.decodes(byteBuffer);

            List<MessageExt> msgListFilterAgain = msgList;
            //3 如果存在tag,并且不是classFilterMode,那么按照tag过滤消息,这就是客户端的消息过滤
            if (!subscriptionData.getTagsSet().isEmpty() && !subscriptionData.isClassFilterMode()) {
                msgListFilterAgain = new ArrayList<MessageExt>(msgList.size());
                for (MessageExt msg : msgList) {
                    if (msg.getTags() != null) {
                        //这采用String#equals方法过滤,而broker端则是比较的tagHash值,即hashCode
                        if (subscriptionData.getTagsSet().contains(msg.getTags())) {
                            msgListFilterAgain.add(msg);
                        }
                    }
                }
            }

            //4 如果有消息过滤钩子,那么执行钩子方法,这里可以扩展自定义的消息过滤的逻辑
            if (this.hasHook()) {
                FilterMessageContext filterMessageContext = new FilterMessageContext();
                filterMessageContext.setUnitMode(unitMode);
                filterMessageContext.setMsgList(msgListFilterAgain);
                this.executeHook(filterMessageContext);
            }

            //5 遍历过滤通过的消息,设置属性
            for (MessageExt msg : msgListFilterAgain) {
                //事务消息标识
                String traFlag = msg.getProperty(MessageConst.PROPERTY_TRANSACTION_PREPARED);
                //如果是事务消息,则设置事务id
                if (Boolean.parseBoolean(traFlag)) {
                    msg.setTransactionId(msg.getProperty(MessageConst.PROPERTY_UNIQ_CLIENT_MESSAGE_ID_KEYIDX));
                }
                //将响应中的最小和最大偏移量存入msg
                MessageAccessor.putProperty(msg, MessageConst.PROPERTY_MIN_OFFSET,
                    Long.toString(pullResult.getMinOffset()));
                MessageAccessor.putProperty(msg, MessageConst.PROPERTY_MAX_OFFSET,
                    Long.toString(pullResult.getMaxOffset()));
                //设置brokerName到msg
                msg.setBrokerName(mq.getBrokerName());
            }

            //将过滤后的消息存入msgFoundList集合
            pullResultExt.setMsgFoundList(msgListFilterAgain);
        }

        //6 因为消息已经被解析了,那么设置消息的字节数组为null,释放内存
        pullResultExt.setMessageBinary(null);

        return pullResult;
    }

2.2 Consumer消费消息

前面拉取消息是说到PullCallback#onSuccess对拉取消息结果的处理,RocketMQ对并发消息顺序消息分别处理。

2.2.1 并发消费

ConsumeMessageConcurrentlyService#submitConsumeRequest把消息提交到消费线程池:

	/**
     * 提交并发消费请求
     *
     * @param msgs              拉取到的消息
     * @param processQueue      处理队列
     * @param messageQueue      消息队列
     * @param dispatchToConsume 是否分发消费,对于并发消费无影响
     */
    @Override
    public void submitConsumeRequest(
        final List<MessageExt> msgs,
        final ProcessQueue processQueue,
        final MessageQueue messageQueue,
        final boolean dispatchToConsume) {
        //单次批量消费的数量,默认1
        final int consumeBatchSize = this.defaultMQPushConsumer.getConsumeMessageBatchMaxSize();
        //如果消息数量 <= 单次批量消费的数量,那么直接全量消费
        if (msgs.size() <= consumeBatchSize) {
            //构建消费请求,将消息全部放进去
            ConsumeRequest consumeRequest = new ConsumeRequest(msgs, processQueue, messageQueue);
            try {
                //将请求提交到consumeExecutor线程池中进行消费
                this.consumeExecutor.submit(consumeRequest);
            } catch (RejectedExecutionException e) {
                //提交的任务被线程池拒绝,那么延迟5s进行提交,而不是丢弃
                this.submitConsumeRequestLater(consumeRequest);
            }
        }
        //如果消息数量 > 单次批量消费的数量,那么需要分割消息进行分批提交
        else {
            for (int total = 0; total < msgs.size(); ) {
                //一批消息集合,每批消息最多consumeBatchSize条,默认1
                List<MessageExt> msgThis = new ArrayList<MessageExt>(consumeBatchSize);
                for (int i = 0; i < consumeBatchSize; i++, total++) {
                    if (total < msgs.size()) {
                        msgThis.add(msgs.get(total));
                    } else {
                        break;
                    }
                }

                //将本批次消息构建为ConsumeRequest
                ConsumeRequest consumeRequest = new ConsumeRequest(msgThis, processQueue, messageQueue);
                try {
                    //将请求提交到consumeExecutor线程池中进行消费
                    this.consumeExecutor.submit(consumeRequest);
                } catch (RejectedExecutionException e) {
                    for (; total < msgs.size(); total++) {
                        msgThis.add(msgs.get(total));
                    }

                    //提交的任务被线程池拒绝,那么延迟5s进行提交,而不是丢弃
                    this.submitConsumeRequestLater(consumeRequest);
                }
            }
        }
    }

ConsumeMessageConcurrentlyService.ConsumeRequest#run执行消费钩子、调度真实Listener消费消息、消息消费结果处理等:

/**
         * 执行并发消费
         */
        @Override
        public void run() {
            //如果处理队列被丢弃,那么直接返回,不再消费,例如负载均衡时该队列被分配给了其他新上线的消费者,尽量避免重复消费
            if (this.processQueue.isDropped()) {
                log.info("the message queue not be able to consume, because it's dropped. group={} {}", ConsumeMessageConcurrentlyService.this.consumerGroup, this.messageQueue);
                return;
            }

            /*
             * 1 获取并发消费的消息监听器,push模式模式下是我们需要开发的,通过registerMessageListener方法注册,内部包含了要执行的业务逻辑
             */
            MessageListenerConcurrently listener = ConsumeMessageConcurrentlyService.this.messageListener;
            ConsumeConcurrentlyContext context = new ConsumeConcurrentlyContext(messageQueue);
            ConsumeConcurrentlyStatus status = null;
            //重置重试topic
            defaultMQPushConsumerImpl.resetRetryAndNamespace(msgs, defaultMQPushConsumer.getConsumerGroup());

            /*
             * 2 如果有消费钩子,那么执行钩子函数的前置方法consumeMessageBefore
             * 我们可以注册钩子ConsumeMessageHook,再消费消息的前后调用
             */
            ConsumeMessageContext consumeMessageContext = null;
            if (ConsumeMessageConcurrentlyService.this.defaultMQPushConsumerImpl.hasHook()) {
                consumeMessageContext = new ConsumeMessageContext();
                consumeMessageContext.setNamespace(defaultMQPushConsumer.getNamespace());
                consumeMessageContext.setConsumerGroup(defaultMQPushConsumer.getConsumerGroup());
                consumeMessageContext.setProps(new HashMap<String, String>());
                consumeMessageContext.setMq(messageQueue);
                consumeMessageContext.setMsgList(msgs);
                consumeMessageContext.setSuccess(false);
                ConsumeMessageConcurrentlyService.this.defaultMQPushConsumerImpl.executeHookBefore(consumeMessageContext);
            }

            //起始时间戳
            long beginTimestamp = System.currentTimeMillis();
            boolean hasException = false;
            //消费返回类型,初始化为SUCCESS
            ConsumeReturnType returnType = ConsumeReturnType.SUCCESS;
            try {
                if (msgs != null && !msgs.isEmpty()) {
                    //循环设置每个消息的起始消费时间
                    for (MessageExt msg : msgs) {
                        MessageAccessor.setConsumeStartTimeStamp(msg, String.valueOf(System.currentTimeMillis()));
                    }
                }
                /*
                 * 3 调用listener#consumeMessage方法,进行消息消费,调用实际的业务逻辑,返回执行状态结果
                 * 有两种状态ConsumeConcurrentlyStatus.CONSUME_SUCCESS 和 ConsumeConcurrentlyStatus.RECONSUME_LATER
                 */
                status = listener.consumeMessage(Collections.unmodifiableList(msgs), context);
            } catch (Throwable e) {
                log.warn(String.format("consumeMessage exception: %s Group: %s Msgs: %s MQ: %s",
                    RemotingHelper.exceptionSimpleDesc(e),
                    ConsumeMessageConcurrentlyService.this.consumerGroup,
                    msgs,
                    messageQueue), e);
                //抛出异常之后,设置异常标志位
                hasException = true;
            }
            /*
             * 4 对返回的执行状态结果进行判断处理
             */
            //计算消费时间
            long consumeRT = System.currentTimeMillis() - beginTimestamp;
            //如status为null
            if (null == status) {
                //如果业务的执行抛出了异常
                if (hasException) {
                    //设置returnType为EXCEPTION
                    returnType = ConsumeReturnType.EXCEPTION;
                } else {
                    //设置returnType为RETURNNULL
                    returnType = ConsumeReturnType.RETURNNULL;
                }
            }
            //如消费时间consumeRT大于等于consumeTimeout,默认15min
            else if (consumeRT >= defaultMQPushConsumer.getConsumeTimeout() * 60 * 1000) {
                //设置returnType为TIME_OUT
                returnType = ConsumeReturnType.TIME_OUT;
            }
            //如status为RECONSUME_LATER,即消费失败
            else if (ConsumeConcurrentlyStatus.RECONSUME_LATER == status) {
                //设置returnType为FAILED
                returnType = ConsumeReturnType.FAILED;
            }
            //如status为CONSUME_SUCCESS,即消费成功
            else if (ConsumeConcurrentlyStatus.CONSUME_SUCCESS == status) {
                //设置returnType为SUCCESS,即消费成功
                returnType = ConsumeReturnType.SUCCESS;
            }

            //如果有钩子,则将returnType设置进去
            if (ConsumeMessageConcurrentlyService.this.defaultMQPushConsumerImpl.hasHook()) {
                consumeMessageContext.getProps().put(MixAll.CONSUME_CONTEXT_TYPE, returnType.name());
            }

            //如果status为null
            if (null == status) {
                log.warn("consumeMessage return null, Group: {} Msgs: {} MQ: {}",
                    ConsumeMessageConcurrentlyService.this.consumerGroup,
                    msgs,
                    messageQueue);
                //将status设置为RECONSUME_LATER,即消费失败
                status = ConsumeConcurrentlyStatus.RECONSUME_LATER;
            }

            /*
             * 5 如果有消费钩子,那么执行钩子函数的后置方法consumeMessageAfter
             * 我们可以注册钩子ConsumeMessageHook,在消费消息的前后调用
             */
            if (ConsumeMessageConcurrentlyService.this.defaultMQPushConsumerImpl.hasHook()) {
                consumeMessageContext.setStatus(status.toString());
                consumeMessageContext.setSuccess(ConsumeConcurrentlyStatus.CONSUME_SUCCESS == status);
                ConsumeMessageConcurrentlyService.this.defaultMQPushConsumerImpl.executeHookAfter(consumeMessageContext);
            }

            //增加消费时间
            ConsumeMessageConcurrentlyService.this.getConsumerStatsManager()
                .incConsumeRT(ConsumeMessageConcurrentlyService.this.consumerGroup, messageQueue.getTopic(), consumeRT);

            /*
             * 6 如果处理队列没有被丢弃,那么调用ConsumeMessageConcurrentlyService#processConsumeResult方法处理消费结果,包含重试等逻辑。
             *
             * 需要注意的是,如果在执行了listener#consumeMessage方法,即执行了业务逻辑之后,处理消费结果之前,该消息队列被丢弃了,
             * 例如负载均衡时该队列被分配给了其他新上线的消费者,那么由于dropped=false,导致不会进行最后的消费结果处理,
             * 将会导致消息的重复消费,因此必须做好业务层面的幂等性!
             */
            if (!processQueue.isDropped()) {
                ConsumeMessageConcurrentlyService.this.processConsumeResult(status, context, this);
            } else {
                log.warn("processQueue is dropped without process consume result. messageQueue={}, msgs={}", messageQueue, msgs);
            }
        }

ConsumeMessageConcurrentlyService#processConsumeResult处理消费结果:

/**
     * 处理消费结果
     *
     * @param status         消费状态
     * @param context        上下文
     * @param consumeRequest 消费请求
     */
    public void processConsumeResult(
        final ConsumeConcurrentlyStatus status,
        final ConsumeConcurrentlyContext context,
        final ConsumeRequest consumeRequest
    ) {
        //ackIndex,默认初始值为Integer.MAX_VALUE,表示消费成功的消息在消息集合中的索引
        int ackIndex = context.getAckIndex();

        //如果消息为空则直接返回
        if (consumeRequest.getMsgs().isEmpty())
            return;

        /*
         * 1 判断消费状态,设置ackIndex的值
         * 消费成功: ackIndex = 消息数量 - 1
         * 消费失败: ackIndex = -1
         */
        switch (status) {
            //如果消费成功
            case CONSUME_SUCCESS:
                //如果大于等于消息数量,则设置为消息数量减1
                //初始值为Integer.MAX_VALUE,因此一般都会设置为消息数量减1
                if (ackIndex >= consumeRequest.getMsgs().size()) {
                    ackIndex = consumeRequest.getMsgs().size() - 1;
                }
                //消费成功的个数,即消息数量
                int ok = ackIndex + 1;
                //消费失败的个数,即0
                int failed = consumeRequest.getMsgs().size() - ok;
                //统计
                this.getConsumerStatsManager().incConsumeOKTPS(consumerGroup, consumeRequest.getMessageQueue().getTopic(), ok);
                this.getConsumerStatsManager().incConsumeFailedTPS(consumerGroup, consumeRequest.getMessageQueue().getTopic(), failed);
                break;
            //如果消费失败
            case RECONSUME_LATER:
                //ackIndex初始化为-1
                ackIndex = -1;
                //统计
                this.getConsumerStatsManager().incConsumeFailedTPS(consumerGroup, consumeRequest.getMessageQueue().getTopic(),
                    consumeRequest.getMsgs().size());
                break;
            default:
                break;
        }

        /*
         * 2 判断消息模式,处理消费失败的情况
         * 广播模式:打印日志
         * 集群模式:向broker发送当前消息作为延迟消息,等待重试消费
         */
        switch (this.defaultMQPushConsumer.getMessageModel()) {
            //广播模式下
            case BROADCASTING:
                //从消费成功的消息在消息集合中的索引+1开始,仅仅是对于消费失败的消息打印日志,并不会重试
                for (int i = ackIndex + 1; i < consumeRequest.getMsgs().size(); i++) {
                    MessageExt msg = consumeRequest.getMsgs().get(i);
                    log.warn("BROADCASTING, the message consume failed, drop it, {}", msg.toString());
                }
                break;
            //集群模式下
            case CLUSTERING:
                List<MessageExt> msgBackFailed = new ArrayList<MessageExt>(consumeRequest.getMsgs().size());
                //消费成功的消息在消息集合中的索引+1开始,遍历消息
                for (int i = ackIndex + 1; i < consumeRequest.getMsgs().size(); i++) {
                    //获取该索引对应的消息
                    MessageExt msg = consumeRequest.getMsgs().get(i);
                    /*
                     * 2.1 消费失败后,将该消息重新发送至重试队列,延迟消费
                     */
                    boolean result = this.sendMessageBack(msg, context);
                    //如果执行发送失败
                    if (!result) {
                        //设置重试次数+1
                        msg.setReconsumeTimes(msg.getReconsumeTimes() + 1);
                        //加入失败的集合
                        msgBackFailed.add(msg);
                    }
                }

                if (!msgBackFailed.isEmpty()) {
                    //从consumeRequest中移除消费失败并且发回broker失败的消息
                    consumeRequest.getMsgs().removeAll(msgBackFailed);

                    /*
                     * 2.2 调用submitConsumeRequestLater方法,延迟5s将sendMessageBack执行失败的消息再次提交到consumeExecutor进行消费
                     */
                    this.submitConsumeRequestLater(msgBackFailed, consumeRequest.getProcessQueue(), consumeRequest.getMessageQueue());
                }
                break;
            default:
                break;
        }

        /*
         * 3 从处理队列的msgTreeMap中将消费成功以及消费失败但是发回broker成功的这批消息移除,然后返回msgTreeMap中的最小的偏移量
         */
        long offset = consumeRequest.getProcessQueue().removeMessage(consumeRequest.getMsgs());
        if (offset >= 0 && !consumeRequest.getProcessQueue().isDropped()) {
            //尝试更新内存中的offsetTable中的最新偏移量信息,第三个参数是否仅单调增加offset为true
            this.defaultMQPushConsumerImpl.getOffsetStore().updateOffset(consumeRequest.getMessageQueue(), offset, true);
        }
    }

2.2.2 顺序消费

ConsumeMessageOrderlyService#submitConsumeRequest提交消费请求:

	@Override
    public void submitConsumeRequest(
        final List<MessageExt> msgs,
        final ProcessQueue processQueue,
        final MessageQueue messageQueue,
        final boolean dispathToConsume) {
        if (dispathToConsume) {
            ConsumeRequest consumeRequest = new ConsumeRequest(processQueue, messageQueue);
            this.consumeExecutor.submit(consumeRequest);
        }
    }

ConsumeMessageOrderlyService.ConsumeRequest#run执行消费钩子、调度真实Listener消费消息、消息消费结果处理等:

@Override
        public void run() {
            if (this.processQueue.isDropped()) {
                log.warn("run, the message queue not be able to consume, because it's dropped. {}", this.messageQueue);
                return;
            }

            final Object objLock = messageQueueLock.fetchLockObject(this.messageQueue);
            synchronized (objLock) {
                //如果是广播模式或者获取到锁并且锁未过期
                if (MessageModel.BROADCASTING.equals(ConsumeMessageOrderlyService.this.defaultMQPushConsumerImpl.messageModel())
                    || (this.processQueue.isLocked() && !this.processQueue.isLockExpired())) {
                    final long beginTime = System.currentTimeMillis();
                    for (boolean continueConsume = true; continueConsume; ) {
                        //处理队列是否已销毁
                        if (this.processQueue.isDropped()) {
                            log.warn("the message queue not be able to consume, because it's dropped. {}", this.messageQueue);
                            break;
                        }
                        //如果为集群模式,并且没锁时,进入延迟消费
                        if (MessageModel.CLUSTERING.equals(ConsumeMessageOrderlyService.this.defaultMQPushConsumerImpl.messageModel())
                            && !this.processQueue.isLocked()) {
                            log.warn("the message queue not locked, so consume later, {}", this.messageQueue);
                            ConsumeMessageOrderlyService.this.tryLockLaterAndReconsume(this.messageQueue, this.processQueue, 10);
                            break;
                        }
                        //如果为集群模式,并且锁时间过期时,进入延迟消费
                        if (MessageModel.CLUSTERING.equals(ConsumeMessageOrderlyService.this.defaultMQPushConsumerImpl.messageModel())
                            && this.processQueue.isLockExpired()) {
                            log.warn("the message queue lock expired, so consume later, {}", this.messageQueue);
                            ConsumeMessageOrderlyService.this.tryLockLaterAndReconsume(this.messageQueue, this.processQueue, 10);
                            break;
                        }

                        long interval = System.currentTimeMillis() - beginTime;
                        //如果超时,进入延迟消费
                        //默认60s,rocketmq.client.maxTimeConsumeContinuously配置
                        if (interval > MAX_TIME_CONSUME_CONTINUOUSLY) {
                            ConsumeMessageOrderlyService.this.submitConsumeRequestLater(processQueue, messageQueue, 10);
                            break;
                        }

                        //一次性批量消费的消息数量,DefaultMQPushConsumer.consumeMessageBatchMaxSize配置
                        final int consumeBatchSize =
                            ConsumeMessageOrderlyService.this.defaultMQPushConsumer.getConsumeMessageBatchMaxSize();
                        //取出消息
                        List<MessageExt> msgs = this.processQueue.takeMessages(consumeBatchSize);
                        //重试消息Topic的处理
                        defaultMQPushConsumerImpl.resetRetryAndNamespace(msgs, defaultMQPushConsumer.getConsumerGroup());
                        if (!msgs.isEmpty()) {
                            final ConsumeOrderlyContext context = new ConsumeOrderlyContext(this.messageQueue);

                            ConsumeOrderlyStatus status = null;

                            ConsumeMessageContext consumeMessageContext = null;
                            //消息钩子的处理
                            if (ConsumeMessageOrderlyService.this.defaultMQPushConsumerImpl.hasHook()) {
                                consumeMessageContext = new ConsumeMessageContext();
                                consumeMessageContext
                                    .setConsumerGroup(ConsumeMessageOrderlyService.this.defaultMQPushConsumer.getConsumerGroup());
                                consumeMessageContext.setNamespace(defaultMQPushConsumer.getNamespace());
                                consumeMessageContext.setMq(messageQueue);
                                consumeMessageContext.setMsgList(msgs);
                                consumeMessageContext.setSuccess(false);
                                // init the consume context type
                                consumeMessageContext.setProps(new HashMap<String, String>());
                                ConsumeMessageOrderlyService.this.defaultMQPushConsumerImpl.executeHookBefore(consumeMessageContext);
                            }

                            long beginTimestamp = System.currentTimeMillis();
                            ConsumeReturnType returnType = ConsumeReturnType.SUCCESS;
                            boolean hasException = false;
                            try {
                                //获取消费锁
                                this.processQueue.getConsumeLock().lock();
                                //处理队列是否已销毁
                                if (this.processQueue.isDropped()) {
                                    log.warn("consumeMessage, the message queue not be able to consume, because it's dropped. {}",
                                        this.messageQueue);
                                    break;
                                }
                                //真实消费消息
                                status = messageListener.consumeMessage(Collections.unmodifiableList(msgs), context);
                            } catch (Throwable e) {
                                log.warn(String.format("consumeMessage exception: %s Group: %s Msgs: %s MQ: %s",
                                    RemotingHelper.exceptionSimpleDesc(e),
                                    ConsumeMessageOrderlyService.this.consumerGroup,
                                    msgs,
                                    messageQueue), e);
                                hasException = true;
                            } finally {
                                //消费锁解锁
                                this.processQueue.getConsumeLock().unlock();
                            }

                            //消费状态为null、回滚、暂停消费时
                            if (null == status
                                || ConsumeOrderlyStatus.ROLLBACK == status
                                || ConsumeOrderlyStatus.SUSPEND_CURRENT_QUEUE_A_MOMENT == status) {
                                log.warn("consumeMessage Orderly return not OK, Group: {} Msgs: {} MQ: {}",
                                    ConsumeMessageOrderlyService.this.consumerGroup,
                                    msgs,
                                    messageQueue);
                            }

                            long consumeRT = System.currentTimeMillis() - beginTimestamp;
                            if (null == status) {
                                if (hasException) {
                                    returnType = ConsumeReturnType.EXCEPTION;
                                } else {
                                    returnType = ConsumeReturnType.RETURNNULL;
                                }
                            } else if (consumeRT >= defaultMQPushConsumer.getConsumeTimeout() * 60 * 1000) {
                                returnType = ConsumeReturnType.TIME_OUT;
                            } else if (ConsumeOrderlyStatus.SUSPEND_CURRENT_QUEUE_A_MOMENT == status) {
                                returnType = ConsumeReturnType.FAILED;
                            } else if (ConsumeOrderlyStatus.SUCCESS == status) {
                                returnType = ConsumeReturnType.SUCCESS;
                            }

                            if (ConsumeMessageOrderlyService.this.defaultMQPushConsumerImpl.hasHook()) {
                                consumeMessageContext.getProps().put(MixAll.CONSUME_CONTEXT_TYPE, returnType.name());
                            }

                            //消费暂停
                            if (null == status) {
                                status = ConsumeOrderlyStatus.SUSPEND_CURRENT_QUEUE_A_MOMENT;
                            }

                            //执行消费钩子
                            if (ConsumeMessageOrderlyService.this.defaultMQPushConsumerImpl.hasHook()) {
                                consumeMessageContext.setStatus(status.toString());
                                consumeMessageContext
                                    .setSuccess(ConsumeOrderlyStatus.SUCCESS == status || ConsumeOrderlyStatus.COMMIT == status);
                                ConsumeMessageOrderlyService.this.defaultMQPushConsumerImpl.executeHookAfter(consumeMessageContext);
                            }

                            //消费状态统计
                            ConsumeMessageOrderlyService.this.getConsumerStatsManager()
                                .incConsumeRT(ConsumeMessageOrderlyService.this.consumerGroup, messageQueue.getTopic(), consumeRT);

                            //处理消费结果
                            continueConsume = ConsumeMessageOrderlyService.this.processConsumeResult(msgs, status, context, this);
                        } else {
                            continueConsume = false;
                        }
                    }
                } else {
                    if (this.processQueue.isDropped()) {
                        log.warn("the message queue not be able to consume, because it's dropped. {}", this.messageQueue);
                        return;
                    }
                    //延迟重试消费
                    ConsumeMessageOrderlyService.this.tryLockLaterAndReconsume(this.messageQueue, this.processQueue, 100);
                }
            }
        }

ConsumeMessageOrderlyService#processConsumeResult处理消费结果:

 /**
     * 处理消费结果
     *
     * @param msgs           消息
     * @param status         消息消费状态
     * @param context        消费钩子上下文
     * @param consumeRequest 消费请求
     * @return
     */
    public boolean processConsumeResult(
        final List<MessageExt> msgs,
        final ConsumeOrderlyStatus status,
        final ConsumeOrderlyContext context,
        final ConsumeRequest consumeRequest
    ) {
        //是否继续消费
        boolean continueConsume = true;
        //消费偏移量
        long commitOffset = -1L;
        //autoCommit 默认 true
        if (context.isAutoCommit()) {
            switch (status) {
                case COMMIT:
                case ROLLBACK:
                    log.warn("the message queue consume result is illegal, we think you want to ack these message {}",
                        consumeRequest.getMessageQueue());
                case SUCCESS:
                    //更新消息数量、消息大小统计信息
                    commitOffset = consumeRequest.getProcessQueue().commit();
                    //更新统计信息
                    this.getConsumerStatsManager().incConsumeOKTPS(consumerGroup, consumeRequest.getMessageQueue().getTopic(), msgs.size());
                    break;
                //消费失败
                case SUSPEND_CURRENT_QUEUE_A_MOMENT:
                    //统计
                    this.getConsumerStatsManager().incConsumeFailedTPS(consumerGroup, consumeRequest.getMessageQueue().getTopic(), msgs.size());
                    //校验是否达到最大重试次数,可以通过DefaultMQPushConsumer#maxReconsumeTimes属性配置,默认无上限,即Integer.MAX_VALUE
                    if (checkReconsumeTimes(msgs)) {
                        //没有到达最大重试次数
                        //标记消息等待再次消费
                        consumeRequest.getProcessQueue().makeMessageToConsumeAgain(msgs);
                        this.submitConsumeRequestLater(
                            consumeRequest.getProcessQueue(),
                            consumeRequest.getMessageQueue(),
                            context.getSuspendCurrentQueueTimeMillis());
                        //不消费请求消费结束不会准线消费
                        continueConsume = false;
                    } else {
                        //达到了最大惠试次数,那么提交消息,算作成功
                        commitOffset = consumeRequest.getProcessQueue().commit();
                    }
                    break;
                default:
                    break;
            }
        } else {
            switch (status) {
                case SUCCESS:
                    this.getConsumerStatsManager().incConsumeOKTPS(consumerGroup, consumeRequest.getMessageQueue().getTopic(), msgs.size());
                    break;
                case COMMIT:
                    commitOffset = consumeRequest.getProcessQueue().commit();
                    break;
                case ROLLBACK:
                    consumeRequest.getProcessQueue().rollback();
                    this.submitConsumeRequestLater(
                        consumeRequest.getProcessQueue(),
                        consumeRequest.getMessageQueue(),
                        context.getSuspendCurrentQueueTimeMillis());
                    continueConsume = false;
                    break;
                case SUSPEND_CURRENT_QUEUE_A_MOMENT:
                    this.getConsumerStatsManager().incConsumeFailedTPS(consumerGroup, consumeRequest.getMessageQueue().getTopic(), msgs.size());
                    if (checkReconsumeTimes(msgs)) {
                        consumeRequest.getProcessQueue().makeMessageToConsumeAgain(msgs);
                        this.submitConsumeRequestLater(
                            consumeRequest.getProcessQueue(),
                            consumeRequest.getMessageQueue(),
                            context.getSuspendCurrentQueueTimeMillis());
                        continueConsume = false;
                    }
                    break;
                default:
                    break;
            }
        }

        //如果处理队列非销毁状态,更新消费进度
        if (commitOffset >= 0 && !consumeRequest.getProcessQueue().isDropped()) {
            this.defaultMQPushConsumerImpl.getOffsetStore().updateOffset(consumeRequest.getMessageQueue(), commitOffset, false);
        }

        return continueConsume;
    }

3. Consumer重平衡

RocketMQ中使用负载均衡服务RebalanceService来专门处理多个消息队列和消费者的对应关系,并且提供了多个不同的消费者负载均衡策略,即如何分配消息队列给这些消费者。

当消费者正常退出,异常关闭通道,或者新加入的时候,同样需要负载均衡服务RebalanceService来进行消息队列分配的重平衡。

3.1 重平衡逻辑

3.1.1 RebalanceService负载均衡服务

RebalanceService是线程服务类。
RebalanceService#run

/**
     * RebalanceService#run方法,也就是负载均衡服务运行的任务,最多每隔20s执行一次重平衡。
     * 主要逻辑是在mqClientFactory#doRebalance方法中实现的。
     */
    @Override
    public void run() {
        log.info(this.getServiceName() + " service started");

        /*
         * 运行时逻辑
         * 如果服务没有停止,则在死循环中执行负载均衡
         */
        while (!this.isStopped()) {
            //等待运行,默认最多等待20s,可以被唤醒
            this.waitForRunning(waitInterval);
            //执行重平衡操作
            this.mqClientFactory.doRebalance();
        }

        log.info(this.getServiceName() + " service end");
    }

MQClientInstance#doRebalance

/**
     * 执行重平衡
     */
    public void doRebalance() {
        //遍历consumerTable,获取每一个消费者MQConsumerInner,即DefaultMQPushConsumerImpl或者其他实例,然后通过消费者本身来执行重平衡操作。
        for (Map.Entry<String, MQConsumerInner> entry : this.consumerTable.entrySet()) {
            //获取一个消费者,即DefaultMQPushConsumerImpl或者其他实例
            //MQConsumerInner有三种实现,
            //分别是DefaultLitePullConsumerImpl、DefaultMQPullConsumerImpl、DefaultMQPushConsumerImpl,
            //前两个都用的很少,他们的doRebalance源码也都很简单,即调用各自内部的rebalanceImpl#doRebalance(false)方法即可。
            MQConsumerInner impl = entry.getValue();
            if (impl != null) {
                try {
                    //通过消费者本身来执行重平衡操作
                    impl.doRebalance();
                } catch (Throwable e) {
                    log.error("doRebalance exception", e);
                }
            }
        }
    }

3.1.2 DefaultMQPushConsumerImpl#doRebalance

重平衡实际由RebalanceImpl处理。
DefaultMQPushConsumerImpl#doRebalance

/**
     * 执行重平衡
     */
    @Override
    public void doRebalance() {
        //如果服务没有暂停,那么调用rebalanceImpl执行重平衡
        if (!this.pause) {
            //isConsumeOrderly表示是否是顺序消费
            this.rebalanceImpl.doRebalance(this.isConsumeOrderly());
        }
    }

3.1.3 RebalanceImpl#doRebalance

/**
     * 将会获取当前消费者的订阅信息集合,然后遍历订阅信息集合,获取订阅的topic,调用rebalanceByTopic方法对该topic进行重平衡。
     * @param isOrder 是否是顺序消费
     */
    public void doRebalance(final boolean isOrder) {
        //获取当前消费者的订阅信息集合
        Map<String, SubscriptionData> subTable = this.getSubscriptionInner();
        if (subTable != null) {
            //遍历订阅信息集合
            for (final Map.Entry<String, SubscriptionData> entry : subTable.entrySet()) {
                //获取topic
                final String topic = entry.getKey();
                try {
                    //对该topic进行重平衡
                    this.rebalanceByTopic(topic, isOrder);
                } catch (Throwable e) {
                    if (!topic.startsWith(MixAll.RETRY_GROUP_TOPIC_PREFIX)) {
                        log.warn("rebalanceByTopic Exception", e);
                    }
                }
            }
        }
        //丢弃不属于当前消费者订阅的topic的队列快照ProcessQueue
        this.truncateMessageQueueNotMyTopic();
    }

RebalanceImpl#rebalanceByTopic

  1. 如果是广播模式,广播模式下并没有负载均衡可言,每个consumer都会消费所有队列中的全部消息,仅仅是更新当前consumer的处理队列processQueueTable的信息。
  2. 如果是集群模式,首先基于负载均衡策略确定分配给当前消费者的MessageQueue,然后更新当前consumer的处理队列processQueueTable的信息。
    2.1 首先获取该topic的所有消息队列集合mqSet,随后从topic所在的broker中获取当前consumerGroup的clientId集合,即消费者客户端id集合cidAll。一个clientId代表一个消费者。
    2.2 对topic的消息队列和clientId集合分别进行排序。排序能够保证,不同的客户端消费者在进行负载均衡时,其mqAll和cidAll中的元素顺序是一致的。
    2.3 获取分配消息队列的策略实现AllocateMessageQueueStrategy,即负载均衡的策略类,执行allocate方法,为当前clientId也就是当前消费者,分配消息队列,这一步就是执行负载均衡或者说重平衡的算法。
    2.4 调用updateProcessQueueTableInRebalance方法,更新新分配的消息队列的处理队列processQueueTable的信息,为新分配的消息队列创建最初的pullRequest并分发给PullMessageService
    2.5 如果processQueueTable发生了改变,那么调用messageQueueChanged方法。设置新的本地订阅关系版本,重设流控参数,立即给所有broker发送心跳,让Broker更新当前订阅关系。
/**
     * 根据topic进行重平衡,将会根据不同的消息模式执行不同的处理策略。
     * @param topic
     * @param isOrder
     */
    private void rebalanceByTopic(final String topic, final boolean isOrder) {
        //根据不同的消息模式执行不同的处理策略
        switch (messageModel) {
            /*
             * 广播模式的处理
             * 广播模式下并没有负载均衡可言,每个consumer都会消费所有队列中的全部消息,仅仅是更新当前consumer的处理队列processQueueTable的信息
             */
            case BROADCASTING: {
                //获取topic的消息队列
                Set<MessageQueue> mqSet = this.topicSubscribeInfoTable.get(topic);
                if (mqSet != null) {
                    //直接更新全部消息队列的处理队列processQueueTable的信息,创建最初的pullRequest并分发给PullMessageService
                    boolean changed = this.updateProcessQueueTableInRebalance(topic, mqSet, isOrder);
                    //如果processQueueTable发生了改变
                    if (changed) {
                        //设置新的本地订阅关系版本,重设流控参数,立即给所有broker发送心跳,让Broker更新当前订阅关系
                        this.messageQueueChanged(topic, mqSet, mqSet);
                        log.info("messageQueueChanged {} {} {} {}",
                            consumerGroup,
                            topic,
                            mqSet,
                            mqSet);
                    }
                } else {
                    log.warn("doRebalance, {}, but the topic[{}] not exist.", consumerGroup, topic);
                }
                break;
            }
            /*
             * 集群模式的处理
             * 基于负载均衡策略确定跟配给当前消费者的MessageQueue,然后更新当前consumer的处理队列processQueueTable的信息
             */
            case CLUSTERING: {
                //获取topic的消息队列
                Set<MessageQueue> mqSet = this.topicSubscribeInfoTable.get(topic);
                /*
                 * 从topic所在的broker中获取当前consumerGroup的clientId集合,即消费者客户端id集合
                 * 一个clientId代表一个消费者
                 */
                List<String> cidAll = this.mQClientFactory.findConsumerIdList(topic, consumerGroup);
                if (null == mqSet) {
                    if (!topic.startsWith(MixAll.RETRY_GROUP_TOPIC_PREFIX)) {
                        log.warn("doRebalance, {}, but the topic[{}] not exist.", consumerGroup, topic);
                    }
                }

                if (null == cidAll) {
                    log.warn("doRebalance, {} {}, get consumer id list failed", consumerGroup, topic);
                }

                if (mqSet != null && cidAll != null) {
                    //将topic的消息队列存入list集合中
                    List<MessageQueue> mqAll = new ArrayList<MessageQueue>();
                    mqAll.addAll(mqSet);

                    /*
                     * 对topic的消息队列和clientId集合分别进行排序
                     * 排序能够保证,不同的客户端消费者在进行负载均衡时,其mqAll和cidAll中的元素顺序是一致的
                     */
                    Collections.sort(cidAll);

                    //获取分配消息队列的策略实现,即负载均衡的策略类
                    AllocateMessageQueueStrategy strategy = this.allocateMessageQueueStrategy;

                    List<MessageQueue> allocateResult = null;
                    try {
                        /*
                         * 为当前clientId也就是当前消费者,分配消息队列
                         * 这一步就是执行负载均衡或者说重平衡的算法
                         */
                        allocateResult = strategy.allocate(
                            this.consumerGroup,
                            this.mQClientFactory.getClientId(),
                            mqAll,
                            cidAll);
                    } catch (Throwable e) {
                        log.error("AllocateMessageQueueStrategy.allocate Exception. allocateMessageQueueStrategyName={}", strategy.getName(),
                            e);
                        return;
                    }

                    //对消息队列去重
                    Set<MessageQueue> allocateResultSet = new HashSet<MessageQueue>();
                    if (allocateResult != null) {
                        allocateResultSet.addAll(allocateResult);
                    }

                    //更新新分配的消息队列的处理队列processQueueTable的信息,创建最初的pullRequest并分发给PullMessageService
                    boolean changed = this.updateProcessQueueTableInRebalance(topic, allocateResultSet, isOrder);
                    //如果processQueueTable发生了改变
                    if (changed) {
                        log.info(
                            "rebalanced result changed. allocateMessageQueueStrategyName={}, group={}, topic={}, clientId={}, mqAllSize={}, cidAllSize={}, rebalanceResultSize={}, rebalanceResultSet={}",
                            strategy.getName(), consumerGroup, topic, this.mQClientFactory.getClientId(), mqSet.size(), cidAll.size(),
                            allocateResultSet.size(), allocateResultSet);
                        //设置新的本地订阅关系版本,重设流控参数,立即给所有broker发送心跳,让Broker更新当前订阅关系
                        this.messageQueueChanged(topic, mqSet, allocateResultSet);
                    }
                }
                break;
            }
            default:
                break;
        }
    }

RebalanceImpl#updateProcessQueueTableInRebalance

/**
     * 更新新分配的消息队列的处理队列processQueueTable的信息,创建最初的pullRequest并分发给PullMessageService。
     * 大概步骤为:
     * 1.遍历当前消费者已分配的所有处理队列processQueueTable,当消费者启动并且第一次执行该方法时,processQueueTable是一个空集合。
     *  如果当前遍历到的消息队列和当前topic相等:
     *  1.1 如果新分配的消息队列集合不包含当前遍历到的消息队列,说明这个队列被移除了。
     *    1.1.1 设置对应的处理队列dropped = true,该队列中的消息将不会被消费。
     *    1.1.2 调用removeUnnecessaryMessageQueue删除不必要的消息队列。删除成功后,processQueueTable移除该条目,changed置为true。
     *  1.2.如果当前遍历到的处理队列最后一次拉取消息的时间距离现在超过120s,那么算作消费超时,可能是没有新消息或者网络通信失败。
     *  1.2.1 如果是push消费模式,设置对应的处理队列dropped = true,该队列中的消息将不会被消费。
     *   调用removeUnnecessaryMessageQueue删除不必要的消息队列。删除成功后,processQueueTable移除该条目,changed置为true。
     * 2.创建一个pullRequestList集合,用于存放新增的PullRequest。遍历新分配的消息队列集合,如果当前消费者的处理队列集合processQueueTable中不包含该消息队列,
     *  那么表示这个消息队列是新分配的,需要进行一系列处理:
     *  2.1 如果是顺序消费,并且调用lock方法请求broker锁定该队列失败,即获取该队列的分布式锁失败表示新增消息队列失败,这个队列可能还再被其他消费者消费,那么本次重平衡就不再消费该队列,进入下次循环。
     *  2.2 如果不是顺序消费或者顺序消费加锁成功,调用removeDirtyOffset方法从offsetTable中移除该消息队列的消费点位offset记录信息。
     *  2.3 为该消息队列创建一个处理队列ProcessQueue。
     *  2.4 调用computePullFromWhereWithException方法,获取该MessageQueue的下一个消息的消费偏移量nextOffset,pull模式返回0,push模式则根据consumeFromWhere计算得到。
     *  2.5 如果nextOffset大于0,表示获取消费位点成功。保存当前消息队列MessageQueue和处理队列ProcessQueue关系到processQueueTable。
     *  2.6 新建一个PullRequest,设置对应的offset、consumerGroup、mq、pq的信息,并且存入pullRequestList集合中。这里就是最初产生拉取消息请求的地方。changed置为true。
     * 3. 调用dispatchPullRequest方法,分发本次创建的PullRequest请求。
     *  3.1 pull模式需要手动拉取消息,这些请求会作废,因此该方法是一个空实现。
     *  3.2 push模式下自动拉取消息,而这里的PullRequest就是对应的消息队列的第一个拉取请求,因此这些请求会被PullMessageService依次处理,后续实现自动拉取消息。
     *   这里就是push模式下最初的产生拉取消息请求的地方。
     * @param topic 订阅的主题
     * @param mqSet 新分配的消息队列集合
     * @param isOrder 是否顺序消费
     * @return 是否有变更
     */
    private boolean updateProcessQueueTableInRebalance(final String topic, final Set<MessageQueue> mqSet,
        final boolean isOrder) {
        boolean changed = false;

        //1 遍历当前消费者的所有处理队列,当消费者启动并且第一次执行该方法时,processQueueTable是一个空集合
        Iterator<Entry<MessageQueue, ProcessQueue>> it = this.processQueueTable.entrySet().iterator();
        while (it.hasNext()) {
            Entry<MessageQueue, ProcessQueue> next = it.next();
            //key为消息队列
            MessageQueue mq = next.getKey();
            //value为对应的处理队列
            ProcessQueue pq = next.getValue();

            //如果topic相等
            if (mq.getTopic().equals(topic)) {
                //如果新分配的消息队列集合不包含当前遍历到的消息队列,说明这个队列被移除了
                if (!mqSet.contains(mq)) {
                    //设置对应的处理队列dropped = true,该队列中的消息将不会被消费
                    pq.setDropped(true);
                    //删除不必要的消息队列
                    if (this.removeUnnecessaryMessageQueue(mq, pq)) {
                        //删除成功后,移除该条目,changed置为true
                        it.remove();
                        changed = true;
                        log.info("doRebalance, {}, remove unnecessary mq, {}", consumerGroup, mq);
                    }

                //2 如果处理队列最后一次拉取消息的时间距离现在超过120s,那么算作消费超时,可能是没有新消息或者网络通信失败
                } else if (pq.isPullExpired()) {
                    switch (this.consumeType()) {
                        case CONSUME_ACTIVELY:
                            break;
                        //如果是push消费模式
                        case CONSUME_PASSIVELY:
                            //设置对应的处理队列dropped = true,该队列中的消息将不会被消费
                            pq.setDropped(true);
                            //删除不必要的消息队列
                            if (this.removeUnnecessaryMessageQueue(mq, pq)) {
                                //删除成功后,移除该条目,changed置为true
                                it.remove();
                                changed = true;
                                log.error("[BUG]doRebalance, {}, remove unnecessary mq, {}, because pull is pause, so try to fixed it",
                                    consumerGroup, mq);
                            }
                            break;
                        default:
                            break;
                    }
                }
            }
        }

        List<PullRequest> pullRequestList = new ArrayList<PullRequest>();
        //遍历新分配的消息队列集合
        for (MessageQueue mq : mqSet) {
            //如果当前消费者的处理队列集合中不包含该消息队列,那么表示这个消息队列是新分配的
            if (!this.processQueueTable.containsKey(mq)) {
                //如果是顺序消费,并且请求broker锁定该队列失败,即获取该队列的分布式锁失败
                //表示新增消息队列失败,这个队列可能还再被其他消费者消费,那么本次重平衡就不再消费该队列
                if (isOrder && !this.lock(mq)) {
                    log.warn("doRebalance, {}, add a new mq failed, {}, because lock failed", consumerGroup, mq);
                    continue;
                }

                //从offsetTable中移除该消息队列的消费点位offset记录信息
                this.removeDirtyOffset(mq);
                //为该消息队列创建一个处理队列
                ProcessQueue pq = new ProcessQueue();

                long nextOffset = -1L;
                try {
                    /*
                     * 获取该MessageQueue的下一个消息的消费偏移量offset
                     * pull模式返回0,push模式则根据consumeFromWhere计算得到
                     */
                    nextOffset = this.computePullFromWhereWithException(mq);
                } catch (Exception e) {
                    log.info("doRebalance, {}, compute offset failed, {}", consumerGroup, mq);
                    continue;
                }

                // 如果nextOffset大于0,表示获取消费位点成功
                if (nextOffset >= 0) {
                    //保存当前消息队列MessageQueue和处理队列ProcessQueue关系
                    ProcessQueue pre = this.processQueueTable.putIfAbsent(mq, pq);
                    if (pre != null) {
                        log.info("doRebalance, {}, mq already exists, {}", consumerGroup, mq);
                    } else {
                        log.info("doRebalance, {}, add a new mq, {}", consumerGroup, mq);
                        /*
                         * 新建一个PullRequest,设置对应的offset、consumerGroup、mq、pq的信息,并且存入pullRequestList集合中
                         * 这里就是最初产生拉取消息请求的地方
                         */
                        PullRequest pullRequest = new PullRequest();
                        pullRequest.setConsumerGroup(consumerGroup);
                        pullRequest.setNextOffset(nextOffset);
                        pullRequest.setMessageQueue(mq);
                        pullRequest.setProcessQueue(pq);
                        pullRequestList.add(pullRequest);
                        //changed置为true
                        changed = true;
                    }
                } else {
                    log.warn("doRebalance, {}, add new mq failed, {}", consumerGroup, mq);
                }
            }
        }

        /*
         * 3 分发本次创建的PullRequest请求。
         * pull模式需要手动拉取消息,这些请求会作废,因此该方法是一个空实现
         * push模式下自动拉取消息,而这里的PullRequest就是对应的消息队列的第一个拉取请求,因此这些请求会被PullMessageService依次处理,后续实现自动拉取消息
         */
        this.dispatchPullRequest(pullRequestList);

        return changed;
    }

RebalanceImpl#truncateMessageQueueNotMyTopic

private void truncateMessageQueueNotMyTopic() {
        Map<String, SubscriptionData> subTable = this.getSubscriptionInner();

        for (MessageQueue mq : this.processQueueTable.keySet()) {
            if (!subTable.containsKey(mq.getTopic())) {

                ProcessQueue pq = this.processQueueTable.remove(mq);
                if (pq != null) {
                    pq.setDropped(true);
                    log.info("doRebalance, {}, truncateMessageQueueNotMyTopic remove unnecessary mq, {}", consumerGroup, mq);
                }
            }
        }
    }

3.2 AllocateMessageQueueStrategy负载均衡的策略类

AllocateMessageQueueStrategy实现类:

  1. AllocateMessageQueueByMachineRoom,机房平均分配策略。
  2. AllocateMessageQueueConsistentHash,一致性哈希分配策略。基于一致性哈希算法分配。
  3. AllocateMessageQueueByConfig,根据用户配置的消息队列分配
  4. AllocateMessageQueueAveragelyByCircle,环形平均分配策略
  5. AllocateMessageQueueAveragely,平均分配策略,这是默认策略
  6. AllocateMachineRoomNearby,机房就近分配策略。

3.2.1 AllocateMessageQueueByMachineRoom机房平均分配策略

消费者只消费绑定的机房中的broker,并对绑定机房中的MessageQueue进行负载均衡。

public class AllocateMessageQueueByMachineRoom implements AllocateMessageQueueStrategy {
    //指定消费的机房名集合
    private Set<String> consumeridcs;

    @Override
    public List<MessageQueue> allocate(String consumerGroup, String currentCID, List<MessageQueue> mqAll,
        List<String> cidAll) {
        //参数校验
        if (StringUtils.isBlank(currentCID)) {
            throw new IllegalArgumentException("currentCID is empty");
        }
        if (CollectionUtils.isEmpty(mqAll)) {
            throw new IllegalArgumentException("mqAll is null or mqAll empty");
        }
        if (CollectionUtils.isEmpty(cidAll)) {
            throw new IllegalArgumentException("cidAll is null or cidAll empty");
        }
        //索引
        List<MessageQueue> result = new ArrayList<MessageQueue>();
        int currentIndex = cidAll.indexOf(currentCID);
        if (currentIndex < 0) {
            return result;
        }
        List<MessageQueue> premqAll = new ArrayList<MessageQueue>();
        for (MessageQueue mq : mqAll) {
            String[] temp = mq.getBrokerName().split("@");
            //如果brokerName符合“机房名@brokerName”的格式要求,并且当前消费者的consumeridcs包含该机房,则加入集合
            if (temp.length == 2 && consumeridcs.contains(temp[0])) {
                premqAll.add(mq);
            }
        }

        //平均分配的队列
        int mod = premqAll.size() / cidAll.size();
        //取模剩余的队列
        int rem = premqAll.size() % cidAll.size();
        //分配队列
        int startIndex = mod * currentIndex;
        int endIndex = startIndex + mod;
        for (int i = startIndex; i < endIndex; i++) {
            result.add(premqAll.get(i));
        }
        //多加一个队列
        if (rem > currentIndex) {
            result.add(premqAll.get(currentIndex + mod * cidAll.size()));
        }
        return result;
    }

    @Override
    public String getName() {
        return "MACHINE_ROOM";
    }

    public Set<String> getConsumeridcs() {
        return consumeridcs;
    }

    public void setConsumeridcs(Set<String> consumeridcs) {
        this.consumeridcs = consumeridcs;
    }
}

3.2.2 AllocateMessageQueueConsistentHash一致性哈希分配策略

/**
 * Consistent Hashing queue algorithm
 * 一致性哈希分配策略。基于一致性哈希算法分配。
 * 大概步骤为:
 * 1.实例化ConsistentHashRouter对象,用于产生虚拟节点以及构建哈希环,如果没有指定哈希函数,则采用MD5Hash作为哈希函数。
 * 2.遍历消息队列集合,对messageQueue进行hash计算,按顺时针找到最近的consumer节点。如果是当前consumer,则加入结果集。
 */
public class AllocateMessageQueueConsistentHash implements AllocateMessageQueueStrategy {
    private final InternalLogger log = ClientLogger.getLog();

    /**
     * 物理节点的虚拟节点的数量,不可小于0,默认10
     */
    private final int virtualNodeCnt;
    /**
     * 自定义的哈希函数,默认为MD5Hash
     */
    private final HashFunction customHashFunction;

    public AllocateMessageQueueConsistentHash() {
        this(10);
    }

    public AllocateMessageQueueConsistentHash(int virtualNodeCnt) {
        this(virtualNodeCnt, null);
    }

    public AllocateMessageQueueConsistentHash(int virtualNodeCnt, HashFunction customHashFunction) {
        if (virtualNodeCnt < 0) {
            throw new IllegalArgumentException("illegal virtualNodeCnt :" + virtualNodeCnt);
        }
        this.virtualNodeCnt = virtualNodeCnt;
        this.customHashFunction = customHashFunction;
    }

    @Override
    public List<MessageQueue> allocate(String consumerGroup, String currentCID, List<MessageQueue> mqAll,
        List<String> cidAll) {
        //参数校验
        if (currentCID == null || currentCID.length() < 1) {
            throw new IllegalArgumentException("currentCID is empty");
        }
        if (mqAll == null || mqAll.isEmpty()) {
            throw new IllegalArgumentException("mqAll is null or mqAll empty");
        }
        if (cidAll == null || cidAll.isEmpty()) {
            throw new IllegalArgumentException("cidAll is null or cidAll empty");
        }

        List<MessageQueue> result = new ArrayList<MessageQueue>();
        if (!cidAll.contains(currentCID)) {
            log.info("[BUG] ConsumerGroup: {} The consumerId: {} not in cidAll: {}",
                consumerGroup,
                currentCID,
                cidAll);
            return result;
        }

        //包装为ClientNode对象
        Collection<ClientNode> cidNodes = new ArrayList<ClientNode>();
        for (String cid : cidAll) {
            cidNodes.add(new ClientNode(cid));
        }
        //实例化ConsistentHashRouter对象,用于产生虚拟节点以及构建哈希环
        //如果没有指定哈希函数,则采用MD5Hash作为哈希函数
        final ConsistentHashRouter<ClientNode> router; //for building hash ring
        if (customHashFunction != null) {
            router = new ConsistentHashRouter<ClientNode>(cidNodes, virtualNodeCnt, customHashFunction);
        } else {
            router = new ConsistentHashRouter<ClientNode>(cidNodes, virtualNodeCnt);
        }

        List<MessageQueue> results = new ArrayList<MessageQueue>();
        //遍历消息队列集合
        for (MessageQueue mq : mqAll) {
            //对messageQueue进行hash计算,按顺时针找到最近的consumer节点
            ClientNode clientNode = router.routeNode(mq.toString());
            //如果是当前consumer,则加入结果集
            if (clientNode != null && currentCID.equals(clientNode.getKey())) {
                results.add(mq);
            }
        }

        return results;

    }

    @Override
    public String getName() {
        return "CONSISTENT_HASH";
    }

    private static class ClientNode implements Node {
        private final String clientID;

        public ClientNode(String clientID) {
            this.clientID = clientID;
        }

        @Override
        public String getKey() {
            return clientID;
        }
    }

}

3.2.3 AllocateMessageQueueByConfig根据用户配置的消息队列分配

/**
 * 根据用户配置的消息队列分配。将会直接返回用户配置的消息队列集合。
 */
public class AllocateMessageQueueByConfig implements AllocateMessageQueueStrategy {
    private List<MessageQueue> messageQueueList;

    @Override
    public List<MessageQueue> allocate(String consumerGroup, String currentCID, List<MessageQueue> mqAll,
        List<String> cidAll) {
        return this.messageQueueList;
    }

    @Override
    public String getName() {
        return "CONFIG";
    }

    public List<MessageQueue> getMessageQueueList() {
        return messageQueueList;
    }

    public void setMessageQueueList(List<MessageQueue> messageQueueList) {
        this.messageQueueList = messageQueueList;
    }
}

3.2.4 AllocateMessageQueueAveragelyByCircle环形平均分配策略

/**
 * Cycle average Hashing queue algorithm
 * 环形平均分配策略。
 * 尽量将消息队列平均分配给所有消费者,多余的队列分配至排在前面的消费者。
 * 与平均分配策略差不多,区别就是分配的时候,按照消费者的顺序进行一轮一轮的分配,直到分配完所有消息队列。
 */
public class AllocateMessageQueueAveragelyByCircle implements AllocateMessageQueueStrategy {
    private final InternalLogger log = ClientLogger.getLog();

    /**
     * 按照消费者的顺序进行一轮一轮的分配,直到分配完所有消息队列。
     * 例如有消费者A、B,有5个消息队列1、2、3、4、5。
     * 第一轮A分配1,B分配2;
     * 第二轮A分配3,B分配4;
     * 第二轮A分配5。
     * 因此A分配到1、3、5,B分配到2、4。
     * @param consumerGroup current consumer group
     * @param currentCID current consumer id
     * @param mqAll message queue set in current topic
     * @param cidAll consumer set in current consumer group
     * @return
     */
    @Override
    public List<MessageQueue> allocate(String consumerGroup, String currentCID, List<MessageQueue> mqAll,
        List<String> cidAll) {
        //参数校验
        if (currentCID == null || currentCID.length() < 1) {
            throw new IllegalArgumentException("currentCID is empty");
        }
        if (mqAll == null || mqAll.isEmpty()) {
            throw new IllegalArgumentException("mqAll is null or mqAll empty");
        }
        if (cidAll == null || cidAll.isEmpty()) {
            throw new IllegalArgumentException("cidAll is null or cidAll empty");
        }

        List<MessageQueue> result = new ArrayList<MessageQueue>();
        if (!cidAll.contains(currentCID)) {
            log.info("[BUG] ConsumerGroup: {} The consumerId: {} not in cidAll: {}",
                consumerGroup,
                currentCID,
                cidAll);
            return result;
        }

        //索引
        int index = cidAll.indexOf(currentCID);
        //获取每个分配轮次轮次中属于该消费者的对应的消息队列
        for (int i = index; i < mqAll.size(); i++) {
            if (i % cidAll.size() == index) {
                result.add(mqAll.get(i));
            }
        }
        return result;
    }

    @Override
    public String getName() {
        return "AVG_BY_CIRCLE";
    }
}

3.2.5 AllocateMessageQueueAveragely平均分配策略(默认策略)

/**
 * Average Hashing queue algorithm
 * 平均分配策略,这是默认策略。
 * 尽量将消息队列平均分配给所有消费者,多余的队列分配至排在前面的消费者。
 * 分配的时候,前一个消费者分配完了,才会给下一个消费者分配。
 */
public class AllocateMessageQueueAveragely implements AllocateMessageQueueStrategy {
    private final InternalLogger log = ClientLogger.getLog();

    /**
     * 计算消息队列数量与消费者数量的商,这个商就是每个消费者都会分到的队列数,然后对于余数,则只有排在前面的消费者能够分配到。
	 * 例如有消费者A、B,有5个消息队列1、2、3、4、5。
     * A分配到1、2、3,B分配到4、5。
     * @param consumerGroup current consumer group
     * @param currentCID current consumer id
     * @param mqAll message queue set in current topic
     * @param cidAll consumer set in current consumer group
     * @return
     */
    @Override
    public List<MessageQueue> allocate(String consumerGroup, String currentCID, List<MessageQueue> mqAll,
        List<String> cidAll) {
        //参数校验
        if (currentCID == null || currentCID.length() < 1) {
            throw new IllegalArgumentException("currentCID is empty");
        }
        if (mqAll == null || mqAll.isEmpty()) {
            throw new IllegalArgumentException("mqAll is null or mqAll empty");
        }
        if (cidAll == null || cidAll.isEmpty()) {
            throw new IllegalArgumentException("cidAll is null or cidAll empty");
        }

        List<MessageQueue> result = new ArrayList<MessageQueue>();
        if (!cidAll.contains(currentCID)) {
            log.info("[BUG] ConsumerGroup: {} The consumerId: {} not in cidAll: {}",
                consumerGroup,
                currentCID,
                cidAll);
            return result;
        }

        //当前currentCID在集合中的索引位置
        int index = cidAll.indexOf(currentCID);
        //计算平均分配后的余数,大于0表示不能被整除,必然有些消费者会多分配一个队列,有些消费者少分配一个队列
        int mod = mqAll.size() % cidAll.size();
        //计算当前消费者分配的队列数量
        //1、如果队列数量小于等于消费者数量,那么每个消费者最多只能分到一个队列,则算作1(后续还会计算),否则,表示每个消费者至少分配一个队列,需要继续计算
        //2、如果mod大于0并且当前消费者索引小于mod,那么当前消费者分到的队列数为平均分配的队列数+1,否则,分到的队列数为平均分配的队列数,即索引在余数范围内的,多分配一个队列
        int averageSize =
            mqAll.size() <= cidAll.size() ? 1 : (mod > 0 && index < mod ? mqAll.size() / cidAll.size()
                + 1 : mqAll.size() / cidAll.size());
        //如果mod大于0并且当前消费者索引小于mod,那么起始索引为index * averageSize,否则起始索引为index * averageSize + mod
        int startIndex = (mod > 0 && index < mod) ? index * averageSize : index * averageSize + mod;
        //最终分配的消息队列数量。取最小值是因为有些队列将会分配至较少的队列甚至无法分配到队列
        int range = Math.min(averageSize, mqAll.size() - startIndex);
        //分配队列,按照顺序分配
        for (int i = 0; i < range; i++) {
            result.add(mqAll.get((startIndex + i) % mqAll.size()));
        }
        return result;
    }

    @Override
    public String getName() {
        return "AVG";
    }
}

3.2.6 AllocateMachineRoomNearby机房就近分配策略

/**
 * 机房就近分配策略。消费者对绑定机房中的MessageQueue进行负载均衡。
 * 除此之外,对于某些拥有消息队列但却没有消费者的机房,其消息队列会被所欲消费者分配,
 * 具体的分配策略是,另外传入的一个AllocateMessageQueueStrategy的实现。
 * 策略的大概逻辑为:
 * 1.将消息队列根据机房分组,将消费者根据机房分组。
 * 2.分配部署在与当前消费者相同的机房中的mq,即如果消息队列与消费者属于同一机房,则对他们进行分配。具体的分配策略通过传入的allocateMessageQueueStrategy实现。
 * 3.如果某个拥有消息队列的机房没有对应的消费者,那么它的消息队列由当前所有的消费者分配。具体的分配策略通过传入的allocateMessageQueueStrategy实现。
 */
public class AllocateMachineRoomNearby implements AllocateMessageQueueStrategy {
    private final InternalLogger log = ClientLogger.getLog();

    /**
     * 用于真正分配消息队列的策略对象。
     */
    private final AllocateMessageQueueStrategy allocateMessageQueueStrategy;//actual allocate strategy
    /**
     * 机房解析器,从clientID和brokerName中解析出机房名称。
     */
    private final MachineRoomResolver machineRoomResolver;

    public AllocateMachineRoomNearby(AllocateMessageQueueStrategy allocateMessageQueueStrategy,
        MachineRoomResolver machineRoomResolver) throws NullPointerException {
        if (allocateMessageQueueStrategy == null) {
            throw new NullPointerException("allocateMessageQueueStrategy is null");
        }

        if (machineRoomResolver == null) {
            throw new NullPointerException("machineRoomResolver is null");
        }

        this.allocateMessageQueueStrategy = allocateMessageQueueStrategy;
        this.machineRoomResolver = machineRoomResolver;
    }

    @Override
    public List<MessageQueue> allocate(String consumerGroup, String currentCID, List<MessageQueue> mqAll,
        List<String> cidAll) {
        //参数校验
        if (currentCID == null || currentCID.length() < 1) {
            throw new IllegalArgumentException("currentCID is empty");
        }
        if (mqAll == null || mqAll.isEmpty()) {
            throw new IllegalArgumentException("mqAll is null or mqAll empty");
        }
        if (cidAll == null || cidAll.isEmpty()) {
            throw new IllegalArgumentException("cidAll is null or cidAll empty");
        }

        List<MessageQueue> result = new ArrayList<MessageQueue>();
        if (!cidAll.contains(currentCID)) {
            log.info("[BUG] ConsumerGroup: {} The consumerId: {} not in cidAll: {}",
                consumerGroup,
                currentCID,
                cidAll);
            return result;
        }

        //group mq by machine room
        //将消息队列根据机房分组
        Map<String/*machine room */, List<MessageQueue>> mr2Mq = new TreeMap<String, List<MessageQueue>>();
        for (MessageQueue mq : mqAll) {
            //获取broker所属机房
            String brokerMachineRoom = machineRoomResolver.brokerDeployIn(mq);
            if (StringUtils.isNoneEmpty(brokerMachineRoom)) {
                if (mr2Mq.get(brokerMachineRoom) == null) {
                    //存入map
                    mr2Mq.put(brokerMachineRoom, new ArrayList<MessageQueue>());
                }
                //添加消息队列
                mr2Mq.get(brokerMachineRoom).add(mq);
            } else {
                throw new IllegalArgumentException("Machine room is null for mq " + mq);
            }
        }

        //group consumer by machine room
        //将消费者根据机房分组
        Map<String/*machine room */, List<String/*clientId*/>> mr2c = new TreeMap<String, List<String>>();
        for (String cid : cidAll) {
            //获取消费者所属的机房
            String consumerMachineRoom = machineRoomResolver.consumerDeployIn(cid);
            if (StringUtils.isNoneEmpty(consumerMachineRoom)) {
                if (mr2c.get(consumerMachineRoom) == null) {
                    //存入map
                    mr2c.put(consumerMachineRoom, new ArrayList<String>());
                }
                //添加消费者
                mr2c.get(consumerMachineRoom).add(cid);
            } else {
                throw new IllegalArgumentException("Machine room is null for consumer id " + cid);
            }
        }

        List<MessageQueue> allocateResults = new ArrayList<MessageQueue>();

        //1.allocate the mq that deploy in the same machine room with the current consumer
        //分配部署在与当前消费者相同的机房中的mq
        //获取当前消费者的机房
        String currentMachineRoom = machineRoomResolver.consumerDeployIn(currentCID);
        //移除并获取当前消费者的机房的队列集合
        List<MessageQueue> mqInThisMachineRoom = mr2Mq.remove(currentMachineRoom);
        //获取当前消费者的机房的消费者集合
        List<String> consumerInThisMachineRoom = mr2c.get(currentMachineRoom);
        if (mqInThisMachineRoom != null && !mqInThisMachineRoom.isEmpty()) {
            allocateResults.addAll(allocateMessageQueueStrategy.allocate(consumerGroup, currentCID, mqInThisMachineRoom, consumerInThisMachineRoom));
        }

        //2.allocate the rest mq to each machine room if there are no consumer alive in that machine room
        //如果机房中没有的消费者,则将剩余的mq分配给每个机房
        for (Entry<String, List<MessageQueue>> machineRoomEntry : mr2Mq.entrySet()) {
            //如果某个拥有消息队列的机房没有对应的消费者,那么它的消息队列由当前所有的消费者分配
            if (!mr2c.containsKey(machineRoomEntry.getKey())) { // no alive consumer in the corresponding machine room, so all consumers share these queues
                allocateResults.addAll(allocateMessageQueueStrategy.allocate(consumerGroup, currentCID, machineRoomEntry.getValue(), cidAll));
            }
        }

        return allocateResults;
    }

    @Override
    public String getName() {
        return "MACHINE_ROOM_NEARBY" + "-" + allocateMessageQueueStrategy.getName();
    }

    /**
     * A resolver object to determine which machine room do the message queues or clients are deployed in.
     *
     * AllocateMachineRoomNearby will use the results to group the message queues and clients by machine room.
     *
     * The result returned from the implemented method CANNOT be null.
     */
    public interface MachineRoomResolver {
        String brokerDeployIn(MessageQueue messageQueue);

        String consumerDeployIn(String clientID);
    }
}

3.3 触发重平衡情况

有三种情况会触发Consumer进行负载均衡或者说重平衡:

  1. RebalanceService服务是一个线程任务,由MQClientInstance启动,其每隔20s自动进行一次自动负载均衡。
  2. Broker触发的重平衡:
    2.1 Broker收到心跳请求之后如果发现消息中有新的consumer连接或者consumer订阅了新的topic或者移除了topic的订阅, 则Broker发送Code为NOTIFY_CONSUMER_IDS_CHANGED的请求给该group下面的所有Consumer,要求进行一次负载均衡。
    2.2 如果某个客户端连接出现连接异常事件EXCEPTION、连接断开事件CLOSE、或者连接闲置事件IDLE,则Broker同样会发送重平衡请求给消费者组下面的所有消费者。处理入口方法为ClientHousekeepingService# doChannelCloseEvent方法。
  3. 新的Consumer服务启动的时候,主动调用rebalanceImmediately唤醒负载均衡服务RebalanceService,进行重平衡。

你可能感兴趣的:(RocketMQ,java-rocketmq,rocketmq,Consumer,消息消费,MessageQueue)