RocketMQ源码分析——事务消息

参考官方设计文档:事务消息设计

文章目录

    • 事务消息流程图
    • 步骤流程
    • 设计关键点
    • 详细步骤
      • 步骤1:事务消息的发送
      • 步骤2:Broker写Half消息
      • 步骤3:执行本地事务
      • 步骤4:Broker提交或者回滚Half消息
      • 步骤5:Broker定时回查
      • 步骤6:Producer查询本地事务
      • 步骤7:Broker重新提交或者回滚Half消息
    • 总结

事务消息流程图

RocketMQ源码分析——事务消息_第1张图片

步骤流程

  1. 发送 half 消息给 Broker。
  2. 服务端响应消息写入结果。
  3. 根据发送结果执行本地事务(如果写入失败,此时 half 消息对业务不可见,本地逻辑不执行)。
  4. 根据本地事务状态执行 Commit 或者 Rollback(Commit 操作生成消息索引,消息对消费者可见)。
  5. 对没有Commit/Rollback的事务消息(pending状态的消息),从服务端发起一次“回查”。
  6. Producer收到回查消息,检查回查消息对应的本地事务的状态。
  7. 根据本地事务状态,重新Commit或者Rollback。

设计关键点

  1. half 消息存储在 RMQ_SYS_TRANS_HALF_TOPIC 主题中,消费者不可见。
  2. 开启一个定时任务消费此 half 消息,向 Producer 回查执行结果,重新发送一条消息到 HALF 主题,HALF 主题消费队列偏移量推进。
  3. 确定 half 消息 Commit 或者 Rollback 后,将对应的处理结果消息(存放的是 HALF 消费队列的偏移量 )放到 RMQ_SYS_TRANS_OP_HALF_TOPIC 主题中。
  4. half 消息不会删除,OP_HALF 主题中存在 half 的消息结果代表 half 消息已被处理。
  5. Commit 消息还会恢复原消息,发送到真实的主题下,这样消费者就能消费了。
  6. 超过15次回查仍不能确定状态,或者 CommitLog 文件超过 72h 过期,回滚此消息。
    RocketMQ源码分析——事务消息_第2张图片

接下来按照流程对照源码一步步分析

详细步骤

步骤1:事务消息的发送

org.apache.rocketmq.client.impl.producer.DefaultMQProducerImpl#sendMessageInTransaction
发送消息前设置两个属性,标记是事务消息,存储生产者组

MessageAccessor.putProperty(msg, MessageConst.PROPERTY_TRANSACTION_PREPARED, "true");
MessageAccessor.putProperty(msg, MessageConst.PROPERTY_PRODUCER_GROUP, this.defaultMQProducer.getProducerGroup());

同步发送消息

this.sendDefaultImpl(msg, CommunicationMode.SYNC, null, timeout);

设置消息系统状态

final String tranMsg = msg.getProperty(MessageConst.PROPERTY_TRANSACTION_PREPARED);
if (tranMsg != null && Boolean.parseBoolean(tranMsg)) {
     
    sysFlag |= MessageSysFlag.TRANSACTION_PREPARED_TYPE;
}

步骤2:Broker写Half消息

org.apache.rocketmq.broker.processor.SendMessageProcessor
接收并处理消息请求,判断是否是事务消息

String traFlag = oriProps.get(MessageConst.PROPERTY_TRANSACTION_PREPARED);
if (traFlag != null && Boolean.parseBoolean(traFlag)) {
     
    if (this.brokerController.getBrokerConfig().isRejectTransactionMessage()) {
     
    	// 不能处理直接返回
        return response;
    }
    // 事务消息特殊处理
    putMessageResult = this.brokerController.getTransactionalMessageService().prepareMessage(msgInner);
} else {
     
    putMessageResult = this.brokerController.getMessageStore().putMessage(msgInner);
}

org.apache.rocketmq.broker.transaction.queue.TransactionalMessageBridge#putHalfMessage
解析出 Half 消息并存储

public PutMessageResult putHalfMessage(MessageExtBrokerInner messageInner) {
     
    return store.putMessage(parseHalfMessageInner(messageInner));
}

解析过程:把真实主题和队列放到属性中,重置系统状态,重设主题为 RMQ_SYS_TRANS_HALF_TOPIC,重设队列为 0
存储到 CommitLog 中,此消息无法被消费者消费。

private MessageExtBrokerInner parseHalfMessageInner(MessageExtBrokerInner msgInner) {
     
    MessageAccessor.putProperty(msgInner, MessageConst.PROPERTY_REAL_TOPIC, msgInner.getTopic());
    MessageAccessor.putProperty(msgInner, MessageConst.PROPERTY_REAL_QUEUE_ID,
        String.valueOf(msgInner.getQueueId()));
    msgInner.setSysFlag(
        MessageSysFlag.resetTransactionValue(msgInner.getSysFlag(), MessageSysFlag.TRANSACTION_NOT_TYPE));
    msgInner.setTopic(TransactionalMessageUtil.buildHalfTopic());
    msgInner.setQueueId(0);
    msgInner.setPropertiesString(MessageDecoder.messageProperties2String(msgInner.getProperties()));
    return msgInner;
}

步骤3:执行本地事务

同步调用消息发送接口,等待Broker返回执行结果,异常就抛出错误不执行后续本地事务

SendResult sendResult = null;
try {
     
    sendResult = this.send(msg);
} catch (Exception e) {
     
    throw new MQClientException("send message Exception", e);
}

若返回成功状态,执行本地事务。若返回失败状态,包括刷盘超时、同步Slave超时、Slave不可用,不执行本地事务,并标记本地事务执行状态为回滚。

LocalTransactionState localTransactionState = LocalTransactionState.UNKNOW;
switch (sendResult.getSendStatus()) {
     
    case SEND_OK: {
     
        try {
     
            ...
            localTransactionState = transactionListener.executeLocalTransaction(msg, arg);
            // 本地事务无返回状态,默认未知
            if (null == localTransactionState) {
     
                localTransactionState = LocalTransactionState.UNKNOW;
            }
            ...
    }
    break;
    case FLUSH_DISK_TIMEOUT:
    case FLUSH_SLAVE_TIMEOUT:
    case SLAVE_NOT_AVAILABLE:
        localTransactionState = LocalTransactionState.ROLLBACK_MESSAGE;
        break;
    default:
        break;
}

按照本地事务的执行结果,向Broker发送 Commit 或者 Rollback 请求。

try {
     
    this.endTransaction(sendResult, localTransactionState, localException);
} catch (Exception e) {
     
}

RemotingCommand 命令代码:END_TRANSACTION,发送方式 Oneway。发送失败也无所谓,后续 Broker 会回查本地事务状态。

EndTransactionRequestHeader requestHeader = new EndTransactionRequestHeader();
requestHeader.setTransactionId(transactionId);
requestHeader.setCommitLogOffset(id.getOffset());
switch (localTransactionState) {
     
    case COMMIT_MESSAGE:
        requestHeader.setCommitOrRollback(MessageSysFlag.TRANSACTION_COMMIT_TYPE);
        break;
    case ROLLBACK_MESSAGE:
        requestHeader.setCommitOrRollback(MessageSysFlag.TRANSACTION_ROLLBACK_TYPE);
        break;
    case UNKNOW:
        requestHeader.setCommitOrRollback(MessageSysFlag.TRANSACTION_NOT_TYPE);
        break;
    default:
        break;
}
this.mQClientFactory.getMQClientAPIImpl().endTransactionOneway(brokerAddr, requestHeader, remark,
            this.defaultMQProducer.getSendMsgTimeout());

步骤4:Broker提交或者回滚Half消息

Broker 接收 END_TRANSACTION 命令
org.apache.rocketmq.broker.processor.EndTransactionProcessor#processRequest
非 Commit 或者 Rollback 不处理

switch (requestHeader.getCommitOrRollback()) {
     
    case MessageSysFlag.TRANSACTION_NOT_TYPE: {
     
        return null;
    }
    case MessageSysFlag.TRANSACTION_COMMIT_TYPE: {
     
        break;
    }
    case MessageSysFlag.TRANSACTION_ROLLBACK_TYPE: {
     
        break;
    }
    default:
        return null;
}

处理 Commit 或者 Rollback

OperationResult result = new OperationResult();
if (MessageSysFlag.TRANSACTION_COMMIT_TYPE == requestHeader.getCommitOrRollback()) {
     
	// 获取 Half 消息
    result = this.brokerController.getTransactionalMessageService().commitMessage(requestHeader);
    // 是否获取成功
    if (result.getResponseCode() == ResponseCode.SUCCESS) {
     
    	// 验证 Half 消息是否一致
        RemotingCommand res = checkPrepareMessage(result.getPrepareMessage(), requestHeader);
        if (res.getCode() == ResponseCode.SUCCESS) {
     
        	// 根据 Half 消息恢复出原消息
            MessageExtBrokerInner msgInner = endMessageTransaction(result.getPrepareMessage());
            msgInner.setSysFlag(MessageSysFlag.resetTransactionValue(msgInner.getSysFlag(), requestHeader.getCommitOrRollback()));
            msgInner.setQueueOffset(requestHeader.getTranStateTableOffset());
            msgInner.setPreparedTransactionOffset(requestHeader.getCommitLogOffset());
            msgInner.setStoreTimestamp(result.getPrepareMessage().getStoreTimestamp());
            // 原消息重新进入 CommitLog,让消费者正常消费
            RemotingCommand sendResult = sendFinalMessage(msgInner);
            if (sendResult.getCode() == ResponseCode.SUCCESS) {
     
                // 刷盘成功,删除 Half 消息
                this.brokerController.getTransactionalMessageService().deletePrepareMessage(result.getPrepareMessage());
            }
            // 失败,等待后续回查
            return sendResult;
        }
        return res;
    }
} else if (MessageSysFlag.TRANSACTION_ROLLBACK_TYPE == requestHeader.getCommitOrRollback()) {
     
    result = this.brokerController.getTransactionalMessageService().rollbackMessage(requestHeader);
    if (result.getResponseCode() == ResponseCode.SUCCESS) {
     
        RemotingCommand res = checkPrepareMessage(result.getPrepareMessage(), requestHeader);
        if (res.getCode() == ResponseCode.SUCCESS) {
     
        	// 删除 Half 消息
            this.brokerController.getTransactionalMessageService().deletePrepareMessage(result.getPrepareMessage());
        }
        return res;
    }
}

获取 Half 消息

public OperationResult commitMessage(EndTransactionRequestHeader requestHeader) {
     
    return getHalfMessageByOffset(requestHeader.getCommitLogOffset());
}

public OperationResult rollbackMessage(EndTransactionRequestHeader requestHeader) {
     
    return getHalfMessageByOffset(requestHeader.getCommitLogOffset());
}

根据 Half 消息 result 验证生产者组、事务状态、偏移量是否和请求传过来的一致

private RemotingCommand checkPrepareMessage(MessageExt msgExt, EndTransactionRequestHeader requestHeader) {
     
    final RemotingCommand response = RemotingCommand.createResponseCommand(null);
    if (msgExt != null) {
     
        final String pgroupRead = msgExt.getProperty(MessageConst.PROPERTY_PRODUCER_GROUP);
        if (!pgroupRead.equals(requestHeader.getProducerGroup())) {
     
            ...
        }

        if (msgExt.getQueueOffset() != requestHeader.getTranStateTableOffset()) {
     
            ...
        }

        if (msgExt.getCommitLogOffset() != requestHeader.getCommitLogOffset()) {
     
            ...
        }
    } else {
     
        response.setCode(ResponseCode.SYSTEM_ERROR);
        response.setRemark("Find prepared transaction message failed");
        return response;
    }
    response.setCode(ResponseCode.SUCCESS);
    return response;
}

Commit 消息提交,需要先恢复出原消息

private MessageExtBrokerInner endMessageTransaction(MessageExt msgExt) {
     
    MessageExtBrokerInner msgInner = new MessageExtBrokerInner();
    // 恢复属性中的主题和队列
    msgInner.setTopic(msgExt.getUserProperty(MessageConst.PROPERTY_REAL_TOPIC));
    msgInner.setQueueId(Integer.parseInt(msgExt.getUserProperty(MessageConst.PROPERTY_REAL_QUEUE_ID)));
    msgInner.setBody(msgExt.getBody());
    msgInner.setFlag(msgExt.getFlag());
    msgInner.setBornTimestamp(msgExt.getBornTimestamp());
    msgInner.setBornHost(msgExt.getBornHost());
    msgInner.setStoreHost(msgExt.getStoreHost());
    msgInner.setReconsumeTimes(msgExt.getReconsumeTimes());
    msgInner.setWaitStoreMsgOK(false);
    msgInner.setTransactionId(msgExt.getUserProperty(MessageConst.PROPERTY_UNIQ_CLIENT_MESSAGE_ID_KEYIDX));
    msgInner.setSysFlag(msgExt.getSysFlag());
    TopicFilterType topicFilterType =
        (msgInner.getSysFlag() & MessageSysFlag.MULTI_TAGS_FLAG) == MessageSysFlag.MULTI_TAGS_FLAG ? TopicFilterType.MULTI_TAG
            : TopicFilterType.SINGLE_TAG;
    long tagsCodeValue = MessageExtBrokerInner.tagsString2tagsCode(topicFilterType, msgInner.getTags());
    msgInner.setTagsCode(tagsCodeValue);
    MessageAccessor.setProperties(msgInner, msgExt.getProperties());
    msgInner.setPropertiesString(MessageDecoder.messageProperties2String(msgExt.getProperties()));
    // 清空属性中的主题和队列
    MessageAccessor.clearProperty(msgInner, MessageConst.PROPERTY_REAL_TOPIC);
    MessageAccessor.clearProperty(msgInner, MessageConst.PROPERTY_REAL_QUEUE_ID);
    return msgInner;
}

Commit 消息,重新将原消息存储到 CommitLog,并返回存储结果。Slave 的刷盘结果不影响此次消息成功返回

private RemotingCommand sendFinalMessage(MessageExtBrokerInner msgInner) {
     
    final RemotingCommand response = RemotingCommand.createResponseCommand(null);
    final PutMessageResult putMessageResult = this.brokerController.getMessageStore().putMessage(msgInner);
    if (putMessageResult != null) {
     
        switch (putMessageResult.getPutMessageStatus()) {
     
            // Success
            case PUT_OK:
            case FLUSH_DISK_TIMEOUT:
            case FLUSH_SLAVE_TIMEOUT:
            case SLAVE_NOT_AVAILABLE:
                response.setCode(ResponseCode.SUCCESS);
                response.setRemark(null);
                break;
            // Failed
            case CREATE_MAPEDFILE_FAILED:
            case MESSAGE_ILLEGAL:
            case PROPERTIES_SIZE_EXCEEDED:
            case SERVICE_NOT_AVAILABLE:
            case OS_PAGECACHE_BUSY:
            case UNKNOWN_ERROR:
            default:
                response.setCode(ResponseCode.SYSTEM_ERROR);
                response.setRemark("UNKNOWN_ERROR DEFAULT");
                break;
        }
        return response;
    } else {
     
        response.setCode(ResponseCode.SYSTEM_ERROR);
        response.setRemark("store putMessage return null");
    }
    return response;
}

原消息刷盘成功,删除 Half 消息,整个事务消息结束

public boolean deletePrepareMessage(MessageExt msgExt) {
     
    if (this.transactionalMessageBridge.putOpMessage(msgExt, TransactionalMessageUtil.REMOVETAG)) {
     
        return true;
    } else {
     
        return false;
    }
}
public boolean putOpMessage(MessageExt messageExt, String opType) {
     
    MessageQueue messageQueue = new MessageQueue(messageExt.getTopic(),
        this.brokerController.getBrokerConfig().getBrokerName(), messageExt.getQueueId());
    if (TransactionalMessageUtil.REMOVETAG.equals(opType)) {
     
        return addRemoveTagInTransactionOp(messageExt, messageQueue);
    }
    return true;
}

Op 消息存储的是 Half 消息的消费队列偏移量

private boolean addRemoveTagInTransactionOp(MessageExt messageExt, MessageQueue messageQueue) {
     
    Message message = new Message(TransactionalMessageUtil.buildOpTopic(), TransactionalMessageUtil.REMOVETAG,
        String.valueOf(messageExt.getQueueOffset()).getBytes(TransactionalMessageUtil.charset));
    writeOp(message, messageQueue);
    return true;
}

步骤5:Broker定时回查

Broker 存储了 Half 消息后,若收不到 Commit 或者 Rollback 命令,定时执行回查。

public class TransactionalMessageCheckService extends ServiceThread {
     
    public void run() {
     
    	// 默认间隔 60s
        long checkInterval = brokerController.getBrokerConfig().getTransactionCheckInterval();
        while (!this.isStopped()) {
     
            this.waitForRunning(checkInterval);
        }
    }

    @Override
    protected void onWaitEnd() {
     
    	// 默认一条消息存储超过6s才执行回查
        long timeout = brokerController.getBrokerConfig().getTransactionTimeOut();
        // 一条消息最大回查次数,默认15次后删除 Half 消息
        int checkMax = brokerController.getBrokerConfig().getTransactionCheckMax();
        this.brokerController.getTransactionalMessageService().check(timeout, checkMax, this.brokerController.getTransactionalMessageCheckListener());
    }
}

具体执行回查逻辑

public void check(long transactionTimeout, int transactionCheckMax,
    AbstractTransactionalMessageCheckListener listener) {
     
    try {
     
        String topic = MixAll.RMQ_SYS_TRANS_HALF_TOPIC;
        // 获取到所有的 Half 消费队列
        Set<MessageQueue> msgQueues = transactionalMessageBridge.fetchMessageQueues(topic);
        if (msgQueues == null || msgQueues.size() == 0) {
     
            log.warn("The queue of topic is empty :" + topic);
            return;
        }
        log.debug("Check topic={}, queues={}", topic, msgQueues);
        for (MessageQueue messageQueue : msgQueues) {
     
            long startTime = System.currentTimeMillis();
            // 根据 Half 队列获取到 Op 队列,Broker名称和队列序号都一样只是主题不一样
            MessageQueue opQueue = getOpQueue(messageQueue);
            // 获取到 Half 消息的消费进度偏移量
            long halfOffset = transactionalMessageBridge.fetchConsumeOffset(messageQueue);
            // 获取到 Op 消息的消费进度偏移量
            long opOffset = transactionalMessageBridge.fetchConsumeOffset(opQueue);
            log.info("Before check, the queue={} msgOffset={} opOffset={}", messageQueue, halfOffset, opOffset);
            if (halfOffset < 0 || opOffset < 0) {
     
                log.error("MessageQueue: {} illegal offset read: {}, op offset: {},skip this queue", messageQueue,
                    halfOffset, opOffset);
                continue;
            }
			// 已处理的Op消息
            List<Long> doneOpOffset = new ArrayList<>();
            // 准备处理的 Half 消息
            HashMap<Long, Long> removeMap = new HashMap<>();
            // 对比两个队列,默认从 Op 主题拉取32条消息
            PullResult pullResult = fillOpRemoveMap(removeMap, opQueue, opOffset, halfOffset, doneOpOffset);
            if (null == pullResult) {
     
                log.error("The queue={} check msgOffset={} with opOffset={} failed, pullResult is null",
                    messageQueue, halfOffset, opOffset);
                continue;
            }
            // single thread
            int getMessageNullCount = 1;// 获取空消息的次数
            long newOffset = halfOffset; // half消息的最新进度
            long i = halfOffset;
            while (true) {
     
                // 一次检查任务只执行60s,然后退出等下个检查任务去执行
                if (System.currentTimeMillis() - startTime > MAX_PROCESS_TIME_LIMIT) {
     
                    log.info("Queue={} process time reach max={}", messageQueue, MAX_PROCESS_TIME_LIMIT);
                    break;
                }
                // 消息已经被处理过了,跳过
                if (removeMap.containsKey(i)) {
     
                    log.info("Half offset {} has been committed/rolled back", i);
                    removeMap.remove(i);
                } else {
     
                    // 获取half消息
                    GetResult getResult = getHalfMsg(messageQueue, i);
                    MessageExt msgExt = getResult.getMsg();
                    if (msgExt == null) {
     
                        // 未获取到,进行一次重试
                        if (getMessageNullCount++ > MAX_RETRY_COUNT_WHEN_HALF_NULL) {
     
                            break;
                        }
                        if (getResult.getPullResult().getPullStatus() == PullStatus.NO_NEW_MSG) {
     
                            log.debug("No new msg, the miss offset={} in={}, continue check={}, pull result={}", i,
                                messageQueue, getMessageNullCount, getResult.getPullResult());
                            // 此队列无消息,结束此队列的查询任务
                            break;
                        } else {
     
                            log.info("Illegal offset, the miss offset={} in={}, continue check={}, pull result={}",
                                i, messageQueue, getMessageNullCount, getResult.getPullResult());
                            i = getResult.getPullResult().getNextBeginOffset();
                            newOffset = i;
                            // 重新拉取
                            continue;
                        }
                    }

                    // 是否回查,次数超过15次丢弃,消息文件超过72h过期了跳过
                    if (needDiscard(msgExt, transactionCheckMax) || needSkip(msgExt)) {
     
                        listener.resolveDiscardMsg(msgExt);
                        // 处理进度加一
                        newOffset = i + 1;
                        i++;
                        continue;
                    }
                    // 读到新消息结束
                    if (msgExt.getStoreTimestamp() >= startTime) {
     
                        log.debug("Fresh stored. the miss offset={}, check it later, store={}", i,
                            new Date(msgExt.getStoreTimestamp()));
                        break;
                    }

                    // 已存储时间
                    long valueOfCurrentMinusBorn = System.currentTimeMillis() - msgExt.getBornTimestamp();
                    // 消息存储超过6s才回查
                    long checkImmunityTime = transactionTimeout;
                    String checkImmunityTimeStr = msgExt.getUserProperty(MessageConst.PROPERTY_CHECK_IMMUNITY_TIME_IN_SECONDS);
                    if (null != checkImmunityTimeStr) {
     
                        checkImmunityTime = getImmunityTime(checkImmunityTimeStr, transactionTimeout);
                        if (valueOfCurrentMinusBorn < checkImmunityTime) {
     
                            if (checkPrepareQueueOffset(removeMap, doneOpOffset, msgExt)) {
     
                                newOffset = i + 1;
                                i++;
                                continue;
                            }
                        }
                    } else {
     
                        if ((0 <= valueOfCurrentMinusBorn) && (valueOfCurrentMinusBorn < checkImmunityTime)) {
     
                            log.debug("New arrived, the miss offset={}, check it later checkImmunity={}, born={}", i,
                                checkImmunityTime, new Date(msgExt.getBornTimestamp()));
                            break;
                        }
                    }
                    List<MessageExt> opMsg = pullResult.getMsgFoundList();
                    // 如果没有已处理的消息且本次处理时间超过最小时间限制
                    // 或者队列中最后一条消息满足回查时间限制
                    boolean isNeedCheck = (opMsg == null && valueOfCurrentMinusBorn > checkImmunityTime)
                        || (opMsg != null && (opMsg.get(opMsg.size() - 1).getBornTimestamp() - startTime > transactionTimeout))
                        || (valueOfCurrentMinusBorn <= -1);
					// 需要回查
                    if (isNeedCheck) {
     
                        // 将half消息再次发送到CommitLog,进度向前推,存储重试次数
                        if (!putBackHalfMsgQueue(msgExt, i)) {
     
                            continue;
                        }
                        // 向生产者发送查询请求
                        listener.resolveHalfMsg(msgExt);
                    } else {
     
                        // 无法判断,加载更多的op消息
                        pullResult = fillOpRemoveMap(removeMap, opQueue, pullResult.getNextBeginOffset(), halfOffset, doneOpOffset);
                        log.info("The miss offset:{} in messageQueue:{} need to get more opMsg, result is:{}", i,
                            messageQueue, pullResult);
                        continue;
                    }
                }
                newOffset = i + 1;
                i++;
            }
            // 保存回查进度
            if (newOffset != halfOffset) {
     
                transactionalMessageBridge.updateConsumeOffset(messageQueue, newOffset);
            }
            long newOpOffset = calculateOpOffset(doneOpOffset, opOffset);
            // 保存回查进度
            if (newOpOffset != opOffset) {
     
                transactionalMessageBridge.updateConsumeOffset(opQueue, newOpOffset);
            }
        }
    } catch (Exception e) {
     
        e.printStackTrace();
        log.error("Check error", e);
    }

}

发送回查请求时,先将调用 putBackHalfMsgQueue 将 Half 消息再次存入 CommitLog,处理进度加一,MQ保证顺序写,无法真正的删除消息。
然后开启一个线程去执行回调查询,不等待查询结果,因为Producer会发送一条处理结果回来。

public void resolveHalfMsg(final MessageExt msgExt) {
     
    executorService.execute(new Runnable() {
     
        @Override
        public void run() {
     
            try {
     
                sendCheckMessage(msgExt);
            } catch (Exception e) {
     
                LOGGER.error("Send check message error!", e);
            }
        }
    });
}

构造发送请求,根据消息的生产者组名称,从生产者组中轮询选择一个生产者发送回调查询请求

public void sendCheckMessage(MessageExt msgExt) throws Exception {
     
    CheckTransactionStateRequestHeader checkTransactionStateRequestHeader = new CheckTransactionStateRequestHeader();
    checkTransactionStateRequestHeader.setCommitLogOffset(msgExt.getCommitLogOffset());
    checkTransactionStateRequestHeader.setOffsetMsgId(msgExt.getMsgId());
    checkTransactionStateRequestHeader.setMsgId(msgExt.getUserProperty(MessageConst.PROPERTY_UNIQ_CLIENT_MESSAGE_ID_KEYIDX));
    checkTransactionStateRequestHeader.setTransactionId(checkTransactionStateRequestHeader.getMsgId());
    checkTransactionStateRequestHeader.setTranStateTableOffset(msgExt.getQueueOffset());
    msgExt.setTopic(msgExt.getUserProperty(MessageConst.PROPERTY_REAL_TOPIC));
    msgExt.setQueueId(Integer.parseInt(msgExt.getUserProperty(MessageConst.PROPERTY_REAL_QUEUE_ID)));
    msgExt.setStoreSize(0);
    String groupId = msgExt.getProperty(MessageConst.PROPERTY_PRODUCER_GROUP);
    Channel channel = brokerController.getProducerManager().getAvaliableChannel(groupId);
    if (channel != null) {
     
        brokerController.getBroker2Client().checkProducerTransactionState(groupId, channel, checkTransactionStateRequestHeader, msgExt);
    } else {
     
        LOGGER.warn("Check transaction failed, channel is null. groupId={}", groupId);
    }
}

不管是生产者还是消费者都会向所有的Broker发送心跳,找到第一个有效的客户端 Channel 通道

public Channel getAvaliableChannel(String groupId) {
     
    HashMap<Channel, ClientChannelInfo> channelClientChannelInfoHashMap = groupChannelTable.get(groupId);
    List<Channel> channelList = new ArrayList<Channel>();
    if (channelClientChannelInfoHashMap != null) {
     
        for (Channel channel : channelClientChannelInfoHashMap.keySet()) {
     
            channelList.add(channel);
        }
        int index = positiveAtomicCounter.incrementAndGet() % size;
        Channel channel = channelList.get(index);
        int count = 0;
        boolean isOk = channel.isActive() && channel.isWritable();
        while (count++ < GET_AVALIABLE_CHANNEL_RETRY_COUNT) {
     
            if (isOk) {
     
                return channel;
            }
            index = (++index) % size;
            channel = channelList.get(index);
            isOk = channel.isActive() && channel.isWritable();
        }
    } else {
     
        log.warn("Check transaction failed, channel table is empty. groupId={}", groupId);
        return null;
    }
    return null;
}

具体的 RemotingCommand 请求命令是 CHECK_TRANSACTION_STATE

步骤6:Producer查询本地事务

生产者接收 Broker 回查请求,解析出 Broker 地址
org.apache.rocketmq.client.impl.ClientRemotingProcessor#checkTransactionState

final String addr = RemotingHelper.parseChannelRemoteAddr(ctx.channel());
// 检查本地事务
producer.checkTransactionState(addr, messageExt, requestHeader);

生产者开启了一个线程池用来处理 Broker 的回调查询请求

public void checkTransactionState(final String addr, final MessageExt msg,
    final CheckTransactionStateRequestHeader header) {
     
    Runnable request = new Runnable() {
     
        private final String brokerAddr = addr;
        private final MessageExt message = msg;
        private final CheckTransactionStateRequestHeader checkRequestHeader = header;
        private final String group = DefaultMQProducerImpl.this.defaultMQProducer.getProducerGroup();

        @Override
        public void run() {
     
        	// 检查本地事务监听是否存在
            TransactionCheckListener transactionCheckListener = DefaultMQProducerImpl.this.checkListener();
            TransactionListener transactionListener = getCheckListener();
            if (transactionCheckListener != null || transactionListener != null) {
     
                LocalTransactionState localTransactionState = LocalTransactionState.UNKNOW;
                Throwable exception = null;
                try {
     
                	// 执行本地事务结果查询,这里是需要生产者实现的地方
                    if (transactionCheckListener != null) {
     
                		// 区分新旧版本,旧的事务接口已被标记弃用
                        localTransactionState = transactionCheckListener.checkLocalTransactionState(message);
                    } else if (transactionListener != null) {
     
                        log.debug("Used new check API in transaction message");
                        localTransactionState = transactionListener.checkLocalTransaction(message);
                    } else {
     
                        log.warn("CheckTransactionState, pick transactionListener by group[{}] failed", group);
                    }
                } catch (Throwable e) {
     
                    log.error("Broker call checkTransactionState, but checkLocalTransactionState exception", e);
                    exception = e;
                }
				// 按照本地查询结果,再次处理 Broker 事务状态
                this.processTransactionState(
                    localTransactionState,
                    group,
                    exception);
            } else {
     
                log.warn("CheckTransactionState, pick transactionCheckListener by group[{}] failed", group);
            }
        }

		// 和 endTransaction 方法类似
        private void processTransactionState(
            final LocalTransactionState localTransactionState,
            final String producerGroup,
            final Throwable exception) {
     
            ...
            try {
     
                DefaultMQProducerImpl.this.mQClientFactory.getMQClientAPIImpl().endTransactionOneway(brokerAddr, thisHeader, remark,
                    3000);
            } catch (Exception e) {
     
            }
        }
    };

    this.checkExecutor.submit(request);
}

步骤7:Broker重新提交或者回滚Half消息

再次调用 org.apache.rocketmq.broker.processor.EndTransactionProcessor#processRequest,逻辑一样。

事务消息主题目录
RocketMQ源码分析——事务消息_第3张图片

总结

事务消息先写 Half 消息,对消息的 Topic 和 Queue 等属性进行替换,同时将原来的 Topic 和 Queue 信息存储到消息的属性中,正因为消息主题被替换,故消息并不会转发到该原主题的消息消费队列,消费者无法感知消息的存在,不会消费。

在完成一阶段写入一条对用户不可见的消息后,二阶段如果是 Commit 操作,则需要让消息对用户可见,恢复原消息重新存储到 CommitLog,并删除一阶段的消息;如果是 Rollback 则需要撤销一阶段的消息。

RocketMQ 无法去真正的删除一条消息,因为是顺序写文件的。RocketMQ 使用 Op 消息标识事务消息已经确定的状态(Commit或者Rollback)。Op 消息的主题是一个内部的 Topic(像Half消息的Topic一样),不会被用户消费。

Op 消息的内容为对应的 Half 消息的存储的Offset,这样通过 Op 消息能索引到 Half 消息进行后续的回查操作。如果一条事务消息没有对应的 Op 消息,说明这个事务的状态还无法确定(可能是二阶段失败了)。对比 Half 消息和 Op 消息进行事务消息的回查并且推进处理进度。

不能确定的消息,Broker端会发起回查,将消息发送到对应的Producer端(同一个Group的Producer),由Producer根据消息来检查本地事务的状态,进而执行Commit或者Rollback。

你可能感兴趣的:(RocketMQ实战)