使用Seata实现分布式事物带来的分支事务回滚异常问题

1.场景说明

  • 使用Seata管理2个微服务的数据库添加操作,webSocketFeignService为另外个服务的feign调用
    @Autowired
    private WebSocketFeignService webSocketFeignService;

    @Transactional
    @GlobalTransactional
    @Override
    public void dt() {
        TestParam testParam1 = new TestParam();
        testParam1.setTestContent("11111");
        testParam1.setTestName("22222");
        this.add(testParam1);
        MessageVO messageVO = new MessageVO();
        List ids = new ArrayList<>();
        ids.add("1");
        messageVO.setUserIds(ids);
        messageVO.setMessageTime(new Date());
        messageVO.setText("测试分布式事务");
        Result push = webSocketFeignService.push(messageVO);
        throw new CoreException();
    }
  • 第2个服务的添加接口详情
    /**
     * 保存数据到数据库
     *
     * @param message
     * @return
     */
    @Transactional
    public List saveMessageAndMessageUser(MessageVO message) {
        List messages = new ArrayList<>();

        MessagePO messagePo = new MessagePO();

        BeanUtils.copyProperties(message, messagePo);
        messageMapper.insertMessage(messagePo);
        List userIds = message.getUserIds();

        for (int i = 0; i < userIds.size(); i++) {
            MessageUserPO userPo = new MessageUserPO();
            userPo.setUserId(userIds.get(i));
            userPo.setMessageId(messagePo.getMessageId());
            messageMapper.insertMessageUser(userPo);
            //发送消息
            MessageDTO messageDTO = new MessageDTO();
            BeanUtils.copyProperties(messagePo, messageDTO);
            BeanUtils.copyProperties(userPo, messageDTO);

            messages.add(messageDTO);
        }

        return messages;
    }

2.异常问题

  • 从不停打印的日志来看,分支事务一直没有回滚成功,所以不停重试,导致重试日志也不停打印
2020-03-23 00:31:35.019  INFO 7526 --- [atch_RMROLE_3_8] i.s.core.rpc.netty.RmMessageListener     : onMessage:xid=192.168.31.129:8888:2038619250,branchId=2038619251,branchType=AT,resourceId=jdbc:mysql://106.15.72.153:3306/chipro,applicationData=null
2020-03-23 00:31:35.028  INFO 7526 --- [atch_RMROLE_3_8] io.seata.rm.AbstractRMHandler            : Branch Rollbacking: 192.168.31.129:8888:2038619250 2038619251 jdbc:mysql://106.15.72.153:3306/chipro
2020-03-23 00:31:35.207  INFO 7526 --- [atch_RMROLE_3_8] i.seata.rm.datasource.DataSourceManager  : branchRollback failed reason [2038619251/192.168.31.129:8888:2038619250]
2020-03-23 00:31:35.207  INFO 7526 --- [atch_RMROLE_3_8] io.seata.rm.AbstractRMHandler            : Branch Rollbacked result: PhaseTwo_RollbackFailed_Retryable
2020-03-23 00:32:05.018  INFO 7526 --- [atch_RMROLE_4_8] i.s.core.rpc.netty.RmMessageListener     : onMessage:xid=192.168.31.129:8888:2038619250,branchId=2038619251,branchType=AT,resourceId=jdbc:mysql://106.15.72.153:3306/chipro,applicationData=null
2020-03-23 00:32:05.019  INFO 7526 --- [atch_RMROLE_4_8] io.seata.rm.AbstractRMHandler            : Branch Rollbacking: 192.168.31.129:8888:2038619250 2038619251 jdbc:mysql://106.15.72.153:3306/chipro
2020-03-23 00:32:05.188  INFO 7526 --- [atch_RMROLE_4_8] i.seata.rm.datasource.DataSourceManager  : branchRollback failed reason [2038619251/192.168.31.129:8888:2038619250]
2020-03-23 00:32:05.189  INFO 7526 --- [atch_RMROLE_4_8] io.seata.rm.AbstractRMHandler            : Branch Rollbacked result: PhaseTwo_RollbackFailed_Retryable
2020-03-23 00:32:34.996  INFO 7526 --- [atch_RMROLE_5_8] i.s.core.rpc.netty.RmMessageListener     : onMessage:xid=192.168.31.129:8888:2038619250,branchId=2038619251,branchType=AT,resourceId=jdbc:mysql://106.15.72.153:3306/chipro,applicationData=null
2020-03-23 00:32:35.001  INFO 7526 --- [atch_RMROLE_5_8] io.seata.rm.AbstractRMHandler            : Branch Rollbacking: 192.168.31.129:8888:2038619250 2038619251 jdbc:mysql://106.15.72.153:3306/chipro
2020-03-23 00:32:35.220  INFO 7526 --- [atch_RMROLE_5_8] i.seata.rm.datasource.DataSourceManager  : branchRollback failed reason [2038619251/192.168.31.129:8888:2038619250]
2020-03-23 00:32:35.220  INFO 7526 --- [atch_RMROLE_5_8] io.seata.rm.AbstractRMHandler            : Branch Rollbacked result: PhaseTwo_RollbackFailed_Retryable
2020-03-23 00:33:04.971  INFO 7526 --- [atch_RMROLE_6_8] i.s.core.rpc.netty.RmMessageListener     : onMessage:xid=192.168.31.129:8888:2038619250,branchId=2038619251,branchType=AT,resourceId=jdbc:mysql://106.15.72.153:3306/chipro,applicationData=null
2020-03-23 00:33:04.976  INFO 7526 --- [atch_RMROLE_6_8] io.seata.rm.AbstractRMHandler            : Branch Rollbacking: 192.168.31.129:8888:2038619250 2038619251 jdbc:mysql://106.15.72.153:3306/chipro
2020-03-23 00:33:05.171  INFO 7526 --- [atch_RMROLE_6_8] i.seata.rm.datasource.DataSourceManager  : branchRollback failed reason [2038619251/192.168.31.129:8888:2038619250]
2020-03-23 00:33:05.172  INFO 7526 --- [atch_RMROLE_6_8] io.seata.rm.AbstractRMHandler            : Branch Rollbacked result: PhaseTwo_RollbackFailed_Retryable
2020-03-23 00:33:34.949  INFO 7526 --- [atch_RMROLE_7_8] i.s.core.rpc.netty.RmMessageListener     : onMessage:xid=192.168.31.129:8888:2038619250,branchId=2038619251,branchType=AT,resourceId=jdbc:mysql://106.15.72.153:3306/chipro,applicationData=null
2020-03-23 00:33:34.950  INFO 7526 --- [atch_RMROLE_7_8] io.seata.rm.AbstractRMHandler            : Branch Rollbacking: 192.168.31.129:8888:2038619250 2038619251 jdbc:mysql://106.15.72.153:3306/chipro
2020-03-23 00:33:35.157  INFO 7526 --- [atch_RMROLE_7_8] i.seata.rm.datasource.DataSourceManager  : branchRollback failed reason [2038619251/192.168.31.129:8888:2038619250]
2020-03-23 00:33:35.158  INFO 7526 --- [atch_RMROLE_7_8] io.seata.rm.AbstractRMHandler            : Branch Rollbacked result: PhaseTwo_RollbackFailed_Retryable

3.异常分析

入口类 RmMessageListener ,经过层层打断点后发现最终问题定位在查出来的数据和当前数据的比较上

这里解释下两个数据?

  • 查出来的数据:也就是undo_log里面保存的更改之后的数据(不是之前)
  • 当前数据:也就是现在数据库保存的数据

为什么要比较两个数据?

  • 因为要避免其他人在事务回滚期间修改了这个数据,相当于乐观锁了,更新的时候比较一下

思考问题衍生情况?

  • 如果有人在事务回滚期间修改了相关数据,那么有可能导致回滚日志一直未能成功?(未验证)
/**
     * Is field equals.
     *
     * @param f0 the f0
     * @param f1  the f1
     * @return the boolean
     */
    public static boolean isFieldEquals(Field f0, Field f1) {
        if (f0 == null) {
            return f1 == null;
        } else {
            if (f1 == null) {
                return false;
            } else {
                if (StringUtils.equalsIgnoreCase(f0.getName(), f1.getName())
                        && f0.getType() == f1.getType()) {
                    if (f0.getValue() == null) {
                        return f1.getValue() == null;
                    } else {
                        if (f1.getValue() == null) {
                            return false;
                        } else {
                            String currentSerializer = UndoLogManager.getCurrentSerializer();
                            if (StringUtils.equals(currentSerializer, FastjsonUndoLogParser.NAME)) {
                                convertType(f0, f1);
                            }
                            return f0.getValue().equals(f1.getValue());
                        }
                    }
                } else {
                    return false;
                }
            }
        }
    }
  • 通过上述代码段,你会发现字段是通过equals比较的。而通过debug信息可以发现,最终有两个byte字段进行对比,可想而知使用equals进行对象对比,对比的其实是对象堆空间地址,肯定是不相等的。

 

使用Seata实现分布式事物带来的分支事务回滚异常问题_第1张图片

  • 当我点开两个byte字段时候发现,两个数组的内容都是相同的,所以分析出,归根结底还是字段类型惹的祸,如果是字符串类型就没有事了

使用Seata实现分布式事物带来的分支事务回滚异常问题_第2张图片

使用Seata实现分布式事物带来的分支事务回滚异常问题_第3张图片

4.解决异常

通过上面分析,初步定位事字段类型导致,所以去数据库查看相关字段类型,发现是blob类型,我们都只是blob类型就是字节数组。所以我们只需要将blob类型改成string类型即可。最终问题得以解决。

  • 修改前

使用Seata实现分布式事物带来的分支事务回滚异常问题_第4张图片

  • 修改后

使用Seata实现分布式事物带来的分支事务回滚异常问题_第5张图片

5.总结

通过对源码的查看,对seata框架又熟悉了一点。整体使用事件监听模式,轮询访问数据库undo_log查看时候有未回滚的数据记录,如果有则立即执行回滚操作,这也就是seata的一直回滚直到回滚成功的特点吧。

你可能感兴趣的:(spring,cloud,分布式事务)