之前我们已经在博客《分布式事务--Fescar》中了解学习到Fescar相关的架构,接下来我们分别用几篇博客分别来介绍一下Fescar的 TM、RM 和 TC之间的交互流程。
TM、RM和TC之间的交互流程图:
简单角色理解:
TC: Fesacr-server应用
TM:dubbo服务调用方
RM:dubbo服务提供方
1、简介
在上一篇博客《Fescar源码学习--事物管理者TM(服务调用方)》中我们已经介绍了TM角色相关的处理操作,这篇博客我们通过示例和源码来分析一下在分布式事务架构中RM角色所承担的责任。
RM在启动到处理TM的RPC调用主要做了以下操作:
(1)启动并向TC(Fescar Server)注册RM服务
(2)接收TM方的 Dubbo RPC 调用
(3)代理数据库生成undo日志,在undo_log表中记录相关原始数据
(4)返回RPC调用结果
(5)TM根据RM返回的结果向TC发送commit或rollback操作,TC根据XId事物id将commit或rollback通知对应的RM,RM接收TC(Fescar Server)的commit和rollback指令,其中commit是异步清理undo_log表中的数据,rollback操作会根据undo_log表中的数据恢复数据(读已提交,可能会产生脏数据)。
2、示例
dubbo服务提供者:
public class StorageServiceImpl implements StorageService {
private static final Logger LOGGER = LoggerFactory.getLogger(StorageService.class);
private JdbcTemplate jdbcTemplate;
public void setJdbcTemplate(JdbcTemplate jdbcTemplate) {
this.jdbcTemplate = jdbcTemplate;
}
@Override
public void deduct(String commodityCode, int count) {
LOGGER.info("Storage Service Begin ... xid: " + RootContext.getXID());
jdbcTemplate.update("update storage_tbl set count = count - ? where commodity_code = ?", new Object[] {count, commodityCode});
LOGGER.info("Storage Service End ... ");
}
public static void main(String[] args) throws Throwable {
ClassPathXmlApplicationContext context = new ClassPathXmlApplicationContext(new String[]{"dubbo-storage-service.xml"});
context.getBean("service");
JdbcTemplate jdbcTemplate = (JdbcTemplate) context.getBean("jdbcTemplate");
jdbcTemplate.update("delete from storage_tbl where commodity_code = 'C00321'");
jdbcTemplate.update("insert into storage_tbl(commodity_code, count) values ('C00321', 100)");
new ApplicationKeeper(context).keep();
}
}
xml配置:
1、初始化
在工程启动时会初始化GlobalTransactionScanner类,在initClient方法中会初始化RM相关的操作。
private void initClient() {
//RM服务与TC建立连接
TMClient.init(applicationId, txServiceGroup);
if (LOGGER.isInfoEnabled()) {
LOGGER.info(
"Transaction Manager Client is initialized. applicationId[" + applicationId + "] txServiceGroup["
+ txServiceGroup + "]");
}
if ((AT_MODE & mode) > 0) {
//初始化与TC连接,初始化线程AsyncWorker
RMClientAT.init(applicationId, txServiceGroup);
if (LOGGER.isInfoEnabled()) {
LOGGER.info(
"Resource Manager for AT Client is initialized. applicationId[" + applicationId
+ "] txServiceGroup["
+ txServiceGroup + "]");
}
}
}
2、RPC调用
TM(服务消费者)在调用RM(服务提供者)使用dubbo的rpc调用机制,会将全局事务xid传递到RM服务中,通过TX_XID键获取值
@Activate(group = { Constants.PROVIDER, Constants.CONSUMER }, order = 100)
public class TransactionPropagationFilter implements Filter {
private static final Logger LOGGER = LoggerFactory.getLogger(TransactionPropagationFilter.class);
@Override
public Result invoke(Invoker> invoker, Invocation invocation) throws RpcException {
String xid = RootContext.getXID();
String rpcXid = RpcContext.getContext().getAttachment(RootContext.KEY_XID);
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("xid in RootContext[" + xid + "] xid in RpcContext[" + rpcXid + "]");
}
boolean bind = false;
if (xid != null) {
RpcContext.getContext().setAttachment(RootContext.KEY_XID, xid);
} else {
if (rpcXid != null) {
RootContext.bind(rpcXid);
bind = true;
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("bind[" + rpcXid + "] to RootContext");
}
}
}
try {
return invoker.invoke(invocation);
} finally {
if (bind) {
String unbindXid = RootContext.unbind();
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("unbind[" + unbindXid + "] from RootContext");
}
if (!rpcXid.equalsIgnoreCase(unbindXid)) {
LOGGER.warn("xid in change during RPC from " + rpcXid + " to " + unbindXid);
if (unbindXid != null) {
RootContext.bind(unbindXid);
LOGGER.warn("bind [" + unbindXid + "] back to RootContext");
}
}
}
}
}
}
3、方法调用
在经过第二步dubbo的Filter拦截获取TX_XID之后,设置为本地线程变量,接下来就是执行本地函数。
@Override
public void deduct(String commodityCode, int count) {
LOGGER.info("Storage Service Begin ... xid: " + RootContext.getXID());
jdbcTemplate.update("update storage_tbl set count = count - ? where commodity_code = ?", new Object[] {count, commodityCode});
LOGGER.info("Storage Service End ... ");
}
jdbcTemplate在执行SQL操作时最终是交由ConnectionProxy的commit方法处理,在将事务提交到数据库之前做了一些处理操作。
4、分片事务提交
在ConnectionProxy的commit方法中做了以下操作:
(1)向TC(事务协调器,Fescar Server)注册分片事务,并获取分片事务id
(2)根据分片事务id在undo_log表中生成恢复日志
(3)真正提交数据库事务
(4)事务如果提交失败则向TC报告失败
(5)事务如果提交成功则向TC报告成功
@Override
public void commit() throws SQLException {
if (context.inGlobalTransaction()) {
try {
//向TC注册并获取分片事务id
register();
} catch (TransactionException e) {
recognizeLockKeyConflictException(e);
}
try {
//生成恢复日志到undo_log表中
if (context.hasUndoLog()) {
UndoLogManager.flushUndoLogs(this);
}
//真正提交数据库事事务
targetConnection.commit();
} catch (Throwable ex) {
//向TC提交事务执行失败
report(false);
if (ex instanceof SQLException) {
throw (SQLException) ex;
} else {
throw new SQLException(ex);
}
}
//向TC提交事务执行成功
report(true);
context.reset();
} else {
targetConnection.commit();
}
}
在执行完commit操作之后,分片执行成功或失败在TC中都存在记录,并且数据库中原有的数据也会生成恢复日志保存到undo_log中,以便全局事务回滚时将数据还原。
5、事务commit或rollback
在上一篇博客《Fescar源码学习--事物管理者TM(服务调用方)》中我们已经学习到全局事务的提交或者回滚由TM(服务调用方)发送指令到TC,TC会根据全局事务XID将分片事务commit或rollback的操作通知到每个RM。
RM分片事务操作:
(1)commit:RM直接删除本地undo_log的记录即可
(2)rollback:RM根据本地undo_log表中的记录还原数据,可能会产生脏数据(undo_log中记录的是某个时间点的原始数据,可能和当前数据已经不一致了)。
RM提供了RmMessageListener用于建立TC发送过来的commit或rollback操作指令
@Override
public void onMessage(long msgId, String serverAddress, Object msg, ClientMessageSender sender) {
if (LOGGER.isInfoEnabled()) {
LOGGER.info("onMessage:" + msg);
}
//commit 操作
if (msg instanceof BranchCommitRequest) {
handleBranchCommit(msgId, serverAddress, (BranchCommitRequest)msg, sender);
//rollback 操作
} else if (msg instanceof BranchRollbackRequest) {
handleBranchRollback(msgId, serverAddress, (BranchRollbackRequest)msg, sender);
}
}
6、commit操作
对于commit的操作RM的处理是简单的,RM只需要保证能将undo_log表中相关的记录删除即可,不需要过多的处理操作,因此commit请求最终会提交给AsyncWorker,由线程定时异步删除记录即可。
(1)xid最终会记录到map中
(2)在AsyncWorker初始化时建立定时任务每秒执行一次doBranchCommits函数,删除undo_log表中的记录
(3)在doBranchCommits中根据全局事务xid和分片事务branchId调用UndoLogManager.deleteUndoLog删除记录。
DataSourceManager中提交到 AsyncWorker异步操作:
@Override
public BranchStatus branchCommit(String xid, long branchId, String resourceId, String applicationData) throws TransactionException {
return asyncWorker.branchCommit(xid, branchId, resourceId, applicationData);
}
在AsyncWorker中异常删除记录操作:
@Override
public BranchStatus branchCommit(String xid, long branchId, String resourceId, String applicationData) throws TransactionException {
//提交事务ID到map中记录即可
if (ASYNC_COMMIT_BUFFER.size() < ASYNC_COMMIT_BUFFER_LIMIT) {
ASYNC_COMMIT_BUFFER.add(new Phase2Context(xid, branchId, resourceId, applicationData));
} else {
LOGGER.warn("Async commit buffer is FULL. Rejected branch [" + branchId + "/" + xid + "] will be handled by housekeeping later.");
}
return BranchStatus.PhaseTwo_Committed;
}
public synchronized void init() {
LOGGER.info("Async Commit Buffer Limit: " + ASYNC_COMMIT_BUFFER_LIMIT);
timerExecutor = new ScheduledThreadPoolExecutor(1,
new NamedThreadFactory("AsyncWorker", 1, true));
//定时任务,定时处理commit操作
timerExecutor.scheduleAtFixedRate(new Runnable() {
@Override
public void run() {
try {
doBranchCommits();
} catch (Throwable e) {
LOGGER.info("Failed at async committing ... " + e.getMessage());
}
}
}, 10, 1000 * 1, TimeUnit.MILLISECONDS);
}
//调用 UndoLogManager.deleteUndoLog 删除记录即可
private void doBranchCommits() {
if (ASYNC_COMMIT_BUFFER.size() == 0) {
return;
}
Map> mappedContexts = new HashMap<>();
Iterator iterator = ASYNC_COMMIT_BUFFER.iterator();
while (iterator.hasNext()) {
Phase2Context commitContext = iterator.next();
List contextsGroupedByResourceId = mappedContexts.get(commitContext.resourceId);
if (contextsGroupedByResourceId == null) {
contextsGroupedByResourceId = new ArrayList<>();
mappedContexts.put(commitContext.resourceId, contextsGroupedByResourceId);
}
contextsGroupedByResourceId.add(commitContext);
iterator.remove();
}
for (String resourceId : mappedContexts.keySet()) {
Connection conn = null;
try {
try {
DataSourceProxy dataSourceProxy = DataSourceManager.get().get(resourceId);
conn = dataSourceProxy.getPlainConnection();
} catch (SQLException sqle) {
LOGGER.warn("Failed to get connection for async committing on " + resourceId, sqle);
continue;
}
List contextsGroupedByResourceId = mappedContexts.get(resourceId);
for (Phase2Context commitContext : contextsGroupedByResourceId) {
try {
UndoLogManager.deleteUndoLog(commitContext.xid, commitContext.branchId, conn);
} catch (Exception ex) {
LOGGER.warn("Failed to delete undo log [" + commitContext.branchId + "/" + commitContext.xid + "]", ex);
}
}
} finally {
if (conn != null) {
try {
conn.close();
} catch (SQLException closeEx) {
LOGGER.warn("Failed to close JDBC resource while deleting undo_log ", closeEx);
}
}
}
}
}
7、rollback操作
在DataSourceManager中进行事务回滚操作
@Override
public BranchStatus branchRollback(String xid, long branchId, String resourceId, String applicationData) throws TransactionException {
DataSourceProxy dataSourceProxy = get(resourceId);
if (dataSourceProxy == null) {
throw new ShouldNeverHappenException();
}
try {
//根据表undo_log中的快照信息回滚数据
UndoLogManager.undo(dataSourceProxy, xid, branchId);
} catch (TransactionException te) {
if (te.getCode() == TransactionExceptionCode.BranchRollbackFailed_Unretriable) {
return BranchStatus.PhaseTwo_RollbackFailed_Unretriable;
} else {
return BranchStatus.PhaseTwo_RollbackFailed_Retriable;
}
}
//返回回滚结果信息
return BranchStatus.PhaseTwo_Rollbacked;
}
在UndoLogManager中首先根据全局事务XID和分片事务branchId获取结果信息,其中在rollback_info中以二进制方式保存了快照信息。
分片事务回滚:
public static void undo(DataSourceProxy dataSourceProxy, String xid, long branchId) throws TransactionException {
assertDbSupport(dataSourceProxy.getTargetDataSource().getDbType());
Connection conn = null;
ResultSet rs = null;
PreparedStatement selectPST = null;
try {
conn = dataSourceProxy.getPlainConnection();
// The entire undo process should run in a local transaction.
conn.setAutoCommit(false);
// Find UNDO LOG
//根据xid和branchId查找数据
selectPST = conn.prepareStatement(SELECT_UNDO_LOG_SQL);
selectPST.setLong(1, branchId);
selectPST.setString(2, xid);
rs = selectPST.executeQuery();
while (rs.next()) {
Blob b = rs.getBlob("rollback_info");
//获取rollback_info字段数据
String rollbackInfo = StringUtils.blob2string(b);
//生成回滚sql
BranchUndoLog branchUndoLog = UndoLogParserFactory.getInstance().decode(rollbackInfo);
//执行回滚操作
for (SQLUndoLog sqlUndoLog : branchUndoLog.getSqlUndoLogs()) {
TableMeta tableMeta = TableMetaCache.getTableMeta(dataSourceProxy, sqlUndoLog.getTableName());
sqlUndoLog.setTableMeta(tableMeta);
AbstractUndoExecutor undoExecutor = UndoExecutorFactory.getUndoExecutor(dataSourceProxy.getDbType(), sqlUndoLog);
undoExecutor.executeOn(conn);
}
}
//删除记录
deleteUndoLog(xid, branchId, conn);
//提交事务
conn.commit();
} catch (Throwable e) {
if (conn != null) {
try {
conn.rollback();
} catch (SQLException rollbackEx) {
LOGGER.warn("Failed to close JDBC resource while undo ... ", rollbackEx);
}
}
throw new TransactionException(BranchRollbackFailed_Retriable, String.format("%s/%s", branchId, xid), e);
} finally {
try {
if (rs != null) {
rs.close();
}
if (selectPST != null) {
selectPST.close();
}
if (conn != null) {
conn.close();
}
} catch (SQLException closeEx) {
LOGGER.warn("Failed to close JDBC resource while undo ... ", closeEx);
}
}
}
总结:
(1)Fescar通过dubbo远程调用传递全局事务id
(2)RM在执行本地数据库操作时首先会向TC申请分片事务
(3)根据分片事务id在undo_log生成回滚日志
(4)执行本地数据库操作,成功或失败都会向TC进行报告
(5)TM进行事务commit或rollback操作,将操作提交到TC,TC将操作发送到RM
(6)RM接收到TC的通知进行commit或rollback操作
(7)如果是commit操作则异步通过线程AsyncWorker进行删除本地undo_log即可
(8)如果是rollback操作则根据全局事务xid和分片事务branchId找到记录,根据字段rollback_info中的快照信息进行回滚数据操作。