现场报有一个功能走不下去,后台日志报错:java.sql.SQLException: ORA-01591: 锁被未决分布式事务处理 657.7.39336 持有。
解决方案:
rollback force '657.7.39336';--执行可能会比较慢
执行完成后,查询DBA_2PC_PENDING,
select * from DBA_2PC_PENDING s where s.local_tran_id='657.7.39336';
657.7.39336 SP4GD.a6dfea73.657.7.39336forced rollback no 2015-6-17 5:28:05 2015-6-17 10:44:33 2015-6-17 5:28:05 oracle UNKNOWN SCDB02 LCA_ZC 14456764049772
或者
delete from sys.pending_trans$ where local_tran_id = '657.7.39336';
delete from sys.pending_sessions$ where local_tran_id = '657.7.39336';
delete from sys.pending_sub_sessions$ where local_tran_id ='657.7.39336';
commit;
Commit force '657.7.39336'
exec dbms_transaction.purge_lost_db_entry('657.7.39336');
DBA_2PC_PENDING
describes distributed transactions awaiting recovery.描述等待恢复的分布式事务。
LOCAL_TRAN_ID String of form: n.n.n; n is a number
GLOBAL_TRAN_ID Globally unique transaction ID
STATE Collecting, prepared, committed, forced commit, or forced rollback
MIXED YES indicates part of the transaction committed and part rolled back
ADVICE C for commit, R for rollback, else NULL
TRAN_COMMENT Text for commit work comment text
FAIL_TIME Value of SYSDATE when the row was inserted (transaction or system recovery)
FORCE_TIME Time of manual force decision (null if not forced locally)
RETRY_TIME Time automatic recovery (RECO) last tried to recover the transaction
OS_USER Operating system-specific name for the end-user
OS_TERMINAL Operating system-specific name for the end-user terminal
HOST Name of the host machine for the end-user
DB_USER Oracle user name of the end-user at the topmost database
COMMIT# Global commit number for committed transactions
这个错误是什么意思呢?
[oracle@standby ~]$ oerr ora 01591
01591, 00000, "lock held by in-doubt distributed transaction %s"
// *Cause: Trying to access resource that is locked by a dead two-phase commit
// transaction that is in prepared state.
// *Action: DBA should query the pending_trans$ and related tables, and attempt
// to repair network connection(s) to coordinator and commit point.
// If timely repair is not possible, DBA should contact DBA at commit
// point if known or end user for correct outcome, or use heuristic
// default if given to issue a heuristic commit or abort command to
// finalize the local portion of the distributed transaction.
下面简单介绍一下分布式事务。
分布式事务,简单来说,是指一个事务在本地和远程执行,本地需要等待确认远程的事务结束后,进行下一步本地的操作。如通过dblink update远程数据库的一行记录,如果在执行过程中网络异常,或者其他事件导致本地数据库无法得知远程数据库的执行情况,此时就会发生in doublt的报错。此时需要dba介入,且需要分多种情况进行处理。
分布式事务的Two-Phase Commit机制,会经历3个阶段:
1.PREPARE PHASE:
1.1 决定哪个数据库为commit point site。(注,参数文件中commit_point_strength值高的那个数据库为commit point site)
1.2 全局协调者(Global Coordinator)要求所有的点(除commit point site外)做好commit或者rollback的准备。此时,对分布式事务的表加锁。
1.3 所有分布式事务的节点将它的scn告知全局协调者。
1.4 全局协调者取各个点的最大的scn作为分布式事务的scn。
至此,所有的点都完成了准备工作,我们开始进入COMMIT PHASE阶段,此时除commit point site点外所有点的事务均为in doubt状态,直到COMMIT PHASE阶段结束。
2.COMMIT PHASE:
2.1 Global Coordinator将最大scn传到commit point site,要求其commit。
2.2 commit point尝试commit或者rollback。分布式事务锁释放。
2.3 commit point通知Global Coordinator已经commit。
2.4 Global Coordinator通知分布式事务的所有点进行commit。
3.FORGET PHASE:
3.1 参与的点通知commit point site他们已经完成commit,commit point site就能忘记(forget)这个事务。
3.2 commit point site在远程数据库上清除分布式事务信息。
3.3 commit point site通知Global Coordinator可以清除本地的分布式事务信息。
3.4 Global Coordinator清除分布式事务信息