erlang的分布式数据库mnesia是erlang分布式体系中非常重要的组件,其本身具有分布式的特性,可以在多个结点上建立数据副本,支持数据持久化,用好了mnesia,可以极大的减轻分布式架构的设计难度。
本博文带来的第一篇有关mnesia的分析,即是mnesia最常见也最重要的接口:transaction写过程。mnesia的文档上记载(http://www.erlang.org/doc/apps/mnesia/Mnesia_chap4.html#id73876),mnesia包含5种访问上下文,访问上下文没有固定的概念,就我的理解是指,mnesia的操作如read、write等维持事务ACID特性及分布式化的程度,这5种访问上下文分别为:
transaction:普通事务,数据更新满足事务的 ACID 特性,也会同步扩散到所有包含数据副本的mnesia结点,但事务提交时不要求各个副本结点的日志存储在磁盘上;
sync_transaction:完全同步事务,基本过程等同于transaction,但事务提交时各个副本结点的日志均已存储在磁盘上;
async_dirty:dirty_*操作的访问上下文,数据的更新不满足事务的 ACID 特性,更新的数据会异步的复制到其它副本结点;
sync_dirty:类似于async_dirty,但是直到数据完全复制到其它副本上,写操作才会返回;
ets:无任何事务及分布式特性,mnesia是通过ets和dets实现的,数据存储在这些表内,ets上下文指明mnesia仅更新本地ets表。
定义mnesia的表user:
mnesia:create_table(user, [{record_name, user},{attributes, [name, id]},{disc_copies, [node()]}]).
本文将就一条普通的mnesia语句mnesia:transaction(fun() -> mnesia:write({user, me, 12345}) end).进行分析,mnesia版本为4.5.1,揭示mnesia的事务原理。
mnesia.erl
transaction(Fun) ->
transaction(get(mnesia_activity_state), Fun, [], infinity, ?DEFAULT_ACCESS, async).
transaction(State, Fun, Args, Retries, Mod, Kind)
when is_function(Fun), is_list(Args), Retries == infinity, is_atom(Mod) ->
mnesia_tm:transaction(State, Fun, Args, Retries, Mod, Kind);
mnesia:transaction通过mnesia_tm:transaction实现,Mod参数为mnesia,Kind参数为async。
mnesia_tm.erl
transaction(OldTidTs, Fun, Args, Retries, Mod, Type) ->
Factor = 1,
case OldTidTs of
undefined -> % Outer
execute_outer(Mod, Fun, Args, Factor, Retries, Type);
{_, _, non_transaction} -> % Transaction inside ?sync_dirty
Res = execute_outer(Mod, Fun, Args, Factor, Retries, Type),
put(mnesia_activity_state, OldTidTs),
Res;
{OldMod, Tid, Ts} -> % Nested
execute_inner(Mod, Tid, OldMod, Ts, Fun, Args, Factor, Retries, Type);
_ -> % Bad nesting
{aborted, nested_transaction}
end.
execute_outer(Mod, Fun, Args, Factor, Retries, Type) ->
case req(start_outer) of
{error, Reason} ->
{aborted, Reason};
{new_tid, Tid, Store} ->
Ts = #tidstore{store = Store},
NewTidTs = {Mod, Tid, Ts},
put(mnesia_activity_state, NewTidTs),
execute_transaction(Fun, Args, Factor, Retries, Type)
end.
此处不分析嵌套事务的情况,可以看到mnesia_tm在执行事务时需要向mnesia的事务管理器发出启动事务的请求,该请求的处理过程在mnesia_tm:doit_loop中。
doit_loop(#state{coordinators=Coordinators,participants=Participants,supervisor=Sup}=State) ->
receive
{From, start_outer} -> %% Create and associate ets_tab with Tid
case catch ?ets_new_table(mnesia_trans_store, [bag, public]) of
{'EXIT', Reason} -> %% system limit
Msg = "Cannot create an ets table for the "
"local transaction store",
reply(From, {error, {system_limit, Msg, Reason}}, State);
Etab ->
tmlink(From),
C = mnesia_recover:incr_trans_tid_serial(),
?ets_insert(Etab, {nodes, node()}),
Tid = #tid{pid = tmpid(From), counter = C},
A2 = gb_trees:insert(Tid,[Etab],Coordinators),
S2 = State#state{coordinators = A2},
reply(From, {new_tid, Tid, Etab}, S2)
end;
事务管理器接收请求后,将创建一个新的临时ets表,构建一个临时tid,并向请求者返回该tid和临时表,此后,涉及事务的任何操作都将写入该临时表,包括数据更新,锁,事务参与结点等信息。
重新回到execute_outer函数,观察其后续处理:
execute_outer(Mod, Fun, Args, Factor, Retries, Type) ->
case req(start_outer) of
{error, Reason} ->
{aborted, Reason};
{new_tid, Tid, Store} ->
Ts = #tidstore{store = Store},
NewTidTs = {Mod, Tid, Ts},
put(mnesia_activity_state, NewTidTs),
execute_transaction(Fun, Args, Factor, Retries, Type)
end.
另一个值得注意的细节在于,mnesia将{Mod, Tid, Ts(也即刚才创建的临时表)}存储在发起事务进程的进程字典内,这将在之后的mnesia:write中进行后续检查。
execute_transaction(Fun, Args, Factor, Retries, Type) ->
case catch apply_fun(Fun, Args, Type) of
{'EXIT', Reason} ->
check_exit(Fun, Args, Factor, Retries, Reason, Type);
{atomic, Value} ->
mnesia_lib:incr_counter(trans_commits),
erase(mnesia_activity_state),
flush_downs(),
catch unlink(whereis(?MODULE)),
{atomic, Value};
{nested_atomic, Value} ->
mnesia_lib:incr_counter(trans_commits),
{atomic, Value};
Value -> %% User called throw
Reason = {aborted, {throw, Value}},
return_abort(Fun, Args, Reason)
end.
apply_fun(Fun, Args, Type) ->
Result = apply(Fun, Args),
case t_commit(Type) of
do_commit ->
{atomic, Result};
do_commit_nested ->
{nested_atomic, Result};
{do_abort, {aborted, Reason}} ->
{'EXIT', {aborted, Reason}};
{do_abort, Reason} ->
{'EXIT', {aborted, Reason}}
end.
execute_transaction通过apply执行了传入mnesia:transaction的函数后,便进行了提交过程。
至此,事务准备过程已经完成,本节点的mnesia事务管理器为事务发起者分配了一个事务tid和一张临时表,并记录了相关分配信息,事务发起者也在进程字典内记录了transaction访问上下文。
未完待续...