09.3.18更新:
随着R13A release,我也重新查了一下这个问题,首先这份代码在编译的时候,会提示
引用
Warning: NOT OPTIMIZED: different control paths use different positions in the binary
litaocheng同学提醒,只要将代码修改为
dame_shit(Bin) ->
dame_shit(Bin, <<"\r\n">>, not_found).
dame_shit(<<Tag:2/bytes, T/binary>>, Tag, not_found) ->
dame_shit(T, Tag, found);
dame_shit(<<B, T/binary>>, Tag, not_found) ->
dame_shit(T, Tag, not_found);
dame_shit(<<>>, _, _) ->
not_found;
dame_shit(Bin, _, found) ->
{found, Bin}.
编译就正常了:
引用
Warning: OPTIMIZED: creation of sub binary delayed
然后在R12B-5/R13A上面执行,逻辑正常:
引用
Find '\r\n' result1: {found,<<"something maybe wrong!">>}
Find '\r\n' result2: {found,<<"something maybe wrong!">>}
下面是原文
===========================
在使用过程中,发现了sub binary的优化似乎有Bug,先上代码:
-module(dame_shit).
-export([main/0, dame_shit/1, dame_shit2/1]).
dame_shit(Bin) ->
dame_shit(Bin, <<"\r\n">>, not_found).
dame_shit(Bin, _, found) ->
{found, Bin};
dame_shit(<<Tag:2/bytes, T/binary>>, Tag, not_found) ->
dame_shit(T, Tag, found);
dame_shit(<<B, T/binary>>, Tag, not_found) ->
dame_shit(T, Tag, not_found);
dame_shit(<<>>, _, _) ->
not_found.
dame_shit2(Bin) ->
dame_shit2(Bin, <<"\r\n">>, not_found).
dame_shit2(Bin, _, found) ->
{found, Bin};
dame_shit2(<<Tag:2/bytes, T/binary>>=UnusedBinary, Tag, not_found) ->
dame_shit2(T, Tag, found);
dame_shit2(<<B, T/binary>>, Tag, not_found) ->
dame_shit2(T, Tag, not_found);
dame_shit2(<<>>, _, _) ->
not_found.
main() ->
io:format("Find '\\r\\n' result1: ~p~n", [dame_shit(<<"oops\r\nsomething maybe wrong!">>)]),
io:format("Find '\\r\\n' result2: ~p~n", [dame_shit2(<<"oops\r\nsomething maybe wrong!">>)]),
ok.
dame_shit & dame_shit2 的区别在于
<<Tag:2/bytes, T/binary>>=UnusedBinary
这句,这会导致binary优化失效,查看编译输出也可以发现这点:
引用
$ erlc +bin_opt_info dame_shit.erl
./dame_shit.erl:8: Warning: OPTIMIZED: creation of sub binary delayed
./dame_shit.erl:10: Warning: OPTIMIZED: creation of sub binary delayed
./dame_shit.erl:10: Warning: variable 'B' is unused
./dame_shit.erl:19: Warning: INFO: the '=' operator will prevent delayed sub bin
ary optimization
./dame_shit.erl:19: Warning: NOT OPTIMIZED: called function dame_shit2/3 does no
t begin with a suitable binary matching instruction
./dame_shit.erl:19: Warning: variable 'UnusedBinary' is unused
./dame_shit.erl:21: Warning: NOT OPTIMIZED: called function dame_shit2/3 does no
t begin with a suitable binary matching instruction
./dame_shit.erl:21: Warning: variable 'B' is unused
运行的时候确实是结果不一样:
引用
$ erl -noshell -s dame_shit main -s init stop
Find '\r\n' result1: not_found
Find '\r\n' result2: {found,<<"something maybe wrong!">>}
在
Windows XP + Erlang OTP R12B3
Linux X64_86 + Erlang OTP R12B3
下都得出相同的结果。
为什么会这样?在 dame_shit(<<B, T/binary>>, Tag, not_found) -> 下面加入日志输出:
io:format("B=~c~n", [B]),
执行后,输出如下:
引用
B=p
B=
B=m
B=h
B=g
B=a
B=e
B=r
B=g
B=!
Find '\r\n' result1: not_found
可见,每次执行这个匹配函数,指针都跳过了2个字符
oo
ps\r
\nso
met
hin
g m
ayb
e w
ron
g!
跳过2个字符的原因,应该是由于前面的函数
dame_shit(<<Tag:
2/bytes, T/binary>>, Tag, not_found) ->
中的2bytes引起。