小测试:两种构造字符串方式的性能对比

先推荐两篇文章:
http://www.wagerlabs.com/blog/2008/02/parsing-text-an.html
http://ppolv.wordpress.com/2008/02/25/parsing-csv-in-erlang/
(都需要爬墙访问,该死的‘功夫网’)

Erlang中解析文本协议,使用Binary无疑是高效的选择,但是我发现,文章中,对Binary中各个字节组合为字符串,都是使用list的:
NewList = lists:reverse([Char|OldList])
而不是
NewList = binary_to_list(<<OldBin/binary,$Char>>)

稍后我做了个测试,证明了对于大量短字符串的构成,比如将 <<"GET /index.html HTTP/1.1">> 解析为 ["GET","/index.html","HTTP/1.1"],使用list会更好一些。

简单写了个循环的测试代码:
test_append() -> 
    test_char_append(100),
    test_char_append(1000),
    test_char_append(10000),
    test_char_append(100000),
    test_char_append(1000000),
    test_char_append(10000000),
    test_field_append(10000),
    test_field_append(100000),
    test_field_append(200000),
    test_field_append(300000).

test_char_append(Loop) ->
    erlang:statistics(wall_clock),
    test_char_append_by_list(Loop, []),
    {_,T1} = erlang:statistics(wall_clock),
    test_char_append_by_binary(Loop, <<>>),
    {_,T2} = erlang:statistics(wall_clock),
    io:format("~p loops, test_char_append_by_list using time: ~pms~n", [Loop,T1]),
    io:format("~p loops, test_char_append_by_binary using time: ~pms~n~n", [Loop,T2]),
    ok.

test_field_append(Loop) ->
    erlang:statistics(wall_clock),
    test_field_append_by_list(Loop, []),
    {_,T1} = erlang:statistics(wall_clock),
    test_field_append_by_binary(Loop, []),
    {_,T2} = erlang:statistics(wall_clock),
    io:format("~p loops, test_field_append_by_list using time: ~pms~n", [Loop,T1]),
    io:format("~p loops, test_field_append_by_binary using time: ~pms~n~n", [Loop,T2]),
    ok.
    
test_char_append_by_list(0, List) -> lists:reverse(List);
test_char_append_by_list(N, List) -> test_char_append_by_list(N-1, [$!|List]).

test_char_append_by_binary(0, Bin) -> binary_to_list(Bin);
test_char_append_by_binary(N, Bin) -> test_char_append_by_binary(N-1, <<Bin/binary, $!>>).

test_field_append_by_list(0, List) -> lists:reverse(List);
test_field_append_by_list(N, List) -> 
    Field = test_char_append_by_list(100, []),
    test_field_append_by_list(N-1, [Field|List]).

test_field_append_by_binary(0, List) -> lists:reverse(List);
test_field_append_by_binary(N, List) -> 
    Field = test_char_append_by_binary(100, <<>>),
    test_field_append_by_binary(N-1, [Field|List]).


输出大致如下:

引用
100 loops, test_char_append_by_list using time: 0ms
100 loops, test_char_append_by_binary using time: 0ms

1000 loops, test_char_append_by_list using time: 0ms
1000 loops, test_char_append_by_binary using time: 0ms

10000 loops, test_char_append_by_list using time: 0ms
10000 loops, test_char_append_by_binary using time: 0ms

100000 loops, test_char_append_by_list using time: 16ms
100000 loops, test_char_append_by_binary using time: 16ms

1000000 loops, test_char_append_by_list using time: 203ms
1000000 loops, test_char_append_by_binary using time: 156ms

10000000 loops, test_char_append_by_list using time: 2922ms
10000000 loops, test_char_append_by_binary using time: 1594ms

10000 loops, test_field_append_by_list using time: 62ms
10000 loops, test_field_append_by_binary using time: 172ms

100000 loops, test_field_append_by_list using time: 1109ms
100000 loops, test_field_append_by_binary using time: 1860ms

200000 loops, test_field_append_by_list using time: 2672ms
200000 loops, test_field_append_by_binary using time: 4937ms

300000 loops, test_field_append_by_list using time: 3438ms
300000 loops, test_field_append_by_binary using time: 7062ms


可见当字符串较短时,使用list比binary速度更佳,当字符串达到10w以上(谁没事搞那么长的list?),binary才有一点点的优势。在大量构造短字符串时,还是乖乖用list组合并反转吧

你可能感兴趣的:(html,erlang,wordpress,Blog)