简介
profiling工具是语言生态的重要组成部分。这篇blog对erlang的vm层profiling做一个总结。
profiling tools overview
可参考
https://www.erlang.org/doc/ef...
名字 | 功能 | 剩余性能) |
---|---|---|
cprof | 统计模块的调用次数,可找出调用次数热点 | 0.67 |
eprof | 统计模块的调用次数和时间,可找出调用时间、次数热点 | 0.17 |
fprof | 统计模块调用次数和关系,可生成call-graph | 0.008 |
eflame | 通过erlang trace采集调用栈,进程状态 | 0.003 |
looking_glass | 也是基于erlang trace | 0.004 |
详细数据
压测代码如下:
defp do_sth(:cpu_intensive) do
start_time = :erlang.system_time(:microsecond)
end_time = start_time + 10_000_000
do_until_time(0, start_time, end_time)
end
def do_until_time(count, start_time, end_time) do
now = :erlang.system_time(:microsecond)
if now >= end_time do
Logger.warn(
"\ntotal_time: #{div(end_time - start_time, 1000_000)} second\ncount: #{count}, avg_μs: #{div(end_time - start_time, count)}"
)
else
for v <- 1..100 do
a = %{v => Base.encode64(:crypto.strong_rand_bytes(20))}
b = Jason.encode!(a)
Jason.decode!(b)
end
do_until_time(count + 1, start_time, end_time)
end
end
无压测
性能基准
total_time: 10 second
count: 24287, avg_μs: 411
cpu: %100
cprof
示例代码
:cprof.start()
do_sth(:cpu_intensive)
:cprof.pause()
data = :cprof.analyse()
Logger.warn("cprof_result:\n#{inspect(data, pretty: true)}")
输出结果
仅能获得模块的调用次数。
{243741614,
[
{Jason.Encode, 75685344,
[
{{Jason.Encode, :escape_json_chunk, 5}, 49314144},
{{Jason.Encode, :value, 3}, 3296400},
{{Jason.Encode, :escape_json, 4}, 3296400},
{{Jason.Encode, :escape_json, 3}, 3296400},
{{Jason.Encode, :"-fun.escape_json/3-", 3}, 3296400},
{{Jason.Encode, :map_naive_loop, 3}, 1648200},
{{Jason.Encode, :map_naive, 3}, 1648200},
{{Jason.Encode, :key, 2}, 1648200},
{{Jason.Encode, :escape_function, 1}, 1648200},
{{Jason.Encode, :encode_string, 2}, 1648200},
{{Jason.Encode, :encode_map_function, 1}, 1648200},
{{Jason.Encode, :encode, 2}, 1648200},
{{Jason.Encode, :"-fun.map_naive/3-", 3}, 1648200}
]},
{Jason.Decoder, 75685344,
[
{{Jason.Decoder, :string, 6}, 52610544},
{{Jason.Decoder, :value, 5}, 3296400},
{{Jason.Decoder, :"-string_decode_function/1-fun-0-", 1}, 3296400},
{{Jason.Decoder, :terminate, 6}, 1648200},
{{Jason.Decoder, :string_decode_function, 1}, 1648200},
{{Jason.Decoder, :parse, 2}, 1648200},
{{Jason.Decoder, :object_decode_function, 1}, 1648200},
{{Jason.Decoder, :object, 6}, 1648200},
{{Jason.Decoder, :key_decode_function, 1}, 1648200},
{{Jason.Decoder, :key, 6}, 1648200},
{{Jason.Decoder, :key, 5}, 1648200},
{{Jason.Decoder, :float_decode_function, 1}, 1648200},
{{Jason.Decoder, :"-key_decode_function/1-fun-0-", 1}, 1648200}
]}
性能
total_time: 10 second
count: 16482, avg_μs: 606
cpu: %100
eprof
示例代码
:eprof.start()
:eprof.start_profiling(:erlang.processes())
do_sth(:cpu_intensive)
:eprof.stop_profiling()
data = :eprof.analyze(:total, sort: :time)
Logger.warn("eprof_result:\n#{inspect(data, pretty: true)}")
输出结果
统计了调用时间。按时间占比排序是个较实用的选项。
FUNCTION CALLS % TIME [uS / CALLS]
-------- ----- ------- ---- [----------]
gen:call/4 1 0.00 0 [ 0.00]
gen:do_for_proc/2 1 0.00 0 [ 0.00]
...
maps:from_list/1 1249500 2.04 203667 [ 0.16]
'Elixir.Enum':into_map/2 833000 2.38 237244 [ 0.28]
'Elixir.ProfilingTest':'-do_until_time/3-fun-0-'/2 416500 2.39 238687 [ 0.57]
'Elixir.Jason.Decoder':parse/2 416500 2.41 240746 [ 0.58]
'Elixir.Base':do_encode64/2 416500 2.65 264394 [ 0.63]
'Elixir.Base':'-do_encode64/2-lbc$^0/2-0-'/2 1666000 7.30 728723 [ 0.44]
'Elixir.Base':enc64_pair/1 5831000 7.39 737817 [ 0.13]
crypto:strong_rand_bytes_nif/1 416500 7.56 754244 [ 1.81]
'Elixir.Jason.Encode':escape_json_chunk/5 12461680 12.19 1216586 [ 0.10]
'Elixir.Jason.Decoder':string/6 13294680 13.93 1390338 [ 0.10]
------------------------------------------------------------ -------- ------- ------- [----------]
Total: 61597835 100.00% 9981650 [ 0.16]
性能
total_time: 10 second
count: 4165, avg_μs: 2400
cpu: %100
fprof
示例代码
:fprof.trace([:start, {:procs, :erlang.processes()}])
do_sth(:cpu_intensive)
# 默认写入fprof.trace文件
:fprof.trace(:stop)
# 读取文件并分析
:fprof.profile()
# 输出结果,按函数自身调用时间排序,不包含调用其他函数的时间
:fprof.analyse(sort: :own)
# 按累计时间排序,包括调用其他函数的时间
:fprof.analyse(sort: :acc)
输出结果
kcallgrind
call graph生成见:https://segmentfault.com/a/11...
sort own
Processing data...
Creating output...
%% Analysis results:
{ analysis_options,
[{callers, true},
{sort, own},
{totals, false},
{details, true}]}.
% CNT ACC OWN
[{ totals, 5040222,10053.041, 9982.294}]. %%%
% CNT ACC OWN
[{ "<0.212.0>", 5034485,undefined, 9948.017}]. %%
{[{{'Elixir.Jason.Decoder',string,6}, 1011296, 0.000, 1539.664},
{{'Elixir.Jason.Decoder',key,5}, 33800, 2135.166, 47.418},
{{'Elixir.Jason.Decoder',value,5}, 33800, 0.000, 46.437}],
{ {'Elixir.Jason.Decoder',string,6}, 1078896, 2135.166, 1633.519}, %
[{{'Elixir.Jason.Decoder',string,6}, 1011296, 0.000, 1539.664},
{{'Elixir.Jason.Decoder',object,6}, 33800, 303.359, 140.051},
{{'Elixir.Jason.Decoder','-string_decode_function/1-fun-0-',1},67600, 85.381, 85.381},
{{'Elixir.Jason.Decoder',key,6}, 33800, 1899.845, 46.730},
{garbage_collect, 4858, 18.554, 18.554}]}.
{[{{'Elixir.Jason.Encode',escape_json_chunk,5},943696, 0.000, 1351.146},
{{'Elixir.Jason.Encode',escape_json,4}, 67600, 1487.645, 94.579}],
{ {'Elixir.Jason.Encode',escape_json_chunk,5},1011296, 1487.645, 1445.725}, %
[{{'Elixir.Jason.Encode',escape_json_chunk,5},943696, 0.000, 1351.146},
{garbage_collect, 8224, 41.920, 41.920}]}.
sort acc
Processing data...
Creating output...
%% Analysis results:
{ analysis_options,
[{callers, true},
{sort, acc},
{totals, false},
{details, true}]}.
% CNT ACC OWN
[{ totals, 5040222,10053.041, 9982.294}]. %%%
% CNT ACC OWN
[{ "<0.212.0>", 5034485,undefined, 9948.017}]. %%
{[{{'Elixir.ProfilingTest','test fprof',1}, 0,10051.624, 0.002},
{{fprof,call,1}, 1, 1.377, 0.004},
{undefined, 0, 0.033, 0.028}],
{ {fprof,just_call,2}, 1,10053.034, 0.034}, %
[{{'Elixir.ProfilingTest',do_sth,1}, 1,10051.622, 0.003},
{suspend, 1, 1.368, 0.000},
{{erlang,monitor,2}, 1, 0.005, 0.005},
{{erlang,demonitor,1}, 1, 0.005, 0.005}]}.
{[{undefined, 0,10053.008, 0.001}],
{ {'Elixir.ProfilingTest','test fprof',1}, 0,10053.008, 0.001}, %
[{{fprof,just_call,2}, 0,10051.624, 0.002},
{{fprof,trace,1}, 1, 1.383, 0.002}]}.
{[{{fprof,just_call,2}, 1,10051.622, 0.003}],
{ {'Elixir.ProfilingTest',do_sth,1}, 1,10051.622, 0.003}, %
[{{'Elixir.ProfilingTest',do_until_time,3}, 1,10051.617, 0.028},
{{erlang,system_time,1}, 1, 0.002, 0.002}]}.
性能
11:47:51.914 [warn]
total_time: 10 second
count: 341, avg_μs: 29325
cpu: %160
eflame
示例代码
spawn(fn -> do_sth(:cpu_intensive) end)
:eflame2.write_trace(:global_and_local_calls, './eflame.test', :all, 15 * 1000)
# 下面的语句需要等待一段时间
:eflame2.format_trace('./eflame.test', './eflame.test.out')
完成输出:
Hello, world, I'm <0.821.0> and I'm running....
<0.230.0> <0.228.0> iex(2)>
nil
<0.50.0> <0.10.0> <0.194.0> <0.64.0>
Writing to ./eflame.test.out for 8 processes... finished!
输出
生成火焰图:
~/git/eflame/flamegraph.riak-color.pl eflame.test.out > eflame.svg
性能
total_time: 10 second
count: 91, avg_μs: 109890
cpu: %120
looking_grass
代码
# 不限制采集范围会直接爆内存
:lg.trace(
[{:scope, [self()]}],
:lg_file_tracer,
'traces.lz4',
%{process_dump: true}
)
do_sth(:cpu_intensive)
:lg.stop()
:lg_flame.profile_many('traces.lz4.*', 'output')
输出
显然丢失了一些调用栈。暂未查明原因。
~/git/FlameGraph/flamegraph.pl output > looking_grass.svg
性能
total_time: 10 second
count: 127, avg_μs: 78740
总结
- erlang 产品中可用的profiling工具非常少,仅 cprof 消耗较少,eprof也需要谨慎开启。通过 trace pattern 减少对性能的影响.
- erlang 多进程,火焰图不适合作为分析 cpu 消耗的工具,可用于分析单个进程场景。
- 没有堪用的火焰图工具,实现太粗糙(没有采样,没有对中间结果做处理),内存,CPU消耗都过大。
参考
https://www.erlang.org/doc/ef...
https://interscity.org/assets...
https://github.com/slfritchie...
https://github.com/rabbitmq/l...