1)cProfile
cProfile可以嵌入到python代码中执行,比如:
import cProfile
cProfile.run('foo()', 'foo.out')
查看结果需要pstats模块,比如:
import pstats
p = pstats.Stats('foo.out')
p.print_stats()
pstats还可以排序, 以及打印排名靠前的记录。比如:
p.sort_stats('cumulative').print_stats(10)
也可以命令行方式独立运行,比如:
python -m cProfile -o profile.log my_test.py
通过pstats查看并将结果写入stats.out:
>>> import pstats
>>> stream = open('stats.out','w')
>>> p=pstats.Stats('profile.log',stream=stream)
>>> p.sort_stats('cumulative').print_stats()
查看stats.out:
Tue Mar 5 15:41:17 2019 profile.log
349093 function calls (345925 primitive calls) in 0.989 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.005 0.005 1.028 1.028 my_test.py:1()
1 0.001 0.001 0.609 0.609 build/bdist.linux-x86_64/egg/my_algo/utils.py:11()
1 0.004 0.004 0.480 0.480 /usr/lib64/python2.7/site-packages/scipy/stats/__init__.py:364()
1 0.002 0.002 0.454 0.454 /usr/lib64/python2.7/site-packages/scipy/stats/stats.py:158()
1 0.009 0.009 0.346 0.346 /usr/lib64/python2.7/site-packages/scipy/stats/distributions.py:8()
1 0.000 0.000 0.305 0.305 preprocess.py:4()
各列含义:
ncalls | 调用次数 |
tottime | 函数(代码块)总共执行时间, 不包含其调用的子函数执行时间 |
percall | 单次调用占用时间,等于tottime除以ncalls |
cumtime | 函数(代码块)包含子调用的所有执行时间 |
percall | cumtime除以ncalls |
filename:lineno(function) | 函数(代码块)位置 |
pstats的sort_stats排序参数:
Valid Arg | Meaning |
---|---|
'calls' | call count |
'cumulative' | cumulative time |
'cumtime' | cumulative time |
'file' | file name |
'filename' | file name |
'module' | file name |
'ncalls' | call count |
'pcalls' | primitive call count |
'line' | line number |
'name' | function name |
'nfl' | name/file/line |
'stdname' | standard name |
'time' | internal time |
'tottime' | internal time |
pstats打印caller: Stats.print_callers(*restrictions), 打印callee: Stats.print_callees(*restrictions)
例如:
>>> p.sort_stats('cumtime').print_callers(20)
Ordered by: cumulative time
List reduced from 3506 to 20 due to restriction <20>
Function was called by...
ncalls tottime cumtime
my_test.py:1() <-
build/bdist.linux-x86_64/egg/my_algo/utils.py:11() <- 1 0.001 0.609 my_test.py:1()
/usr/lib64/python2.7/site-packages/scipy/stats/__init__.py:364() <- 1 0.004 0.480 build/bdist.linux-x86_64/egg/my_algo/utils.py:11()
/usr/lib64/python2.7/site-packages/scipy/stats/stats.py:158() <- 1 0.002 0.454 /usr/lib64/python2.7/site-packages/scipy/stats/__init__.py:364()
/usr/lib64/python2.7/site-packages/scipy/stats/distributions.py:8() <- 1 0.009 0.346 /usr/lib64/python2.7/site-packages/scipy/stats/stats.py:158()
pm_preprocess.py:4() <- 1 0.000 0.305 my_test.py:1()
/usr/lib64/python2.7/site-packages/pandas/__init__.py:5() <- 1 0.004 0.302 preprocess.py:4()
2) line_profiler
cProfile可以定位到某个function, line_profiler可以定位到函数内部的每一行。可以通过@profile声明要分析的函数以及执行kernprof的方式来分析,比如:
@profile
def slow_function(a, b, c):
...
然后执行:$ kernprof -l script_to_profile.py
分析结果保存在: script_to_profile.py.lprof,可以通过以下方式查看结果:
$ python -m line_profiler script_to_profile.py.lprof
也可以在执行kernprol的时候加入-v参数,直接在命令行显示:
$ kernprof -l -v script_to_profile.py
总结:性能分析的时候可以先用cProfile定位热点function, 再通过line_profile定位具体的热点行。
3)结果可视化
可以通过gprof2dot,pyprof2calltree,pyinstrument 等实现,以如下python程序为例:
num = 1000
def calculateSum(n):
sum = 0
counter = 1
while counter <= n:
sum = sum + counter
counter += 1
return sum
sumN = calculateSum(num)
print("The sum from 1 to %d is: %d" % (num,sumN))
gprof2dot结果可视化:
python -m cProfile -o output.pstats test.py
gprof2dot output.pstats -f pstats | dot -Tpng -o output.png
结果格式:
A node in the output graph represents a function and has the following layout:
+------------------------------+
| function name |
| total time % ( self time % ) |
| total calls |
+------------------------------+
where:
total time % is the percentage of the running time spent in this function and all its children;
self time % is the percentage of the running time spent in this function alone;
total calls is the total number of times this function was called (including recursive calls).
An edge represents the calls between two functions and has the following layout:
total time %
calls
parent --------------------> children
Where:
total time % is the percentage of the running time transfered from the children to this parent (if available);
calls is the number of calls the parent function called the children.
参考:
https://docs.python.org/3.2/library/profile.html
https://github.com/rkern/line_profiler
https://github.com/jrfonseca/gprof2dot
https://thirld.com/blog/2014/11/30/visualizing-the-results-of-profiling-python-code/