perf是在kernel 2.6.31中才加入到内核树中,较早的发行版本中内核版本都比较老,都没有带perf。如果要使用perf,需要自己去编译。这里进行的是静态编译,生成的可执行文件在运行的时候不依赖动态连接库,可以直接拷贝到其他机器上运行。
1.编译环境
编译环境说明:
a.内核版本: 2.6.32
b.系统发行版本: Redhat Enterprise 4.3
c.perf源码: 源码来自linux2-6-32版本中tools/perf目录
2.安装依赖包
在编译之前需要安装elfutils-libelf-devel rpm包,这个包要和系统中安装的elfutils-libelf包的版本一致,可以在 这里下载。如果没有安装,编译的时候会报下面的错误:
[root@RedHat_4_3 perf]
# make LDFLAGS=-static
Makefile
:
402
:
No libdw.h found
or
old libdw.h found
or
elfutils is older than
0
.
138
, disables dwarf support. Please install new elfutils
-
devel
/
libdw
-
dev
Makefile
:
419
:
*
*
*
No libelf.h
/
libelf found, please install libelf
-
dev
/
elfutils
-
libelf
-
devel. Stop.
在安装过程中可能会遇到其他的提示rpm包版本太旧的问题,这些都可以通过设置一些宏来解决。
3.静态编译
perf目录下的Makefile中有这样一段注释,详细介绍了编译时可以指定的一些宏及其作用,如下所示:
# Define V to have a more verbose compile.
#
# Define PYTHON to point to the python binary if the default
# `python' is not correct; for example: PYTHON=python2
#
# Define PYTHON_CONFIG to point to the python-config binary if
# the default `$(PYTHON)-config' is not correct.
#
# Define ASCIIDOC8 if you want to format documentation with AsciiDoc 8
#
# Define DOCBOOK_XSL_172 if you want to format man pages with DocBook XSL v1.72.
#
# Define LDFLAGS=-static to build a static binary.
#
# Define EXTRA_CFLAGS=-m64 or EXTRA_CFLAGS=-m32 as appropriate for cross-builds.
#
# Define NO_DWARF if you do not want debug-info analysis feature at all.
#
# Define WERROR=0 to disable treating any warnings as errors.
根据上面的注释,如果要静态编译,只需要定义LDFLGAS宏,并且把宏的值设置为-static即可。执行静态编译,命令如下所示:
[root@RedHat_4_3 perf]
# make LDFLAGS=-static
Makefile
:
402
:
No libdw.h found
or
old libdw.h found
or
elfutils is older than
0
.
138
, disables dwarf support. Please install new elfutils
-
devel
/
libdw
-
dev
Makefile
:
443
:
newt
not
found, disables TUI support. Please install newt
-
devel
or
libnewt
-
dev
Makefile
:
493
:
The path
'python'
is
not
executable.
Makefile
:
493
:
*
*
*
Please set
'python'
appropriately. Stop.
[root@dbl
-
sat
-
dev10.dbl01.baidu.com perf]
#
我们先来看第一个错误,产生错误的原因是安装的elfutils版本过低,解决的方法是安装新的版本或者禁掉dwarf(dwarf是一种调试文件格式)。如果要安装新的版本需要安装一系列的安装包,对系统环境的改动较大,所以我们选择禁掉dwarf。根据前面的注释,我们可以定义NO_DWARF宏来来实现,继续我们的编译,命令如下:
[root@RedHat_4_3 perf]
# make LDFLAGS=-static NO_DWARF=true
Makefile
:
443
:
newt
not
found, disables TUI support. Please install newt
-
devel
or
libnewt
-
dev
Makefile
:
493
:
The path
'python'
is
not
executable.
Makefile
:
493
:
*
*
*
Please set
'python'
appropriately. Stop.
还是先看第一个错误,产生错误的原因是没有安装newt-devel,可是系统中newt和newt-devel包都已经安装,所以只能选择把它给禁掉了。在前面的注释中没有看到禁掉newt的宏,所以只能去Makefile中找了。在Mafile中可以看到下面的内容(省略了部分内容):
ifdef NO_NEWT
BASIC_CFLAGS
+=
-DNO_NEWT_SUPPORT
else
FLAGS_NEWT
=$(ALL_CFLAGS) $(ALL_LDFLAGS) $(EXTLIBS)
-lnewt
ifneq ($(call try
-cc,$(SOURCE_NEWT),$(FLAGS_NEWT)),y)
msg
:
= $(warning newt
not found, disables TUI support. Please install newt
-devel
or libnewt
-dev);
BASIC_CFLAGS
+=
-DNO_NEWT_SUPPORT
else
# Fedora has /usr/include/slang/slang.h, but ubuntu /usr/include/slang.h
BASIC_CFLAGS
+=
-I
/usr
/include
/slang
......
endif
endif
根据上面的语句,我们可以推断出newt的支持是否启用依赖NO_NEWT宏,可以通过定义NO_NEWT宏来禁掉newt。其实顺着这部分语句向上看,你会发生NO_DWARF也是这样定义的。我们加上NO_NEWT宏的定义,继续编译,如下所示:
[root@RedHat_4_3 perf]
# make LDFLAGS=-static NO_DWARF=true NO_NEWT=true
Makefile
:
493
:
The path
'python'
is
not
executable.
Makefile
:
493
:
*
*
*
Please set
'python'
appropriately. Stop.
[root@dbl
-
sat
-
dev10.dbl01.baidu.com perf]
#
这个错误和前面的不太一样,开始是认为它找不到python的安装路径,所以在编译的时候通过PYTHON宏指定了python可执行程序的绝对路径,但是依然报这样的错误。根据前面的注释,编译的时候可能还会用到python-config这个命令,但是在redhat 4.3中没有找到这个找到这个命令,应该是因为python版本太旧的问题。现在的问题就是怎么禁掉这个烦人的python支持。根据Makefile中的python相关的定义(修改前是497行附近)禁掉python支持,只要取消PYTHON宏的定义就可以了,但是加上了undef语句,还是会报同样的错误。再次review Makefile,发现在禁止python支持的时候就两句有用,一行是在disable-python的定义位置(489行),如下所示:
487 disable
-python
= $(eval $(disable
-python_code))
488 define disable
-python_code
489 BASIC_CFLAGS
+=
-DNO_LIBPYTHON
490 $(
if $(
1),$(warning No $(
1) was found))
491 $(warning Python support won
't be built)
492 endef
另一行是在PYTHON宏的预处理位置(499行),如下所示:
497 ifndef PYTHON
498 $(call disable
-python,python interpreter)
499 python
-clean
:
=
500
else
所以我就把和python有关的语句全部注释掉(修改前是487行到560行),重新添加了上面讲到的两行,如下所示(这里只列出了部分内容):
487 BASIC_CFLAGS
+=
-DNO_LIBPYTHON
488 python
-clean
:
=
489
#disable-python = $(eval $(disable-python_code))
490
#define disable-python_code
491
# BASIC_CFLAGS += -DNO_LIBPYTHON
492
# $(if $(1),$(warning No $(1) was found))
493
# $(warning Python support won't be built)
494
#endef
......
559
# endif
560
# endif
561
# endif
562
#endif
继续进行我们的编译,仍然使用"make LDFLAGS=-static NO_DWARF=true NO_NEWT=true"命令编译,前面的都很顺利,但是在编译buildin-test.o时报了下面的错误:
builtin
-
test.c
: In function
`sched__get_first_possible_cpu
':
builtin-test.c:986: warning: implicit declaration of function `CPU_ALLOC'
builtin
-
test.c
:
986
: warning
: nested extern declaration of
`CPU_ALLOC
'
builtin-test.c:986: warning: assignment makes pointer from integer without a cast
builtin-test.c:987: warning: implicit declaration of function `CPU_ALLOC_SIZE'
builtin
-
test.c
:
987
: warning
: nested extern declaration of
`CPU_ALLOC_SIZE
'
builtin-test.c:988: warning: implicit declaration of function `CPU_ZERO_S'
builtin
-
test.c
:
988
: warning
: nested extern declaration of
`CPU_ZERO_S
'
builtin-test.c:991: warning: implicit declaration of function `CPU_FREE'
builtin
-
test.c
:
991
: warning
: nested extern declaration of
`CPU_FREE
'
builtin-test.c:1001: warning: implicit declaration of function `CPU_ISSET_S'
builtin
-
test.c
:
1001
: warning
: nested extern declaration of
`CPU_ISSET_S
'
builtin-test.c:1007: warning: implicit declaration of function `CPU_CLR_S'
builtin
-
test.c
:
1007
: warning
: nested extern declaration of
`CPU_CLR_S
'
builtin-test.c:1012: warning: nested extern declaration of `CPU_FREE'
builtin
-
test.c
:
991
: warning
: redundant redeclaration of
'CPU_FREE'
builtin
-
test.c
:
991
: warning
: previous implicit declaration of
'CPU_FREE' was here
builtin
-
test.c
: In function
`test__PERF_RECORD
':
builtin-test.c:1301: warning: nested extern declaration of `CPU_FREE'
builtin
-
test.c
:
991
: warning
: redundant redeclaration of
'CPU_FREE'
builtin
-
test.c
:
991
: warning
: previous implicit declaration of
'CPU_FREE' was here
make
:
*
*
* [builtin
-
test.o] Error
1
CPU_ALLOC_SIZE、CPU_ZERO_S这样的宏在sched.h中并没有找到定义,可能是内核版本太老的问题。我们在Makefile中找到buildin-test.o的编译选项,把它给去掉,不要编译这个模块,对应的编译选项如下所示:
BUILTIN_OBJS
+= $(OUTPUT)builtin
-
test.o
把这行给注释掉,然后再次编译,会报下面的错误:
perf.o(.data
+0x1f0)
: undefined reference to
`cmd_test
'
collect2: ld returned 1 exit status
make: *** [perf] Error 1
这个问题也很容易解决,cmd_test是builtin-test模块提供的,在代码中搜索只有一处用到了这个函数,如下所示:
static void handle_internal_command(int argc, const char
*
*argv)
{
const char
*cmd
= argv[
0];
static struct cmd_struct commands[]
= {
{
"buildid-cache", cmd_buildid_cache,
0 },
{
"buildid-list", cmd_buildid_list,
0 },
{
"diff", cmd_diff,
0 },
{
"evlist", cmd_evlist,
0 },
{
"help", cmd_help,
0 },
{
"list", cmd_list,
0 },
{
"record", cmd_record,
0 },
{
"report", cmd_report,
0 },
{
"bench", cmd_bench,
0 },
{
"stat", cmd_stat,
0 },
{
"timechart", cmd_timechart,
0 },
{
"top", cmd_top,
0 },
{
"annotate", cmd_annotate,
0 },
{
"version", cmd_version,
0 },
{
"script", cmd_script,
0 },
{
"sched", cmd_sched,
0 },
{
"kmem", cmd_kmem,
0 },
{
"lock", cmd_lock,
0 },
{
"kvm", cmd_kvm,
0 },
{
"test", cmd_test,
0 },
{
"inject", cmd_inject,
0 },
};
......
}
我们在上面的代码可以看到cmd_test应该是在输入test命令时的处理函数,我们把这行给注释掉,去掉对test命令的支持,然后再次编译,如下所示:
[root@RedHat_4_3 perf]
# make LDFLAGS=-static NO_DWARF=true NO_NEWT=true
CC perf.o
LINK perf
这次
没有报任何错误,在当前目录下生成了perf这个可执行文件,编译终于成功了!
4.验证
编译完成只是完成了第一步,最后要验证其是否可用。
我们先来验证perf list命令,输出如下所示:
[root@RedHat_4_3 perf]
# ./perf list
List of pre
-
defined events (to be used
in
-
e)
:
cpu
-
cycles OR cycles [Hardware event]
stalled
-
cycles
-
frontend OR idle
-
cycles
-
frontend [Hardware event]
stalled
-
cycles
-
backend OR idle
-
cycles
-
backend [Hardware event]
instructions [Hardware event]
cache
-
references [Hardware event]
cache
-
misses [Hardware event]
branch
-
instructions OR branches [Hardware event]
branch
-
misses [Hardware event]
bus
-
cycles [Hardware event]
cpu
-
clock [Software event]
task
-
clock [Software event]
page
-
faults OR faults [Software event]
minor
-
faults [Software event]
major
-
faults [Software event]
context
-
switches OR cs [Software event]
cpu
-
migrations OR migrations [Software event]
alignment
-
faults [Software event]
emulation
-
faults [Software event]
L1
-
dcache
-
loads [Hardware cache event]
L1
-
dcache
-
load
-
misses [Hardware cache event]
L1
-
dcache
-
stores [Hardware cache event]
L1
-
dcache
-
store
-
misses [Hardware cache event]
L1
-
dcache
-
prefetches [Hardware cache event]
L1
-
dcache
-
prefetch
-
misses [Hardware cache event]
最后验证下perf stat命令,输出如下所示:
[root@RedHat_4_3 perf]
# ./perf stat ls >/dev/null
Performance counter stats
for
'ls'
:
0
.
812494
task
-
clock
# 0.858 CPUs utilized
0
context
-
switches
# 0.000 M/sec
0
CPU
-
migrations
# 0.000 M/sec
223
page
-
faults
# 0.274 M/sec
1
,
938
,
102
cycles
# 2.385 GHz
1
,
199
,
394
stalled
-
cycles
-
frontend
# 61.88% frontend cycles idle
631
,
632
stalled
-
cycles
-
backend
# 32.59% backend cycles idle
1
,
367
,
408
instructions
# 0.71 insns per cycle
# 0.88 stalled cycles per insn
268
,
711
branches
# 330.724 M/sec
12
,
927
branch
-
misses
# 4.81% of all branches
0
.
000946553
seconds time elapsed
OK,现在我们可以认为我们编译的perf可用了!