Perf -- Linux下的系统性能调优工具
在ubuntu 11.04 要装了这两个包才有的工具的那些命令的,我在软件中心中装的 apt-get install 也一样吧?
linux-tools
linux-tools-common
这个工具不错可以统计很多硬件相关信息 “cpu cache命中" "分支预测" "指令周期"等信息。还可以监控指定的进程的函数调用计数信息等。
idebright@:~/桌面$ perf list
List of pre-defined events (to be used in -e):
cpu-cycles OR cycles [Hardware event]
instructions [Hardware event]
cache-references [Hardware event]
cache-misses [Hardware event]
branch-instructions OR branches [Hardware event]
branch-misses [Hardware event]
bus-cycles [Hardware event]
cpu-clock [Software event]
task-clock [Software event]
page-faults OR faults [Software event]
minor-faults [Software event]
major-faults [Software event]
context-switches OR cs [Software event]
cpu-migrations OR migrations [Software event]
alignment-faults [Software event]
emulation-faults [Software event]
L1-dcache-loads [Hardware cache event]
L1-dcache-load-misses [Hardware cache event]
L1-dcache-stores [Hardware cache event]
L1-dcache-store-misses [Hardware cache event]
L1-dcache-prefetches [Hardware cache event]
L1-dcache-prefetch-misses [Hardware cache event]
L1-icache-loads [Hardware cache event]
L1-icache-load-misses [Hardware cache event]
L1-icache-prefetches [Hardware cache event]
L1-icache-prefetch-misses [Hardware cache event]
LLC-loads [Hardware cache event]
LLC-load-misses [Hardware cache event]
LLC-stores [Hardware cache event]
LLC-store-misses [Hardware cache event]
LLC-prefetches [Hardware cache event]
LLC-prefetch-misses [Hardware cache event]
dTLB-loads [Hardware cache event]
dTLB-load-misses [Hardware cache event]
dTLB-stores [Hardware cache event]
dTLB-store-misses [Hardware cache event]
dTLB-prefetches [Hardware cache event]
dTLB-prefetch-misses [Hardware cache event]
iTLB-loads [Hardware cache event]
iTLB-load-misses [Hardware cache event]
branch-loads [Hardware cache event]
branch-load-misses [Hardware cache event]
rNNN (see 'perf list --help' on how to encode it) [Raw hardware event descript
mem:<addr>[:access] [Hardware breakpoint]
perf stat ./a.out
^C
Performance counter stats for './a.out':
9,044 cache-misses # 0.003 M/sec (scaled from 66.87%)
523,191 cache-references # 0.172 M/sec (scaled from 66.96%)
21,838,315 branch-misses # 6.678 % (scaled from 33.13%)
327,014,993 branches # 107.285 M/sec (scaled from 33.04%)
2,355,587,681 instructions # 0.349 IPC (scaled from 49.41%)
6,740,540,287 cycles # 2211.403 M/sec (scaled from 67.03%)
100 page-faults # 0.000 M/sec
30 CPU-migrations # 0.000 M/sec
482 context-switches # 0.000 M/sec
3048.082970 task-clock-msecs # 0.596 CPUs
5.118230246 seconds time elapsed
widebright@:~/桌面$ ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 08:21 ? 00:00:00 /sbin/init
root 2 0 0 08:21 ? 00:00:00 [kthreadd]
root 3 2 0 08:21 ? 00:00:00 [ksoftirqd/0]
root 5 2 0 08:21 ? 00:00:00 [kworker/u:0]
root 6 2 0 08:21 ? 00:00:00 [migration/0]
root 7 2 0 08:21 ? 00:00:00 [migration/1]
root 9 2 0 08:21 ? 00:00:00 [ksoftirqd/1]
root 10 2 0 08:21 ? 00:00:01 [kworker/0:1]
widebright@:~/桌面$ perf top -c 1000 -p 5
Fatal: Permission error - are you root?
Consider tweaking /proc/sys/kernel/perf_event_paranoid.
widebright@:~/桌面$ sudo perf top -c 1000 -p 5
[sudo] password for widebright:
-------------------------------------------------------------------------------
PerfTop: 302 irqs/sec kernel:100.0% exact: 0.0% [1000 cycles], (target_pid: 5)
-------------------------------------------------------------------------------
samples pcnt function DSO
_______ _____ ________________________________ ________
603.00 19.7% i915_gem_retire_requests_ring [i915]
352.00 11.5% kref_put [kernel]
250.00 8.2% __ticket_spin_lock [kernel]
184.00 6.0% i915_gem_object_move_to_inactive [i915]
143.00 4.7% kfree [kernel]
113.00 3.7% __ticket_spin_unlock [kernel]
102.00 3.3% find_busiest_group [kernel]
69.00 2.3% i915_gem_retire_work_handler [i915]
65.00 2.1% __slab_free [kernel]
64.00 2.1% mod_timer [kernel]
61.00 2.0% i915_gem_object_move_to_active [i915]
53.00 1.7% update_cfs_load [kernel]
52.00 1.7% process_one_work [kernel]
=========================
PERF-STAT(1) perf Manual PERF-STAT(1)
NAME
perf-stat - Run a command and gather performance counter statistics
SYNOPSIS
perf stat [-e <EVENT> | --event=EVENT] [-a] <command>
perf stat [-e <EVENT> | --event=EVENT] [-a] — <command> [<options>]
DESCRIPTION
This command runs a command and gathers performance counter statistics
from it.
OPTIONS
<command>...
Any command you can specify in a shell.
-e, --event=
Select the PMU event. Selection can be a symbolic event name (use
perf list to list all events) or a raw PMU event (eventsel+umask)
in the form of rNNN where NNN is a hexadecimal event descriptor.
-i, --no-inherit
child tasks do not inherit counters
-p, --pid=<pid>
stat events on existing process id
-t, --tid=<tid>
stat events on existing thread id
-a, --all-cpus
system-wide collection from all CPUs
-c, --scale
scale/normalize counter values
-r, --repeat=<n>
repeat command and print average + stddev (max: 100)
-B, --big-num
print large numbers with thousands' separators according to locale
-C, --cpu=
Count only on the list of CPUs provided. Multiple CPUs can be
provided as a comma-separated list with no space: 0,1. Ranges of
CPUs are specified with -: 0-2. In per-thread mode, this option is
ignored. The -a option is still necessary to activate system-wide
monitoring. Default is to count on all CPUs.
-A, --no-aggr
Do not aggregate counts across all monitored CPUs in system-wide
mode (-a). This option is only valid in system-wide mode.
-n, --null
null run - don’t start any counters
-v, --verbose
be more verbose (show counter open errors, etc)
-x SEP, --field-separator SEP
print counts using a CSV-style output to make it easy to import
directly into spreadsheets. Columns are separated by the string
specified in SEP.
EXAMPLES
$ perf stat — make -j
Performance counter stats for 'make -j':
8117.370256 task clock ticks # 11.281 CPU utilization factor
678 context switches # 0.000 M/sec
133 CPU migrations # 0.000 M/sec
235724 pagefaults # 0.029 M/sec
24821162526 CPU cycles # 3057.784 M/sec
18687303457 instructions # 2302.138 M/sec
172158895 cache references # 21.209 M/sec
27075259 cache misses # 3.335 M/sec
Wall-clock time elapsed: 719.554352 msecs
SEE ALSO
perf-top(1), perf-list(1)