性能测试工具1:perf

   1.介绍     

        perf是linxu下的一款性能分析工具。Linux的性能计数器是一个新的基于内核的子系统,它为所有性能分析提供了一个框架。它包括硬件级别(CPU/PMU、性能监控单元)功能和软件(软件计数器、跟踪点)功能。

        通过perf,应用程序可以利用PMU,tracepoint和内核中的计数器来进行性能统计。它不但可以分析制定应用程序的性能问题,也可以用来分析内核的性能问题,当然也可以同时分析应用程序和内核,从而全面理解应用程序中的性能瓶颈。使用perf,可以分析程序运行期间发生的硬件事件,比如instructions retired,processor clock cycles等,也可以分析软件时间,比如page fault和进程切换。

         perf是一款综合性分析工具,大到系统全局性性能,在小到进程线程级别,甚至函数及汇编级别。

2.perf的编译

        perf的源代码存在linux的源代码目录中(tools/perf),执行make即可生成执行文件。由于perf和内核是关联,需要保持跟内核版本同步。

        在arm平台执行perf时,有可能会遇到类型的错误:

The sys_perf_event_open() syscall returned with 38 (Function not implemented) for event (cycles).
/bin/dmesg may provide additional information.
No CONFIG_PERF_EVENTS=y kernel support configured?

        以上错误需要打开内核的相关配置,例如上面的 CONFIG_PERF_EVENTS=y选项。其他类型错误,可根据执行命令提示来打开相关内核配置。

3.perf的原理

        linux性能计数器是一个基于内核的子系统,它提供一个性能分析框架,比如硬件(CPU,PMU)功能和软件(软件计数器,tracepoints)功能

性能测试工具1:perf_第1张图片

3.1 tracepoints

         tracepoints是散落在内核源码中的一些hook,它们可以在特定的代码被执行到的时候触发,这一特性可以被各种trace/debug工具所使用,perf将tracepoint产生的时间记录下来,通过分析这些报告,调优人员便可以了解程序运行期间内核的各种细节,对性能症状做出准确的诊断,这些trackpoints的对应的sysfs节点在/sys/kernel/debug/tracing/events目录下。

4.主要关注点

        基于性能分析,可以进行算法优化(空间复杂度和时间复杂度权衡)代码优化(提高执行速度,减少内存占用)

        评估程序对硬件资源的使用情况,例如各级cache的访问次数各级cache的丢失次数流水线停顿周期前端总线访问次数等。

        评估程序对操作系统资源的使用情况,系统调用次数上下文切换次数任务迁移次数

事件可以分为三种:

  1. Hardware Event由PMU部件产生,在特定的条件下探测性能事件是否发生以及发生的次数,比如cache命中;
  2. Software Event是内核产生的事件,分布在各个功能模块中,统计和操作系统相关性能事件,比如进程切换,tick数等
  3. Tracepoint Event是内核中静态tracepoint所触发的事件,这些tracepoint用来判断程序运行期间内核的行为细节,比如slab分配器的分配次数等

5.perf的使用

  通过敲命令perf --help可以看到perf的二级命令:

序号 命令 作用
1 annotate 解析perf record生成的perf.data文件,显示被注释的代码
2 archive 根据数据文件记录的build-id,将所有被采样到的elf文件打包,利用此压缩包,可以再任何机器上分析数据文件中记录的采样数据
3 bench perf中内置的benchmark,目前包括两套针对调度器和内存管理子系统的benchmark
4 buildid-cache 管理perf的buildid缓存,每个elf文件都有一个独一无二的buildid,buildid被perf用来关联性能数据与elf文件
5 buildid-list 列出数据文件中记录的所有buildid
6 data 数据文件相关处理
7 diff 对比两个数据文件的差异,能够给出每个符号(函数)在热点分析上的具体差异
8 evlist 列出数据文件perf.data中所有性能事件
9 inject 该工具读取perf record工具记录的事件流,并将其定向到标准输出,在被分析代码中的任何一点,都可以向事件流中注入其它事件
10 kmem 针对内核内存(slab)子系统进行追踪测量的工具
11 kvm 用来追踪测试运行在kvm虚拟机上的Guest OS
12 list 列出当前系统支持的所有性能事件,包括硬件性能事件,软件性能事件以及检查点
13 lock 分析内核中的锁信息,包括锁的争用情况,等待延迟等
14 mem 内存存取情况
15 record 收集采样信息,并将其记录在数据文件中,随后可通过其他工具对数据文件进行分析
16 report 读取perf record创建的数据文件,并给出热点分析结果
17 sched 针对调度器子系统的分析工具
18 script 执行perf或python写的功能扩展脚本,生成脚本框架,读取数据文件中的数据信息等
19 stat 执行某个命令,收集特定进程的性能概况,包括CPI,Cache丢失率等
20 test perf对当前软硬件平台进行健全性测试,可用此工具测试当前的软硬件平台是否能支持perf的所有功能
21 timechart 针对测试期间系统行为进行可视化的工具
22 top 关于syscall的工具
23 probe 用于定义动态检查点

5.1 查看全局性概况命令

命令 作用
perf list 查看当前系统支持的性能事件,(就是使用perf stat -e x,y,z),其中x,y,z为追踪事件,可以同perf list来查看
perf bench 对系统性能进行摸底
perf test 对系统进行健全性测试
perf stat 对全局性能进行统计

5.2 查看全局细节命令

命令 作用
perf top 可以实时查看当前系统进程函数占用率情况
perf probe 可以自定义动态事件

5.3 查看特定功能分析命令:

命令 作用
perf kmem 针对slab子系统性能分析
perf kvm 针对kvm虚拟化分析
perf lock 分析锁性能
perf mem 分析内存slab性能
perf sched 分析内核调度器性能
perf trace 记录系统调用轨迹

5.4常用功能命令

     perf record,可以系统全局,也可以具体到某个进程,更甚具体到某一进程某一事件,可宏观,也可以很微观

命令 作用
perf record 记录信息到perf.data
perf report 生成报告
perf diff 对两个记录进行diff
perf evlist 列出记录的性能事件
perf annotate 显示perf.data函数代码
perf archive 将相关符号打包,方便在其他机器进行分析
perf script 将perf.data输出可读性文本

5.5可视化工具

命令 作用
perf timechart record 记录事件
perf timechart 生成output.svg文档

6.perf引入的负载

     perf测试不可避免的会引入额外负载,有三种形式:

  • counting:内核提供计数总结,多是Hardware Event,Software Events,PMU计数等,相关命令perf stat;
  • sampling:perf将事件数据缓存到一块buffer中,然后异步写入到perf.data文件中,使用perf report等工具进行离线分析;
  • bpf:Kernel 4.4+新增功能,可以提供更多有效filter和输出总结;  

         其中,counting引入的额外负荷最小,sampling在某些情况下会引入非常大的负荷,bpf可以有效缩减负荷。

        针对sampling,可以通过挂在建立在RAM上的文件系统来有效降低读写I/o引入的负荷

mkdir /tmpfs
mount -t tmpfs tmpfs /tmpfs

7.perf list列出所有事件

        perf list显示的事件类型分类如下:hw/cache/pmu都是硬件相关的;tracepoint基于内核的trace;sw实际上是内核计数器

  • hw/hardware显示支持的硬件事件相关,如:
ccion@ccion:~$ sudo perf list hardware

List of pre-defined events (to be used in -e):

  branch-instructions OR branches                    [Hardware event]
  branch-misses                                      [Hardware event]
  bus-cycles                                         [Hardware event]
  cache-misses                                       [Hardware event]
  cache-references                                   [Hardware event]
  cpu-cycles OR cycles                               [Hardware event]
  instructions                                       [Hardware event]
  ref-cycles                                         [Hardware event]
  • sw/software显示支持的软件事件列表:
ccion@ccion:~$ sudo perf list sw

List of pre-defined events (to be used in -e):

  alignment-faults                                   [Software event]
  bpf-output                                         [Software event]
  context-switches OR cs                             [Software event]
  cpu-clock                                          [Software event]
  cpu-migrations OR migrations                       [Software event]
  dummy                                              [Software event]
  emulation-faults                                   [Software event]
  major-faults                                       [Software event]
  minor-faults                                       [Software event]
  page-faults OR faults                              [Software event]
  task-clock                                         [Software event]
  • cache/hwcache显示硬件cache相关事件列表:
ccion@ccion:~$ sudo perf list cache

List of pre-defined events (to be used in -e):

  L1-dcache-load-misses                              [Hardware cache event]
  L1-dcache-loads                                    [Hardware cache event]
  L1-dcache-stores                                   [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  LLC-load-misses                                    [Hardware cache event]
  LLC-loads                                          [Hardware cache event]
  LLC-store-misses                                   [Hardware cache event]
  LLC-stores                                         [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  dTLB-store-misses                                  [Hardware cache event]
  dTLB-stores                                        [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]
  iTLB-loads                                         [Hardware cache event]
  node-load-misses                                   [Hardware cache event]
  node-loads                                         [Hardware cache event]
  node-store-misses                                  [Hardware cache event]
  node-stores                                        [Hardware cache event]
  • pmu显示支持的PMU事件列表
ccion@ccion:~$ sudo perf list pmu
[sudo] password for ccion: 

List of pre-defined events (to be used in -e):

  branch-instructions OR cpu/branch-instructions/    [Kernel PMU event]
  branch-misses OR cpu/branch-misses/                [Kernel PMU event]
  bus-cycles OR cpu/bus-cycles/                      [Kernel PMU event]
  cache-misses OR cpu/cache-misses/                  [Kernel PMU event]
  cache-references OR cpu/cache-references/          [Kernel PMU event]
  cpu-cycles OR cpu/cpu-cycles/                      [Kernel PMU event]
  cstate_core/c3-residency/                          [Kernel PMU event]
  cstate_core/c6-residency/                          [Kernel PMU event]
  cstate_core/c7-residency/                          [Kernel PMU event]
  cstate_pkg/c10-residency/                          [Kernel PMU event]
  cstate_pkg/c2-residency/                           [Kernel PMU event]
  cstate_pkg/c3-residency/                           [Kernel PMU event]
  cstate_pkg/c6-residency/                           [Kernel PMU event]
  cstate_pkg/c7-residency/                           [Kernel PMU event]
  cstate_pkg/c8-residency/                           [Kernel PMU event]
  cstate_pkg/c9-residency/                           [Kernel PMU event]
  cycles-ct OR cpu/cycles-ct/                        [Kernel PMU event]
  cycles-t OR cpu/cycles-t/                          [Kernel PMU event]
  el-abort OR cpu/el-abort/                          [Kernel PMU event]
  el-capacity OR cpu/el-capacity/                    [Kernel PMU event]
  el-commit OR cpu/el-commit/                        [Kernel PMU event]
  el-conflict OR cpu/el-conflict/                    [Kernel PMU event]
  el-start OR cpu/el-start/                          [Kernel PMU event]
  instructions OR cpu/instructions/                  [Kernel PMU event]
  intel_pt//                                         [Kernel PMU event]
  mem-loads OR cpu/mem-loads/                        [Kernel PMU event]
  mem-stores OR cpu/mem-stores/                      [Kernel PMU event]
  msr/aperf/                                         [Kernel PMU event]
  msr/mperf/                                         [Kernel PMU event]
  msr/pperf/                                         [Kernel PMU event]
  msr/smi/                                           [Kernel PMU event]
  msr/tsc/                                           [Kernel PMU event]
  power/energy-cores/                                [Kernel PMU event]
  power/energy-gpu/                                  [Kernel PMU event]
  power/energy-pkg/                                  [Kernel PMU event]
  power/energy-psys/                                 [Kernel PMU event]
  power/energy-ram/                                  [Kernel PMU event]
  ref-cycles OR cpu/ref-cycles/                      [Kernel PMU event]
  topdown-fetch-bubbles OR cpu/topdown-fetch-bubbles/ [Kernel PMU event]
  topdown-recovery-bubbles OR cpu/topdown-recovery-bubbles/ [Kernel PMU event]
  topdown-slots-issued OR cpu/topdown-slots-issued/  [Kernel PMU event]
  topdown-slots-retired OR cpu/topdown-slots-retired/ [Kernel PMU event]
  topdown-total-slots OR cpu/topdown-total-slots/    [Kernel PMU event]
  tx-abort OR cpu/tx-abort/                          [Kernel PMU event]
  tx-capacity OR cpu/tx-capacity/                    [Kernel PMU event]
  tx-commit OR cpu/tx-commit/                        [Kernel PMU event]
  tx-conflict OR cpu/tx-conflict/                    [Kernel PMU event]
  tx-start OR cpu/tx-start/                          [Kernel PMU event]
  uncore_cbox_0/clockticks/                          [Kernel PMU event]
  uncore_cbox_1/clockticks/                          [Kernel PMU event]
  uncore_cbox_2/clockticks/                          [Kernel PMU event]
  uncore_cbox_3/clockticks/                          [Kernel PMU event]
  uncore_cbox_4/clockticks/                          [Kernel PMU event]
  uncore_imc/data_reads/                             [Kernel PMU event]
  uncore_imc/data_writes/                            [Kernel PMU event]
  • tracepoint显示支持的所有tracepoint列表:
ccion@ccion:~$ sudo perf list tracepoint

List of pre-defined events (to be used in -e):

  alarmtimer:alarmtimer_cancel                       [Tracepoint event]
  alarmtimer:alarmtimer_fired                        [Tracepoint event]
  alarmtimer:alarmtimer_start                        [Tracepoint event]
  alarmtimer:alarmtimer_suspend                      [Tracepoint event]
  block:block_bio_backmerge                          [Tracepoint event]
  block:block_bio_bounce                             [Tracepoint event]
  block:block_bio_complete                           [Tracepoint event]
  block:block_bio_frontmerge                         [Tracepoint event]
  block:block_bio_queue                              [Tracepoint event]
  block:block_bio_remap                              [Tracepoint event]
  block:block_dirty_buffer                           [Tracepoint event]
  block:block_getrq                                  [Tracepoint event]
  block:block_plug                                   [Tracepoint event]
  block:block_rq_complete                            [Tracepoint event]
  block:block_rq_insert                              [Tracepoint event]
  block:block_rq_issue                               [Tracepoint event]
  block:block_rq_remap                               [Tracepoint event]
  block:block_rq_requeue                             [Tracepoint event]
  block:block_sleeprq                                [Tracepoint event]
  block:block_split                                  [Tracepoint event]
  block:block_touch_buffer                           [Tracepoint event]
  block:block_unplug                                 [Tracepoint event]
  bpf:bpf_map_create                                 [Tracepoint event]
  bpf:bpf_map_delete_elem                            [Tracepoint event]
  bpf:bpf_map_lookup_elem                            [Tracepoint event]
  bpf:bpf_map_next_key                               [Tracepoint event]
  bpf:bpf_map_update_elem                            [Tracepoint event]
  bpf:bpf_obj_get_map                                [Tracepoint event]
  bpf:bpf_obj_get_prog                               [Tracepoint event]
  bpf:bpf_obj_pin_map                                [Tracepoint event]
  bpf:bpf_obj_pin_prog                               [Tracepoint event]
  bpf:bpf_prog_get_type                              [Tracepoint event]
  bpf:bpf_prog_load                                  [Tracepoint event]
  bpf:bpf_prog_put_rcu                               [Tracepoint event]
  bridge:br_fdb_add                                  [Tracepoint event]
  bridge:br_fdb_external_learn_add                   [Tracepoint event]
  bridge:br_fdb_update                               [Tracepoint event]
  bridge:fdb_delete                                  [Tracepoint event]
  cgroup:cgroup_attach_task                          [Tracepoint event]
  cgroup:cgroup_destroy_root                         [Tracepoint event]
  cgroup:cgroup_mkdir                                [Tracepoint event]
  cgroup:cgroup_release                              [Tracepoint event]
  cgroup:cgroup_remount                              [Tracepoint event]
  cgroup:cgroup_rename                               [Tracepoint event]
  cgroup:cgroup_rmdir                                [Tracepoint event]
  cgroup:cgroup_setup_root                           [Tracepoint event]
  cgroup:cgroup_transfer_tasks                       [Tracepoint event]
  clk:clk_disable                                    [Tracepoint event]
  clk:clk_disable_complete                           [Tracepoint event]
  clk:clk_enable                                     [Tracepoint event]
  clk:clk_enable_complete                            [Tracepoint event]
  clk:clk_prepare                                    [Tracepoint event]
  clk:clk_prepare_complete                           [Tracepoint event]
  clk:clk_set_parent                                 [Tracepoint event]
  clk:clk_set_parent_complete                        [Tracepoint event]
  clk:clk_set_phase                                  [Tracepoint event]
  clk:clk_set_phase_complete                         [Tracepoint event]
  clk:clk_set_rate                                   [Tracepoint event]
  clk:clk_set_rate_complete                          [Tracepoint event]
  clk:clk_unprepare                                  [Tracepoint event]
  clk:clk_unprepare_complete                         [Tracepoint event]
  cma:cma_alloc                                      [Tracepoint event]
  cma:cma_release                                    [Tracepoint event]
  compaction:mm_compaction_begin                     [Tracepoint event]
  compaction:mm_compaction_defer_compaction          [Tracepoint event]
  compaction:mm_compaction_defer_reset               [Tracepoint event]
  compaction:mm_compaction_deferred                  [Tracepoint event]
  compaction:mm_compaction_end                       [Tracepoint event]
  compaction:mm_compaction_finished                  [Tracepoint event]
  compaction:mm_compaction_isolate_freepages         [Tracepoint event]
  compaction:mm_compaction_isolate_migratepages      [Tracepoint event]
  compaction:mm_compaction_kcompactd_sleep           [Tracepoint event]
  compaction:mm_compaction_kcompactd_wake            [Tracepoint event]
  compaction:mm_compaction_migratepages              [Tracepoint event]
  compaction:mm_compaction_suitable                  [Tracepoint event]
  compaction:mm_compaction_try_to_compact_pages      [Tracepoint event]
  compaction:mm_compaction_wakeup_kcompactd          [Tracepoint event]
  cpuhp:cpuhp_enter                                  [Tracepoint event]
  cpuhp:cpuhp_exit                                   [Tracepoint event]
  cpuhp:cpuhp_multi_enter                            [Tracepoint event]
  dma_fence:dma_fence_destroy                        [Tracepoint event]
  dma_fence:dma_fence_emit                           [Tracepoint event]
  dma_fence:dma_fence_enable_signal                  [Tracepoint event]
  dma_fence:dma_fence_init                           [Tracepoint event]
  dma_fence:dma_fence_signaled                       [Tracepoint event]
  dma_fence:dma_fence_wait_end                       [Tracepoint event]
  dma_fence:dma_fence_wait_start                     [Tracepoint event]
  drm:drm_vblank_event                               [Tracepoint event]
  drm:drm_vblank_event_delivered                     [Tracepoint event]
  drm:drm_vblank_event_queued                        [Tracepoint event]
  exceptions:page_fault_kernel                       [Tracepoint event]
  exceptions:page_fault_user                         [Tracepoint event]
  ext4:ext4_alloc_da_blocks                          [Tracepoint event]
  ext4:ext4_allocate_blocks                          [Tracepoint event]
  ext4:ext4_allocate_inode                           [Tracepoint event]
  ext4:ext4_begin_ordered_truncate                   [Tracepoint event]
  ext4:ext4_collapse_range                           [Tracepoint event]
  ext4:ext4_da_release_space                         [Tracepoint event]
  ext4:ext4_da_reserve_space                         [Tracepoint event]
  ext4:ext4_da_update_reserve_space                  [Tracepoint event]
  ext4:ext4_da_write_begin                           [Tracepoint event]
  ext4:ext4_da_write_end                             [Tracepoint event]
  ext4:ext4_da_write_pages                           [Tracepoint event]
  ext4:ext4_da_write_pages_extent                    [Tracepoint event]
  ext4:ext4_direct_IO_enter                          [Tracepoint event]
  ext4:ext4_direct_IO_exit                           [Tracepoint event]
  ext4:ext4_discard_blocks                           [Tracepoint event]
  ext4:ext4_discard_preallocations                   [Tracepoint event]
.......

你可能感兴趣的:(性能优化,arm)