ftrace和tracepoint简单使用

ftrace

ftrace 的作用是帮助开发人员了解 Linux 内核的运行时行为,以便进行故障调试或性能分析。
最早 ftrace 是一个 function tracer,仅能够记录内核的函数调用流程。如今 ftrace 已经成为一个 framework,采用 plugin 的方式支持开发人员添加更多种类的 trace 功能。
Ftrace 由 RedHat 的 Steve Rostedt 负责维护。

1. 内核编译(打开ftrace)

CONFIG_KPROBES_ON_FTRACE=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_FTRACE=y
CONFIG_DYNAMIC_FTRACE=y
CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
CONFIG_DYNAMIC_FTRACE_WITH_ARGS=y
CONFIG_FTRACE_SYSCALLS=y
CONFIG_FTRACE_MCOUNT_RECORD=y
CONFIG_FTRACE_MCOUNT_USE_CC=y
CONFIG_HAVE_SAMPLE_FTRACE_DIRECT=y
CONFIG_HAVE_SAMPLE_FTRACE_DIRECT_MULTI=y

2. 挂载debugfs

mount -t debugfs debugfs /sys/kernel/debug/

3. 查看函数调用栈(以vfs_read()函数为例)

以下命令都是在:/sys/kernel/debug/tracing/目录下执行:

echo 1 > options/func_stack_trace
echo vfs_read > set_ftrace_filter
echo 1 > tracing_on
echo function > current_tracer
echo 0 > tracing_on
cat trace | head -n 20

结果如下:

/sys/kernel/debug/tracing # cat trace | head -n 20
# tracer: function
#
# entries-in-buffer/entries-written: 418/418   #P:1
#
#                                _-----=> irqs-off/BH-disabled
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| / _-=> migrate-disable
#                              |||| /     delay
#           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
#              | |         |   |||||     |         |
              sh-78      [000] .....   332.939292: vfs_read <-ksys_read
              sh-78      [000] .....   332.940093: <stack trace>
 => 0xffffffffc0333083
 => vfs_read
 => ksys_read
 => do_syscall_64
 => entry_SYSCALL_64_after_hwframe
              sh-78      [000] .....   332.940622: vfs_read <-ksys_read

4. 查看函数子函数调用(以vfs_read()函数为例)

以下命令都是在:/sys/kernel/debug/tracing/目录下执行:

echo function_graph > current_tracer
echo vfs_read > set_graph_function
echo 5 > max_graph_depth
echo 1 > tracing_on
echo 0 > tracing_on
cat trace | head -n 50

结果如下:

/sys/kernel/debug/tracing # cat trace | head -n 40
# tracer: function_graph
#
# CPU  DURATION                  FUNCTION CALLS
# |     |   |                     |   |   |   |
 0) + 26.397 us   |  mutex_unlock();
 0)               |  vfs_read() {
 0)               |    rw_verify_area() {
 0)               |      security_file_permission() {
 0)               |        selinux_file_permission() {
 0)   2.639 us    |          __inode_security_revalidate();
 0)   1.278 us    |          avc_policy_seqno();
 0) + 11.926 us   |        }
 0) + 15.646 us   |      }
 0) + 19.372 us   |    }
 0)               |    new_sync_read() {
 0)               |      tty_read() {
 0)   1.007 us    |        tty_paranoia_check();
 0)               |        tty_ldisc_ref_wait() {
 0)   1.485 us    |          ldsem_down_read();
 0)   3.461 us    |        }
 0)               |        n_tty_read() {
 0)   1.854 us    |          mutex_lock_interruptible();
 0)   1.532 us    |          down_read();
 0)   3.663 us    |          add_wait_queue();
 0)   2.981 us    |          copy_from_read_buf();
 0)   5.688 us    |          n_tty_check_unthrottle();
 0)   1.014 us    |          n_tty_kick_worker();
 0)   1.039 us    |          up_read();
 0)   2.701 us    |          remove_wait_queue();
 0)   0.978 us    |          mutex_unlock();
 0) + 36.193 us   |        }
 0)               |        tty_ldisc_deref() {
 0)   1.004 us    |          ldsem_up_read();
 0)   2.937 us    |        }
 0)   0.978 us    |        ktime_get_real_seconds();
 0) + 56.611 us   |      }
 0) + 60.801 us   |    }
 0) + 93.459 us   |  }
 0)               |  vfs_read() {
 0)               |    rw_verify_area() {

tracepoint

tracepoint是预先在函数的插入点中插桩,当执行到函数的插入点,则执行插桩函数,进而触发与插入点预先绑定的probe函数,probe函数可以是一个或者多个,probe函数可以定义为任意的行为,从而可以起到对函数内部观测的作用。

1. 内核编译(打开tracepoint)

CONFIG_TRACEPOINTS=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y

2. 挂载debugfs

mount -t debugfs debugfs /sys/kernel/debug/

3. 使用tracepoint(以系统调用mkdir为例)

以下命令都是在:/sys/kernel/debug/tracing/目录下执行:

echo 1 > events/syscalls/sys_enter_mkdir/enable
echo 1 > tracing_on
cd ~
mkdir haha
cd sys/kernel/debug/tracing/
cat trace

结果如下:

/sys/kernel/debug/tracing # cat trace
# tracer: nop
#
# entries-in-buffer/entries-written: 2/2   #P:1
#
#                                _-----=> irqs-off/BH-disabled
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| / _-=> migrate-disable
#                              |||| /     delay
#           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
#              | |         |   |||||     |         |
           mkdir-84      [000] ...1.   697.463523: sys_mkdir(pathname: 7ffe07d21f4d, mode: 1ff)
           mkdir-85      [000] ...1.   707.071031: sys_mkdir(pathname: 7fff72d3af6c, mode: 1ff)

你可能感兴趣的:(Kernel,ftrace,tracepoint)