systemtap使用笔记

安装

使用之前需要安装一些内核调试包
安装systemtap

sudo yum install systemtap systemtap-runtime  

可以用这个命令看下需要的包

stap-prep

正常的安装流程.假设之前相关的包都没有装
先看一下自己的系统环境,安装包必须要跟内核版本对应

$uname -r
3.10.0-327.el7.x86_64

需要安装的三个包

  • kernel-debuginfo
  • kernel-debuginfo-common
  • kernel-devel

后面加上自己环境的版本信息即可。比如kernel-debuginfo-3.10.0-327.el7.x86_64。安装包可以去 http://rpm.pbone.net这里找。

安装完成后可以输入以下命令来验证是否系统已经支持systemtap:

$sudo stap -ve 'probe begin { log("hello world") exit() }'
Pass 1: parsed user script and 116 library script(s) using 227408virt/38928res/3192shr/36024data kb, in 190usr/10sys/210real ms.
Pass 2: analyzed script: 1 probe(s), 2 function(s), 0 embed(s), 0 global(s) using 228068virt/39984res/3456shr/36684data kb, in 10usr/0sys/5real ms.
Pass 3: translated to C into "/tmp/stapIFvNry/stap_ce820c0cec98f129396f1b5044b5f319_1112_src.c" using 228192virt/40188res/3640shr/36808data kb, in 0usr/0sys/0real ms.
Pass 4: compiled C into "stap_ce820c0cec98f129396f1b5044b5f319_1112.ko" in 1500usr/450sys/1859real ms.
Pass 5: starting run.
hello world
Pass 5: run completed in 0usr/30sys/343real ms.

如果能输出类似上面的信息,说明已经安装完成可以使用了。

使用

stap -l 可以查看某个函数在哪个文件的哪一行定义的,可以是内核代码也可以是用户态代码。

$stap -l 'process("/usr/lib64/libjemalloc.so.2").function("malloc")'
process("/usr/lib64/libjemalloc.so.2").function("malloc@/usr/include/stdlib.h:465")

stap -L 可以看函数中的哪些变量可以被看到,后面写脚本时可以直接引用这些变量

$stap -L 'process("/usr/lib64/libjemalloc.so.2").function("malloc")'
process("/usr/lib64/libjemalloc.so.2").function("malloc@/usr/include/stdlib.h:465") $size:size_t $usize:size_t

复杂一点的打点、统计之类的可以写stap脚本来完成,语法类似c语言,也比较简答,这里贴一个做内存分析的脚本。

probe begin {
    printf("=============begin============\n")
}

//记录每次内存分配的size
global g_mem_ref_tbl
//记录内存分配的堆栈
global g_mem_bt_tbl
global cnt
global m_cnt
global str
probe process("/usr/lib64/libjemalloc.so.2").function("malloc").return {
    if (target() == pid()) {
      cnt++
      if ($size > 1024*1024) { // 1M
        g_mem_ref_tbl[$return] = $size
        g_mem_bt_tbl[$return] = sprint_ubacktrace()
        //g_mem_bt_tbl[$return] = sprint_ubacktrace_brief()
        //str = sprint_ubacktrace()
        //g_mem_bt_tbl[$return] = sprint_ustack(ubacktrace)
        m_cnt++
      }
    }
}

probe process("/usr/local/lib/libjemalloc.so.2").function("free").call {
    if (target() == pid()) {
      //内存释放就将这两个变量删除
      g_mem_ref_tbl[$ptr] = 0
      delete g_mem_ref_tbl[$ptr]
      delete g_mem_bt_tbl[$ptr]
    }
}

probe end {
    //最后输出未释放内存的分配堆栈
    total_size=0
    printf("=============end============\n")
    printf("total alloc count %d, mcnt=%d\n", cnt, m_cnt)
    foreach(mem in g_mem_bt_tbl) {
      if (g_mem_ref_tbl[mem] > 1024 * 1024) {
        total_size+=g_mem_ref_tbl[mem]
        printf("---------%d--------\n", g_mem_ref_tbl[mem])
        printf("%s\n", g_mem_bt_tbl[mem])
        printf("--------end--------\n")
      } else {
        printf("error %d, %s\n", g_mem_ref_tbl[mem], g_mem_bt_tbl[mem])
      }
    }
    printf("total size = %d\n", total_size)
}

//程序计时功能,60s后退出
probe timer.ms(60000)
{
    exit()
}

变量无需指明类型,程序会自己识别。如果是全局的,前面加上global。

官方有很多函数,需要的可以去查一下。 另外如果安装了systemstap在下面目录下也有一些例子可以参考一下。
https://sourceware.org/systemtap/tapsets/
/usr/share/systemtap/tapset 例子

错误分析

parser error

这个错误一般是debuginfo包没有安装好,如果确定已经安装完成,可以卸载掉重新安装试一下。

semantic error

这个错误分好多种,这里举例我遇到过的

  1. 如下,我将process错写为processy
semantic error: while resolving probe point: identifier 'processy' at mem.stap:12:7
        source: probe processy("/usr/lib64/libjemalloc.so.2").function("malloc").return {
  1. 将探测函数名写错,malloc错写为mmalloc,这时在下面有提示“similar functions: malloc, calloc, mallocx, valloc, realloc”, 可以根据这个来检查一下是否写错。
$semantic error: while resolving probe point: identifier 'process' at mem.stap:12:7
        source: probe process("/usr/lib64/libjemalloc.so.2").function("mmalloc").return {
                      ^

semantic error: no match (similar functions: malloc, calloc, mallocx, valloc, realloc)

OVER FLOW

这个是运行时出现的错误,主要是由于systemtap设置了一些限制,可以用一些参数来配置。
下面举三个运行出错的例子以及解决方式,命令中加入对应的参数。

ERROR: probe overhead exceeded threshold

-DSTP_NO_OVERLOAD

ERROR: Skipped too many probes, check MAXSKIPPED or try again with stap -t for more details.

-DMAXSKIPPED=102400

ERROR: Array overflow, check MAXMAPENTRIES near identifier....

-DMAXMAPENTRIES=1024000

man stap可以看到RESOURCE LIMITS这一栏下面有很多资源配置的参数。

MAXNESTING
Maximum number of nested function calls. Default determined by script analysis, with a bonus 10 slots added for recursive scripts.

MAXSTRINGLEN
Maximum length of strings, default 128.

MAXTRYLOCK
Maximum number of iterations to wait for locks on global variables before declaring possible deadlock and skipping the probe, default 1000.

MAXACTION
Maximum number of statements to execute during any single probe hit (with interrupts disabled), default 1000.

MAXACTION_INTERRUPTIBLE
Maximum number of statements to execute during any single probe hit which is executed with interrupts enabled (such as begin/end probes),
default (MAXACTION * 10).

MAXMAPENTRIES
Maximum number of rows in any single global array, default 2048.

MAXERRORS
Maximum number of soft errors before an exit is triggered, default 0, which means that the first error will exit the script.

MAXSKIPPED
Maximum number of skipped probes before an exit is triggered, default 100. Running systemtap with -t (timing) mode gives more details about
skipped probes. With the default -DINTERRUPTIBLE=1 setting, probes skipped due to reentrancy are not accumulated against this limit.

MINSTACKSPACE
Minimum number of free kernel stack bytes required in order to run a probe handler, default 1024. This number should be large enough for
the probe handler’s own needs, plus a safety margin.

MAXUPROBES
Maximum number of concurrently armed user-space probes (uprobes), default somewhat larger than the number of user-space probe points named
in the script. This pool needs to be potentialy large because individual uprobe objects (about 64 bytes each) are allocated for each
process for each matching script-level probe.
STP_MAXMEMORY
Maximum amount of memory (in kilobytes) that the systemtap module should use, default unlimited. The memory size includes the size of the
module itself, plus any additional allocations. This only tracks direct allocations by the systemtap runtime. This does not track indirect
allocations (as done by kprobes/uprobes/etc. internals).

TASK_FINDER_VMA_ENTRY_ITEMS
Maximum number of VMA pages that will be tracked at runtime. This might get exhausted for system wide probes inspecting shared library vari‐
ables and/or user backtraces. Defaults to 1536.

STP_PROCFS_BUFSIZE
Size of procfs probe read buffers (in bytes). Defaults to MAXSTRINGLEN. This value can be overridden on a per-procfs file basis using the
procfs read probe .maxsize(MAXSIZE) parameter.

先这样,后续用到一些不错的技能再继续补充。

你可能感兴趣的:(systemtap使用笔记)