使用strace命令解决linux服务器依赖库问题

简单说明:
strace的另一个用处是解决和动态库相关的问题。当对一个可执行文件运行ldd时,它会告诉你程序使用的动态库和找到动态库的位置。但是如果你正在使用一个比较老 的glibc版本(2.2或更早),你可能会有一个有bug的ldd程序,它可能会报告在一个目录下发现一个动态库,但是真正运行程序时动态连接程序 (/lib/ld-linux.so.2)却可能到另外一个目录去找动态连接库。这通常因为/etc/ld.so.conf和 /etc/ld.so.cache文件不一致,或者/etc/ld.so.cache被破坏。在glibc 2.3.2版本上这个错误不会出现,可能ld-linux的这个bug已经被解决了。

尽管这样,ldd并不能把所有程序依赖的动态库列出 来,系统调用dlopen可以在需要的时候自动调入需要的动态库,而这些库可能不会被ldd列出来。作为glibc的一部分的NSS(Name Server Switch)库就是一个典型的例子,NSS的一个作用就是告诉应用程序到哪里去寻找系统帐号数据库。应用程序不会直接连接到NSS库,glibc则会通 过dlopen自动调入NSS库。如果这样的库偶然丢失,你不会被告知存在库依赖问题,但这样的程序就无法通过用户名解析得到用户ID了。
让我们看一个例子:
whoami程序会给出你自己的用户名,这个程序在一些需要知道运行程序的真正用户的脚本程序里面非常有用,whoami的一个示例 输出如下:
[root@mgr04 opt]# whoami
root

whoami程序会给出你自己的用户名,这个程序在一些需要知道运行程序的真正用户的脚本程序里面非常有用,whoami的一个示例 输出如下:

特别提示:演示系统环境如下

[root@mgr04 opt]# cat /etc/redhat-release 
CentOS Linux release 7.2.1511 (Core) 

在centos系统命令行运行 ldd /usr/bin/whoami只能查看到whoami才依赖了2个库文件,然而实际上whoami 命令依赖系统的库文件远不止这些。

[root@mgr04 opt]# ldd /usr/bin/whoami
    linux-vdso.so.1 =>  (0x00007ffd1c111000)
    libc.so.6 => /lib64/libc.so.6 (0x00007fe374d4d000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fe375116000)

咱们可以通过 strace -e trace=open whoami 命令来查看实际中whoami 所调用的库文件如下:

查看命令行执行whoami命令是都打开了系统哪些依赖库文件:
[root@mgr04 opt]# strace -e trace=open whoami
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
open("/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 3
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
open("/lib64/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = 3
open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 3
root
+++ exited with 0 +++

故障模拟演示:
假设因为某种原因在升 级glibc的过程中负责用户名和用户ID转换的库NSS丢失,我们可以通过把nss库改名来模拟这个环境

[root@mgr04 opt]# mv /lib64/libnss_files.so.2 /lib64/libnss_files.so.2.bak
[root@mgr04 opt]# 
[root@mgr04 opt]# whoami
whoami: cannot find name for user ID 0
[root@mgr04 opt]# 

这里你可以看到,运行whoami时出现了错误,此时在命令行执行ldd程序的输出不会提供有用的帮助:

[root@mgr04 opt]# ldd /usr/bin/whoami
    linux-vdso.so.1 =>  (0x00007ffd62bd4000)
    libc.so.6 => /lib64/libc.so.6 (0x00007fdcf861f000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fdcf89e8000)
[root@mgr04 opt]# 

你只会看到whoami依赖Libc.so.6和ld-linux-x86-64.so.2,它没有给出运行whoami所必须的其他库。这里时用strace跟踪 whoami时的输出:

[root@mgr04 opt]# strace -o whoami-strace.txt whoami
whoami: cannot find name for user ID 0
[root@mgr04 opt]# cat whoami-strace.txt 
.....................
.....................
socket(AF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
close(3)                                = 0
socket(AF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_LOCAL, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
close(3)                                = 0
open("/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=1717, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f306e2ae000
read(3, "#\n# /etc/nsswitch.conf\n#\n# An ex"..., 4096) = 1717
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x7f306e2ae000, 4096)            = 0
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=28446, ...}) = 0
mmap(NULL, 28446, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f306e2a8000
close(3)                                = 0
open("/lib64/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib64/tls/x86_64/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/lib64/tls/x86_64", 0x7fff92c1f600) = -1 ENOENT (No such file or directory)
open("/lib64/tls/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/lib64/tls", {st_mode=S_IFDIR|0555, st_size=4096, ...}) = 0
open("/lib64/x86_64/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/lib64/x86_64", 0x7fff92c1f600)   = -1 ENOENT (No such file or directory)
open("/lib64/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/lib64", {st_mode=S_IFDIR|0555, st_size=32768, ...}) = 0
open("/usr/lib64/tls/x86_64/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/tls/x86_64", 0x7fff92c1f600) = -1 ENOENT (No such file or directory)
open("/usr/lib64/tls/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/tls", {st_mode=S_IFDIR|0555, st_size=4096, ...}) = 0
open("/usr/lib64/x86_64/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/x86_64", 0x7fff92c1f600) = -1 ENOENT (No such file or directory)
open("/usr/lib64/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat("/usr/lib64", {st_mode=S_IFDIR|0555, st_size=32768, ...}) = 0
munmap(0x7f306e2a8000, 28446)           = 0
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=28446, ...}) = 0
mmap(NULL, 28446, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f306e2a8000
close(3)                                = 0
open("/lib64/tls/libnss_sss.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/lib64/libnss_sss.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib64/tls/libnss_sss.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/usr/lib64/libnss_sss.so.2", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
munmap(0x7f306e2a8000, 28446)           = 0
open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=2502, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f306e2ae000
read(3, "# Locale name alias data base.\n#"..., 4096) = 2502
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x7f306e2ae000, 4096)            = 0
open("/usr/share/locale/en_US.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/en_US.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/en_US/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/en.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/en.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/en/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
write(2, "whoami: cannot find name for use"..., 39) = 39
close(1)                                = 0
close(2)                                = 0
exit_group(1)                           = ?
+++ exited with 1 +++
[root@mgr04 opt]# 

你可以发现在不同目录下面查找libnss.so.2的尝试,但是都失败了。如果没有strace这样的工具,很难发现这个错误是由于缺少动态库造成的。现 在只需要找到libnss.so.2并把它放回到正确的位置就可以了。 

如果你已经知道你要找什么,你可以让strace只跟踪一些类型的系统调用。例如,你需要看看在configure脚本里面执行的程序,你需要监视的系统调 用就是execve。让strace只记录execve的调用用这个命令:

strace -f -o configure-strace.txt -e execve ./configure

原文地址:
https://www.linuxidc.com/Linux/2012-12/75671p3.htm