应用调试之程序启动失败分析

应用程序的执行需要解决依赖关系

使用ldd命令,可以查看应用程序的动态库加载依赖

比如,对于一个例子程序,执行后输出如下

    ywgong@ubuntu:/home/nfsshare/program_test$ ldd a.out 
          linux-vdso.so.1 =>  (0x00007fff18ffe000)
          libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f56d57c7000)
          libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f56d55af000)
          libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f56d51ef000)
          libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f56d4ef3000)
          /lib64/ld-linux-x86-64.so.2 (0x00007f56d5b30000)
    ywgong@ubuntu:/home/nfsshare/program_test$ 

上面的信息输出了a.out依赖的动态库

其中第一个 vdso,不是一个实际的动态库,是为了在x86架构下更快的实现系统调用而提供的

使用strace命令可以查看程序运行过程中的系统调用情况

ywgong@ubuntu:/home/nfsshare/program_test$ strace ./a.out 
    execve("./a.out", ["./a.out"], [/* 25 vars */]) = 0
    brk(0)                                  = 0x129d000
    access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
    mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff8858e4000
    access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
    open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
    fstat(3, {st_mode=S_IFREG|0644, st_size=110610, ...}) = 0
    mmap(NULL, 110610, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7ff8858c8000
    close(3)                                = 0
    access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
    open("/usr/lib/x86_64-linux-gnu/libstdc++.so.6", O_RDONLY|O_CLOEXEC) = 3
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220\361\6\0\0\0\0\0"..., 832) = 832
    fstat(3, {st_mode=S_IFREG|0644, st_size=1344600, ...}) = 0
    mmap(NULL, 3452160, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ff885379000
    mprotect(0x7ff8854b7000, 2097152, PROT_NONE) = 0
    mmap(0x7ff8856b7000, 40960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x13e000) = 0x7ff8856b7000
    mmap(0x7ff8856c1000, 11520, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7ff8856c1000
    close(3)                                = 0
    access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
    open("/lib/x86_64-linux-gnu/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 3
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`+\0\0\0\0\0\0"..., 832) = 832
    fstat(3, {st_mode=S_IFREG|0644, st_size=96552, ...}) = 0
    mmap(NULL, 2192432, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ff885161000
    mprotect(0x7ff885178000, 2093056, PROT_NONE) = 0
    mmap(0x7ff885377000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16000) = 0x7ff885377000
    close(3)                                = 0
    access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
    open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\200\30\2\0\0\0\0\0"..., 832) = 832
    fstat(3, {st_mode=S_IFREG|0755, st_size=1811128, ...}) = 0
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff8858c7000
    mmap(NULL, 3925176, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ff884da2000
    mprotect(0x7ff884f57000, 2093056, PROT_NONE) = 0
    mmap(0x7ff885156000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b4000) = 0x7ff885156000
    mmap(0x7ff88515c000, 17592, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7ff88515c000
    close(3)                                = 0
    access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
    open("/lib/x86_64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0pU\0\0\0\0\0\0"..., 832) = 832
    fstat(3, {st_mode=S_IFREG|0644, st_size=1030512, ...}) = 0
    mmap(NULL, 3125544, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7ff884aa6000
    mprotect(0x7ff884ba1000, 2093056, PROT_NONE) = 0
    mmap(0x7ff884da0000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xfa000) = 0x7ff884da0000
    close(3)                                = 0
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff8858c6000
    mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff8858c4000
    arch_prctl(ARCH_SET_FS, 0x7ff8858c4740) = 0
    mprotect(0x7ff885156000, 16384, PROT_READ) = 0
    mprotect(0x7ff884da0000, 4096, PROT_READ) = 0
    mprotect(0x7ff885377000, 4096, PROT_READ) = 0
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff8858c3000
    mprotect(0x7ff8856b7000, 32768, PROT_READ) = 0
    mprotect(0x601000, 4096, PROT_READ)     = 0
    mprotect(0x7ff8858e6000, 4096, PROT_READ) = 0
    munmap(0x7ff8858c8000, 110610)          = 0
    brk(0)                                  = 0x129d000
    brk(0x12cf000)                          = 0x12cf000
    fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 3), ...}) = 0
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ff8858e3000
    write(1, " key : key1key2\n", 16 key : key1key2
    )       = 16
    write(1, "long size is 8  unsigned long si"..., 91long size is 8  unsigned long size is 8  long long size is 8 unsigned long long size is 8 
    ) = 91
    write(1, "Test -lL void ptr is 0xfffffffff"..., 41Test -lL void ptr is 0xffffffffffffffff 
    ) = 41
    exit_group(0)                           = ?

可以看到并没有第一个库的加载

但是,也可以看到,没有最后一个库的加载。最后一个库是为了加载其他库的,实际上,我们可以看到,用户进程的祖先1号进程中已经准备了这个库

ywgong@ubuntu:/home/nfsshare/program_test$ sudo cat /proc/1/maps 
    [sudo] password for ywgong: 
    7f73b119b000-7f73b11a7000 r-xp 00000000 08:01 2363116                    /lib/x86_64-linux-gnu/libnss_files-2.15.so
    7f73b11a7000-7f73b13a6000 ---p 0000c000 08:01 2363116                    /lib/x86_64-linux-gnu/libnss_files-2.15.so
    7f73b13a6000-7f73b13a7000 r--p 0000b000 08:01 2363116                    /lib/x86_64-linux-gnu/libnss_files-2.15.so
    7f73b13a7000-7f73b13a8000 rw-p 0000c000 08:01 2363116                    /lib/x86_64-linux-gnu/libnss_files-2.15.so
    7f73b13a8000-7f73b13b2000 r-xp 00000000 08:01 2363171                    /lib/x86_64-linux-gnu/libnss_nis-2.15.so
    7f73b13b2000-7f73b15b2000 ---p 0000a000 08:01 2363171                    /lib/x86_64-linux-gnu/libnss_nis-2.15.so
    7f73b15b2000-7f73b15b3000 r--p 0000a000 08:01 2363171                    /lib/x86_64-linux-gnu/libnss_nis-2.15.so
    7f73b15b3000-7f73b15b4000 rw-p 0000b000 08:01 2363171                    /lib/x86_64-linux-gnu/libnss_nis-2.15.so
    7f73b15b4000-7f73b15cb000 r-xp 00000000 08:01 2363114                    /lib/x86_64-linux-gnu/libnsl-2.15.so
    7f73b15cb000-7f73b17ca000 ---p 00017000 08:01 2363114                    /lib/x86_64-linux-gnu/libnsl-2.15.so
    7f73b17ca000-7f73b17cb000 r--p 00016000 08:01 2363114                    /lib/x86_64-linux-gnu/libnsl-2.15.so
    7f73b17cb000-7f73b17cc000 rw-p 00017000 08:01 2363114                    /lib/x86_64-linux-gnu/libnsl-2.15.so
    7f73b17cc000-7f73b17ce000 rw-p 00000000 00:00 0 
    7f73b17ce000-7f73b17d6000 r-xp 00000000 08:01 2363146                    /lib/x86_64-linux-gnu/libnss_compat-2.15.so
    7f73b17d6000-7f73b19d5000 ---p 00008000 08:01 2363146                    /lib/x86_64-linux-gnu/libnss_compat-2.15.so
    7f73b19d5000-7f73b19d6000 r--p 00007000 08:01 2363146                    /lib/x86_64-linux-gnu/libnss_compat-2.15.so
    7f73b19d6000-7f73b19d7000 rw-p 00008000 08:01 2363146                    /lib/x86_64-linux-gnu/libnss_compat-2.15.so
    7f73b19d7000-7f73b1b8c000 r-xp 00000000 08:01 2363123                    /lib/x86_64-linux-gnu/libc-2.15.so
    7f73b1b8c000-7f73b1d8b000 ---p 001b5000 08:01 2363123                    /lib/x86_64-linux-gnu/libc-2.15.so
    7f73b1d8b000-7f73b1d8f000 r--p 001b4000 08:01 2363123                    /lib/x86_64-linux-gnu/libc-2.15.so
    7f73b1d8f000-7f73b1d91000 rw-p 001b8000 08:01 2363123                    /lib/x86_64-linux-gnu/libc-2.15.so
    7f73b1d91000-7f73b1d96000 rw-p 00000000 00:00 0 
    7f73b1d96000-7f73b1d9d000 r-xp 00000000 08:01 2363103                    /lib/x86_64-linux-gnu/librt-2.15.so
    7f73b1d9d000-7f73b1f9c000 ---p 00007000 08:01 2363103                    /lib/x86_64-linux-gnu/librt-2.15.so
    7f73b1f9c000-7f73b1f9d000 r--p 00006000 08:01 2363103                    /lib/x86_64-linux-gnu/librt-2.15.so
    7f73b1f9d000-7f73b1f9e000 rw-p 00007000 08:01 2363103                    /lib/x86_64-linux-gnu/librt-2.15.so
    7f73b1f9e000-7f73b1fb6000 r-xp 00000000 08:01 2363099                    /lib/x86_64-linux-gnu/libpthread-2.15.so
    7f73b1fb6000-7f73b21b5000 ---p 00018000 08:01 2363099                    /lib/x86_64-linux-gnu/libpthread-2.15.so
    7f73b21b5000-7f73b21b6000 r--p 00017000 08:01 2363099                    /lib/x86_64-linux-gnu/libpthread-2.15.so
    7f73b21b6000-7f73b21b7000 rw-p 00018000 08:01 2363099                    /lib/x86_64-linux-gnu/libpthread-2.15.so
    7f73b21b7000-7f73b21bb000 rw-p 00000000 00:00 0 
    7f73b21bb000-7f73b21ff000 r-xp 00000000 08:01 5254720                    /usr/local/lib/libdbus-1.so.3.8.3
    7f73b21ff000-7f73b23fe000 ---p 00044000 08:01 5254720                    /usr/local/lib/libdbus-1.so.3.8.3
    7f73b23fe000-7f73b23ff000 r--p 00043000 08:01 5254720                    /usr/local/lib/libdbus-1.so.3.8.3
    7f73b23ff000-7f73b2400000 rw-p 00044000 08:01 5254720                    /usr/local/lib/libdbus-1.so.3.8.3
    7f73b2400000-7f73b2408000 r-xp 00000000 08:01 2363133                    /lib/x86_64-linux-gnu/libnih-dbus.so.1.0.0
    7f73b2408000-7f73b2608000 ---p 00008000 08:01 2363133                    /lib/x86_64-linux-gnu/libnih-dbus.so.1.0.0
    7f73b2608000-7f73b2609000 r--p 00008000 08:01 2363133                    /lib/x86_64-linux-gnu/libnih-dbus.so.1.0.0
    7f73b2609000-7f73b260a000 rw-p 00009000 08:01 2363133                    /lib/x86_64-linux-gnu/libnih-dbus.so.1.0.0
    7f73b260a000-7f73b2620000 r-xp 00000000 08:01 2363135                    /lib/x86_64-linux-gnu/libnih.so.1.0.0
    7f73b2620000-7f73b2820000 ---p 00016000 08:01 2363135                    /lib/x86_64-linux-gnu/libnih.so.1.0.0
    7f73b2820000-7f73b2821000 r--p 00016000 08:01 2363135                    /lib/x86_64-linux-gnu/libnih.so.1.0.0
    7f73b2821000-7f73b2822000 rw-p 00017000 08:01 2363135                    /lib/x86_64-linux-gnu/libnih.so.1.0.0
    7f73b2822000-7f73b2844000 r-xp 00000000 08:01 2363091                    /lib/x86_64-linux-gnu/ld-2.15.so
    7f73b2a21000-7f73b2a26000 rw-p 00000000 00:00 0 
    7f73b2a42000-7f73b2a44000 rw-p 00000000 00:00 0 
    7f73b2a44000-7f73b2a45000 r--p 00022000 08:01 2363091                    /lib/x86_64-linux-gnu/ld-2.15.so
    7f73b2a45000-7f73b2a47000 rw-p 00023000 08:01 2363091                    /lib/x86_64-linux-gnu/ld-2.15.so
    7f73b2a47000-7f73b2a6d000 r-xp 00000000 08:01 6291517                    /sbin/init
    7f73b2c6d000-7f73b2c6f000 r--p 00026000 08:01 6291517                    /sbin/init
    7f73b2c6f000-7f73b2c70000 rw-p 00028000 08:01 6291517                    /sbin/init
    7f73b32ab000-7f73b33b2000 rw-p 00000000 00:00 0                          [heap]
    7fff7044e000-7fff7046f000 rw-p 00000000 00:00 0                          [stack]
    7fff70573000-7fff70575000 r-xp 00000000 00:00 0                          [vdso]
    ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

所以,后续进程都会自动映射该进程的该库空间

那这个库初始时如何加载的?应该是init进程构造过程中准备的。我们看看init进程需要加载那些库

ywgong@ubuntu:/home/nfsshare/program_test$ ldd /sbin/init
        linux-vdso.so.1 =>  (0x00007fffec0f6000)
        libnih.so.1 => /lib/x86_64-linux-gnu/libnih.so.1 (0x00007f1280d11000)
        libnih-dbus.so.1 => /lib/x86_64-linux-gnu/libnih-dbus.so.1 (0x00007f1280b07000)
        libdbus-1.so.3 => /usr/local/lib/libdbus-1.so.3 (0x00007f12808c1000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f12806a4000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f128049c000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f12800dc000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f1281170000)

是不是看着基本对应上了

在嵌入式设备上,也可以查看到类似信息

 /mnt/usr/lib/ # cat /proc/1/maps 
  00008000-000da000 r-xp 00000000 1f:02 42         /bin/busybox
  000e2000-000e3000 rw-p 000d2000 1f:02 42         /bin/busybox
  000e3000-000e5000 rw-p 00000000 00:00 0 
  005fd000-0061e000 rw-p 00000000 00:00 0          [heap]
  b6d6f000-b6e93000 r-xp 00000000 1f:02 248        /lib/libc-2.16.so
  b6e93000-b6e9b000 ---p 00124000 1f:02 248        /lib/libc-2.16.so
  b6e9b000-b6e9d000 r--p 00124000 1f:02 248        /lib/libc-2.16.so
  b6e9d000-b6e9e000 rw-p 00126000 1f:02 248        /lib/libc-2.16.so
  b6e9e000-b6ea1000 rw-p 00000000 00:00 0 
  b6ea1000-b6f09000 r-xp 00000000 1f:02 258        /lib/libm-2.16.so
  b6f09000-b6f10000 ---p 00068000 1f:02 258        /lib/libm-2.16.so
  b6f10000-b6f11000 r--p 00067000 1f:02 258        /lib/libm-2.16.so
  b6f11000-b6f12000 rw-p 00068000 1f:02 258        /lib/libm-2.16.so
  b6f12000-b6f31000 r-xp 00000000 1f:02 241        /lib/ld-2.16.so
  b6f35000-b6f38000 rw-p 00000000 00:00 0 
  b6f38000-b6f39000 r--p 0001e000 1f:02 241        /lib/ld-2.16.so
  b6f39000-b6f3a000 rw-p 0001f000 1f:02 241        /lib/ld-2.16.so
  be9f1000-bea12000 rw-p 00000000 00:00 0          [stack]
  ffff0000-ffff1000 r-xp 00000000 00:00 0          [vectors]

具体加载库的情况,可以直接执行ld-linux.so。因为设备上没有ldd命令

/lib # ./ld-linux.so.3 --list /bin/busybox 
        libm.so.6 => /lib/a7_softfp_neon-vfpv4/libm.so.6 (0xb6e91000)
        libc.so.6 => /lib/a7_softfp_neon-vfpv4/libc.so.6 (0xb6d5f000)
        /lib/ld-linux.so.3 => ./ld-linux.so.3 (0xb6f04000)

其他库都是以lib命名开头的,而ld-linux.so是可以直接运行的,那么自己检查自己会怎样

/lib # ./ld-linux.so.3 --list ./ld-linux.so.3

loader cannot load itself

自己不能加载自己

补充一下,如果我们看内核线程的maps,会得到空

ld程序作为加载器程序,辅助解决动态库的定位、加载以及内存地址的确定

再来看设备的加载错误信息

/mnt/usr/bin/xxx: error while loading shared libraries: ld-linux.so: cannot open shared object file: No such file or directory

用strace跟踪程序的系统调用,发现

mmap2(0xb69d9000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x124) = 0xb69d9000
  mmap2(0xb69dc000, 9632, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb69dc000
  close(3)                                = 0
  open("/mnt/usr/lib/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  open("/usr/lib/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  open("/lib/a7_softfp_neon-vfpv4/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/tls/v7l/neon/vfp/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/tls/v7l/neon/vfp", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/tls/v7l/neon/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/tls/v7l/neon", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/tls/v7l/vfp/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/tls/v7l/vfp", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/tls/v7l/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/tls/v7l", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/tls/neon/vfp/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/tls/neon/vfp", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/tls/neon/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/tls/neon", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/tls/vfp/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/tls/vfp", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/tls/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/tls", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/v7l/neon/vfp/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/v7l/neon/vfp", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/v7l/neon/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/v7l/neon", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/v7l/vfp/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/v7l/vfp", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/v7l/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/v7l", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/neon/vfp/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/neon/vfp", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/neon/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/neon", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/vfp/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4/vfp", 0xbea762f8) = -1 ENOENT (No such file or directory)
  open("/usr/lib/a7_softfp_neon-vfpv4/ld-linux.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
  stat64("/usr/lib/a7_softfp_neon-vfpv4", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
  writev(2, [{"/mnt/usr/bin/xxx", 23}, {": ", 2}, {"error while loading shared libra"..., 36}, {": ", 2}, {"ld-linux.so", 11}, {": ", 2}, {"cannot open shared object file", 30}, {": ", 2}, {"No such file or directory", 25}, {"\n", 1}], 10/mnt/usr/bin/xxx: error while loading shared libraries: ld-linux.so: cannot open shared object file: No such file or directory
  ) = 134
  exit_group(127)                         = ?

最后加载ld-linux.so动态库,正常加载情况下,对比发现,是没有这个动态库加载的

 /lib # ./ld-linux.so.3 --list /mnt/usr/bin/xxx 
        librt.so.1 => /lib/a7_softfp_neon-vfpv4/librt.so.1 (0xb6f03000)
        libz.so.1 => /mnt/usr/lib/libz.so.1 (0xb6ee3000)
        libuuid.so.1 => /mnt/usr/lib/libuuid.so.1 (0xb6ed7000)
        libzmq.so.5 => /mnt/usr/lib/libzmq.so.5 (0xb6e67000)
        libprotobuf.so.7 => /mnt/usr/lib/libprotobuf.so.7 (0xb6da0000)
        libdiagnosis.so => /mnt/usr/lib/libdiagnosis.so (0xb6d96000)
        libsqlite3.so => /mnt/usr/lib/libsqlite3.so (0xb6c62000)
        libdl.so.2 => /lib/a7_softfp_neon-vfpv4/libdl.so.2 (0xb6c57000)
        libpthread.so.0 => /lib/a7_softfp_neon-vfpv4/libpthread.so.0 (0xb6c37000)
        libasound.so.2 => /mnt/usr/lib/libasound.so.2 (0xb6b17000)
        libstdc++.so.6 => /lib/a7_softfp_neon-vfpv4/libstdc++.so.6 (0xb6a4c000)
        libm.so.6 => /lib/a7_softfp_neon-vfpv4/libm.so.6 (0xb69db000)
        libgcc_s.so.1 => /lib/a7_softfp_neon-vfpv4/libgcc_s.so.1 (0xb69b6000)
        libc.so.6 => /lib/a7_softfp_neon-vfpv4/libc.so.6 (0xb6883000)
        /lib/ld-linux.so.3 => ./ld-linux.so.3 (0xb6f14000)

strace跟踪也是没有ld-linux.so的

上述错误如何产生:init--www--xxx

                        --telnetd--xx

测试发现,所有需要动态加载的程序都会出现上述问题,脚本可以正常运行。脚本也是生成子进程运行的。

所以问题应该还是跟ld加载动态库有关。

继续补充:

PC控制权的变化过程:shell程序加载--内核调用--操作系统--程序的加载部分--加载动态库并执行预初始化代码,特别对于C++对象--交给main接口

pic位置无关代码、gof全局符号表、地址替换,完成加载。

共享库代码复用,数据独立。根据不同的段有不同的处理。

再次查看init进程的内存映射 有ld动态库

/lib # ./ld-linux.so.3 
Usage: ld.so [OPTION]... EXECUTABLE-FILE [ARGS-FOR-PROGRAM...]
You have invoked `ld.so', the helper program for shared library executables.
This program usually lives in the file `/lib/ld.so', and special directives
in executable files using ELF shared libraries tell the system's program
loader to load the helper program from this file.  This helper program loads
the shared libraries needed by the program executable, prepares the program
to run, and runs it.  You may invoke this helper program directly from the
command line to load and run an ELF executable file; this is like executing
that file itself, but always uses this helper program from the file you
specified, instead of the helper program file specified in the executable
file you run.  This is mostly of use for maintainers to test new versions
of this helper program; chances are you did not intend to run this program.

  --list                list all dependencies and how they are resolved
  --verify              verify that given object really is a dynamically linked
                        object we can handle
  --inhibit-cache       Do not use /etc/ld.so.cache
  --library-path PATH   use given PATH instead of content of the environment
                        variable LD_LIBRARY_PATH
  --inhibit-rpath LIST  ignore RUNPATH and RPATH information in object names
                        in LIST
  --audit LIST          use objects named in LIST as auditors

测试ld-linux 动态加载库可以运行,说明该加载器自身是ok的

/lib # ./ld-linux.so.3 --list /mnt/usr/bin/xxx

/mnt/usr/bin/xxx: error while loading shared libraries: ld-linux.so: cannot open shared object file: No such file or directory

/lib #

问题暴露出来 gdb调试一下?

/ywg/gdbref/usr/bin # ./gdb  /mnt/usr/bin/xxx 
GNU gdb 6.8
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-none-linux-gnueabi"...
Dwarf Error: wrong version in compilation unit header (is 4, should be 2) [in module /mnt/usr/bin/xxx]
(gdb) r
Starting program: /mnt/usr/bin/xxx 
Error while reading shared library symbols:
Dwarf Error: wrong version in compilation unit header (is 4, should be 2) [in module /lib/ld-linux.so.3]
Error while reading shared library symbols:
Dwarf Error: wrong version in compilation unit header (is 4, should be 2) [in module /lib/ld-linux.so.3]
/mnt/usr/bin/xxx: error while loading shared libraries: ld-linux.so: cannot open shared object file: No such file or directory

Program exited with code 0177.
(gdb) 

运行失败,没有特别的帮助信息。编写一个最简单的程序,测试。

/lib # ./ld-linux.so.3 --list /ywg/program_test/a.out 
        libstdc++.so.6 => /lib/a7_softfp_neon-vfpv4/libstdc++.so.6 (0xb6ead000)
        libm.so.6 => /lib/a7_softfp_neon-vfpv4/libm.so.6 (0xb6e3c000)
        libgcc_s.so.1 => /lib/a7_softfp_neon-vfpv4/libgcc_s.so.1 (0xb6e16000)
        libc.so.6 => /lib/a7_softfp_neon-vfpv4/libc.so.6 (0xb6ce4000)
        /lib/ld-linux.so.3 => ./ld-linux.so.3 (0xb6f7a000)
/lib # 

有的程序能够加载,可能跟加载库有关?

/lib # ./ld-linux.so.3 --list /ywg/program_test/gpsinit

/ywg/program_test/gpsinit: error while loading shared libraries: ld-linux.so: cannot open shared object file: No such file or directory

不能加载时,都有哪些动态库依赖?找个最简单的不能运行的程序分析

root@ubuntu:/home/nfsshare/program_test# arm-hisiv400-linux-readelf -d gpsinit 
Dynamic section at offset 0x3101c contains 33 entries:
  Tag        Type                         Name/Value
 0x00000001 (NEEDED)                     Shared library: [librt.so.1]
 0x00000001 (NEEDED)                     Shared library: [libz.so.1]
 0x00000001 (NEEDED)                     Shared library: [libuuid.so.1]
 0x00000001 (NEEDED)                     Shared library: [libdl.so.2]
 0x00000001 (NEEDED)                     Shared library: [libpthread.so.0]
 0x00000001 (NEEDED)                     Shared library: [libasound.so.2]
 0x00000001 (NEEDED)                     Shared library: [libstdc++.so.6]
 0x00000001 (NEEDED)                     Shared library: [libm.so.6]
 0x00000001 (NEEDED)                     Shared library: [libgcc_s.so.1]
 0x00000001 (NEEDED)                     Shared library: [libc.so.6]
 0x0000000c (INIT)                       0xa7b0
 0x0000000d (FINI)                       0x33ff4
 0x00000019 (INIT_ARRAY)                 0x41000
 0x0000001b (INIT_ARRAYSZ)               20 (bytes)
 0x0000001a (FINI_ARRAY)                 0x41014
 0x0000001c (FINI_ARRAYSZ)               4 (bytes)
 0x00000004 (HASH)                       0x8168
 0x00000005 (STRTAB)                     0x9088
 0x00000006 (SYMTAB)                     0x8618
 0x0000000a (STRSZ)                      4169 (bytes)
 0x0000000b (SYMENT)                     16 (bytes)
 0x00000015 (DEBUG)                      0x0
 0x00000003 (PLTGOT)                     0x4114c
 0x00000002 (PLTRELSZ)                   1088 (bytes)
 0x00000014 (PLTREL)                     REL
 0x00000017 (JMPREL)                     0xa370
 0x00000011 (REL)                        0xa300
 0x00000012 (RELSZ)                      112 (bytes)
 0x00000013 (RELENT)                     8 (bytes)
 0x6ffffffe (VERNEED)                    0xa220
 0x6fffffff (VERNEEDNUM)                 5
 0x6ffffff0 (VERSYM)                     0xa0d2

从依赖的动态库入手,多个程序对比分析

目前看,怀疑因素有librt  libz

这里,处理思路分为下面三种:

第一步,serialcmd新程序编译需要调整,单独编译一个版本测试,保证能够运行

经过测试,该命令不需要rt库的依赖,重新编译后可以运行。验证问题就处在rt上,libz没有问题。

解决方法一,对该程序调整编译策略,重新生成,更新版本。

第二步,分析librt为什么加载失败

/lib # ./ld-linux.so.3 --list /lib/librt-2.16.so

/lib/librt-2.16.so: error while loading shared libraries: ld-linux.so: cannot open shared object file: No such file or directory

librt应该是问题源

root@ubuntu:/home/nfsshare/program_test# arm-hisiv400-linux-readelf -d /home/nfsshare/librt-2.16.so 
Dynamic section at offset 0x5ee8 contains 31 entries:
  Tag        Type                         Name/Value
 0x00000001 (NEEDED)                     Shared library: [libc.so.6]
 0x00000001 (NEEDED)                     Shared library: [libpthread.so.0]
 0x00000001 (NEEDED)                     Shared library: [ld-linux.so]
 0x0000000e (SONAME)                     Library soname: [librt.so.1]
 0x0000000c (INIT)                       0x1438
 0x0000000d (FINI)                       0x52a4
 0x00000019 (INIT_ARRAY)                 0xded4
 0x0000001b (INIT_ARRAYSZ)               4 (bytes)
 0x0000001a (FINI_ARRAY)                 0xded8
 0x0000001c (FINI_ARRAYSZ)               4 (bytes)
 0x00000004 (HASH)                       0x56e8
 0x6ffffef5 (GNU_HASH)                   0x174
 0x00000005 (STRTAB)                     0xaa0
 0x00000006 (SYMTAB)                     0x390
 0x0000000a (STRSZ)                      1424 (bytes)
 0x0000000b (SYMENT)                     16 (bytes)
 0x00000003 (PLTGOT)                     0xe000
 0x00000002 (PLTRELSZ)                   512 (bytes)
 0x00000014 (PLTREL)                     REL
 0x00000017 (JMPREL)                     0x1238
 0x00000011 (REL)                        0x11d0
 0x00000012 (RELSZ)                      104 (bytes)
 0x00000013 (RELENT)                     8 (bytes)
 0x6ffffffc (VERDEF)                     0x1114
 0x6ffffffd (VERDEFNUM)                  3
 0x6ffffffb (FLAGS_1)                    Flags: NODELETE
 0x6ffffffe (VERNEED)                    0x1170
 0x6fffffff (VERNEEDNUM)                 2
 0x6ffffff0 (VERSYM)                     0x1030
 0x6ffffffa (RELCOUNT)                   6
 0x00000000 (NULL)                       0x0

librt 中包含 ld-linux,应该是找librt的依赖时出错了

0x00000001 (NEEDED)                     Shared library: [libdl.so.2]

是不是软连接出问题了?创建软连接

/lib # ln -s ld-2.16.so ld-linux.so

再次执行

/lib # 
/lib # ./ld-linux.so.3 --list /lib/librt-2.16.so 
        libc.so.6 => /lib/a7_softfp_neon-vfpv4/libc.so.6 (0xb6de3000)
        libpthread.so.0 => /lib/a7_softfp_neon-vfpv4/libpthread.so.0 (0xb6dc4000)
        ld-linux.so => /lib/a7_softfp_neon-vfpv4/ld-linux.so (0xb6d9b000)
        /lib/a7_softfp_neon-vfpv4/ld-linux.so.3 => ./ld-linux.so.3 (0xb6f26000)

可以找到库的依赖了

/lib # /mnt/usr/bin/xxx
I[2021-07-07 17:36:24][ApplicationHisi][Application::Application(),73]Config upload-width: 352, upload-height: 288, upload-bps: 256kb, upload-fps: 15
I[2021-07-07 17:36:24][ApplicationHisi][Application::Application(),79]Config display-width: 1280, display-height: 720, display-bps: 1024kb, display-fps: 25
I[2021-07-07 17:36:24][ApplicationHisi][virtual bool Application::Initialize(),106]Initialize
I[2021-07-07 17:36:24][ApplicationHisi][virtual bool Application::Initialize(),112]Version: 0.0.0.23397
I[2021-07-07 17:36:24][FFMpeg][static bool FFMpeg:
..........

程序可以运行起来了。

那么,为什么会出现找不到的问题。 之前正常加载时,是怎么找到的?

再次查看librt动态库在编译链下的解析

root@ubuntu:/home/nfsshare/program_test# arm-hisiv400-linux-readelf -d /home/nfsshare/librt-2.16.so 

Dynamic section at offset 0x5ee8 contains 31 entries:
  Tag        Type                         Name/Value
 0x00000001 (NEEDED)                     Shared library: [libc.so.6]
 0x00000001 (NEEDED)                     Shared library: [libpthread.so.0]
 0x00000001 (NEEDED)                     Shared library: [ld-linux.so]
 0x0000000e (SONAME)                     Library soname: [librt.so.1]
 0x0000000c (INIT)                       0x1438
 0x0000000d (FINI)                       0x52a4
 0x00000019 (INIT_ARRAY)                 0xded4
 0x0000001b (INIT_ARRAYSZ)               4 (bytes)
 0x0000001a (FINI_ARRAY)                 0xded8
 0x0000001c (FINI_ARRAYSZ)               4 (bytes)
 0x00000004 (HASH)                       0x56e8
 0x6ffffef5 (GNU_HASH)                   0x174
 0x00000005 (STRTAB)                     0xaa0
 0x00000006 (SYMTAB)                     0x390
 0x0000000a (STRSZ)                      1424 (bytes)
 0x0000000b (SYMENT)                     16 (bytes)
 0x00000003 (PLTGOT)                     0xe000
 0x00000002 (PLTRELSZ)                   512 (bytes)
 0x00000014 (PLTREL)                     REL
 0x00000017 (JMPREL)                     0x1238
 0x00000011 (REL)                        0x11d0
 0x00000012 (RELSZ)                      104 (bytes)
 0x00000013 (RELENT)                     8 (bytes)
 0x6ffffffc (VERDEF)                     0x1114
 0x6ffffffd (VERDEFNUM)                  3
 0x6ffffffb (FLAGS_1)                    Flags: NODELETE
 0x6ffffffe (VERNEED)                    0x1170
 0x6fffffff (VERNEEDNUM)                 2
 0x6ffffff0 (VERSYM)                     0x1030
 0x6ffffffa (RELCOUNT)                   6
 0x00000000 (NULL)                       0x0

对比正常rt库的依赖

root@ubuntu:/home/nfsshare/program_test# arm-hisiv400-linux-readelf -d /home/nfsshare/librt-2.16.so-okdev 
Dynamic section at offset 0x5ee8 contains 31 entries:
  Tag        Type                         Name/Value
 0x00000001 (NEEDED)                     Shared library: [libc.so.6]
 0x00000001 (NEEDED)                     Shared library: [libpthread.so.0]
 0x00000001 (NEEDED)                     Shared library: [ld-linux.so.3]
 0x0000000e (SONAME)                     Library soname: [librt.so.1]
 0x0000000c (INIT)                       0x1438
 0x0000000d (FINI)                       0x52a4
 0x00000019 (INIT_ARRAY)                 0xded4
 0x0000001b (INIT_ARRAYSZ)               4 (bytes)
 0x0000001a (FINI_ARRAY)                 0xded8
 0x0000001c (FINI_ARRAYSZ)               4 (bytes)
 0x00000004 (HASH)                       0x56e8
 0x6ffffef5 (GNU_HASH)                   0x174
 0x00000005 (STRTAB)                     0xaa0
 0x00000006 (SYMTAB)                     0x390
 0x0000000a (STRSZ)                      1424 (bytes)
 0x0000000b (SYMENT)                     16 (bytes)
 0x00000003 (PLTGOT)                     0xe000
 0x00000002 (PLTRELSZ)                   512 (bytes)
 0x00000014 (PLTREL)                     REL
 0x00000017 (JMPREL)                     0x1238
 0x00000011 (REL)                        0x11d0
 0x00000012 (RELSZ)                      104 (bytes)
 0x00000013 (RELENT)                     8 (bytes)
 0x6ffffffc (VERDEF)                     0x1114
 0x6ffffffd (VERDEFNUM)                  3
 0x6ffffffb (FLAGS_1)                    Flags: NODELETE
 0x6ffffffe (VERNEED)                    0x1170
 0x6fffffff (VERNEEDNUM)                 2
 0x6ffffff0 (VERSYM)                     0x1030
 0x6ffffffa (RELCOUNT)                   6
 0x00000000 (NULL)                       0x0
root@ubuntu:/home/nfsshare/program_test# 

可以发现,正常的设备上,用的是ld-linux.so.3,异常的设备上用的是ld-linux.so

这个差异从何而来?

二进制对比这两个文件,有差异,但是两个文件大小一样

再进一步比较。先用BEYOND compare 仔细比较发现有一个字节发生了变化

接着用hexdump -n 5096 ../librt-2.16.so-okdev 工具进一步查看差异地址附件十六进制数据信息,如下

0001000 3300 6c00 6269 7472 732e 2e6f 0031 4c47

0001000 332e 6c00 6269 7472 732e 2e6f 0031 4c47

可见,原来的2e变成了00,也就是ld-linux.so.3的最后一个.【2e】变成了【0】,导致库名称发生了变化

产生这个问题的原因又是什么呢?

也许是yaffs导致的?还是内存导致的?感觉是磁盘导致的,因为读出来就是错误的。

下次重新开机,再次导出上述文件,可以确认该问题是否跟yaffs有关。

如果正常,说明磁盘上的文件本身应该是没有问题的吧,否则再次开机应该不会正常。

那如果正常,则只能说明当前文件系统挂载时出问题了?

如果是磁盘或者内存错误的话,该问题出现第二次了,在同一个地方出两次,不同设备,对于这种随机性的问题,又有点显得太过巧合了。

而且,是单字节出问题,如果是硬件或者软件的问题,还是很难自圆其说。

整机刷yaffs,所以固定点问题可能出现?

可能只是巧合,应该是flash被干扰了,后续还会出现异常bit位,导致程序启动异常。出异常的位不一定是上述ld库所在位置。库本身是不会写的,所以出错不应该是操作系统写数据本身逻辑,而可能是干扰或者读写flash接口、设备内部特性影响导致。

你可能感兴趣的:(Linux系统开发,linux,ubuntu,运维)