解读系统(Kernel)奔溃(Oops)

我们通常在遇到系统的奔溃时会做什么？不去管具体的原因而直接重启系统？还是尝试着去发现些什么？可能作为一个普通的用户来说，确实不太需要关心太多，重启或者就可以了。不过一些企业中，或者这种情况发生在你的一个提供基础服务的服务器上时，如果不去关心这些错误原因，可能会给以后埋下一颗定时炸弹，因为你不知道什么时候它会再次发生奔溃。
有时我们是需要一些Geek精神的，例如我们看到我们系统的串口里有这样的一段输出代码时该怎么办。

[  257.264914] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001
[  257.266396] IP: crasher_write+0x77/0xb0 [crasher]
[  257.267382] PGD 0 
[  257.267383] 
[  257.267771] Oops: 0002 [#1] SMP
[  257.268071] Modules linked in: crasher(OE) binfmt_misc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel tpm_tis joydev tpm_tis_core ppdev virtio_
balloon qxl ttm drm_kms_helper snd_hda_codec_generic drm tpm snd_hda_intel snd_hda_codec snd_hda_core snd_pcsp snd_hwdep snd_pcm snd_timer parport_pc
 snd parport i2c_piix4 soundcore virtio_net virtio_blk virtio_console virtio_rng crc32c_intel serio_raw virtio_pci virtio_ring virtio ata_generic pata_acpi qemu_fw_cfg
[  257.275238] CPU: 0 PID: 1933 Comm: bash Tainted: G           OE   4.11.9-300.fc26.x86_64 #1
[  257.276360] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014
[  257.277783] task: ffff9ced39202580 task.stack: ffffaa02c0634000
[  257.278787] RIP: 0010:crasher_write+0x77/0xb0 [crasher]
[  257.279601] RSP: 0018:ffffaa02c0637e08 EFLAGS: 00010246
[  257.280457] RAX: 0000000000000032 RBX: 0000000000000002 RCX: 0000000000000000
[  257.281647] RDX: 0000000000000001 RSI: 000055fb4a6edda1 RDI: ffffaa02c0637e0c
[  257.282873] RBP: ffffaa02c0637e20 R08: ffffaa02c0638000 R09: ffff9ced38c754e4
[  257.284253] R10: 0000000000000001 R11: ffff9ced39202580 R12: fffffffffffffffb
[  257.285370] R13: ffffaa02c0637f18 R14: 000055fb4a6edda0 R15: ffff9cecf5753200
[  257.286329] FS:  00007eff6cda5b40(0000) GS:ffff9ced3fc00000(0000) knlGS:0000000000000000
[  257.287397] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  257.288163] CR2: 0000000000000001 CR3: 000000007be80000 CR4: 00000000001406f0
[  257.289419] Call Trace:
[  257.289812]  proc_reg_write+0x42/0x70
[  257.290400]  __vfs_write+0x37/0x160
[  257.291121]  ? selinux_file_permission+0xfb/0x120
[  257.292028]  ? security_file_permission+0x3b/0xc0
[  257.292817]  vfs_write+0xb1/0x1a0
[  257.293373]  SyS_write+0x55/0xc0
[  257.293943]  entry_SYSCALL_64_fastpath+0x1a/0xa9
[  257.294822] RIP: 0033:0x7eff6c4871f0
[  257.295511] RSP: 002b:00007ffdfc380208 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  257.296914] RAX: ffffffffffffffda RBX: 000000000000001b RCX: 00007eff6c4871f0
[  257.298474] RDX: 0000000000000002 RSI: 000055fb4a6edda0 RDI: 0000000000000001
[  257.299815] RBP: 000055fb48e68c10 R08: 000000000000000a R09: 00007eff6cda5b40
[  257.301592] R10: 000055fb4a7167c0 R11: 0000000000000246 R12: 0000000000000065
[  257.303198] R13: 000055fb48e68700 R14: 0000000000000024 R15: 0000000000000013
[  257.304313] Code: 48 8d 7d ec e8 db 8c 39 e0 48 8d 7d ec e8 d2 8c 39 e0 48 89 d8 48 8b 4d f0 65 48 33 0c 25 28 00 00 00 75 3a 48 83 c4 10 5b 5d c3  04 25 01 00 00 00 41 eb dd 48 c7 c0 f2 ff ff ff eb d7 3c 30 
[  257.307443] RIP: crasher_write+0x77/0xb0 [crasher] RSP: ffffaa02c0637e08
[  257.308717] CR2: 0000000000000001

第一眼看到后确实会头大，不知所措，然后等着系统重启，如果系统Hang住了，就只有冷启动了。不过，让我们尝试着去读懂下这个输出，这样或许在下次遇到这样的问题或许会有解决办法，又或者可以避免这样的事情再次发生。

[  257.264914] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001

表明发生panic的原因是因为空指针操作发生在地址 0000000000000001。

[  257.266396] IP: crasher_write+0x77/0xb0 [crasher]

IP表示指令的指针，表示发生错误的地址。

[  257.267771] Oops: 0002 [#1] SMP

这是个用16进制表示的错误代码，它会有不同的代码来表示不同的错误：

bit 0 == 0 means no page found, 1 means a protection fault
bit 1 == 0 means read, 1 means write
bit 2 == 0 means kernel, 1 means user-mode
[#1] — this value is the number of times the Oops occurred. Multiple Oops can be triggered as a cascading effect of the first one.

[  257.268071] Modules linked in: crasher(OE) binfmt_misc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel tpm_tis joydev tpm_tis_core ppdev virtio_
balloon qxl ttm drm_kms_helper snd_hda_codec_generic drm tpm snd_hda_intel snd_hda_codec snd_hda_core snd_pcsp snd_hwdep snd_pcm snd_timer parport_pc
 snd parport i2c_piix4 soundcore virtio_net virtio_blk virtio_console virtio_rng crc32c_intel serio_raw virtio_pci virtio_ring virtio ata_generic pata_acpi qemu_fw_cfg

系统所加载的模块。

[  257.275238] CPU: 0 PID: 1933 Comm: bash Tainted: G           OE   4.11.9-300.fc26.x86_64 #1
[  257.276360] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014

发生panic的CPU，process，kernel version，hardware等信息。在内核文件/kernel/panic.c中有对Bash Tainted的一些描述：

static const struct tnt tnts[] = {
    { TAINT_PROPRIETARY_MODULE, 'P', 'G' },
    { TAINT_FORCED_MODULE,      'F', ' ' },
    { TAINT_UNSAFE_SMP,     'S', ' ' },
    { TAINT_FORCED_RMMOD,       'R', ' ' },
    { TAINT_MACHINE_CHECK,      'M', ' ' },
    { TAINT_BAD_PAGE,       'B', ' ' },
    { TAINT_USER,           'U', ' ' },
    { TAINT_DIE,            'D', ' ' },
    { TAINT_OVERRIDDEN_ACPI_TABLE,  'A', ' ' },
    { TAINT_WARN,           'W', ' ' },
    { TAINT_CRAP,           'C', ' ' },
    { TAINT_FIRMWARE_WORKAROUND,    'I', ' ' },
    { TAINT_OOT_MODULE,     'O', ' ' },
};

详细的信息可以参考这个内核文件，在这里 P,G表示专有模块已经被加载。

[  257.277783] task: ffff9ced39202580 task.stack: ffffaa02c0634000

发生panic时当前进程task的信息。

[  257.278787] RIP: 0010:crasher_write+0x77/0xb0 [crasher]

RIP：被执行的指令（代码），在这里被翻译成了函数名字+偏移值。

[  257.279601] RSP: 0018:ffffaa02c0637e08 EFLAGS: 00010246
[  257.280457] RAX: 0000000000000032 RBX: 0000000000000002 RCX: 0000000000000000
[  257.281647] RDX: 0000000000000001 RSI: 000055fb4a6edda1 RDI: ffffaa02c0637e0c
[  257.282873] RBP: ffffaa02c0637e20 R08: ffffaa02c0638000 R09: ffff9ced38c754e4
[  257.284253] R10: 0000000000000001 R11: ffff9ced39202580 R12: fffffffffffffffb
[  257.285370] R13: ffffaa02c0637f18 R14: 000055fb4a6edda0 R15: ffff9cecf5753200
[  257.286329] FS:  00007eff6cda5b40(0000) GS:ffff9ced3fc00000(0000) knlGS:0000000000000000
[  257.287397] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  257.288163] CR2: 0000000000000001 CR3: 000000007be80000 CR4: 00000000001406f0

转储CPU寄存器的内容，这些寄存器的地址可以作为错误分析时的入口地址。

[  257.289419] Call Trace:
[  257.289812]  proc_reg_write+0x42/0x70
[  257.290400]  __vfs_write+0x37/0x160
[  257.291121]  ? selinux_file_permission+0xfb/0x120
[  257.292028]  ? security_file_permission+0x3b/0xc0
[  257.292817]  vfs_write+0xb1/0x1a0
[  257.293373]  SyS_write+0x55/0xc0
[  257.293943]  entry_SYSCALL_64_fastpath+0x1a/0xa9

Call Trace，系统发生panic之前的函数调用。可以从大致的分析出系统发生错误的原因。“？”表示一个位于堆栈上的指针，但不适合'frame'，是一个会消失的指令。

[  257.304313] Code: 48 8d 7d ec e8 db 8c 39 e0 48 8d 7d ec e8 d2 8c 39 e0 48 89 d8 48 8b 4d f0 65 48 33 0c 25 28 00 00 00 75 3a 48 83 c4 10 5b 5d c3  04 25 01 00 00 00 41 eb dd 48 c7 c0 f2 ff ff ff eb d7 3c 30

Code：发生panic时正在运行的机器的代码部分十六进制转储。可以用‘./scripts/decodecode’去反汇编它。

# ./decodecode < code.code 
[ 257.304313] Code: 48 8d 7d ec e8 db 8c 39 e0 48 8d 7d ec e8 d2 8c 39 e0 48 89 d8 48 8b 4d f0 65 48 33 0c 25 28 00 00 00 75 3a 48 83 c4 10 5b 5d c3  04 25 01 00 00 00 41 eb dd 48 c7 c0 f2 ff ff ff eb d7 3c 30
All code
========
   0:   48 8d 7d ec             lea    -0x14(%rbp),%rdi
   4:   e8 db 8c 39 e0          callq  0xffffffffe0398ce4
   9:   48 8d 7d ec             lea    -0x14(%rbp),%rdi
   d:   e8 d2 8c 39 e0          callq  0xffffffffe0398ce4
  12:   48 89 d8                mov    %rbx,%rax
  15:   48 8b 4d f0             mov    -0x10(%rbp),%rcx
  19:   65 48 33 0c 25 28 00    xor    %gs:0x28,%rcx
  20:   00 00 
  22:   75 3a                   jne    0x5e
  24:   48 83 c4 10             add    $0x10,%rsp
  28:   5b                      pop    %rbx
  29:   5d                      pop    %rbp
  2a:   c3                      retq   
  2b:*  c6 04 25 01 00 00 00    movb   $0x41,0x1     <-- trapping instruction
  32:   41 
  33:   eb dd                   jmp    0x12
  35:   48 c7 c0 f2 ff ff ff    mov    $0xfffffffffffffff2,%rax
  3c:   eb d7                   jmp    0x15
  3e:   3c 30                   cmp    $0x30,%al

Code starting with the faulting instruction
===========================================
   0:   c6 04 25 01 00 00 00    movb   $0x41,0x1
   7:   41 
   8:   eb dd                   jmp    0xffffffffffffffe7
   a:   48 c7 c0 f2 ff ff ff    mov    $0xfffffffffffffff2,%rax
  11:   eb d7                   jmp    0xffffffffffffffea
  13:   3c 30                   cmp    $0x30,%al

[  257.307443] RIP: crasher_write+0x77/0xb0 [crasher] RSP: ffffaa02c0637e08
[  257.308717] CR2: 0000000000000001

再一次的打印出错误发生的代码。

OK，到此一个简单的Call Trace代码的大概意思就明白了，系统发生panic是由于使用了未知的空指针代码，发生在自注册的模块代码中。

其实，大多时候我们自己的电脑上只能看到黑屏在系统发生了崩溃之后，这样我们要如何去分析错误了。这样就需要用到一个专门的技术"KDUMP"，它可以把发生崩溃时系统的内存保存起来，这样就可以在系统重启后使用"Crash"（类似于gdb）这个工具去分析。我们在之后的文章中再做介绍。

解读系统(Kernel)奔溃(Oops)

你可能感兴趣的:(解读系统(Kernel)奔溃(Oops))