标签: 杂谈 |
分类: Linux系统 |
五:System-config-kdump
对于其他发行版,应该也有相应的GUI前端,这里就不再多说了。安装好后,可以在 系统->管理->Kdump 找到:[root@Derek-Laptop derek]# yum install -y system-config-kdump
六:Kdump一些琐碎的东西
这个函数就是实现blacklist这个Feature的。这个Feature的大致目的就是,如果你不希望一些内核模块在Crash Kernel中被加载的话,可以在/etc/kdump.conf添加这样一行:do_blacklist(){local modName=$1
if echo "$modName" | grep -q "\/" ; thenlocal dirName="/lib/modules/$kernel/$modName"find $dirName -xtype f -exec basename {} \; | sed "s/^\(.*\).ko/blacklist \1/g" >> $MNTIMAGE/etc/blacklist-kdump.confelseecho "blacklist $modName" >> $MNTIMAGE/etc/blacklist-kdump.conffi}
blacklist iwl3945
default)DEFAULT_ACTION=$config_valcase $DEFAULT_ACTION inreboot|shell)FINAL_ACTION="reboot -f";;halt)FINAL_ACTION="halt -f";;poweroff)FINAL_ACTION="poweroff -f";;esac;;
七:Crash的准备工作
echo 1 > /proc/sys/kernel/panic_on_oops
需要注意的是,启用这个特性的话,是不能够同时启用NMI_WATCHDOG的!否则系统会Panic!!!echo 1 > /proc/sys/kernel/unknown_nmi_panic
需要注意的是,一旦你这么做了, 你的系统就死了 !所以,如果你已经保存好你的数据了,并且不担心原地复活过程中SELinux可能的Relabel的话,同时非常冒险、学习的精神!那你可以敲下回车了![root@Derek-Laptop derek]# echo c > /proc/sysrq-trigger
然后是/etc/kdump.conf:[root@Derek-Laptop derek]# cat /boot/grub/grub.confdefault=0timeout=3splashimage=(hd0,0)/grub/splash.xpm.gzhiddenmenutitle Fedora (2.6.34.6-54.fc13.i686.PAE)root (hd0,0)kernel /vmlinuz-2.6.34.6-54.fc13.i686.PAE ro root=/dev/mapper/vg_dereklaptop-lv_root rd_LVM_LV=vg_dereklaptop/lv_root rd_LVM_LV=vg_dereklaptop/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us rhgb quiet rdblacklist=nouveau crashkernel=256Minitrd /initramfs-2.6.34.6-54.fc13.i686.PAE.img
手动强制触发Kdump之前, 确认一下Kdump服务 :[root@Derek-Laptop derek]# tail /etc/kdump.conf#kdump_post /var/crash/scripts/kdump-post.sh#extra_bins /usr/bin/lftp#disk_timeout 30#extra_modules gfs2#default shell
ext4 UUID=2c560b75-fc2b-4346-a669-6403e954498apath /var/kdumpcore_collector makedumpfile -c --message-level 1 -d 31default shell
如果你的不是operational,建议你别回车,否则,你的Fedora就真的死了![root@Derek-Laptop derek]# service kdump statusKdump is operational
成功保存了vmcore,好好保留这个文件,以后还指望他分析出当初Crash的原因呢![root@Derek-Laptop derek]# ll /var/kdump/127.0.0.1-2010-09-18-00\:07\:09/vmcore-rw------- 1 root root 21494214 9月 18 08:07 /var/kdump/127.0.0.1-2010-09-18-00:07:09/vmcore
八:Crash初步
这就是我的vmcore了,总共才45M,Level为31。我的内存为2GB,压缩之后的大小非常客观吧!不过还是根据自己的需要来调整Level吧![root@Derek-Laptop derek]# du -sh /var/kdump/127.0.0.1-2010-10-04-16\:14\:09/vmcore45M /var/kdump/127.0.0.1-2010-10-04-16:14:09/vmcore
另外一种就是跑在dumpfile之上,例如:$ crash$ crash /usr/tmp/vmlinux$ crash /boot/System.map vmlinux.dbg$ crash -S vmlinux.dbg$ crash vmlinux vmlinux.dbg
这里的-S表示 使用 /boot/System.map 作为mapfile。所有的Usage为:$ crash vmlinux vmcore$ crash /boot/System.map vmlinux.dbg vmcore$ crash -S vmlinux.dbg vmcore$ crash vmlinux vmlinux.dbg vmcore
我们这里采用的是跑在dumpfile之上的,所以需要安装一个包:Usage:crash [-h [opt]][-v][-s][-i file][-d num] [-S] [mapfile] [namelist] [dumpfile]
需要注意的是,Kdump产生的vmcore需要对应的kernel-debuginfo才能使用Crash分析,所以需要安装对应的debuginfo!!![root@Derek-Laptop derek]# yum install kernel-debuginfo
Crash使用了对应的vmlinux来分析保存的vmcore,同时也打印出了一些信息^_^[root@Derek-Laptop derek]# crash /var/kdump/127.0.0.1-2010-10-04-16\:14\:09/vmcore /usr/lib/debug/lib/modules/2.6.34.7-56.fc13.i686.PAE/vmlinux
crash 5.0.7Copyright (C) 2002-2010 Red Hat, Inc.Copyright (C) 2004, 2005, 2006 IBM CorporationCopyright (C) 1999-2006 Hewlett-Packard CoCopyright (C) 2005, 2006 Fujitsu LimitedCopyright (C) 2006, 2007 VA Linux Systems Japan K.K.Copyright (C) 2005 NEC CorporationCopyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.This program is free software, covered by the GNU General Public License,and you are welcome to change it and/or distribute copies of it undercertain conditions. Enter "help copying" to see the conditions.This program has absolutely no warranty. Enter "help warranty" for details.GNU gdb (GDB) 7.0Copyright (C) 2009 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law. Type "show copying"and "show warranty" for details.This GDB was configured as "i686-pc-linux-gnu"...
KERNEL: /usr/lib/debug/lib/modules/2.6.34.7-56.fc13.i686.PAE/vmlinuxDUMPFILE: /var/kdump/127.0.0.1-2010-10-04-16:14:09/vmcore [PARTIAL DUMP]CPUS: 2DATE: Mon Oct 4 16:13:36 2010UPTIME: 00:32:28LOAD AVERAGE: 0.94, 0.65, 0.41TASKS: 328NODENAME: Derek-LaptopRELEASE: 2.6.34.7-56.fc13.i686.PAEVERSION: #1 SMP Wed Sep 15 03:27:15 UTC 2010MACHINE: i686 (1662 Mhz)MEMORY: 2 GBPANIC: "Oops: 0002 [#1] SMP " (check log for details)PID: 3347COMMAND: "bash"TASK: efcb4c80 [THREAD_INFO: efcca000]CPU: 0STATE: TASK_RUNNING (PANIC)
crash>
crash> btPID: 3347 TASK: efcb4c80 CPU: 0 COMMAND: " bash "#0 [efccbe38] crash_kexec at c046f511#1 [efccbe90] bad_area at c04280b1#2 [efccbea8] do_page_fault at c07a5e61#3 [efccbed4] error_code (via page_fault) at c07a3a65EAX: 00000063 EBX: 00000063 ECX: c0aa4e08 EDX: 00000000 EBP: efccbf14DS: 007b ESI: c09c171c ES: 007b EDI: 00000003 GS: 00e0CS: 0060 EIP: c0637138 ERR: ffffffff EFLAGS: 00210046#4 [efccbf08] sysrq_handle_crash at c0637138#5 [efccbf18] __handle_sysrq at c06374f7#6 [efccbf40] write_sysrq_trigger at c06375af#7 [efccbf50] proc_reg_write at c05117a1#8 [efccbf74] vfs_write at c04da415#9 [efccbf90] sys_write at c04da50f#10 [efccbfb0] ia32_sysenter_target at c0408c98EAX: 00000004 EBX: 00000001 ECX: b7835000 EDX: 00000002DS: 007b ESI: 00000002 ES: 007b EDI: b7835000SS: 007b ESP: bfc9b1bc EBP: bfc9b1f4 GS: 0033CS: 0073 EIP: 00898424 ERR: 00000004 EFLAGS: 00200246
除此之外,还可以看到当时的程序栈,相当底层的东西了吧^_^[root@Derek-Laptop derek]# echo c > /proc/sysrq-trigger