操作系统内部会对中断,异常进行处理等。其中有一类异常叫未定义指令undefined instruction,CPU遇到了不认识的指令,则会进入未定义指令异常处理函数,因为CPU不认识该指令,所以通常CPU遇到非法指令,则无法继续执行,会使内核崩溃panic。但在崩溃前会操作系统会打印一些debug信息,供工程师分析。
如下log,内核启动过程中调用用户态modprobe程序,而modprobe执行过程中遇到非法指令,导致内核奔溃。
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 4.9.84-g8d35179-dirty ([email protected]) (gcc version 6.2.0 20161103 ZTE Embsys-TSP Vtest (GCC) ) #115 SMP Mon Jul 30 09:16:32 CST 2018
[ 0.000000] Boot CPU: AArch64 Processor [410fd034]
[ 0.000000] doing early options, parsing ARGS: 'console=ttyAMA1,115200 earlycon=pl011,0xA0804000 root=/dev/ram0 rw verbose debug swiotlb=1024'
[ 0.000000] doing early options: console='ttyAMA1,115200'
[ 0.000000] earlycon: earlycon buf=ttyAMA1,115200
[ 0.000000] doing early options: earlycon='pl011,0xA0804000'
[ 0.000000] earlycon: earlycon buf=pl011,0xA0804000
[ 0.000000] earlycon: earlycon->name=pl011
[ 0.000000] [yuchen]: register_earlycon 135:
[ 0.000000] [yuchen]: register_earlycon 147: mapbase=00000000a0804000,membase=ffffffbefe7ff000
[ 0.000000] [yuchen]: register_earlycon 151: mapbase=00000000a0804000,membase=ffffffbefe7ff000
[ 0.000000] earlycon: pl11 at MMIO 0x00000000a0804000 (options '')
[ 0.000000] [yuchen]: register_earlycon 154:
[ 0.000000] [yuchen]: pl011_early_console_setup in: device->port.membase=ffffffbefe7ff000!
[ 0.000000] [yuchen]: pl011_early_console_setup out!
[ 0.000000] [yuchen]: register_earlycon 158:
[ 0.000000] bootconsole [pl11] enabled
[ 0.000000] doing early options: root='/dev/ram0'
[ 0.000000] doing early options: rw='(null)'
[ 0.000000] doing early options: verbose='(null)'
[ 0.000000] doing early options: debug='(null)'
[ 0.000000] doing early options: swiotlb='1024'
[ 0.000000] On node 0 totalpages: 65536
[ 0.000000] DMA zone: 896 pages used for memmap
[ 0.000000] DMA zone: 0 pages reserved
[ 0.000000] DMA zone: 65536 pages, LIFO batch:15
[ 0.000000] percpu: Embedded 20 pages/cpu @ffffffc00ffb5000 s42136 r8192 d31592 u81920
[ 0.000000] pcpu-alloc: s42136 r8192 d31592 u81920 alloc=20*4096
[ 0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2
[ 0.000000] Detected VIPT I-cache on CPU0
[ 0.000000] CPU features: enabling workaround for ARM erratum 845719
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 64640
[ 0.000000] Kernel command line: console=ttyAMA1,115200 earlycon=pl011,0xA0804000 root=/dev/ram0 rw verbose debug swiotlb=1024
[ 0.000000] doing Booting kernel, parsing ARGS: 'console=ttyAMA1,115200 earlycon=pl011,0xA0804000 root=/dev/ram0 rw verbose debug swiotlb=1024'
[ 0.000000] doing Booting kernel: console='ttyAMA1,115200'
[ 0.000000] doing Booting kernel: earlycon='pl011,0xA0804000'
[ 0.000000] doing Booting kernel: root='/dev/ram0'
[ 0.000000] doing Booting kernel: rw='(null)'
[ 0.000000] doing Booting kernel: verbose='(null)'
[ 0.000000] doing Booting kernel: debug='(null)'
[ 0.000000] doing Booting kernel: swiotlb='1024'
[ 0.000000] PID hash table entries: 1024 (order: 1, 8192 bytes)
[ 0.000000] Dentry cache hash table entries: 32768 (order: 6, 262144 bytes)
[ 0.000000] Inode-cache hash table entries: 16384 (order: 5, 131072 bytes)
[ 0.000000] Memory: 238968K/262144K available (3902K kernel code, 218K rwdata, 916K rodata, 320K init, 245K bss, 23176K reserved, 0K cma-reserved)
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] modules : 0xffffff8000000000 - 0xffffff8008000000 ( 128 MB)
[ 0.000000] vmalloc : 0xffffff8008000000 - 0xffffffbebfff0000 ( 250 GB)
[ 0.000000] .text : 0xffffff8008080000 - 0xffffff8008450000 ( 3904 KB)
[ 0.000000] .rodata : 0xffffff8008450000 - 0xffffff8008540000 ( 960 KB)
[ 0.000000] .init : 0xffffff8008540000 - 0xffffff8008590000 ( 320 KB)
[ 0.000000] .data : 0xffffff8008590000 - 0xffffff80085c6808 ( 219 KB)
[ 0.000000] .bss : 0xffffff80085c6808 - 0xffffff8008603c64 ( 246 KB)
[ 0.000000] fixed : 0xffffffbefe7fd000 - 0xffffffbefec00000 ( 4108 KB)
[ 0.000000] PCI I/O : 0xffffffbefee00000 - 0xffffffbeffe00000 ( 16 MB)
[ 0.000000] vmemmap : 0xffffffbf00000000 - 0xffffffc000000000 ( 4 GB maximum)
[ 0.000000] 0xffffffbf00000000 - 0xffffffbf00380000 ( 3 MB actual)
[ 0.000000] memory : 0xffffffc000000000 - 0xffffffc010000000 ( 256 MB)
[ 0.000000] Hierarchical RCU implementation.
[ 0.000000] Build-time adjustment of leaf fanout to 64.
[ 0.000000] RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=3.
[ 0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=3
[ 0.000000] NR_IRQS:64 nr_irqs:64 0
[ 0.000000] GICv3: CPU0: found redistributor 0 region 0:0x00000000b3200000
[ 0.000000] arm_arch_timer: Architected cp15 timer(s) running at 50.00MHz (virt).
[ 0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xb8812736b, max_idle_ns: 440795202655 ns
[ 0.000002] sched_clock: 56 bits at 50MHz, resolution 20ns, wraps every 4398046511100ns
[ 0.008159] Calibrating delay loop (skipped), value calculated using timer frequency.. 100.00 BogoMIPS (lpj=200000)
[ 0.018668] pid_max: default: 32768 minimum: 301
[ 0.023365] Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
[ 0.029935] Mountpoint-cache hash table entries: 512 (order: 0, 4096 bytes)
[ 0.037277] ASID allocator initialised with 65536 entries
[ 1.075047] CPU1: failed to come online
[ 1.078902] CPU1: failed in unknown state : 0x0
[ 2.103784] CPU2: failed to come online
[ 2.107638] CPU2: failed in unknown state : 0x0
[ 2.112208] Brought up 1 CPUs
[ 2.115187] SMP: Total of 1 processors activated.
[ 2.119920] CPU features: detected feature: GIC system register CPU interface
[ 2.127103] CPU features: detected feature: 32-bit EL0 Support
[ 2.132972] CPU: All CPU(s) started at EL1
[ 2.137095] alternatives: patching kernel code
[ 2.141950] doing early, parsing ARGS: 'console=ttyAMA1,115200 earlycon=pl011,0xA0804000 root=/dev/ram0 rw verbose debug swiotlb=1024'
[ 2.154130] doing early: console='ttyAMA1,115200'
[ 2.158864] doing early: earlycon='pl011,0xA0804000'
[ 2.163858] doing early: root='/dev/ram0'
[ 2.167890] doing early: rw='(null)'
[ 2.171483] doing early: verbose='(null)'
[ 2.175515] doing early: debug='(null)'
[ 2.179369] doing early: swiotlb='1024'
[ 2.183234] doing core, parsing ARGS: 'console=ttyAMA1,115200 earlycon=pl011,0xA0804000 root=/dev/ram0 rw verbose debug swiotlb=1024'
[ 2.195320] doing core: console='ttyAMA1,115200'
[ 2.199965] doing core: earlycon='pl011,0xA0804000'
[ 2.204873] doing core: root='/dev/ram0'
[ 2.208816] doing core: rw='(null)'
[ 2.212322] doing core: verbose='(null)'
[ 2.216266] doing core: debug='(null)'
[ 2.220037] doing core: swiotlb='1024'
[ 2.223851] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[ 2.233663] futex hash table entries: 1024 (order: 5, 131072 bytes)
[ 2.240116] NET: Registered protocol family 16
[ 2.244600] doing postcore, parsing ARGS: 'console=ttyAMA1,115200 earlycon=pl011,0xA0804000 root=/dev/ram0 rw verbose debug swiotlb=1024'
[ 2.257036] doing postcore: console='ttyAMA1,115200'
[ 2.262031] doing postcore: earlycon='pl011,0xA0804000'
[ 2.267289] doing postcore: root='/dev/ram0'
[ 2.271584] doing postcore: rw='(null)'
[ 2.275439] doing postcore: verbose='(null)'
[ 2.279735] doing postcore: debug='(null)'
[ 2.283854] doing postcore: swiotlb='1024'
[ 2.288155] doing arch, parsing ARGS: 'console=ttyAMA1,115200 earlycon=pl011,0xA0804000 root=/dev/ram0 rw verbose debug swiotlb=1024'
[ 2.300248] doing arch: console='ttyAMA1,115200'
[ 2.304894] doing arch: earlycon='pl011,0xA0804000'
[ 2.309802] doing arch: root='/dev/ram0'
[ 2.313744] doing arch: rw='(null)'
[ 2.317253] doing arch: verbose='(null)'
[ 2.321195] doing arch: debug='(null)'
[ 2.324966] doing arch: swiotlb='1024'
[ 2.328733] vdso: 2 pages (1 code @ ffffff8008457000, 1 data @ ffffff8008594000)
[ 2.336185] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
[ 2.343229] DMA: preallocated 256 KiB pool for atomic allocations
[ 2.349364] Serial: AMBA PL011 UART driver
[ 2.353649] OF: amba_device_add() failed (-19) for /uart@A0803000
[ 2.359816] OF: amba_device_add() failed (-19) for /uart@A0804000
[ 2.365981] OF: amba_device_add() failed (-19) for /uart@A0805000
[ 2.372228] doing subsys, parsing ARGS: 'console=ttyAMA1,115200 earlycon=pl011,0xA0804000 root=/dev/ram0 rw verbose debug swiotlb=1024'
[ 2.384491] doing subsys: console='ttyAMA1,115200'
[ 2.389312] doing subsys: earlycon='pl011,0xA0804000'
[ 2.394395] doing subsys: root='/dev/ram0'
[ 2.398514] doing subsys: rw='(null)'
[ 2.402195] doing subsys: verbose='(null)'
[ 2.406315] doing subsys: debug='(null)'
[ 2.410257] doing subsys: swiotlb='1024'
[ 2.416431] doing fs, parsing ARGS: 'console=ttyAMA1,115200 earlycon=pl011,0xA0804000 root=/dev/ram0 rw verbose debug swiotlb=1024'
[ 2.428353] doing fs: console='ttyAMA1,115200'
[ 2.432823] doing fs: earlycon='pl011,0xA0804000'
[ 2.437556] doing fs: root='/dev/ram0'
[ 2.441322] doing fs: rw='(null)'
[ 2.444654] doing fs: verbose='(null)'
[ 2.448421] doing fs: debug='(null)'
[ 2.452016] doing fs: swiotlb='1024'
[ 2.455799] clocksource: Switched to clocksource arch_sys_counter
[ 2.462200] NET: Registered protocol family 2
[ 2.466753] TCP established hash table entries: 2048 (order: 2, 16384 bytes)
[ 2.473858] TCP bind hash table entries: 2048 (order: 3, 32768 bytes)
[ 2.480362] TCP: Hash tables configured (established 2048 bind 2048)
[ 2.486787] UDP hash table entries: 256 (order: 1, 8192 bytes)
[ 2.492663] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
[ 2.499024] NET: Registered protocol family 1
[ 2.503746] Trying to unpack rootfs image as initramfs...
[ 3.277299] Freeing initrd memory: 2472K
[ 3.281262] doing device, parsing ARGS: 'console=ttyAMA1,115200 earlycon=pl011,0xA0804000 root=/dev/ram0 rw verbose debug swiotlb=1024'
[ 3.293534] doing device: console='ttyAMA1,115200'
[ 3.298356] doing device: earlycon='pl011,0xA0804000'
[ 3.303438] doing device: root='/dev/ram0'
[ 3.307559] doing device: rw='(null)'
[ 3.311241] doing device: verbose='(null)'
[ 3.315361] doing device: debug='(null)'
[ 3.319306] doing device: swiotlb='1024'
[ 3.323345] hw perfevents: enabled with armv8_pmuv3 PMU driver, 7 counters available
[ 3.331920] workingset: timestamp_bits=62 max_order=16 bucket_order=0
[ 3.339498] io scheduler noop registered
[ 3.343446] io scheduler deadline registered (default)
[ 3.358750] Unable to detect cache hierarchy for CPU 0
[ 3.377145] brd: module loaded
[ 3.380337] libphy: Fixed MDIO Bus: probed
[ 3.384528] Initializing XFRM netlink socket
[ 3.388860] NET: Registered protocol family 10
[ 3.394404] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[ 3.400567] NET: Registered protocol family 17
[ 3.405043] NET: Registered protocol family 15
[ 3.409514] doing late, parsing ARGS: 'console=ttyAMA1,115200 earlycon=pl011,0xA0804000 root=/dev/ram0 rw verbose debug swiotlb=1024'
[ 3.421607] doing late: console='ttyAMA1,115200'
[ 3.426253] doing late: earlycon='pl011,0xA0804000'
[ 3.431161] doing late: root='/dev/ram0'
[ 3.435107] doing late: rw='(null)'
[ 3.438616] doing late: verbose='(null)'
[ 3.442562] doing late: debug='(null)'
[ 3.446332] doing late: swiotlb='1024'
[ 3.450101] Floating-point is not implemented
[ 3.454481] Advanced SIMD is not implemented
[ 3.459066] [yuchen]: module_name=crypto-hmac(sha256)
[ 3.464629] modprobe[747]: undefined instruction: pc=00000000000c0dbc
[ 3.471120] Code: 000093e5 5c3405e5 bcc29fe5 042080e5 (ba0f07ee)
[ 3.477340] [yuchen]: module_name=crypto-hmac(sha256)-all
[ 3.482950] modprobe[748]: undefined instruction: pc=00000000000c0dbc
[ 3.489439] Code: 000093e5 5c3405e5 bcc29fe5 042080e5 (ba0f07ee)
[ 3.495693] [yuchen]: module_name=crypto-cbc(aes)
[ 3.500602] modprobe[751]: undefined instruction: pc=00000000000c0dbc
[ 3.507091] Code: 000093e5 5c3405e5 bcc29fe5 042080e5 (ba0f07ee)
[ 3.513268] [yuchen]: module_name=crypto-cbc(aes)-all
[ 3.518528] modprobe[752]: undefined instruction: pc=00000000000c0dbc
[ 3.525016] Code: 000093e5 5c3405e5 bcc29fe5 042080e5 (ba0f07ee)
[ 3.531232] Key type encrypted registered
[ 3.535347] Warning: unable to open an initial console.
[ 3.540721] Freeing unused kernel memory: 320K
[ 3.545326] init[1]: undefined instruction: pc=00000000000c0dbc
[ 3.551290] Code: 000093e5 5c3405e5 bcc29fe5 042080e5 (ba0f07ee)
[ 3.557457] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
[ 3.557457]
[ 3.566655] Kernel Offset: disabled
[ 3.570157] Memory Limit: none
[ 3.573224] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
[ 3.573224]
如何分析:
方法1:根据PC指针,转换成非法指令的原型。
[ 3.518528] modprobe[752]: undefined instruction: pc=00000000000c0dbc
将modprobe程序反汇编:
~/tools/crossTools/aarch64be_eabi_gcc6.2.0_glibc2.24.0_fp_be8/bin/aarch64_be-linux-gnu-objdump -D modprobe > modprobe.txt
vim modprobe.txt,打开,查找pc地址,即可看到对应指令。
方法2:根据二进制可执行代码(code),转换成指令原型。
[ 3.525016] Code: 000093e5 5c3405e5 bcc29fe5 042080e5 (ba0f07ee)
log中(ba0f07ee)为PC寄存器中的二进制指令码,将ba0f07ee作为指令字,保存到a.s文件中。
.intr 0xee070fba |
用arm64汇编器去编译a.s文件,得到a.out文件。
~/tools/crossTools/aarch64be_eabi_gcc6.2.0_glibc2.24.0_fp_be8/bin/aarch64_be-linux-gnu-as a.s
用objdump反汇编得到汇编指令:
~/tools/crossTools/aarch64be_eabi_gcc6.2.0_glibc2.24.0_fp_be8/bin/aarch64_be-linux-gnu-objdump -D a.out
方法3:
直接看arm64手册,根据code指令字格式,翻译出汇编原型。
未定义指令错误原因:
错误1:编译器和处理器(CPU)硬件配合的问题,编译器生成了cpu不认识的指令,cpu自然无法处理。比如某款处理器不带浮点计算单元(fp),而你使用的编译器却用了浮点指令,那么这些浮点指令不会被cpu所识别,造成非法指令异常。(本人就遇到过这类问题,由于公司某个平台使用了自研的CPU,该CPU去掉了浮点模块,但是工具链默认情况下会使用浮点指令的,所以导致指令异常,最后更换工具链解决。)
错误2:程序代码段中的指令字,在运行过程中被篡改,因为代码段一般是只读段,一般不会有篡改的可能性。(但笔者确实遇到过一种情况,在实际产品中,长时间运行后,软件概率性出现非法指令,这种情况下发生在flash存储设备上,最终查到的原因是软件中有一段频繁操作flash的代码,狂写flash,导致flash器件中一些bit发生漂移反转。造成非法指令。。当然这种情况非常少见,,)
错误3:所使用的可执行程序,由于某种原因,(经过特殊的格式转换,加header等等),最终生成的程序代码段被破坏,这种情况下程序本身就存在了非法指令。而elf头并没有被破坏。所以在程序执行过程中会出错。
下面就是CPU和编译器不匹配问题的log,当时使用了arm64大端编译器(带浮点指令)。指令字(200c014e)反汇编出来是一条浮点指令: dup v0.16b, w1,所以cpu执行报了非法指令
-bash-4.1$ ~/tools/crossTools/aarch64be_eabi_gcc6.2.0_glibc2.24.0_fp_be8/bin/aarch64_be-linux-gnu-objdump -D a.out
a.out: file format elf64-bigaarch64
Disassembly of section .text:
0000000000000000 <.text>:
0: 4e010c20 dup v0.16b, w1
-bash-4.1$
##使用aarch64be_eabi_gcc6.2.0_glibc2.24.0_fp_be8/bin/aarch64_be-linux-gnu-编译器,编译的文件系统
[ 2.398016] Advanced SIMD is not implemented
[ 2.403281] [yuchen]: module_name=crypto-hmac(sha256)
[ 2.408573] [yuchen]: module_name=crypto-hmac(sha256)-all
[ 2.414147] [yuchen]: module_name=crypto-cbc(aes)
[ 2.418957] [yuchen]: module_name=crypto-cbc(aes)-all
[ 2.424160] Key type encrypted registered
[ 2.428362] RAMDISK: gzip image found at block 0
[ 2.733404] EXT4-fs (ram0): couldn't mount as ext3 due to feature incompatibilities
[ 2.741533] EXT4-fs (ram0): mounted filesystem with ordered data mode. Opts: (null)
[ 2.749204] VFS: Mounted root (ext4 filesystem) on device 1:0.
[ 2.755107] Freeing unused kernel memory: 320K
[ 2.760930] init[1]: undefined instruction: pc=0000000000400d00
[ 2.766850] Code: 00000000 00000000 00000000 00000000 (200c014e)
[ 2.773036] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
[ 2.773036]
[ 2.782157] Kernel Offset: disabled
[ 2.785632] Memory Limit: none
[ 2.788675] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
[ 2.788675]