一个内核报错日志分析与解决

问题

今天检查jira服务器挂了的时候,查看了该服务器/var/log/messages日志,发现报错如下:

Dec  4 13:50:26 localhost kernel: pcieport 0000:00:1c.6: AER: Corrected error received: id=00e6
Dec  4 13:50:26 localhost kernel: pcieport 0000:00:1c.6: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e6(Receiver ID)
Dec  4 13:50:26 localhost kernel: pcieport 0000:00:1c.6:   device [8086:a116] error status/mask=00000001/00002000
Dec  4 13:50:26 localhost kernel: pcieport 0000:00:1c.6:    [ 0] Receiver Error        
Dec  4 13:50:26 localhost kernel: pcieport 0000:00:1c.6: AER: Corrected error received: id=00e6
Dec  4 13:50:26 localhost kernel: pcieport 0000:00:1c.6: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e6(Receiver ID)
Dec  4 13:50:26 localhost kernel: pcieport 0000:00:1c.6:   device [8086:a116] error status/mask=00000001/00002000
Dec  4 13:50:26 localhost kernel: pcieport 0000:00:1c.6:    [ 0] Receiver Error        
Dec  4 13:50:27 localhost kernel: pcieport 0000:00:1c.6: AER: Corrected error received: id=00e6
Dec  4 13:50:27 localhost kernel: pcieport 0000:00:1c.6: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e6(Receiver ID)
Dec  4 13:50:27 localhost kernel: pcieport 0000:00:1c.6:   device [8086:a116] error status/mask=00000001/00002000
Dec  4 13:50:27 localhost kernel: pcieport 0000:00:1c.6:    [ 0] Receiver Error

这种报错可能是由于活动状态电源管理正在将链路转换为较低的电源状态,可能导致设备触发这些错误。

 

解决

# 0.修改/etc/default/grub引导文件
[root@localhost ~]# cp /etc/default/grub /etc/default/grub.bak
[root@localhost ~]# vim /etc/default/grub

# 1.修改以下配置
# GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=nomsi"

# 2.保存关闭grub文件,更新grub引导,并重启
[root@localhost ~]# update-grub
[root@localhost ~]# reboot

你可能感兴趣的:(一个内核报错日志分析与解决)