VM中有一块盘,进入mount点后,ls显示错误:
[root@es10 data2]# ls
ls: cannot open directory .: Input/output error
[root@es10 ~]# lsblk | grep vdc
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vdc 253:32 0 2T 0 disk
└─vdc1 253:33 0 2T 0 part /data2
[root@es10 ~]# mount | grep vdc
/dev/vdc1 on /data2 type xfs (rw,relatime,attr2,inode64,noquota)
[root@es10 data2]# df -h | grep vdc
Filesystem Size Used Avail Use% Mounted on
/dev/vdc1 2.0T 281G 1.7T 15% /data2
[root@es10 ~]# free -h
total used free shared buff/cache available
Mem: 31G 322M 18G 16M 12G 30G
Swap: 0B 0B 0B
[root@es10 ~]# tail /var/log/messages
Dec 4 10:02:37 es10 kernel: XFS (vdc1): xfs_log_force: error -5 returned.
Dec 4 10:02:39 es10 kernel: XFS (vdc1): xfs_log_force: error -5 returned.
Dec 4 10:02:39 es10 kernel: XFS (vdc1): xfs_log_force: error -5 returned.
[root@es10 ~]# lsof | grep "/data2"
lsof: WARNING: can't stat() xfs file system /data2
Output information may be incomplete.
[root@es10 ~]# lsof | grep "/data"
lsof: WARNING: can't stat() xfs file system /data2
Output information may be incomplete.
bash 32147 root cwd DIR 253,17 26 128 /data1
在网上搜索了一下,有人也遇到过,重新mount可以解决:
[root@es10 ~]# umount /data2
[root@es10 ~]# mount | grep vdc
[root@es10 ~]# mount /dev/vdc1 /data2
[root@es10 ~]# tail /var/log/messages
Dec 4 10:02:37 es10 kernel: XFS (vdc1): xfs_log_force: error -5 returned.
Dec 4 10:02:39 es10 kernel: XFS (vdc1): Unmounting Filesystem
Dec 4 10:02:57 es10 kernel: XFS (vdc1): Mounting V4 Filesystem
Dec 4 10:02:57 es10 kernel: XFS (vdc1): Starting recovery (logdev: internal)
Dec 4 10:02:58 es10 kernel: XFS (vdc1): Ending recovery (logdev: internal)
Dec 4 10:05:38 es10 systemd: Started Session 592 of user root.
Dec 4 10:05:38 es10 systemd-logind: New session 592 of user root.
Dec 4 10:05:38 es10 systemd: Starting Session 592 of user root.
Linux默认会使用可用内存的40%来作为文件系统cache,当这个阈值达到时,文件系统会下刷这些cache到磁盘。而这个flush的过程,会有一个时间限制,操作系统默认给了120s的限制,如果底层磁盘不是很快,部分cache依然占用,没有flush到disk,系统内存会慢慢被消耗完。
Dec 2 05:27:32 es10 kernel: INFO: task java:10432 blocked for more than 120 seconds.
Dec 2 05:27:32 es10 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 2 05:27:32 es10 kernel: java D ffff8808115a4500 0 10432 1 0x00000180
Dec 2 05:27:32 es10 kernel: ffff8803fcf9fe40 0000000000000082 ffff8808115a4500 ffff8803fcf9ffd8
Dec 2 05:27:32 es10 kernel: ffff8803fcf9ffd8 ffff8803fcf9ffd8 ffff8808115a4500 000000720007bf45
Dec 2 05:27:32 es10 kernel: ffff88042944c528 ffff8803fcf9fef4 ffff88080fc2e918 ffff8808115a4500
Dec 2 05:27:32 es10 kernel: Call Trace:
Dec 2 05:27:32 es10 kernel: [] schedule+0x29/0x70
Dec 2 05:27:32 es10 kernel: [] _xfs_log_force_lsn+0x184/0x2f0 [xfs]
Dec 2 05:27:32 es10 kernel: [] ? wake_up_state+0x20/0x20
Dec 2 05:27:32 es10 kernel: [] xfs_file_fsync+0x1b0/0x200 [xfs]
Dec 2 05:27:32 es10 kernel: [] do_fsync+0x65/0xa0
Dec 2 05:27:32 es10 kernel: [] SyS_fdatasync+0x13/0x20
Dec 2 05:27:32 es10 kernel: [] tracesys+0xdd/0xe2
Dec 2 05:27:32 es10 kernel: INFO: task java:10433 blocked for more than 120 seconds.
Dec 2 05:27:32 es10 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 2 05:27:32 es10 kernel: java D ffff88081176e780 0 10433 1 0x00000180
Dec 2 05:27:32 es10 kernel: ffff88049fb03e40 0000000000000082 ffff88081176e780 ffff88049fb03fd8
Dec 2 05:27:32 es10 kernel: ffff88049fb03fd8 ffff88049fb03fd8 ffff88081176e780 000000720007be05
Dec 2 05:27:32 es10 kernel: ffff88042944c528 ffff88049fb03ef4 ffff88080fc2e818 ffff88081176e780
Dec 2 05:27:32 es10 kernel: Call Trace:
Dec 2 05:27:32 es10 kernel: [] schedule+0x29/0x70
Dec 2 05:27:32 es10 kernel: [] _xfs_log_force_lsn+0x184/0x2f0 [xfs]
Dec 2 05:27:32 es10 kernel: [] ? wake_up_state+0x20/0x20
Dec 2 05:27:32 es10 kernel: [] xfs_file_fsync+0x1b0/0x200 [xfs]
Dec 2 05:27:32 es10 kernel: [] do_fsync+0x65/0xa0
Dec 2 05:27:32 es10 kernel: [] SyS_fdatasync+0x13/0x20
Dec 2 05:27:32 es10 kernel: [] tracesys+0xdd/0xe2
Dec 2 05:33:17 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2097660325
Dec 2 05:33:17 es10 kernel: XFS (vdc1): metadata I/O error: block 0x7d07b9a5 ("xlog_iodone") error 5 numblks 64
Dec 2 05:33:17 es10 kernel: XFS (vdc1): xfs_do_force_shutdown(0x2) called from line 1180 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa01c3b9e
Dec 2 05:33:17 es10 kernel: XFS (vdc1): Log I/O Error Detected. Shutting down filesystem
Dec 2 05:33:17 es10 kernel: XFS (vdc1): xfs_log_force: error -5 returned.
Dec 2 05:33:17 es10 kernel: XFS (vdc1): Please umount the filesystem and rectify the problem(s)
Dec 2 05:33:47 es10 kernel: XFS (vdc1): xfs_log_force: error -5 returned.
Dec 2 05:33:54 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2115248608
Dec 2 05:33:54 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2115249616
Dec 2 05:33:54 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2115250624
Dec 2 05:33:54 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2115251632
Dec 2 05:33:54 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2115252640
Dec 2 05:33:54 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2115253648
Dec 2 05:33:54 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2115254656
Dec 2 05:33:54 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2115255664
Dec 2 05:33:54 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2115256672
Dec 2 05:33:54 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2115256800
Dec 2 05:34:17 es10 kernel: XFS (vdc1): xfs_log_force: error -5 returned.
Dec 2 05:38:17 es10 kernel: blk_update_request: 8 callbacks suppressed
Dec 2 05:38:17 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2097660389
Dec 2 05:38:17 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2097660453
Dec 2 05:38:17 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2097660517
Dec 2 05:38:17 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2097660581
Dec 2 05:38:17 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2097660645
Dec 2 05:38:17 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2097660709
Dec 2 05:38:17 es10 kernel: blk_update_request: I/O error, dev vdc, sector 2097660773
Dec 2 05:38:17 es10 kernel: XFS (vdc1): metadata I/O error: block 0x7d07b9e5 ("xlog_iodone") error 5 numblks 64
Dec 2 05:38:17 es10 kernel: XFS (vdc1): xfs_do_force_shutdown(0x2) called from line 1180 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa01c3b9e
Dec 2 05:38:17 es10 kernel: XFS (vdc1): metadata I/O error: block 0x7d07ba25 ("xlog_iodone") error 5 numblks 64
Dec 2 05:38:17 es10 kernel: XFS (vdc1): xfs_do_force_shutdown(0x2) called from line 1180 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa01c3b9e
Dec 2 05:38:17 es10 kernel: XFS (vdc1): metadata I/O error: block 0x7d07ba65 ("xlog_iodone") error 5 numblks 64
Dec 2 05:38:17 es10 kernel: XFS (vdc1): xfs_do_force_shutdown(0x2) called from line 1180 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa01c3b9e
Dec 2 05:38:17 es10 kernel: XFS (vdc1): metadata I/O error: block 0x7d07baa5 ("xlog_iodone") error 5 numblks 64
Dec 2 05:38:17 es10 kernel: XFS (vdc1): xfs_do_force_shutdown(0x2) called from line 1180 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa01c3b9e
Dec 2 05:38:17 es10 kernel: XFS (vdc1): metadata I/O error: block 0x7d07bae5 ("xlog_iodone") error 5 numblks 64
Dec 2 05:38:17 es10 kernel: XFS (vdc1): xfs_do_force_shutdown(0x2) called from line 1180 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa01c3b9e
Dec 2 05:38:17 es10 kernel: XFS (vdc1): metadata I/O error: block 0x7d07bb25 ("xlog_iodone") error 5 numblks 64
Dec 2 05:38:17 es10 kernel: XFS (vdc1): xfs_do_force_shutdown(0x2) called from line 1180 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa01c3b9e
Dec 2 05:38:17 es10 kernel: XFS (vdc1): metadata I/O error: block 0x7d07bb65 ("xlog_iodone") error 5 numblks 64
Dec 2 05:38:17 es10 kernel: XFS (vdc1): xfs_do_force_shutdown(0x2) called from line 1180 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa01c3b9e
从上面的日志可以看到:
主要问题还是 kernel: INFO: task java:10432 blocked for more than 120 seconds.这种问题,解决方法就是降低vm.dirty_ratio、vm.dirty_background_ratio
这两个参数。
echo
"vm.dirty_ratio = 10"
>> /etc/sysctl.conf
echo
"vm.dirty_background_ratio = 1"
>> /etc/sysctl.conf'
不重启生效:
sysctl -p
http://blog.51cto.com/7938217/1679885
https://www.blackmoreops.com/2014/09/22/linux-kernel-panic-issue-fix-hung_task_timeout_secs-blocked-120-seconds-problem/
https://bugzilla.redhat.com/show_bug.cgi?id=1211084