注:本文分析基于3.10.0-693.el7内核版本,即CentOS 7.4
在之前的文章中vm内核参数之内存脏页dirty_writeback_centisecs和dirty_expire_centisecs,在处理脏页时,__mark_inode_dirty会判断是否开启block_dump,我们今天就来了解下这个具体是什么作用,以及是否还有其他路径会记录相关IO信息。
void __mark_inode_dirty(struct inode *inode, int flags)
{
...
if (unlikely(block_dump))
block_dump___mark_inode_dirty(inode);
...
}
这是用于记录dirty inode的信息,包括进程名,进程号,设备名等
static noinline void block_dump___mark_inode_dirty(struct inode *inode)
{
if (inode->i_ino || strcmp(inode->i_sb->s_id, "bdev")) {
struct dentry *dentry;
const char *name = "?";
dentry = d_find_alias(inode);
//获取该inode所在设备名称
if (dentry) {
spin_lock(&dentry->d_lock);
name = (const char *) dentry->d_name.name;
}
//打印dirty inode记录信息,包括进程名,进程号,设备名等
printk(KERN_DEBUG
"%s(%d): dirtied inode %lu (%s) on %s\n",
current->comm, task_pid_nr(current), inode->i_ino,
name, inode->i_sb->s_id);
if (dentry) {
spin_unlock(&dentry->d_lock);
dput(dentry);
}
}
}
除了在标记dirty inode时会通过block_dump记录IO读写记录,在提交IO路径上也同样会记录,
void submit_bio(int rw, struct bio *bio)
{
bio->bi_rw |= rw;
/*
* If it's a regular read/write or a barrier with data attached,
* go through the normal accounting stuff before submission.
*/
if (bio_has_data(bio)) {
unsigned int count;
...
if (unlikely(block_dump)) {
char b[BDEVNAME_SIZE];
printk(KERN_DEBUG "%s(%d): %s block %Lu on %s (%u sectors)\n",
current->comm, task_pid_nr(current),
(rw & WRITE) ? "WRITE" : "READ",
(unsigned long long)bio->bi_sector,
bdevname(bio->bi_bdev, b),
count);
}
}
generic_make_request(bio);
}
我们尝试打开该选项,只要设置一个非0值即可,
echo 1 > /proc/sys/vm/block_dump
然后调试信息就会输出到系统日志,可以通过dmesg命令查看日志,可以看到如下的日志,
[3483051.407583] etcd(1088): WRITE block 62705008 on vda1 (8 sectors)
[3483051.438507] etcd(1089): WRITE block 62705008 on vda1 (8 sectors)
[3483051.439129] in:imjournal(554): dirtied inode 393563 (imjournal.state.tmp) on vda1
[3483051.439135] in:imjournal(554): dirtied inode 393563 (imjournal.state.tmp) on vda1
[3483051.439193] in:imjournal(554): WRITE block 12859400 on vda1 (8 sectors)
[3483051.466076] auditd(511): WRITE block 76034768 on vda1 (24 sectors)
[3483051.466397] jbd2/vda1-8(293): WRITE block 8715456 on vda1 (8 sectors)
[3483051.466407] jbd2/vda1-8(293): WRITE block 8715464 on vda1 (8 sectors)
对于dirty inode我们可以直观的知道是哪个进程在操作哪个文件,比如我用vi编辑一个测试文件,保存后在dmesg日志中可以看到如下信息,
[root@localhost vm]# dmesg | grep test
[3485027.567718] vi(14276): dirtied inode 543071 (mytest.txt) on vda1
[3485027.567721] vi(14276): dirtied inode 543071 (mytest.txt) on vda1
如果需要知道文件绝对路径可以再配合find命令搜一把,或者用debugfs工具。
但是对于读写文件就没那么容易了,block_dump给我们的信息并没有那么直观,同样需要借助debugfs工具。
以上面inode 543071为例,
[root@localhost vm]# debugfs /dev/vda1 -R 'ncheck 543071'
debugfs 1.42.9 (28-Dec-2013)
Inode Pathname
543071 /root/mytest.txt
这时就能清楚的确定文件的绝对路径。
以下面两条日志为例,
[3498645.980401] bash(5875): READ block 655600 on vda1 (32 sectors)
[3498645.980667] grep(5876): READ block 629872 on vda1 (280 sectors)
通过debugfs的icheck找到对应inode,然后再通过ncheck找对对应的文件,
debugfs: icheck 655600
Block Inode number
655600 531108
debugfs: ncheck 531108
Inode Pathname
531108 /home/elk/elasticsearch-7.6.0-x86_64.rpm
debugfs: icheck 629872
Block Inode number
629872 529729
debugfs: ncheck 529729
Inode Pathname
529729 /home/elk/jdk-8u231-linux-x64.rpm