看了通过blktrace, debugfs分析磁盘IO大作,试图自己搞一把。花了1个多小时,果然能成功地定位到了正写入的文件。觉得有以下几点值得特别说一下:
1、用debugfs,无论icheck或ncheck,都非常耗时,并且会产生非常高的disk read。我等了将近有10分钟
rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
cciss/c0d1 21880.00 0.00 432.00 0.00 178752.00 0.00 413.78 0.95 2.21 2.18 94.20
cciss/c0d1p1 21880.00 0.00 432.00 0.00 178752.00 0.00 413.78 0.95 2.21 2.18 94.20
2、真的实际要对hot file disk IO进行诊断,其实需要对blkparse的结果进行处理,需要对相同设备的相同、临近扇区号进行聚合(假设顺序读写)
3、arrowpig同学发现,对于ext4在journey开时做debugfs还有些问题,主要是ext4总是先写journey的一个缓存区,导致debugfs总是看到同一inode号(inode=8)。不知这个应该如何解决?
[op1@SVR2084HP360 ~]$ sudo blktrace /dev/cciss/c0d1p1
BLKTRACESTOP: Invalid argument
Device: /dev/cciss/c0d1p1
CPU 0: 0 events, 1 KiB data
CPU 1: 0 events, 0 KiB data
CPU 2: 0 events, 0 KiB data
CPU 3: 0 events, 0 KiB data
CPU 4: 0 events, 0 KiB data
CPU 5: 0 events, 5 KiB data
CPU 6: 0 events, 0 KiB data
CPU 7: 0 events, 0 KiB data
Total: 0 events (dropped 0), 5 KiB data
[op1@SVR2084HP360 ~]$ blkparse cciss_c0d1p1.blktrace.* > 1.log
[op1@SVR2084HP360 ~]$ grep 'A' 1.log|head -n 5
104,16 6 1 0.000000000 2569 A W 3711129946 + 8 <- (104,17) 3711129912
104,16 6 1 0.000000000 2569 A W 3711129946 + 8 <- (104,17) 3711129912
104,16 6 1 0.000000000 2569 A W 3711129946 + 8 <- (104,17) 3711129912
104,16 6 1 0.000000000 2569 A W 3711129946 + 8 <- (104,17) 3711129912
104,16 6 1 0.000000000 2569 A W 3711129946 + 8 <- (104,17) 3711129912
[op1@SVR2084HP360 ~]$ sudo debugfs -R 'icheck 463891239'
/dev/cciss/c0d1p1 debugfs 1.39 (29-May-2006)
icheck 463891239 Block Inode number 463891239
[op1@SVR2084HP360 ~]$ sudo debugfs -R 'ncheck 228573462' /dev/cciss/c0d1p1
debugfs 1.39 (29-May-2006)
Inode Pathname
228573462 /current/BP-1593184750-192.168.82.55-1350991561852/current/finalized/subdir27/blk_878458967791221266