最近频繁出现磁盘IO飙升的故障,一直无法查出原因来,想用哪个iotop的工具,无奈官网说只支持2.6.20以上的内核(http://guichaz.free.fr/iotop/)
Iotop is a Python program with a top like UI used to show of behalf of which process is the I/O going on. It requires Python ≥ 2.5 (or Python ≥ 2.4 with the ctypes module) and a Linux kernel ≥ 2.6.20 with the TASK_DELAY_ACCT CONFIG_TASKSTATS, TASK_IO_ACCOUNTING and CONFIG_VM_EVENT_COUNTERS options on.
而我们的操作系统以前基本上都是采用centos 5.5 2.6.18版本的内核:
[slide@test ~]$ uname -a Linux eric 2.6.18-194.32.1.el5 #1 SMP Wed Jan 5 17:52:25 EST 2011 x86_64 x86_64 x86_64 GNU/Linux[slide@test ~]$ cat /etc/redhat-release CentOS release 5.5 (Final)
还好这个版本的内核提供了一种方法可以通过查看指定进程的文件/pro/pid/io来获得该进程对磁盘的读写情况,例如
[root@test ~]# cat /proc/28483/io rchar: 1860123170wchar: 7677744972syscr: 1983620syscw: 11010065read_bytes: 62033920write_bytes: 12085121024cancelled_write_bytes: 0
参数说明:
Description ―――�C rchar ―�C I/O counter: chars readThe number of bytes which this task has caused to be read from storage. This is simply the sum of bytes which this process passed to read() and pread(). It includes things like tty IO and it is unaffected by whether or not actual physical disk IO was required (the read might have been satisfied from pagecache) wchar ―�C I/O counter: chars written The number of bytes which this task has caused, or shall cause to be written to disk. Similar caveats apply here as with rchar. syscr ―�C I/O counter: read syscalls Attempt to count the number of read I/O operations, i.e. syscalls like read()and pread(). syscw ―�C I/O counter: write syscalls Attempt to count the number of write I/O operations, i.e. syscalls likewrite() and pwrite(). read_bytes ―――- I/O counter: bytes readAttempt to count the number of bytes which this process really did cause to be fetched from the storage layer. Done at the submit_bio() level, so it is accurate for block-backed filesystems. write_bytes ―――�C I/O counter: bytes written Attempt to count the number of bytes which this process caused to be sent to the storage layer. This is done at page-dirtying time. cancelled_write_bytes ――――――― The big inaccuracy here is truncate. If a process writes 1MB to a file andthen deletes the file, it will in fact perform no writeout. But it will have been accounted as having caused 1MB of write. In other words: The number of bytes which this process caused to not happen, by truncating pagecache. A task can cause “negative” IO too. If this task truncates some dirty pagecache, some IO which another task has been accountedfor (in it’s write_bytes) will not be happening. We _could_ just subtract that from the truncating task’s write_bytes, but there is information loss in doing that.
不经意间在网上看到vpsee的一篇文章《如何查看进程 IO 读写情况》(http://www.vpsee.com/2009/08/monitor-process-io-activity/)
里面有个python脚本,开启block_dump,使得可以把 block 读写(WRITE/READ)状况 dump 到日志里,这样可以通过 dmesg 命令来查看进程读写磁盘的情形
不过这个脚本有个问题,就是每秒会清空dmesg的系统日志,获得的进程读写不准确,基本上都是kjournald和pdflush进程的读写信息,很难看到具体的进程对磁盘的读写
其中kjournald是EXT3文件系统的日志进程,
pdflush用于将内存中的内容和文件系统进行同步,比如说,当一个文件在内存中进行修改,pdflush负责将它写回硬盘.
每当内存中的垃圾页(dirty page)超过10%的时候,pdflush就会将这些页面备份回硬盘.这个比率是可调节的,
通过/etc/sysctl.conf中的 vm.dirty_background_ratio项 默认值为10 也可以通过
cat /proc/sys/vm/dirty_background_ratio 查看当前的值
这样查看也不是办法,今天尝试着在centos 5.6的系统上安装iotop,直接使用系统的yum来安装yum install iotop,发现居然可以安装,而且还顺带安装了python-ctypes
(1/2): iotop-0.4.3-4.el5.noarch.rpm
(2/2): python-ctypes-1.0.2-3.el5.x86_64.rpm
这个是因为系统自带的python版本是2.4.3,需要依赖python-ctypes才能安装iotop(见官网的说明)
安装完成运行iotop之后出现错误:
[root@KW-00227-SER02 ~]# iotop No module named iotop.ui To run an uninstalled copy of iotop, launch iotop.py in the top directory
这个是由于我们重新安装了一个python2.6.5的版本,但是采用的是yum安装,而yum调用的是系统的2.4版本的python,这样使得iotop命令也是调用python 2.4版本才可以运行
解决办法是修改iotop版本的python依赖关系
vim /usr/bin/iotop
把第一行#!/usr/bin/python修改成#!/usr/bin/python2.4(注意前提是/usr/bin/python2.4这个文件必须存在,由于centos系统对python的依赖很强,所以在安装其他版本的python的时候,不要把系统的2.4版本给删了)
再次运行iotop
DISK READ: 107.74 K/s | Total DISK WRITE: 4.49 M/s PID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 3170 be/3 root 0.00 B/s 13.90 K/s 0.00 % 14.33 % [jbd2/dm-1-8] 351 be/3 root 0.00 B/s 0.00 B/s 0.00 % 0.61 % [kblockd/2]
看来centos 5.6的版本的内核2.6.18-238.el5是可以支持iotop运行的
[root@king ~]# uname -a Linux king 2.6.18-238.el5 #1 SMP Thu Jan 13 15:51:15 EST 2011 x86_64 x86_64 x86_64 GNU/Linux[root@king ~]# cat /etc/redhat-release CentOS release 5.6 (Final)
原文:http://tech.foolpig.com/2012/06/16/linux-io/