首先看看nmon命令的帮助信息:
[root@linux nmon]# ./nmon.sh -h Hint: nmon.sh [-h] [-s <seconds>] [-c <count>] [-f -d <disks> -t -r <name>] [-x] -h 查看完整的说明信息,有两种模式:a、命令行交互式模式 (h) b、对于数据收集模式 (-f) -f 电子表格的输出格式 [注意:默认 -s300 -c288] 可选 (300秒*288次=86400秒=60*60*24=1天) -s <seconds> 刷新屏幕频率的时间 [默认 2] -c <number> 刷新屏幕的次数 [默认 1000000] -d <disks> to increase the number of disks [default 256] -t spreadsheet includes top processes -x capacity planning (每15分钟1天 = -fdt -s 900 -c 96) 版本 - nmon 14g 对于命令行交互式模式 -s <seconds> 刷新屏幕频率的时间 [默认 2] -c <number> 刷新屏幕的次数 [默认 1000000] -g <filename> User Defined Disk Groups [hit g to show them] - file = on each line: group_name <disks list> space separated - like: database sdb sdc sdd sde - upto 64 disk groups, 512 disks per line - disks can appear more than once and in many groups -b 命令行交互模式的界面是黑色和白色 [默认的颜色] 例如: nmon.sh -s 1 -c 100 (说明:在命令行交互模式下,每秒钟刷新一次屏幕,总共采集100次) 对于数据收集模式 = 电子表格格式 (逗号分隔值) Note: use only one of f,F,z,x or X and make it the first argument -f 电子表格输出格式 [注意: default -s300 -c288] 输出文件是 <hostname>_YYYYMMDD_HHMM.nmon -F <filename> 等同于 -f 但是使用用户提供的文件名 -r <runname> 用于电子表格文件 [default hostname] -t include top processes in the output -T as -t plus saves command line arguments in UARG section -s <seconds> 采集数据的时间 -c <number> 采集数据的次数 -d <disks> to increase the number of disks [default 256] -l <dpl> disks/line default 150 to avoid spreadsheet issues. EMC=64. -g <filename> User Defined Disk Groups (see above) - see BBBG & DG lines -N include NFS Network File System -I <percent> Include process & disks busy threshold (default 0.1) don't save or show proc/disk using less than this percent -m <directory> 生成的数据文件的路径 例如:在30秒的时间间隔收集的top procs,持续1小时 nmon.sh -f -t -r Test1 -s30 -c120 To load into a spreadsheet: sort -A *nmon >stats.csv transfer the stats.csv file to your PC Start spreadsheet & then Open type=comma-separated-value ASCII file The nmon analyser or consolidator does not need the file sorted. Capacity planning mode - use cron to run each day -x sensible spreadsheet output for CP = one day 每15分钟1天 ( i.e. -ft -s 900 -c 96) -X sensible spreadsheet output for CP = busy hour 每30秒1小时 ( i.e. -ft -s 30 -c 120) 交互模式命令 key --- Toggles to control what is displayed --- h = 联机帮助信息 r = 机器类型,机器名,缓存信息和OS版本+LPAR c = CPU处理器统计条形图 l = 条形图长期CPU(超过75个快照) m = 内存统计 L = 巨大的内存页面统计 V = 虚拟内存和交换统计 k = 内核内部统计 n = 网络统计和错误 N = NFS网络文件系统 d = 磁盘I/O图 D = 磁盘I/O统计 o = 磁盘I/O映射(每个磁盘上的一个字符显示它是多么繁忙) j = 文件系统 t = 顶级进程统计使用1,3,4,5来选择数据及顺序 u = 顶级进程命令的详细信息 v = 详细简单的检查 - OK/Warn(警告)/Danger(危险) b = 黑白模式(或使用- b选项) . = 最小模式,即只显示繁忙的磁盘和进程 key --- Other Controls --- + = 双屏幕刷新时间 - = 一半的屏幕刷新时间 q = 退出 (also x, e or control-C) 0 = 零峰计数复位 (峰值 = ">") space = 立即刷新屏幕 Startup Control If you find you always type the same toggles every time you start then place them in the NMON shell variable. For example: export NMON=cmdrvtan Others: a) To you want to stop nmon - kill -USR2 <nmon-pid> b) Use -p and nmon outputs the background process pid c) To limit the processes nmon lists (online and to a file) Either set NMONCMD0 to NMONCMD63 to the program names or use -C cmd:cmd:cmd etc. example: -C ksh:vi:syncd d) If you want to pipe nmon output to other commands use a FIFO: mkfifo /tmp/mypipe nmon -F /tmp/mypipe & grep /tmp/mypipe e) If nmon fails please report it with: 1) nmon version like: 14g 2) the output of cat /proc/cpuinfo 3) some clue of what you were doing 4) I may ask you to run the debug version Developer Nigel Griffiths Feedback welcome - on the current release only and state exactly the problem No warranty given or implied.
[root@linux nmon]# ./nmon.sh +nmon-14g------[H for help]---Hostname=linux--------Refresh= 2secs ---04:22.50-----------------------------------------------------------------------------------------------------------------+ | | | ------------------------------ For help type H or ... | | # # # # #### # # nmon -? - hint | | ## # ## ## # # ## # nmon -h - full | | # # # # ## # # # # # # | | # # # # # # # # # # To start the same way every time | | # ## # # # # # ## set the NMON ksh variable | | # # # # #### # # | | ------------------------------ | | | | Use these keys to toggle statistics on/off: | | c = CPU l = CPU Long-term - = Faster screen updates | | m = Memory j = Filesystems + = Slower screen updates | | d = Disks n = Network V = Virtual Memory | | r = Resource N = NFS v = Verbose hints | | k = kernel t = Top-processes . = only busy disks/procs | | h = more options q = Quit | |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|在这里就可以看见一些指令的介绍和一些信息,其中 Refresh= 2secs 就是表示监控界面2秒钟刷新一次,也可以在nmon命令后面跟 -s 参数来指定想要刷新的频率,输入 h 可以看见更详细指令的介绍:
+nmon-14g------[H for help]---Hostname=linux--------Refresh= 2secs ---04:27.49-----------------------------------------------------------------------------------------------------------------+ | HELP ------------------------------------------------------------------------------------------------------------------------------------------------ | | key --- statistics which toggle on/off --- | | h = This help information | | r = RS6000/pSeries CPU/cache/OS/kernel/hostname details + LPAR | | t = Top Process Stats 1=basic 3=CPU | | u = shows command arguments (hit twice to refresh) | | c = CPU by processor l = longer term CPU averages | | m = Memory & Swap stats L=Huge j = JFS Usage Stats | | n = Network stats N = NFS | | d = Disk I/O Graphs D=Stats o = Disks %Busy Map | | k = Kernel stats & loadavg V = Virtual Memory | | g = User Defined Disk Groups [start nmon with -g <filename>] | | v = Verbose Simple Checks - OK/Warnings/Danger | | b = black & white mode | | --- controls --- | | + and - = double or half the screen refresh time | | q = quit space = refresh screen now | | . = Minimum Mode =display only busy disks and processes | | 0 = reset peak counts to zero (peak = ">") | | Developer Nigel Griffiths see http://nmon.sourceforge.net | |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|在这里可以看见所有输入的指令,以及会看到什么内容,这里要说一下在nmon中输入一次 h 会看见帮助信息,在敲一次 h 就会取消显示了,其它指令也同理,这里输入 r (机器类型,机器名,缓存信息和OS版本+LPAR):
| Linux and Processor Details ------------------------------------------------------------------------------------------------------------------------- | | Linux: Linux version 2.6.18-164.el5 ([email protected]) | | Build: (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)) | | Release : 2.6.18-164.el5 | | Version : #1 SMP Tue Aug 18 15:51:54 EDT 2009 | | cpuinfo: model name : Intel(R) Core(TM) i3-2310M CPU @ 2.10GHz | | cpuinfo: vendor_id : GenuineIntel | | cpuinfo: cpu MHz : 2093.260 | | cpuinfo: bogomips : 4186.52 | | # of CPUs: 1 --1颗cpu | | Machine : i686 | | Nodename : linux --hostname | | /etc/*ease[1]: Red Hat Enterprise Linux Server release 5.4 (Tikanga) --操作系统版本 | | /etc/*ease[2]: (null) | | /etc/*ease[3]: (null) | | /etc/*ease[4]: (null) | | lsb_release: Distributor ID: RedHatEnterpriseServer | | lsb_release: Description: Red Hat Enterprise Linux Server release 5.4 (Tikanga) | | lsb_release: Release: 5.4 | | lsb_release: Codename: Tikanga | +---------Warning: Some Statistics may not shown-----------------------------------------------------------------------------------------------------------------------------------------------+在这里看见一些主机和操作系统的信息,再敲一次 r 就会取消显示了,然后输入 t (顶级进程统计使用1,3,4,5来选择数据及顺序),然后再按数字 5 :
| Top Processes Procs=85 mode=5 (1=Basic, 3=Perf 4=Size 5=I/O)---------------------------------------------------------------------------------------------------------------------------------| | PID %CPU Size Res Res Res Res Shared Faults Command | | Used KB Set Text Data Lib KB Min Maj | | 4050 0.5 12748 10548 108 10896 0 832 84 0 nmon.sh | | 1 0.0 2072 624 32 280 0 532 0 0 init | | 2 0.0 0 0 0 0 0 0 0 0 migration/0 | | 3 0.0 0 0 0 0 0 0 0 0 ksoftirqd/0 | | 4 0.0 0 0 0 0 0 0 0 0 watchdog/0 | | 5 0.0 0 0 0 0 0 0 0 0 events/0 | | 6 0.0 0 0 0 0 0 0 0 0 khelper | | 7 0.0 0 0 0 0 0 0 0 0 kthread | | 10 0.0 0 0 0 0 0 0 0 0 kblockd/0 | | 11 0.0 0 0 0 0 0 0 0 0 kacpid | | 67 0.0 0 0 0 0 0 0 0 0 cqueue/0 | | 70 0.0 0 0 0 0 0 0 0 0 khubd | | 72 0.0 0 0 0 0 0 0 0 0 kseriod | | 136 0.0 0 0 0 0 0 0 0 0 pdflush | | 137 0.0 0 0 0 0 0 0 0 0 pdflush | | 138 0.0 0 0 0 0 0 0 0 0 kswapd0 | | 139 0.0 0 0 0 0 0 0 0 0 aio/0 | +---------Warning: Some Statistics may not shown-----------------------------------------------------------------------------------------------------------------------------------------------+注意这个 mode=5 表示就是按I/O来排序了,这里还可以选择其它(1、3、4、5)方式排序,这里可以看见系统有85个进程(Procs=85),5就是按占用的cpu来排的降序,接着输入 u (顶级进程命令的详细信息):
| Top Processes Procs=85 mode=5 (1=Basic, 3=Perf 4=Size 5=I/O)---------------------------------------------------------------------------------------------------------------------------------| | PID %CPU ResSize Command Command | | Used KB | | 4050 1.0 10660 ./nmon.sh | | 1 0.0 624 init [3] | | 2 0.0 0 [migration/0] | | 3 0.0 0 [ksoftirqd/0] | | 4 0.0 0 [watchdog/0] | | 5 0.0 0 [events/0] | | 6 0.0 0 [khelper] | | 7 0.0 0 [kthread] | | 10 0.0 0 [kblockd/0] | | 11 0.0 0 [kacpid] | | 67 0.0 0 [cqueue/0] | | 70 0.0 0 [khubd] | | 72 0.0 0 [kseriod] | | 136 0.0 0 [pdflush] | | 137 0.0 0 [pdflush] | | 138 0.0 0 [kswapd0] | | 139 0.0 0 [aio/0] | +---------Warning: Some Statistics may not shown-----------------------------------------------------------------------------------------------------------------------------------------------+以上信息都很直观,我就不在多说了,接着看 c (CPU处理器统计条形图):
| CPU Utilisation ------------------------------------------------------------------------------------------------------------------------------------- | |---------------------------+-------------------------------------------------+ | |CPU User% Sys% Wait% Idle|0 |25 |50 |75 100| | | 1 0.0 0.0 0.0 100.0| > | | |---------------------------+-------------------------------------------------+ | |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|从上面可以看见系统非常闲(Idle=100%),其中“>”代表的是系统最高cpu的使用峰值,如果按数字0就会重置峰值为0了,接着看 l (条形图长期CPU):
| CPU +-------------------------------------------------------------------------+ | |100%-| | | | 95%-| | | | 90%-| | | | 85%-| | | | 80%-| | | | 75%-| | | | 70%-| | | | 65%-| | | | 60%-| | | | 55%-| | | | 50%-| | | | 45%-| | | | 40%-| | | | 35%-| | | | 30%-| | | | 25%-| | | | 20%-| | | | 15%-| | | | 10%-| | | +---------Warning: Some Statistics may not shown-----------------------------------------------------------------------------------------------------------------------------------------------+也是cpu使用情况的另一种显示,其中“|”和上面的“>”原理一样,接着看 m(内存统计):
| Memory Stats ---------------------------------------------------------------------------------------------------------------------------------------- | | RAM High Low Swap Page Size=4 KB | | Total MB 503.3 0.0 503.3 1027.6 | | Free MB 192.3 0.0 192.3 1027.6 | | Free Percent 38.2% 0.0% 38.2% 100.0% | | MB MB MB | | Cached= 200.6 Active= 107.4 | | Buffers= 46.4 Swapcached= 0.0 Inactive = 169.7 | | Dirty = 0.2 Writeback = 0.0 Mapped = 9.5 | | Slab = 26.8 Commit_AS = 130.0 PageTables= 1.3 | |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|从上面可以看见swap还100%没有使用,物理内存ram空闲38.2%,接着看 L (巨大的内存页面统计):
Large (Huge) Page Stats ----------------------------------------------------------------------------------------------------------------------------- | | There are no Huge Pages | | - see /proc/meminfo | | | |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|这个东西我也没看太明白,接着看 j (文件系统):
| Filesystems ----------------------------------------------------------------------------------------------------------------------------------------- | |Filesystem SizeMB FreeMB Use% Type MountPoint | |/dev/sda3 48502 36378 21% ext3 / | |/proc - - - proc not a real filesystem | |/sys - - - sysfs not a real filesystem | |/dev/pts - - - devpts not a real filesystem | |/dev/sda1 99 82 12% ext3 /boot | |/dev/shm - - - tmpfs not a real filesystem | |/proc/sys/fs/binfmt_misc - - - binfmt_m not a real filesystem | |/var/lib/nfs/rpc_pipefs rpc_pipe size=zero blocks! | |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|这里可以看见磁盘的使用情况,非常直观,接着看 n (网络统计和错误):
| Network I/O ----------------------------------------------------------------------------------------------------------------------------------------- | |I/F Name Recv=KB/s Trans=KB/s packin packout insize outsize Peak->Recv Trans | | lo 0.0 0.0 0.0 0.0 0.0 0.0 0.2 0.2 | | eth0 0.0 0.1 0.5 0.5 60.0 218.0 52.4 99.0 | | sit0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 | | Network Error Counters ------------------------------------------------------------------------------------------------------------------------------ | |I/F Name iErrors iDrop iOverrun iFrame oErrors oDrop oOverrun oCarrier oColls | | lo 0 0 0 0 0 0 0 0 0 | | eth0 0 0 0 0 0 0 0 0 0 | | sit0 0 0 0 0 0 0 0 0 0 | |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|接着看 N (NFS网络文件系统):
| Network Filesystem (NFS) I/O Operations per second -------------------------------------------------------------------------------------------------- | | Version 2 Client Server Version 3 Client Server | | null 0.0 0.0 null 0.0 0.0 | | getattr 0.0 0.0 getattr 0.0 0.0 | | setattr 0.0 0.0 setattr 0.0 0.0 | | root 0.0 0.0 lookup 0.0 0.0 | | lookup 0.0 0.0 access 0.0 0.0 | | readlink 0.0 0.0 readlink 0.0 0.0 | | read 0.0 0.0 read 0.0 0.0 | | wrcache 0.0 0.0 write 0.0 0.0 | | write 0.0 0.0 create 0.0 0.0 | | create 0.0 0.0 mkdir 0.0 0.0 | | remove 0.0 0.0 symlink 0.0 0.0 | | rename 0.0 0.0 mknod 0.0 0.0 | | link 0.0 0.0 remove 0.0 0.0 | | symlink 0.0 0.0 rmdir 0.0 0.0 | | mkdir 0.0 0.0 rename 0.0 0.0 | | rmdir 0.0 0.0 link 0.0 0.0 | | readdir 0.0 0.0 readdir 0.0 0.0 | | fsstat 0.0 0.0 readdirplus 0.0 0.0 | +---------Warning: Some Statistics may not shown-----------------------------------------------------------------------------------------------------------------------------------------------+接着看 d (磁盘I/O图):
| Disk I/O --/proc/diskstats----mostly in KB/s-----Warning:contains duplicates------------------------------------------------------------------------- | |DiskName Busy Read WriteKB|0 |25 |50 |75 100| | |sda 0% 0.0 0.0|> | | |sda1 0% 0.0 0.0|> | | |sda2 0% 0.0 0.0|> | | |sda3 0% 0.0 0.0|> | | |Totals Read-MB/s=0.0 Writes-MB/s=0.0 Transfers/sec=0.0 | |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|接着看 D (磁盘I/O统计):
Disk I/O --/proc/diskstats----mostly in KB/s-----Warning:contains duplicates------------------------------------------------------------------------- | |DiskName Busy Read Write Xfers Size Peak% Peak-RW InFlight | |sda 0% 0.0 0.0KB/s 0.0 0.0KB 0% 8.0KB/s 0 | | |sda1 0% 0.0 0.0KB/s 0.0 0.0KB 0% 0.0KB/s 0 | | |sda2 0% 0.0 0.0KB/s 0.0 0.0KB 0% 0.0KB/s 0 | | |sda3 0% 0.0 0.0KB/s 0.0 0.0KB 0% 8.0KB/s 0 | | |Totals Read-MB/s=0.0 Writes-MB/s=0.0 Transfers/sec=0.0 | |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|接着看 o (磁盘I/O映射):
| Disk %Busy Map --Key: @=90 #=80 X=70 8=60 O=50 0=40 o=30 +=20 -=10 .=5 _=0%-------------------------------------------------------------------------- | | Disk No. 1 2 3 4 5 6 | |Disks=4 0123456789012345678901234567890123456789012345678901234567890123 | |disk 0 to 63 ____ | |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|接着看 k (内核内部统计):
| Kernel Stats ---------------------------------------------------------------------------------------------------------------------------------------- | | RunQueue 1 Load Average CPU use since boot time | | ContextSwitch 30228.5 1 mins 0.00 Uptime Days= 0 Hours= 2 Mins=29 | | Forks 25.9 5 mins 0.00 Idle Days= 0 Hours= 2 Mins=28 | | Interrupts 738415.7 15 mins 0.00 Average CPU use= 0.98% | |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|接着看 V (虚拟内存和交换统计):
| Virtual-Memory -------------------------------------------------------------------------------------------------------------------------------------- | |nr_dirty = 0 pgpgin = 0 High Normal DMA | |nr_writeback= 0 pgpgout = 0 alloc 0 0 0 | |nr_unstable = 0 pgpswpin = 0 refill 0 0 0 | |nr_table_pgs= 330 pgpswpout = 0 steal 0 0 0 | |nr_mapped = 2438 pgfree = 0 scan_kswapd 0 0 0 | |nr_slab = 6852 pgactivate = 0 scan_direct 0 0 0 | | pgdeactivate= 0 | |allocstall = 0 pgfault = 7 kswapd_steal = 0 | |pageoutrun = 0 pgmajfault = 0 kswapd_inodesteal= 0 | |slabs_scanned= 0 pgrotated = 0 pginodesteal = 0 | |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|接着看 v (详细简单的检查):
| Verbose Mode ---------------------------------------------------------------------------------------------------------------------------------------- | | Code Resource Stats Now Warn Danger | | OK -> CPU %busy 0.0% >80% >90% | | OK -> Top Disk %busy 0.0% >40% >60% | | | | | | HELP ------------------------------------------------------------------------------------------------------------------------------------------------ |这里显示了cpu和disk的一个诊断信息,Warn(警告)/Danger(危险),可以看见cpu大于80%就是警告,大于90%就是危险,disk同理如下。
nmon工具可以收集非常详细的系统信息,不过通常用的更多的是采样信息,来生成报表!!!