
要监控服务器的运行状况? 尝试一些内建的命令行与少量的外围工具吧。 许多 Linux 发行版都预置了许多监控工具。这些工具提供系统活动可量化的信息量度。 你可以使用这些工具来查找可能导致运行问题的原因。以下将要讨论的这些工具是最基本的系统命令, 当在要进行系统分析和服务器问题调试,例如:

  1. 查找瓶颈。
  2. 磁盘(存储)瓶颈。
  3. CPU 与内存瓶颈。
  4. 网络瓶颈。

#1: top - 活动进程命令

top命令提供一个实时的动态更新的系统运行视图,例如, 实时活动进程。 默认, 它将显示服务器上运行的占用最多CPU线程的进程,并每隔5分钟刷新一次。

热键 用法
t 开关显示摘要信息。
m 开关显示内存信息。
A 对最高的系统资源开销进行排序显示。在快速鉴定性能饥渴的系统任务上很有效。
f 进入top的交互配置屏幕。 在为特别的任务配置 top时很有效。
o 允许交互式选择top组合。
r 改变优先权问题命令。
k 杀死进程问题命令。
z 开关彩色/单色。

#2: vmstat - 系统活动, 硬件与系统信息

vmstat 命令报告进程,内存,页面,IO中断,磁带与CPU活跃度的信息。
# vmstat 3

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 2540988 522188 5130400    0    0     2    32    4    2  4  1 96  0  0
 1  0      0 2540988 522188 5130400    0    0     0   720 1199  665  1  0 99  0  0
 0  0      0 2540956 522188 5130400    0    0     0     0 1151 1569  4  1 95  0  0
 0  0      0 2540956 522188 5130500    0    0     0     6 1117  439  1  0 99  0  0
 0  0      0 2540940 522188 5130512    0    0     0   536 1189  932  1  0 98  0  0
 0  0      0 2538444 522188 5130588    0    0     0     0 1187 1417  4  1 96  0  0
 0  0      0 2490060 522188 5130640    0    0     0    18 1253 1123  5  1 94  0  0


# vmstat -m

猎取活动 / 休眠的内存页面信息

# vmstat -a
#3: w - 查看谁登录并做了什么

w 命令显示了此计算机上当前登录的用户及他们的进程的信息。
# w username
# w vivek


 17:58:47 up 5 days, 20:28,  2 users,  load average: 0.36, 0.26, 0.24
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
root     pts/0       14:55    5.00s  0.04s  0.02s vim /etc/resolv.conf
root     pts/1       17:43    0.00s  0.03s  0.00s w

#4: uptime - 告诉你系统运行了多长时间

uptime 命令可以查看服务器运行了多长的时间。 当前时间, 系统运行了多久, 多少个用户已登录, 和在过去的每1,5与15分内系统平均装载。
# uptime

 18:02:41 up 41 days, 23:42,  1 user,  load average: 0.00, 0.00, 0.00

1 被认可为最佳的装载值。 每个系统的装载都不一样。在单 CPU 系统 1 - 3 和 多CPU系统 6-10 装载值也是可接受的。

#5: ps - 显示进程

ps 会报告当前进程的截图。 选择所有进程用 -A 或 -e 选项:
# ps -A

  PID TTY          TIME CMD
    1 ?        00:00:02 init
    2 ?        00:00:02 migration/0
    3 ?        00:00:01 ksoftirqd/0
    4 ?        00:00:00 watchdog/0
    5 ?        00:00:00 migration/1
    6 ?        00:00:15 ksoftirqd/1
 4881 ?        00:53:28 java
 4885 tty1     00:00:00 mingetty
 4886 tty2     00:00:00 mingetty
 4887 tty3     00:00:00 mingetty
 4888 tty4     00:00:00 mingetty
 4891 tty5     00:00:00 mingetty
 4892 tty6     00:00:00 mingetty
 4893 ttyS1    00:00:00 agetty
12853 ?        00:00:00 cifsoplockd
12854 ?        00:00:00 cifsdnotifyd
14231 ?        00:10:34 lighttpd
14232 ?        00:00:00 php-cgi
54981 pts/0    00:00:00 vim
55465 ?        00:00:00 php-cgi
55546 ?        00:00:00 bind9-snmp-stat
55704 pts/1    00:00:00 ps

ps类似于 top 但提供更多信息。


# ps -Al
# ps -AlF

查看线程 ( LWP 和 NLWP)

# ps -AlFH


# ps -AlLm


# ps ax
# ps axu


# ps -ejH
# ps axjf
# pstree


# ps -eo euser,ruser,suser,fuser,f,comm,label
# ps axZ
# ps -eM

查看用户 Vivek 的所有进程

# ps -U vivek -u vivek u


# ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm
# ps axo stat,euid,ruid,tty,tpgid,sess,pgrp,ppid,pid,pcpu,comm
# ps -eopid,tt,user,fname,tmout,f,wchan

仅查看进程 Lighttpd 的进程号

# ps -C lighttpd -o pid=

# pgrep lighttpd

# pgrep -u vivek php-cgi

显示进程号为 55977 的进程名称

# ps -p 55977 -o comm=


# ps -auxf | sort -nr -k 4 | head -10


# ps -auxf | sort -nr -k 3 | head -10

#6: free - 内存使用

命令 free 显示系统中物理内存与交换分区中总共可用的空间。
# free

            total       used       free     shared    buffers     cached
Mem:      12302896    9739664    2563232          0     523124    5154740
-/+ buffers/cache:    4061800    8241096
Swap:      1052248          0    1052248

  Linux Find Out Virtual Memory PAGESIZE
  Linux Limit CPU Usage Per Process
  How much RAM does my Ubuntu / Fedora Linux desktop PC have?

#7: iostat - 平均的  CPU 装载, 磁盘活动

命令 iostat 报告中央处理器统计与设备的输入/输出统计,分区与网络文件系统(NFS)信息。
# iostat

Linux 2.6.18-128.1.14.el5 ( 	06/26/2009

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.50    0.09    0.51    0.03    0.00   95.86

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              22.04        31.88       512.03   16193351  260102868
sda1              0.00         0.00         0.00       2166        180
sda2             22.04        31.87       512.03   16189010  260102688
sda3              0.00         0.00         0.00       1615          0

#8: sar - 收集与报告系统活动

 sar 命令腄收集,报告和保存系统活动信息。查看网络计数,输入:
# sar -n DEV | more
# sar -n DEV -f /var/log/sa/sa24 | more
# sar 4 5

Linux 2.6.18-128.1.14.el5 ( 		06/26/2009

06:45:12 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
06:45:16 PM       all      2.00      0.00      0.22      0.00      0.00     97.78
06:45:20 PM       all      2.07      0.00      0.38      0.03      0.00     97.52
06:45:24 PM       all      0.94      0.00      0.28      0.00      0.00     98.78
06:45:28 PM       all      1.56      0.00      0.22      0.00      0.00     98.22
06:45:32 PM       all      3.53      0.00      0.25      0.03      0.00     96.19
Average:          all      2.02      0.00      0.27      0.01      0.00     97.70

#9: mpstat - 多处理器使用

mpstat 命令显示了每一有效的处理器的活动,processor 0 是第一个。 mpstat -P ALL 显示每一个 CPU 在每个进程的利用率:
# mpstat -P ALL

Linux 2.6.18-128.1.14.el5 (	 	06/26/2009

06:48:11 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
06:48:11 PM  all    3.50    0.09    0.34    0.03    0.01    0.17    0.00   95.86   1218.04
06:48:11 PM    0    3.44    0.08    0.31    0.02    0.00    0.12    0.00   96.04   1000.31
06:48:11 PM    1    3.10    0.08    0.32    0.09    0.02    0.11    0.00   96.28     34.93
06:48:11 PM    2    4.16    0.11    0.36    0.02    0.00    0.11    0.00   95.25      0.00
06:48:11 PM    3    3.77    0.11    0.38    0.03    0.01    0.24    0.00   95.46     44.80
06:48:11 PM    4    2.96    0.07    0.29    0.04    0.02    0.10    0.00   96.52     25.91
06:48:11 PM    5    3.26    0.08    0.28    0.03    0.01    0.10    0.00   96.23     14.98
06:48:11 PM    6    4.00    0.10    0.34    0.01    0.00    0.13    0.00   95.42      3.75
06:48:11 PM    7    3.30    0.11    0.39    0.03    0.01    0.46    0.00   95.69     76.89

#10: pmap - Process Memory Usage

The command pmap report memory map of a process. Use this command to find out causes of memory bottlenecks.
# pmap -d PID
To display process memory information for pid # 47394, enter:
# pmap -d 47394
Sample Outputs:

47394:   /usr/bin/php-cgi
Address           Kbytes Mode  Offset           Device    Mapping
0000000000400000    2584 r-x-- 0000000000000000 008:00002 php-cgi
0000000000886000     140 rw--- 0000000000286000 008:00002 php-cgi
00000000008a9000      52 rw--- 00000000008a9000 000:00000   [ anon ]
0000000000aa8000      76 rw--- 00000000002a8000 008:00002 php-cgi
000000000f678000    1980 rw--- 000000000f678000 000:00000   [ anon ]
000000314a600000     112 r-x-- 0000000000000000 008:00002
000000314a81b000       4 r---- 000000000001b000 008:00002
000000314a81c000       4 rw--- 000000000001c000 008:00002
000000314aa00000    1328 r-x-- 0000000000000000 008:00002
000000314ab4c000    2048 ----- 000000000014c000 008:00002
00002af8d48fd000       4 rw--- 0000000000006000 008:00002
00002af8d490c000      40 r-x-- 0000000000000000 008:00002
00002af8d4916000    2044 ----- 000000000000a000 008:00002
00002af8d4b15000       4 r---- 0000000000009000 008:00002
00002af8d4b16000       4 rw--- 000000000000a000 008:00002
00002af8d4b17000  768000 rw-s- 0000000000000000 000:00009 zero (deleted)
00007fffc95fe000      84 rw--- 00007ffffffea000 000:00000   [ stack ]
ffffffffff600000    8192 ----- 0000000000000000 000:00000   [ anon ]
mapped: 933712K    writeable/private: 4304K    shared: 768000K

The last line is very important:

mapped: 933712K total amount of memory mapped to files writeable/private: 4304K the amount of private address space shared: 768000K the amount of address space this process is sharing with others

#11 and #12: netstat and ss - Network Statistics

The command netstat displays network connections, routing tables, interface statistics, masquerade connections, and multicast memberships. ss command is used to dump socket statistics. It allows showing information similar to netstat. See the following resources about ss and netstat commands:

#13: iptraf - Real-time Network Statistics

The iptraf command is interactive colorful IP LAN monitor. It is an ncurses-based IP LAN monitor that generates various network statistics including TCP info, UDP counts, ICMP and OSPF information, Ethernet load info, node stats, IP checksum errors, and others. It can provide the following info in easy to read format:

Network traffic statistics by TCP connection IP traffic statistics by network interface Network traffic statistics by protocol Network traffic statistics by TCP/UDP port and by packet size Network traffic statistics by Layer2 address

#14: tcpdump - Detailed Network Traffic Analysis

The tcpdump is simple command that dump traffic on a network. However, you need good understanding of TCP/IP protocol to utilize this tool. For.e.g to display traffic info about DNS, enter:
# tcpdump -i eth1 'udp port 53'
To display all IPv4 HTTP packets to and from port 80, i.e. print only packets that contain data, not, for example, SYN and FIN packets and ACK-only packets, enter:
# tcpdump 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'
To display all FTP session to, enter:
# tcpdump -i eth1 'dst and (port 21 or 20'
To display all HTTP session to
# tcpdump -ni eth0 'dst and tcp and port http'
Use wireshark to view detailed information about files, enter:
# tcpdump -n -i eth1 -s 0 -w output.txt src or dst port 80

#15: strace - System Calls

Trace system calls and signals. This is useful for debugging webserver and other server problems. See how to use to trace the process and see What it is doing.

#16: /Proc file system - Various Kernel Statistics

/proc file system provides detailed information about various hardware devices and other Linux kernel information. See Linux kernel /proc documentations for further details. Common /proc examples:
# cat /proc/cpuinfo
# cat /proc/meminfo
# cat /proc/zoneinfo
# cat /proc/mounts

17#: Nagios - Server And Network Monitoring

Nagios is a popular open source computer system and network monitoring application software. You can easily monitor all your hosts, network equipment and services. It can send alert when things go wrong and again when they get better. FAN is "Fully Automated Nagios". FAN goals are to provide a Nagios installation including most tools provided by the Nagios Community. FAN provides a CDRom image in the standard ISO format, making it easy to easilly install a Nagios server. Added to this, a wide bunch of tools are including to the distribution, in order to improve the user experience around Nagios.

18#: Cacti - Web-based Monitoring Tool

Cacti is a complete network graphing solution designed to harness the power of RRDTool's data storage and graphing functionality. Cacti provides a fast poller, advanced graph templating, multiple data acquisition methods, and user management features out of the box. All of this is wrapped in an intuitive, easy to use interface that makes sense for LAN-sized installations up to complex networks with hundreds of devices. It can provide data about network, CPU, memory, logged in users, Apache, DNS servers and much more. See how to install and configure Cacti network graphing tool under CentOS / RHEL.

#19: KDE System Guard - Real-time Systems Reporting and Graphing

KSysguard is a network enabled task and system monitor application for KDE desktop. This tool can be run over ssh session. It provides lots of features such as a client/server architecture that enables monitoring of local and remote hosts. The graphical front end uses so-called sensors to retrieve the information it displays. A sensor can return simple values or more complex information like tables. For each type of information, one or more displays are provided. Displays are organized in worksheets that can be saved and loaded independently from each other. So, KSysguard is not only a simple task manager but also a very powerful tool to control large server farms.

See the KSysguard handbook for detailed usage.

#20: Gnome System Monitor - Real-time Systems Reporting and Graphing

The System Monitor application enables you to display basic system information and monitor system processes, usage of system resources, and file systems. You can also use System Monitor to modify the behavior of your system. Although not as powerful as the KDE System Guard, it provides the basic information which may be useful for new users:

Displays various basic information about the computer's hardware and software. Linux Kernel version GNOME version Hardware Installed memory Processors and speeds System Status Currently available disk space Processes Memory and swap space Network usage File Systems Lists all mounted filesystems along with basic information about each.

