lists the functions of the monitoring tools
top Process activity
vmstat System activity, Hardware and system information
uptime, w Average system load
ps, pstree Displays the processes
free Memory usage
iostat Average CPU load, disk activity
sar Collect and report system activity
mpstat Multiprocessor usage
numastat NUMA-related statistics
pmap Process memory usage
netstat Network statistics
iptraf Real-time network statistics
tcpdump, ethereal Detailed network traffic analysis
nmon Collect and report system activity
strace System calls
Proc file system Various kernel statistics
KDE system guard Real-time systems reporting and graphing
Gnome System Monitor Real-time systems reporting and graphing
lists the function of the benchmark tools
lmbench Microbenchmark for operating system functions
iozone File system benchmark
netperf Network performance benchmark
=============================================================================
vmstat
vmstat provides information about processes, memory, paging, block I/O, traps, and CPU activity. The vmstat command displays either average data or actual samples. The sampling mode is enabled by providing vmstat with a sampling frequency and a sampling duration
Process (procs) r: The number of processes waiting for runtime
b: The number of processes in uninterruptable sleep
Memory swpd: The amount of virtual memory used (KB)
free: The amount of idle memory (KB)
buff: The amount of memory used as buffers (KB)
cache: The amount of memory used as cache (KB)
Swap si: Amount of memory swapped from the disk (KBps)
so: Amount of memory swapped to the disk (KBps)
IO bi: Blocks sent to a block device (blocks/s)
bo: Blocks received from a block device (blocks/s)
System in: The number of interrupts per second, including the clock
cs: The number of context switches per second
CPU (% of total CPU time)
us: Time spent running non-kernel code (user time, including nice time).
sy: Time spent running kernel code (system time).
id: Time spent idle. Prior to Linux 2.5.41, this included I/O-wait time.
wa: Time spent waiting for IO. Prior to Linux 2.5.41, this appeared as zero
=============================================================================
uptime
The uptime command can be used to see how long the server has been running and how many users are logged on, as well as for a quick overview of the average load of the server
=============================================================================
ps and pstree
the pstree command might come in handy as it displays the running processes in a tree structure and consolidates spawned subprocesses (for example, Java threads). The pstree command can help identify originating processes. There is another ps variant, pgrep
=============================================================================
free
The command /bin/free displays information about the total amount of free and used memory (including swap) on the system. It also includes information about the buffers and cache used by the kernel
Using the -l option, you can see how much memory is used in each memory zone
You can also determine how many chunks of memory are available in each zone using/proc/buddyinfo file. Each column of numbers means the number of pages of that order which are available
=============================================================================
iostat
The iostat command shows average CPU times since the system was started (similar to uptime). It also creates a report of the activities of the disk subsystem of the server in two parts: CPU utilization and device (disk) utilization
The device utilization report has these sections:
Device The name of the block device
tps The number of transfers per second (I/O requests per second) to the device. Multiple single I/O requests can be combined in a transfer request, because a transfer request can have different sizes
Blk_read/s, Blk_wrtn/s Blocks read and written per second indicate data read from or written to the device in seconds. Blocks can also have different sizes. Typical sizes are 1024, 2048, and 4048 bytes, depending on the partition size
Blk_read, Blk_wrtn Indicates the total number of blocks read and written since the boot
The iostat can use many options. The most useful one is-x option from the performance
perspective. It displays extended statistics.
using iostat with the -d and-x flag in order to display only information about the disk subsystem of interest
=============================================================================
sar
The sar command is used to collect, report, and save system activity information. The sar command consists of three applications: sar, which displays the data, and sa1 and sa2, which are used for collecting and storing the data. The sar tool features a wide range of options so be sure to check the man page for it. The sar utility is part of the sysstat package.
With sa1 and sa2, the system can be configured to get information and log it for later analysis.
To accomplish this, add the lines to /etc/crontab (Example 2-14). Keep in mind that a default cron job running sar daily is set up automatically after installing sar on your system
Tip: We suggest that you have sar running on most if not all of your systems. In case of a
performance problem, you will have very detailed information on hand at very small
overhead and no additional cost
=============================================================================
mpstat
The mpstat command is used to report the activities of each of the available CPUs on a multiprocessor server. Global average activities among all CPUs are also reported. The mpstat utility is part of the sysstat package.
The mpstat utility enables you to display overall CPU statistics per system or per processor. mpstat also enables the creation of statistics when used in sampling mode analogous to the vmstat command with a sampling frequency and a sampling count. Example 2-17 shows a sample output created with mpstat -P ALL to display average CPU utilization per processor
=============================================================================
numastat
The numastat command provides information about the ratio of local versus remote memory usage and the overall memory configuration of all nodes. Failed allocations of local memory, as displayed in the numa_miss column and allocations of remote memory (slower memory), as displayed in the numa_foreign column should be investigated. Excessive allocation of remote memory will increase system latency and likely decrease overall performance. Binding processes to a node with the memory map in the local RAM will most likely improve performance
=============================================================================
pmap
The pmap command reports the amount of memory that one or more processes are using. You can use this tool to determine which processes on the server are being allocated memory and whether this amount of memory is a cause of memory bottlenecks. For detailed information,use pmap -d option
You can also look at the address spaces where the information is stored. You can find an interesting difference when you issue the pmap command on 32-bit and 64-bit systems. For the complete syntax of the pmap command, issue: pmap -?
=============================================================================
netstat
It displays a lot of network related information such as socket usage, routing, interface, protocol, network statistics, and more. Here are some of the basic options:
-a Show all socket information
-r Show routing information
-i Show network interface statistics
-s Show network protocol statistics
Proto The protocol (tcp, udp, raw) used by the socket.
Recv-Q The count of bytes not copied by the user program connected to this socket.
Send-Q The count of bytes not acknowledged by the remote host.
Local Address Address and port number of the local end of the socket. Unless the --numeric (-n) option is specified, the socket address is resolved to its canonical host name (FQDN), and the port number is translated into the corresponding service name.
Foreign Address Address and port number of the remote end of the socket.
State The state of the socket. Since there are no states in raw mode and usually no states used in UDP, this column may be left blank. For possible
states, see Figure 1-28 on page 32 and man page
=============================================================================
iptraf
iptraf monitors TCP/IP traffic in a real time manner and generates real time reports. It shows TCP/IP traffic statistics by each session, by interface, and by protocol. The iptraf utility is provided by the iptraf package
=============================================================================
tcpdump / ethereal
The tcpdump and ethereal are used to capture and analyze network traffic. Both tool use the libpcap library to capture packets. They monitor all the traffic on a network adapter with promiscuous mode and capture all the frames the adapter has received. To capture all the packets, these commands should be executed with super user privilege to make the interface promiscuous mode
tcpdump
tcpdump is a simple but robust utility. It has basic protocol analyzing capability allowing you to get a rough picture of what is happening on the network. tcpdump supports many options and flexible expressions for filtering the frames to be captured (capture filter). We’ll take a look at this below
Options:
-i <interface> Network interface
-e Print the link-level header
-s <snaplen> Capture <snaplen> bytes from each packet
-n Avoide DNS lookup
-w <file> Write to file
-r <file> Read from file
-v, -vv, -vvv Vervose output
DNS query packets
tcpdump -i eth0 'udp port 53'
FTP control and FTP data session to 192.168.1.10
tcpdump -i eth0 'dst 192.168.1.10 and (port ftp or ftp-data)'
HTTP session to 192.168.2.253
tcpdump -ni eth0 'dst 192.168.2.253 and tcp and port 80'
Telnet session to subnet 192.168.2.0/24
tcpdump -ni eth0 'dst net 192.168.2.0/24 and tcp and port 22'
Packets for which the source and destination are not in subnet 192.168.1.0/24 with TCP SYN or TCP FIN flags on (TCP establishment or termination)
tcpdump 'tcp[tcpflags] & (tcp-syn|tcp-fin) != 0 and not src and dst net 192.168.1.0/24'
=============================================================================
nmon
nmon, short for Nigel's Monitor, is a popular tool to monitor Linux systems performance developed by Nigel Griffiths. Since nmon incorporates the performance information for several subsystems, it can be used as a single source for performance monitoring. Some of the tasks that can be achieved with nmon include processor utilization, memory utilization, run queue information, disks I/O statistics, network I/O statistics, paging activity, and process metrics
In order to run nmon, simply start the tool and select the subsystems of interest by typing their one-key commands. For example, to get CPU, memory, and disk statistics, start nmon and type c m d.
A very useful feature of nmon is the ability to save performance statistics for later analysis in a
comma separated values (CSV) file. The CSV output of nmon can be imported into a spreadsheet application in order to produce graphical reports. In order to do so nmon should be started with the -f flag
For more information on nmon we suggest you visit
http://www-941.haw.ibm.com/collaboration/wiki/display/WikiPtype/nmon
In order to download nmon, visit
http://www.ibm.com/collaboration/wiki/display/WikiPtype/nmonanalyser
=============================================================================
strace
To trace a process, specify the process ID (PID) to be monitored:
strace -p <pid>
Here’s another interesting use. This command reports how much time has been consumed in the kernel by each system call to execute a command.
strace -c <command>
For the complete syntax of the strace command, issue:
strace -?
=============================================================================
Proc file system
acpi
ACPI refers to the advanced configuration and power interface supported by most modern desktop and notebook systems. Because ACPI is mainly a PC technology, it is often disabled on server systems. For more information about ACPI refer to:
http://www.apci.info
tty
The tty subdirectory contains information about the respective virtual terminals of the systems and to what physical devices they are attached
=============================================================================
Benchmark tools
=============================================================================
LMbench
LMbench is a suite of microbenchmarks that can be used to analyze different operating system settings such as an SELinux enabled system versus a non SELinux system. The benchmarks included in LMbench measure various operating system routines such as context switching, local communications, memory bandwidth, and file operations. Using LMbench is pretty straight forward as there are only three important commands to know;
make results: The first time LMbench is run it will prompt for some details of the system configuration and what tests it should perform.
make rerun: After the initial configuration and a first benchmark run, using the make rerun command simply repeats the benchmark using the configuration supplied during the make results run.
make see: Finally after a minimum of three runs the results can be viewed using the make see command. The results will be displayed and can be copied to a spreadsheet application for further analysis or graphical representation of the data
The LMbench benchmark can be found at http://sourceforge.net/projects/lmbench/
=============================================================================
IOzone
IOzone is a file system benchmark that can be utilized to simulate a wide variety of different disk access patterns. Since the configuration possibilities of IOzone are detailed, it is possible to simulate a targeted workload profile precisely. IOzone writes one or multiple files of variable size using variable block sizes
While IOzone offers a very comfortable automatic benchmarking mode it is usually more efficient to define the workload characteristics such as file size, I/O size, and access pattern. If a file system has to be evaluated for a database workload it would be logical to have IOzone create a random access pattern to a large file at large block sizes instead of streaming a large file with a small block size. Some of the most important options for IOzone are
-b <output.xls> Tells IOzone to store the results in a Microsoft® Excel® compatible spreadsheet.
-C Displays output for each child process (can be used to check if all children really run simultaneously).
-f <filename> Can be used to tell IOzone where to write the data.
-i <number of test> This option is used to specify what tests are to be run. You will always have to specify -i 0 in order to write the test file for the first time. Useful tests are -i 1 for streaming reads, -i 2 for random read and random write access, and -i 8 for a workload with mixed random access.
-h Displays the onscreen help.
-r Tells IOzone what record or I/O size that should be used for the tests. The record size should be as close as possible to the record size that will be used by the targeted workload.
-k <number of async I/Os> Uses the async I/O feature of kernel 2.6 that is often used by databases such as IBM DB2®.
-m If the targeted application uses multiple internal buffers then this behavior can be simulated using the -m flag.
-s <size in KB> Specifies the file size for the benchmark. For asynchronous file systems (the default mounting option for most file systems) IOzone
should be used with a file size of at least twice the system’s memory in order to really measure disk performance. The size can also be specified in MB or GB using m or g respectively, directly after the file size.
-+u Is an experimental switch that can be used to measure the processor
utilization during the test
Note: Any benchmark using files that fit into the system’s memory and that are stored on asynchronous file systems will measure the memory throughput rather than the disk subsystem performance. So, you should either mount the file system of interest with the sync option or use a file size roughly twice the size of the system’s memory
If IOzone is used with file sizes that either fit into the system’s memory or cache it can also be used to gain some data about cache and memory throughput. It should be noted that due to the file system overheads IOzone will report only 70-80% of a system’s bandwidth. The IOzone benchmark can be found at http://www.iozone.org/
=============================================================================
netperf
netperf is a performance benchmark tool that focuses on TCP/IP networking performance. It supports UNIX domain socket and SCTP benchmarking.
netperf is designed based on a client-server model. netserver runs on a target system and netperf runs on the client. netperf controls the netserver and passes configuration data to netserver, generates network traffic, and gets the result from netserver through a control connection that is separated from the actual benchmark traffic connection. During the benchmarking, no communication occurs on the control connection so it does not have any effect on the result. The netperf benchmark tool also has a reporting capability including a CPU utilization report
netperf can generate several types of traffic. Basically these fall into two categories: bulk data transfer traffic and request/response type traffic. You should note that netperf uses only one socket at a time. The next version of netperf (netperf4) will fully support benchmarking for concurrent sessions. At this time, we can perform multiple session benchmarking as described below
Bulk data transfer
Bulk data transfer is the most commonly measured factor in network benchmarking. The bulk data transfer is measured by the amount of data transferred in one second. It simulates large file transfer such as multimedia streaming and FTP data transfer
Request/response type
This simulates request/response type traffic which is measured by the number of transactions exchanged in one second. Request/response traffic type is typical for online transaction applications such as web server, database server, mail server, file server (which serves small or medium files), and directory server. In real environment, session establishment and termination should be performed as well as data exchange. To simulate this, TCP_CRR type was introduced.
Global options:
-A Change send and receive buffer alignment on remote system
-b Burst of packet in stream test
-H <remotehost> Remote host
-t <testname> Test traffic type
TCP_STREAM Bulk data transfer benchmark
TCP_MAERTS Similar to TCP_STREAM except direction of stream is opposite.
TCP_SENDFILE Similar to TCP_STREAM except using sendfile() instead of send(). It causes a zero-copy operation.
UDP_STREAM Same as TCP_STREAM except UDP is used.
TCP_RR Request/response type traffic benchmark
TCP_CC TCP connect/close benchmark. No request and response packet is exchanged.
TCP_CRR Performs connect/request/response/close operation. It is very similar to HTTP1.0/1.1 session with HTTP keepalive disabled.
UDP_RR Same as TCP_RR except UDP is used.
-l <testlen> Test length of benchmarking. If positive value is set, netperf performs the benchmarking in testlen seconds. If negative, it performs until value of testlen bytes of data is exchanged for bulk data transfer benchmarking or value of testlen transactions for request/response type.
-c Local CPU utilization report
-C Remote CPU utilization report
Note: The report of the CPU utilization might not be accurate in some platforms. Make sure it is accurate before you perform benchmarking
For more details, refer to http://www.netperf.org/
=============================================================================
Other useful tools
bonnie Disk I/O and file system benchmark
http://www.textuality.com/bonnie/
bonnie++ Disk I/O and file system benchmark
http://www.coker.com.au/bonnie++/
NetBench File server benchmark. It runs on Windows.
dbench File system benchmark. Commonly used for file server benchmark.
http://freshmeat.net/projects/dbench/
iometer Disk I/O and network benchmark
http://www.iometer.org/
ttcp Simple network benchmark
nttcp Simple network benchmark
iperf Network benchmark
http://dast.nlanr.net/projects/Iperf/
ab (Apache Bench) Simple web server benchmark. It comes with Apache HTTP server.
http://httpd.apache.org/
WebStone Web server benchmark
http://www.mindcraft.com/webstone/
Apache JMeter Used mainly for web server performance benchmarking. It also support other protocol such as SMTP, LDAP, JDBC™ and so on, and
it has good reporting capability. http://jakarta.apache.org/jmeter/
fsstone, smtpstone Mail server benchmark. They come with Postfix.
http://www.postfix.org/
nhfsstone Network File System benchmark. Comes with nfs-utils package.
DirectoryMark LDAP benchmark
http://www.mindcraft.com/directorymark/
=============================================================================