Finding Performance Bottlenecks in Linux
There isn't a computer professional who, at some point, hasn't wondered whether their system(s) are slow due to legitimate load, or inefficiency. The beauty is there's no real reason to sit and wonder. In the case of Linux (and many other operating systems), all of the information you need is at your fingertips. You just have to know how to find it.
Computing bottlenecks occur in four basic areas: CPU, RAM, network, and disk I/O. Linux offers a huge collection of tools for collecting and viewing information about each. Let's take a look at some useful techniques, and some of the easier solutions to each area if you find problems.
CPU Performance Inspection
Most new computers today come with multiple CPUs, or some approximation thereof. Some tools allow you to view the individual performance of each of these. However, since the goal here is to measure overall performance, this article focuses on working with a single CPU value. See the man pages for each command for whether it offers flags to go further.
One excellent tool for monitoring CPU performance is sar. This program may not be installed by default on your system, look for the sysstat utilities package for your distribution. Typing sar
without any arguments gives you something similar to what you'll see in Figure 1.
Figure 1: An example of default sar
output.
From left to right, sar
gives you the time the measurement was taken, which CPU it's reporting on (or in our case, all as a collective whole), and then the percentage of CPU in use at that time for:
%user
- User space (non-kernel programs)%nice
- Programs whose priority had been altered with the nice or renice commands%system
- Kernel space (the kernel itself plus modules)%iowait
- Waiting to fulfill a disk I/O request%steal
- Forced to wait for the hypervisor to finish servicing another virtual CPU, in the case of virtual machines%idle
- Waiting for new instructions
While all of these columns are interesting, the one that quickly lets you determine if you're CPU-bound is %idle
. In the case of Figure 1, this CPU (or bank of CPUs) is practically at the beach on vacation. If the numbers were significantly higher, you would need to consider upgrading the CPU, stopping unnecessary processes, or moving some of the services off of this computer and onto another to improve CPU utilization.
RAM Performance Inspection
The nice thing about sar is that you can also use it to look at your memory. When invoked as sar -r
, you see something similar to Figure 2.
Figure 2: An example of sar
memory output, invoked with sar -r
.
From left to right, this output tells us the time the sample was taken, and then:
kbmemfree
- Unused memory in kbkbmemused
- Amount of memory utilized by user space applications in kb%memused
- The percentage of your RAM currently in usekbbuffers
- Amount of memory in kb that your kernel is using to buffer datakbcached
- Amount of memory in kb that the kernel is using to cache datakbswpfree
- Unused swap space in kbkbswpused
- Used swap space in kb%swpused
- The percentage of your swap space currently in usekbswpcad
- Amount of cached swap in kb
Again, while all of these columns are useful, two give you a quick picture of whether your problem is with memory: %memused
, and %swpused
. While Figure 1 showed a CPU that was sunning itself in Aruba, %memused
shows that this computer is consistently operating at the edge of its RAM capacity. The %swpused
column tells us that on the other hand, the machine isn't being pushed so hard that it's having to move code from RAM into swap space on the hard drive. For the timespan shown in the measurements, then, you aren't experiencing poor performance.
However, don't be alarmed by the fact that this machine looks like it's one step from having to push things into swap. The kernel's memory manager will put the most active applications in physical RAM (in ps
's STAT
column or top
's S
column you'll see R
for running), and the idle applications into swap (in ps
or top
these will show as S
for sleeping), so just the raw percentages of how much RAM and swap you're using don't show the whole picture. Typing ps aux
will let you see how many processes at a particular time are sleeping, and what percentage of memory (and CPU) each is using. Knowing how much RAM, how much swap, and how many processes are sleeping, along with how much RAM these processes are using, will help you better understand if you're having RAM bottlenecks. Factors such as shared memory can also make it look like you're using more RAM than you really are.
The solutions for improving RAM performance are similar to those for CPU: add more RAM, stop unnecessary programs, or move some of your services off onto another machine. It's also possible that you're suffering memory leaks or that something you're running is very RAM-inefficient. These topics bear further discussion in another article.
Disk I/O Performance Inspection
Yet another reason to use sar is that this Swiss army knife of performance information tools can also tell you how your drives are doing. Type sar -dp
and you'll see something like what's shown in Figure 3.
Figure 3: The beginning of sar
I/O output, invoked with sar -dp
.
This combination of flags shows you information per device, as seeing just the summary information (sar -b
) doesn't give you any real reference points at a glance. From left to right, this output gives you the time the measurement was taken, as well as:
DEV
- The physical device in questionrd_sec/s
- Number of sectors (1 sector = 512 bytes) read per secondwr_sec/s
- Number of sectors written per secondavgrq-sz
- Average number of sectors issued to the deviceavgqu-sz
- Average queue length of requests issued to the deviceawait
- Average number of milliseconds I/O requests for this device had to wait before being handled, including how long it took to handle themsvctm
- Average time number of milliseconds I/O requests for this device had to wait before being handled%util
- Percentage of CPU time taken up by I/O requests being issued to the device
Notice in this case that the percentage is not the most interesting value here. Avgqu-sz
and svctm
are the two most useful values for determining if you have an I/O-bound machine. The longer the queue, the more requests are piling up before they're being serviced. The longer they have to wait before being serviced, the slower everything gets.
On an I/O-bound machine, solutions include faster drives (including RAID arrays and other remote storage), organizing your partitions so that I/O-heavy programs aren't all trying to write to the same physical drive, and of course splitting off services onto other machines to spread the load. Very high disk I/O values could in fact mean that you're using a lot of swap.
Network Performance Inspection
While sar
(as sar -n ALL
) can also show you network performance data, in this case it's a bit of overkill. A quick ifconfig
(you may need to include the path) can give you some basic information for a quick visual inspect, as shown in Figure 4.
Figure 4: Network information displayed with /sbin/ifconfig
.
The key to understanding this output for performance monitoring purposes is to know that T
stands for Transmit and R
stands for Receive. If you see values greater than zero for errors, dropped, overruns, and collisions, then you may very well have a network bottleneck problem. The first thing to do is check all of your connections, and equipment such as switches and hubs. Also, check at a few different times and see if the problem is persistent. If it continues, it bears further investigation.
In the case of all four of these issues, this article just skims the surface of both investigation techniques and solutions. In general, you'll want to take these measurements multiple times to see if the problems are persistent or come and go. You might even want to set up cron jobs to take these measurements on an automatic basis.
Further installments will address the larger issues of monitoring performance over time, making tweaks that don't involve having to upgrade hardware, and things developers can do to address performance issues with their own software.
Dee-Ann LeBlanc is a freelance writer, editor, trainer, course developer, and journalist essentially specializing in helping people better understand Linux and open source.