[置顶] Using valgrind to detect memory errors使用valgrind检测内存错误

原创作品,转载请注明出处

1 Overview

This document makes a investigation on issue of resource leaks and how to detect them using valgrind.

1.1    Document Organization

After a brief introduction of the concept of resource leaks  , Section 2 describes the concept of resource leak  and introduce  all kinds of resource leak in detail.  Section 3 makes a overall introduction about the valgrind and section 4 discusses how to use valgrind to debug issue of memory errors. Section 4 makes a comparison between TOP and valgrind, which can serve as a reference for developers and testing engineers.

2    What is resource leak?

In computer science, a resource leak is a particular type of resource consumption by a computer program where the program cannot release resources it has acquired. This condition is normally the result of a bug in a program. Typical resource leaks include memory leak and handle leak.

2.1    Memory Leak

a memory leak occurs when a computer program incorrectly manages memory allocations. In object-oriented programming, a memory leak may happen when an object is stored in memory but cannot be accessed by the running code.
A memory leak has symptoms similar to a number of other problems (see below) and generally can only be diagnosed by a programmer with access to the program.

2.2    handle leak

A handle leak is a type of software bug that occurs when a computer program asks for a handle to a resource but does not free the handle when it is no longer used. If this occurs frequently or repeatedly over an extended period of time, a large number of handles may be marked in-use and thus unavailable, causing performance problems or a crash.
Examples of handle resources available in limited numbers to the operating system include: internet sockets, file descriptors  etc.

3    Using valgrind to detect resource leaks

3.1    Introduction to valgrind

Valgrind is an open−source tool for finding memory−management problems in Linux−x86 executables. It
detects memory leaks/corruption in the program being run.

3.2    Memory Leaks Revisited

 [置顶] Using valgrind to detect memory errors使用valgrind检测内存错误_第1张图片

When a program is executed, it is given a fixed portion of memory to be used for its stack and heap.
If the program is unable to allocate memory, it will throw an out of memory exception and this is likely to crash the program
Stack memory is “freed” when a function returns and the current stack frame is popped off the stack.Therefore, memory leaks can only occur with memory on the heap.Dynamically allocated memory will not be freed until the delete/free command is called on it.
 A program that leaks memory may run for days, weeks, or even longer before it causes a program to crash.
valgrincan  be used to check a program for a variety of common errors including memory leaks and handle leaks.

3.3    Getting valgrind from internet

Valgrind may be obtained from the following locations:
http://valgrind.org/

3.4    Installing valgrind on NVR

Uncompress, compile and install it:
#bzip2 -d valgrind−3.9.0.tar.bz2
#tar xvf valgrind−3.9.0.tar
#cd valgrind−3.9.0
#./configure –prefix=/usr/local
#make
#make install
Add the path to your path variable. Now valgrind is ready to catch the bugs.
#source /etc/opt/americandynamics/venvr/paths.sh

3.5    Why Valgrind?

Memory management is prone to errors that are too hard to detect. Common errors may be
listed as:
1. Use of uninitialized memory
2. Reading/writing memory after it has been freed
3. Reading/writing off the end of malloc'd blocks
4. Reading/writing inappropriate areas on the stack
5. Memory leaks −− where pointers to malloc'd blocks are lost forever
6. Mismatched use of malloc/new/new[] vs free/delete/delete[]
7. Some misuses of the POSIX pthreads API
These errors usually lead to crashes.
This is a situation where we need Valgrind. Valgrind works directly with the executables, with no need to
recompile, relink or modify the program to be checked. Valgrind decides whether the program should be
modified to avoid memory leak, and also points out the spots of "leak."
Valgrind simulates every single instruction your program executes. For this reason, Valgrind finds errors not
only in your application but also in all supporting dynamically−linked (.so−format) libraries, including the
GNU C library, the X client libraries, Qt if you work with KDE, and so on.

3.6    Before using valgrind

Be sure that your executable was created from files that were complied with the -g and -O0 compiler flags

3.7    Running valgrind

Useful Flags:
    --leak-check=<no | summary | yes | full>
    defaults to summary
    yes or full will provide details for individual leaks which includes a stack trace to its location
    --show-reachable=<no | yes>
    defaults to no
    if enabled, valgrind will also provide information about any “still reachable” memory leaks, which are usually not considered to be serious.

3.8    Example on how to use memcheck of valgrind to perform memory unit test

(1)    create unit test program , whose executable file is named as “GenaEngine”:
(2)    make out the executable program named as “GenaEngine”
(3)    Run valgrind to perform memory unit test:
# valgrind --tool=memcheck --show-leak-kinds=all --leak-check=full ./GenaEngine 172.16.9.60 >&memCheckGena.log
(4)    Now we got the results of memory unit tests which is saved in below attached log file:
 .log

3.9    Analyze the results of memory unit test

(1)    Start analyzing from the part of LEAK SUMMARY. We can find below information:
==14848== LEAK SUMMARY:
==14848==    definitely lost: 135 bytes in 7 blocks
==14848==    indirectly lost: 876 bytes in 42 blocks
==14848==      possibly lost: 3,427 bytes in 45 blocks
==14848==    still reachable: 83,914 bytes in 2,692 blocks
==14848==         suppressed: 0 bytes in 0 blocks
    The details are in the Memcheck section of the user manual.
In short:
"definitely lost" means your program is leaking memory -- fix those leaks!
"indirectly lost" means your program is leaking memory in a pointer-based structure. (E.g. if the root node of a binary tree is "definitely lost", all the children will be "indirectly lost".) If you fix the "definitely lost" leaks, the "indirectly lost" leaks should go away.
"possibly lost" means your program is leaking memory, unless you're doing unusual things with pointers that could cause them to point into the middle of an allocated block; see the user manual for some possible causes. Use --show-possibly-lost=no if you don't want to see these reports.
"still reachable" means your program is probably ok -- it didn't free some memory it could have. This is quite common and often reasonable. Don't use --show-reachable=yes if you don't want to see these reports.
(2)    Find the part of definitely lost heap summary:
==14848== 486 (48 direct, 438 indirect) bytes in 3 blocks are definitely lost in loss record 566 of 603
==14848==    at 0x4C27F9E: malloc (vg_replace_malloc.c:291)
==14848==    by 0x55CA6A7: curl_slist_append (in /usr/lib64/libcurl.so.4.1.1)
==14848==    by 0x4E4E62A: CGenaEngine::SubscribeEvents(std::string, std::string, std::string, std::string&) (CGenaEngine.cpp:817)
==14848==    by 0x4E4FABE: CGenaEngine::StartSubscription() (CGenaEngine.cpp:584)
==14848==    by 0x4E4FDE2: CGenaEngine::ConnectToGena(std::string, boost::function<void (CameraEvent)>, int) (CGenaEngine.cpp:528)
==14848==    by 0x401C97: main (TestGenaEngine.cpp:121)
(3)    Then we locate the code:
 
(4)    Conclusion:
We append the header string to list, but we never free it.This is a memory leak!
(5)    Retest the program after fixing the issue:
# valgrind --tool=memcheck --show-leak-kinds=all --leak-check=full ./PelcoServiceEngine 172.16.9.60 >&memCheckGena.log
(6)Recheck the leak summary, we don’t find report of definitely lost any more.---〉Fixed!


4    TOP vs Valgrind

4.1    Using top to detect  user-space memory leaks

It is relatively easy  for user to detect user-space memory hogs.  User-space is basically non-kernel space, and refers to memory used by running programs and processes – ie, not memory used by the system kernel.  A quick method is to use your system monitor program, or if you prefer the command-line, run:
>top –p PID1,PID2
and hit “M” to sort by memory usage.  This will quickly tell you what programs are hogging the most memory on the system.  Certainly if they eat up more memory over time, then you have a good lead on the source of a memory leak.  Another method of getting pretty much the same information is:
>ps -e o pid,command,pmem,rsz,vsz k +rsz
Usually, if the virtual memory result provided by top output keeps increasing over time, it is possible that there are memory leaks in system.The indicator of RES, SHR and DATA should also be analyzed and referenced to idenfy a real memory leak.

4.2    Comparison between TOP and Valgrind

Top can tell us there are possible memory leaks, but we can’t be sure until monitor the system over a long time.One-time data can’t speak anything, low memory or high memory doen’t necessarily mean that whether there are memory lost or not.In a word, it can be served as a memory indicator, but it can not help us to fix the issue of memory leak.
For the developer, to debug the issue of any memory leak on Linux OS, valgrind is of course a better choice.For programmer, he can use it to perform memory unit test as it is shown in the previous chapter.
Then he can analyze the result and narrow down the root cause of memory leaks.
QA can also use valgrind to get the Leak summary to decide whether there are possible memory leak or not.
Indeed, valgrind not only help us debug memory leaks, but also help us to debug any other kind of memory errors and resource leaks.Valgrind  is a  multi-purpose Linux x86 profiling tool.Its main functions are listed as below:
    Memcheck is memory debugger
    detects memory-management problems
    Cachegrind is a cache profiler
    performs detailed simulation of the I1, D1 and L2 caches in your CPU
    Massif is a heap profiler
    performs detailed heap profiling by taking regular snapshots of a program's heap
    Helgrind is a thread debugger
    finds data races in multithreaded
     programs  

4.3    How QA can benefit from using valgrind?

In the previous chapter, I have shown how to use the memcheck tool of valgrind for developer to debug the memory errors.However, for software testing engineer, if he or she wannna know more about the status of system memory leak, he or she can just run the valgrind this way:
# valgrind --tool=memcheck --show-leak-kinds=Yes --leak-check=summary device_manager {any other specific process parameters}  >&MemCheck.log
Note that device_manager should be replaced as the actual process name you wanna check.And usually you may need to specify the correct process parameters to locate the real process which may share same name with other process.
You can redirect the result in to a log file as above:  >&MemCheck.log
The disadvantage is that there may be too much unuseful  information.The advantage is that you can check the field of “LEAK SUMMARY” to see whether there are memory leak or not.To some extent, it is more accurate than the result of top.
5    Summary
Valgrind can help us to debug all kinds of memory errors and resource leaks. Memory leak is one kind of system resource leaks.
In the earlier stage of development, it is suggested to use valgrind to perform memory unit test to reduce the memory errors.
In the later stage of project,ie at stage when we already enter software testing, we should follow below step:
(1)    it is suggested to use Top to check whether there are memory leaks or not.  
(2)    If there are possible memory leaks, developer should use valgrind to debug the issue of memory leak.
For more details on how to use valgrind, please go to valgrind official site and download the latest user manual.
6    References:
1. http://valgrind.org/
2.The valgrind Quick start Guide
3.ValGrind User Manual



你可能感兴趣的:([置顶] Using valgrind to detect memory errors使用valgrind检测内存错误)