Valgrind HOWTO
24 August 2002
| Revision History | ||
|---|---|---|
| Revision 1.1 | 2002-09-15 | Revised by: tab |
| Converted to XML 4.1.2, added gfdl, reviewed, author revisions | ||
| Revision 1.0 | 2002-08-24 | Revised by: SS |
| Initial release | ||
- Table of Contents
- 1. Background
- 2. Introduction
-
- 2.1. Purpose
- 2.2. Acknowledgments
- 2.3. Copyright and Distribution Policy
- 2.4. Feedback and Corrections
- 3. Getting it Installed
-
- 3.1. Getting Valgrind
- 3.2. Installing
- 4. A Closer View
-
- 4.1. Why Valgrind?
- 4.2. Usage
- 4.3. Limitations and Dependencies of Valgrind.
- 5. Let's Go Deeper
- 6. Concluding Remarks
- 7. References
- A. GNU Free Documentation License
-
- A.1. PREAMBLE
- A.2. APPLICABILITY AND DEFINITIONS
- A.3. VERBATIM COPYING
- A.4. COPYING IN QUANTITY
- A.5. MODIFICATIONS
- A.6. COMBINING DOCUMENTS
- A.7. COLLECTIONS OF DOCUMENTS
- A.8. AGGREGATION WITH INDEPENDENT WORKS
- A.9. TRANSLATION
- A.10. TERMINATION
- A.11. FUTURE REVISIONS OF THIS LICENSE
- A.12. How to use this License for your documents
2. Introduction
2.1. Purpose
Valgrind is an open-source tool for finding memory-management problems in Linux-x86 executables. It detects memory leaks/corruption in the program being run. It is being developed by Julian Seward.
2.3. Copyright and Distribution Policy
Copyright (C)2002 Deepak P, Sandeep S.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in Appendix A entitled "GNU Free Documentation License".
2.4. Feedback and Corrections
Kindly forward feedback and criticism to Deepak.P or/and Sandeep.S. We shall be indebted to anybody who points out errors and inaccuracies in this document; we will rectify them as soon as we are informed.
3. Getting it Installed
3.2. Installing
Uncompress, compile and install it:
#tar xvfz valgrind-1.0.0.tar.gz
#cd valgrind-1.0.0
#./configure
#make
#make install
|
Add the path to your path variable. Now valgrind is ready to catch the bugs.
4. A Closer View
4.1. Why Valgrind?
-
Use of uninitialized memory
-
Reading/writing memory after it has been freed
-
Reading/writing off the end of malloc'd blocks
-
Reading/writing inappropriate areas on the stack
-
Memory leaks -- where pointers to malloc'd blocks are lost forever
-
Mismatched use of malloc/new/new[] vs free/delete/delete[]
-
Some misuses of the POSIX pthreads API
These errors usually lead to crashes.
This is a situation where we need Valgrind. Valgrind works directly with the executables, with no need to recompile, relink or modify the program to be checked. Valgrind decides whether the program should be modified to avoid memory leak, and also points out the spots of "leak."
Valgrind simulates every single instruction your program executes. For this reason, Valgrind finds errors not only in your application but also in all supporting dynamically-linked (.so-format) libraries, including the GNU C library, the X client libraries, Qt if you work with KDE, and so on. That often includes libraries, for example the GNU C library, which may contain memory access violations.
4.2. Usage
4.2.1. Invoking Valgrind
#valgrind ps -ax |
Valgrind provides thousands of options. We deliberately avoid them, not to make this article boring.
The output contains the usual output of ps -ax also with the detailed report by valgrind. Any error (memory related) is pointed out in the error report.
4.2.2. How to Identify the Error from the Error Report
Consider the output of Valgrind for some test program:
==1353== Invalid read of size 4 ==1353== at 0x80484F6: print (valg_eg.c:7) ==1353== by 0x8048561: main (valg_eg.c:16) ==1353== by 0x4026D177: __libc_start_main (../sysdeps/generic/libc-start.c :129) ==1353== by 0x80483F1: free@@GLIBC_2.0 (in /home/deepu/valg/a.out) ==1353== Address 0x40C9104C is 0 bytes after a block of size 40 alloc'd ==1353== at 0x40046824: malloc (vg_clientfuncs.c:100) ==1353== by 0x8048524: main (valg_eg.c:12) ==1353== by 0x4026D177: __libc_start_main (../sysdeps/generic/libc-start.c :129) ==1353== by 0x80483F1: free@@GLIBC_2.0 (in /home/deepu/valg/a.out) |
Here, 1353 is the process ID. This part of the error report says that a read error has occurred at line number 7, in the function print. The function print is called by function main, and both are in the file valg_eg.c. The function main is called by the function __libc_start_main at line number 129, in ../sysdeps/generic/libc-start.c. The function __libc_start_main is called by free@@GLIBC_2.0 in the file /home/deepu/valg/a.out. Similarly details of calling malloc are also given.
4.2.3. Types of Errors with Examples
4.2.3.1. Use of uninitialized memory
Sources of uninitialized data are:
-
local variables that have not been initialized.
-
The contents of malloc'd blocks, before writing something there.
This is not a problem with calloc since it initializes each allocated bytes with 0. The new operator in C++ is similar to malloc. Fields of the created object will be uninitialized.
Sample program:
#include <stdlib.h>
int main()
{
int p, t;
if (p == 5) /*Error occurs here*/
t = p+1;
return 0;
}
|
Here the value of p is uninitialized, therefore p may contain some random value (garbage), so an error may occur at the condition check. An uninitialized variable will cause error in 2 situations:
-
When it is used to determine the outcome of a conditional branch. Eg:'if (p == 5)' in the above program.
-
When it is used to generate a memory address. Eg: In the above program let there be an integer array a[10], and if you write 'a[p] = 1', it will generate an error.
4.2.3.2. Illegal read/write
Sample program:
#include <stdlib.h>
int main()
{
int *p, i, a;
p = malloc(10*sizeof(int));
p[11] = 1; /* invalid write error */
a = p[11]; /* invalid read error */
free(p);
return 0;
}
|
Here you are trying to read/write from/to address (p+sizeof(int)*11) which is not allocated to the program.
4.2.3.3. Invalid free
Sample program:
#include <stdlib.h>
int main()
{
int *p, i;
p = malloc(10*sizeof(int));
for(i = 0;i < 10;i++)
p[i] = i;
free(p);
free(p); /* Error: p has already been freed */
return 0;
}
|
Valgrind checks the address, which is given as argument to free. If it is an address that has already been freed you will be told that the free is invalid.
4.2.3.4. Mismatched Use of Functions
-
If allocated with malloc, calloc, realloc, valloc or memalign, you must deallocate with free.
-
If allocated with new[], you must deallocate with delete[].
-
If allocated with new, you must deallocate with delete.
Sample program:
#include <stdlib.h>
int main()
{
int *p, i;
p = ( int* ) malloc(10*sizeof(int));
for(i = 0;i < 10;i++)
p[i] = i;
delete(p); /* Error: function mismatch */
return 0;
}
|
Output by valgrind is:
==1066== ERROR SUMMARY: 1 errors from 1 contexts (suppressed:
0 from 0)
==1066== malloc/free: in use at exit: 0 bytes in 0 blocks.
==1066== malloc/free: 1 allocs, 1 frees, 40 bytes allocated.
==1066== For a detailed leak analysis, rerun with:
--leak-check=yes
==1066== For counts of detected errors, rerun with: -v
|
>From the above "ERROR SUMMARY" it is clear that there is 0 bytes in 0 blocks in use at exit, which means that the malloc'd have been freed by delete. Therefore this is not a problem in Linux, but this program may crash on some other platform.
4.2.3.5. Errors Occur Due to Invalid System Call Parameter
Valgrind checks all parameters to system calls.
Sample program:
#include <stdlib.h>
#include <unistd.h>
int main()
{
int *p;
p = malloc(10);
read(0, p, 100); /* Error: unaddressable bytes */
free(p);
return 0;
}
|
==1045== Syscall param read(buf) contains unaddressable
byte(s)
==1045== at 0x4032AF44: __libc_read (in
/lib/i686/libc-2.2.2.so)
==1045== by 0x4026D177: __libc_start_main
(../sysdeps/generic/libc-start.c:129)
==1045== by 0x80483E1: read@@GLIBC_2.0 (in
/home/deepu/valg/a.out)
|
Here, buf = p contains the address of a 10 byte block. The read system call tries to read 100 bytes from standard input and place it at p. But the bytes after the first 10 are unaddressable.
4.2.3.6. Memory Leak Detection
Consider the following program:
#include <stdlib.h>
int main()
{
int *p, i;
p = malloc(5*sizeof(int));
for(i = 0;i < 5;i++)
p[i] = i;
return 0;
}
|
==1048== LEAK SUMMARY:
==1048== definitely lost: 20 bytes in 1 blocks.
==1048== possibly lost: 0 bytes in 0 blocks.
==1048== still reachable: 0 bytes in 0 blocks.
|
In the above program p contains the address of a 20-byte block. But it is not freed anywhere in the program. So the pointer to this 20 byte block is lost forever. This is known as memory leaking. We can get the leak summary by using the Valgrind option --leak-check=yes.
4.2.4. How to Suppress Errors
Suppression files may be modified. This is useful if part of your project contains errors you can't or don't want to fix, yet you don't want to continuously be reminded of them. The format of the file is as follows.
{
Error name
Type
fun:function name, which contains the error to suppress
fun:function name, which calls the function specified above
}
|
Error name can be any name.
type=ValueN, if the error is an uninitialized value error.
=AddrN, if it is an address error.(N=sizeof(data type))
=Free, if it is a free error (eg:mismatched free)
=Cond, if error is due to uninitialized CPU condition code.
=Param, if it is an invalid system call parameter error.
|
You can then run the program with:
valgrind --suppressions=path/to/the/supp_file.supp testprog |
4.3. Limitations and Dependencies of Valgrind.
-
Program runs 25 to 50 times slower.
-
Increased memory consumption.
-
Highly optimized code (compiled with -O1, -O2 options ) may sometimes cheat Valgrind.
-
Valgrind relies on dynamic linking mechanism.
Valgrind is closely tied to details of the CPU, operating system and to a less extent, compiler and basic C libraries. Presently Valgrind works only on the Linux platform (kernels 2.2.X or 2.4.X) on x86s. Glibc 2.1.X or 2.2.X is also required for Valgrind.
5. Let's Go Deeper
The dynamic linker calls the initialization function of Valgrind. Then the synthetic CPU takes control from the real CPU. In the memory there may be some other .so files. The dynamic linker calls the initialization function of all such .so files. Now the dynamic linker calls the main of the loaded program. When main returns, the synthetic CPU calls the finalization function of valgrind.so. During the execution of the finalization function, summary of all errors detected are printed and memory leaks are checked. Finalization function exits giving back the control from the synthetic CPU to the real one.
5.1. How Valgrind Tracks Validity of Each Byte
-
when data is used for address generation,
-
when control flow decision is to be made.
In any of these two situations, if the data is found to be undefined an error report will be generated. But no error reports are generated while copying or adding undefined data.
However the case with floating-point data is different. During a floating-point read instruction the 'V' bits corresponding to the data are checked. Thus copying of uninitialized value will produce error in case of floating-point numbers.
#include <stdlib.h>
int main()
{
int *p, *a;
p = malloc(10*sizeof(int));
a = malloc(10*sizeof(int));
a[3] = p[3];
free(a);
free(p);
return 0;
}
/* produce no errors */
|
#include <stdlib.h>
int main()
{
float *p, *a;
p = malloc(10*sizeof(float));
a = malloc(10*sizeof(float));
a[3] = p[3];
free(a);
free(p);
return 0;
}
/* produces error */
|
All bytes that are in memory but not in CPU have an associated valid-address(A) bit, which indicates whether the corresponding memory location is accessible by the program. When a program starts, the 'A' bits corresponding to each global variables are set. When a call malloc, new or any other memory allocating function is made, the 'A' bits corresponding to the allocated bytes are set. Upon freeing the allocated block using free/new/new‘’ the corresponding 'A' bits are cleared. While doing a system call the 'A' bits are changed appropriately.
When values are loaded from memory the 'A' bits corresponding to each bytes are checked by Valgrind, and if the 'A' bit corresponding to a byte is set then its 'V' bits is checked. If the 'V' bits are not set, an error will be generated and the 'V' bits are set to indicate validity. This avoids long chain of errors. If the 'A' bit corresponding to a loaded byte is 0 then its 'V' bits are forced to set, despite the value being invalid.
Have a look on the following program. Run it.
#include <stdlib.h>
int main()
{
int *p, j;
p = malloc(5*sizeof(int));
j = p[5];
if (p[5] == 1)
i = p[5]+1;
free(p);
return 0;
}
|
Here two errors occur. Both of them are due to the accessing address location p + sizeof(int)*5 which is not allocated to the program. During the execution of j = p[5], since the address p + sizeof(int)*5 is invalid, the 'V' bits of 4 bytes starting at location p+sizeof(int)*5 are forced to set. Therefore uninitialized value occurs neither during the execution of j = p[5] nor during the execution of if(p[5]==1).
5.2. Cache Profiling
The configuration of a cache means its size, associativity and number of lines. If the data requested by the processor appears in the upper level it is called a hit. If the data is not found in the upper level, the request is called a miss. The lower level in the hierarchy is then accessed to retrieve the block containing requested data. In modern machines L1 is first searched for data/instruction requested by the processor. If it is a hit then that data/instruction is copied to some register in the processor. Otherwise L2 is searched. If it is a hit then data/instruction is copied to L1 and from there it is copied to a register. If the request to L2 also is a miss then main memory has to be accessed.
Valgrind can simulate the cache, meaning it can display the things that occur in the cache when a program is running. For this, first compile your program with -g option as usual. Then use the shell script cachegrind instead of valgrind.
Sample output:
==7436== I1 refs: 12,841 ==7436== I1 misses: 238 ==7436== L2i misses: 237 ==7436== I1 miss rate: 1.85% ==7436== L2i miss rate: 1.84% ==7436== ==7436== D refs: 5,914 (4,626 rd + 1,288 wr) ==7436== D1 misses: 357 ( 324 rd + 33 wr) ==7436== L2d misses: 352 ( 319 rd + 33 wr) ==7436== D1 miss rate: 6.0% ( 7.0% + 2.5% ) ==7436== L2d miss rate: 5.9% ( 6.8% + 2.5% ) ==7436== ==7436== L2 refs: 595 ( 562 rd + 33 wr) ==7436== L2 misses: 589 ( 556 rd + 33 wr) ==7436== L2 miss rate: 3.1% ( 3.1% + 2.5% ) |
L2i misses means the number of instruction misses that occur in L2 cache. L2d misses means the number of data misses that occur in L2 cache. Total number of data references = Number of reads + Number of writes. Miss rate means fraction of misses that are not found in the upper level. |
The shell script cachegrind also produces a file, cachegrind.out, that contains line-by-line cache profiling information which is not humanly understandable. A program vg_annotate can easily interpret this information. If the shell script vg_annotate is used without any arguments it will read the file cachegrind.out and produce an output which is humanly understandable.
When C, C++ or assembly source programs are passed as input to vg_annotate it displays the number of cache reads, writes, misses etc.
I1 cache: 16384 B, 32 B, 4-way associative D1 cache: 16384 B, 32 B, 4-way associative L2 cache: 262144 B, 32 B, 8-way associative Command: ./a.out Events recorded: Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw Events shown: Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw Event sort order: Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw Thresholds: 99 0 0 0 0 0 0 0 0 Include dirs: User annotated: valg_flo.c Auto-annotation: off |
User-annotated source: valg_flo.c:
Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw
. . . . . . . . . #include<stdlib.h>
. . . . . . . . . int main()
3 1 1 . . . 1 0 0 {
. . . . . . . . . float *p, *a;
6 1 1 . . . 3 0 0 p = malloc(10*sizeof(float));
6 0 0 . . . 3 0 0 a = malloc(10*sizeof(float));
6 1 1 3 1 1 1 1 1 a[3] = p[3];
4 0 0 1 0 0 1 0 0 free(a);
4 0 0 1 0 0 1 0 0 free(p);
2 0 0 2 0 0 . . . }
|
-
Ir = Total instruction cache reads.
-
I1mr = I1 cache read misses.
-
I2mr = L2 cache instruction read misses.
6. Concluding Remarks
If you have found any glaring typos, or outdated info in this document, please let us know.
A. GNU Free Documentation License
Copyright (C) 2000 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
A.5. MODIFICATIONS
-
Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which should, if there were any, be listed in the History section of the Document). You may use the same title as a previous version if the original publisher of that version gives permission.
-
List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has less than five).
-
State on the Title page the name of the publisher of the Modified Version, as the publisher.
-
Preserve all the copyright notices of the Document.
-
Add an appropriate copyright notice for your modifications adjacent to the other copyright notices.
-
Include, immediately after the copyright notices, a license notice giving the public permission to use the Modified Version under the terms of this License, in the form shown in the Addendum below.
-
Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document's license notice.
-
Include an unaltered copy of this License.
-
Preserve the section entitled "History", and its title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section entitled "History" in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence.
-
Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on. These may be placed in the "History" section. You may omit a network location for a work that was published at least four years before the Document itself, or if the original publisher of the version it refers to gives permission.
-
In any section entitled "Acknowledgements" or "Dedications", preserve the section's title, and preserve in the section all the substance and tone of each of the contributor acknowledgements and/or dedications given therein.
-
Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section numbers or the equivalent are not considered part of the section titles.
-
Delete any section entitled "Endorsements". Such a section may not be included in the Modified Version.
-
Do not retitle any existing section as "Endorsements" or to conflict in title with any Invariant Section.
If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version's license notice. These titles must be distinct from any other section titles.
You may add a section entitled "Endorsements", provided it contains nothing but endorsements of your Modified Version by various parties--for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version.
A.11. FUTURE REVISIONS OF THIS LICENSE
The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License "or any later version" applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation.
A.12. How to use this License for your documents
Copyright (c) YEAR YOUR NAME. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with the Invariant Sections being LIST THEIR TITLES, with the Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST. A copy of the license is included in the section entitled "GNU Free Documentation License".
If you have no Invariant Sections, write "with no Invariant Sections" instead of saying which ones are invariant. If you have no Front-Cover Texts, write "no Front-Cover Texts" instead of "Front-Cover Texts being LIST"; likewise for Back-Cover Texts.
If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.





Comments
Subscribe to Comments Feed