ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


Modern Memory Management, Part 2
Pages: 1, 2, 3

Luckily, several excellent programs exist to help detect and correct memory leaks and other memory problems. The best, easiest-to-use, freely available one is Valgrind. It is available only for x86-Linux, and can literally save hours or even days of debugging. Some of the problems it finds may never be found without it! A similar program available for a wider range of operating systems is Rational Purify. Purify is not free, but runs on most flavors of UNIX, as well as Windows, and can analyze code written in Java, C, C++, Visual Basic, C#, and VB.NET. Many other programs exist, as well, but these two probably work the best.



How do these programs work? They basically wrap all system calls that allocate memory, and internally keep track of every allocation and free operation. Why doesn't the standard system malloc do this? Because it is extremely slow--any program will run extremely slowly under Valgrind or Purify--a small price to pay for discovering all of the memory leaks in your code, however. When the program terminates, they basically match up all the calls to malloc and its neighbors with all of the calls to free, and construct a report listing all of the unfreed blocks of memory. They will distinguish between reachable leaks and true leaks, as well. For each leak, they will indicate the line of code that allocated the memory, sorted by size, so that you can address the largest leaks first. It is still up to the programmer to go and correct the mistakes, of course, and to decide which ones are worth fixing. Other types of detectable problems include:

  • Trying to free memory that has not been allocated or is already freed.
  • Exceeding the bounds of an array.
  • Writing or reading past the end of a string or buffer.
  • De-referencing an invalid/null pointer/object.
  • Use of uninitialized variables.

Resource Leaks

A very little-known or -understood variation on the memory leak exists, too--the resource leak. A resource, or handle, is a pointer to some object used to refer to some operating system object or device. This can include file handles, pipes, network connection handles, database connection handles, window handles, and so on. These use up memory, like any other object, and all have corresponding close/free functions that must be called when done with the handle. Some, such as file handles, also have hard limits (set with ulimit in this case) of how many may be in use at any one time. Any attempt to create a new handle beyond this point will fail. If the code does not check for and deal with running out of file handles, unpredictable behavior will occur.

Because handles use up memory, it is possible to have a resource leak, which may manifest itself in a similar manner to a memory leak. Alternately, instead of running out of memory, new handle requests will simply start failing. Here is a trivial buggy program to check a status file on disk, presumably being written by another process:

#define BUFSZ 128

int main(void)
{
    FILE *f;
    char buf[BUFSZ];

    while (1) {
            if (f = fopen("status", "r")) {
                    if (fgets(buf, BUFSZ, f)) {
                            if (!strcmp(buf, "complete"))
                                    break;
                    }
            }
            else {
                    printf("unable to open status file\n");
                    break;
            }
    }
    return 0;
}

This quickly outputs unable to open status file and terminates. If it did not check the return value of fopen for NULL, the program would crash instead when it tried to fgets from a NULL pointer. Leaving out the fclose(f); statement after the fgets statement results in the program quickly using up all of the file handles in the system.

Resource leaks can be difficult to detect because they may exhibit unusual or unexpected behavior. For instance, in the above example, when fopen begins to fail, you might initially think it is some sort of file-locking issue. However, checking errno quickly reveals ENFILE--too many open files. Most languages provide some mechanism to determine why a certain system call fails, so make use of it when debugging to save time. When nothing else seems to make sense, ask if a resource leak could be to blame.

Summary

Memory is a precious resource and you should not waste it. Nevertheless, there are times when using large blocks of memory can be useful. A program that runs through a database doing calculations may run many times faster if you load the entire database into memory first before processing it, rather then processing it on disk or remotely through a connection, one row at a time. In the end, the programmer must decide how to best make use of the resources available on a given machine. Armed with this knowledge, you should be ready to face the trials and tribulations of memory management on anything from a palm computer to a multi-CPU behemoth. So long, and thanks for the memories!

Howard Feldman is a research scientist at the Chemical Computing Group in Montreal, Quebec.


Return to ONLamp.com.



Sponsored by: