Linux DevCenter    
 Published on Linux DevCenter (http://www.linuxdevcenter.com/)
 See this if you're having trouble printing code examples


C++ Memory Management: From Fear to Triumph, Part 2

by George Belotsky
06/19/2003

Dealing with Errors Effectively

The previous article dealt with specifying the memory management errors that are common in C++. Now that you are aware of these errors, the next question is how to keep them out of your code. The answer, of course, is software engineering. If you are expecting a moralistic tirade about how software developers are a bunch of young, reckless, ignorant kids who turn out terrible code because they do not follow proper procedures, you are about to be disappointed. While it is possible to create effective procedures for engineering, engineering itself is never about procedures.

Bureaucracy, in and of itself, is not necessarily harmful, as long as it is confined to a supporting role. The problem with any kind of rules and procedures is that they tend to assume a primary importance. Soon, following the rules is the only thing that matters, no matter how horrible the result. For an eloquent description of the consequences, read Richard P. Feynman's outstanding "What Do You Care What Other People Think?" [Fey88].

If engineering is not the act of following a procedure, than what is it? First and foremost, engineering is design, and design is ultimately about dealing with errors. The fundamental need to deal with errors is recognized in every engineering field except software.

Actually, the critics of software development bear a great deal of responsibility for the current state of affairs. While they decry the generally poor quality of computer programs, these critics nevertheless promote the view that software should be flawless. In their view, it is only the impure programmer that ruins the ideal of the perfect program. This sort of archaic idealism makes engineering literally impossible by its failure to recognize that software exists in the real world just like anything else that people build.

Related Reading

C++ In a Nutshell
A Desktop Quick Reference
By Ray Lischner

Unfortunately, it is beyond the scope of this article to discuss the overall merits of various philosophies, the important lessons that things like Godel's Incompleteness Theorem can teach us about computer programs, or even the reason why software errors are inevitable (it has to do with trying to sustain exponential growth with resources that can only increase linearly). Nevertheless, one very important, practical point must be made. You need to accept that software errors will happen (yes, even in your code), then design to minimize their severity and impact just like real engineers do.

Observations on the C++ Memory Management Mechanism

Understanding the C++ memory mechanism will allow you to move beyond identifying specific errors in your code (covered in the previous article) and begin to engineer such errors out of your system. This knowledge will also help you combine various memory handling techniques (presented in the next article) into a coherent design.

Memory Management is About Consistency of Ownership

Since C++ does not provide an automated memory management system, every program is responsible for both allocating and releasing any memory that it needs. This idea is commonly expressed as a relationship of ownership. Each chunk of allocated memory belongs to a clearly identifiable part of the program (typically an object) which is responsible for freeing that memory when appropriate. The concept of memory ownership is used to create a comprehensible, ordered design for what would otherwise be a chaotic, error-prone program.

Memory ownership focuses on managing the deallocation of memory, preventing leaks and especially dangling references. Memory allocated in one part of the program might subsequently become the responsibility of another part. This is called transfer of ownership.

Ownership is a very useful idea in thinking about your design. When you allocate memory, you must decide who owns it (i.e., which part of your program will ultimately be responsible for freeing that memory). Good judgment here will go a long way toward preventing memory leaks and dangling references.

By far the most important aspect of memory ownership is that it must be consistent. If you have an object that is handling some memory, every method of that object's class must be based on the same assumptions about the ownership of that memory. If one method is written as if the object itself owns the memory while another assumes that the memory is owned by something else, disaster will follow.

It actually turns out that the classic dangling reference scenario illustrated in the previous article is really a case of inconsistent memory ownership. In the example, the destructor frees some memory that is used by SimpleString, a clear indication that SimpleString objects own the memory. Unfortunately, no copy constructor or assignment operator consistent with this view of ownership is provided. Instead, the C++ compiler is allowed to generate the default versions of these methods. The defaults just copy the pointer member of SimpleString as if some other part of the code were responsible for the memory. The SimpleString destructor, of course, assumes otherwise; a dangling reference is the result.

The next article in this series covers a number of techniques for memory management. As you read read about these methods, however, keep in mind that successful memory management with C++ demands consistency of ownership for everything that you allocate.

Memory Allocation is a Side Effect

Consider this innocent-looking piece of code.

Example 1 — a surprising result: the Code

//---
try {
  Surprise surprise;

}
//Catch all standard exceptions. 
catch (exception& e) {
  cout << "caught an exception of type " 
       << typeid(e).name() << endl; 
}
//---

Here is the output that this code produces on the author's system.

Example 2 — a surprising result: the output

caught an exception of type St9bad_alloc

Not so innocent, after all, but why does it throw a bad_alloc exception? Such an exception can only result when an attempted memory allocation fails, but there is no call to new in the code sample given.

The answer, of course, lies in the class Surprise.

Example 3 — the Surprise class

//---
class Surprise {

public:
  Surprise();
  ~Surprise();

private:
  char* huge_cstr_p_;

  //See the Training Wheels Class in article three of this
  //series for an explanation of these declarations.
  Surprise(const Surprise&);
  Surprise& operator=(const Surprise&);
};

Surprise::Surprise() : huge_cstr_p_(0) {
  //Try to allocate an absolutely gigantic buffer.
  huge_cstr_p_ = new char[2000000000];
}

Surprise::~Surprise() {
  delete [] huge_cstr_p_;
}
//---

Surprise tries to allocate a very large amount of memory, which results in an exception when new fails. Clearly, memory allocation is a side effect.

In general, programmers are taught to avoid side effects for the very reason just illustrated: side effects surprise the users of your code. While dynamic memory allocation cannot always be avoided, it is important to be aware that hidden allocation inside classes is a side effect. When you allocate memory, the user suffers a loss of control. In this example, only a local object of class Surprise was created. The user of your class might specifically want to avoid allocating from the heap at this point in the program, but you force her to do so.

Before allocating memory, consider alternatives that might be available. The following list provides several suggestions.

Allocators are commonly used in the C++ standard library as well as other libraries. An architecture that supports allocators allows a great deal of flexibility in memory management. The strategy used to allocate memory can be changed by replacing allocators, leaving the rest of the program intact. In particular, if the users of your class are allowed to supply an allocator of their choice (e.g., through one of your class's constructors) they can implement their own memory management strategy to work together with your code.

Using member objects is a much more basic technique which can improve your code in several ways. With a member object, there is no longer a need to allocate memory in the constructor or free it in the destructor. If an exception leaves your constructor, the member object will be properly destroyed. As an added bonus, the default copy constructor and assignment operator work correctly (unless the member object's class contains bugs related to copying or assignment). On the other hand, a member object becomes a fixed part of your own object; you cannot delete it separately and then allocate another copy.

If you need to use an object temporarily inside a method call, declare it with an automatic (i.e., local) variable. Automatic variables are, as the name implies, automatically created and destroyed with no need for any intervention by you. Using such variables for transient objects is much less prone to error than explicit allocation and deallocation. In addition, automatic variables are created on the stack, which is typically a much more efficient process than getting memory dynamically from the heap. Most importantly, these stack-allocated variables do not surprise your users. After all, you are expected to create local variables for loop counters, array indices, etc. A local object, unless it is very big, is really the same sort of thing.

One important general consideration is to avoid requesting a lot of memory where it is not expected. This applies both to dynamic allocation and to the alternatives described here. For example, while objects of a class named BigBuffer may be ten megabytes in size, such large Character objects are rarely a good design. A ten megabyte Character, even if it does not perform dynamic memory allocation itself, still requires at least ten megabytes of storage. A user would be very confused if creating a thousand Characters caused an out-of-memory condition.

Keep in mind that memory allocation is a side effect, and try to avoid surprising your users. In some cases, it is even worthwhile to make users allocate their own memory, either directly by requiring a buffer or indirectly by mandating an allocator. At other times, it is better to provide default behavior (e.g., a default allocator) while allowing other options to be specified. Often, using member objects to avoid additional allocation (although the member objects themselves might allocate) is the best solution due to its inherent simplicity. Transient objects that are local to a method call are best declared as automatic variables. If you really must allocate behind the scenes, then do so, but consider how it will affect your users and whether at least allowing an alternative would be of benefit.

With C++, You Design Memory Management Just Like Anything Else

When working with C++, memory management is just another part of the system you are building. All features of C++ are available to implement your memory management scheme. Allocator objects, mentioned previously, are an example of delegating memory allocation to a specific component of the overall program. Smart pointers are another useful tool; they are really objects that act like pointers (smart pointers are covered in the next article).

The key concept is that memory allocation, although critical, is not special. You don't need to call new directly every time you need memory. It is far better to create a set of classes that will be responsible for supplying memory to the rest of your program. Thus, memory management becomes just another subsystem that you build. Not only does this allow you to easily change the way you use memory without affecting the rest of your code, but it is also much easier to debug. If there is a memory leak, or you suspect a dangling reference, there is one clearly defined, specific place for you to search for errors.

Both this article and the next present a number of techniques concerning memory management. Don't look at these techniques in isolation, as little code snippets to spread throughout your program. Rather, consider them as building blocks for your own memory management subsystem. With this attitude, you are well on your way toward turning C++ memory management from unwelcome drudgery into a powerful tool to improve your code.

Generalization

The methods and principles of C++ memory management are readily applicable to other types of resources. Common examples include open files, network connections, and locks on shared data in a multithreaded program. These resources can also leak, such as when a program creates sockets but forgets to close them. Likewise, errors similar to dangling references can happen with other resources as well (e.g., corrupting a shared data structure because the lock on it has inadvertently been released).

These generalizations extend beyond C++. Memory-managed languages (such as Java) will mostly take care of the memory, but you must typically handle other resources (such as open files) by yourself. Thus, consistency of ownership, the side effects of resource allocation, and the need to specifically address resource management in your design are all relevant, even in memory managed languages.

Related Reading

Secure Programming Cookbook for C and C++
Recipes for Cryptography, Authentication, Input Validation & More
By John Viega, Matt Messier

With regards to C++ itself, its tools for resource management are exceptionally powerful. After all, C++ is a complex, highly expressive language which nevertheless makes you responsible for managing your own memory. It needs a powerful toolset to allow you to deal effectively with such a fundamental resource.

For example, memory-managed languages typically lack a reliable destructor. Since the language takes care of the memory, you should not often worry about what happens to unused objects. They are simply cleaned up by the system eventually. Unfortunately, if you want the unused object to close a file, the most common recourse is to write an explicit close method, and then remember to call it. This solution is hardly elegant and is also highly prone to errors. On the other hand, the destructor of a carefully written C++ class will take care of the open file automatically.

Powerful resource handling primitives is one reason why C++ is such a great choice for user-space servers. These servers are often very complicated, having to deal with multiple network connections, open files, and a number of subsystems shared by several threads. Given an environment where correct operation requires precise, efficient control over so many different resources, C++ has a distinct advantage over the other mainstream languages.

The next article continues this discussion by presenting several key resource management techniques that you can incorporate into your own designs. While the presentation focuses on memory management, the techniques are, of course, applicable to many other resource types.

Further Reading

This information appears in the previous article. It is also included here for your convenience.

A number of very useful resources are available regarding C++. Notes on these resources are provided here (the Bibliography itself follows).

First, you need a book with broad coverage, which can serve as an introduction, a reference, and for review. Ira Pohl's C++ by Dissection [Poh02] (for which the author of this article was a reviewer) is an example of such a book. It features a particularly gentle ramp-up into working with the language.

In addition to a book with broad coverage, you will need books that focus specifically on the most difficult aspects of the language, and present techniques to deal with them. Three titles that you should find very valuable are Effective C++ [Mey98], More Effective C++ [Mey96] (both by Scott Meyers) and C++ FAQs [Cli95] (by Marshall P. Cline and Greg A. Lomow). There is also an online version of the last title.

The key to reading all three books is not to panic. They contain a great deal of difficult technical details, and are broken up into a large number of very specific topics. Unless you are merely reviewing material with which you are already familiar, reading any of these books from cover to cover is unlikely to be useful.

A good strategy is to allocate a little time (even as short as 15 minutes) each day to work with any one of the Meyers' books or with C++ FAQs. Begin your session by looking over the entire table of contents, which, in all three books, has a very detailed listing of all the items covered. Don't ignore this important step; it will take you progressively less time as you become familiar with each particular book.

Next, try to read the items that are most relevant to the current problem that you are trying to solve, ones where you feel that you are weak, or even those that seem most interesting to you. An item that looks completely unfamiliar is also a good candidate: it is likely an important aspect of C++ that you are not yet aware of.

Finally, when you want insights into bureaucracy, tips on what to do with your icewater during NASA meetings (answer: dip booster rocket O-ring material into it), or just a good laugh when you are frustrated with C++, try Richard P. Feynman's "What Do You Care What Other People Think?" [Fey88].

Bibliography

This bibliography also appears in the previous article of this series. Notes on the bibliography are given in Further Reading.

[Cli95] Marshall P. Cline and Greg A. Lomow. C++ FAQs: Frequently Asked Questions. Addison-Wesley Publishing Co., Inc.. Copyright © 1995. 0-201-58958-3.

[Fey88] Richard Feynman and Ralph Leighton. "What Do You Care What Other People Think?": Further Adventures of a Curious Character. W.W. Norton & Company, Inc.. Copyright © 1998 Gweneth Feynman and Ralph Leighton. 0-393-02659-0.

[Mey96] Scott Meyers. More Effective C++: 35 New Ways to Improve Your Programs and Designs. Addison-Wesley Longman, Inc.. Copyright © 1996. 020163371X.

[Mey98] Scott Meyers. Effective C++: 50 Specific Ways to Improve Your Programs and Designs. Second. Addison-Wesley. Copyright © 1998. 0-201-92488-9.

[Poh02] Ira Pohl. C++ by Dissection: The Essentials of C++ Programming. Addison-Wesley. Copyright © 2002. 0-201-74396-5.

George Belotsky is a software architect who has done extensive work on high-performance internet servers, as well as hard real-time and embedded systems.


Return to the Linux DevCenter.

Copyright © 2009 O'Reilly Media, Inc.