ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

Memoization in Java Using Dynamic Proxy Classes

by Tom White
08/20/2003

Memoization is a technique borrowed from functional programming languages like Lisp, Python, and Perl for giving functions a memory of previously computed values. Memoizing a function adds a transparent caching wrapper to the function, so that function values that have already been calculated are returned from a cache rather than being recomputed each time. Memoization can provide significant performance gains for computing-intensive calls. It is also a reusable solution to adding caching to arbitrary routines.

This article looks at memoization in Java and presents a general Memoizer class that can be used like this:
Foo foo = (Foo) Memoizer.memoize(new FooImpl());

Here, Foo is the interface whose method calls are to be memoized, and FooImpl is the concrete implementation of Foo. The foo variable is an instance of the Foo interface; it behaves just like FooImpl, except that return values are cached. The beauty of the single Memoizer class is that adding memoization to any class is trivial: define an interface that comprises the methods to be memoized, then just call memoize with an implementation of the interface.

To understand how Memoizer is implemented, it is useful to build up to it in steps. First, I explore how caching can be built into the class that needs it. Then I examine how to add a caching wrapper to a specific class. Finally, I explain how a caching wrapper can be generalized to work for any class.

Adding Caching to a Computationally Intensive Routine

As an example of a computationally intensive routine, consider a class called PiBinaryDigitsCalculator that calculates the binary digits of pi. The only public method, calculateBinaryDigit, takes an integer that describes the binary digit required -- for example, -1000000 will return the one millionth digit after the decimal point -- and returns a byte value that is either 0 or 1. (For details of the novel algorithm used, consult the paper by the trio that discovered it.)

public class PiBinaryDigitsCalculator {
  /**
   * Returns the coefficient of 2^n in the binary
   * expansion of pi.
   * @param n the binary digit of pi to calculate.
   * @throws ValidityCheckFailedException if the validity
   * check fails, this means the implementation is buggy
   * or n is too large for sufficient precision to be
   * retained.
   */
  public byte calculateBinaryDigit(final int n) {
      return runBBPAlgorithm(n);
  }

  private byte runBBPAlgorithm(final int n) {
      // Lengthy routine goes here ...
  }

}

The simplest and most direct way to cache return values is to modify the class to have a Map instance variable to store previously computed values:

import java.util.HashMap;

public class PiBinaryDigitsCalculator {

  private HashMap cache = new HashMap();

  public synchronized byte calculateBinaryDigit(
  final int n) {

      final Integer N = new Integer(n);
      Byte B = (Byte) cache.get(N);
      if (B == null) {
          byte b = runBBPAlgorithm(n);
          cache.put(N, new Byte(b));
          return b;
      } else {
          return B.byteValue();
      }
  }

  private byte runBBPAlgorithm(final int n) {
      // Lengthy routine goes here ...
  }

}

Related Reading

Java Enterprise Best Practices
By The O'Reilly Java Authors

The calculateBinaryDigit method first checks a HashMap cache for a key, based on the input value (n), and simply returns the value if found. Otherwise, the lengthy calculation is performed and the result stored in the cache. There is also a little extra work in translating between primitive values and object wrappers to allow primitives to be placed in the map.

While this approach works, it has several drawbacks. First, there is no separation of concerns between caching code and algorithm code. One class is responsible for performing the calculation and maintaining the cached values. As a result, it is hard to test the code cleanly. For example, you cannot write a test that checks that the algorithm consistently returns the same value each time it is called, since from the second call onwards, the value is retrieved from the cache.

Second, it is hard to remove the caching code if it is not needed, since it is tightly intertwined with the code that performs the underlying calculation. Indeed, it is difficult to know if the caching is yielding significant performance improvements, as you cannot test the class with caching turned off. If you improve the algorithm, the caching might not be helping performance any longer -- but you would not know.

Third, the caching code cannot be reused. Although the code conforms to a common pattern, it is dedicated to the PiBinaryDigitsCalculator class.

The first two objections can be dealt with by introducing a caching wrapper.

A Caching Wrapper

By using the Decorator pattern, it is simple to separate the code that performs the calculation from the code that provides caching. First, define an interface that abstracts out the essential method call:

public interface BinaryDigitsCalculator {

  public byte calculateBinaryDigit(final int n);
}

Then create two implementations, one for each separate responsibility:

public class PiBinaryDigitsCalculator
  implements BinaryDigitsCalculator {

  public byte calculateBinaryDigit(final int n) {
      return runBBPAlgorithm(n);
  }

  private byte runBBPAlgorithm(final int n) {
      // Lengthy routine goes here ...
  }

}

import java.util.HashMap;

public class CachingBinaryDigitsCalculator implements
BinaryDigitsCalculator {

  private BinaryDigitsCalculator binaryDigitsCalculator;
  private HashMap cache = new HashMap();

  public CachingBinaryDigitsCalculator(
  BinaryDigitsCalculator calculator) {
      this.binaryDigitsCalculator = calculator;
  }

  public synchronized byte calculateBinaryDigit(int n) {
      final Integer N = new Integer(n);
      Byte B = (Byte) cache.get(N);
      if (B == null) {
          byte b =
            binaryDigitsCalculator.calculateBinaryDigit(n);
          cache.put(N, new Byte(b));
          return b;
      } else {
          return B.byteValue();
      }
  }
}

This listing is simply a refactored version of the previous implementation of PiBinaryDigitsCalculator. CachingBinaryDigitsCalculator wraps any instance of BinaryDigitsCalculator and adds caching for calls to the calculateBinaryDigit method. For a few more lines of code, the classes have improved in readability and maintainability. Since clients can now use the algorithm through the BinaryDigitsCalculator interface, it is easy to switch a caching implementation in or out as required.

In addition, proper tests for the caching code can be written. Imagine a mock implementation of BinaryDigitsCalculator that returns a different value each time the calculateBinaryDigit method is called with the same argument. By using this mock object, we can test that the caching wrapper is working, since it will return the same value each time. Such a test was impossible to write with the previous implementation that mixed caching and calculation.

Pages: 1, 2

Next Pagearrow