ONJava.com    
 Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples


Seamlessly Caching Stubs for Improved Performance

by William Grosso, author of Java RMI
10/31/2001

This is the second of a three-part series on the Remote Method Invocation framework, or RMI -- a powerful framework for building distributed applications. RMI, which comes with Java 2, Standard Edition, lets two different applications communicate using method calls that look and feel remarkably like ordinary in-process method calls. Two problems with RMI, however, are that it's easy to repeat the same code in lots of different places, and to write an inefficient application that makes far too many remote calls. In this series, we discuss one way to organize the RMI-specific code of a client application to solve these two problems, by introducing command objects to encapsulate remote method calls. If you're not familiar with RMI, take a look at the author's book, Java RMI. To learn about command objects, consult Design Patterns: Elements of Reusable Object-Oriented Software.

Abstract

In the first article in this series, Command Objects and RMI, I introduced a distributed translation service and showed how the use of a simple command object framework to encapsulate retry logic simultaneously simplified the client application and made it more robust.

In this article, I discuss a common structural problem for client applications: they wind up either making an excessive number of remote method calls to a naming service or implementing some form of local cache for the naming service. After outlining the problem, I'll show you how to extend the command object framework, introduced in the first article, to provide seamless caching of stubs. Using these new extensions will make your RMI code simpler and more robust.

Once again, I'd like to state that these articles require a fair amount of RMI knowledge and experience. If you're not familiar with RMI, my book Java RMI is a pretty good place to start.

The source code for this article can be downloaded by clicking here. This code has been tested using the beta of JDK 1.4 on Windows NT systems, but it should compile and run against JDK 1.3.

Distributed Applications Often Waste Bandwidth

Let's start by recalling the client application -- it allowed the user to ask a remote service to translate a word. The GUI part of the client application was very simple as well. It provided ways for the user to enter a word, (manually) choose a translation service, and actually get a word translated. Here's what the GUI looked like:

Screen shot.

To use the program, the user fills out all the text fields and then presses the Translate Word Now button. The program then executes the following lines of code (which are contained in a listener attached to the Translate Word Now button):

String resultText = "";
Translator translator = _translatorPanel.getTranslator();
Word word = _wordPanel.getWord();
Language targetLanguage = _translatorPanel.getTargetLanguage();
TranslateWord translateMethod = new TranslateWord(translator, word, targetLanguage);
try {
  Word result = (Word) translateMethod.makeCall();
  resultText = result.toString();
}
catch (CouldNotTranslateException CNTE) {
  resultText = COULD_NOT_TRANSLATE_STRING;
}
catch (Exception e) {
  resultText = e.toString();
}
finally {
  _resultsPanel.setText(resultText);
}

This code fetches a stub from the appropriate RMI registry (that's what the call to getTranslator does), builds a command object around the stub, and then executes the remote call.

What happens if the user enters a second word and then clicks the Translate Word Now button? In that case, the code fetches a stub from the appropriate RMI registry, builds a command object around the stub, and then executes the remote call.

The same lines of code are executed evey time the button is pressed. This means that each time the user clicks the button, two remote method calls are being made: one to fetch the stub for the translator from the remote registry and one to actually make the translate call to the server performing the translation. This happens even if the same server is used to perform the translations.

Also in this series:

Learning Command Objects and RMI -- O'Reilly's Java RMI author William Grosso introduces you to the basic ideas behind command objects by providing a translation service from a remote server and using command objects to structure the RMI made from a client program.

Generics and Method Objects -- O'Reilly's Java RMI author William Grosso introduces you to the new Generics Specification and rebuilds his command object framework using it.

This is horribly inefficient. Remote method calls are slow: most tests indicate that a simple remote method call is at least 1,000 times slower than an ordinary, in-process method call (and this will only get worse -- processor speed is increasing at a faster rate than network speed). And the situation is even worse than it appears because it's not just one application that is being affected. Every remote method call decreases the amount of bandwidth available on the network for all applications. Doubling the number of remote calls made by a client-server application is, hands down, one of the worst design decisions imaginable.

It's also a decision that's made every day.

Fortunately, there's an obvious solution. Instead of fetching the stub each time, simply fetch the stub the first time and store it in a local stub cache (e.g., in a hash table in memory). The second time the translator is used, the stub is already available locally and, hence, only one remote method invocation is necessary.

In this article, I'll show you how to build a local stub cache in a principled way. There are two main goals:

Implementing a Local Cache to Hold Stubs

In this article, we'll first build a stub cache, and then integrate the stub cache into the command object framework. The first implementation of a stub cache is fairly simple: it takes instances of ServerDescription and returns instances of RemoteStub. That is, the public interface for RemoteStubCache contains the following two static methods:

  public static RemoteStub getStubToRemoteObject(ServerDescription serverDescription) throws ServerUnavailable
  public static void removeStubFromCache(ServerDescription serverDescription)

Objects that need to get a stub call getStubToRemoteObject; objects that have bad stubs (for example, stubs to a server which has crashed) inform the cache by calling removeStubFromCache. Both of these methods take a single argument, an instance of ServerDescription.

ServerDescription is a simple object: it contains enough information to find the registry a server is bound into, along with the name of the server. RemoteStubCache consists of some static methods to access the cached stubs, based on an instance of ServerDescription. Here's the code for ServerDescription:

public class ServerDescription
{
  private String _serverMachine;
  private String _serverName;
  private int _registryPort;
  private String _hashString;

  public ServerDescription(String serverMachine, String serverName, int registryPort) {
    _serverMachine = serverMachine;
    _serverName = serverName;
    _registryPort = registryPort ;
    _hashString = _serverMachine + _serverName + String.valueOf(_registryPort);

  }

  protected RemoteStub getStub() {
    RemoteStub returnValue = null;
    try {
Registry registry = LocateRegistry.getRegistry(_serverMachine , _registryPort);
return (RemoteStub) registry.lookup(_serverName);
    }
    catch (Exception ignored) {}
    return returnValue;
  }

  public int hashCode() {
    return _hashString.hashCode();
  }

  public boolean equals(Object object) {
    if (! (object instanceof ServerDescription)) {
return false;  
    }
    ServerDescription otherServerDescription = (ServerDescription) object;
    return _hashString.equals(otherServerDescription.getHashstring());
  }

  private String getHashstring() {
    return _hashString;
  }  
}

Note that we were very careful to implement equals and hashcode in a way that takes advantage of all of the data contained in an instance of ServerDescription. Doing so makes the implementation of RemoteStubCache much easier.

Related Reading

Java RMIJava RMI
By William Grosso
Table of Contents
Index
Sample Chapter
Full Description

ServerDescription (rather than RemoteStubCache) is also responsible for fetching the stubs. This might seem a little idiosyncratic, but it gives us a lot of flexibility to extend the framework. In particular, putting the stub retrieval logic into ServerDescription leaves open the possibility of making ServerDescription into an abstract base class and having more than one concrete subclass. For example, you might want different implementations of ServerDescription if the stubs are stored used different types of naming services. Extending our framework to include stubs stored in an LDAP server involves creating a new subclass of ServerDescription but doesn't actually alter existing code.

Once ServerDescription is in place, implementing RemoteStubCache is fairly simple. All RemoteStubCache needs to do is maintain a hash table of stubs using instances of ServerDescription as keys. Here's the code for RemoteStubCache:

public class RemoteStubCache
{
  private static Hashtable _serverDescriptionsToStubs= new Hashtable();

  public static RemoteStub getStubToRemoteObject(ServerDescription serverDescription) throws ServerUnavailable
  {
    RemoteStub returnValue = (RemoteStub) _serverDescriptionsToStubs.get(serverDescription);
    if (null == returnValue) { returnValue = serverDescription.getStub() ;
if (null!= returnValue) {
  _serverDescriptionsToStubs.put(serverDescription, returnValue);
}
else {
  throw new ServerUnavailable();
}
    }
    return returnValue;

  }

  public static void removeStubFromCache(ServerDescription serverDescription) {
    _serverDescriptionsToStubs.remove(serverDescription);
  }
}

Extending the Command Object Framework to Use RemoteStubCache

We are now in a position to extend the framework for command objects to use a RemoteStubCache. This involves making a slight change to AbstractRemoteMethodCall by adding a new method, remoteExceptionOccurred, to be called when a remote method call fails for unknown reasons (e.g., when the RMI runtime throws an instance of RemoteException).

After I've changed AbstractRemoteMethodCall, I'll introduce a new abstract command object, ServerDescriptionBasedRemoteMethodCall, which extends AbstractRemoteMethodCall and uses instances of ServerDescription to retrieve stubs from RemoteStubCache.

The changes to AbstractRemoteMethodCall are slight. We need to add the remoteExceptionOccurred method and call it from within the retry logic whenever a remote exception is thrown. Here is the new implementation of makeCall, with all of the the new lines highlighted.

  public Object makeCall() throws ServerUnavailable, Exception {
    RetryStrategy strategy = getRetryStrategy();
    while (strategy.shouldRetry()) {
Remote remoteObject = getRemoteObject();
if (null==remoteObject) {
  throw new ServerUnavailable();
}
try {
  return performRemoteCall(remoteObject);
}
catch (RemoteException remoteException) {
  try {
    remoteExceptionOccurred(remoteException);
    strategy.remoteExceptionOccurred();
  }
  catch (RetryException retryException) {
    handleRetryException(remoteObject);
  }
}
    }
    return null;
  }

  
  protected void remoteExceptionOccurred(RemoteException remoteException) {
    /* ignored in based class. */
  }
g

This version of AbstractRemoteMethodCall will actually work with the framework from the first article; the only difference is that when an instance of RemoteException is thrown, the empty method remoteExceptionOccurred might be called (in practice, HotSpot will eventually get rid of the method call).

But remoteExceptionOccurred turns out to be a very useful method when you're caching stubs. The reason is simple: the goal of a stub cache is to reuse the same stub to a remote server unless the stub isn't valid any longer. In general, it's very hard to tell if a stub isn't valid, but we do know the following facts:

For these reasons, the new subclass of AbstractRemoteMethodCall, which has the singularly unlovely name ServerDescriptionBasedRemoteMethodCall, will flush the stub from the cache if an instance of RemoteException is thrown (by calling the RemoteStubCache's static removeStubFromCache method).

Here's the code for ServerDescriptionBasedRemoteMethodCall:

public abstract class ServerDescriptionBasedRemoteMethodCall extends AbstractRemoteMethodCall {
  protected ServerDescription _serverDescription;

  public ServerDescriptionBasedRemoteMethodCall(ServerDescription serverDescription){
    _serverDescription = serverDescription;
  }

  protected Remote getRemoteObject() throws ServerUnavailable {
    try {
RemoteStub stub = RemoteStubCache.getStubToRemoteObject(_serverDescription);
return stub;
    }
    catch (ServerUnavailable serverUnavailable) {
System.out.println("Can't find stub for server " + _serverDescription);
throw serverUnavailable;
    }
  }

  protected void remoteExceptionOccured(RemoteException remoteException) {
    RemoteStubCache.removeStubFromCache(_serverDescription);
  }
}

This simply implements getRemoteObject as a call on a static method of RemoteStubCache. If RemoteStubCache already has the stub, it simply retrieves the stub from the hash table and returns the stub (without any remote method calls being made). Otherwise, RemoteStubCache fetches a stub from the remote registry. But, whenever an instance ofRemoteException is thrown, RemoteStubCache will be told to discard the stub. This means that if a stub has gone bad for any reason, it will be thrown away (and a replacement stub will be pulled from the remote registry) the next time any command object attempts to use it.

This will all happen without the programmer of the client code needing to do anything (or even knowing that a local stub cache is being used). It all just fits seamlessly into our framework.

Integrating the Local Stub Cache Into Our Application

Now that we've built both RemoteStubCache and ServerDescriptionBasedRemoteMethodCall, we need to integrate them into the rest of our application. And here, we're in for a pleasant surprise -- RemoteStubCache and ServerDescriptionBasedRemoteMethodCall aren't just a nice addition to our framework that cuts the number of remote method calls in half; they also actually simplify the rest of our code. Here, for example, are the old and new versions of TranslateWord.

The version from the first article:

public class TranslateWord extends AbstractRemoteMethodCall {

  private Translator _translator;
  private Word _sourceWord;
  private Language _targetLanguage;

  public TranslateWord(Translator translator, Word sourceWord, Language targetLanguage) {
    _translator = translator;
    _sourceWord = sourceWord;
    _targetLanguage = targetLanguage;
  }

  protected Remote getRemoteObject() throws ServerUnavailable {
    return (Remote) _translator;
  }

  protected Object performRemoteCall(Remote remoteObject) throws RemoteException, Exception {
    Translator translator = (Translator) remoteObject;
    return translator.translate(_sourceWord, _targetLanguage);
  }
}

The new version:

public class TranslateWord extends ServerDescriptionBasedRemoteMethodCall  {
  private Word _sourceWord;
  private Language _targetLanguage;
  public TranslateWord(ServerDescription serverDescription, Word sourceWord, Language targetLanguage) {
    super(serverDescription);
    _sourceWord = sourceWord;
    _targetLanguage = targetLanguage;
  }

  protected Object performRemoteCall(Remote remoteObject) throws RemoteException, Exception {
    Translator translator = (Translator) remoteObject;
    return translator.translate(_sourceWord, _targetLanguage);
  }
}

The second object is actually a little shorter and cleaner. It's not a huge difference -- these two classes are recognizably the same object -- but it's still a pleasant surprise.

We extended the framework by adding three new classes and now, without any additional effort, all the classes that hook into the framework will simultaneously get some extra (and fairly sophisticated) functionality and become simpler. Now that's a good framework!

Complications That Arise from Using a Local Stub Cache

There are two basic types of servers in the world: servers that are intended to run for very long periods of time without any downtime, and servers that are intended to have a more transient lifetime. I call the second class of servers temporary servers.

Caching stubs to temporary servers is often a bad idea. The reason is that RMI implements a very nice distributed garbage collector. The way it works is simple: any stub which can be accessed by live code maintains a lease on the server. The lease is simply a promise by the RMI runtime that for a certain amount of time (the duration of the lease), the server will not be subject to garbage collecteion -- unless the server application takes extraordinary measures. When the lease is about to expire, the stub renews it.

The net effect is that if there are active stubs to a server, the server will not be picked up in garbage collection. The problem with our stub caching system is this: if an application ever obtained a stub to the server, it will cache the stub inside RemoteStubCache forever. Which means that using RemoteStubCache can prevent temporary servers from being collected as garbage.

If you're not familiar with the distributed garbage collector, and would like to learn more, it's fully covered in my book Java RMI.

You might think this isn't actually a problem for our implementation of ServerDescription. Our implementation of ServerDescription assumes that the server is bound into a running instance of the RMI registry. Stubs in the RMI registry maintain leases on the servers they reference. That means that the local stub cache is adding extra leases (one for each cached stub), but is not actually pinning the server in memory.

The problem comes when the server is removed from an RMI registry. For example, you might imagine that, at the end of the day, we use the following command object to shut down our server:

import java.rmi.*;
import java.rmi.server.*;

public class ShutdownServer extends ServerDescriptionBasedRemoteMethodCall {
  private String _reason;
  public ShutdownServer(ServerDescription serverDescription, String reason) {
    super(serverDescription);
    _reason = reason;
  }

  protected Object performRemoteCall(Remote remoteObject) throws RemoteException, Exception {
    ServerWhichCanBeShutDown server = (ServerWhichCanBeShutDown) remoteObject;
    server.shutdown(_reason);
    return null;
  }
}

You might think that, as long as the server removed itself from the RMI registry (which the client cannot do; for security reasons, code running on one computer cannot alter the contents of an instance of the RMI registry running on another computer), this would be sufficient.

But you would be wrong. The problem is that local stub caches have stubs that point to the server. Even though the server has been told to shut down, and even though its stub has been removed from the RMI registry, the local stub caches have already-fetched stubs. This is true even for the client that called shutdown -- after performRemoteCall is finished executing, the local stub cache still has a stub to the server.

A Slightly More Complicated Local Stub Cache

Of course, this problem also has an obvious solution: client programs need to occasionally remove unused stubs from RemoteStubCache. The only real difficulty is figuring out what strategy should be used to expire stubs which aren't being used. One simple, yet often effective, strategy is to implement the following two precepts:

Implementing this is actually a little harder than it looks. The difficult part is contained in the first bullet point: how do you tell if a stub is inactive? The answer (as you probably guessed) is that command objects make this easy -- we simply need to change the code in ServerDescriptionBasedRemoteMethodCall so that it tells the instance of RemoteStubCache when the stub is no longer being used.

In order to do this, we're simply going to add a method, noLongerUsingStubToRemoteObject, to RemoteStubCache and then override makeCall in ServerDescriptionBasedRemoteMethodCall as follows:

  public Object makeCall() throws ServerUnavailable, Exception {
    Object returnValue = null;
    try {
returnValue = super.makeCall();
    }
    finally {
RemoteStubCache.noLongerUsingStubToRemoteObject(_serverDescription);
    }
    return returnValue;
  }
  

All this does is invoke the superclass method (containing the retry logic) and then, no matter how the superclass method turns out (whether it throws an exception or not), tell RemoteStubCache that the stub is no longer being used.

After we've done this, we simply need to change RemoteStubCache to track how long a stub has been inactive, and then expire the old stubs. We're not going to do that in this series (it's out of the scope of these articles), but it's not a difficult thing to do.

The beauty of this new functionality is that the stub expiration logic is completely hidden from your code -- it was entirely implemented inside of the command object framework.

Summary and Roadmap

In this article, I extended the command object framework to seamlessly implement a local stub cache behind the scenes. In doing so, I also removed the lookup code from various places in the client, instead putting most of it into a new abstract command object, ServerDescriptionBasedRemoteMethodCall. If we revisit the previous article's cost-benefit analysis now, we'd see the following:

Pro

Con

Related Reading

Java RMIJava RMI
By William Grosso
Table of Contents
Index
Sample Chapter
Full Description

On the other hand, we haven't really addressed some of the problems mentioned in the the first article in this series. In particular, we still have the following two problems:

From looking at these lists, it should be clear that command objects make life better. They simplify client code, make the client more robust, make it possible to seamlessly implement things like local stub caches, and allow you to say things like "I optimized performance of our distributed application by implementing a local stub cache using the command pattern" around the water cooler.

In the next, and final, article in this series, I'll address the problem of type safety. After discussing the newly-defined generics extension to the Java programming language, I'll show how it partially addresses this problem by allowing us to return strongly typed values (instead of instances of Object) and slightly narrow our use of Exception.

William Grosso is a coauthor of Java Enterprise Best Practices.


Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.