ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

Expiring Data with Hashbelts
Pages: 1, 2

Using a Hashbelt with RemoteStubCache

Our second example of using hashbelts involves rebuilding the RemoteStubCache class from the second article from my earlier series on command objects.

Also by William Grosso

The Hashbelt Data Structure

Introducing Automatic Data Expiration

Java RMI: Serialization

RemoteStubCache solves a fairly common problem in client-server applications. The problem is that a naive use of naming services doubles the number of remote calls being made. This happens because programmers want to use a naming service to store the stubs to servers; doing so leads to a nice clean architecture and allows, for example, servers to be dynamically redeployed without changing client code. The obvious way to use a naming service is to write code using the following schema:



  RemoteServer server = Naming.lookup(logical_name);   server.methodCall(arguments);

The problem is that each of these lines involves a remote call. The first line fetches a stub from the naming service and the second line actually calls the server.

Fortunately, there's an obvious solution. Transform the first line into a local call by using a RemoteStubCache. Instead of fetching the stub each time, simply fetch the stub the first time and store it locally. The second time the translator is used, the stub is already available locally and, hence, only one remote method invocation is necessary in most cases.

Here's the implementation of RemoteStubCache that I used for that series of articles.

  import java.rmi.*;
  import java.rmi.server.*;
  import java.util.*;

  /*
    Holds stubs to remote RMI servers.
  */

  public class RemoteStubCache {
    private static Hashtable<serverdescription, remotestub> _serverDescriptionsToStubs= new Hashtable<serverdescription, remotestub>();

    public static RemoteStub getStubToRemoteObject(ServerDescription serverDescription) throws ServerUnavailable {
      RemoteStub returnValue = _serverDescriptionsToStubs.get(serverDescription);
      if (null == returnValue) {
        returnValue = serverDescription.getStub() ;
        if (null!= returnValue) {
          _serverDescriptionsToStubs.put(serverDescription, returnValue);
        }
        else {
          throw new ServerUnavailable();
        }
      }
      return returnValue;
    }
  
    public static void removeStubFromCache(ServerDescription serverDescription) {
      _serverDescriptionsToStubs.remove(serverDescription);
    }
  }

This is, by and large, a fine stub cache for simple applications. However, it can cause more problems if you use advanced RMI features like Activation or actively take advantage of the distributed garbage collector because stubs keep references to servers. An example scenario will illustrate the problem. Suppose the people building the servers for a system make the following three design decisions (all of which are quite reasonable):

  • They decide to take advantage of RMI's Activation framework to conserve server-side resources by delaying starting (or restarting) a server until a client actually sends the server a message.
  • They decide to store data out to a relational database when no clients have connections to the server (using the Unreferenced interface).
  • They decide to shut down the server if no clients have references to it, and no clients have had references to it for the past thirty minutes.

What happens to these architectural decisions when the client decides to cache stubs? They all break. Badly. The servers are launched correctly. But, since the client is keeping stubs around indefinitely, the client maintains connections to the servers, and the distributed garbage collector never thinks the server is unreferenced. The server never stores its data out to a relational database, nor does it ever shut down.

This is bad.

Fortunately, the solution is simple: the client shouldn't simply store the data in a global cache; the client should expire stubs that it hasn't used recently. Here's the code for an implementation of RemoteStubCache that uses a hashbelt to get rid of old stubs:

  package grosso.rmi;
  import grosso.expiration.*;
  import grosso.expiration.hashbelt.*;
  import grosso.expiration.hashbelt.handlers.*;
  import grosso.expiration.hashbelt.containers.*;

  import java.rmi.*;
  import java.rmi.server.*;
  import java.util.*;

  /*
    Holds stubs to remote RMI servers.
  */

  public class RemoteStubCache {
    /*
      In previous versions of this, we used a Hashtable to store our stubs.
      Now we automatically expire stubs using a cache.
      We use an updatingHashbeltExpirationSystem (so stubs are reinserted into the
      front of the conveyor belt every time we use them).
      We use a NullExpirationHandler, because the only expiration logic we need is
      done by the garbage collector.

      And we expire our stubs if they've been inactive for more than 30 minutes
    */

    private static long FIVE_MINUTES = 5 * 60 * 1000;
    private static long ROTATION_TIME = FIVE_MINUTES;
    private static int NUMBER_OF_CONTAINERS = 6;
    private static RemoteStubCacheExpirationSystem _expirationSystem =  new RemoteStubCacheExpirationSystem();;

    public static RemoteStub getStubToRemoteObject(ServerDescription serverDescription) throws ServerUnavailable {
      RemoteStub returnValue = _expirationSystem.get(serverDescription);
      if (null == returnValue) {
        returnValue = serverDescription.getStub() ;
        if (null!= returnValue) {
          _expirationSystem.put(serverDescription, returnValue);
        }
        else {
          throw new ServerUnavailable();
        }
      }
      return returnValue;
    }

    public static void removeStubFromCache(ServerDescription serverDescription) {
      _expirationSystem.remove(serverDescription);
    }

    private static class RemoteStubCacheExpirationSystem extends UpdatingHashbeltExpirationSystem<serverdescription, remotestub> {
      public RemoteStubCacheExpirationSystem() {
        super(NUMBER_OF_CONTAINERS, ROTATION_TIME);
      }
    
      protected HashbeltContainer<serverdescription, remotestub> getNewContainer() {
        return new StandardHashbeltContainer<serverdescription,remotestub>();
      }
    }
  }

Notice that this is entirely a change to the implementation of RemoteStubCache; we simply used an inner class that extended UpdatingHashbeltExpirationSystem instead of Hashtable to implement the cache. None of the code that used RemoteStubCache needed to change at all.

Note: This change to RemoteStubCache is a change that's often worth making in your code. Changing a global hashtable to an expiration system is easy to do and allows resources (whether distributed or local) to be released.

Summary And Conclusion

We've now spent three articles, and about 50 pages in total (if you printed these articles out), discussing data expiration. In the first article, we discussed various scenarios where data expiration is necessary, and then covered some common solutions to the problem. The second article introduced the hashbelt algorithm and code library, and this article has focussed on examples drawn from real-world applications.

In both the first and second articles, we stated a list of requirements for any data expiration system. Let's wrap things up by revisiting those requirements and making sure that hashbelts meet them.

  • The expiration mechanism should use a small, and bounded, number of background threads. Hashbelts use a single background thread, easily meeting this requirement.
  • The expiration mechanism must be a global cache. This is easily done, as in the RemoteStubCache example above.
  • The expiration mechanism cannot make other code harder to write. Hashbelts look like instances of java.util.Map to client code. Nothing could be easier to use.
  • The caching should be consistent. Because hashbelts rely on a very quick rotation, and because the expired objects are inaccessible after the rotation occurs, the hashbelts meet this requirement as well.
  • Everything must be fast. For the most part, retrieval into a hashbelt-based system is a single hash-based lookup. And, by choosing the appropriate container and expiration handler, programmers can optimize the expiration step as well.
  • The indexing scheme should make sense. This is perhaps the biggest flaw in hashbelts, as presented. The key that is used is a logical key. But there is only one key: if you wanted to access weather reports either by location or by buoy number (each weather buoy has a unique identifier), then you'd need to do some fairly substantial subclassing (you'd need a new implementation of HashbeltContainer and a new subclass of AbstractHashbeltExpirationSystem to support the new indexing methods).
  • The expiration mechanism should be general. It should be clear that the hashbelt framework presented her is extremely general.

It's official: Hashbelts are a new data structure that expire data in a way that meets all our requirements! At this point you should be a true believer in hashbelts. If you're not, or if you have questions, email me and we can continue the conversation. Otherwise, feel free to use and extend the source code for this article (and send me e-mail if you come up with any interesting ideas for extending the library; I'm planning on maintaining this code and keeping it up to date for the foreseeable future).

William Grosso is a coauthor of Java Enterprise Best Practices.


Return to ONJava.com.