ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

Managing Component Dependencies Using ClassLoaders

by Don Schwarz
04/13/2005

Java's class loading mechanism is incredibly powerful. It allows you to leverage external third-party components without the need for header files or static linking. You simply drop the JAR files for the components into a directory and arrange for them to be added to your classpath. Run-time references are all resolved dynamically. But what happens when these third-party components have their own dependencies? Generally, it is left up to each developer to determine the full set of required components, acquire the correct version of each, and ensure that they are all added to the classpath properly.

JAR Manifest Files

But it doesn't have to be like this; Java's class loading mechanism allows for more elegant solutions to this problem. One such solution is for each component's authors to specify the dependencies of their component inside of its JAR manifest. A manifest is a text file (META-INF/MANIFEST.MF) that can be included inside of a JAR to specify metadata about the file. The most popular attribute, Main-Class, specifies a main class that java -jar can use to locate which class to invoke. However, there is a related, but much less well-known, attribute called Class-Path that lets a JAR specify that it has dependencies on other JARs. Java's default ClassLoader knows to check for these attributes and to automatically append the specified dependencies to its internal classpath.

Let's look at an example. Consider a Java application that implements a traffic simulation. This application is composed of three individual JARs:

  • simulator-ui.jar: A Swing-based view to display the progress of the simulation.
  • simulator.jar: Data objects used to represent the state of the simulation and a controller class that implements the simulation.
  • rule-engine.jar: A generic third-party rule engine that is used to model the rules of the simulation.
simulator-ui.jar depends upon simulator.jar, which in turn depends upon rule-engine.jar.

The naive way to execute this application is:

$ java -classpath
   simulator-ui.jar:simulator.jar:rule-engine.jar
   com.oreilly.simulator.ui.Main

Editor's note: the above command should be entered on one line; it has been wrapped to fit the constraints of our web layout.

But we could also specify this information in JAR manifest files. simulator-ui's MANIFEST.MF file looks like this:

Main-Class: com.oreilly.simulator.ui.Main
Class-Path: simulator.jar

While simulator's MANIFEST.MF simply contains:

Class-Path: rule-engine.jar
rule-engine either does not have a manifest, or it is empty.

Now we can just do:

$ java -jar simulator-ui.jar

Java will automatically parse the manifest entries to extract the main class and modify the classpath accordingly. It will even determine the path of simulator-ui.jar and interpret all Class-Path attributes relative to this path, so we could just as easily have done one of the following:

$ java -jar ../simulator-ui.jar
$ java -jar /home/don/build/simulator-ui.jar

Dependency Conflicts

Java's implementation of the Class-Path attribute presents a big improvement over specifying the entire classpath manually. However, both approaches have some important limitations. One of the biggest limitations, which may not have even crossed your mind, is that you can only load one version of each component. This may seem obvious because most programming environments have this limitation. However, it is not uncommon for large multi-JAR projects with many third-party dependencies to encounter conflicts in those dependencies.

For example, let's say that you're developing a meta-search engine that queries multiple search engines and collates the results. Google and Amazon's Alexa both support web services APIs that use SOAP as a communication mechanism, and both provide Java libraries that can be used to conveniently access these APIs. This is a bit contrived, but for the sake of argument, let's assume that your JAR, metasearch.jar, depends upon google.jar and amazon.jar, each of which depend upon a common soap.jar.

This is fine for now, but what happens in the future when the SOAP protocol or API changes in some way? It's quite likely that these two search engines will not choose to upgrade at exactly the same time. There may come a day when accessing Amazon requires SOAP v1.x and accessing Google requires SOAP v2.x, and the two versions of SOAP were not designed to co-exist in the same process. In this case, we might have the following JAR dependencies specified:

$ cat metasearch/META-INF/MANIFEST.MF
Main-Class: com.onjava.metasearch.Main
Class-Path: google.jar amazon.jar

$ cat amazon/META-INF/MANIFEST.MF
Class-Path: soap-v1.jar

$ cat google/META-INF/MANIFEST.MF
Class-Path: soap-v2.jar

This captures the dependencies correctly, but there's no magic here--this won't do what we want. If soap-v1.jar and soap-v2.jar define many of the same classes, we're almost certainly going to have problems.

$ java -jar metasearch.jar
SOAP v1: remotely invoking searchAmazon
SOAP v1: remotely invoking searchGoogle

As you can see, soap-v1.jar was added to the classpath first, so it is used in both cases. Just as in the previous example, this is equivalent to:

$ java -classpath
   metasearch.jar:amazon.jar:google.jar:soap-v1.jar:soap-v2.jar
   # WRONG!

Editor's note: the above command should be entered on one line; it has been wrapped to fit the constraints of our web layout.

It's interesting to note that Yahoo has also released a web services API, and they do not seem to have introduced a dependency on an existing SOAP/XML-RPC library. On smaller projects, conflicting component dependencies are often cited as a reason not to use a full-scale component (such as a collections library) when you can get by with a small, hand-rolled solution or with including just the one or two classes needed. Hand-rolled solutions have their place, but it is almost always better to use a real component if one is available. And copying other components' classes into your own codebase is never a good idea; in effect, you've just forked the development of that component and no one is ever going to merge in bug fixes or security updates.

Many larger projects, primarily commercial components, have even adopted the disturbing practice of consuming entire components and bundling them inside of their own JAR. To do this, they mangle the package name to make it unique (e.g., com/acme/foobar/org/freeware/utility) and include the classes directly in their JAR. This has the advantage of preventing any clashes between multiple versions of these component JARs, but at considerable cost. Doing this completely hides the third-party dependencies from the developers. If this process became widespread, it would lead to extreme inefficiencies (both in terms of the size of JAR files and the inefficiency of loading multiple versions of each JAR into one process). The problem with this approach is that if two components depend on the same version of a third component (or can be made to do so), there is no central mediator which can determine this and ensure that the shared component is only loaded once. This is something that we'll be investigating in the next section. In addition to any inefficiencies, it is quite likely that your ability to legally bundle third-party software with your own project may be restricted by the license under which that software is released.

Another approach to this problem is for each component's developers to encode a version number explicitly in your package name. Sun's javac code takes this approach--there is a com.sun.tools.javac.Main class that simply forwards calls on to com.sun.tools.javac.v8.Main. Each time a new Java version is released, the package of this code changes. This allows multiple releases of a component to live in a single class loader and it makes the choice of version explicit; however, this is not a very good solution, overall. Either clients need to know exactly what version they plan to use and must change their code to switch to a new version, or they must rely on wrapper classes that forward method calls to the latest version (in which case, these wrapper classes suffer from the same problems that we highlighted above).

Java in a Nutshell

Related Reading

Java in a Nutshell
By David Flanagan

Pages: 1, 2

Next Pagearrow