ONJava.com    
 Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples


O'Reilly Book Excerpts: Java Generics and Collections

Java Generics and Collections: Evolution, Not Revolution, Part 1

by Maurice Naftalin and Philip Wadler

Editor's Note: In their new book Java Generics and Collections, authors Maurice Naftalin and Philip Wadler offer the thorough introduction to the syntax and semantics of Java 5.0 generics that you'd expect from an in-depth book on the topic. But they go a step further by considering the real-world, practical concerns of using generics in your work. Unless you're starting a project from scratch in Java 5.0, odds are you have a legacy code-base that does not currently use generics. Is bulk-converting it to use generics in one release a realistic option? Assuming it's not, you need to consider your options for a gradual introduction. Fortunately, the implementation of generics makes this eminently practical, as they describe in Chapter 5, "Evolution, Not Revolution," which we are excerpting over the course of the next two weeks on ONJava.

One motto underpinning the design of generics for Java is evolution, not revolution. It must be possible to migrate a large, existing body of code to use generics gradually (evolution) without requiring a radical, all-at-once change (revolution). The generics design ensures that old code compiles against the new Java libraries, avoiding the unfortunate situation in which half of your code needs old libraries and half of your code needs new libraries.

The requirements for evolution are much stronger than the usual backward compatibility. With simple backward compatibility, one would supply both legacy and generic versions for each application; this is exactly what happens in C#, for example. If you are building on top of code supplied by multiple suppliers, some of whom use legacy collections and some of whom use generic collections, this might rapidly lead to a versioning nightmare.

What we require is that the same client code works with both the legacy and generic versions of a library. This means that the supplier and clients of a library can make completely independent choices about when to move from legacy to generic code. This is a much stronger requirement than backward compatibility; it is called migration compatibility or platform compatibility.

Java implements generics via erasure, which ensures that legacy and generic versions usually generate identical class files, save for some auxiliary information about types. It is possible to replace a legacy class file by a generic class file without changing, or even recompiling, any client code; this is called binary compatibility.

We summarize this with the motto binary compatibility ensures migration compatibility—or, more concisely, erasure eases evolution.

This section shows how to add generics to existing code; it considers a small example, a library for stacks that extends the Collections Framework, together with an associated client. We begin with the legacy stack library and client (written for Java before generics), and then present the corresponding generic library and client (written for Java with generics). Our example code is small, so it is easy to update to generics all in one go, but in practice the library and client will be much larger, and we may want to evolve them separately. This is aided by raw types, which are the legacy counterpart of parameterized types.

The parts of the program may evolve in either order. You may have a generic library with a legacy client; this is the common case for anyone that uses the Collections Framework in Java 5 with legacy code. Or you may have a legacy library with a generic client; this is the case where you want to provide generic signatures for the library without the need to rewrite the entire library. We consider three ways to do this: minimal changes to the source, stub files, and wrappers. The first is useful when you have access to the source and the second when you do not; we recommend against the third.

In practice, the library and client may involve many interfaces and classes, and there may not even be a clear distinction between library and client. But the same principles discussed here still apply, and may be used to evolve any part of a program independently of any other part.

Legacy Library with Legacy Client

We begin with a simple library of stacks and an associated client, as presented in Example 5.1. This is legacy code, written for Java 1.4 and its version of the Collections Framework. Like the Collections Framework, we structure the library as an interface Stack (analogous to List), an implementation class ArrayStack (analogous to ArrayList), and a utility class Stacks (analogous to Collections). The interface Stack provides just three methods: empty, push, and pop. The implementation class ArrayStack provides a single constructor with no arguments, and implements the methods empty, push, and pop using methods size, add, and remove on lists. The body of pop could be shorter—instead of assigning the value to the variable, it could be returned directly—but it will be interesting to see how the type of the variable changes as the code evolves. The utility class provides just one method, reverse, which repeatedly pops from one stack and pushes onto another.

Example 5-1. Legacy library with legacy client

l/Stack.java:
  interface Stack {
    public boolean empty();
    public void push(Object elt);
    public Object pop();
  }

l/ArrayStack.java:
  import java.util.*;
  class ArrayStack implements Stack {
    private List list;
    public ArrayStack() { list = new ArrayList(); }
    public boolean empty() { return list.size() == 0; }
    public void push(Object elt) { list.add(elt); }
    public Object pop() {
      Object elt = list.remove(list.size()-1);
      return elt;
    }
    public String toString() { return "stack"+list.toString(); }
  }

l/Stacks.java:
  class Stacks {
    public static Stack reverse(Stack in) {
      Stack out = new ArrayStack();
      while (!in.empty()) {
        Object elt = in.pop();
        out.push(elt);
      }
      return out;
    }
  }

l/Client.java:
  class Client {
    public static void main(String[]  args) {
      Stack stack = new ArrayStack();
      for (int i = 0; i<4; i++) stack.push(new Integer(i));
      assert stack.toString().equals("stack[0, 1, 2, 3]");
      int top = ((Integer)stack.pop()).intValue();
      assert top == 3 && stack.toString().equals("stack[0, 1, 2]");
      Stack reverse = Stacks.reverse(stack);
      assert stack.empty();
      assert reverse.toString().equals("stack[2, 1, 0]");
    }
  }

The client allocates a stack, pushes a few integers onto it, pops an integer off, and then reverses the remainder into a fresh stack. Since this is Java 1.4, integers must be explicitly boxed when passed to push, and explicitly unboxed when returned by pop.

Generic Library with Generic Client

Next, we update the library and client to use generics, as presented in Example 5.2. This is generic code, written for Java 5 and its version of the Collections Framework. The interface now takes a type parameter, becoming Stack<E> (analogous to List<E>), and so does the implementing class, becoming ArrayStack<E> (analogous to ArrayList<E>), but no type parameter is added to the utility class Stacks (analogous to Collections). The type Object in the signatures and bodies of push and pop is replaced by the type parameter E. Note that the constructor in ArrayStack does not require a type parameter. In the utility class, the reverse method becomes a generic method with argument and result of type Stack<T>. Appropriate type parameters are added to the client, and boxing and unboxing are now implicit.

Example 5-2. Generic library with generic client

g/Stack.java:
  interface Stack<E> {
    public boolean empty();
    public void push(E elt);
    public E pop();
  }

g/ArrayStack.java:
  import java.util.*;
  class ArrayStack<E> implements Stack<E> {
    private List<E> list;
    public ArrayStack() { list = new ArrayList<E>(); }
    public boolean empty() { return list.size() == 0; }
    public void push(E elt) { list.add(elt); }
    public E pop() {
      E elt = list.remove(list.size()-1);
      return elt;
    }
    public String toString() { return "stack"+list.toString(); }
  }

g/Stacks.java:
  class Stacks {
    public static <T> Stack<T> reverse(Stack<T> in) {
      Stack<T> out = new ArrayStack<T>();
      while (!in.empty()) {
        T elt = in.pop();
        out.push(elt);
      }
      return out;
    }
  }

g/Client.java:
  class Client {
    public static void main(String[] args) {
      Stack<Integer> stack = new ArrayStack<Integer>();
      for (int i = 0; i<4; i++) stack.push(i);
      assert stack.toString().equals("stack[0, 1, 2, 3]");
      int top = stack.pop();
      assert top == 3 && stack.toString().equals("stack[0, 1, 2]");
      Stack<Integer> reverse = Stacks.reverse(stack);
      assert stack.empty();
      assert reverse.toString().equals("stack[2, 1, 0]");
    }
  }

In short, the conversion process is straightforward: just add a few type parameters and replace occurrences of Object by the appropriate type variable. All differences between the legacy and generic versions can be spotted by comparing the highlighted portions of the two examples. The implementation of generics is designed so that the two versions generate essentially equivalent class files. Some auxiliary information about the types may differ, but the actual bytecodes to be executed will be identical. Hence, executing the legacy and generic versions yields the same results. The fact that legacy and generic sources yield identical class files eases the process of evolution, as we discuss next.

Generic Library with Legacy Client

Now let's consider the case where the library is updated to generics while the client remains in its legacy version. This may occur because there is not enough time to convert everything all at once, or because the library and client are controlled by different organizations. This corresponds to the most important case of backward compatibility, where the generic Collections Framework of Java 5 must still work with legacy clients written against the Collections Framework in Java 1.4.

In order to support evolution, whenever a parameterized type is defined, Java also recognizes the corresponding unparameterized version of the type, called a raw type. For instance, the parameterized type Stack<E> corresponds to the raw type Stack, and the parameterized type ArrayStack<E> corresponds to the raw type ArrayStack.

Every parameterized type is a subtype of the corresponding raw type, so a value of the parameterized type can be passed where a raw type is expected. Usually, it is an error to pass a value of a supertype where a value of its subtype is expected, but Java does permit a value of a raw type to be passed where a parameterized type is expected—however, it flags this circumstance by generating an unchecked conversion warning. For instance, you can assign a value of type Stack<E> to a variable of type Stack, since the former is a subtype of the latter. You can also assign a value of type Stack to a variable of type Stack<E>, but this will generate an unchecked conversion warning.

To be specific, consider compiling the generic source for Stack<E>, ArrayStack<E>, and Stacks from Example 5.2 (say, in directory g) with the legacy source for Client from Example 5.1 (say, in directory l). Sun's Java 5 compiler yields the following message:

% javac g/Stack.java g/ArrayStack.java g/Stacks.java l/Client.java
Note: Client.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

The unchecked warning indicates that the compiler cannot offer the same safety guarantees that are possible when generics are used uniformly throughout. However, when the generic code is generated by updating legacy code, we know that equivalent class files are produced from both, and hence (despite the unchecked warning) running a legacy client with the generic library will yield the same result as running the legacy client with the legacy library. Here we assume that the only change in updating the library was to introduce generics, and that no change to the behavior was introduced, either on purpose or by mistake.

If we follow the suggestion above and rerun the compiler with the appropriate switch enabled, we get more details:

% javac -Xlint:unchecked g/Stack.java g/ArrayStack.java \
%    g/Stacks.java l/Client.java
l/Client.java:4: warning: [unchecked] unchecked call
to push(E) as a member of the raw type 
 
 
 Stack
      for (int i = 0; i<4; i++) stack.push(new Integer(i));
                                          ^
l/Client.java:8: warning: [unchecked] unchecked conversion
found   : Stack
required: Stack<E>
     Stack reverse = Stacks.reverse(stack);
                                     ^
l/Client.java:8: warning: [unchecked] unchecked method invocation:
<E>reverse(Stack<E>) in Stacks is applied to (Stack)
      Stack reverse = Stacks.reverse(stack);
                                    ^
3 warnings

Not every use of a raw type gives rise to a warning. Because every parameterized type is a subtype of the corresponding raw type, but not conversely, passing a parameterized type where a raw type is expected is safe (hence, no warning for getting the result from reverse), but passing a raw type where a parameterized type is expected issues a warning (hence, the warning when passing an argument to reverse); this is an instance of the Substitution Principle. When we invoke a method on a receiver of a raw type, the method is treated as if the type parameter is a wildcard, so getting a value from a raw type is safe (hence, no warning for the invocation of pop), but putting a value into a raw type issues a warning (hence, the warning for the invocation of push); this is an instance of the Get and Put Principle.

Even if you have not written any generic code, you may still have an evolution problem because others have generified their code. This will affect everyone with legacy code that uses the Collections Framework, which has been generified by Sun. So the most important case of using generic libraries with legacy clients is that of using the Java 5 Collections Framework with legacy code written for the Java 1.4 Collections Framework.

In particular, applying the Java 5 compiler to the legacy code in Example 5.1 also issues unchecked warnings, because of the uses of the generified class ArrayList from the legacy class ArrayStack. Here is what happens when we compile legacy versions of all the files with the Java 5 compiler and libraries:

% javac -Xlint:unchecked l/Stack.java l/ArrayStack.java \
%    l/Stacks.java l/Client.java
l/ArrayStack.java:6: warning: [unchecked] unchecked call to add(E)
as a member of the raw type java.util.List
    public void push(Object elt)  list.add(elt);
                                           ^
1 warning

Here the warning for the use of the generic method add in the legacy method push is issued for reasons similar to those for issuing the previous warning for use of the generic method push from the legacy client.

It is poor practice to configure the compiler to repeatedly issue warnings that you intend to ignore. It is distracting and, worse, it may lead you to ignore warnings that require attention—just as in the fable of the little boy who cried wolf. In the case of pure legacy code, such warnings can be turned off by using the -source 1.4 switch:

% javac -source 1.4 l/Stack.java l/ArrayStack.java \
%    l/Stacks.java l/Client.java

This compiles the legacy code and issues no warnings or errors. This method of turning off warnings is only applicable to true legacy code, with none of the features introduced in Java 5, generic or otherwise. One can also turn off unchecked warnings by using annotations, as described in the next section, and this works even with features introduced in Java 5.

Maurice Naftalin is Director of Software Development at Morningside Light Ltd., a software consultancy in the United Kingdom.

Philip Wadler is a professor of theoretical computer science at the University of Edinburgh, Scotland, where his research focuses on functional and logic programming.


View catalog information for Java Generics and Collections

Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.