ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

Generics and Method Objects
Pages: 1, 2, 3, 4

A Brief Introduction to Generic Classes



The problem with ClassCastException is nothing new, of course. Programmers have been running into this sort of problem, and generally hoping that their casts are correct, ever since the first JDK shipped. What is new is that there's a solution on the horizon: the Java Community Process, through JSR-104, has produced a specification for adding generics to the Java language.

How to Obtain the Generics Package

For most people, reading a description is a poor substitute for actually making mistakes. Even though I'm going to tell you how to use generics in some detail, it makes sense for you to download and install the generics package first. That way, as I'm explaining things, you can run tests and try things out.

The generics package actually consists of two separate downloads: a specification and a early-access implementation of the specification. But, before I tell you how to download either of these things, I want to make a few things clear:

  • Using SUN's implementation of generics requires JDK 1.4. If you don't have JDK 1.4 yet, you need to download it first.

  • JDK 1.4 is in beta right now and won't be finalized for a few more months.

  • The generics package is not part of JDK 1.4. The generics specification is finished, but it wasn't finished in time for generics to be included in JDK 1.4. Depending on how things go, generics might be included in JDK 1.5 (scheduled for release in 2003).

Put another way, using this stuff is going to force you to migrate to a new, still-in-beta, version of the JDK. And even if you manage to get your friends to migrate to JDK 1.4, they still aren't going to be able to compile your code (they will be able to use your class files; the output from the generics compiler is 100% compatible with the JDK1.4 JVM).

The generics specification is available from the JSR-014 web page (you'll have to accept a click-through license). To get the early-access implementation of the generics specification, you'll have to go to Javasoft's early access site.

The implementation that you can download from the early access site consists of two major components: a new compiler (that understands the generics syntax) and a genericized implementation of the Collections library.

The Idea Behind Generics

The idea behind generics is simple, and will look familiar to programmers who have used C++ templates. As part of class, interface, and method definitions, you can specify parametrized types by using a comma-separated list of type variables. The type variables stand in for classes that will be specified later, either as part of a subclass definition or during a call to a constructor.

The code that creates instances of a generic class is responsible for specifying all the parametrized types; the compiler is responsible for tracking the types and making sure that all the type declarations match throughout the code (you don't even have to include the casts in your code; the compiler will handle everything for you).

Here, for example, is the generic definition of the List class and the add method (the generic version of List is part of the early-access implementation).

    public interface List<E> extends Collection<E>{
        // ..
        boolean add(E o)
        // ...

This declares that List is an interface with a single parametrized type, denoted by the type variable E. Arguments to the add method have to have the same type as E. In order for an instance of List to be created, E must be specified. And, since creation is done by invoking a constructor, the compiler can actually check that the arguments to add are actually of the correct type and warn the programmer (thus converting run-time instances of ClassCastException into compile-time warnings and errors).

List is a simple example, but it illustrates the basic idea: as part of the declaration, programmers include a set of type variables, using the <> syntax. Once the type variables have been declared, they can be used to restrict the arguments, return values, or exceptions in method declarations.

Here's some more class declarations, to give you the general idea:

    public class SimpleClass<T> { ... }
    public class MoreComplicatedClass<W, T> { ... }

The first of these defines a class named SimpleClass that has a single type variable (T). The second defines a class, MoreComplicatedClass, with two type variables (W and T). In the case of SimpleClass , methods can only use one type variable; in the case of MoreComplicatedClass, there are two type variables involved.

Note: I'm deliberately not giving the full grammar for generics or covering all the possibilities. This article only covers the basics of generics. For the full scoop, see JSR-014.

One unexpected feature is that any type variable can be constrained by using the extends keyword. For example, the following declaration says that the class Athlete is a parametrized class with two type variables, T and W, and that T is restricted to subclasses of Sport.

    public class Athlete<T extends Sport, W> extends Person

Instantiating and Using a Generic Class

Once you've defined a generic class (or found one to use), the next thing you need to do is use it in your code. The best way to explain this is to start with an example. Let's look at the declaration of the generic versions of the List interface, the AbstractList class, and the Vector class. Here's the relevant code:

    public interface List<E> extends Collection<E>{
        boolean containsAll(java.util.Collection<T> c)
        boolean removeAll(java.util.Collection<T> c)
        boolean retainAll(java.util.Collection<T> c)
        T[] toArray(T[] a)
        boolean add(E o)
        //    ...
    }

    public class AbstractList<E> implements List<E>{
        //    ...
    }

    public class Vector<E> extends AbstractList<E>{
        //    ...
    }

The only thing necessary in order to use Vector<E> is to make sure that the type variable is passed in to the constructor. After that, you can use the generic class just like a normal class. Here's an example program that creates two instances of Vector that contain strongly typed entries.

    import java.util.*;
    public class SimpleVectorProgram {
        public static void main(String[] args) {
            playWithIntegerVector();
            playWithStringVector();
        }

        private static void playWithIntegerVector() {
            Vector<Integer> integerVector = new Vector<Integer>();
            integerVector.add(new Integer(13));
            examine(integerVector);
        // no cast necessary in next line
            Integer ourInteger = integerVector.get(0);
        }

        private static void playWithStringVector() {
            Vector<String> stringVector = new Vector<String>();
            stringVector.add("thirteen");
            examine(stringVector);
        // no cast necessary in next line
            String ourString = stringVector.get(0);
        }

        private static void examine(Object o) {
            System.out.println("It's an instance of " + o.getClass().getName());
        }
    }

The two different playWith... methods create two different instances of Vector -- one that can only contain instances of Integer and one that can only contain instances of String. The important point to notice is that once the vectors are created, objects are inserted and removed just as they were in JDK 1.3 or 1.2. The only difference is that the values don't need to be cast to the correct type because the compiler is performing all the necessary type checks.

The casting is still happening, of course. It's just that the casts are being inserted by the compiler, after the compiler validates that the cast will succeed. Because the compiler is checking types more thoroughly, it can reject incorrect code. For example, the following version of playwithIntegerVector won't compile:

        private static void playWithIntegerVector() {
            Vector<Integer> integerVector = new Vector<Integer>();
        // The next line causes a compiler error; it's adding an instance of
        // the wrong type. It wouldn't fail if the declaration was
        // Vector integerVector = new Vector();
            integerVector.add("thirteen");
            examine(integerVector);
            Integer ourInteger = integerVector.get(0);
        }

Generics in Gory Detail

In this section, we're going to dive into the details of the implementation of generics. This is a good time to pause and take a breather (or even to simply skip ahead; on a first reading, you might simply want to skip ahead to the the section entitled Adding Generic Arguments and ReturnValues to the Command Object Framework).

One thing that's a little surprising is the output of SimpleVectorProgram above. It prints out the following:

    It's an instance of java.util.Vector
    It's an instance of java.util.Vector

In particular, the run-time types of the vectors are Vector, and not Vector<Integer> or Vector<String>. You might suspect that this is just the result of of a badly written toString method in the Vector class, but it's not. The types Vector<String> and Vector<Integer> simply do not exist at run-time. In fact, they don't really even exist during compile-time; the compiler rewrites the source code to eliminate the type variables (the JSR-014 specification calls this rewriting process erasure).

In fact, if you decompile the byte code file for SimpleVectorProgram, you'll find the following:

    public class SimpleVectorProgram {
        public static void main(String args[]) {
            playWithIntegerVector();
            playWithStringVector();
        }

        private static void playWithIntegerVector() {
            Vector vector = new Vector();
            vector.add(new Integer(13));
            examine(vector);
            Integer integer = (Integer)vector.get(0);
        }

        private static void playWithStringVector() {
            Vector vector = new Vector();
            vector.add("thirteen");
            examine(vector);
            String s = (String)vector.get(0);
        }

        private static void examine(Object obj) {
            System.out.println("It's an instance of " + obj.getClass().getName());
        }
    }

This example shows that the generics compiler doesn't leave any traces of the extended type information (the values of the type variables) in the class files. Instead, it does two things:

  • It inserts all the casts the developer would have otherwise had to write.

  • It does some fairly sophisticated analysis to make sure that all the casts "match up" (that is, that everything it casts to a String really was a string).

This way of doing things has three main advantages:

  • The "code" bloat problem associated with C++ templates simply doesn't exist. In C++, template classes really do become many classes (one for each set of template variable bindings). This isn't so bad when you're doing an install from CD, but can become a nightmare for mobile code.

  • It avoids potential security policy issues revolving around permissions granted to packages. If X is a randomly chosen class, it's not obvious what package Vector<X> should be in, or what permissions it should be granted. While this probably isn't an issue for container classes like Vector, it's a very significant issue for other classes.

  • It doesn't change the JVM specification. Most specifications for adding generics to the Java language eventually wind up adding new bytecodes or changing the definition of a class.

On the other hand, this approach can also cause some headaches. The most annoying "gotcha" in the generics specification is this: you can't use parametrized types effectively in method dispatch -- they simply don't work the way ordinary types do. For example, the following class won't compile:

    public class MoreComplexVectorProgram<W,T> {
        private void examine(Vector<T>vector) {
        }
        private void examine(Vector<W> vector) {
        }
    }

This is a perfectly natural thing to want to do. Mentally, Vector<T> and Vector<W> are distinct types and so overloading the examine method is intuitively correct (if somewhat strange). But look at the decompiled code again. Once the generics compiler is finished erasing the type variable information, both examine methods will have the same signature (they both take a single argument of type Vector). Since you can't have two methods in the same class with identical signatures, this code won't compile.

Pages: 1, 2, 3, 4

Next Pagearrow