ONJava.com    
 Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples


O'Reilly Book Excerpts: Java Servlet Programming, 2nd Edition

Enterprise Servlets and J2EE

Related Reading

Java Servlet Programming
By Jason Hunter

by Jason Hunter with William Crawford

This excerpt is Chapter 12 from Java Servlet Programming, 2nd Edition, published in April 2001 by O'Reilly.

This chapter discusses enterprise servlets. The term enterprise is used all the time with Java these days, but what does it mean? According to my trusty and beat-up copy of The American Heritage Dictionary (so old it's priced at $1.95) the word enterprise has three definitions:

  1. An undertaking, esp. one of some scope and risk

  2. A business

  3. Readiness to venture; initiative

It's a surprisingly close definition to what people mean when they say enterprise Java and enterprise servlets. We can merge the traditional definitions to create a modern definition:

  1. Readiness to support a business undertaking of large scope

In other words, enterprise servlets are servlets designed to support business-oriented large-scale web sites -- high-traffic, high-reliability sites that have extra demands for scalability, load balancing, failover support, and integration with other Java 2, Enterprise Edition (J2EE) technologies.

As servlets have become increasingly popular and robust, and as servlet containers have become more solid and featureful, a growing number of enterprise sites are being built using servlets. Writing servlets for these sites differs from writing servlets for traditional sites, and in this chapter we'll discuss the special requirements and abilities of these enterprise servlets.

Distributing Load

How to Be Distributable

Many Styles of Distribution

Integrating with J2EE

J2EE Division of Labor

Environment Entries

References to EJB Components

References to External Resource Factories

Servlet Distribution in a J2EE Environment

Distributing Load

For high-traffic and/or high-reliability sites, it's often desirable to distribute the site's content and processing duties across multiple backend servers. This distribution allows multiple servers to share the load, increasing the number of simultaneous requests that can be handled and providing failover so the site can remain up even when one particular component crashes.

Distribution isn't appropriate for every site. Creating and maintaining a distributed site can be significantly more complicated than doing the same for a standalone site and can be more costly as well in terms of load-balancing hardware and/or software requirements. Distribution also doesn't tend to provide a significant performance benefit until the server is under extreme load. When presented with a performance problem, it's often easiest to "throw hardware at the problem" by installing a single higher-end machine rather than trying to share the load between two underperforming machines.

Still, there are many sites that need to scale beyond the capabilities of a single machine and that need a level of reliability no single machine can offer. These are the sites that need to be distributed.

How to Be Distributable

The programming requirements for a distributable servlet are much stricter than the requirements for a nondistributable servlet. A distributable servlet must be written following certain rules so that different instances of the servlet can execute on multiple backend machines. Any programmer assumptions that there's only one servlet instance, one servlet context, one JVM, or one filesystem have the potential to cause serious problems.

For more information on Enterprise JavaBeans see http://java.sun.com/products/ejb and Enterprise JavaBeans by Richard Monson-Haefel (O'Reilly).

To learn how servlets can be distributed, look at Enterprise JavaBeans (EJB) technology, a server-side component model for implementing distributed business objects and the technology that's at the heart of J2EE. EJB is designed from the ground up as distributable objects. An EJB implements business logic and lets the container (essentially the server) in which it runs manage services such as transactions, persistence, concurrency, and security. An EJB may be distributed across a number of backend machines and may be moved between machines at the container's discretion. To enable this distribution model, EJB must follow a strict specification-defined ruleset for what they can and cannot do. (See sidebar)

Servlets have no such specification-defined ruleset. This stems from their heritage as frontend server-side components, used to communicate with the client and call on the distributed EJB and not be distributed themselves. However, for high-traffic sites or sites that need high reliability, servlets too need to be distributed. We expect upcoming Servlet API versions to include a tighter definition for the implementation of distributed servlet containers.

The following are our own rules of thumb for writing servlets to be deployed in a distributed environment:

A web application whose components follow these rules can be marked distributable, and that marking allows the server to deploy the application across multiple backend machines. The distributable mark is placed within the web.xml deployment descriptor as an empty <distributable/> tag located between the application's description and its context parameters:

<web-app>
 <description>
  All servlets and JSPs are ready for distributed deployment
 </description>

 <distributable/>

 <context-param>
  <!-- ... -->
 </context-param>
</web-app>

Applications are nondistributable by default, to allow the casual servlet programmer to author servlets without worrying about the extra rules for distributed deployment. Marking an application distributable does not necessarily mean the application will be split across different machines. It only indicates the capability of the application to be split. Think of it as a programmer-provided certification.

Servers do not enforce most of the preceding rules given for a distributed application. For example, a servlet is not barred from using instance and static variables nor barred from storing objects in its ServletContext, and a servlet may still directly access files using the java.io package. It's up to the programmer to ensure these abilities aren't abused. The only enforcement that the server may perform is throwing an IllegalArgumentException if an object bound to the HttpSession does not implement java.io.Serializable (and even that's optional because, as we'll see later, a J2EE-compliant server must allow additional types of objects to be stored in the session).

Many Styles of Distribution

Servlet distribution (often called clustering) is an optional feature of a servlet container, and servlet containers that do support clustering are free to do so in several different ways. There are four standard architectures, listed here from simplest to most advanced.

  1. No clustering. All servlets execute within a single JVM, and the <distributable/> marker is essentially ignored. This design is simple, and works fine for a standard site. The standalone Tomcat server works this way.

  2. Clustering support, no session migration, and no session failover. Servlets in a web application marked <distributable/> may execute across multiple machines. Nonsession requests are randomly distributed (modulo some weighting perhaps). Session requests are "sticky" and tied to the particular backend server on which they first start. Session data does not move between machines, and this has the advantage that sessions may hold nontransferable (non-Serializable) data and the disadvantage that sessions may not migrate to underutilized servers and a server crash may result in broken sessions. This is the architecture used by Apache/JServ and Apache/Tomcat. Sessions are tied to a particular host through a mechanism where the mod_jserv/mod_jk connector in Apache uses a portion of the session ID to indicate which backend JServ or Tomcat owns the session. Multiple instances of Apache may be used as well, with the support of load-balancing hardware or software.

  3. Clustering support, with session migration, no session failover. This architecture works the same as the former, except a session may migrate from one server to another to improve the load balance. To avoid concurrency issues, any session migration is guaranteed to occur between user requests. The Servlet Specification makes this guarantee: "Within an application that is marked as distributable, all requests that are part of a session can only be handled on a single VM at any one time." All objects placed into a session that may be migrated must implement java.io.Serializable or be transferable in some other way.

  4. Clustering support, with session migration and with session failover. A server implementing this architecture has the additional ability to duplicate the contents of a session so the crash of any individual component does not necessarily break a user's session. The challenge with this architecture is coordinating efficient and effective information flow. Most high-end servers follow this architecture.

The details on how to implement clustering vary by server and are a point on which server vendors actively compete. Look to your server's documentation for details on what level of clustering it supports. Another useful feature to watch for is session persistence, the background saving of session information to disk or database, which allows the information to survive server restarts and crashes.

Integrating with J2EE

Throughout the rest of this book, servlets have been used as a standalone technology built upon the standard Java base. Servlets have another life, however, where they act as an integral piece of what's known as Java 2, Enterprise Edition, or J2EE for short.

Most pronounce J2EE as J-2-E-E but those who know it best at Sun just say "jah-too-ee."

J2EE 1.2 collects together several server-side APIs including Servlet API 2.2, JSP 1.1, EJB, JavaMail, the Java Messaging Service ( JMS), Java Transactions (JTA), CORBA, JDBC, the Java API for XML Parsing ( JAXP), and the Java Naming and Directory Interface ( JNDI). J2EE makes the whole greater than the sum of its parts by defining how these technologies can interoperate and make use of one another, and providing certification that certain application servers are J2EE compliant, meaning they provide all the required services as well as the extra connection glue.

J2EE Division of Labor

J2EE breaks enterprise application development into six distinct roles. Of course, an individual may participate in more than one role and multiple individuals may work together in a given role.

J2EE product provider
The operating system vendor, database system vendor, application server vendor, and/or web server vendor. The product provider provides an implementation of the J2EE APIs and tools for application deployment and management.
Application component provider
The author of the application's servlets, EJB, and other code as well as general content such as HTML. (In other words, you.)
Application assembler
Takes the application's components and (using tools from the product provider) places them in a form appropriate for deployment. As part of this the assembler describes the external dependencies of the application that may change from deployment to deployment, like database or user login information.
Deployer
Takes the output of the assembler and (using tools from the product provider) installs, configures, and executes the application. The configuration task requires satisfying the external dependencies outlined by the assembler.
System administrator
Configures and administers the network infrastructure to keep the application alive.
Tool provider
Creates tools to support J2EE development, beyond those provided by the product provider.

The division of labor between component provider, assembler, and deployer has an impact on how we (as servlet programmers in the content provider role) behave. Specifically, we should design our code to make external dependencies clear for the assembler, and furthermore we should use mechanisms that allow the deployer to satisfy these dependencies without modifying the files received from the assembler. That means no deployer edits to the web.xml file! Why not? Because J2EE applications are assembled into Enterprise Archive (.ear) files of which a contained web application's web.xml file is but one uneditable part.

This sounds more difficult than it actually is. J2EE provides a standard mechanism to achieve this abstraction using JNDI and a few special tags in the web.xml deployment descriptor. JNDI is an object lookup mechanism, a way to bind objects under certain paths and locate them later using that path. You can think of it like an RMI registry, except it's more general with support for accessing a range of services including LDAP and NIS (and even, in fact, the RMI registry!). An assembler declares external dependencies within the web.xml using special tags, a deployer satisfies these dependencies using server-specific tools, and at runtime our Java code uses the JNDI API to access the external resources -- kindly placed there by the J2EE-compliant server. All goals are satisfied: our Java code remains portable between J2EE-compliant servers, and the deployer can satisfy the code's external dependencies without modifying the files received from the assembler. There's even enough flexibility left over for server vendors to compete on implementations of the standard.

Environment Entries

Context init parameters serve a useful purpose with servlets, but there's a problem with context init parameters in the J2EE model: any change to a parameter value requires a modification to the web.xml file. For parameter values that may need to change during deployment, it's better to use environment entries instead, as indicated by the <env-entry> tag. The <env-entry> tag may contain a <description>, <env-entry-name>, <env-entry-value>, and <env-entry-type>. The following <env-entry> specifies whether the application should enable sending of PIN codes by mail:

<env-entry>
 <description>Send pincode by mail</description>
 <env-entry-name>mailPincode</env-entry-name>
 <env-entry-value>false</env-entry-value>
 <env-entry-type>java.lang.Boolean</env-entry-type> <!-- FQCN -->
</env-entry>

The <description> explains to the deployer the purpose of this entry. It's optional but a good idea to provide. The <env-entry-name> is used by Java code as part of the JNDI lookup. The <env-entry-value> defines the default value to be presented to the deployer. It's optional, but not specifying a value requires the deployer to provide one. The <env-entry-type> represents the fully qualified class name (FQCN) of the entry. The type may be a String, Byte, Short, Integer, Long, Boolean, Double, or Float (all with their full java.lang qualification). The type helps the deployer know what's expected. If you're familiar with the EJB deployment descriptor, these tags may look familiar; they have the same names and semantics in EJB as well.

Java code can retrieve the <env-entry> values using JNDI:

Context initCtx = new InitialContext( );
Boolean mailPincode = (Boolean) initCtx.lookup("java:comp/env/mailPincode");

All entries are placed by the server into the java:comp/env context. If you're new to JNDI, you can think of this as a URL base or filesystem directory. The java:comp/env context is read-only and unique per web application, so if two different web applications define the same environment entry, the entries do not collide. The context abbreviations, by the way, stand for component environment.

Example 12-1 shows a servlet that displays all its environment entries, using the JNDI API to browse the java:comp/env context.

Example 12-1: Snooping the java:comp/env Context

import java.io.*;
import java.util.*;
import javax.servlet.*;
import javax.servlet.http.*;

import javax.naming.*;

public class EnvEntrySnoop extends HttpServlet {

 public void doGet(HttpServletRequest req, HttpServletResponse res)
throws ServletException, IOException {
  res.setContentType("text/plain");
  PrintWriter out = res.getWriter( );

  try {
   Context initCtx = new InitialContext( );
   NamingEnumeration enum = initCtx.listBindings("java:comp/env");

   // We're using JDK 1.2 methods; that's OK since J2EE requires JDK 1.2
   while (enum.hasMore( )) {
    Binding binding = (Binding) enum.next( );
    out.println("Name: " + binding.getName( ));
    out.println("Type: " + binding.getClassName( ));
    out.println("Value: " + binding.getObject( ));
    out.println( );
   }
  }
  catch (NamingException e) {
   e.printStackTrace(out);
  }
 }
}

Assuming the previous web.xml entry, the servlet would generate:

Name: mailPincode
Type: java.lang.Boolean
Value: false

Remember, a server that does not support J2EE is not required to support these tags or any of the tags we talk about in this section.

References to EJB Components

When the environment entry object is an EJB component, there's a special <ejb-ref> tag that must be used. It provides a way for servlets to get a handle to an EJB using an abstract name. The deployer ensures the availability of an appropriate bean at runtime based on the constraints given by the <ejb-ref> tag. The tag may contain a <description>, <ejb-ref-name>, <ejb-ref-type>, <home>, <remote>, and <ejb-link>. Here's a typical <ejb-ref>:

<ejb-ref>
  <description>Cruise ship cabin</description>
  <ejb-ref-name>ejb/CabinHome</ejb-ref-name>
  <ejb-ref-type>Entity</ejb-ref-type>
  <home>com.titan.cabin.CabinHome</home>
  <remote>com.titan.cabin.Cabin</remote>
</ejb-ref>

The Servlet API 2.2 Specification states, "The ejb-ref-type element contains the expected Java class type of the referenced EJB." This is a confirmed mistake. The actual purpose is as stated here.

These tags also have similar counterparts in EJB, and in fact this example is borrowed from the book Enterprise JavaBeans by Richard Monson-Haefel (O'Reilly). The <description> supports the deployer and is optional but recommended. The <ejb-ref-name> dictates the JNDI lookup name. It's recommended (but not required) that the name be placed within the ejb/ subcontext, making the full path to the bean java:comp/env/ejb/CabinHome. The <ejb-ref-type> must have a value of either Entity or Session, the two types of EJB components (see sidebar).

Finally, the <home> element specifies the fully qualified class name of the EJB's home interface, while the <remote> element specifies the FQCN of the EJB's remote interface.

A servlet would obtain a reference to the Cabin bean with the following code:


InitialContext initCtx = new InitialContext(  );
Object ref = initCtx.lookup("java:comp/env/ejb/CabinHome");
CabinHome home = 
  (CabinHome) PortableRemoteObject.narrow(ref, CabinHome.class);

If the assembler writing the web.xml file has a specific EJB component in mind for an EJB reference, that information can be conveyed to the deployer with the addition of the optional <ejb-link> element. The <ejb-link> element should refer to the <ejb-name> of an EJB component registered in an EJB deployment descriptor within the same J2EE application. The deployer has the option to use the suggestion or override it. Here's an updated web.xml entry:

<ejb-ref>
  <description>Cruise ship cabin</description>
  <ejb-ref-name>ejb/CabinHome</ejb-ref-name>
  <ejb-ref-type>Entity</ejb-ref-type>
  <home>com.titan.cabin.CabinHome</home>
  <remote>com.titan.cabin.Cabin</remote>
  <ejb-link>CabinBean</ejb-link>
</ejb-ref>

References to External Resource Factories

Finally, for those times when the environment entry is a resource factory, there's a <resource-ref> tag to use. A factory is an object that creates other objects on demand. A resource factory creates resource objects, such as database connections or message queues.

The <resource-ref> tag may contain a <description>, <res-ref-name>, <res-type>, and <res-auth>. Here's a typical <resource-ref>:

<resource-ref>
  <description>Primary database</description>
  <res-ref-name>jdbc/primaryDB</res-ref-name>
  <res-type>javax.sql.DataSource</res-type>
  <res-auth>CONTAINER</res-auth>
</resource-ref>

The <description> again supports the deployer and is optional but recommended. The <res-ref-name> dictates the JNDI lookup name. It's recommended but not required to place the resource factories under a subcontext that describes the resource type:

The <res-type> element specifies the FQCN of the resource factory (not the created resource). The factory types in the preceding list are the standard types. A server has the option to support additional types; user factories cannot be used. The upcoming J2EE 1.3 specification proposes a "connector" mechanism to extend this model for user-defined factories.

The <res-auth> tells the server who is responsible for authentication. It can have two values: CONTAINER or SERVLET. If CONTAINER is specified, the servlet container (the J2EE server) handles authentication before binding the factory to JNDI, using credentials provided by the deployer. If SERVLET is specified, the servlet must handle authentication duties programmatically. To demonstrate:

InitialContext initCtx = new InitialContext( );
DataSource source =
  (DataSource) initCtx.lookup("java:comp/env/jdbc/primaryDB");

// If "CONTAINER"
Connection con1 = source.getConnection( );

// If "SERVLET"
Connection con2 = source.getConnection("user", "password");

These tags too have similar counterparts in the EJB deployment descriptor. The only difference is that in EJB the two possible values for <res-auth> are Container and Application (note the inexplicable case difference).

Servlet Distribution in a J2EE Environment

The final difference between servlets in a standalone environment and servlets in a J2EE environment involves a subtle change to the rules for session distribution. While a standard web server is required to support only java.io.Serializable objects in the session for a distributable application, a J2EE-compliant server that supports a distributed servlet container must also support several additional types of objects:

All these are interfaces that do not implement Serializable. For transferring the objects the container may use its own custom mechanism, perhaps based on serialization or perhaps not. Additional class types may be supported at the server's discretion, but these are the only guaranteed types.


Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.