Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples

Session Replication in Tomcat 5 Clusters, Part 1

by Srini Penchikala

The Tomcat 5 server provides built-in support for clustering and session replication. This first article in this series will provide an overview of session persistence and the inner works of session replication in Tomcat clusters. I will discuss how the session replication process works in Tomcat 5 and the replication mechanisms available for session persistence across the cluster nodes. In part two, I will discuss the details of a sample Tomcat cluster setup with session replication enabled, and compare different replication scenarios.


A traditional standalone (non-clustered) server does not provide any failover or load balancing capabilities. When the server goes down, the entire web site is unavailable until the server is brought back up. Any sessions that were stored in the server's memory are lost and the users have to log in again and enter all of the data lost due to the server crash.

On the other hand, a server that's part of a cluster provides both scalability and failover capabilities. A cluster is a group of multiple server instances running simultaneously and working together to provide high availability, reliability, and scalability. The server cluster appears to clients as a single server instance. Clustered server instances are pretty much the same as a standalone server from the client point of view, but they offer uninterrupted service and session data persistence by providing failover and session replication.

Server Communication in a Cluster

The application servers in a cluster use network technologies such as IP multicast and sockets to share information about the availability of the servers.

One-to-Many Communication Using IP Multicast

Tomcat server uses IP multicast for all one-to-many communications among server instances in the cluster. IP multicast is the broadcast technology that enables multiple servers to "subscribe" to a given IP address and port number and listen for messages. (Multicast IP addresses range from to Each server instance in the cluster uses multicast to broadcast regular heartbeat messages with its availability. By monitoring these heartbeat messages, server instances in a cluster determine when a server instance has failed. One limitation of using IP multicast for server communication is that it does not guarantee that the messages are actually received. For example, if an application's local multicast buffer is full, new multicast messages cannot be written to the buffer, and the application is not notified when messages are dropped.

Server Communication Using IP Sockets

IP sockets also provide the mechanism for sending messages and data among the servers in a cluster. The server instances use IP sockets for replicating HTTP session state among the cluster nodes. Proper socket configuration is crucial to the performance of a cluster. The efficiency of socket-based communication is dependent on the type of socket implementation (i.e. whether the system uses native or pure-Java socket reader implementation) and whether the server instance is configured to use enough socket reader threads, if the server uses pure-Java socket readers.

For best socket performance, configure the system to use the native sockets rather than the pure-Java implementation. This is because native sockets have less overhead compared to the Java-based socket implementation. Although the Java implementation of socket reader threads is a reliable and portable method of peer-to-peer communication, it does not provide the best performance for heavy-duty socket usage in the cluster. Native socket readers use more efficient techniques to determine if there is any data to read on a socket. With a native socket reader implementation, reader threads do not need to poll inactive sockets: they service only active sockets, and they are immediately notified when a given socket becomes active. With pure-Java socket readers, threads must actively poll all opened sockets to determine if they contain any data to read. In other words, socket reader threads are always "busy" polling sockets, even if the sockets have no data to read. This unnecessary overhead can reduce performance.

Related Reading

Tomcat: The Definitive Guide
By Jason Brittain, Ian F. Darwin

Clustering in Tomcat 5

Even though the clustering feature was available in the earlier versions of Tomcat 5, it has become more modular in later versions (5.0.19 or later). The Cluster element in server.xml was refactored so that we can replace different parts of the cluster without affecting the other elements. For example, currently the configuration sets the membership service to be multicast discovery. This can easily be changed to a membership service that uses TCP or Unicast instead, without changing the rest of the clustering logic.

Other cluster elements, such as the session manager, replication sender, and replication receiver, can also be replaced with custom implementations without affecting the rest of the cluster configuration. Also, any server component in the Tomcat cluster can now use the clustering API to send messages to any or all members of the cluster.

Session Replication

There are two kinds of sessions handled by a server cluster: sticky sessions and replicated sessions. Sticky sessions are those sessions that stay on the single server that received the web request. The other cluster members don't have any knowledge of the session state in the first server. If the server that has the session goes down, the user has to again log in to the web site and re-enter any data stored in the session.

In the second session type, the session state in one server is copied to all of the other servers in the cluster. The session data is copied whenever the session is modified. This is a replicated session. Both sticky and replicated sessions have their advantages and disadvantages. Sticky sessions are simple and easy to handle, since we don't need to replicate any session data to other servers. This results in less overhead and better performance. But if the server goes down, so does all of the session data stored in its memory. Since the session data is not copied to other servers, the session is completely lost. This can cause problems if we are in the middle of processing a transaction type of query and lose all of the data that has been entered.

To support automatic failover for servlet and JSP HTTP session states, Tomcat server replicates the session state in memory. This is done by copying the data stored in the session (attributes) on one server to the other members in the cluster to prevent any data loss and allow for failover.

State Management of Objects

There are four categories of objects that are distinguished by how they maintain state on the server:

(Source: "Using WebLogic Server Clusters")

Design Considerations for Session Replication

Network Considerations

It is important to isolate the cluster multicast address with other applications. We don't want the cluster configuration or the network topology to interfere with multicast server communications. Sharing the cluster multicast address with other applications forces clustered server instances to process unnecessary messages, introducing overhead. Sharing a multicast address may also overload the IP multicast buffer and delay transmission of the server heartbeat messages. Such delays can result in a server instance being marked as dead, simply because its heartbeat messages were not received in a timely manner.

Programming Considerations

In addition to the network-related factors mentioned above, there are some design considerations related to the way we write J2EE web applications that also affect session replication. Following is a list of some of these programming considerations:

Session Replication in Tomcat 5

Prior to version 5, Tomcat server only supported sticky sessions (using the mod_jk module for load balancing purposes). If we needed session replication, we had to rely on third-party software such as JavaGroups to implement it. Tomcat 5 server comes with session replication capabilities. Similar to the clustering feature, session replication is enabled just by modifying the server.xml configuration file.

Martin Fowler talks about three session-state persistence patterns in his book Enterprise Patterns. These patterns are:

  1. Client Session State: Stores session state on the client.
  2. Server Session State: Keeps the session state on a server system in a serialized form.
  3. Database Session State: Stores session data as committed data in the database.

Tomcat supports these three session persistence types:

  1. In-memory replication: Session state is replicated in the JVM's memory, using the SimpleTcpCluster and SimpleTcpClusterManager classes that ship with the Tomcat 5 installation. These classes are in the package org.apache.catalina.cluster and are part of server/lib/catalina-cluster.jar.
  2. Database persistence: In this type, the session state is stored in a relational database and the server retrieves session information from the database, using the JDBCManager class. This class is in the org.apache.catalina.session.JDBCStore package, and is part of the file catalina.jar.
  3. File-based persistence: Here, the session state is saved to a file system, using the PersistenceManager class. This class is in the org.apache.catalina.session.FileStore package and is part of catalina.jar.

Elements of Tomcat Cluster and Session Replication

This section briefly explains the elements comprising Tomcat cluster and session replication.


This is the main element in the cluster. The class SimpleTcpCluster represents the cluster element. It creates the ClusterManager for all of the distributable web contexts using the specified manager class name in server.xml.

Cluster Manager

This class takes care of replicating the session data across all of the nodes in the cluster. The session replication happens for all those web applications that have the distributable tag specified in the web.xml file. The cluster manager is specified in server.xml with the managerClassName attribute of the Cluster element. The cluster manager code is designed to be a separate element in the cluster; all we have to do is write a session manager class that implements the ClusterManager interface. This gives us the flexibility of using a custom cluster manager without affecting other elements in the cluster.

There are two replication algorithms. SimpleTcpReplicationManager replicates the entire session each time, while DeltaManager only replicates session deltas.

The simple replication manager copies the entire session on each HTTP request. This is more useful when the sessions are small in size, and if we have code like:

HashMap map = session.getAttribute("map");

Here, we don't need to specifically call the session.setAttribute() or removeAttribute methods to replicate the session changes. For each HTTP request, all of the attributes in the session are replicated. There is an attribute called useDirtyFlag that can be used to optimize the number of times a session is replicated. If this flag is set to true, we have to call setAttribute() method to get the session changes replicated. If it's set to false, the session is replicated after each request. SimpleTcpReplicationManager creates a ReplicatedSession to perform the session replication.

The delta manager is provided for pure performance reasons. It does one replication per request. It also invokes listeners, so if we call session.setAttribute(), the listeners on the other servers will be invoked. DeltaManager creates a DeltaSession to do the session replication.


The membership is established by all Tomcat instances sending broadcast messages on the same multicast IP and port. The broadcast message contains the IP address and TCP listen port of the server (the default IP address value is If an instance has not received the message within a given time frame (specified by the mcastDropTime parameter in the cluster configuration), the member is considered dead. The element is represented by the McastService class.

The attributes starting with mcastXXX are for the membership multicast ping. The following table lists the attributes used for IP multicast server communication.

Attribute Description
mcastAddr Multicast address (this has to be the same for all of the nodes)
mcastPort Multicast port number (this also has to be the same for all of the nodes)
mcastBindAddr IP address to bind the multicast socket to a specific address
mcastTTL Multicast Time To Live (TTL) to limit the broadcast
mcastSoTimeout Multicast read timeout (in milliseconds)
mcastFrequency Time in between sending "I'm alive" heartbeats (in milliseconds)
mcastDropTime Time before a node is considered dead (in milliseconds)


This element is represented by the ReplicationTransmitter class. When a multicast broadcast message is received, the member is added to the cluster. Upon the next replication request, the sending instance will use the host and port information and establish a TCP socket. Using this socket, it sends over the serialized data. There are three different ways to handle session replication in Tomcat 5. These are asynchronous, synchronous, and pooled replication modes. The following section explains how these modes work and the scenarios where each should be used.


This cluster element is represented by the ReplicationListener class. The attributes in the cluster configuration that start with tcpXXX are for the actual TCP session replication. The following table shows the attributes used to configure socket-based server communication for server replication.

Attribute Description
tcpThreadCount Number of threads to handle incoming replication requests
tcpListenAddress IP address for TCP cluster requests. (If this is set to auto, the address becomes the value of InetAddress.getLocalHost().getHostAddress().)
tcpListenPort The port number where the session replication is received from other cluster members.
tcpSelectorTimeout Timeout (in milliseconds)

Replication Valve

The replication valve is used to determine which HTTP requests need to be replicated. Since we don't usually replicate the static content (such as HTML and JavaScript, stylesheets and image files), we can filter the static content using the replication valve element. The valve is used to find out when the request is completed and initiate the replication.


The deployer element can be used to deploy apps cluster-wide. Currently, the deployment only deploys/undeploys to working members in the cluster so no WARs are copied upon startup of a broken node. The deployer watches a directory (watchDir) for WAR files when watchEnabled="true". When a new WAR file is added, the WAR gets deployed to the local instance, and is then deployed to the other instances in the cluster. When a WAR file is deleted from the watchDir the WAR is undeployed locally and cluster-wide.

All of the elements in the Tomcat cluster architecture, and their hierarchy, are shown in Figure 1.

Figure 1
Figure 1. Tomcat cluster hierarchy diagram. Click image for full-size screen shot.

How Session Replication Works in Tomcat

The following section briefly explains how the cluster nodes share the session information when a Tomcat server is started up or shut down. For more detailed explanation, refer to the Tomcat 5 Clustering documentation.

TC-01: First node in the cluster
TC-02: Second node in the cluster


In this article, I talked about session replication in a clustered environment, and some design considerations when creating J2EE applications with an in-memory session replication requirement. I also discussed the clustering elements in Tomcat 5 container that are specific to session replication. In part two of this series, we'll look at how to configure session replication in a Tomcat cluster using different session managers and replication modes.

Srini Penchikala is an information systems subject matter expert at Flagstar Bank.

Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.