Clustering and Load Balancing in Tomcat 5, Part 1by Srini Penchikala
The latest version of the Tomcat servlet container provides clustering and load balancing capabilities that are essential for deploying scalable and robust web applications. The first part of this article provides an overview of installation, configuration, usage, and extension of clustering and load balancing features. The second will introduce a sample web application to demonstrate the steps involved in configuring Tomcat server instances to enable clustering, and will study session persistence using in-memory replication in the cluster environment.
The Tomcat 5 server comes with a rules-based load balancer application. Two simple custom load balancing rules (extending the rules API) were written based on round-robin and random algorithms to redirect incoming web requests. Performance benchmarks for the sample web application running in the cluster environment are presented. The load testing tool JMeter was used to simulate multiple web users to study the load-balancing mechanism.
Since this article concentrates mainly on demonstrating the clustering capabilities in the Tomcat servlet container, J2EE application clustering to replicate EJB, JNDI, and JMS objects is not discussed here. Refer to the articles "J2EE Clustering" and "J2EE Clustering with JBoss" for EJB and JMS clustering.
Large-Scale System Design
Enterprise web portal applications must provide scalability and high availability (HA) for web services in order to serve thousands of users hitting a corporate web site. Scalability is the system's ability to support increasing numbers of users by adding additional servers to the cluster. High availability is basically providing redundancy in the system. If a cluster member fails for some reason, another member in the cluster can transparently take over the web requests. Deploying a web portal application in a cluster environment gives us the ability to achieve scalability, reliability, and high availability required by the web portal application. Basically, the main goal of clustering is to prevent any web site outage problems occurring due to a Single Point of Failure (SPoF) in the system.
Large-scale system design provides mission-critical services to ensure minimal downtime and maximum scalability in an enterprise application environment. Rather than run a single server, multiple cooperating servers are run. To scale, you should include additional machines within the cluster and to minimize downtime, you should make sure every component of the cluster is redundant. The main ingredient of a large-scale system is clustering, which includes load balancing, fault tolerance, and session state persistence features. Usually for web applications, a hardware- or software-based load balancer sits in front of the application servers within the cluster. These load balancers are used to distribute the load between the cluster nodes by redirecting web traffic to an appropriate cluster member, at the same time detecting any server failures.
A cluster is defined as a group of application servers that transparently run a J2EE application as if it were a single entity. There are two methods of clustering: vertical scaling and horizontal scaling. Vertical scaling is achieved by increasing the number of servers running on a single machine, whereas horizontal scaling is done by increasing the number of machines in the cluster. Horizontal scaling is more reliable than vertical scaling, since there are multiple machines involved in the cluster environment, as compared to only one machine. With vertical scaling, the machine's processing power, CPU usage, and JVM heap memory configurations are the main factors in deciding how many server instances should be run on one machine (also known as the server-to-CPU ratio).
The servers in a J2EE cluster are usually configured using one of the three options. In the independent approach, each application server has its own file system with its own copy of the application files. Another approach is to use a shared file system, where the cluster uses a single storage device that all application servers use to obtain application files. A third configuration approach is called the managed approach, where an administrative server controls access to application content and is responsible for "pushing" appropriate application content to managed servers. The admin server ensures that all servers in the cluster have the application available. It also updates all servers when an application is deployed, and removes the application from all servers when the application is undeployed.
Clustering can be done at various tiers in a J2EE application, including at the database tier. Some database vendors offer clustered databases that support data replication between multiple database servers by providing client transparency where the client (usually a servlet container or an application server) doesn't have to know to which database server it's connecting to get the data. Examples of JDBC clustering are Oracle9i's Real Application Clusters (RAC) and Clustered JDBC (C-JDBC). RAC supports fail over of database connections and transparently reroutes JDBC connections and database requests to a failed over database node. C-JDBC is an open source database cluster that allows a web application to transparently access a cluster of databases through a JDBC driver. This implementation not only load balances JDBC connections among the database nodes in the cluster, but also fails over to a secondary database server.
Clustering in Tomcat
Clustering was available in the previous Tomcat version (version 4.1) as a third-party JAR file; it wasn't very easy to install or configure to make multiple Tomcat instances run in a cluster. JavaGroups is a popular choice for adding clustering capabilities in open source servlet containers (Tomcat) and application servers (JBoss). But in the latest version of Tomcat server, clustering comes as part of the main installation package. This minimizes all of the extra effort that goes into integrating third-party clustering implementations into the Tomcat server.
In a typical cluster environment, for servers in the cluster to cooperate and replicate state, they need to communicate with each other. This group communication is achieved either by point-to-point RMI (TCP-IP) or via IP multicast. Most of the J2EE application servers (such as JBoss, Oracle, WebLogic, and Borland) all use IP multicast communication to send state/updates/heartbeats to one another in the cluster. Here's how the communication among the cluster members works in Tomcat: all of the cluster members talk to each other using multicast
ping messages. Each Tomcat instance will send out a message in which it will broadcast its IP address and TCP listen port for session replication. If an instance has not received the message within a given time frame, it is considered down.
Another popular concept in clustering, called farming, provides cluster-wide hot deployment of web applications. In a server farm, a web application is deployed by copying an application's WAR file to only one node in the cluster; farming will take care of deploying the web application across the entire cluster. Similarly, removing the WAR file from a single cluster node will result in undeploying the web application from all the nodes in the cluster. The Tomcat clustering documentation mentions that a future Tomcat version will support farming capability.
Load balancing (also known as high availability switch over) is a mechanism where the server load is distributed to different nodes within the server cluster, based on a load balancing policy. Rather than execute an application on a single server, the system executes application code on a dynamically selected server. When a client requests a service, one (or more) of the cooperating servers is chosen to execute the request. Load balancers act as single points of entry into the cluster and as traffic directors to individual web or application servers.
Two popular methods of load balancing in a cluster are DNS round robin and hardware load balancing. DNS round robin provides a single logical name, returning any IP address of the nodes in the cluster. This option is inexpensive, simple, and easy to set up, but it doesn't provide any server affinity or high availability. In contrast, hardware load balancing solves the limitations of DNS round robin through virtual IP addressing. Here, the load balancer shows a single IP address for the cluster, which maps the addresses of each machine in the cluster. The load balancer receives each request and rewrites headers to point to other machines in the cluster. If we remove any machine in the cluster, the changes take effect immediately. The advantages of hardware load balancing are server affinity and high availability; the disadvantages are that it's very expensive and complex to set up.
There are many different algorithms to define the load distribution policy, ranging from a simple round robin algorithm to more sophisticated algorithms used to perform the load balancing. Some of the commonly used algorithms are:
- Minimum load
- Last access time
- Programmatic parameter-based (where the load balancer can choose a server based upon method input arguments)
Load-balancing algorithms affect statistical variance, speed, and simplicity. For example, the weight-based algorithm has a longer computational time than the other algorithms. For a more detailed explanation on load balancing, refer to the ONJava article "Load Balancing Web Applications."
Load Balancing in Tomcat
Load balancing capability was not provided in previous Tomcat versions. The integration of the Apache web server and the Tomcat servlet container together has been a popular choice to handle web requests and to balance loads. In an Apache-Tomcat setup, a Tomcat instance called Tomcat Worker is configured to implement load balancing.
Tomcat 5 provides load balancing in three different ways: using the JK native connector, using Apache 2 with
mod_rewrite, or using the balancer web app. In this article, we concentrate on the third option, using the balancer web application to redirect web requests to different nodes in the cluster. The load balancer application is a rules-based application that uses a servlet filter mechanism to redirect incoming web requests to the next available member in the cluster. Servlet filters were introduced in the Servlet 2.3 specification. These filters are used for a variety of tasks in a web application, such as JAAS authentication, encryption, logging and auditing, data compression, XSLT filters that transform XML content, etc. As mentioned on the Tomcat balancer web site, the balancer application is not designed as a replacement for other robust load-balancing mechanisms. Rather, it's a simple and extensible way to direct traffic among multiple servers. Check out the sample Java classes provided in the balancer application to understand how load balancing is achieved in different ways using different rules criteria.
The load balancing is enabled by creating a rules configuration file (called rules.xml) that contains various rules and redirection URLs. The balancer filter checks the
RuleChain to determine where to redirect the request by checking the rules in the same order as they are defined in the rules.xml file. As soon as a
Rule matches the criteria, the filter stops the evaluation and redirects the request to URL specified for the matching rule.
Pages: 1, 2