ONJava.com    
 Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples


JBoss Optimizations 101

by Sacha Labourey and Juha Lindfors
05/28/2003

This article introduces a fictitious application to show some basic configuration mistakes, frequently made by new J2EE developers, that lead to bad application performance. While the application is fictitious, all surrounding issues and optimizations we cover originate from real-life consulting experiences in the J2EE and JBoss fields.

Presentation of the CMS101

CMS101 is a basic content management server (CMS) implemented as a J2EE application. Its purpose is to serve HTML pages that are dynamically built, based on a set of basic elements:

Usually, different contents will share the same header, footer, and left and right side.

Structure of a Web page for CMS101
Figure 1. Structure of a web page for CMS101

Obviously, the flexibility offered by this CMS is not overwhelming and will only satisfy limited scenarios, so don't try this at work. However, its pedagogic interest remains intact.

At the beginning of the project, after several passionate meetings, CMS101's architecture team ended up with the EJB design excerpted here:

UML design of CMS101
Figure 2. UML design of CMS101

A web page is identified by a name and has links to a header, a footer, and left and right sides. The page content is versioned: the current production version number is stored in the WebPage EJB and different versions are stored in the PageContent EJB. All of these EJBs are Container Managed Persistence (CMP) entity beans. An EJB Page Renderer stateless session bean handles web page creation. It starts the transaction and sequentially asks each component composing the web page for its HTML output:

Page rendering activity chart
Figure 3. Page-rendering activity chart

Related Reading

Java Performance Tuning
By Jack Shirazi

Note: for the sake of simplicity, only access to the Header EJB is shown in the picture above. However, access to the footer and left and right sides is similar.

A few days later, the development team comes up with a first version of CMS101 and deploys it under JBoss 3.2. After a successful deployment, the development teams define a set of web pages for testing and start playing with CMS101. The result is satisfying. However, as real professionals, they decide to put a scalability test suite into place!

First Headaches: Heavy Locking

While simple testing produced satisfactory results, the development team quickly detects that their application doesn't scale when they run the scalability test suite. For example, some of their test web pages contain code that must get salary information from a remote ERP system. This operation, while not CPU-intensive for the application server, takes a few seconds (network latency, ERP processing time, etc.). They observe that, while such a page is being processed, no other page can be rendered by CMS101! What is the reason for that problem?

By default, JBoss uses pessimistic locking. An entity bean that is enrolled in a transaction cannot be used by another transaction until the one that uses the bean has committed or rolled back. Consequently, as the transaction is started by the PageRenderer EJB, all entity beans used by PageRenderer are only "released" once the whole work is finished.

As all Web pages of the CMS101 test suite use the same header and footer, there is a huge contention taking place, namely:

Activity chart showing pessimistic locking
Figure 4. Activity chart showing pessimistic locking

When you face entity bean access contention problems that lead to bad performance of your J2EE application, there is no (magic) general solution that can be used; each scenario must be individually analyzed.

In the current scenario, it is obvious that there is no real need for the entity beans to be enrolled in the transaction. Each bean is only accessed once during the transaction. None of its fields is updated. JBoss offers a way to handle this situation by defining either an entire EJB as being "read-only" or simply as a subset of its methods. When accessing a read-only method (or EJB), while JBoss still prevents concurrent access to the same bean instance, the bean will not be enrolled in the transaction and will not be locked during the whole transaction lifetime. Consequently, other transactions can directly use it for their own work.

Our brilliant development team decides that all calls to methods prefixed by get were read-only, so they modify the jboss.xml deployment descriptor accordingly:

<jboss>
  <enterprise-beans>

    <entity>
      <ejb-name>WebPage</ejb-name>
      <method-attributes>
        <method>
          <method-name>get*</method-name>
          <read-only>true</read-only>
        </method>
      <method-attributes>
    </entity>

    <entity>
      <ejb-name>PageContent</ejb-name>
      <method-attributes>
        <method>
          <method-name>get*</method-name>
          <read-only>true</read-only>
        </method>
      <method-attributes>
    </entity>

    <entity>
      <ejb-name>Header</ejb-name>
      <method-attributes>
        <method>
          <method-name>get*</method-name>
          <read-only>true</read-only>
        </method>
      <method-attributes>
    </entity>

    <!-- and so on for Footer, LeftSide and RightSide -->

    <session>
      <ejb-name>PageRenderer</ejb-name>
      <jndi-name>PageRenderer</jndi-name>
    </session>

  </enterprise-beans>
</jboss>

They then decide to re-run the test suite to admire the new results. However, they still have mixed feelings. While the concurrency and overall performance of their application is much better, their database suffers from heavy access and quickly becomes the bottleneck of their solution.

When Loading Data Once May Be Enough ...

The reason for this heavy database usage comes from the cache, or, more accurately, from the absence of cache. For entity beans, the EJB specification defines three commit options, which can be split into two main categories:

  1. I own the database (AKA Commit Option A): If any data must be modified, it will be done through the container. The container is the only point of write access to the database. As such, the container can cache data across transactions without the risk of having an unsynchronized cache.
  2. I don't own the database (AKA Commit Options B or C): Data may be modified by other systems as the EJB container. Consequently, the EJB container cannot keep data in cache across transactions, as it may have been modified externally. It must reload the required data from the database for each transaction.

By default, JBoss' entity bean containers are configured not to use the cache (i.e., Commit Option B). As the CMS101 development team hasn't changed the default configuration, the database becomes the bottleneck. For each database request, the page description, content, header, footer, and left and right sides are reloaded!

Note: I have even seen a real-life situation where CMP 1.1 was used and all data that was read during the transaction was written back at the end of the transaction, even though no fields had been changed.

As the database used by CMS101 is only used by their application, they decide to activate caching and switch to Commit Option A:

<jboss>
  <enterprise-beans>

    <container-configurations>
    <!-
    We define a new configuration that simply overrides
    the default CMP 2.x configuration defined in 
    conf/standardjboss.xml by changing its commit
    option 
    -->
      <container-configuration extends=
        "Standard CMP 2.x EntityBean">
        <container-name>CMP 2.x and Cache</container-name>
        <commit-option>A</commit-option>
      </container-configuration>
    </container-configurations>
  
    <entity>
      <ejb-name>WebPage</ejb-name>
      <configuration-name
      > CMP 2.x and Cache</configuration-name>
      <method-attributes>
        <method>
          <method-name>get*</method-name>
          <read-only>true</read-only>
        </method>
      <method-attributes>
    </entity>

    <entity>
      <ejb-name>PageContent</ejb-name>
      <configuration-name
      > CMP 2.x and Cache</configuration-name>
      <method-attributes>
        <method>
          <method-name>get*</method-name>
          <read-only>true</read-only>
        </method>
      <method-attributes>
    </entity>

    <!-- and so on for Header, Footer, 
    LeftSide and RightSide -->

    <session>
      <ejb-name>PageRenderer</ejb-name>
      <jndi-name>PageRenderer</jndi-name>
    </session>

  </enterprise-beans>
</jboss>

The development team runs the test suite again and sees that both the scalability and level of database usage are excellent! After a little more testing, they will be ready to go into production and sell their (highly value-added) CMS101!

Cluster, You Said Cluster? Oops, Forgot that Detail ...

After months of prospecting, the CMS101 commercial team finds its first customer. There is, however, a small discrepancy between what the sales force has sold and what the development team has implemented (which is quite an unusual situation): the customer expects a high number of requests on its web site and thus wants CMS101 to run in a cluster to balance the load.

Clustering CMS101 is not a problem in itself, as JBoss supports clustering features. The problem is that by doing so, they will lose the performance optimizations they just implemented through Commit Option A. By running a cluster of JBoss instances, more than one JBoss node will access the same database. Furthermore, they will not only read data, but may also update web page content, for example. Consequently, we now have as many points of write access to the database as we have JBoss instances in the cluster. If a user modifies a web page on a specific JBoss node, the database and the local cache will be updated. However, the other JBoss instances will never reload fresh data from the database, instead using their own caches, now containing stale data.

unsynchronized cache data
Figure 5. Unsynchronized cache data

Once again, let's analyze the specific requirements of this application. In the clustered case, our problem is that the data is never refreshed in the other nodes' caches. Consequently, we need a way to force other nodes' caches to reload a specific bean from the database when it is modified on another node. The node that modifies data must send some kind of invalidation message to the other node caches. Luckily, the cache invalidation message doesn't need to be sent transactionally to the other caches — we're dealing with web pages, not bank accounts.

For these scenarios, JBoss incorporates a handy tool: the cache invalidation framework. It provides automatic invalidation of cache entries in a single node or across a cluster of JBoss instances. As soon as an entity bean is modified on a node, an invalidation message is automatically sent to all related containers in the cluster and the related entry is removed from the cache. The next time the data is required by a node, it will not be found in cache, and will be reloaded from the database:

cache invalidation framework
Figure 6. Cache invalidation framework

To activate this behavior in JBoss, the development team has to run JBoss clustered and modify the jboss.xml deployment descriptor:

<jboss>
  <enterprise-beans>

    <entity>
      <ejb-name>WebPage</ejb-name>
      <configuration-name
        >Standard CMP 2.x with cache invalidation<
      /configuration-name>
      <method-attributes>
        <method>
          <method-name>get*</method-name>
          <read-only>true</read-only>
        </method>
      <method-attributes>
      <cache-invalidation>True</cache-invalidation>
    </entity>
      
    <entity>
      <ejb-name>PageContent</ejb-name>
      <configuration-name
        >Standard CMP 2.x with cache invalidation<
      /configuration-name>
      <method-attributes>
        <method>
          <method-name>get*</method-name>
          <read-only>true</read-only>
        </method>
      <method-attributes>
      <cache-invalidation>True</cache-invalidation>
    </entity>
      
      <!-- and so on for Header, Footer, 
        LeftSide and RightSide -->
      
    <session>
      <ejb-name>PageRenderer</ejb-name>
      <jndi-name>PageRenderer</jndi-name>
    </session>

  </enterprise-beans>
</jboss>

Note that we have removed our customized container configuration, instead using the one named "Standard CMP 2.x with cache invalidation," pre-defined in conf/standardjboss.xml. No additional configuration is required to get this behavior. Many other fancy designs can be built using this framework.

Note: JBoss 4.0 will not only contain the distributed invalidation framework, but will also include a full-fledged transactional distributed cache.

This way, the development team can keep all of the advantages from previous optimizations and get even better throughput, thanks to the cluster, all the while satisfying the customer requirements.

Conclusion

The basic optimizations described in this article do not just apply to CMS101, but to any kind of J2EE application with similar data taxonomy. Remember that this analysis can be done on a per-EJB basis, not just for your entire application as a monolithic whole.

Long life to CMS101 and see you on the JBoss.org forums!

Sacha Labourey is one of the core developers of JBoss Clustering and the General Manager of JBoss Group Europe.

Juha Lindfors is a computer scientist at the University of Helsinki.


Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.