ONJava.com    
 Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples


Java Design and Performance Optimization

Tuning JDBC: Measuring JDBC performance

by Jack Shirazi, author of Java Performance Tuning
12/05/2001

Java Database Connectivity (JDBC) is used extensively by many Java applications. In this article, the first of a series on performance-tuning JDBC, I look at how to measure the performance of JDBC and how to identify which parts of your JDBC subsystem need optimizing.

The core java.sql interfaces

Our goal is to improve the performance of our application. Normally, your first step should be to profile the application to find the bottlenecks, and then address those bottlenecks. However, effectively profiling distributed applications can be difficult. I/O can show up as significant in profiling, simply because of the nature of a distributed application, which normally has threads spending time waiting on I/O. It can be unclear whether threads blocking on reads and writes are part of a significant bottleneck or simply a side issue. When profiling, it is usually worthwhile to have separate measurements available for the communication subsystems. So if we want to measure the performance of the JDBC subsystem, what do we measure?

If you look in the java.sql package, there are three interfaces that form the core of JDBC: Connection, Statement, and ResultSet. Normal interaction with a database consists of:

The following example method illustrates standard database interaction by accessing all of the columns from every row of a specified database table and storing the data from each row into a String [] array, putting all the rows in a vector:

  public static Vector getATable(String tablename, Connection Connection)
    throws SQLException
  {
    String sqlQuery = "SELECT * FROM " + tablename;
    Statement statement = Connection.createStatement();
    ResultSet resultSet = statement.executeQuery(sqlQuery);
    int numColumns = resultSet.getMetaData().getColumnCount();
    String[] aRow;
    Vector allRows = new Vector();
    while(resultSet.next())
    {
      aRow = new String[numColumns];
      for (int i = 0; i < numColumns; i++)
        //ResultSet access is 1-based, arrays are 0-based
        aRow[i] = resultSet.getString(i+1);
      allRows.addElement(aRow);
    }
    return allRows;
  }

There are no concrete implementations of Connection, Statement, or ResultSet in java.sql, or elsewhere in the public SDK. Each implementation of these and other JDBC interfaces is created by the producer of the database driver, and delivered as part of the database driver package. If you printed out the classname of the Connection object or other objects that you are using, you would probably see something like XXXConnection, XXXStatement, or XXXConnectionImpl, XXXStatementImpl, etc., where XXX would be the name of the database you are using, for example, Oracle.

If we wanted to measure the JDBC performance of the example getATable() method just presented, we could simply put calls to System.currentTimeMillis() at the beginning and end of the getATable() method, and print the time difference to find out how long it took. That technique would work where the database interaction is isolated, as in the getATable() method. But usually a Java application spreads its database interaction among many methods in many classes, and it is often difficult to isolate the database interaction. How can we measure database interactions in this more widespread situation?

One ideal way would be if all the JDBC classes had measurement capabilities built into them. Then we could simply turn on measurements whenever we wanted to see the performance of the database interactions. JDBC classes do not normally provide this feature, but perhaps we can replace them with classes that do. Our target is for the replacement classes to provide exactly the

Proxy objects

Wrapping objects of a particular interface using dedicated wrapper objects of that interface is an established technique with many uses. The synchronized wrappers of the collection classes are probably the best known example, but there are many others. The SDK even has a special class which will generate wrapper objects at runtime: the java.lang.reflect.Proxy class. Wrapper objects are also known as proxy objects. I could have used generated proxy objects in this article, but that would have made the explanation of wrapping JDBC objects more complicated than I wanted, so I stuck with explicitly coded wrapper classes.

functionality previously available, to have the additional capability of measuring database interactions, and also for the replacement to require very little change to and be transparent to the rest of the application. That's a tall bill.

Fortunately, when a framework is defined almost entirely in terms of interfaces, as JDBC is, it becomes very simple to replace any class with another implementation. That is, after all, the whole point of interfaces. In particular, you can always replace any implementation of an interface with a wrapper class that simply wraps the original class and forwards (or delegates, in OO speak) all of the method calls to that original class. Here, we can replace the JDBC classes we use with wrappers which wrap the original classes. We can embed our measuring capabilities in the wrapper classes and have those measurements execute throughout the application.

Wrapping the Connection class

We'll start by wrapping the Connection class. The following ConnectionWrapper class implements Connection. The class has one instance variable of Connection type, and a constructor that initializes that instance variable with the constructor parameter. Most of the Connection methods are simply defined to delegate the call to the instance variable:

package tuning.jdbc;

import java.sql.*;
import java.util.Map;

public class ConnectionWrapper implements Connection
{
  protected Connection realConnection;

  public Connection realConnection () {
    return realConnection;
  }

  public ConnectionWrapper (Connection Connection) {
    realConnection = Connection;
  }

  public void clearWarnings() throws SQLException {
    realConnection.clearWarnings();
  }

  public void close() throws SQLException {
    realConnection.close();
  }

  public boolean isClosed() throws SQLException {
    return realConnection.isClosed();
  }

   public void commit() throws SQLException {
    realConnection.commit();
  }
  
  ...

Related Reading

Java Performance TuningJava Performance Tuning
By Jack Shirazi
Table of Contents
Index
Sample Chapter
Full Description
Read Online -- Safari

I have left out most of the methods, but they follow the template of the ones shown here. Where previously you would have used a Connection object obtained from the database driver, you instead simply wrap that Connection object with the ConnectionWrapper, and use the ConnectionWrapper object. Wherever you obtain your Connection object, you simply need to add one extra line like the one shown here:

Connection dbConnection = getConnectionFromDriver();
  dbConnection = new ConnectionWrapper(dbConnection);

Obtaining connections is really the only part of the application that needs changing. This will require all the calls that obtain a Connection to be found and edited; however, most applications use a central factory class to provide their Connection objects, and in these cases it is very simple to add the ConnectionWrapper. The factory class frequently fronts a pool of connections, and there may be a little extra work to release the Connection back into the pool, since it will first need to be unwrapped, for example:

public static void releaseConnection(Connection conn)
{
  if (conn instanceof ConnectionWrapper)
    conn = ( (ConnectionWrapper) conn).realConnection();
  //carry on with original release code
  ...
}

We havent actually finished with the ConnectionWrapper class yet. There are some methods of the ConnectionWrapper class that are not simple delegations. These are the methods that provide various types of Statement objects:

public Statement createStatement() throws SQLException {
    return new StatementWrapper(realConnection.createStatement(), this);
  }

  public Statement createStatement(int resultSetType,
              int resultSetConcurrency) throws SQLException {
    return new StatementWrapper(
          realConnection.createStatement(resultSetType,
             resultSetConcurrency), this);
  }

  public CallableStatement prepareCall(String sql) throws SQLException {
    return new CallableStatementWrapper(
          realConnection.prepareCall(sql), this, sql);
  }

  public CallableStatement prepareCall(String sql, int resultSetType,
             int resultSetConcurrency) throws SQLException {
    return new CallableStatementWrapper(
          realConnection.prepareCall(sql, resultSetType,
                resultSetConcurrency), this, sql);
  }

  public PreparedStatement prepareStatement(String sql)
        throws SQLException {
    return new PreparedStatementWrapper(
          realConnection.prepareStatement(sql), this, sql);
  }

  public PreparedStatement prepareStatement(String sql, int resultSetType,
        int resultSetConcurrency) throws SQLException {
    return new PreparedStatementWrapper(
          realConnection.prepareStatement(sql, resultSetType,
               resultSetConcurrency), this, sql);
  }

As you can see, we have three types of Statement wrapper classes we need to define. In addition, there is one other wrapper class we need, for the DatabaseMetaData. This wrapper class is required for completeness, because DatabaseMetaData can return the Connection object used to create the DatabaseMetaData, so we need to make sure that the Connection object is our wrapped one, not the original unwrapped Connection.

  public DatabaseMetaData getMetaData() throws SQLException {
    return new DatabaseMetaDataWrapper(
          realConnection.getMetaData(), this);
  }

Wrapping the statement classes

The three statement classes, Statement, PreparedStatement, and CallableStatement, have similar simple wrappers that forward all of the calls:

public class StatementWrapper implements Statement
{
  protected Statement realStatement;
  protected ConnectionWrapper connectionParent;

  public StatementWrapper(Statement statement, ConnectionWrapper parent)
  {
    realStatement = statement;
    connectionParent = parent;
  }

  public void cancel() throws SQLException {
    realStatement.cancel();
  }

  ...

I've chosen to implement the PreparedStatementWrapper as a subclass of StatementWrapper, but that isn't a requrement. You could have PreparedStatement as a subclass of Object and implement all of the required methods rather than inherit the Statement methods:

public class PreparedStatementWrapper extends StatementWrapper implements PreparedStatement
{
  PreparedStatement realPreparedStatement;
  String sql;
  public PreparedStatementWrapper(PreparedStatement statement, ConnectionWrapper parent, String sql)
  {
    super(statement, parent);
    realPreparedStatement = statement;
    this.sql = sql;
  }

  public void addBatch() throws SQLException {
    realPreparedStatement.addBatch();
  }

Similarly, I've chosen to implement the CallableStatementWrapper as a subclass of PreparedStatementWrapper:

public class CallableStatementWrapper extends PreparedStatementWrapper implements CallableStatement
{
  CallableStatement realCallableStatement;
  public CallableStatementWrapper(CallableStatement statement, ConnectionWrapper parent, String sql)
  {
    super(statement, parent, sql);
    realCallableStatement = statement;
  }

  public Array getArray(int i) throws SQLException {
    return new SQLArrayWrapper(realCallableStatement.getArray(i), this, sql);
  }

Once again, we haven't quite finished. There are several kinds of methods in these Statement wrapper classes which should not be simple delegations. First, there is a method that returns the Connection object. We want to return the ConnectionWrapper instead, but that is easy enough. Here is the method from StatementWrapper:

  public Connection getConnection() throws SQLException {
    return connectionParent;
  }

Second, we have methods that return ResultSets. These methods need to return ResultSet wrappers. In order to keep the ResultSetWrapper consistent, I've added a lastSqlString instance variable to StatementWrapper, which is passed to the ResultSetWrapper constructor. This instance variable will become useful when we come to assigning measurements to particular SQL statements. The methods that return ResultsSets are:

//StatementWrapper method
  public ResultSet getResultSet() throws SQLException {
    return new ResultSetWrapper(realStatement.getResultSet(), this, lastSql);
  }

  public ResultSet executeQuery(String sql) throws SQLException {
    return new ResultSetWrapper(realStatement.executeQuery(sql), this, sql);
  }

//PreparedStatementWrapper method
  public ResultSet executeQuery() throws SQLException {
    return new ResultSetWrapper(realPreparedStatement.executeQuery(), this, sql);
  }

Third, some methods use java.sql.Array objects. Because these Array objects can return a ResultSet, we again need to provide an Array wrapper so that ResultSetWrapper objects will be returned rather than plain ResultSets. We also need to handle the case where an Array object is passed in to the setArray() method: if it is an Array wrapper, the object needs to be unwrapped before being passed to the underlying PreparedStatement:

public void setArray(int i, Array x) throws SQLException {
    if (x instanceof SQLArrayWrapper)
      realPreparedStatement.setArray(i, ((SQLArrayWrapper) x).realArray);
    else
      realPreparedStatement.setArray(i, x);
  }

public Array getArray(int i) throws SQLException {
    return new SQLArrayWrapper(realCallableStatement.getArray(i), this, sql);
  }

Finally, the reason we are creating all these wrapper classes is to enable measurements to be taken. The methods that execute the SQL statements are, reasonably enough, called execute-something. We need to add logging into these methods. Note that I delegate responsibility for logging to a JDBCLogger class in the following methods. Essentially, each method has a call to the real execute method wrapped with a logging call. I pass the sql string and the current thread to the logging call because these are both very useful parameters for any type of logging, especially measuring the time taken for the procedure to run. Note also that I've redefined the executeQuery() methods that return ResultSets (which were first defined a couple of code fragments back) so that they now have logging inserted:

//StatementWrapper methods
  public void addBatch(String sql) throws SQLException {
    realStatement.addBatch(sql);
    lastSql = sql;
  }

  public boolean execute(String sql) throws SQLException {
    Thread t = Thread.currentThread();
    JDBCLogger.startLogSqlQuery(t, sql);
    boolean b = realStatement.execute(sql);
    JDBCLogger.endLogSqlQuery(t, sql);
    lastSql = sql;
    return b;
  }

  public int[] executeBatch() throws SQLException {
    Thread t = Thread.currentThread();
    JDBCLogger.startLogSqlQuery(t, "batch");
    int[] i = realStatement.executeBatch();
    JDBCLogger.endLogSqlQuery(t, "batch");
    return i;
  }

  public ResultSet executeQuery(String sql) throws SQLException {
    Thread t = Thread.currentThread();
    JDBCLogger.startLogSqlQuery(t, sql);
    ResultSet r = realStatement.executeQuery(sql);
    JDBCLogger.endLogSqlQuery(t, sql);
    lastSql = sql;
    return new ResultSetWrapper(r, this, sql);
  }

  public int executeUpdate(String sql) throws SQLException {
    Thread t = Thread.currentThread();
    JDBCLogger.startLogSqlQuery(t, sql);
    int i = realStatement.executeUpdate(sql);
    JDBCLogger.endLogSqlQuery(t, sql);
    lastSql = sql;
    return i;
  }


//PreparedStatementWrapper methods
  public boolean execute() throws SQLException {
    Thread t = Thread.currentThread();
    JDBCLogger.startLogSqlQuery(t, sql);
    boolean b = realPreparedStatement.execute();
    JDBCLogger.endLogSqlQuery(t, sql);
    return b;
  }

  public ResultSet executeQuery() throws SQLException {
    Thread t = Thread.currentThread();
    JDBCLogger.startLogSqlQuery(t, sql);
    ResultSet r = realPreparedStatement.executeQuery();
    JDBCLogger.endLogSqlQuery(t, sql);
    return new ResultSetWrapper(r, this, sql);
  }

  public int executeUpdate() throws SQLException {
    Thread t = Thread.currentThread();
    JDBCLogger.startLogSqlQuery(t, sql);
    int i = realPreparedStatement.executeUpdate();
    JDBCLogger.endLogSqlQuery(t, sql);
    return i;
  }

Wrapping the ResultSet class

The ResultSetWrapper class once again consists mainly of delegated methods:

public class ResultSetWrapper implements ResultSet
{
  ResultSet realResultSet;
  StatementWrapper parentStatement;
  String sql;

  public ResultSetWrapper(ResultSet resultSet, StatementWrapper statement, String sql) {
    realResultSet = resultSet;
    parentStatement = statement;
    this.sql = sql;
  }

  public boolean absolute(int row) throws SQLException {
    return realResultSet.absolute(row);
  }

  ...

Again, there are some methods which are not plain delegations. The getStatement() method returns the statement that generated this ResultSet. We need to return the StatementWrapper:

  public Statement getStatement() throws SQLException {
    return parentStatement;
  }

The getArray() methods need to return a wrapped Array object:

  public Array getArray(int i) throws SQLException {
    return new SQLArrayWrapper(realResultSet.getArray(i), parentStatement, sql);
  }

  public Array getArray(String colName) throws SQLException {
    return new SQLArrayWrapper(realResultSet.getArray(colName), parentStatement, sql);
  }

And finally, we need to add logging. Many developers erroneously believe that the various Statement.execute*() methods incur the main cost of the database interaction. This is true for database updates, and for database reads that involve only a few rows. But if more than a few rows of a query are being read, then the ResultSet.next() method can spend a significant amount of time fetching rows from the database. If many rows are to be read, ResultSet.next() calls can cumulatively take much more time than the execution of the SQL statement. So it is natural to log the ResultSet.next() calls.

  public boolean next() throws SQLException {
    Thread t = Thread.currentThread();
    JDBCLogger.startLogSqlNext(t, sql);
    boolean b = realResultSet.next();
    JDBCLogger.endLogSqlNext(t, sql);
    return b;
  }

There are other ResultSet calls that you can log if you need, such as previous(), or insertRow(), etc. But most applications will only need next() logged, and that is how Ill leave the class.

The JDBC wrapper framework

That pretty much covers the classes that need wrapping. I haven't explicitly shown the Array wrapper or the DatabaseMetaData wrapper, but they are straightforward, needing only delegation and ResultSetWrappers and ConnectionWrappers returned instead of ResultSets and Connections. All of the classes in full for JDBC 2 can be obtained from the further resources section. JDBC 1, JDBC 2 and the upcoming JDBC 3 are all amenable to this technique of using wrapper objects to measure database interaction. They differ in their interface definitions (and so, require different wrapper classes), but all of the wrapper classes for the different versions can be created in the same way, following the procedure covered in this article.

The class I haven't yet shown you is the JDBCLogger class. A simple implementation of that class would be with null calls for the logging methods. This would provide no logging:

package tuning.jdbc;

public class JDBCLogger
{
  public static void startLogSqlQuery(Thread t, String sql) {}
  public static void endLogSqlQuery(Thread t, String sql) {}
  public static void startLogSqlNext(Thread t, String sql) {}
  public static void endLogSqlNext(Thread t, String sql) {}

}

A more useful definition would be to time the queries. The following methods work by holding the start time of the query and finding the elapsed time for the query when it finishes. The methods are kept simple by assuming that SQL queries cannot recurse within the same thread, an assumption which is generally true:

  private static Hashtable QueryTime = new Hashtable();

  public static void startLogSqlQuery(Thread t, String sql)
  {
     if (QueryTime.get(t) != null)
       System.out.println("WARNING: overwriting sql query log time for " + sql);
     QueryTime.put(t, new Long(System.currentTimeMillis()));
  }

  public static void endLogSqlQuery(Thread t, String sql)
  {
     long time = System.currentTimeMillis();
     time -= ((Long) QueryTime.get(t)).longValue();
     System.out.println("Time: " + time + " millis for SQL query " + sql);
     QueryTime.remove(t);
  }

Using these methods in the JDBCLogger class would provide output lines similar to:

Time: 53 millis for SQL query SELECT * FROM JACKTABL

for every SQL query execution. This gives you precise measurements for each of your SQL queries. You can also sum the queries in the JDBCLogger class, retain the maximum half-dozen measurements or so, and then print out a summary statment. The summary measurements I usually like to have are minimum, maximum, average, standard deviation, and 90th percentile value. These summary values tend to be more useful for a scaled-up system, whereas printing individual lines for each query is more useful when profiling the system during development.

Using the JDBC wrapper framework

I have found the techniques presented here to be very useful in determining the performance of JDBC calls, both in development and in deployed applications. Because the wrappers are simple and robust, and require very little alteration to the application using them (i.e, are low maintenance), they are suitable to be retained within a deployed application. Creating a configurable JDBCLogger class will allow you to turn logging on and off at will.

During development, these classes have enabled me to identify both individually expensive database interactions and repeated database interactions which are expensive because of their cumulative cost. Identifying these expensive database interactions is the necessary first step to improving your application's performance. In deployed environments, these wrapper classes have identified discrepancies between the expected performance and those actually obtained, and have helped to identify why those discrepancies exist.

Note that there is also a free tool available from Provision6, called the P6Spy driver, which uses the techniques illustrated in this article to log JDBC calls.

After you have used these classes to pinpoint where JDBC is causing a performance problem, you will need to tune the database interactions. The performance tuning techniques to tune JDBC will be presented in future Tuning JDBC articles.

Further Resources

Jack Shirazi is the author of Java Performance Tuning. He was an early adopter of Java, and for the last few years has consulted mainly for the financial sector, focusing on Java performance.


Read more Java Design and Performance Optimization columns.

Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.