ONJava.com    
 Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples


Domain Searching Using Visitors

by Paul Mukherjee
06/01/2005

Modern applications typically require domain searching functionality--the ability to search for data within the context of the application domain. For instance, an application for tracking financial transactions might need to be able to search for all transactions over a certain cash value; a supply-chain management application might need to be able to search for all requisitions for a particular supplier. This capability could be for ad hoc searching or for the generation of reports. Since the underlying information is typically stored in a relational database, at some point the search needs to be converted from the language of the domain to the query language supported by the database (typically SQL). Though this is a variation on the well-known problem of object-relational (O-R) mapping, this particular aspect of the problem has generated less interest than the more fundamental problem of defining the relationship between domain objects and database tables.

As a consequence, I have seen many applications where the benefits of a carefully designed O-R mapping layer have been negated by ill-conceived search functionality that couples the domain objects tightly with the database. The objective of this article is to show how careful design can provide a flexible solution that is easy to maintain and adapt to different O-R mappings or native SQL. The solution I present has two main ingredients: a collection of classes that capture the information to be used for searching; and an implementation of the Visitor pattern that provides the implementation of the search, in the language of the underlying persistence service.

In this article, I use a running example to illustrate the ideas I am presenting. The example is a simple database that is used as part of a backup application to record which files are located in a specific backup volume.

This article is organized as follows: in the following section, some of the forces that influence the design are described. After that an overview of the design is given, followed by a description of the framework used in the design. The two sections after this show two different implementations of the design, and finally, the strengths and weaknesses of the design are considered.

The source code accompanying this article is available in the Resources section below. Note that for brevity I have inlined some variables and methods in this article, compared to the actual source code.

Related Reading

Hibernate: A Developer's Notebook
By James Elliott

Forces

When considering the problem of domain searching, a number of different forces apply and should be considered as parameters constraining potential solutions.

The first constraint is that the solution should support loose coupling between domain objects and the database. A couple of years ago this would have required no further justification, but many of the exponents of lightweight approaches to application design have argued eloquently that loose coupling is a symptom of over-engineering, so some explanation is necessary.

The objective of loose coupling in this context is to ensure clean separation between the problem domain layer and the data management layer. This separation is critical if each layer is to have well-defined responsibilities. In fact, the emerging trend towards transparent persistence makes this separation all the more important.

Any solution should also be maintainable; modifying or extending domain search functionality should not lead to wholesale changes in the application design. Loose coupling is normally necessary for maintainability, but not sufficient.

Other forces could also apply depending on the specific application, such as performance, substitutability (should a number of different databases or persistence services be supported?), and so on. However, I consider loose coupling and maintainability to be common to all solutions except perhaps the worst kind of hack thrown together on short notice (the kind we all have worked on but never admit to!).

Design Overview

The domain searching design I am going to present assumes a standard layered approach with a presentation layer, a problem domain layer and a data management layer. The example also uses a client layer consisting of Java Server Pages (JSPs) served to a user in a browser. To keep the example simple, I have not used a web application framework such as Struts, though there would be no problem using this design within a Struts application.

The example therefore structures the design as shown in Figure 1. The important part of the design is the third step. This is also the only step that is dependent on the underlying persistence service.

Solution Structure
Figure 1: Solution structure

In order to help understand the example, the structure of the database used is shown in Figure 2. A volume contains a number of files, and a number of keywords may also be associated with a volume to allow for keyword-based searching. The information persisted has deliberately been kept as simple as possible to ensure that the clarity of the design is preserved.

Database Tables
Figure 2: Database tables

In order to demonstrate the flexibility of the design, I will present two implementations, one using SQL and one using Hibernate. However, before these implementations can be described, the overall solution framework needs to be outlined.

Design Framework

When designing domain searching, it is important to have a clear specification of what is to be searched, and what is to be presented as the result of the search. This might seem self-evident, but in my experience it can lead to significant discussion, especially if the users of the application fall into different constituencies.

Assuming that such a specification is in place, a number of search criterion classes should be created, which will contain the user-provided search data. In the backup database example, users can search using volume names, file names, a file's last modification date, and keywords. The result will show the volume name, file name, file size, and last modification date for each file matching the submitted search criteria. It is permitted for none of the described search criteria to be used, in which case the entire contents of the database will be returned, subject to whatever constraint is specified by the presentation tier on page size.

This leads to the class diagram shown in Figure 3.

Search Criterion Classes
Figure 3: Search criterion classes

ISearchCriterion is an interface implemented by all search criterion classes. The methods defined by this interface are described in the table below. For this article, the most important method is the accept method, which provides the entry point for visitors. This is a standard implementation of the Visitor pattern.

Method Description
String getName() The name of the criterion. This is used to identify the criterion uniquely within the program.
String getDisplay() The display name of the criterion. This is the name used by clients to present the criterion to users.
void setValue(String value) throws ParseException This method is used to populate the criterion. The string value is that submitted by the user; it is parsed by this method and the object stores the result of the parse.
void accept(ISearchVisitor visitor) throws SearchException The entry point for visitor objects, for this criterion.

For domain searching, the Visitor pattern provides the perfect means to traverse the search criterion classes. The interface ISearchVisitor defines methods for visiting each kind of search criterion. This allows the submitted search criteria to be traversed and a query built; the search can then be executed and a result delivered. The visitor hierarchy for this example is shown in Figure 4.

Visitor Classes
Figure 4: Visitor classes

The following sections describe the SQL and Hibernate visitors in more detail.

SQL Visitor

The SQL visitor traverses the collection of search criteria objects and constructs a prepared statement. This prepared statement is then executed and a search result constructed. The various visit methods are used to accumulate the tables to include in the search, and the constraints on the search. The doSearch method visits all of the search criteria and combines the information yielded by the visitor methods to create a prepared statement that is executed. The result set is then traversed and a search result created. The following section shows the accumulation of information from the search criteria by visitor methods. After that, the actual query execution and result handling is explained.

Visitor Methods

The visitor methods accumulate information in four instance variables; the prepared statement to be built will have the form:

SELECT <selects> FROM <tables> WHERE <constraints> 

The SQL visitor uses collections to accumulate the values used to populate this statement. The collection used for <selects> is a list, since the order is significant; the order provides the basis for interpretation of the result set. Order is not significant for <tables>, so here a set is used. In the case of constraints, in order to be able to match actual values to the placeholders created in the prepared statement, two lists are used. The first contains the actual string values to be conjoined in the prepared statement; the second contains the values to be used to populate the placeholders.

This leads to the following instance variable declarations:

public class SQLSearchVisitor implements ISearchVisitor {
  private List selects = new ArrayList();
  private Set tables = new HashSet();
  private List criteria = new ArrayList();
  private List parameters = new ArrayList();
  ...
}

Having previously defined what the query should yield, it is possible to add some values to the collection, such as the names of the tables and columns, that need to be included for all searches. This is done by the method addFixedInformation:

private void addFixedInformation() {
  selects.add("VOLUME_TB.NAME");
  selects.add("FILE_TB.NAME");
  selects.add("FILE_TB.SIZE");
  selects.add("FILE_TB.LASTMODIFIED");
  tables.add("VOLUME_TB");
  tables.add("FILE_TB");
  criteria.add("FILE_TB.VOLUMEFK = VOLUME_TB.ID");
}

Each visitor method then adds information to the instance variables. For instance, consider visitVolumeName from the class SQLSearchVisitor. The visitor method adds the criterion that any matched volumes must include the submitted volume name as a substring. visitKeyword provides a more interesting example as in this case, extra tables have to be added.

public void visitKeyword(SearchKeyword keyword) {
  tables.add("KEYWORD_TB");
  tables.add("KEYWORD_VOLUME_REL");
  addCriterion("KEYWORD_TB.KEYWORD LIKE ?",
  keyword.getKeyword());
}

In this case, it is also necessary to constrain the tables so that the relationship between the keyword and volume represented by KEYWORD_VOLUME_REL is used. This is consigned to a separate method to avoid repetition in the criterion list:

private void addJoins(){
  if (tables.contains("KEYWORD_VOLUME_REL")){
    // Must also contain KEYWORD_TB and VOLUME_TB
    criteria.add("KEYWORD_VOLUME_REL.KEYWORDFK =
    KEYWORD_TB.ID");
    criteria.add("KEYWORD_VOLUME_REL.VOLUMEFK = VOLUME_TB.ID");
  }
}

The remaining visitor methods can be seen in the source code. These methods are glued together by the method buildQuery, from SQLSearchVisitor, which ensures that the instance variables are populated. Having populated these instance variables, it is possible to create and execute a prepared statement.

Query Construction and Execution

Query construction falls into two phases: creating the SQL text, and then inserting the parameters into the prepared statement, corresponding to the placeholders in the text. This is captured by getPreparedStatement, from SQLSearchVisitor, which uses the instance variable conn, which is an instance of java.sql.Connection.

Creating the SQL statement is just a question of iterating over the collections and concatenating the values to form a string. A helper method, addItems from SQLSearchVisitor, is used to exploit the similarity in the iterations. For example, if a search is submitted for files matching the pattern *.jpg, this would result in the following SQL statement:

SELECT VOLUME_TB.NAME, FILE_TB.NAME, FILE_TB.SIZE,
FILE_TB.LASTMODIFIED FROM FILE_TB, VOLUME_TB WHERE
FILE_TB.VOLUMEFK = VOLUME_TB.ID AND FILE_TB.NAME LIKE ?

And the parameters list would contain the single value "%.jpg%".

Once the prepared statement has been created, it can be executed. This yields a ResultSet object that can be traversed and a SearchResult constructed. This SearchResult object is returned as the result of the search to the presentation tier for rendering in a suitable form. Details can be found in the source code provided.

Hibernate Visitor

Hibernate is an object-relational persistence and query service for Java. It supports transparent persistence by allowing persistent data to be defined as plain old Java objects (POJOs). Runtime configuration is then used to map these objects to database tables, giving a clean separation between the domain object model and persistence management. Extensive literature about Hibernate is available elsewhere (see the Resources section), so I don't propose to provide a thorough introduction here.

Domain Objects and Persistence

The domain object model for the example is straightforward and is shown in Figure 5. I will briefly describe the mapping of one domain object to the database in order to give a flavor of how this is done.

Application Domain Object Model
Figure 5: Application domain object model

The domain class File represents the information stored about a backed-up file by the application. It contains a number of instance variables (listed in the table below) with associated getter and setter methods.

Name Description
id : Integer A unique identifier for the file.
lastModified : Date The date of the last modification of the file prior to backup.
name : String The name of the file.
size : Integer The size of the file in bytes.
volume : Volume The volume used to back up the file. If this File object is returned as the result of a search, this will be null.
volumeName : String The name of the volume. This is only used if the volume is null.

Instances of this class correspond to rows in the FILE_TB table shown in Figure 2; the VOLUME_FK column in this table is used to provide the volume and volumeName instance variables as needed.

The mapping between the File class and the FILE_TB table is provided by File.hbm.xml. Similar mapping files for the other domain classes can be defined. These can be seen in the accompanying source code.

Searching Using Hibernate

Having established the object-relational mapping, Hibernate offers a number of different methods of querying. It is possible to perform a query using the underlying JDBC connection, in a similar manner to the SQL search visitor described previously. However, this approach bypasses the persistence service and is provided only if the persistence service does not support the desired behavior, which is not the case here. A second method provided by Hibernate is Hibernate Query Language, which allows SQL-style queries to be phrased in the language of domain objects. This is a very powerful approach, and for complex queries is the recommended strategy. The third approach is to use Hibernate criteria, which are objects that constrain properties of the domain object model. For simple searches, such as those needed for the example, this approach is ideal and is therefore the one I have used.

The query to be executed by Hibernate is specified using a Criteria object. This object is built up by the visitor as it traverses the supplied search criteria. This object is therefore implemented as an instance variable by the visitor and is initialized in the constructor from a supplied Hibernate Session object:

public class HibernateSearchVisitor implements ISearchVisitor {
  private Criteria criteria;
  public HibernateSearchVisitor(Session session)
    throws SearchException {
    this.criteria = session.createCriteria(File.class);
  }
  ...
}

The use of File.class here tells Hibernate that instances of File are to be yielded by the search. Each visitor then adds to the criteria object. Dependent domain objects are tied in by creating child criteria. For example, the method visitVolumeName from HibernateSearchVisitor constrains the name of the dependent Volume object by creating a child criteria for the volume property of File and constraining it.

The other visitor methods constrain the properties of File in a similar manner. The overall search is performed by the doSearch method from the class HibernateSearchVisitor, which builds the query using the buildQuery method. The actual query is performed by the list method of the criteria object, and the result is then used to populate a SearchResult object. Compared to the SQL visitor, the simplicity and elegance of Hibernate is illustrated perfectly in this example.

Closing Remarks

In the preceding sections, I have described a design for domain searching that is based around defining domain objects to represent the search, and then using the Visitor pattern to build the actual search to be performed. I have shown two specific implementations of the visitor to illustrate the approach.

The main strengths of the design are:

Note with this last point that this isn't just a question of supporting multiple databases (in my experience, switching databases during a project is rare). It might be that a product is being developed and the flexibility to support other persistence approaches in the future needs to be built in now. Using the visitor design provides this flexibility at very little cost. Moreover, it might be that in the future the application might need to search an external information system, say, using web services. This design makes support for such functionality straightforward to incorporate.

The main weakness of the design is essentially that inherited from the use of the visitor: if the structure of the domain objects changes frequently, then all of the visitor implementations need to be updated to reflect these changes. However, in my experience domain objects change infrequently compared to the pace of change of other parts of an application.

Whatever your approach to persistence, next time you have to implement domain searching, I hope this article has added some weapons to your armory.

Resources

Paul Mukherjee works as a technical architect for Systematic Software Engineering Limited in Britain, and is a Sun Certified Enterprise Architect.


Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.