ONJava.com    
 Published on ONJava.com (http://www.onjava.com/)
 See this if you're having trouble printing code examples


Agile Object to Relational Database Replication with db4o

by Jim Paterson
04/12/2006

Databases can be many things to many people. For some, a database is something that handles enterprise data on a cluster of back-end servers. Others work at the other end of the spectrum and need to gather and store data on small mobile devices such as PDAs or more specialized vertical market devices. Sometimes these devices have a network connection that allows them to talk directly to a back-end server. Often, however, they are partially connected, and need their own local data store.

This is where embedded databases come into the picture. These are databases that can be embedded into applications, rather than running as separate processes. It's not surprising that they have become increasingly popular in the past few years. Most embedded databases are lightweight variants of the traditional SQL database. However, this is an area where db4o, an open source native pure object database for Java and .NET, excels. db4o, which I described in a previous article, has some important assets in this type of environment: small footprint, zero administration, and, as a native object database, the ability to store objects as they are, with no need to define a database schema or map objects to tables. The domain class model itself is the database schema. The simplicity of development with an object database can lead to significant savings in time and effort.

Why Replicate?

There are some potential drawbacks with an object database like db4o. Since data is stored in the form of Java or .NET objects, you can only make sense of it within the application that created it, or with an application that can access the same model classes. You can't just fire an SQL query at the database.

The new db4o Replication System (dRS) now allows you to join together the different worlds of object and relational databases. Based on Hibernate, it provides the capability to replicate data between db4o and relational databases such as Oracle and MySQL. In practice, this can mean you can synchronize data between the local db4o database on a partially connected device and an enterprise RDBMS. It also means that your db4o data becomes available on an SQL-friendly platform for ad hoc access.

Related Reading

Hibernate: A Developer's Notebook
By James Elliott

dRS also allows such distributed systems to benefit from db4o's schema evolution support. In an agile environment, object models may be refactored frequently. db4o automatically adapts to a changed object model, and those changes can be replicated into the RDBMS.

db4o Replication System (dRS)

db4objects Inc., the company behind the db4o database, has recently released a development version of its new replication solution. dRS can be used in a distributed db4o-based system in a number of ways. Here are some possible scenarios:

dRS provides an object-oriented API that lets you write simple applications to perform these tasks, or to embed the capabilities within your applications. Although db4o supports both Java and .NET, the initial release of dRS is Java-only. However, it can also be used to replicate databases containing .NET objects.

The ability to replicate between native databases is an important feature for db4o in distributed applications in which components are not necessarily able to communicate over permanent or highly available network connections. A db4o database consists of a single, easily transportable file, the contents of which can be made available to applications through a lightweight network server, through a server embedded within the application, or by simply opening the file in a single-user mode. This gives a great deal of flexibility: as long as a replication system can get access to a database file, it can synchronize it with another file or files.

There are other benefits from an object-oriented approach to replication. For example, a replication process often needs to handle conflicts; for example, where objects have been modified since the last replication in two databases that are to be synchronized. Because dRS can work with the actual business objects, rather than a representation of their field values mapped onto tables, it is possible to build the precedence criteria into the business logic within the objects. For example, in a system with customers and salespersons, a customer object might be able to store information to indicate whether that object was changed by a salesperson with "ownership" of that customer. The replication process could then access that information encapsulated in the objects themselves to help resolve conflicts.

From Objects to Tables

Replication between native object databases is interesting, but what is likely to make many people sit up and take notice of db4o is the ability to replicate objects from a native object database to MySQL, or to Oracle, or to whatever RDBMS you might prefer to use. You can develop your application with db4o, but you can access the data pretty much any way you want. If your company has a preference to have all of its data deposited within an Oracle system, you can do it, even if it was originally created by a db4o-based application.

In this article, I'll concentrate on the object-to-relational capability of dRS.This is intended to give you a flavor of what db4o and dRS can do; it is not meant to be a comprehensive guide to all the features of dRS. Along the way, having a close look at the same data in different databases will highlight some of the ways in which a native object database differs from a relational one.

dRS and Hibernate

dRS makes use of the capability that Hibernate, the popular open source object-relational mapping framework, provides to work directly with Java objects that are persisted by a relational database. Hibernate works in the background to generate and execute the SQL statements that the database understands, and to connect the results of executing the SQL with live Java objects. dRS builds on this to compare objects in a db4o database with those stored in the SQL database, and to update those that have changed since the last replication. Hibernate does the work of making persisted data available as live objects, while dRS does the work of comparing and deciding which objects need to be updated. Because Hibernate supports a wide range of RDBMSes (there is a list here), dRS supports them too.

The only extra bits of work that need to be done compared with db4o-to-db4o replication are to create a Hibernate configuration file (which specifies, among other things, what kind of database is to be used) and specify where to find it, and to create mapping files for each persistent class. It's worth noting here that by doing this you are to some extent compromising on the benefits offered by db4o itself. With a pure object database, you never have to create mappings, and you can easily work with object models that would be very difficult to map to relational tables.

So what's the motivation to use db4o at all? Why not use a relational database in the first place? There are a number of reasons why you might choose db4o. The fact that the application and database use exactly the same model means that you don't have to write a lot of code to build and execute SQL statements that the compiler can't check and that can be difficult to refactor. You also don't need to write more code to assemble objects from the results. You could use Hibernate directly in the application, which would let you work only with objects in your code. However, the footprint and performance of the resultant application would probably make it unsuitable for devices with limited resources. Finally, in my opinion, db4o is just really nice to use.

Example: Replication from db4o to MySQL

Let's look at dRS replication in action. The dRS distribution includes some samples based on an in-memory HSQLDB database. To test it out in a more "enterprise" scenario, let's replicate to a MySQL 5.0 database.

You can download the source code for this example in the Resources section. To compile and run this code you'll need to download dRS and add all of the included libraries to your project. The dRS distribution includes the required db4o and Hibernate libraries. You'll also need access to a MySQL server, on which you need to create a schema called drsexample, and you should add the MySQL Connector/J driver to your project--you can download it from the MySQL Connectors Downloads page.

First we need some objects. These will be instances of the two classes listed below, Customer and Address. There is a one-to-one relationship here: each Customer object has a field that is an Address object. The ability to deal with objects with relationships is important, as objects stored in an object database will often have fields that are themselves objects. Replication should maintain those relationships. dRS supports fields that are either single objects like this, or instances of common collection types.

public class Customer {
    private Long id;
    private String title;
    private String firstName;
    private String lastName;
    private Address address;
    private String email;
    private String phoneNumber;
    private String faxNumber;
    private int creditLine;
    private String contactCode;
    
    // getters and setters omitted...
        
    public String toString(){
        return "[" + firstName + " " + lastName + "]";
    }   
}

public class Address {
    private Long id;
    private String street1;
    private String street2;
    private String city;
    private String zip;
    private String country;
    
    // getters and setters omitted... 

    public String toString(){
        return "[" + street1 + ", " + city + "]";
    }
}

Now we need to create some instances, and store these in a db4o database. Actually, we'll store them in a number of separate databases, as you would with db4o databases on separate disconnected mobile devices. These separate databases can then be merged into the MySQL database. Two databases will be enough to demonstrate the point.

In db4o, you can create a database very easily within your application code. Adding the single db4o.jar file as a library to your Java project gives you access to the API classes. The database itself is contained in a single file, usually given the extension .yap. The following line of code opens an existing database file, or creates a new one if it doesn't exist:

ObjectContainer objectContainer = Db4o.openFile("c:/mobileone.yap");

ObjectContainer is a db4o API class that provides a handle to allow you to access the database. Calls to store, update, or retrieve objects from the database are made through the ObjectContainer reference.

The following code creates two Customer objects and two Address objects. Each Customer is associated with one of the Address objects. The Customer objects are stored simply by calling the method ObjectContainer#set. The database doesn't have to be told that it is going to store this type of object--it can store any kind of object with no need to define the schema in advance. Objects are stored as Java objects.

What about storing the Address objects? Well, because each one is a field of a Customer, storing the Customer also stores the Address, so we don't need to store them explicitly.

ExtDb4o.configure().generateVersionNumbers(Integer.MAX_VALUE);
ExtDb4o.configure().generateUUIDs(Integer.MAX_VALUE);
ObjectContainer objectContainer = Db4o.openFile("c:/mobileone.yap");

Address a1 = new Address();
a1.setStreet1("1 First Street");
a1.setStreet2("Downtown");
a1.setCity("Pittsburgh");
a1.setZip("PA 15121");
a1.setCountry("USA");

Address a2 = new Address();
a2.setStreet1("2 Second Street");
a2.setStreet2("Uptown");
a2.setCity("Seattle");
a2.setZip("WA 98200");
a2.setCountry("USA");

Customer c1 = new Customer();
c1.setTitle("Mr");
c1.setFirstName("Ben");
c1.setLastName("Roethlisberger");
c1.setEmail("ben@steelers.com");
c1.setPhoneNumber("1-412-123-4567");
c1.setFaxNumber("1-412-987-6543");
c1.setCreditLine(25000);
c1.setContactCode("Joe");

Customer c2 = new Customer();c2.setTitle("Mr");
c2.setFirstName("Matt");
c2.setLastName("Hasselbeck");
c2.setEmail("matt@seahawks.com");
c2.setPhoneNumber("1-206-123-4567");
c2.setFaxNumber("1-206-987-6543");
c2.setCreditLine(16000);
c2.setContactCode("Joe");

c1.setAddress(a1);
c2.setAddress(a2);

objectContainer.set(c1);
objectContainer.set(c2);
objectContainer.commit();
objectContainer.close();

There are a couple of db4o configuration calls made before the database file is opened.

Let's take a look at what's been stored. We can do this by running a query within a Java application, or by using db4o's graphical database browser, the ObjectManager (this needs to be downloaded separately from the main db4o distribution). We'll do the latter here, and look at an example of running a query later on. Figure 1 shows the ObjectManager interface.

Figure 1
Figure 1. The db4o ObjectManager

Browsing an object database is very different from browsing a relational one. There are no tables. Instead, the database contents are shown as an object tree, similar to the way local variables are often displayed by debuggers in IDEs. There is a separate root node for each class that is stored. The set of objects of a particular class is called the extent of the class.

Figure 2 shows the Customer node expanded to show the fields of each object. Each object is identified by an object ID (OID) value (e.g. 1225 for the first Customer). The first Customer is associated with its Address by holding a reference in its address field to the Address object with the OID 3108. The relationship is stored in almost exactly the same way it was represented in the objects in memory before they were stored. In contrast, in MySQL there will need to be matching values in the id fields in order to maintain associations.

Figure 2
Figure 2. The Customer object tree in the db4o database

Configuring for Replication to MySQL

As described above, dRS uses Hibernate to map objects to tables, so we need some configuration files to tell it what classes to map and how to map them. The main configuration file, hibernate.cfg.xml, specifies the database driver and URL (on localhost in this example), database-specific dialect, and the classes mapping file names. The database, drsexample, on the MySQL server is specified in the URL.

<!-- hibernate.cfg.xml -->
<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE hibernate-configuration PUBLIC
                "-//Hibernate/Hibernate Configuration DTD 3.0//EN"
                "http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">

<hibernate-configuration>
    <session-factory>
        <!-- Database connection settings -->
        <property name="hibernate.connection.driver_class">
        com.mysql.jdbc.Driver</property>
        <property name="hibernate.connection.url">jdbc:mysql://
            localhost:3306/drsexample</property>
        <property name="hibernate.connection.username"></property>
        <property name="hibernate.connection.password"></property>

        <!-- JDBC connection pool (use the built-in) -->
        <property name="hibernate.connection.pool_size">1</property>

        <!-- SQL dialect -->
        <property name="hibernate.dialect">
        org.hibernate.dialect.MySQLDialect</property>

        <!-- Echo all executed SQL to stdout -->
        <property name="hibernate.show_sql">true</property>

        <!-- Update the database schema if out of date -->
        <property name="hibernate.hbm2ddl.auto">update</property>

        <!-- Specify all your data entity classes here -->
        <mapping resource="com/onjava/db4o/drs/Address.hbm.xml"/>  
        <mapping resource="com/onjava/db4o/drs/Customer.hbm.xml"/>  
    </session-factory>
</hibernate-configuration>

The two class mapping files, Customer.hbm.xml and Address.hbm.xml, are shown below. These map each class to a table: the tables are called CUSTOMERS and ADDRESSES. Note that Customer.hbm.xml specifies the address field as a participant in a one-to-one relationship.

<!-- Customer.hbm.xml -->
<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
                "-//Hibernate/Hibernate Mapping DTD 3.0//EN"
                "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">

<hibernate-mapping default-access="field" default-lazy="false"
    default-cascade="all">
    <class name="com.onjava.db4o.drs.Customer" table="CUSTOMERS">
        <id column="id" type="java.lang.Long">
            <generator class="foreign">
                <param name="property">address</param>
            </generator>
        </id>
        <property name="title" type="java.lang.String"/>
        <property name="firstName" type="java.lang.String"/>
        <property name="lastName" type="java.lang.String"/>
        <property name="email" type="java.lang.String"/>
        <property name="phoneNumber" type="java.lang.String"/>
        <property name="faxNumber" type="java.lang.String"/>
        <property name="creditLine" type="java.lang.Integer" length="5"/>
        <property name="contactCode" type="java.lang.String"/>
        <one-to-one name="address"/>   
    </class>
</hibernate-mapping>

<!-- Address.hbm.xml -->
<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
                "-//Hibernate/Hibernate Mapping DTD 3.0//EN"
                "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">

<hibernate-mapping default-access="field" default-lazy="false"
    default-cascade="all">
    <class name="com.onjava.db4o.drs.Address" table="ADDRESSES">
        <id column="id" type="java.lang.Long">
            <generator class="native"/>
        </id>
        <property name="street1" type="java.lang.String"/>
        <property name="street2" type="java.lang.String"/>    
        <property name="city" type="java.lang.String"/>
        <property name="zip" type="java.lang.String"/>
        <property name="country" type="java.lang.String"/>
    </class>
</hibernate-mapping>

Running the Replication

To replicate from the db4o database file to a running MySQL server with an empty database named drsexample, we need to add all of the dRS distribution libraries to a project; import the required Hibernate, db4o, and dRS classes; and run the following code:

ExtDb4o.configure().generateUUIDs(Integer.MAX_VALUE);
ExtDb4o.configure().generateVersionNumbers(Integer.MAX_VALUE); 
ObjectContainer objectContainer = Db4o.openFile("c:/mobileone.yap"); 
Configuration config = new Configuration().configure("hibernate.cfg.xml"); 
ReplicationSession replication = Replication.begin(container, config);
ObjectSet changed = replication.providerA().objectsChangedSinceLastReplication();
while (changed.hasNext()) 
    replication.replicate(changed.next()); 
replication.commit();
replication.close();
objectContainer.close();

After running this, we can open a MySQL command-line client on the drsexample database, and take a look at what's in the database. There are a number of tables, the most obviously relevant being CUSTOMERS and ADDRESSES . We can query to find out what's in them (for brevity, not all fields are shown here):

mysql> select id,firstname,lastname, email, creditLine from customers;
+----+-----------+----------------+-------------------+------------+
| id | firstname | lastname       | email             | creditLine |
+----+-----------+----------------+-------------------+------------+
| 1  | Matt      | Hasselbeck     | matt@seahawks.com |      16000 |
| 2  | Ben       | Roethlisberger | ben@steelers.com  |      25000 |
+----+-----------+----------------+-------------------+------------+
2 rows in set (0.00 sec)

mysql>  select id,street1,street2, city, zip, country from addresses;
+----+-----------------+----------+------------+----------+---------+
| id | street1         | street2  | city       | zip      | country |
+----+-----------------+----------+------------+----------+---------+
| 1  | 2 Second Street | Uptown   | Seattle    | WA 98200 | USA     |
| 2  | 1 First Street  | Downtown | Pittsburgh | PA 15121 | USA     |
+----+-----------------+----------+------------+----------+---------+
2 rows in set (0.00 sec)

With the few lines of code above, dRS has extracted all the objects from the db4o database, and with the help of Hibernate, has created the appropriate tables in the MySQL database and inserted the field values in the MySQL tables. All of the objects are copied, as this is the first replication of the database. Note the id field values--these did not come from the objects, and are not related to the db4o OIDs. The values have been generated by MySQL as directed by the Hibernate configuration files, and are needed so that the right ADDRESS row can be associated with each CUSTOMER row (remember this was taken care of in db4o by storing object references directly). For example, to get the ADDRESS for Ben Roethlisberger, we could write SQL like this:

select addresses.id, street1,street2, city, zip, country 
    from customers, addresses 
    where customer.id = address.id;

There is no need for a foreign key, as the simple one-to-one relationship means we can just match the ids in each table. For objects with one-to-many relationships, the configuration files can be written so that the required foreign keys are generated.

This is only part of the story, however. Replication requires that the database keeps track of changes to objects so that it only copies those that have been changed. We also need to be able to associate each table row with the correct object in a db4o database. dRS creates some additional fields to help with these requirements. These are the same for every class. The output from the MySQL describe command below shows all of the fields in the Customer table, while the following queries show some of the values stored:

mysql> describe customers;
+--------------------+--------------+------+-----+---------+-------+
| Field              | Type         | Null | Key | Default | Extra |
+--------------------+--------------+------+-----+---------+-------+
| id                 | bigint(20)   | NO   | PRI |         |       |
| title              | varchar(255) | YES  |     | NULL    |       |
| firstName          | varchar(255) | YES  |     | NULL    |       |
| lastName           | varchar(255) | YES  |     | NULL    |       |
| email              | varchar(255) | YES  |     | NULL    |       |
| phoneNumber        | varchar(255) | YES  |     | NULL    |       |
| faxNumber          | varchar(255) | YES  |     | NULL    |       |
| creditLine         | int(11)      | YES  |     | NULL    |       |
| contactCode        | varchar(255) | YES  |     | NULL    |       |
| drs_uuid_long_part | bigint(20)   | YES  |     | NULL    |       |
| drs_version        | bigint(20)   | YES  |     | NULL    |       |
| drs_provider_id    | bigint(20)   | YES  | MUL | NULL    |       |
+--------------------+--------------+------+-----+---------+-------+
12 rows in set (0.01 sec)

mysql>  select lastname, drs_version, drs_provider_id from customers;
+----------------+-------------+-----------------+
| lastname       | drs_version | drs_provider_id |
+----------------+-------------+-----------------+
| Hasselbeck     |           1 |               2 |
| Roethlisberger |           1 |               2 |
+----------------+-------------+-----------------+
2 rows in set (0.00 sec)

mysql> select street1, drs_version, drs_provider_id from addresses;
+-----------------+-------------+-----------------+
| street1         | drs_version | drs_provider_id |
+-----------------+-------------+-----------------+
| 2 Second Street |         100 |               2 |
| 1 First Street  |         100 |               2 |
+-----------------+-------------+-----------------+
2 rows in set (0.00 sec)

drs_version is the version number, which is compared to the db4o object version number when replicating to determine which objects have changed. drs_provider_id identifies the provider, which means the db4o database that provided this object. Both rows here have the drs_provider_id, as they all came from the same source. There is also a field, drs_uuid_long_part (not shown here), which is used by dRS to match a table row to a specific object in the database based on its UUID.

In addition to creating additional fields, dRS creates a set of tables to assist the replication process. These include the following:

Merging Databases

Let's bring another db4o database into the picture. The second database, stored in a file called mobile2.yap, also has two Customer and two Address objects. The code to create the database is identical to the first example, apart from the database filename and the actual field values, so it is not shown here. The replication code is also identical, and the same configuration files are used. The contents of the MySQL database now look like this:

mysql>  select id, firstname, lastname, email, creditline from customers;
+----+-----------+----------------+-------------------+------------+
| id | firstname | lastname       | email             | creditline |
+----+-----------+----------------+-------------------+------------+
| 1  | Matt      | Hasselbeck     | matt@seahawks.com |      16000 |
| 2  | Ben       | Roethlisberger | ben@steelers.com  |      25000 |
| 3  | Jake      | Delhomme       | jake@panthers.com |      45000 |
| 4  | Jake      | Plummer        | jake@broncos.com  |      12000 |
+----+-----------+----------------+-------------------+------------+
4 rows in set (0.00 sec)

mysql>  select lastname, drs_version, drs_provider_id from customers;
+----------------+-------------+-----------------+
| lastname       | drs_version | drs_provider_id |
+----------------+-------------+-----------------+
| Hasselbeck     |           1 |               2 |
| Roethlisberger |           1 |               2 |
| Delhomme       |           1 |               3 |
| Plummer        |           1 |               3 |
+----------------+-------------+-----------------+
4 rows in set (0.00 sec)

mysql>  select id, street1, street2, city, zip, country from addresses;
+----+-----------------+----------+------------+----------+---------+
| id | street1         | street2  | city       | zip      | country |
+----+-----------------+----------+------------+----------+---------+
| 1  | 2 Second Street | Uptown   | Seattle    | WA 98200 | USA     |
| 2  | 1 First Street  | Downtown | Pittsburgh | PA 15121 | USA     |
| 3  | 4 Fourth Street | Downtown | Charlotte  | NC 28300 | USA     |
| 4  | 3 Third Street  | NULL     | Denver     | CO 80300 | USA     |
+----+-----------------+----------+------------+----------+---------+
4 rows in set (0.00 sec)

mysql>  select street1, drs_version, drs_provider_id from addresses;
+-----------------+-------------+-----------------+
| street1         | drs_version | drs_provider_id |
+-----------------+-------------+-----------------+
| 2 Second Street |         100 |               2 |
| 1 First Street  |         100 |               2 |
| 4 Fourth Street |         100 |               3 |
| 3 Third Street  |         100 |               3 |
+-----------------+-------------+-----------------+
4 rows in set (0.00 sec)

The new table rows have a different drs_provider_id value, indicating that they were provided by a different database. The street2 field of one of the Address objects was null in the db4o database, which is reflected in a NULL value in MySQL.

Updating Objects

Let's see what happens when an object is updated in a db4o database. The following Java code updates a Customer object in each database: the creditLine field for Ben Roethlisberger is changed in one, while Jake Plummer's email address is changed in the other. In db4o, you first query the database to get an in-memory reference to the object you want to update. Then, you make the required change to the object through that reference, and finally, you call ObjectContainer#set to store the changes in the object in the database.

Note that db4o provides some alternative ways to express your queries. Here we want to find a specific Customer by lastName, so we can use the simplest query-by-example (QBE) method. We create a template Customer object, with only the field(s) we want to match populated (the others are not assigned and so are either null or zero). Then we call ObjectContainer#get using the template object--in effect, we are saying "get all objects in the database that match this template." The null or zero fields are ignored when matching the stored objects with the template. The result of calling get is an iterable collection, actually a db4o-type ObjectSet. In the example code we just retrieve the first result from the ObjectSet.

ExtDb4o.configure().generateUUIDs(Integer.MAX_VALUE);
ExtDb4o.configure().generateVersionNumbers(Integer.MAX_VALUE);
ObjectContainer objectContainer = Db4o.openFile("c:/mobileone.yap");
        
Customer template1 = new Customer();
template1.setLastName("Roethlisberger");
Customer c1 = (Customer) objectContainer.get(template1).next();
c1.setCreditLine(72000);
        
objectContainer.set(c1);
objectContainer.commit();
objectContainer.close();

ObjectContainer objectContainer = Db4o.openFile("c:/mobiletwo.yap");
Customer template2 = new Customer();
template2.setLastName("Plummer");
Customer c2 = (Customer) objectContainer.get(template2).next();
System.out.println("Updating customer: " + c2.getLastName());
c2.setEmail("Jake@myIsp.com");
        
objectContainer.set(c2);
objectContainer.commit();
objectContainer.close();

QBE is a very limited query method. For example, it can only match specific values; it can't do "greater than" or "less than" queries. It was OK for the above example, but for more complex queries the recommended method is native queries, supported in db4o since version 5.0. Native queries are expressed in Java itself (or in C# or VB, in the .NET version), rather than in an external language like SQL or Hibernate Query Language (HQL), and so they can be checked at compile time and can benefit from IDE code hints and refactoring support. An example of a simple native query is given in the next part of this article.

The replication is repeated in exactly the same way as before after updating the db4o databases, and the CUSTOMERS table now looks like this:

mysql>  select id, firstname,lastname, email, creditline from customers;
+----+-----------+----------------+---------------------+------------+
| id | firstname | lastname       | email               | creditline |
+----+-----------+----------------+---------------------+------------+
| 1  | Matt      | Hasselbeck     | matt@seahawks.com   |      16000 |
| 2  | Ben       | Roethlisberger | ben@steelers.com    |      72000 |
| 3  | Jake      | Delhomme       | jake@panthers.com   |      45000 |
| 4  | Jake      | Plummer        | Jake@myIsp.com      |      12000 |
+----+-----------+----------------+---------------------+------------+
4 rows in set (0.00 sec)

mysql>  select lastname, drs_version, drs_provider_id from customers;
+----------------+-------------+-----------------+
| lastname       | drs_version | drs_provider_id |
+----------------+-------------+-----------------+
| Hasselbeck     |           1 |               2 |
| Roethlisberger |         100 |               2 |
| Delhomme       |           1 |               3 |
| Plummer        |         100 |               3 |
+----------------+-------------+-----------------+
4 rows in set (0.00 sec)

The updated values have been replicated into MySQL, and the drs_version field has been updated to show that these two objects only have been newly replicated.

Going the Other Way

Replication is bidirectional, so it is possible to update values in the MySQL database and replicate these back to the original db4o databases. In the following example, we'll update a couple of rows of the CUSTOMERS table. We'll increase the creditline value for customers whose contactcode is Joe. If you look at the original data stored in the db4o databases, you can see that this means the two Customers originally from the mobileone.yap database. We have to be careful, though. If we just run a simple update query like this:

update customers
set creditLine = creditLine + 20000,
where customers.contactCode = 'Joe';

then dRS will not know about the changes, and the new values will not be replicated. We can make sure they are replicated by also updating the drs_version value for each affected row. If this is set to be greater than the highest version value in the REPLICATIONRECORD table, then dRS will recognize that these rows have changed since the last replication round. The following version of the update query does this:

update customers
set
creditLine = creditLine + 20000,
drs_version = (select max(version) from replicationrecord) + 1
where customers.contactCode = 'Joe';

After executing the update query, we can look at the new data and see that the specified rows have been updated:

mysql> select id, firstname,lastname, creditline, contactcode from customers;
+----+-----------+----------------+------------+-------------+
| id | firstname | lastname       | creditline | contactcode |
+----+-----------+----------------+------------+-------------+
|  1 | Matt      | Hasselbeck     |      36000 | Joe         |
|  2 | Ben       | Roethlisberger |      92000 | Joe         |
|  3 | Jake      | Delhomme       |      45000 | Sue         |
|  4 | Jake      | Plummer        |      12000 | Sue         |
+----+-----------+----------------+------------+-------------+
4 rows in set (0.00 sec)

The code to replicate from Hibernate to a db4o database is very similar to the forward-replication code. We just specify that we want to replicate objects changed in provider B rather than provider A.

ExtDb4o.configure().generateUUIDs(Integer.MAX_VALUE);
ExtDb4o.configure().generateVersionNumbers(Integer.MAX_VALUE);
ObjectContainer objectContainer = Db4o.openFile("c:/mobileone.yap");
Configuration config = new Configuration().configure("hibernate.cfg.xml");
ReplicationSession replication = Replication.begin(container, config);
ObjectSet changed = replication.providerB().objectsChangedSinceLastReplication();
while (changed.hasNext())
    replication.replicate(changed.next());
replication.commit();
replication.close();
objectContainer.close();

After doing this, the relevant objects in mobileone.yap should have updated creditLine values. We could use the ObjectManager to browse the db4o database, but instead let's see how it's done using a db4o native query. The code to run a query that finds all objects with a contactCode that equals Joe is shown below. The query is expressed as an anonymous Predicate class that implements a method match. The match method contains the code that defines the query. All Customer objects in the database are candidates for matching, and the query returns a List of those Customer objects for which the match method returns true. Some clever optimization in the background means that this query actually executes without having to instantiate every candidate object in order to match it, which means that native queries can be fast. The code is written in Java, using the Predicate interface from the db4o API; there are no embedded query strings in SQL or any other query language. The only string value is the actual search value.

ObjectContainer objectContainer = Db4o.openFile("c:/mobileone.yap");
List<Customer> customers = objectContainer.query(new Predicate<Customer>() {
    public boolean match(Customer customer) {
        return customer.getContactCode().equals("Joe");
    }
});
for (Customer customer : customers){
    System.out.println("Name: " + customer.getLastName() + 
       ", Credit Line: " + customer.getCreditLine());
}
objectContainer.close();

The output shows that the values updated in MySQL have been successfully replicated to the db4o database:

Querying Mobile1...
Name: Roethlisberger, Credit Line: 92000
Name: Hasselbeck, Credit Line: 36000

This example demonstrated the replication of MySQL updates made directly using SQL. If the MySQL database is instead updated by a Java application using Hibernate, then the dRS ReplicationConfigurator class can be used within that application to install an event listener. This listener detects updates or inserts done through Hibernate, and automatically updates the drs_version field in the same way as we did manually above. It can also generate version numbers and UUIDs for data inserted with Hibernate. The dRS listener transparently ensures that Hibernate updates can be correctly replicated back to db4o. The listing below shows installation of a listener followed by updating with Hibernate. The objects to be updated are retrieved using HQL. The result after replication is the same as that of the SQL update described above.

Configuration cfg = new Configuration().configure("hibernate.cfg.xml");
ReplicationConfigurator.configure(cfg);
SessionFactory sessionFactory = cfg.buildSessionFactory();
Session session = sessionFactory.openSession();
ReplicationConfigurator.install(session, cfg);
Transaction tx = session.beginTransaction();

List result = session.createQuery(
    "from Customer customer where customer.contactCode = 'Joe'").list();
Iterator it = result.iterator();
while(it.hasNext()){
    Customer cust = (Customer)it.next();
    cust.setCreditLine(cust.getCreditLine()+20000);
    session.save(cust);
}  
tx.commit();
session.close();

Agile Development: Schema Evolution

One of the characteristics of agile development is that designs are often refactored. As result, the data model may evolve during the lifetime of a project. What happens if the data model of the db4o databases is changed? First, let's see a simple example of how db4o can handle schema changes to a live native database. Let's say it has been decided that the two separate street1 and street2 fields in the Address class are no longer appropriate, and that we would rather have a single street field. However, there is data in the databases, so we want to make sure of the following:

The obvious thing to do is to modify the Address class definition to include the new field. We can also include the following method, onActivate. db4o will call a method with this name, if it exists, for any object when it is activated--in other words, when it is retrieved with a query and a reference to it is created in memory.

public void objectOnActivate(ObjectContainer container) {
    if((street == null) && (street1 !=null)) {
        street = street1;
        if (street2 != null)
            street = street + ", " + street2;
        }
    }
}

Next, we run a query for each db4o database that activates and re-stores all of the objects in the database:

ObjectContainer objectContainer = Db4o.openFile("c:/mobileone.yap");
ObjectSet data = objectContainer.get(Address.class);
while(data.hasNext()){
    objectContainer.set(data.next());          
}       
objectContainer.close();    
        
objectContainer = Db4o.openFile("c:/mobiletwo.yap");
data = objectContainer.get(Address.class);
while(data.hasNext()){
    objectContainer.set(data.next());          
}       
objectContainer.close();

Now the schema has been changed. The values stored in the original fields remain in the database, but all objects will contain the correct value in the new street field, and all new objects should use the refactored field. Figure 3 shows the Address object tree node in one db4o database viewed in the ObjectManager.

Figure 3
Figure 3. The Address objects in the refactored db4o database

We could now change the class definition for Address again by removing the original street1 and street2 fields, leaving only the refactored field. This would ensure that new objects only have values in the new street field, and that only the new field can be accessed by queries. This would still not remove the old data from the database.

Now that the db4o database schema has been changed, the MySQL schema will also need to be changed, and the data updated to match the changes in the db4 databases. This can be done easily by repeating the replication process. The only intervention that is required before doing this is to alter the Hibernate configuration file for the Address class, Address.hbm.xml, so that only the new street field is included in the mapping.

<!-- Address.hbm.xml -->
<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
                "-//Hibernate/Hibernate Mapping DTD 3.0//EN"
                "http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">

<hibernate-mapping default-access="field" default-lazy="false"
    default-cascade="all">
    <class name="com.onjava.db4o.drs.Address" table="ADDRESSES">
        <id column="id" type="java.lang.Long">
            <generator class="native"/>
        </id>
        <property name="street" type="java.lang.String"/>  
        <property name="city" type="java.lang.String"/>
        <property name="zip" type="java.lang.String"/>
        <property name="country" type="java.lang.String"/>
    </class>
</hibernate-mapping>

The replication process is repeated, again simply by executing the code listed earlier, and the ADDRESSES table in the MySQL database now contains the data shown in the query below. The street column has been added to the table and is populated with the correct data. The original fields are still in the table (though not shown in this query), but new objects will not map onto them.

mysql>  select id, street, city, zip, country from addresses;
+----+---------------------------+------------+----------+---------+
| id | street                    | city       | zip      | country |
+----+---------------------------+------------+----------+---------+
| 1  | 2 Second Street, Uptown   | Seattle    | WA 98200 | USA     |
| 2  | 1 First Street, Downtown  | Pittsburgh | PA 15121 | USA     |
| 3  | 3 Third Street            | Denver     | CO 80300 | USA     |
| 4  | 4 Fourth Street, Downtown | Charlotte  | NC 28300 | USA     |
+----+---------------------------+------------+----------+---------+
4 rows in set (0.00 sec)

Conclusion

db4o is a compact native object database that is particularly well suited for use embedded within distributed applications involving partially connected devices (although it can also be useful in many other types of system). It offers significant advantages over relational databases in this role: it offers a small footprint and zero administration (you don't have to create tables, as the application's classes directly define the database schema), and can lead to a reduction in development time. Since it basically uses the application's object model directly, it can easily handle models that are difficult to map to relational tables.

On the other hand, it stores data in a way that is unfamiliar to most database developers, with data closely tied to the application code. These points will make some companies nervous: they prefer to have data in an enterprise RDBMS, for example, for access by other applications. db4o's new dRS replication service rewrites some of the rules here: it uses Hibernate to allow easy replication of native object data to any RDBMS supported by Hibernate. db4o is well suited to agile development, as refactoring is well supported, and replication carries this support forward into the RDBMS. For those who are interested in the idea of object databases, but have good reasons for using relational databases, too, db4o and dRS are worth a look.

Resources

Jim Paterson is a Lecturer at Glasgow Caledonian University in the UK , specializing in web development and object-oriented software development.


Return to ONJava.com.

Copyright © 2009 O'Reilly Media, Inc.