ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Simplify Business Logic with PHP DataObjects

by Darryl Patterson
08/05/2004

Simple Data Objects with PHP 5

Working with a database is an everyday exercise for web developers these days. From simple form processing to large-scale web applications, we almost always need a database. After working on a few projects, it doesn't take very long to realize that the same four simple database tasks repeat many times in virtually all projects. These four tasks are:

  1. Finding records (SELECT).
  2. Updating existing records (UPDATE).
  3. Adding new records (INSERT).
  4. Removing records (DELETE).

You find yourself re-typing or copying and pasting your queries throughout your code, slightly modifying them as necessary in various places. Many folks resort to a data abstraction layer, such as Pear::DB or DBX. Using a data abstraction layer is a good thing, but its main goal is to make the RDBMS transparent, allowing you to change database vendors fairly easily. A database abstraction layer doesn't offer a way to abstract the table structure of your database, since it can't standardize the way your functional code accesses data in your database. This is where having a separate data access layer can help out. You can implement data access layers in more than one way, but here, we're interested in DataObjects.

The concept of DataObjects comes from some very well-documented design patterns, namely the Data Access Object and Transfer Object patterns. These patterns can be quite complex, but with a little imagination, we can use their fundamental ideas and purpose to make our own data layer in PHP 5 much easier to work with. Let's dig in.

What is a DataObject?

As just mentioned, the idea of a DataObject comes from a design pattern. If you've ever looked into design patterns at all, you'll know that they often depend on object orientation. That said, we will make extensive use of PHP 5's new object model. We'll also use MySQL for the examples (although it's also easy to use something like Pear::DB).

Essentially, a DataObject is a coded class that directly represents a table in your database -- you will code a class for every table. The class will have member variables that exactly match the fields in your table, as well as a set of methods or functions that perform at least the four basic tasks mentioned above. Suppose we have a simple table for holding user info:

TABLE: User
userId  INT
firstName VARCHAR(30)
lastName  VARCHAR(40)
email   VARCHAR(100)

Now that we have a table, we'll code a corresponding class whose member variables have the same field names found in the table. I tend to prepend my DataObject class names with DO_ to avoid confusing them with other classes of similar names; this is a common pattern in PHP to simulate namespaces. Here's the code:

class DO_User {
  public $userId;
  public $firstName;
  public $lastName;
  public $email;
}

This simple little wrapper represents a single row from our table. An instance of DO_User can hold only one row of data at a time. How do we fetch data from the database into this object? Let's add a new method called get() to our class to query the database for one specific user. We'll provide the userId (the primary key) of the user we want to fetch as a parameter.

File contents of: class-DO_User.php
<?php
class DO_User {
  public $userId;
  public $firstName;
  public $lastName;
  public $email;

  // This function will perform a select on the table looking for
	// a specific userId.
  public function get($userId)
  {
    $sql = 'SELECT * FROM User WHERE userId='
			 . mysql_escape_string($userId);
    $rs  = mysql_query($sql);
    $row = mysql_fetch_array($rs);

    $this->userId  = $row['userId'];
    $this->firstName = $row['firstName'];
    $this->lastName  = $row['lastName'];
    $this->email   = $row['email']
  }
}
?>

With this very simple data object, we can now interact with our database using only PHP code. We need no SQL to use our database. Here is a script that uses this DataObject to fetch and display the user's data to a web browser:

<?php
include_once('class-DO_User.php');
$user = new DO_User();

// We'll use a literal integer here, 
// but this could come from anywhere,
// such as $_POST or $_GET
$user->get(5);

?>
<html>
  <head>
    <title>User Info</title>
  </head>
  <body>
    <p>Here is the user info:</p>
    <table border="1">
      <tr>
        <td>User ID</td>
        <td><?=$user->userId?></td>
      </tr>
      <tr>
        <td>First Name</td>
        <td><?=$user->firstName?></td>
      </tr>
      <tr>
        <td>Last Name</td>
        <td><?=$user->lastName?></td>
      </tr>
      <tr>
        <td>Email</td>
        <td><?=$user->email?></td>
      </tr>
    </table>
  </body>
</html>

As you can see, using the DataObject is quite simple and very clean. The get() method we added performs a simple query on the user table, searching for the specific primary key (userId) for which we're looking. Remember, whenever you query a table looking for equality on a primary key, you'll only receive one record back. This works well for our simple DataObject, but we'll discuss retrieving multiple rows a little later.

Notice that when the DataObject has found the record, it copies the record data into the DataObject's member variables. This is why the names of the member variables in the DataObject must match the column names in the table exactly.

Data Access Versus Database Abstraction

Let's take a moment to discuss data access layer vs. database abstraction layer. As the introduction explained, these are two different things. You can use one or the other, or even both for the best overall flexibility.

The database abstraction layer hides the RDBMS running in the background. If you're smart and careful about how you write your SQL, you can easily switch from one database server to another, without changing any function calls or SQL. Pear::DB does this quite nicely.

A data access layer, on the other hand, hides the underlying table structure. By writing a data access layer to represent your tables, you can interact with the data in your tables without using any SQL in the business layer (where all the functional code lives) of your application. DataObjects are a good choice for the data access layer, as they have a direct relationship to tables, so you can reuse them across your whole application without modifying them.

Before DataObjects, my data layer consisted of classes that encapsulated and grouped queries based on functionality; for example, all queries related to registering and managing a user's login. The problem with this kind of layer is that different bits of unrelated functionality had similar queries. To log someone into the application, I need a user record. I also need that record to display an "edit user info screen." I had to to copy and paste the same query into two separate classes, due to the architecture of the code. With DataObjects, if I need a user's record, I use the DataObject that represents that table, regardless of where I am in the code.

You can use database abstraction and data access layers together. Our DataObjects simply use Pear::DB instead of PHP's native database functions. Most of the time, I forego using Pear::DB, as I know the chances of changing the underlying RDBMS is next to nil. When I know there is a good chance of the RRDBMS changing within the next five years, I'll use Pear::DB just to be safe. Otherwise, using PHP's native DB functions are faster to code, and execute faster. Pear::DB adds another layer of abstraction that results in a slight slowdown in code execution.

Inserting Rows

Retrieving records is a good start, but we need more. We need to add new records to the table, so let's add another method to our DataObject that will perform an INSERT [file source]:

public function insert()
{
  $sql = "INSERT INTO
        User
      SET
        firstName='" . mysql_escape_string($this->firstName) . "',
        lastName='"  . mysql_escape_string($this->lastName)  . "',
        email='"   . mysql_escape_string($this->email)   . "'";
  
  mysql_query($sql);
  $this->userId = mysql_insert_id();
}

Isn't this clever? You may wonder where the data to insert comes from. That data needs to be in the data object before we call this, as we don't pass on the data we want to insert as parameters of the insert() method. The code retrieves the last insert_id and stores it within the DataObject so the object is now usable for related inserts on other tables. Here's how to use this new method:

<?php
include_once('class-DO_User.php');

$user      = new DO_User();
$user->firstName = 'Jane';
$user->lastName  = 'Doe';
$user->email   = 'jane.doe@example.com';

$user->insert();
?>
<html>
  <head>
    <title>INSERT Example</title>
  </head>
  <body>
    <p>The user was added to the User table. 
    The userId is: <?=$user->userId?></p>
  </body>
</html>

This example inserts a new record into the User table. We create the DataObject, set the field values, and then call insert(). Nice and easy. We can also use both the get() and insert() methods together to copy a record. It doesn't end there; we could also make a small change to the record before we do a copy. Want to change the email address? Here's how:

<?php
include_once('class-DO_User.php');
$user = new DO_User();

// Again just using a literal int for now.
$user->get(5);

// Change the email address in the DataObject
// NOTE: This doesn't affect the DB at all, 
// just the value in the DataObject
$user->email = 'jdoe@example.com';

// Next, we call insert() to create a NEW record in the User table
$user->insert();

?>
<html>
  <head>
    <title>Copy Row INSERT Example</title>
  </head>
  <body>
    <p>The user was copied to the User table. 
    The userId is: <?=$user->userId?></p>
  </body>
</html>

I'm excited just writing this! Hopefully you're starting to see how easy it is to interact with your database when using DataObjects.

Updating Rows

Retrieving a record, changing a few values, and inserting it as a new record can be valuable. More often, you'll want to update that record with the changes. Let's add an update() method to our DataObject [file source]:

public function update()
{
  $sql = "UPDATE
        User
      SET
        firstName='" . mysql_escape_string($this->firstName) . "',
        lastName='"  . mysql_escape_string($this->lastName)  . "',
        email='"   . mysql_escape_string($this->email)   . "'
      WHERE
        userId="   . mysql_escape_string($this->userId);
  
  mysql_query($sql);
}

This is just as clever as the insert(). Once again, the data we're working with needs to be in the DataObject before this point. This method simply uses that existing data to perform the update. Here's how to use it:

<?php
include_once('class-DO_User.php');
$user = new DO_User();

// Again just using a literal int to get a row.
$user->get(5);

// Change the email address
$user->email = 'janedoe@example.com';

// Perform the update
$user->update();

?>
<html>
  <head>
    <title>UPDATE Example</title>
  </head>
  <body>
    <p>The user updated. The userId is: <?=$user->userId?></p>
  </body>
</html>

When I first looked into DataObjects, I didn't see the value in them. It seemed like a lot of setup in coding of all the DataObjects, with little return. It wasn't until I started using them that their value truly started to show. Can updating a user's email address in the DB be any easier than that last example? Not only is the code here compact, but it's highly readable and clean -- two things I like to see in code.

Deleting Rows

That covers three of the four fundamental database tasks. The advantages of DataObjects should be obvious at this point. Let's consider the DELETE method by adding delete() [file source]:

public function delete()
{
  $sql = "DELETE FROM
        User
      WHERE
        userId=" . mysql_escape_string($this->userId);
  
  mysql_query($sql);
}

Use delete() in the same way as you use update(). Find the row you want to delete ($user->get(5)), then call delete ($user->delete()). Once again, the primary key of the record must already exist in the DataObject.

Result Sets

Now we can find a row and perform an insert, delete, or update on our DO_User DataObject. Remember how I mentioned that our DataObject is a simple wrapper that represents a single row from our table? What if we need to perform a query that returns more than one row? Our DataObject can't handle this, nor should it. We need another wrapper for the result set.

The simplest approach to solving this problem is to create an array of DataObjects that contains all of the found rows. That may seem logical at first, but what if your query returns 10,000, 100,000, or more rows? That'd make a mighty big array of objects, turning your application into a RAM hog and convincing your web host to shut down your site.

We need a proper wrapper for the result set that holds a reference to the native (or Pear::DB) result set. This wrapper won't return an array of each row. Instead, it returns instances of our DO_User DataObject. Pretty cool.

We'll have to write a new class for this. I always call it ReadOnlyResultSet, as the result set itself is read-only (we can't update or delete rows in the result set directly), and it happens to be the name used in the J2EE design patterns (let's hear it for consistency). Before we write this class, we'll add one more method to DO_User to find a bunch of rows using data currently in the DataObject. When it finds the rows, this new function will return a new instance of our new ReadOnlyResultSet class. This is easier than it sounds. We'll start by adding the find() method [file source]:

public function find()
{
  $sql = "SELECT * FROM User";
  
  // This array will hold the where clause
  $where = array();
  
  // Using PHP 5's handy new reflection API
  $class = new ReflectionClass('DO_User');

  // Get all of DO_User's variable (or property) names
  $properties = $class->getProperties();
  
  // Loop through the properties
  for ($i = 0; $i < count($properties); $i++) {
    $name = $properties[$i]->getName();
    if ($this->$name != '') {
      // Add this to the where clause
      $where[] = "`" . $name . "`='" 
      . mysql_escape_string($this->$name) . "'";
    }
  }
  
  // If we have a where clause, build it
  if (count($where) > 0){
    $sql .= " WHERE " . implode(' AND ', $where);
  }
    
  $rs = mysql_query($sql);
  include_once('class-ReadOnlyResultSet.php');
  return new ReadOnlyResultSet($rs);
}

This one is a doozy. PHP 5 has a built-in Reflection API that we can use to retrieve any and all information about a class and/or object. Consider this built-in reverse engineering. We use reflection here to pull the property names (member variable names) from the DataObject. Take a look at this new API some time; it's quite powerful.

Before calling find(), we have to to put data into those properties we want to search. If we want to find people with a certain first name, we set only that property and then call find. The find() method builds the query and returns a ReadOnlyResultSet. Before we explore how to use this method, we need to code the ReadOnlyResultSet class.

Coding ReadOnlyResultSet

The ReadOnlyResultSet class is a wrapper for a result set that returns DataObject instances. I use a generic ReadOnlyResultSet for all projects. It works so well that I never have to modify it. This new wrapper has three methods: getNext(), rowCount(), and reset(). Here's the code for the class [file source]:

File contents of: class-ReadOnlyResultSet.php
<?php 
class ReadOnlyResultSet {
	// This member variable will hold the native result set
  private $rs;

	// Assign the native result set to an instance variable
  function __construct($rs)
  {
    $this->rs = $rs;
  }

	// Receives an instance of the DataObject we're working on
  function getNext($dataobject)
  {
    $row    = mysql_fetch_array($this->rs);

    // Use reflection to fetch the DO's field names
    $class    = new ReflectionObject($dataobject);
    $properties = $class->getProperties();

    // Loop through the properties to set them from the current row
    for ($i = 0; $i < count($properties); $i++) {
      $prop_name        = $properties[$i]->getName();
      $dataobject->$prop_name = $row[$prop_name];
    }
    
    return $dataobject;
  }

  // Move the pointer back to the beginning of the result set
  function reset()
  {
    mysql_data_seek($this->rs, 0);
  }

  // Return the number of rows in the result set
  function rowCount()
  {
    return mysql_num_rows($this->rs);
  }
}
?>

As a note for those used to using & to pass by variables reference, PHP 5 always passes objects by reference now (like Java). That's why I avoid the & in my examples.

Because this is a wrapper for a result set, the constructor for this class requires that we pass a result set. For this example, it's a native MySQL result set, for the reasons previously stated. The reset() method moves the pointer of the native result set back to 0, the beginning. rowCount() returns the number of rows in the native result set. The getNext() method is more curious. Let's break it down a little.

First, because this class needs to work for all DataObjects, we have to have a way to tell this method which DataObject to work with. That's why we have to pass the DataObject we want to fill with the next row's data. This will usually be an empty DataObject. The first line in this method fetches the next row from the native result set with mysql_fetch_array(). Next, we need to loop through all of the passed-in object's properties using the Reflection API. Notice the usage of the class ReflectionObject instead of ReflectionClass. This is because we're trying to reverse engineer an object, not a class.

As the loop executes, it sets the value of each property in our DataObject to the value of each field in our native row. When we're all done, we return the now-filled DataObject. This class works for all DataObjects and you should never need to modify it.

It's time to see ReadOnlyResultSet in action. Here's how to look for all users who have the first name jane:

<?php
include_once('class-DO_User.php');
$user      = new DO_User();

// We'll use a literal string here, 
// but this could come from anywhere,
// such as $_POST or $_GET
$user->firstName = 'jane';

// Call find(), which returns an instance of ReadOnlyResultSet
$rs        = $user->find();
?>

<html>
  <head>
    <title>Found Users</title>
  </head>
  <body>
    <p>Here the found users:</p>
    <table border="1">
      <tr>
        <td>User ID</td>
        <td>First Name</td>
        <td>Last Name</td>
        <td>Email</td>
      </tr>
?>
// Loop through the result set
for ($i=0; $i < $rs-rowCount(); $i++) {

  // Pass on a new instance of DO_User, receiving it back filled in
  $userRow = $rs->getNext(new DO_User());
?>

	// Display the current row in an HTML table
      <tr>
        <td><?=$userRow->userId?></td>
        <td><?=$userRow->firstName?></td>
        <td><?=$userRow->lastName?></td>
        <td><?=$userRow->email?></td>
      </tr>
<?php
}
<?php
    </table>
  </body>
</html>

Notice that when we call $rs->getNext() we needed to pass a new DO_User instance. This is how the ReadOnlyResultSet knows which kind of DataObject it's working with. The ReadOnlyResultSet populates this new DataObject with the next row's data, returning the now-populated DataObject. Remember, PHP 5 always passes objects by reference, so the new DataObject passed on to getNext() is the exact same object that returned from getNext(). Once again, compact, readable and clean code.

Further Ideas

Beyond these basic five methods, you can add more functionality as needed. I often add a method to authenticate a user upon login. This method expects the username and password as parameters and returns true or false. If it returns true, I fill the DataObject with the found record. If you need very specific or more complex finding, add a new method to accomplish that task. The fundamental methods shown in this article are just the starting point for your own DataObjects.

It's actually quite possible to generalize these five methods and drop them into a super class. The Reflection API makes this pretty easy to do.

Also note that there is one limitation to this simple DataObject implementation -- you can't easily do joins. When I need a join, I create view DataObjects that only return a ReadOnlyResultSet, since you generally can't do joined INSERTs, DELETEs, and UPDATEs. There are other ways to accomplish joins, though.

Finally, you may have noticed that there is a Pear::DB_DataObject package available. This is a solid package, but it can be a bit of a challenge to figure out if you're not already familiar with the concepts of DataObjects. This article makes a good introduction to Pear::DB_DataObject, a package I hope to write about in the future.

This brings us to the end of our first look at DataObjects. What we've seen here is a simple implementation of DataObjects using PHP 5. With some small modifications, you can adapt this implementation to work with PHP 4, as well as with any database abstraction layer you wish (such as Pear::DB or DBX). Many folks also write up a script that automatically generates the initial DataObjects by querying the database's table structure and creating the necessary PHP files. This can save a lot of time.

In future articles, I hope to cover Pear::DB_DataObject and PHP 5's new Reflection API.

Resources

Darryl Patterson is a senior instructor at Centennial College in Toronto, where he has developed and taught many programming courses including PHP, SQL, Java/J2EE, HTML, JavaScript and CGI/Perl.


Return to the PHP DevCenter

Copyright © 2009 O'Reilly Media, Inc.