PHP DevCenter
oreilly.comSafari Books Online.Conferences.


Pitfalls of Transactions with PHP

by Kimberlee Jensen

Why Transactions?

Transactions -- a database management system (DBMS) feature that enables data recovery on internal errors -- raise significant controversy among developers. Some view transactions as a crutch, while others view them as a necessity.

Using transactions can save time and money, because transactions enable databases to restore themselves. But inappropriate use of transactions means wasted overhead and impeded performance. And transactions do not replace programming; sadly, some developers have foregone error trapping and data validation, thinking that transactions will catch all errors.

Transaction support in a DBMS or its table types is defined using the ACID test:

  • Atomicity is an all-or-none proposition.
  • Consistency guarantees that a transaction never leaves your database in a half-finished state.
  • Isolation keeps transactions separated until they're finished.
  • Durability guarantees that the database will keep track of pending changes in such a way that the server can recover from an abnormal termination.

Initially, the MySQL DBMS did not support transactions. As of version 3.23-max, MySQL supports transactions with two additional table types, InnoDB and BDB. InnoDB is recommended for its CPU efficiency and support of row-level and multiple-concurrency locking. Now that MySQL has full transaction support, PHP developers need to incorporate them effectively. Although PHP 4.x lacks native transaction functions, transaction statements can be used the same way as standard SQL queries.

Related Reading

Web Database Applications with PHP, and MySQL
By Hugh E. Williams, David Lane

When to Use Transactions

To maximize the use of transactions, developers should follow these simple guidelines:

  • Use transactions for mission-critical actions that must all complete. Completing an online purchase is mission-critical. Browsing a list of top sellers is not.

  • Use transactions for data subject to race conditions. Even though the database server and operating system are multi-threaded and can process multiple queries concurrently, the specific group of queries must be processed in order and cannot be processed independently out of order.

  • Use transactions in conjunction with, not as a replacement for data validation and programmatic error trapping.

By supporting transactions, the DBMS provides an extreme level of reliability and drastically reduces the need for manual recovery and data integrity validation. However, any transaction, by definition, adds overhead because at some level, the DBMS must lock data and take a snapshot of the data at the outset. So transactions must be used appropriately.

Transactions Versus Errors

Developers can sometimes misunderstand the definition of a transaction failure. The transaction will fail when the database engine itself has an error that causes a query or queries not to complete. The most common misconception is that 0 rows returned or affected are considered a failure. For example, if a query of DELETE FROM tablex WHERE fieldy = 10 returns 0 affected rows, that is considered a successful query according to the transaction definition because the query completed and returned its affected rows. Whether the number of affected rows matches the expected result is a matter for programming, not transactions.

Some developers believe transactions mean the elimination of programming, even using transactions in lieu of programmatic data validation. This is wrong: transactions are intended to recover errors made internally by the database engine or externally by extreme measures such as a power outage. Transactions were never intended to account for user errors.

Web developers must trap errors because transactions do not prevent data from being lost between the database and the web application. Theoretically, a transaction could complete successfully, but the results of the transaction could fail to reach the web application. If the web application connects to the database server through a local host connection (using the UNIX socket), this is generally not an issue unless the socket becomes corrupted or disabled. However, if the web application connects to the database via the TCP/IP socket, a packet can be dropped in transmission. From the database standpoint, the transaction has succeeded, but if the web application has lost data, it has failed.

For example, the database could execute a SELECT query successfully and return a result set, but if those data rows are lost in the transfer to the web application, the application would show 0 as the number of rows. This is why before looping through a result set, a developer always needs to verify that mysql_num_rows(result identifier) is non-zero to make sure the web application has received the data successfully.

Assuming transactions cover all possible failures is a mistake. Programmers still need to program!

As well, database developers still need to plan. Before the programmers begin to code the web application, the database developers must make informed decisions about when to use a transaction-safe table. MyISAM table types typically execute SELECT queries much faster than InnoDB types. According to recent PHP Oracle/MySQL benchmarks, InnoDB table types handle more inserts per second than BDB table types.

For this example we will use InnoDB for our transaction-essential tables and MyISAM for tables not requiring transactions.

An Illustrated Example

Let's model a fictional e-commerce site that sells MP3 files. The database will need a table to store customer contact information and another table to represent each individual MP3 file the customer is purchasing. We will also assume the use of temporary data to hold the user's shopping cart information. We will then transfer the information to permanent tables upon checkout. Any shopping cart information that is not transferred to the permanent tables can be purged. We will use temporary tables instead of sessions to hold the shopping cart data because we want to keep statistics on user behavior, such as placing items in a shopping cart but not completing the transaction. (Once our fictional web site is as successful as Amazon, we may switch to sessions that hold shopping cart data).

Example Tables

The tables involved with the completion of an order will include:

  • customers
  • orders
  • order_detail
  • temp_orders
  • temp_order_detail

Here are two of the proposed tables in the model:

The order_detail table:

CREATE TABLE order_detail (
    order_ID  INT UNSIGNED,
    mp3_ID    INT UNSIGNED)

The order table:

    customer_ID   INT UNSIGNED,
    order_date    TIMESTAMP(8),
    total         FLOAT (6, 2),
    temp_order_ID INT UNSIGNED)

Transactions in the Example

The pseudo-code for what transpires during checkout reads as such:

  1. Insert customer data into the customer table.
  2. Transfer data from the temporary order tables to the permanent order tables.
  3. Update the permanent order tables to reflect their relationships to the customer table.
  4. Delete data from the temporary order tables (if necessary).

This process exemplifies a transaction, as all stages contribute to a single goal — the completion of a sale. If the database crashes during a single step of this process, we immediately want to recover and restore the information. For example, if the customer information fails to load, we want to revert the database state back to the previous state, preserving the shopping cart data in the temporary order tables.

These queries must execute in the specified order. If the deletion of data from the temporary tables executes before the data is transferred to the permanent tables, the operation will fail. Using transaction-safe tables will prevent that problem. To ensure that these tables use the InnoDB handler, the developer can use the ALTER statement to change the table type after creation, or tack the table type on at the end of the CREATE statement. The latter is preferred.

    customer_ID   INT UNSIGNED,
    order_date    TIMESTAMP(8),
    total         FLOAT (6, 2),
    temp_order_ID INT UNSIGNED) Type=InnoDB

One database can use numerous table types. Assuming we are selling MP3 files legitimately for download, we can assume a table to hold the name, description, and price of each song. We will also assume separate linking tables to link songs to categories (assuming a many-to-many relationship), songs to artists, songs to CDs, and songs to record labels. These tables will be searched frequently. The process of adding to the inventory is done in batch format when traffic is slow. Also, since these are MP3 files, the purchasing of a song does not affect inventory; adding a song to a shopping cart does not impact the song table whatsoever. Therefore, these tables should be MyISAM, which processes SELECT queries the fastest. This the default type in MySQL as of version 3.23.

The programming logic can catch any SQL queries that fail or return no results. To make these SELECT queries that search the inventory part of a transaction would be a waste of overhead.

The best strategy for working with transactions is to apply the try/catch programming strategy from Java. Let's examine a model of troubleshooting with PHP.

Web application coding model
Figure 1. Web application coding model.

Pages: 1, 2

Next Pagearrow

Valuable Online Certification Training

Online Certification for Your Career
Earn a Certificate for Professional Development from the University of Illinois Office of Continuing Education upon completion of each online certificate program.

PHP/SQL Programming Certificate — The PHP/SQL Programming Certificate series is comprised of four courses covering beginning to advanced PHP programming, beginning to advanced database programming using the SQL language, database theory, and integrated Web 2.0 programming using PHP and SQL on the Unix/Linux mySQL platform.

Enroll today!

Sponsored by: