Transactions -- a database management system (DBMS) feature that enables data recovery on internal errors -- raise significant controversy among developers. Some view transactions as a crutch, while others view them as a necessity.
Using transactions can save time and money, because transactions enable databases to restore themselves. But inappropriate use of transactions means wasted overhead and impeded performance. And transactions do not replace programming; sadly, some developers have foregone error trapping and data validation, thinking that transactions will catch all errors.
Transaction support in a DBMS or its table types is defined using the ACID test:
Initially, the MySQL DBMS did not support transactions. As of version 3.23-max, MySQL supports transactions with two additional table types, InnoDB and BDB. InnoDB is recommended for its CPU efficiency and support of row-level and multiple-concurrency locking. Now that MySQL has full transaction support, PHP developers need to incorporate them effectively. Although PHP 4.x lacks native transaction functions, transaction statements can be used the same way as standard SQL queries.
|
Related Reading
Web Database Applications with PHP, and MySQL |
To maximize the use of transactions, developers should follow these simple guidelines:
Use transactions for mission-critical actions that must all complete. Completing an online purchase is mission-critical. Browsing a list of top sellers is not.
Use transactions for data subject to race conditions. Even though the database server and operating system are multi-threaded and can process multiple queries concurrently, the specific group of queries must be processed in order and cannot be processed independently out of order.
Use transactions in conjunction with, not as a replacement for data validation and programmatic error trapping.
By supporting transactions, the DBMS provides an extreme level of reliability and drastically reduces the need for manual recovery and data integrity validation. However, any transaction, by definition, adds overhead because at some level, the DBMS must lock data and take a snapshot of the data at the outset. So transactions must be used appropriately.
Developers can sometimes misunderstand the definition of a transaction
failure. The transaction will fail when the database engine itself has an error
that causes a query or queries not to complete. The most common misconception
is that 0 rows returned or affected are considered a failure. For example, if a
query of DELETE FROM tablex WHERE fieldy = 10 returns 0 affected
rows, that is considered a successful query according to the transaction
definition because the query completed and returned its affected rows. Whether
the number of affected rows matches the expected result is a matter for
programming, not transactions.
Some developers believe transactions mean the elimination of programming, even using transactions in lieu of programmatic data validation. This is wrong: transactions are intended to recover errors made internally by the database engine or externally by extreme measures such as a power outage. Transactions were never intended to account for user errors.
Web developers must trap errors because transactions do not prevent data from being lost between the database and the web application. Theoretically, a transaction could complete successfully, but the results of the transaction could fail to reach the web application. If the web application connects to the database server through a local host connection (using the UNIX socket), this is generally not an issue unless the socket becomes corrupted or disabled. However, if the web application connects to the database via the TCP/IP socket, a packet can be dropped in transmission. From the database standpoint, the transaction has succeeded, but if the web application has lost data, it has failed.
For example, the database could execute a SELECT query
successfully and return a result set, but if those data rows are lost in the
transfer to the web application, the application would show 0 as the number
of rows. This is why before looping through a result set, a developer always
needs to verify that mysql_num_rows(result identifier) is non-zero
to make sure the web application has received the data successfully.
Assuming transactions cover all possible failures is a mistake. Programmers still need to program!
As well, database developers still need to plan. Before the programmers
begin to code the web application, the database developers must make informed
decisions about when to use a transaction-safe table. MyISAM table types
typically execute SELECT queries much faster than InnoDB types.
According to recent PHP Oracle/MySQL benchmarks, InnoDB table types handle more inserts per second than BDB table types.
For this example we will use InnoDB for our transaction-essential tables and MyISAM for tables not requiring transactions.
Let's model a fictional e-commerce site that sells MP3 files. The database will need a table to store customer contact information and another table to represent each individual MP3 file the customer is purchasing. We will also assume the use of temporary data to hold the user's shopping cart information. We will then transfer the information to permanent tables upon checkout. Any shopping cart information that is not transferred to the permanent tables can be purged. We will use temporary tables instead of sessions to hold the shopping cart data because we want to keep statistics on user behavior, such as placing items in a shopping cart but not completing the transaction. (Once our fictional web site is as successful as Amazon, we may switch to sessions that hold shopping cart data).
The tables involved with the completion of an order will include:
customersordersorder_detailtemp_orderstemp_order_detailHere are two of the proposed tables in the model:
The order_detail table:
CREATE TABLE order_detail (
detail_ID INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
order_ID INT UNSIGNED,
mp3_ID INT UNSIGNED)
The order table:
CREATE TABLE orders (
order_ID INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
customer_ID INT UNSIGNED,
order_date TIMESTAMP(8),
total FLOAT (6, 2),
temp_order_ID INT UNSIGNED)
The pseudo-code for what transpires during checkout reads as such:
customer table.customer table.This process exemplifies a transaction, as all stages contribute to a single goal — the completion of a sale. If the database crashes during a single step of this process, we immediately want to recover and restore the information. For example, if the customer information fails to load, we want to revert the database state back to the previous state, preserving the shopping cart data in the temporary order tables.
These queries must execute in the specified order. If the deletion of data
from the temporary tables executes before the data is transferred to the
permanent tables, the operation will fail. Using transaction-safe tables will
prevent that problem. To ensure that these tables use the InnoDB handler, the
developer can use the ALTER statement to change the table type
after creation, or tack the table type on at the end of the CREATE
statement. The latter is preferred.
CREATE TABLE orders (
order_ID INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
customer_ID INT UNSIGNED,
order_date TIMESTAMP(8),
total FLOAT (6, 2),
temp_order_ID INT UNSIGNED) Type=InnoDB
One database can use numerous table types. Assuming we are selling MP3 files
legitimately for download, we can assume a table to hold the name, description,
and price of each song. We will also assume separate linking tables to link
songs to categories (assuming a many-to-many relationship), songs to artists,
songs to CDs, and songs to record labels. These tables will be searched
frequently. The process of adding to the inventory is done in batch format
when traffic is slow. Also, since these are MP3 files, the purchasing of a song
does not affect inventory; adding a song to a shopping cart does not impact the
song table whatsoever. Therefore, these tables should be MyISAM, which
processes SELECT queries the fastest. This the default type in
MySQL as of version 3.23.
The programming logic can catch any SQL queries that fail or return no
results. To make these SELECT queries that search the inventory
part of a transaction would be a waste of overhead.
The best strategy for working with transactions is to apply the try/catch programming strategy from Java. Let's examine a model of troubleshooting with PHP.

Figure 1. Web application coding model.
|
Now, for a code example:
Let's assume the customer is performing a keyword search and this code represents the results page:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Mega MP3 Store Search Results</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body bgcolor="#FFFFCC" text="#000000">
<?php
// check to see that the form has been submtited and that the keyword field is
// empty
if (($_GET["Submit"] == "Search it!") &&
(!empty($_GET["keywords"])) &&
(strlen($_GET["keywords"]) > 1))
{
// If the user did not bother to type in any keywords, or typed in one
// letter why bother with the rest of the code. Note, a regular expression
// is also useful to validate the quality of the keywords escape any
// characters that will affect the database
$keywords = addslashes($_GET["keywords"]);
// the @ symbol is warning suppression so a warning will not be thrown back
// to the user, be careful not to over-rely on warning suppression, every
// warning suppression should be modified with an if else to catch the
// warning
if ($connect = @mysql_connect("localhost", "specialwebuser",
"good4pasSwor9d"))
{
mysql_select_db("music_for_sale");
// we will keep the query simple and ignore the record labels and
// genres for now
$query = "SELECT songs.name, songs.description, songs.price, songs.ID,
artists.artist_group_name FROM songs
INNER JOIN artists ON songs.ID = artists.songID WHERE
((songs.name LIKE '%".$keywords."%') OR
(songs.description LIKE '%".$keywords."%'))";
// test for errors in the query, again, if this generates an SQL error,
// do not display the error to the user, but rather trap it
if ($result_query = @mysql_query($query, $connect))
{
// test if there are results returned by the query, if not,
if (mysql_num_rows($result_query) > 0)
{
print ("<p> We found: " . mysql_num_rows($result_query) .
"match(es) to your search for <b>$keywords</b>lt;/p>
while($row = mysql_fetch_array ($result_query))
{
// obviously you can use more sophisticated HTML than merely
// paragraph formatting
print ("<p>" . $row["title"] . " " . $row["description"] .
"</p>");
} // end while - looping through the result set
} // end if there is a result set
else
{
print ("<p>We failed to find a match for <b>$keywords</b>lt;/p>
} // end else - there were no results
} // end if no error in the query
else
{
// here you may also want to email details of the error to the
// Webmaster, notice we take ownership of the blame, do not blame
// the customer for a query error
print ("<p>We are having technical difficulties, please
contact our customer service at 1-800-555-5555.</p>");
} // end else - query had an error
} // end if connection was successful
else
{
// here you may also want to email details of the error to the
// Webmaster, noting the connection failed
print ("<p>We are having technical difficulties, please
contact our customer service at 1-800-555-5555.</p>");
}
} // end if keywords are not empty
else
{
// here you can blame the customer for not entering a keyword
print ("<p>You did not enter any keywords, please enter
something.</p>");
} // end else keywords are in fact empty
?>
<form action="results.php" method="get" name="search" id="search">
<input name="keywords" type="text" id="keywords">
<input type="submit" name="Submit" value="Search it!">
</form>
</body>
</html>
Some developers execute the try/catch concept by using exit or
or die statements. This is a Perl and Unix-scripting strategy that
is programmatically sound, but HTML faulty. For example, let's examine this
code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Mega MP3 Store Search Results</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body bgcolor="#FFFFCC" text="#000000">
<?php
$connect = @mysql_connect ("localhost", "specialwebuser", "good4pasSwor9d")
or die ("<p>We are having technical difficulties, please contact our
customer service at 1-800-555-5555.</p>");
$query = "SELECT songs.name, songs.description, songs.price, songs.ID,
artists.artist_group_name FROM songs
INNER JOIN artists ON songs.ID = artists.songID WHERE
((songs.name LIKE '%".$keywords."%')
OR (songs.description LIKE '%".$keywords."%'))";
// we will keep the query simple and ignore the record labels and genres for
// now
$result_query = @mysql_query($query, $connect)
or die ("<p>We are having technical difficulties, please contact our
customer service at 1-800-555-5555.</p>");
// more code to follow
?>
</body>
</html>
By using the exit statement here, the server never sends the
</body> and </html> tags to the client
browser if there is an error in the database connection or in the query. This
breaks the rules of XHTML compliance and also hurts compliance with assistive
technologies. As the industry moves further into client-side XML, developers
should expect to make their code comply with these standards.
To come full circle with our strategy, let's examine code for performing our sale completion and incorporate transactions with PHP. Unlike languages such as Perl and Java, PHP does not provide a native API for transactions, so developers must use existing functions in the MySQL API. We will include some pseudo-code and eliminate the surrounding HTML for brevity.
<?php
// we will assume that we have client-side validation with JavaScript, but that
// our site may be accessed with non-JavaScript browsers such as Lynx, we will
// also assume our customers table has three required fields: first_name,
// last_name, and email
$form_error="";
if (empty($_POST["first_name"])
{
$form_error .= "first name";
}
if (empty($_POST["last_name"])
{
$form_error .= "last name";
}
if (!preg_match("/([\w]{1,}[@]{1}[\w]{1,}[\.]{1}[\w]{2,})/", $_POST["email"]))
{
// we'll perform a very basic email validation
$form_error .= "email";
}
if ($form_error == "")
{
// if the user completed the form to our minimal standards
if ($connect = @mysql_connect("localhost", "specialwebuser",
"good4pasSwor9d"))
{
mysql_select_db("music_for_sale");
// make sure we flag the database to not commit after each executed
// query
$query = "SET AUTOCOMMIT=0";
$result_query = @mysql_query($query, $connect);
$query = "BEGIN";
$result_query = @mysql_query($query, $connect);
// generally the BEGIN statement will not cause an SQL error, so
// trapping of this query is not essential
// initialize a flag to true, and if a single query fails to return the
// correct affected rows, or if it generates an error, we will set the
// flag to false
$success = true;
// note that the insert into customers could be outside of the
// transaction, we may want to capture that data regardless of whether
// or not the sale is complete, so this query could be executed
// outside of the transaction
$query = "INSERT INTO customers (first_name, last_name, email)
VALUES ('$first_name,' '$last_name', '$email'))";
// test for errors in the query, again, if this generates an SQL error,
// do not display the error to the user, but rather trap it
if ($result_query = @mysql_query($query, $connect))
{
// verify the INSERT statement took effect
if (mysql_affected_rows($connect) == 1)
{
/* Next we will use pseudo-code for the following operations.
We could choose to nest all the queries, but instead we will
take advantage of transactions and error trapping. We do want
to trap an error and point out to the user that the transaction
was not complete. We want to tell the user ourselves, rather
than have the database generate a warning or PHP generate a
syntax error.
* Use SELECT LAST_INSERT_ID() to capture the customer ID number
generated by virtual of being an AUTO_INCREMENT field
* Assign the customer ID to a scalar variable
* SELECT the values from the temp_orders table matching the
temp order ID to the cookie
* INSERT that information into the orders table, which
represents the permanent orders
* Use SELECT LAST_INSERT_ID() to capture the order ID, assign
it to a scalar variable, so that you can email it to the
customer and print it out as a tracking number for the customer
* SELECT the values from the temp_order_details table matching
the temp order ID to the cookie
* INSERT that information into the order_details table
* DELETE from the temp_order_details table
* DELETE from the temp_orders table
To execute each query we will use this syntax, which will
enable us to set a flag if there is an error. This is essential
because a query that completes execution but returns 0 rows
affected is not considered a failure in terms of transactions.
$result_query = @mysql_query($query, $connect);
if (($result_query == false) &&
(mysql_affected_rows($connect) == 0))
{
// verify the query executed completely and verify that it
// had impact on the table
$success = false;
// here also, the developer could choose to add a ROLLBACK
// statement
}
*/
} // end if affected rows is 1 for the INSERT statement
} // end if no error in the insert query query
else
{
print ("<p>We are having technical difficulties, please
contact our customer service at 1-800-555-5555.</p>");
}// end else - problems with the query
if ($success == true)
{
$query = "COMMIT";
$result_query = @mysql_query($query, $connect)
} // end if success is true
else
{
print ("<p>We are having technical difficulties, please
contact our customer service at 1-800-555-5555.</p>");
} // end else - the success flag is false
} // end if connection is successful
else
{
print ("<p>We are having technical difficulties, please contact
our customer service at 1-800-555-5555.</p>");
// here you may also want to email details of the error to the Webmaster
}
} // end if form validation passed
else
{
// give the user some instructions on how to return to the form, also pass
// along back to the form data via the query string to be used in
// generating an error message on the form page
print ("<p>Your forgot to provide us with some information, there
was a problem with the following field(s): $form_error. Please
either hit the back button on your browser or <a href =
\"form.php?error=".urlencode($form_error)."\">click
here</a> to return to the form .</p>");
}
?>
Using the "if/else" structures to trap errors may seem like extraneous code, but a good developer will use include files and functions to break the code into reusable modules. To further modularize this code, a good developer will create functions and include files. For example, one function could be the error message shown to the user and an email alert to the site administrator:
function show_error($error_code)
{
print ("<p>We are having technical difficulties, please contact our
customer service at 1-800-555-5555.</p>");
mail("websiteguru@mysite.com ", "error with database", $error_code,
"From: site@mysite.com");
} // end function
By calling this function, the else blocks can simply read:
else
{
show_error("connection failed");
}
Include files could also be used to hold the connection parameters so that
when the user names and passwords change every six months, the information is
updated in one spot. Note that any include file should be named
.php not .inc so that the code is never seen as plain
text by a browser (or you must change your web server configuration).
Additionally, any
include file that holds the user name and password should include a check to
see how it is being loaded. If the REQUEST_URI is equal to the
same name as the include file, then redirect to the home page so that the user
cannot see the file.
There is a significant risk of data loss when sending data to and from a socket (UNIX or TCP/IP). Transactions again will not recover that. Transactions can only recover when the DBMS fails, but once the queries have completed, the transaction completes.
Whether the data reaches the application successfully is
a matter of programming. With error trapping, the developer can force a
rollback if a single piece of the information fails to reach the application. Also, the developer should always write custom error messages. Printing out
the errors generated from the DBMS using mysql_error() is not only
completely useless to the site visitor, but also a potential security risk to
the database.
Transactions offer wonderful reliability for mission-critical operations where not only is the data at risk of being corrupted or lost, but also the loss of data would severely jeopardize the company's business. Transactions are best used in conjunction with data validation and error trapping on behalf of the programmer. Because MySQL offers the flexibility among table types, developers should choose the transaction-safe and non-transaction safe tables appropriately.
Kimberlee Jensen is a freelance Web developer and an IT instructor at Seattle Central Community College.
Return to PHP DevCenter.
Copyright © 2009 O'Reilly Media, Inc.