DELETE Tips and Tricks


Also in aboutSQL:

What's the Big Deal about SQL?

Getting Started with SQL

Introducing SELECT

Filtering SELECTed Data with WHERE

INSERTing Data

UPDATE that Data!

Where to Use WHERE

Using Functions

Aggregate Functions

Controlling Data Display with ORDER BY

Group By

The "Key" to Good SQL

Good Relations

Types of Relationships

JOINed at the Hip

More on JOINS

Types of JOINs

The Outer Limits of SQL JOINs

Self-Inflicted SQL

SQL Subqueries

Introducing SQL Sets

Data Definition Language

Working With Tables

SQL Data Types

First off, a quick thank you for your feedback concerning the aboutSQL columns -- I'm taking your suggestions, criticisms, and requests into account as I continue to develop this column.

This week we're going to look at the last of the core SQL statements for changing data -- the DELETE statement. We'll also focus on some practical tips for implementing this most dangerous SQL statement in real-world applications.

The DELETE statement

It's probably too easy to use the DELETE statement, thanks to the (fairly) consistent syntax of the SQL standard, but I think it's important to provide you with as much information as possible -- so here goes:

DELETE FROM table_name WHERE criteria

Now that you've seen it, I want to warn you in loud and unambiguous language to be EXTREMELY careful when you use it. There is no undo. And if you forget the WHERE clause or make a mistake in it, you run the risk of removing the entire table from the database!

Just like the UPDATE command, I heartily recommend trying a SELECT statement using the WHERE clause before you do a DELETE, just to make sure it works like you expect it to. And you should be exceptionally careful how you let users interface with the DELETE command because:

SQL Wisdom #2) If users CAN delete data, they WILL delete it ... accidentally and at the worst possible time.

We'll take a look at the DELETE command in action in two different ways -- one for DELETEing an entire table, the other for the more focused DELETE. If you remember from the last column, aboutSQL UPDATEs, I showed you how to improperly use the UPDATE command without a WHERE clause to change the entire contents of a column of data to a single value.

ID Title Artist Year
1 Pet Sounds Bruce Hornsby and the Range 1966
2 Security Bruce Hornsby and the Range 1990
3 The Way it Is Bruce Hornsby and the Range 1986
4 Joshua Judges Ruth Bruce Hornsby and the Range 1992
5 Supernatural Bruce Hornsby and the Range 1999

Ironically, the DELETE statement without a WHERE clause fixes our problem, or at least gets rid of it. The SQL statement

DELETE FROM MusicCollection;

will remove the entire table from the database and give us the chance to start over (or you could use UPDATE statements to fix each row if you've got the time). For a more surgical removal, the syntax for removing the Pet Sounds row of data could be any of the following:

DELETE FROM MusicCollection WHERE ID=1;
DELETE FROM MusicCollection WHERE Title='Pet Sounds';
DELETE FROM MusicCollection WHERE Title='Pet Sounds' AND Year=1966;

Remember that DELETE and UPDATE work in similar ways, affecting entire rows of data at a time. Both statements can use the WHERE clause to filter the database for a collection of rows to modify, but the default behavior is to operate on the entire table. There's no UNDO statement!

Safe DELETE practices

At this point, you may be thinking, "So why waste a column on a SQL statement that you're so unlikely to use? What's the point?"

Well, for one thing I want to make sure that I cover the basics of SQL thoroughly. But I'd also like to make sure that you know when, how, and why to use SQL to its (and your) best advantage. In most cases, giving users the tools to delete data is not in anyone's best interest! The key is to take a moment and consider why you are actually deleting the data. Go ahead -- I'll wait. I submit that there are at least two primary reasons to delete data:

Unless you're a database administrator, you probably didn't think about the second reason to delete data, and I don't blame you a bit. The normal concern with removing data from a database is to ensure that reports use accurate information -- most of us could care less about disk space and the like.

So why would you actually need to really delete the data, especially when you're almost guaranteed to need it right after you press the enter key on the SQL statement? An alternative is to add a field to the database (probably a Boolean) that stores a true/false value indicating whether that record has been deleted.

This lets you write applications to "delete" the data, but gives you or other administrators the ability to easily undelete it as well. The downside is that every query you run now has to include something along the lines of "WHERE Deleted=FALSE" to ensure that all decisions are made on the "undeleted" data.

The DBAs do have a point however -- databases do need to be cleaned of old data to ensure that performance and reliability are optimal. Different databases have different characteristics, but in any case, as the number of records increases, the overhead for most queries also increases.

If the majority of the data is "deleted" data, then the performance hit will eventually become untenable. One alternative to the "Deleted field" is to move deleted records to a table of deleted records, which provides for an "undo" for deleted records without cluttering the real data table. This makes DBAs happier, but often requires some additional management overhead (when a table structure is changed, the related deleted table needs to be changed as well), which makes them less excited.

In any case, it should be clear that DELETEs should be rarely used, and even then only in the hands of an experienced DBA or SQL developer. And in any case, RUN AN EQUIVALENT SELECT STATEMENT FIRST!!! You should never practice deletions on important data without a test to see if your WHERE clause does what you think.

Next steps

Now that we've finished the set of SQL data-change statements (INSERT, UPDATE, and DELETE) and covered the core SQL data statement (SELECT), we'll dig deeper into the WHERE clause for the next installment of aboutSQL. As you may recall, we have touched on it in the past few columns to see what it can really do. Next, we'll move into more sophisticated data filtering. In the meantime, feel free to drop me a line with questions, comments, or hints aboutSQL.

John Paul Ashenfelter is president and CTO of TransitionPoint.com, a technology development and analysis group focusing on web database applications.

Read more aboutSQL columns.

Discuss this article in the O'Reilly Network Linux Forum.

Return to the Linux DevCenter.


Copyright © 2017 O'Reilly Media, Inc.