Linux DevCenter    
 Published on Linux DevCenter (http://www.linuxdevcenter.com/)
 See this if you're having trouble printing code examples


aboutSQL

Controlling Data Display with ORDER BY

02/13/2001

Also in aboutSQL:

SQL Data Types

Working With Tables

Data Definition Language

Introducing SQL Sets

SQL Subqueries

We've spent a lot of time talking about the SELECT statement, and that's no accident -- finding and manipulating existing data from a database is one of the key goals of most SQL development. A more recent sub-theme I've mentioned, in case you've missed it, is introducing SQL tools that provide a lot of functionality inside of the database.

Many web application developers (PHP, ColdFusion, ASP, etc.) don't know much about SQL when they start, and they end up spending a lot of time reinventing the wheel when SQL already provides them with an idling Ferrari! The functions and aggregate functions we've talked about the past two weeks certainly fit into that category, as does this week's topic -- ordering data for display using SQL.

Basics of ordering

It's not unusual to be asked to sort data by one or more criteria -- maybe employees by their annual sales, students by their grades, or albums by their release date. You've already seen in an earlier article that the ORDER BY clause of the SELECT statement can be used to do this type of sorting. As a refresher, the syntax is:

SELECT ... ORDER BY column_name

So to sort all CDs in a music collection by their original release date, you could use the SQL statement:

SELECT Title, Artist, Year FROM MusicCollection ORDER BY Year

and the result would look something like:

Artist Title Year
Pet Sounds The Beach Boys 1966
The Way It Is Bruce Hornsby 1986
Security Peter Gabriel 1990
Joshua Judges Ruth Lyle Lovett 1992

As you can see, implementing a sort using the ORDER BY clause is simple. Changing the default ascending sort order into descending is as simple as adding the DESC keyword, and multiple sort columns can easily be specified simply by appending them to the list:

ORDER BY column_name_1, column_name_2 DESC

which would sort the data in ascending order based on the first column name and further sort duplicates using the second column name in descending order. This process can be extending to as many columns of sorting as your database can handle.

Order by 'gotchas'

"Alright," you may be saying, "why this review of the ORDER BY clause now, John Paul?" We may have touched on the ORDER BY clause previously, but we missed the one crucial item that is necessary to truly understand how to properly use ORDER BY -- and that is precisely what rules are used to determine the "order" to sort field values?

The answer to that question ... well ... it's ... complex. And (here we go again) database-dependent. Oh yes, and language-dependent as well! It's amazing that something that can be implemented so simply hides such a complicated inner structure.

Each database has its own rules about sorting. Some have a single strategy. Some require a certain strategy to be selected when the database is created or when the database server is installed. Changing the behavior of any one of them is non-trivial. Understanding the various options, however, is the first step. There are two broad approaches:

It sounds simple so far, right? But let's think about dictionary order for a moment. Clearly all A-Z sorts are not created equal; in fact, here are a few complicating factors:

And of course there are the obligatory combinations of all of these factors. If you work purely in the A-Z, 0-9 world, the only real issue is case sensitivity. But for any international applications, make sure you know how your database will really sort the contents of the data fields.

And there's one final complication to note -- what about NULLs? We haven't talked much about NULL yet, but will spend considerable time on it as we move forward. NULLs are the content of any database field that contains no data.

Where in the sort order do they rank? Fortunately this is a pretty easy answer -- at the beginning. NULLs have no value, so A-Z and 0-9 are certainly greater than NULL, so as we sort through the data, NULLs will come out first.

Next steps

This week we took a bit of a side-step to revisit the ORDER BY clause of the SELECT statement. Next week, we'll look at its companion clause GROUP BY. In the meantime, keep the questions and comments coming.

John Paul Ashenfelter is president and CTO of TransitionPoint.com, a technology development and analysis group focusing on web database applications.


Read more aboutSQL columns.

Copyright © 2009 O'Reilly Media, Inc.