Previously in aboutSQL:
We've spent the past few weeks taking a look at the fundamental statements of SQL for data manipulation (SELECT, UPDATE, INSERT, and DELETE) and at the all-important WHERE clause, all of which provide a foundation for building useful SQL queries. The SQL we've talked about so far is part of the standard that all SQL implementations, from Microsoft Access to MySQL to Oracle, implement in standard ways. This week, we veer off into one of the most useful, but most frustrating areas of SQL -- built-in SQL functions.
SQL, like most computer languages, includes the capability to perform functions on data. Unlike most computer languages, however, each database vendor has implemented their own set of functions. Not only does each vendor choose which functions they will support, different vendors implement the same functionality using different syntax! This basically means that as soon as you open the door to functions, your database code stops being portable. For traditional database applications, that is less of an issue, but for web-based applications using tools like PHP or ColdFusion, it means that the SQL functions you embed in your web application will break when you move from MySQL to Oracle or Access to SQL-Server. To make matters worse, SQL functions often provide significant performance gains for all sorts of database manipulation, particularly in the case of web applications.
So what's a conscientious developer to do? My advice is to use functions anywhere you can produce a measurable performance gain and document thoroughly what database the SQL was intended to work with. As we'll talk about in a future column, moving any function-oriented code to a database stored procedure and essentially "black-boxing" it is an excellent approach -- this lets the DBA rewrite the stored procedures for optimal performance -- but some databases do not support stored procedures, so this is not a perfect solution. There are a number of functions that are available in a wide range of databases, though (even if the syntax changes), so you can usually get the functionality you need as long as the functions are not too exotic.
There are four basic types of functions that you'll typically use with data:
How do you know what functions your database supports? Unfortunately, the only reliable way is to check the documentation. As a general rule of thumb, the more a database costs (or the more widely it is used for open source databases), the more functions it supports. Microsoft Access has a small set of functions, MySQL has a large set, and Oracle has a huge set (need your date converted to Roman numerals?).
Functions in SQL are typically used in WHERE clauses, though they generally can be used anywhere you'd use a field name or value. One of the easiest ways to learn functions is to see a few in action, so I've come up with a couple of simple examples with commentary on database function differences to give you a flavor for what functions are like. We'll start with a SELECT statement to find names longer than 30 characters:
SELECT Firstname, Lastname,SSN FROM Customers WHERE LEN(Lastname)>30;
This SELECT will retrieve the Firstname, Lastname, and SSN (Social Security number) from the Customers data table if the length of the Lastname field is greater than 30 characters. This syntax will work in Access, Sybase, MySQL, and other databases. Oracle has a similar syntax with a different function name:
SELECT Firstname, Lastname,SSN FROM Customers WHERE LENGTH(Lastname)>30;
LEN and LENGTH -- not too big a difference, but it is one you'd have to remember as you moved back and forth between database platforms. As you'll learn as you use a lot of SQL functions, Oracle is frequently the odd man out for function names and syntax. It's not all Oracle's fault, though -- Microsoft SQL Server is based on Sybase, and Access is supposed to provide a smooth transition to SQL Server, so it should be no surprise that they are all similar. MySQL often offers more than one syntax for common functions in what I assume is an attempt to smooth over the differences between various SQL implementations. PostgreSQL, DB2, Informix, and the dozens of other databases all have their own ways of doing things as well. This brings us to another instance of SQL Wisdom:
SQL Wisdom #3) Always, always, always check the manual for your database to see what functions are (and aren't) available.
The previous example wasn't too bad, but let's take a look at something as simple as inserting the current time in the database. Here the possibilities get a little more complicated.
INSERT INTO Orders(OrderNum,Customer,Timestamp) VALUES(44571,'John Paul Ashenfelter',Now());
INSERT INTO Orders(OrderNum,Customer,Timestamp) VALUES(44571,'John Paul Ashenfelter',GetDate());
INSERT INTO Orders(OrderNum,Customer,Timestamp) VALUES(44571,'John Paul Ashenfelter',CURRENT_TIMESTAMP;
|Oracle||INSERT INTO Orders(OrderNum,Customer,Timestamp) VALUES(44571,'John Paul Ashenfelter',SYSDATE);|
Not only is there a different way in each database, some databases have multiple ways! And let's not even get into how to format a date and the differences between internal date/time formats (we'll save that for another column). But, in any case, the function provides a straightforward way to operate on (or in this case, generate) data in the database.
Functions open up a huge range of additional functionality for SQL developers, but at the cost of portability between database platforms. Used properly, however, they provide improved functionality and performance for data-intensive applications. Next week, we'll look at using aggregate functions that operate on entire columns of data as well as SQL clauses for grouping data using these functions.
John Paul Ashenfelter is president and CTO of TransitionPoint.com, a technology development and analysis group focusing on web database applications.
Read more aboutSQL columns.
Discuss this article in the O'Reilly Network Linux Forum.
Return to the Linux DevCenter.
Copyright © 2009 O'Reilly Media, Inc.