ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


aboutSQL

Self-Inflicted SQL

07/12/2001

Ahh ... the joys of SQL joins. We've talked about a number of different kinds of joins over the past few columns, and now we'll finally wind up with the mysterious "self join."

This will lead naturally into the next column's topic -- SQL subqueries. After that, we'll head back to set theory and the UNION statement to wrap up our toolbox of ways to create record sets from the tables in the database. Further up the road, we'll move into the SQL statements for creating and managing the database, and tables themselves which should occupy us for quite a while.

So what's a self-join?

A self-join is one of those tools that is only useful when you're stuck in a particular type of situation -- specifically when you have a normalized database table that needs to be "flattened." The typical situation is when a table of data contains a link to data values that are stored in the same table.

A standard example you'll see in SQL books is an "Employees" table that contains a field for the "Manager ID", which is a link to one of the records in the Employees table, as seen in the following example.


Employees
EmployeeID EmployeeName ManagerID
61 Sue Smith (null)
62 David Jones 61
63 Troy Parker 61
64 Claire Smith-Jones 63
65 Grover Rivers 63

Comment on this articleNow that we've finished working with joins for a while, what are your questions and comments about this aspect of MySQL?
Post your comments

Also in aboutSQL:

SQL Data Types

Working With Tables

Data Definition Language

How would you create a query that will return the names of the manager of each employee? One way would be to nest queries together in a loop -- a common approach for web developers (PHP, ASP, ColdFusion, JSP, etc.) faced with this problem. That's an awful solution from a performance perspective. A SQL joinis much faster than doing a lot of combinations of separate queries using server-side technology. We'll use a self-join instead.

Using self-joins

A self-join is simply a normal SQL join that joins one table to itself. This is accomplished by using table name aliases (which were discussed several columns ago) to give each "instance" of the table a separate name. For example:

SELECT E1.EmployeeName AS Employee, E2.EmployeeName AS Manager FROM Employees AS E1 INNER JOIN Employees AS E2 ON E1.ManagerID = E2.EmployeeID

What we're doing with the SQL code above is effectively creating two identical tables:


E1 (Employees)
EmployeeID EmployeeName ManagerID
61 Sue Smith (null)
62 David Jones 61
63 Troy Parker 61
64 Claire Smith-Jones 63
65 Grover Rivers 63

and

E2 (Employees)
EmployeeID EmployeeName ManagerID
61 Sue Smith (null)
62 David Jones 61
63 Troy Parker 61
64 Claire Smith-Jones 63
65 Grover Rivers 63

These tables are joined on the highlighted columns. Note that you are not truly creating another copy of the table -- you are joining the table to itself, but the effect is easier to understand if you think about it as two tables (at least it is for me!).

Next steps

Whew! That's it for SQL joins, at least for now. There are some more esoteric ways to join tables together but we've got a lot more common topics to cover first -- particularly the idea of subqueries which we discussed here. Until then, feel free to contact me with your comments and questions.

John Paul Ashenfelter is president and CTO of TransitionPoint.com, a technology development and analysis group focusing on web database applications.

Read more aboutSQL columns.


Return to ONLamp.com.

Copyright © 2009 O'Reilly Media, Inc.