

JOINed at the Hip
05/24/2001Over the past few articles, I've digressed from SQL into the world of database design -- now that digression pays off as we begin a multipart series on using the SQL JOIN
clause. The JOIN
clause is probably one of the most used and most confusing facets of SQL, but I'll do my best to demystify it in the coming columns.
Rehashing SELECT statements
Before this column veered off into database table design, I had devoted a few articles to the SQL SELECT
statement. These statements had the general form:
SELECT column_name(s) [AS alias] FROM table_name
[ORDER BY column_name(s) (ASC | DESC) ] WHERE (search_criteria);
which I used quite extensively to perform searches on a database of CDs. But for my examples, I've only looked at selecting records from a single database table. Now that I've spent some time discussing table relationships, I can take a look at using a SELECT
statement to pull records from a set of related database tables.
Let's take a look at our database records concerning Peter Gabriel's CDs. The following two tables contain the artist and CD information and have a one-to-many (one artist to many CDS) link on the ArtistID field which is highlighted.
ArtistID | ArtistName |
22 | Peter Gabriel |
CDID | ArtistID | Title | Year | Quality |
15 | 22 | So | 1984 | 10 |
16 | 22 | Us | 1992 | 10 |
I can find all the artist information for Peter Gabriel using a SQL statement like this:
SELECT * FROM Artists WHERE ArtistID=22;
and we could find all of the CD information using the SQL statement
SELECT * FROM CDs WHERE ArtistID=22;
which would give me two sets of data about the musician and his albums. Wouldn't it be great to get all of that information in a single place? I mean, I actually broke all this information out into two tables to be more efficient! The whole point of having related tables is that I can take the two separate tables and "put them back together" into the equivalent of one huge monolithic table consisting of artist and CD information, something like the following table.
Artist | Title | Year | Quality |
Peter Gabriel | So | 1984 | 10 |
Peter Gabriel | Us | 1992 | 10 |
What I need to do is join or link the two tables together on their common key field, in this case the ArtistID field. I can take the two SQL statements from above and put them together to create a virtual table that consists of all the information from each individual table.
SELECT * FROM Artist,CD WHERE (CD.ArtistID=Artist.ArtistID);
which generates a table that would look something like
CDID | ArtistID | Artist | Title | Year | Quality |
If I further limit the SQL statement to the ArtistID for Peter Gabriel
SELECT * FROM Artist,CD WHERE (CD.ArtistID=Artist.ArtistID) AND Artist.ArtistID=22;
I'd get the following result
CDID | ArtistID | Artist | Title | Year | Quality |
15 | 22 | Peter Gabriel | So | 1984 | 10 |
16 | 22 | Peter Gabriel | Us | 1992 | 10 |
So I can link any two tables on a common (key) field using the equal sign (=
) operator as part of the WHERE
clause in a SQL SELECT
statement.
Also in aboutSQL: |
Using the JOIN keyword
In addition to using the equal sign (=
) operator to join tables, SQL includes a JOIN
operator that can be used to make more explicit kinds of joins between two or more tables. Starting with the following SQL
SELECT * FROM Artist,CD WHERE (CD.ArtistID=Artist.ArtistID) AND Artist.ArtistID=22;
we can explicitly create the joined table
SELECT * FROM (Artist INNER JOIN CD ON CD.ArtistID=Artist.ArtistID) WHERE Artist.ArtistID=22;
As you can see from the two equivalent statements, the second one makes it a little clearer that we're creating a virtual table by joining two related tables together and then filtering the resulting tables for the correct ArtistID. For the time being, we're going to ignore why we use INNER JOIN
instead of just JOIN
-- that's a topic worthy of an article all its own!
Next steps
In the next few articles we'll be focusing on the JOIN
clause in all its many permutations -- INNER
, OUTER
, LEFT
, RIGHT
, and all the other ways to put tables together. Until then, feel free to contact me with comments and questions.
Believe it or not, I read all of the e-mail you send me and answer the vast majority of it. I'm contemplating using this column to address real-world SQL problems -- be it a thorny SQL challenge, a poorly written SQL query you inherited from another developer, or newbie questions. If you'd like to see that sort of analysis, let me know -- and if you need that sort of analysis, drop me a line. I won't promise to solve your problem, but I'd like to see what sort of questions are out there. Questions that would interest general audience of developers are more likely to be written up.
John Paul Ashenfelter is president and CTO of TransitionPoint.com, a technology development and analysis group focusing on web database applications.
Read more aboutSQL columns.
Return to the ONLamp.com.
