An Interview with Joshua Drake: Coauthor of Practical PostgreSQLby Jonathan Gennick, O'Reilly editor
Recently, Jonathan Gennick sat down with the coauthor of Practical PostgreSQL to talk about the future of this open source database. In this in-depth interview, Joshua Drake, who is also the cofounder of Command Prompt, and a project leader for Mammoth PostgreSQL, touches on everything from a Mac OS X distribution of PostgreSQL to LXP, the XML application server that runs as an Apache module to XML support in PostgreSQL.
Gennick: How long have you been working with PostgreSQL, and what got you started with it?
Drake: This is an interesting question. In the early 1990s I actually worked with Ingres, with which PostgreSQL has some deep-rooted history. However, as far as what I would consider current PostgreSQL, I started in 1995 with Postgres95. I had been contracted to write a shopping cart for an ISP and needed a database. Incidentally, that is also the time that I was first introduced to PHP (PHP/FI back then).
Gennick: I hear you are working on a Mac OS X distribution of PostgreSQL. Is that true? What are your plans for that? Are others also working on the same thing?
Drake: Well, Mac OS X is only one of our four supported distributions. (By the way, we refer to our distributions as Mammoth PostgreSQL.) We support Red Hat Linux 6.x/7.x, Solaris 8, Mac OS X, and Win32. Our Win32 version is in Beta and will be released shortly. Our hope is to give the Win32 users incentives to test deployments before moving to one of our Unix-supported platforms.
The Mac OS X version has been well received and is actually our most demanded version at this time. The marketing that Apple is doing right now with the "Just Works" slogan is the same as our thought process with our Mac OS X version. We are providing a database with enterprise class features that the Apple heads won't have to worry about. They just install it, and run it. If they need help they can call us.
There are other PostgreSQL distributions for Mac OS X out there, but I only know of two that are actively supported, our's (Command Prompt) and DbExpert's.
Gennick: You have distributions for more than just the Mac; why?
Drake: Market. We could just support Mac OS X or Linux, but it really wouldn't do us or our customers a lot of good. By providing a well-rounded set of support and programming services, plus the multiplatform support for our products, customers are never caught locked into their operating system platform.
Gennick: In Chapter 13 of Practical PostgreSQL, you cover something called LXP. Just what is that? It's not part of the standard PostgreSQL distribution, is it?
Drake: LXP is a XML application server that runs as an Apache module. The two main benefits of LXP are true database persistence and rapid development time. Originally LXP was developed internally for our own use; we wanted something that was versatile but not as programmatic as something like PHP or Perl. In other words, we wanted a simple solution to a number of complex problems. OK, we were lazy and did not want to continue coding in languages that required a lot of extra syntax.
LXP currently supports a wide array of features, including:
As a simple example, here is some code to perform a query to a PostgreSQL database and display the results in tabular form (for example, in rows and columns):
<lxp> <!-- select all my employees and their phone numbers from our employee table --> <include sql="SELECT employee_firstname, employee_lastname, phone_number FROM employees"> <div align="bold_content"> First Name: </div> <field name="employee_firstname"> <div align="bold_content"> Last Name: </div> <field name="employee_lastname"> <div align="bold_content"> Phone Number: </div> <field name="phone_number"> </include> </lxp>
As you can see, you can incorporate HTML directly into the LXP code and the parser understands how to deal with it. It is very simple, but extremely powerful. The equivalent code in PHP or Perl would need to be quite a bit more lengthy. An additional feature, which customers have found useful, is the Universal Broker. The Universal Broker was created to allow people to use tools as they are appropriate. For example, PHP. PHP is a great language and we use it in a lot of our development. There are times, when you want LXP to do something that is just better served in PHP. If this is the case, all you have to do is the following:
<lxp> <include method="uri" src="my_form.php" /> </lxp>
This will process the PHP file directly through Apache and allow you to use PHP (or Perl, Python, Java, and so on) directly from LXP.
The key to LXP is that it doesn't look like a programming language, but it does give you all the benefits of one. Many of our customers are extremely bright and self motivated, but they are not programmers. If you give them a bunch of PHP code they say, "Great... now I have to learn Latin?", but when they see LXP they say, "Hey looks like HTML... no problem". That is a powerful argument for the small business looking to develop an application.
LXP is not currently distributed as part of the community PostgreSQL. However, it is part of Mammoth PostgreSQL.
Gennick: One advantage that PostgreSQL has long held over MySQL--the other open source database--is that PostgreSQL supports transactions and concurrency and related things that people have come to expect in a database. Now that MySQL has transactional support, do you expect interest in PostgreSQL to decline? If not, why not? After all, MySQL does seem to be more widely used.
Drake: We don't see interest in PostgreSQL declining. In fact, we see it growing. We get customers every week calling us about migrating from MySQL to PostgreSQL. To be fair to MySQL, PostgreSQL and MySQL are different products. You really can't compare them as it is almost like apples and oranges. Yes, they are both databases, but only PostgreSQL is an Object Relational Database. MySQL has increased its capabilities of late but there are still significant features it lacks. Here is a brief, albeit incomplete list of the features MySQL does not have that PostgreSQL does:
Some of these have been added with the 4.0 release, which is still in development. I don't want to sound negative about MySQL. It has a strong following for a reason: it is reasonably good at what it does and until about 18 to 24 months ago it really was a better solution (for simple database stuff) than PostgreSQL. However, over the last two years, PostgreSQL has really come into its own and from our experience provides a much better overall solution than MySQL for most database application development.
Gennick: Do you have any idea who has the largest PostgreSQL database and how big it is?
Drake: We are aware of an American Chemical Society database that is over a terabyte. We have several customers that are in the multigigabyte (10GB to 50GB) range and just recently BASF joined the PostgreSQL bandwagon as well.
Gennick: You mention BASF. Are they one of your clients? Why do you say they "joined the bandwagon"? Are they doing something special or unusual with PostgreSQL?
No, they are not one of our customers, but I know that they recently signed a five-year PostgreSQL support contract with a PostgreSQL support company.
Gennick: Do you recall which support company that was?
Drake: Web Commerce Group.
Gennick: What big improvements can we expect to see in the near future with respect to PostgreSQL? When is the next major release scheduled? When will we see PostgreSQL 8.0?
The next major release will be PostgreSQL 7.3, and it's planned for September 2002. However, it is important to realize that the PostgreSQL team follows the, "We release when it's done." motto quite religiously. There are currently no plans for an 8.0. Not that development will stop, just that we will probably be in the 7 series for some time.
In reference to features, we will see the standard items such as:
But there has also been reference to:
Gennick: Schema support, what's that?
Drake: I believe it is the equivalent of namespaces in Oracle. To be honest, I am not yet sure of the benefit.
For reliability reasons we tend to stick with current production-quality source only. That is one of the reasons we waited for 7.2.1 before we released Mammoth. At 7.2 we began testing, and released at 7.2.1. We won't worry about 7.3 until it comes out, then when 7.3.1 hits you will see a new version of Mammoth.
Gennick: It's well-known that the Red Hat Database is a version of PostgreSQL. How different is the Red Hat Database from plain old PostgreSQL.
Drake: This is a better question for Red Hat, however I can point out some obvious traits. The Red Hat Database was originally a 7.1.2 code base. It now lists as 7.1.3, which puts the code at just about 12 months old. The 7.2 series is quite a bit more advanced, stable, and manageable than the 7.1 series. In fact, we delayed our release of Mammoth until the community released 7.2.1 for these reasons.
Gennick: Has Red Hat been successful in its offering of PostgreSQL as the Red Hat Database? And has Red Hat's offering spurred any additional interest in PostgreSQL in general, outside of the Red Hat sphere of influence?
Drake: Again, this is probably more a question for Red Hat. However, I can say we get a lot of people calling us saying, "I can't convince myself to spend $2,300 on PostgreSQL, can Command Prompt help me?". Outside of people noticing us when they review Red Hat DB we don't run into the Red Hat DB often. The marketing team over at Red Hat does not seem focused on delivering mind share for its database product.
Gennick: If you were giving advice to someone just beginning to use PostgreSQL, are there any technology intersections, such as PostgreSQL and PHP for example, that you think deserve special attention?
Drake: Well outside of LXP and reading Practical PostgreSQL from O'Reilly--blatant plug--I believe that PHP and Java both deserve mention with PostgreSQL. It is also important to remember what PostgreSQL is. It is easiest to think of PostgreSQL as baby Oracle. They have a lot of the same features, but PostgreSQL is a lot easier to manage. I would also suggest the following Web sites:
Gennick: Oracle and Microsoft have been hard at work building XML support into their databases. Oracle9i, for example, allows you to easily return query results in XML format. When can we expect to see XML support in PostgreSQL?
Drake: You can process XML results with LXP (or any other Web language) with PostgreSQL and we (Command Prompt) are looking into developing XQuery support for PostgreSQL. There are advantages and disadvantages to exporting your data as XML. Exporting XML directly from the database adds verbosity to the query results, which can make processing them easier, but also adds a lot of unrequired data, thus creating a possible strain on bandwidth (not just Net bandwidth, but i/o bandwidth as well). We find that it is better to export raw tabular data and format to XML for processing through XSLT than to export XML directly from the database.
With LXP 1.0 you will be able to process a query, render the tabular results as XML, and then process them through XSLT to generate whatever output you desire. You could even transform the XML to PDF on the fly and store it back to PostgreSQL as a large object, or bytea.
Gennick: A "bytea"? What is that?.
Drake: It is a data type for binary data. However unlike a large object it is stored within the table row just like the rest of the data.
Gennick: What does Command Prompt do besides the obvious PostgreSQL support?
Drake: Command Prompt--warning marketing lingo imminent--is a managed services firm with a focus on PostgreSQL and Linux custom development. We offer a full enterprise-class product and services line including; A VAR/OEM program, managed hosting (Virtual and Dedicated), custom application programming, tier 3 support, consulting services, and training. We also offer a complete commercial line of products called Mammoth. These include Mammoth PostgreSQL, Mammoth LXP, Mammoth S/ODBC, and Mammoth Browser.
Gennick: What features would you like to see the PostgreSQL community working on?
Drake: I know this will sound like blasphemy but I would like to see stronger support for Win32. There is a large need in the Win32 community for a database that is as capable as PostgreSQL without costing $10,000 to $40,000 dollars.
Gennick: What are some of the problems with the current level of Windows support?
Drake: The largest problem is a limitation on the number of connections that can be made. PostgreSQL on Win32 uses Cygwin, which is a POSIX compatibility layer for Win32. The Cygwin fork() implementation, which PostgreSQL uses, is limited as to how many times it can branch. You end up hitting a connection limit at 63 connections, I believe.
We have done some optimizing and testing with Mammoth PostgreSQL on Win32 and have found it to be very stable, with decent performance as long as we keep the connections at 50 or below.
Gennick: What specific features is Command Prompt working on for PostgreSQL?
Drake: Well, we mentioned previously that we are working on XQuery support. We also have a procedural language called XPL, which is basically LXP but is meant to be called from the database (Like pl/PgSQL), our S/ODBC driver and our cross-platform management tool, Mammoth Browser. We are also working on a replication and load-balancing engine for PostgreSQL.
Gennick: Josh, thanks very much for your time, and for a very interesting interview. I learned a lot from our little chat.
Drake: Jonathan, you're very welcome. It's been my pleasure.
O'Reilly & Associates recently released (January 2002) Practical PostgreSQL.
Sample Chapter 2, "Installing PostgreSQL", is available free online.
For more information, or to order the book, click here.
Jonathan Gennick is an O'Reilly Media, Inc. editor specializing in database and programming titles.
Return to the O'Reilly Network.
Copyright © 2009 O'Reilly Media, Inc.