A Day in the Life of #Apache
by Rich Bowen, coauthor of
Tips for Making Your Server Run Faster
Editor's note: After a brief summer hiatus that included a trip to Portland, Oregon for OSCON 2004, Rich Bowen is back this month with his latest column based on his conversations on the IRC channel #apache. Want to know how to make your web site faster? Rich has some tips to enhance your server's performance.
#apache is an IRC channel that runs on the irc.freenode.net IRC
network. To join this channel, you need to install an IRC client (XChat, MIRC,
bitchx are popular clients) and enter the following commands:
/server irc.freenode.net /join #apache
First, a note. I'm writing this at OSCON 2004. It is Friday morning, and this article was due on Tuesday evening. So, many thanks to my editor for her understanding, and to paraphrase Douglas Adams, here are some words of wisdom from fajita:
<DrBacchus> fajita: deadlines
<fajita> The great thing about deadlines is the whooshing sound they make as they fly past.
Today we're talking about the rather common question that comes up a couple of times every week:
<Quixote> How do I make my server run faster?
As you might imagine, the answers to this vary greatly, and primarily depend on the type of content you have on your web site, and in what ways you have already reconfigured your server. So we'll approach this question from a variety of different angles. Also, before we get started, you might be interested to know that there's a document on the Apache site that addresses this question, somewhat. That document may be found at httpd.apache.org/docs-2.0/misc/perf-tuning.html.
We'll start with another question.
<Quixote> How do I measure performance?
<DrBacchus> fajita: benchmarking tools
<fajita> Some available benchmarking tools include ab, flood, jmeter, daiquiri, siege
Measuring performance is a tricky business, because you're seldom measuring exactly the thing that you want to be. You typically want to know how a web site will perform in real-world scenarios. Most measuring tools try to simulate these scenarios to some degree, but it's never quite the same thing. And as long as you understand that it's not the same thing, these tools can be valuable in making your web site faster.
The tools that fajita recommends are all freely available, and have various
degrees of functionality. I will not spend this article telling you about all of
them, although we will look at
ab a little, because it comes with
Apache, and is a convenient starting place for performance testing.
Now that that's out of the way, we'll talk briefly about using
ab is a very simple-minded benchmarking tool. Don't be fooled
into thinking that it simulates reality in any meaningful way. However, it's
still valuable to test to see if your performance changes have actually made a
ab requests a URL multiple times, and then reports a
few statistics about those transaction:
ab -n 1000 -c 10 http://localhost/index.html
Here's some partial output from that command:
Concurrency Level: 10 Time taken for tests: 2.303 seconds Complete requests: 1000 Failed requests: 0 Broken pipe errors: 0 Total transferred: 253260 bytes HTML transferred: 12060 bytes Requests per second: 434.22 [#/sec] (mean) Time per request: 23.03 [ms] (mean) Time per request: 2.30 [ms] (mean, across all concurrent requests) Transfer rate: 109.97 [Kbytes/sec] received
The information provided in this output allows for some basic measurement of the speed of a particular resource, so that you can observe any changes that occur when you modify your server configuration. Now that you have a way to verify that my recommendations are actually doing something, let's get on with the tips.
Avoid DNS whenever possible. It is slow, and it is out of your control. DNS lookups take as long as they take. So you want to avoid forcing Apache to do them.
There are two particular places where this might come up: access control and logging.
If you are using
allow from or
deny from lines in
your configuration to do access control based on the address of the client, try
to use the IP address, rather than the hostname, of the client you want to
permit or deny. If you use a hostname, Apache will have to do a DNS lookup in order
to determine if the client in question is from that hostname.
One option in your logfile configuration is the directive
HostNameLookups. If it's set to
Off, which is the default,
Apache will log the IP address of the client. If it's set to
On, it will
instead log the hostname.
Don't do that.
Causing Apache to do a DNS lookup for every client access will slow down performance significantly, and will also cause the number of Apache child processes to grow, as various processes are using their time to do DNS lookups.
I've discussed .htaccess files before, and briefly touched on the performance aspects of using them. The short form here is that you should avoid using .htaccess files whenever possible. They are a huge performance drain.
The reason for this is two-fold.
First, there's the fact that Apache has to look in the .htaccess file every single time a resource is requested from the directory in question. .htaccess files are not cached, and changes to them take effect immediately. So Apache has to check for that file every time. Meaning that you're opening that .htaccess file, reading it in, and parsing the contents, with every single request.
But, wait, there's more! Because .htaccess files apply to subdirectories, Apache will have to check the directory above, and perhaps the one above that, and so on, until it reaches a directory where .htaccess files are not permitted. This means that every resource requested from that directory generates two or three or four (etc.) file system accesses, even if there aren't any .htaccess files in those directories -- Apache still has to look.
The moral here is to set
AllowOverride None wherever
possible, and for places that you really need .htaccess files, turn
the feature on only for that directory. Or, better yet, put directives in
httpd.conf, where they belong. (Yes, there are times when
.htaccess files are useful. I just think it's less often than some
folks seem to think.)
negotiation is a feature that uses the user's browser preferences to
determine what variant of a resource (e.g., which of several languages) is served up.
While this is a wonderful feature, it comes with a pretty large performance
price. Don't use it unless you need it. In practical terms, this means removing
MultiViews from your
Options lines in your
config files if you're not using it.
This list is not complete by any means. And there's more thorough documentation on the Apache web site. (See 1.3 and 2.0.) But these are the places where the largest number of mistakes are made, and so those are a pretty good place to start.
And, if you're doing Perl CGI programs, make sure you take a look at
And be sure to drop by #apache with any further questions.
Rich Bowen is a member of the Apache Software Foundation, working primarily on the documentation for the Apache Web Server. DrBacchus, Rich's handle on IRC, can be found on the web at www.drbacchus.com/journal.
In November 2003, O'Reilly Media, Inc. released Apache Cookbook.
Sample Chapter 9, "Error Handling," is available free online.
For more information, or to order the book, click here.
Return to the Apache DevCenter