Apache DevCenter
oreilly.comSafari Books Online.Conferences.

advertisement


A Day in the Life of #Apache
Tips for Making Your Server Run Faster

by Rich Bowen, coauthor of Apache Cookbook

Editor's note: After a brief summer hiatus that included a trip to Portland, Oregon for OSCON 2004, Rich Bowen is back this month with his latest column based on his conversations on the IRC channel #apache. Want to know how to make your web site faster? Rich has some tips to enhance your server's performance.

#apache is an IRC channel that runs on the irc.freenode.net IRC network. To join this channel, you need to install an IRC client (XChat, MIRC, and bitchx are popular clients) and enter the following commands:

/server
irc.freenode.net
/join #apache

Day Eight

Related Reading

Apache Cookbook
By Ken Coar, Rich Bowen

First, a note. I'm writing this at OSCON 2004. It is Friday morning, and this article was due on Tuesday evening. So, many thanks to my editor for her understanding, and to paraphrase Douglas Adams, here are some words of wisdom from fajita:

<DrBacchus> fajita: deadlines
<fajita> The great thing about deadlines is the whooshing sound they make as they fly past.

Today we're talking about the rather common question that comes up a couple of times every week:

<Quixote> How do I make my server run faster?

As you might imagine, the answers to this vary greatly, and primarily depend on the type of content you have on your web site, and in what ways you have already reconfigured your server. So we'll approach this question from a variety of different angles. Also, before we get started, you might be interested to know that there's a document on the Apache site that addresses this question, somewhat. That document may be found at httpd.apache.org/docs-2.0/misc/perf-tuning.html.

We'll start with another question.

<Quixote> How do I measure performance?
<DrBacchus> fajita: benchmarking tools
<fajita> Some available benchmarking tools include ab, flood, jmeter, daiquiri, siege

Measuring performance is a tricky business, because you're seldom measuring exactly the thing that you want to be. You typically want to know how a web site will perform in real-world scenarios. Most measuring tools try to simulate these scenarios to some degree, but it's never quite the same thing. And as long as you understand that it's not the same thing, these tools can be valuable in making your web site faster.

The tools that fajita recommends are all freely available, and have various degrees of functionality. I will not spend this article telling you about all of them, although we will look at ab a little, because it comes with Apache, and is a convenient starting place for performance testing.

Now that that's out of the way, we'll talk briefly about using ab.

ab is a very simple-minded benchmarking tool. Don't be fooled into thinking that it simulates reality in any meaningful way. However, it's still valuable to test to see if your performance changes have actually made a difference. ab requests a URL multiple times, and then reports a few statistics about those transaction:

ab -n 1000 -c 10 http://localhost/index.html

Here's some partial output from that command:

Concurrency Level:      10
Time taken for tests:   2.303 seconds
Complete requests:      1000
Failed requests:        0
Broken pipe errors:     0
Total transferred:      253260 bytes
HTML transferred:       12060 bytes
Requests per second:    434.22 [#/sec] (mean)
Time per request:       23.03 [ms] (mean)
Time per request:       2.30 [ms] (mean, across all concurrent requests)
Transfer rate:          109.97 [Kbytes/sec] received

The information provided in this output allows for some basic measurement of the speed of a particular resource, so that you can observe any changes that occur when you modify your server configuration. Now that you have a way to verify that my recommendations are actually doing something, let's get on with the tips.

DNS

Avoid DNS whenever possible. It is slow, and it is out of your control. DNS lookups take as long as they take. So you want to avoid forcing Apache to do them.

There are two particular places where this might come up: access control and logging.

If you are using allow from or deny from lines in your configuration to do access control based on the address of the client, try to use the IP address, rather than the hostname, of the client you want to permit or deny. If you use a hostname, Apache will have to do a DNS lookup in order to determine if the client in question is from that hostname.

One option in your logfile configuration is the directive HostNameLookups. If it's set to Off, which is the default, Apache will log the IP address of the client. If it's set to On, it will instead log the hostname.

Don't do that.

Causing Apache to do a DNS lookup for every client access will slow down performance significantly, and will also cause the number of Apache child processes to grow, as various processes are using their time to do DNS lookups.

.htaccess Files

I've discussed .htaccess files before, and briefly touched on the performance aspects of using them. The short form here is that you should avoid using .htaccess files whenever possible. They are a huge performance drain.

The reason for this is two-fold.

First, there's the fact that Apache has to look in the .htaccess file every single time a resource is requested from the directory in question. .htaccess files are not cached, and changes to them take effect immediately. So Apache has to check for that file every time. Meaning that you're opening that .htaccess file, reading it in, and parsing the contents, with every single request.

But, wait, there's more! Because .htaccess files apply to subdirectories, Apache will have to check the directory above, and perhaps the one above that, and so on, until it reaches a directory where .htaccess files are not permitted. This means that every resource requested from that directory generates two or three or four (etc.) file system accesses, even if there aren't any .htaccess files in those directories -- Apache still has to look.

The moral here is to set AllowOverride None wherever possible, and for places that you really need .htaccess files, turn the feature on only for that directory. Or, better yet, put directives in httpd.conf, where they belong. (Yes, there are times when .htaccess files are useful. I just think it's less often than some folks seem to think.)

Content Negotiation

Content negotiation is a feature that uses the user's browser preferences to determine what variant of a resource (e.g., which of several languages) is served up. While this is a wonderful feature, it comes with a pretty large performance price. Don't use it unless you need it. In practical terms, this means removing the MultiViews from your Options lines in your config files if you're not using it.

Other Resources

This list is not complete by any means. And there's more thorough documentation on the Apache web site. (See 1.3 and 2.0.) But these are the places where the largest number of mistakes are made, and so those are a pretty good place to start.

Other things that you might look at are mod_deflate (mod_gzip if you're on 1.3) and mod_file_cache (mod_mmap_static if you're on 1.3).

And, if you're doing Perl CGI programs, make sure you take a look at mod_perl.

And be sure to drop by #apache with any further questions.

Rich Bowen is a member of the Apache Software Foundation, working primarily on the documentation for the Apache Web Server. DrBacchus, Rich's handle on IRC, can be found on the web at www.drbacchus.com/journal.


In November 2003, O'Reilly Media, Inc. released Apache Cookbook.


Return to the Apache DevCenter



Sponsored by: