What to Log
The directives listed below are most compatible with Apache 2.0 with the
mod_logio module. Later in the article, I will discuss cross-compatibility with Apache 1.3 and other server environments.
Source IP, Time, and Request Line
These directives are already used in the common log file format. They are the three most obvious request metrics to track.
When logging the remote host, it is important to log the client IP address,
not the hostname. To do this, use the
%a directive instead of
%h. Even if
HostnameLookups are turned on, using
%a will only record the IP. For the purposes of the Blackbox
format, reverse DNS should not be trusted.
%t directive records the time that the request started. It
could be modified using a
strftime format, but it would be better
to keep it as is. That makes it easier to correlate lines between the Blackbox
log file and the common log file.
%r directive is the first line of text sent by the web
client, which includes the request method, the full URL, and the HTTP protocol.
It is possible to break up this data using individual directives. For example,
you could log a URL without a query string. Again, it's better to keep the
request line intact for comparison.
Process id and Thread id
When the Apache server starts, it spawns off child processes to handle incoming requests. As it runs, it shuts down older processes and adds new ones. Apache can add additional child processes if it needs to keep up with a high demand. By recording the process id and thread id (if applicable), you will have a record of which child process handled an incoming client.
You can also track the number of Apache processes for a given time and
determine when a child process shut down. If you are running an application
mod_python), recording the PID
will make it easier to find out what hits a child process was handling when
debugging an application error.
The connection status directive tells us detailed information about the
client connection. It returns one of three flags:
X if the client
aborted the connection before completion,
+ if the client has
indicated that it will use keep-alives (and request additional URLs), or
- if the connection will be closed after the request.
Keep-Alive is an HTTP 1.1 directive that informs a web server
that a client can request multiple files during the same connection. This way a
client doesn't need to go through the overhead of re-establishing a TCP
connection to retrieve a new file.
For Apache 1.3, use the
%c directive in place of the
There's nothing really new about this directive, since it's already used in
the common log file format. The CLF records the status code — after any
redirections take place — with
%>s. For the Blackbox
format, we will want to record the status code before and after the redirection
Time to Serve Request
The common log file format cannot accurately determine the amount of time it takes to serve a file. Some parsing programs will try to make estimates based on the timestamp on hits from the same source, but it is very unreliable, especially if the hits are being made in parallel.
These two directives will give you the exact metrics you need. The
%T directive will report the time in seconds it took to handle the
request while the
%D directive will report the same time in
Apache 1.3 does not support the
Bytes Sent and Received
Apache 2.0 includes the optional
mod_logio module which can
report on how many bytes of traffic a client sent and how many bytes the
%b directive does a good job, but it only reports the bytes
returned in the requested object, excluding the bytes from the HTTP headers.
The header traffic is usually small, but you may want to record it to get a
better idea of outgoing TCP traffic for a given interface can be like.
Recording the incoming bytes is helpful when your users are uploading files
with the PUT or POST methods.
%I to record incoming bytes,
%O to record
outgoing bytes, and
%B to record outgoing content bytes. In cases
where no content is returned, the
%B directive returns a zero,
%b returns a dash. Since we're dealing with integer
values, it's better to use