The Access Log
Let's see what's lurking inside that log.
For the purposes of this look at a typical set of logs, I'm
assuming your Apache server has been configured to use
Common Log Format (CLF), the default in a fresh Apache installation. Your
httpd.conf file
should contain the following configuration directive:
CustomLog logs/access_log common
Look at your access log, the location of which will
depend upon your layout preferences and installation method.
The Apache 1.3.9 RPM installation under Red Hat 6.1 places logs in an
/etc/httpd/logs directory. The source and binary installs
typically use /usr/local/apache/logs/access_log. The default
filename under Windows is access.log.
Let's zoom in on one fairly representative line in a log:
123.45.678.90 - - [07/Mar/2000:14:27:12 -0800]
"GET /mypage.html HTTP/1.1" 200 10369
123.45.678.90
|
The visitor's IP address. If you particularly need the visitor's host name, read the Apache documentation on the HostNameLookups directive. |
- -
|
The first of the two dashes is a placeholder for something called ident, a less trustworthy form of client identification. That's about all I'll say on this; for further information, see Apache's IdentityCheck directive. The second dash is a placeholder for the user name supplied
by a visitor if required to log in to gain access to a
password-protected
section of the web site. Say, for example, I restricted access
to a |
[07/Mar/2000:14:27:12 -0800]
|
The date, time, and time-zone. |
GET /mypage.html
|
The visitor's request, in this case the You'll often
see requests consisting only of a slash, |
HTTP/1.1
|
The browser's request protocol, in this case HTTP, version 1.1. An older, yet still very common protocol, is HTTP 1.0. |
200
|
An
HTTP status code is
returned as part of the response to the visitor's browser.
|
10369
|
The number of bytes returned to the visitor, excluding headers
(status codes and the like). In the case of a 304
Not Modified status (see above), this value is the usual
|
Logging in Apache (version 1.2 and later) is handled by the Apache module,
mod_log_config,
which enables you to customize how your logs look and work. Your
httpd.conf file contains some popular log formats to get you started:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\"
\"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %b" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent
Each log format starts out with the LogFormat directive, followed by a string of tokens that describe how each line of the log file should look, and ending with a nickname given to the format. Click here for a comprehensive list of tokens and their meanings. How you want your logs displayed and into how many files you want them sorted is up to you. Some site authors separate log files into referrer and agent logs. I prefer to use the "combined" log format and keep everything in one place.
Let's say I wish to use "common" log format, but also want to keep track
of who is linking to my site. I could just use "combined" format, but
I don't really care what type of browser (agent) my visitor is using.
Instead, I'll create a new LogFormat directive like so:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\"" commonish
Now that I've defined my preferred log format, I need to tell Apache to use this format. Using my "commonish" log format above:
CustomLog logs/commonish_log commonish
where logs/commonish_log is the path to my log file relative
to my
ServerRoot.
You can actually skip the LogFormat directive and include
your preferred log format string in place of the nickname in your
CustomLog directive -- it's up to you.
We've only just scratched the surface of log customization. For much more, be sure to read the detailed mod_log_config documentation.