To put the created reports in context, begin by looking at the raw log data, and from that define basic web analytics terminology.
Anatomy of a web server log file
Using the configuration format specified earlier, each web log will have multiple lines of text, each containing nine fields of data. To understand the work AWStats has to perform, consider how a record looks:
|1||Host (user) IP||d81-211-134-62.cust.tele2.it||There has been a DNS lookup in this case. The web server can do it, but you can also do it later, if you do it at all. Judging from the user's host, there is a reasonable probability that the request came from Italy. (However, if the host were something like proxy.alitalia.it, the user might have been working for Alitalia in Boston!)|
|2||RFC 1413 identity (username) of the client determined by
||-||Rarely used. PC clients do not usually run
|3||Authenticated User (login name)||-||The login name for a web server-required login. This is not usually present--most web sites use application server logins, not web server logins.|
|4||The date and time that the server finished processing the request||
||Time includes UTC (Coordinated Universal Time) offset.|
|5||The user request||
||In this case, the client requested the top-level default document / (index.html)
|6||Response Status sent to client||
|7||Bytes sent, excluding HTTP headers||
|8||Referer (sic) URL, if any||http://www.antezeta.com/about.html||The URL from which the client made the request. This field is blank if the user directly types a URL, chooses a bookmark, or uses privacy software that blocks the information from being sent.|
|9||User-Agent identification as reported by the user agent. This usually includes operating system and browser names and versions.||
||This is a Firefox 1.0.4 browser on a Fedora Linux system. Note: some browsers, such as Opera, let the user choose which identification to send. A user can claim to use Microsoft Internet Explorer 6 even while using Opera. This impostor functionality is a response to all the poorly designed "Optimized for browser x" sites that refuse to work with other, legitimate standards-compliant browsers.|
|1||HTML text file; for example, index.html|
|1||CSS formatting instructions file|
|6||GIF, ICO, and PNG image files|
Probably the most common web metric bandied about, "hits" is also the most meaningless.
- A hit is a successful request for an object from a web server. Success usually merits a status code of 200 or, for objects that are identical to those already in a user's cache, 304.
Along with bandwidth consumption, hits can be useful as an input for server sizing and capacity planning. While people make much of hits to tout the success of a site, hits have no intrinsic business value. Representations to the contrary probably indicate a lack of understanding of how futile hits are as a useful business measure.