ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


Analyzing Web Logs with AWStats
Pages: 1, 2, 3, 4, 5

Building and Updating the AWStats Statistics Database

AWStats uses intermediary files to produce its reports--one for each month of each year for each configuration file you have created. These files represent a compact, optimized version of raw web server log file data, based on preference settings in the AWStats configuration file. Run the command appropriate for your operating system to generate a statistics file for the web log saved earlier in the temporary directory (replace antezeta with your domain name):



$ perl /usr/local/awstats/wwwroot/cgi-bin/awstats.pl -config=antezeta \
    -update -LogFile=/tmp/access.log

> perl "C:\Program Files\AWStats\wwwroot\cgi-bin\awstats.pl" -config=antezeta \
    -update -LogFile=C:\temp\access.log

You should see output similar to this Windows example:

Update for config "C:\Program Files\AWStats\wwwroot\cgi-bin/
    awstats.antezeta.conf"
With data in log file "C:\temp\access.log"...
Phase 1 : First bypass old records, searching new record...
Searching new records from beginning of log file...
Phase 2 : Now process new records (Flush history on disk after 20000 hosts)...
Jumped lines in file: 0
Parsed lines in file: 539
 Found 1 dropped records,
 Found 4 corrupted records,
 Found 0 old records,
 Found 534 new qualified records.

This will generate a statistics file awstatsMMYYYY.antezeta.txt in the same directory as awstats.pl (unless you gave a different value to DirData in awstats.antezeta.conf):

Directory of C:\Program Files\AWStats\wwwroot\cgi-bin
06/23/2005 03:51 PM 6,633 awstats062005.antezeta.txt

where MM is the month and YYYY the year of the web server log data. Should the input data bridge two months, the statistics database will consist of two statistics files.

Rerun the previous command to generate the statistics database. Instead of 534 new records, you have 534 old ones:

Update for config "C:\Program Files\AWStats\wwwroot\cgi-bin/
    awstats.antezeta.conf"
With data in log file "C:\temp\access.log"...
Phase 1 : First bypass old records, searching new record...
Searching new records from beginning of log file...
Jumped lines in file: 0
Parsed lines in file: 539
 Found 1 dropped records,
 Found 4 corrupted records,
 Found 534 old records,
 Found 0 new qualified records.

AWStats, noticing it received an old file, correctly ignores the old data. However, AWStats is less flexible when it comes to processing log files out of order--it must process them chronologically. If you skip a day's processing, AWStats will ignore it if you try to process it after processing successive days. The solution is to delete that month's statistics file and reprocess the log data for the entire month to date. Similarly, some AWStats configuration file changes affect statistics file generation. If your log files are not large and you have doubts, delete the statistics file(s) and reprocess your logs.

"Corrupted" record tips

  • Some web servers, such as Microsoft IIS, insert comment lines in the web log file every time the server starts. These lines will appear in the "corrupted" count above.
  • If your percentage of corrupted records is high and you have many records, you probably have a mismatch between the fields and their order in your log file, and what you have declared with the AWStats LogFormat parameter. Should this be the case, visually compare the first few lines of your log file with the LogFormat option, adjusting your configuration if necessary.
  • Some ISPs sort web logs by host after performing a reverse DNS lookup. Web log analysis programs expect logs to be in chronological order. If the log is out of order, sort it on the date and time field before passing it to AWStats.

Log retention tip

Storing the original log files for extended periods is a good practice, unless legal or company policy dictates otherwise. Access to historical logs lets you regenerate your reports if you subsequently make a configuration file change or decide to migrate to another web log analysis tool.

Producing the First Reports

After you have created a statistics database, it's possible to run reports. While AWStats supports a very nice on-demand web CGI interface, it's easy to create static HTML reports to avoid having to reconfigure your web server. The following commands will generate the reports in the /tmp or C:\temp directory:

$ perl "/usr/local/awstats/tools/awstats_buildstaticpages.pl"
    -config=antezeta -lang=en
    -awstatsprog="/usr/local/awstats/wwwroot/cgi-bin/awstats.pl"
    -dir="/tmp"
    -diricons="/usr/local/awstats/wwwroot/icon"

> perl "C:\Program Files\AWStats\tools\awstats_buildstaticpages.pl"
    -config=antezeta
    -lang=en -awstatsprog="C:\Program Files\AWStats\wwwroot\cgi-bin\awstats.pl"
    -dir="C:\temp" -diricons="../Program%20Files/AWStats/wwwroot/icon"

AWStats creates the HTML reports in the temp directory specified by -dir; the main index file is awstats.config.html (for this example, awstats.antezeta.html). Open it in a web browser.

Report graph tip

Should the report graphs be clear rather than colored, verify the directory specified with the -diricons parameter. This value is hardcoded in the HTML files. In the Windows example above, we had to encode the space in the directory name with the %20 notation. We also used HTML forward slashes rather than Windows backslashes.

Pages: 1, 2, 3, 4, 5

Next Pagearrow





Sponsored by: