ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Contents

• Introduction: What is Cricket?

• Installing Cricket

• Configuring Cricket

Monitoring Apache Page-Load Times With Cricket
03/17/2000

What is Cricket?

Cricket is an easy-to-set-up application for recording page-load times, and it has a nice web-based grapher that will generate charts to display the data in several formats.

Everyone wants something different from our web server. Marketeers want to know the geographical areas that hits are coming from. Editors want to know what content readers find most interesting. Advertisers demand to know how many times their banner ads are delivered, and from what pages.

And of course, as system administrator, I need to know that pages are being delivered promptly. I can't get that information from conventional web server log files. Instead, I need a system that can periodically load-test pages from each of my servers, and log how long it takes. By graphing that data, I can get a feel for how well my servers are running.

For the last year, I have been using a free software package, Cricket, that's perfect for this application.

What is Cricket?

Cricket can be easily set up to record page-load times, and it has a nice web-based grapher that will generate charts to display the data in several formats. Cricket is based on RRDtool, whose ancestor is MRTG (short for "Multi-Router Traffic Grapher"). RRDtool (Round Robin Data Tool) is a package that collects data in "round robin" databases; each data file is fixed in size so that running Cricket does not slowly fill up your disks. The database tables are sized when created and do not grow larger over time. As the data ages, it's averaged. This works fine for the graphing we are doing here.

Each RRDtool table "wraps around": The detailed "hourly" table has readings at 5-minute intervals for the last 24 hours, the "daily" table has averages of the load times for the last seven days, and so on. It's pretty intuitive when you look at the graphs, but you should be aware that you won't be able to get back exact 5-minute interval readings for something that took place a month ago if you use the RRDtool as your logging system -- only the interpolated averages are stored. This is good enough for most purposes however.

Tracking page-load times is just one small facet of what Cricket can do for you. I use it to track how much traffic travels across our Etherswitch so that we can track bandwidth utilization of our subnet and each individual host. Cricket uses SNMP (Simple Network Management Protocol) to log data from all sorts of networked gear.

See Cricket in Action!

See for yourself how Cricket works. Click here to jump to a table listing four web site options. Each option is a live web page. Choose a page, and within seconds Cricket will analyze its load time and build the graphs displaying its results.

In addition, you can even write your own "collector" scripts that log data on just about anything, and then use Cricket to graph the data. I tested this by logging subscriber counts for a mailing list.

In this article, we're going to set up Cricket to track web server performance. Cricket comes with good documentation, and sample setups for network gear. Once you've got it running, refer to these documents to take on more ambitious projects. There is also a Cricket mailing list that you can sign up for to get more help. (See the Cricket web site for info on how to sign up.)

When Cricket is installed to log page-load times, I can show folks the before and after pictures when I perform system tuning operations such as adding memory or a new hard drive. This helps to justify the expense by showing concrete evidence that the change has improved load times.

If you make a major software change, you can track its effect on load times. Keep in mind that you have to select the right URLs to track. If you turn on server-side "includes" but point Cricket only at a page with no includes in it, you won't get as sharp a picture of the change resulting from the new software.

On page 2, we'll walk through the Cricket installation process step by step.


Installing Cricket

For our basic Cricket setup, let's assume you're going to install it on a Linux server that already has an Apache server running on it. Also you'll need to have installed the basic development tools: make, GCC compiler, and the Linux kernel headers.

I found these instructions in the beginner.txt file in the cricket/doc directory. My instructions are more explicitly geared for a Linux system; if you are running something else, you'll probably want to refer to the beginner.txt file as well. So, let's get started.

You will need a recent version of Perl -- version 5.004 or newer. To check the version number, use the perl -v command.

You will also need these packages from CPAN (Comprehensive Perl Archive Network); you may have some of them already.

Package name Where to get it:
MD5 CPAN by-authors/id/GAAS/Digest-MD5-*.tar.gz
LWP CPAN by-authors/id/GAAS/libwww-perl-*.tar.gz
DB_File CPAN by-authors/id/PMQS/DB_File-*.tar.gz Date::Parse
CPAN by-authors/id/GBARR/Timedate-*.tar.gz Time::HiRes
CPAN by-authors/id/DEWEG/Time-HiRes-*.tar.gz

If you have the CPAN module installed and configured, you can issue the following commands while running as "root." If you have the CPAN module but have not configured it, the first time you run it, it will ask some questions. Go ahead and give it a shot; it's not that hard. Otherwise you can go to CPAN.org web site. Here you can find the modules, download them, unpack them, and build and install each one by following the ReadMe file. This is known as doing it the hard way.

Each of these CPAN module commands will install the latest version of each package. It is safe to run the command; if the latest version is already installed, it will just tell you that and stop.

 perl -MCPAN -eshell  
 cpan> install MD5  
 cpan> install LWP  
 cpan> install DB_File  
 cpan> install Date::Parse  
 cpan> install Time::HiRes  
 cpan> quit

There are two more packages that are not in the CPAN archives, so you have to fetch and install them separately. Use your web browser to find the latest version of each and download them to a spot on your system where you can unpack and build the package. Then use your rootly powers to install it.

First the SNMP_Session package: For this HTTP tracking project we actually don't need to use any SNMP services, but Cricket requires the package so you have to install it anyway.

The SNMP_Session web site is:
http://www.switch.ch/misc/leinen/snmp/perl
You can download the latest version from:
ftp://ftp.switch.ch/software/sources/network/snmp/perl

Here are the abbreviated instructions on building version 0.76:

    % tar xzf SNMP_Session-0.76.tar.gz
    % cd SNMP_Session-0.76
    % perl Makefile.PL
    % make
    % su root
    Password:
    # make install
    # exit
    % cd ..
The RRD package is the heart of Cricket. The main RRD site is:
http://ee-staff.ethz.ch/~oetiker/webtools/rrdtool/.
You can download the package from:
http://ee-staff.ethz.ch/~oetiker/webtools/rrdtool/pub/.

Here are the abbreviated instructions on building version 1.0.11:

    % tar xzf rrdtool-1.0.11.tar.gz
    % cd rrdtool-1.0.11

I had to do this to get configure to work on one of my systems:

    % unset noclobber

    % ./configure
    % make
    % su root
    Password:

This next line will install the RRD Perl modules in your system's standard site-perl directory tree instead of putting them in a separate location (which is what make install does). This is necessary for the Cricket scripts to find the RRD modules.

    # make site-perl-install

Now, as root, create a user account that will run Cricket.

These commands work on a Linux system; use your own preference on your system to create a Cricket user. You don't strictly need a separate Cricket account, but I find it is a lot easier this way.


    # groupadd cricket
    # useradd -g cricket -c 'Cricket Traffic Grapher' cricket
    # passwd cricket
    # chmod 755 ~cricket
Set an alias to receive Cricket's mail.

    # echo "cricket: root" >> /etc/aliases
    # newaliases
    # exit

Download and install Cricket.

    % su - cricket
    Password:

Now that you're running as Cricket, use a browser to download the Cricket source archive from here.

Here are the abbreviated instructions on installing version 0.72:

    % tar xzf cricket-0.72.tar.gz

Using this symbolic link will allow you to upgrade easily later:

    % ln -s cricket-0.72 cricket
    % cd cricket

Running "configure" will fix up the first line of each Cricket script so they can find Perl on your system.

    % ./configure

Now we get into the configuration; this is the most complicated step. Copy the sample configuration tree to the ~cricket/cricket-config directory, which is where the installed Cricket will look for it.

    % cd ..
    % cp -r cricket/sample-config cricket-config

There are lots of setup files in there that won't be used but maybe you'll want them later; they won't hurt anything for now. For now, we are only interested in cricket-config/http-performance. Edit the URLs file.

    % cd cricket-config/http-performance
    % ls
    Defaults  urls
    % emacs urls

The sample file looks like this. Remove those entries and put in entries for whatever you would like to monitor.

    target  cricket-home
        short-desc = "The Cricket Homepage"
        url = "http://www.munitions.com/~jra/test-file.txt"

    target  www.cnn.com
        url = "http://www.cnn.com"

You can have as many targets as you like; each one will cause a database table of approximately 60 kilobytes to be created. If it takes more than five minutes to collect a set of data, you will receive warning messages telling you that the collection subtree is locked. You can ignore these messages or change the collection interval.

The defaults file in this directory contains settings to control how the data is displayed. For a while, one of my servers was consistently delivering pages in times greater than five seconds. I had to change the setting for y-max so that I could see more data on the graphs. I won't tell you how slow the server was -- too embarrassing. Every page on that site was generated by Perl scripts.

Each time you make changes to the cricket-config files, you have to recompile them. There will be error messages if you entered anything incorrectly.

   % cd ~
   % cricket/compile

To test the data collector, run it now manually.

   % cricket/collector /http-performance

If it works, you'll see a lot of messages like this indicating Cricket is testing each target in your configuration file:

   [09-Feb-2000 22:26:55 ] Retrieving data 
   (EXEC: /home/cricket/cricket-config/../cricket/util/test-url 
   http://www.xml.com) for XML

The collector will also create the data tables in the ~cricket/cricket-data/http-performance directory the first time you run it.

The collector is run from a script called collect-subtrees. You can set up Cron to run different collection sets at different intervals. The file cricket/subtrees-sets defines what is in each set. For our example, you will have to edit that file to change the lines.

    set normal:
  /routers
  /router-interfaces

to

    set normal:
  /http-performance

Now, to test collection of the set, run the wrapper script:

   % cricket/collect-subtrees normal
   % exit

You won't see any output from this script, but the wrapper will create another directory, cricket-logs, and log its output to file in it, normal.0.

Making a Cron Entry

Now you are ready to set up a Cron job to run the collection script. Make a Cron entry to run Cricket once every five minutes. I run Cricket from the /etc/cron.d directory. You could run it directly under the Cricket account (using the crontab -e command to edit the file), but I find it easier to keep track of what administrative Cron jobs are installed by putting them all in the /etc/cron.d directory.

   % su root

[Next command all on one line]

   # echo "*/5 * * * * cricket 
   /home/cricket/cricket/collect-subtrees normal" > 
   /etc/cron.d/cricket
   
   # exit

Now wait until the next 5-minute increment rolls around and watch to see if the data collection happens. Once Cricket has been running for awhile, you will see a series of files from normal.0 to normal.20; each time collect-subtrees file runs, it renumbers the files so the newest one is always normal.0.

Cricket is now logging data. If you modify the files in cricket-config, remember to re-run compile to update the configuration.

On page 3, we'll cover how to set things up for web browsing.


Web Browsing

Basically everything is already installed, but you have to make symbolic links from the Cricket account's public_html area into the Cricket install. (Using symlinks instead of copying the files makes upgrading very easy, so I highly recommend it.)

    % su - cricket
    Password:

    % mkdir public_html
    % cd public_html
    % ln -s ../cricket/doc doc
    % mkdir cricket
    % cd cricket
    % ln -s ../../cricket/VERSION
    % ln -s ../../cricket/grapher.cgi
    % ln -s ../../cricket/images
    % ln -s ../../cricket/lib
    % ln -s ../../cricket/mini-graph.cgi

Configure your Web Server

You have to have your web server configured to allow viewing of user directories. You need to allow symlinks and CGIs in the Cricket subdirectory. This would be appropriate code to add to your Apache httpd.conf file. (This part is pretty generic; if you have installed a recent Apache server, it's probably already there.)

UserDir public_html

<Directory /home/*/public_html>
    AllowOverride FileInfo AuthConfig Limit
    Options MultiViews Indexes SymLinksIfOwnerMatch
    <Limit GET POST OPTIONS PROPFIND>
        Order allow,deny
        Allow from all
    </Limit>
    <Limit PUT DELETE PATCH PROPPATCH MKCOL COPY MOVE LOCK UNLOCK>
        Order deny,allow
        Deny from all
    </Limit>
</Directory>

# This is for the Cricket Traffic Grapher

<Directory /home/cricket/public_html/cricket>
    Options SymLinksIfOwnerMatch ExecCGI
</Directory>

Of course, if you have to add this to your httpd.conf file, you will have to tell httpd that your configuration has changed. If you compiled and installed Apache from sources, you can use the apachectl command to signal httpd. This command should work for you.

     apachectl restart

If you used the Red Hat RPM version of Apache, you can use this instead:

     /etc/rc.d/init.d/httpd restart

Now you should be able to run the grapher.cgi program using this URL,

    http://yoursystem/~cricket/cricket/grapher.cgi

One last tip

If it's a possibility for you, install Cricket on several different networks to get a more accurate picture of how the Internet affects page-load times. I run a copy on my personal ISP account across town.

if you insert the name of your web server in where it says yoursystem.

If you get an "internal server" error message in the browser, the first place to look is in your Apache error_log file. (The exact location of this file depends on your system.)

When things are running normally, you will get a menu back from grapher.cgi. You should be able to click on the link, http-performance. This will return a list of the targets you set up to be monitored. Click on one of them and you should get the graph page. Alas, the first time you view a page there will be no data to view! Patience. Come back in a few hours and more interesting graphs will start to show up.


Brian Wilson is the system administrator for O'Reilly Network. He is also calls himself the 'online services coordinator' (whatever that is) for the International Human Powered Vehicle Association.

Discuss this article in the O'Reilly Network Apache Forum.

Return to the Apache DevCenter.

 

Copyright © 2009 O'Reilly Media, Inc.