10. Free Disk Space
Disk space is another finite resource consumed by Squid. When you run Squid on a dedicated system, controlling the disk usage is relatively easy. If you have other applications using the same partitions as Squid, you need to be a little more careful. We need to worry about disk space for two reasons: the disk cache and Squid's log files.
If Squid gets a "no space left on device" error while writing to the disk cache, it resets the cache size and keeps going. In other words, this is a non-fatal error. The new cache size is set to what Squid believes is the current size. This also causes Squid to start removing existing objects to make room for new ones. Running out of space when writing a logfile, however, is a fatal error. The Squid process exits, rather than continue operating without the ability to log important information.
Free disk space information is only available through the cache
manager. Furthermore, Squid only tells you about the
directories. It won't tell you about the status of the partition
where you store your log files (unless that partition is also a
cache directory). Thus, you may want to develop your own simple
script to monitor free space on your logging partition.
storedir cache manager page has a section like this for
each cache directory:
Store Directory #0 (diskd): /cache0/Cache FS Block Size 1024 Bytes First level subdirectories: 16 Second level subdirectories: 64 Maximum Size: 15360000 KB Current Size: 13823540 KB Percent Used: 90.00% Filemap bits in use: 774113 of 2097152 (37%) Filesystem Space in use: 14019955/17370434 KB (81%) Filesystem Inodes in use: 774981/4340990 (18%) Flags: Pending operations: 0 Removal policy: lru LRU reference age: 22.46 days
We are particularly interested in two lines: the "Percent Used" and "Filesystem Space in use" lines.
The "Percent Used" line shows how much space Squid has used, compared
to the size you specified on the
cache_dir line. This will
normally be equal to, or less than, the value for
The "Filesystem Space in use" line shows how much space is actually
used on this partition. Squid gets the information from the
system call. It should match what you would see by running
from your shell. This is the important value to watch. If the
percentage hits 100 percent, Squid will receive "no space left on device"
11. Hit Ratio
Cache hit ratio is another metric that can vary a lot from time to
time. Its high variability means that it is not always a good
indicator of a problem. A sudden drop in hit ratio might mean that one of
the cache clients is a crawler or something that adds
to its requests. Perhaps the best reason to monitor it is
simply to understand how many requests benefit are served directly
from the cache (in case the boss asks you to justify Squid's
You can get the hit ratio, calculated over the last five minutes, by requesting this SNMP OID:
The same information is available on the cache manager "info" page:
# squidclient mgr:info | grep 'Request Hit Ratios' Request Hit Ratios: 5min: 29.8%, 60min: 44.1%
My Squid-rrd Monitoring Utility
For better or worse, the cache manager currently provides more useful
information than Squid's SNMP implementation. However, the cache
manager output was designed to be human-readable. It would be
awkward for you to write a bunch of software to
grep for all of the
relevant information and extract the values. Especially since I
have already done it for you.
I have a Perl script, recently enhanced by Dan Kogai, to issue cache manager requests and store the values into an RRD database. If you don't know about RRDtool yet, you should. It is Tobi Oetiker's successor to MRTG. It's very cool.
My Perl script runs periodically from
cron. It makes cache manager
requests and uses regular expressions to parse the output for certain
metrics. The extracted values are stored in various RRD files.
I also provide a template CGI script that displays the RRD data.
You can find my code and documentation at www.squid-cache.org/~wessels/squid-rrd. I've included some of the graphs below. You can view more graphs (and look at the full-size versions of the ones below) by visiting my stats page for the IRCache proxies at www.ircache.net/Cache/Statistics/Vitals/rrd/cgi.
These two graphs show memory usage and page-fault rate for a one-month period. You can clearly see when Squid was restarted because the memory usage goes down. It slowly climbs back up as Squid runs. You can also see that the page-fault rate increases as the memory consumption increases.
These five graphs show various metrics for a 24-hour period. You can see that an increase in load causes corresponding increases in CPU usage, file descriptor usage, and, to some extent, response times. The file descriptor graph shows a brief spike during the late evening hours.
Duane Wessels discovered Unix and the Internet as an undergraduate student studying physics at Washington State University.
O'Reilly & Associates published Squid: The Definitive Guide in January 2004.
Chapter 8, "Advanced Disk Cache Topics," is available free online.
For more information, or to order the book, click here.
Return to ONLamp.com.