5. Denied Requests (HTTP and ICP)
You should normally expect to see a small number of denied requests as Squid operates. However, a high rate or percentage of denied requests indicates either 1) a mistake in your access control rules, 2) a misconfiguration on the cache client, or 3) someone attempting to attack your server.
If you use very specific address-based access controls, you'll need to carefully track IP address changes on your cache clients. For example, you may have a list of neighbor cache IP addresses. If one of those neighbors gets a new IP address, and doesn't tell you, all of its requests will be refused.
Unfortunately, there is no way to easily get a running total of
denied requests from either SNMP or the cache manager. If you want to
track this metric, you'll have to write a little bit of code to
extract it from either the cache manager
client_list page, or from
Squid's access.log file.
client_list page has counters for each client's ICP and HTTP
request history. It looks like this:
Address: xxx.xxx.xxx.xxx Name: xxx.xxx.xxx.xxx Currently established connections: 0 ICP Requests 776 UDP_HIT 9 1% UDP_MISS 615 79% UDP_MISS_NOFETCH 152 20% HTTP Requests 448 TCP_HIT 1 0% TCP_MISS 201 45% TCP_REFRESH_HIT 2 0% TCP_IMS_HIT 1 0% TCP_DENIED 243 54%
With a little bit of Perl, you can develop a script that prints out
IP addresses of clients having more than a certain number or
UDP_DENIED requests. The primary problem
with using this information is that Squid never resets the counters.
Thus, the values are not sensitive to short-term variations. If
Squid has been running for days or weeks, it may take a while until
the denied counters exceed your threshold. To get more immediate
feedback, you may want to search for
your access.log file and count the number of such requests.
6. HTTP Service Time
The HTTP service time represents how long it usually takes to complete a single HTTP transaction. In other words, it is the amount of time elapsed between reading the client's request and writing the last chunk of the response. Response times generally have heavy-tailed distributions, so we use the median as a good indicator of the average.
In most situations, the median service time should be between 100 and 500 milliseconds. The value that you actually see may depend on the speed of your Internet connection and other factors. Of course, the value varies throughout the day, as well. You'll need to collect this metric for a while to understand what is normal for your installation. A service time that seems too high may indicate that 1) your upstream ISP is congested, or 2) your own Squid cache is overloaded or suffering from resource exhaustion (memory, file descriptors, CPU, etc.). If you suspect the latter, simply look at the other metrics described here for confirmation.
To get the five-minute median service time for all HTTP requests, use this SNMP OID:
By browsing the MIB, you can find separate measurements for cache hits, cache misses, and 304 (Not Modified) replies. To get the median HTTP service time from the cache manager, do this:
# squidclient mgr:5min | grep client_http.all_median_svc_time client_http.all_median_svc_time = 0.127833 seconds
You can also use the
high_response_time_warning directive in
squid.conf to warn you if the response time exceeds a pre-defined
threshold. For example:
7. DNS Service Time
The DNS service time is a similar metric, although it measures only the amount of time necessary to resolve DNS cache misses. The HTTP service time measurements actually include the DNS resolution time. However, since Squid's DNS cache usually has a high hit ratio, most HTTP requests do not require a time-consuming DNS resolution.
A high DNS service time usually indicates a problem with Squid's primary DNS server. Thus, if you see a large median DNS response time, you should look for problems on the DNS server, rather than Squid. If you cannot fix the problem, you may want to select a different primary DNS resolver for Squid, or perhaps run a dedicated resolver on the same host as Squid.
To get the five-minute median DNS service time from SNMP, request this OID:
And from the cache manger:
# squidclient mgr:5min | grep dns.median_svc_time dns.median_svc_time = 0.058152 seconds
8. Open File Descriptors
File descriptors are one of the finite resources used by Squid. If you don't know how critical file descriptor limits are to Squid's performance, read the first section of Six Things First-Time Squid Administrators Should Know and/or Chapter 3 of Squid: The Definitive Guide.
When you monitor Squid's file descriptor usage, you'll probably find that it is intricately linked to the HTTP connection rate and HTTP service time. An increase in service time or connection rate also results in an increase in file descriptor usage. Nonetheless, it is a good idea to keep track of this metric, as well. For example, if you graph file descriptor usage over time and see a plateau, your file descriptor limit is probably not high enough.
Squid's SNMP MIB doesn't have an OID for the number of currently open file descriptors. However, it can report the number of unused (closed) descriptors. You can either subtract that value from the known limit, or simply monitor the unused number. The OID is:
To get the number of used (open) file descriptors from the cache manger, search for this line in the "info" page:
# squidclient mgr:info | grep 'Number of file desc currently in use' Number of file desc currently in use: 88
9. CPU Usage
Squid's CPU usage depends on a wide variety of factors including your hardware, features that you have enabled, your cache size, HTTP and ICP query rates, and others. Furthermore, high CPU usage is not necessarily a bad thing. All other things being equal, it is better to have high CPU usage and a high request rate than low CPU usage and a low request rate. In other words, after removing a disk I/O bottleneck, you may notice that Squid's CPU usage goes up, rather than down. This is good, because it means Squid is handling more requests in the same amount of time.
There are two things to watch for in the CPU usage data. First,
any periods of 100 percent CPU usage indicate some kind of problem,
perhaps a software bug. Henrik Nordstrom recently uncovered an
incompatibly on Linux 2.2 kernels when using
half_closed_clients is enabled. This Linux kernel bug can
cause periods of 100 percent CPU utilization. As a workaround
you can disable the
The second reason to watch the CPU usage is simply to make sure that your CPU does not become a bottleneck. This might happen if you utilize CPU-intensive features such as Cache-Digests, CARP, or large regular expression-based access control lists. If you see Squid approaching 75 percent CPU utilization, you might want to consider a hardware upgrade.
Squid's SNMP MIB provides a CPU usage value with this OID:
Unfortunately, it is simply the ratio of CPU time to actual time since the process started. This means that it won't show short-term changes in CPU usage. To get more accurate measurements, you should use the cache manager:
# squidclient mgr:5min | grep cpu_usage cpu_usage = 1.711396%