ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Big Scary Daemons

Long-Term Monitoring with SNMP

09/21/2000

We've seen how SNMP can be used to gather just about any information from a host. You can interpret this data through a wide array of programs. The most popular are cricket and mrtg. Both are included in the FreeBSD ports collection, and install cleanly on any BSD. Here, we'll discuss mrtg.

The mrtg program uses SNMP data to automatically generate reports on a web page with cleanly labeled graphs. You can give supervisors, managers, and coworkers convenient access to performance data without giving them server access. It keeps records over a whole year, so you can get a good idea of real-life trends. It's also quite useful for justifying hardware and software expenditures; you can point out exactly much CPU time a machine is using, and how it's changed as you've added software.

You can run mrtg as a daemon, but it is traditionally a cron job run every five minutes.

Every mrtg call requires a config file. You can use the included cfgmaker tool to generate a default configuration for measuring network throughput on interfaces. The cfgmaker tool is easy to use:

cfgmaker communityname@machine > mrtg.cfg

For example, if I wanted to run mrtg on my local machine, I could run:

cfgmaker private@localhost > localhost.cfg

The cfgmaker tool will make SNMP queries of the device and generate a configuration file. It includes a lot of information, including some unnecessary HTML. If you look through the created file, you'll see that cfgmaker has thrown up a configuration for every single interface on the machine. The loopback interface, and any down interfaces, are commented out. The remaining, uncommented parts will look like this:

Target[localhost.3]: 3:private@localhost
MaxBytes[localhost.3]: 1250000
Title[localhost.3]: moneysink.exceptionet.com (No hostname defined for IP address): ep0
PageTop[localhost.3]: <H1>Traffic Analysis for ep0
 </H1>
 <TABLE>

<TR><TD>System:</TD><TD>turtledawn.blackhelicopters.org in Right here, right now.</TD></TR>
   <TR><TD>Maintainer:</TD><TD>Me <me@somewhere.org></TD></TR>
   <TR><TD>Interface:</TD><TD>ep0 (3)</TD></TR>
   <TR><TD>IP:</TD><TD>No hostname defined for IP address (192.168.1.100)</TD></TR>
   <TR><TD>Max Speed:</TD>
       <TD>1250.0 kBytes/s (ethernetCsmacd)</TD></TR>
  </TABLE>

Before you can use this configuration, you need to add a WorkDir directive to the config file. WorkDir tells mrtg where to store its logs, graphics, and HTML. I generally put the WorkDir somewhere under my web server root, such as:

WorkDir: /usr/local/share/apache/htdocs/mrtg

You'll probably want to password-protect this directory if the web server is on the public Internet or otherwise exposed to the world at large.

Related Articles:

Walk the SNMP Walk

Talk the SNMP Talk

Customizing mrtg

The "target" keyword tells mrtg which machine to query, and which interface on that machine this configuration is for. The string inside the brackets ([]) is an arbitrary label. All files generated by mrtg will be be named with this label as a prefix. The actual target appears after the colon. If you change the community name or IP address of your system, you can edit it directly here.

MaxBytes is the maximum throughput allowed through the interface. In this case, we have a 10baseT card. The mrtg program has enough brains to figure out the values for most common network types. You should never have to change this value if you're measuring throughput.

Title and PageTop are arbitrary text. You can put almost any HTML in these spaces, and it will be displayed.

Once I finish editing the mrtg config to my taste, it generally looks like this:

WorkDir: /usr/local/share/apache/htdocs/mrtg
Target[localhost.3]: 3:private@localhost
MaxBytes[localhost.3]: 1250000
Title[localhost.3]: Ethernet Interface
PageTop[localhost.3]: <H1>Traffic Analysis for Ethernet Interface</H1>
<P>Call the Helpdesk if you have any questions

I know perfectly well where the system is, after all, and who to talk to about it. If these pages are intended for management, I might add a couple lines of HTML after PageTop describing what the machine does, or how to interpret the data.

You can list any number of machines and/or interfaces in one configuration file. Just be sure each target has a unique label.

By default, mrtg measures network traffic. You can use it to measure any information given via SNMP MIBs, however.

First, you have to identify the MIBs available on your system. In an earlier article we discussed snmpwalk. Using it on the local system like such:

snmpwalk localhost private .1

should spill the entire MIB tree.

If you're using the ucd-snmp described in previous articles, you're probably more interested in the ucd-snmp MIBs. You can pull those from the system by doing:

snmpwalk localhost private .1.3.6.1.4.1.2021

The string at the end is the branch of the MIB tree that is reserved for ucd-snmp values. This generates a lot of output; you'll probably want to dump the results in a file.

Once you have a full list of MIBs, pick the values you want to monitor. The ucd-snmp MIBs list includes:

enterprises.ucdavis.memory.memIndex.0 = 0
enterprises.ucdavis.memory.memErrorName.0 = swap
enterprises.ucdavis.memory.memTotalSwap.0 = 204672
enterprises.ucdavis.memory.memAvailSwap.0 = 204648
enterprises.ucdavis.memory.memTotalReal.0 = 137096
enterprises.ucdavis.memory.memAvailReal.0 = 19180
enterprises.ucdavis.memory.memTotalFree.0 = 27032
enterprises.ucdavis.memory.memMinimumSwap.0 = 16000

Long-term monitoring of a system's memory and swap is definitely useful.

You'll want to confirm that the MIBs mean what you think they mean, and convert them to numerical form. You can do both with the snmptranslate command.

Using snmpwalk only gives you the last section of the MIB. You have to know that the "enterprises" tree is always prefaced with .1.3.6.1.4. (This is common knowledge in the SNMP world.) You give this full MIB, and the -Td switch, to the snmptranslate command:

snmptranslate -Td .1.3.6.1.4.enterprises.ucdavis.memory.memAvailSwap.0
.1.3.6.1.4.1.2021.4.4.0
memAvailSwap OBJECT-TYPE
-- FROM UCD-SNMP-MIB
SYNTAX INTEGER
MAX-ACCESS read-only
STATUS current
DESCRIPTION "Available Swap Space on the host."
::= { iso(1) org(3) dod(6) internet(1) private(4) enterprises(1) ucdavis(2021) memory(4) memAvailSwap(4) 0 }

This gives you a heap of useful information about the MIB, including its numerical equivalent and its definition. Take note of the numerical MIB; we'll need it soon.

The mrtg program charts MIBs in pairs, so you'll want to pick values to monitor accordingly. Sensible choices are things like "available swap and total swap," or "system memory and user memory." (Measuring available swap versus the percentage of disk available would give you difficult-to-understand charts.) We'll use user CPU time versus system CPU time as an example.

Digging through the snmpwalk output, and translating the system and user CPU times (enterprises.ucdavis.systemStats.ssCpuUser.0 and enterprises.ucdavis.systemStats.ssCpuSystem.0, respectively), we find that they translate to .1.3.6.1.4.1.2021.11.9.0 and .1.3.6.1.4.1.2021.11.10.0.

To make mrtg monitor these MIBs instead, you add them to the "Target" entry like so:

Target[localhost.cpu]:.1.3.6.1.4.1.2021.11.9.0&.1.3.6.1.4.1.2021.11.10.0:private@localhost

Be sure to pick a separate label for the target, and for all configuration statements for that target. If you don't, mrtg will either complain or overwrite the log files from other targets.

Test your configuration file by running mrtg on the command line a few times:

mrtg localhost.cfg

The first two times, mrtg will warn that it can't find log files, and then it should run silently. If you get an error that mrtg cannot reach a target, the Target entry is misconfigured. Either the community name, host name, or numerical MIB is wrong.

When mrtg runs silently, add it to cron to run every five minutes. If you followed the example above, when you look at http://localhost/mrtg/localhost.cpu.html, you'll see a pretty graph of your CPU usage over the last year.

You can use mrtg to monitor any system that uses SNMP. You can even install SNMP on your NT systems; this makes them easy to monitor, at a fraction of the cost of commercial systems. The only difference between mrtg and a commercial system is that you have to know what you're doing to use mrtg.

The first two articles in this series sparked dozens of e-mails from people wanting to use SNMP to monitor other platforms, such as Novell and NT. I highly recommend SNMP for the Public Community for SNMP on other 86 platforms. Be warned: To call NT's implementation of SNMP "skeletal" would leave you without an adequate description of its error messages.

SNMP is a rich and complex topic. We're finished with it for now, but in the next article we'll look at further customizing mrtg for your installation.

Michael W. Lucas


Read more Big Scary Daemons columns.

Discuss this article in the Operating Systems Forum.

Return to the BSD DevCenter.

 

Copyright © 2009 O'Reilly Media, Inc.