ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Big Scary Daemons

System Panics, Part 1: Preparing for the Worst

03/21/2002

This is Part 1 in a two-part series on system panics. In this column, Michael Lucas talks about how to prepare a FreeBSD system in case of a panic. In the next column, he'll talk about what to do when the worst happens.

I've built my reputation on reliability, a process made infinitely easier by FreeBSD. That's why I felt so shocked when a client called and said, "My server just went down for the second time in a day."

This client runs an ISP and relies heavily on FreeBSD for his mail and Web services. His 2.2.8-stable boxes have uptimes approaching a year--they'd be longer, but we had to rearrange the power cables in the server room one night. This system ran 4-stable and had been in production for several months.

Instead of a login prompt, the console displayed a message much like this one:


Fatal trap 12: page fault while in kernel mode
fault virtual address   =       0x80c0153a
fault code / supervisor write, page not present
instruction pointer = 0x8:0xc015aa84
stack pointer =         0x10:0xc7377e7c
frame pointer =         0x10:0xc7377e80
code segment =          base 0x0, limit 0xfffff, type 0x1b
             =          DPL 0, pres 1, def32 1, gran 1
processor eflags =      interrupt enabled, resume, IOPL=0
current process =       5 (syncer)
interrupt mask =        bio
trap number =           12
panic: page fault

If you're an inexperienced sysadmin, this can turn your blood cold. Unix in general, and FreeBSD in particular, generally gives friendly messages that describe what's wrong and give you a place to start looking, or in the worst case, a term to type into your favorite search engine. The only word that looks even vaguely familiar in this message is "syncer". Most people don't know what the syncer is. Most of those who recognize it know better than to try to fix it. The "mysterious panic" is among the worst situations you can have in FreeBSD.

The first time this happened to me, several years ago, I scrambled for a piece of paper and a pen. Eventually I found an old envelope and a broken stub of pencil, and crawled between the server rack and the rough brick wall. In one hand, I balanced the 6-inch, black-and-white monitor I dragged back there with me. With the other hand, I held the old envelope up against the wall. Apparently I had a third hand to copy the panic message to the envelope, because it somehow got there. Finally, scraped and cramped, I slithered back out of the rack and victoriously typed the whole mess into an email. Surely the crack FreeBSD developers would be able to look at this garbage and tell me exactly what had happened.

After this struggle, the immediate response was quite frustrating. "Can you send a backtrace?"

I've seen many, many messages to a FreeBSD mailing list reporting problems like this. They always get the same response I got. Most of these people are never heard from again, and I understand exactly how they feel. When you've been dealing with a server that crashes, or (worse) keeps crashing, the last thing you want to do is reconfigure it.

There's a simple way around this problem, however: Set up your server to handle a panic before the panic happens. Set it up when you install the server. That way, you'll automatically get a backtrace if it ever crashes. This might seem like a novel idea, and it certainly isn't emphasized in the FreeBSD documentation, but it make sense. Be ready for disaster. If it never happens, well, you don't have anything to complain about. If you get a panic, you're ready. You can present the FreeBSD folks with a decent, full debugging dump.

The problem with the panic message on my envelope is that it only gives a tiny scrap of the story. It's like describing your stolen car as "red, with a scratch on the fender." If you don't give the car's make, model, and VIN number or license plate, you cannot expect the police to make much headway. Similarly, without more information from your crashing kernel, the FreeBSD developers can't catch the criminal code.

The standard FreeBSD kernel install removes all the debugging information from the kernel before installing it. This debugging information includes "symbols," which provide a map between the machine code and the source code. Such a map can be larger than the actual program. Nobody wants to run a kernel that's three times larger than it has to be! It also includes a complete list of source code line numbers, so the developer can learn exactly where a problem occurred. Without this information, the developer is stuck trying to map a kernel core to the source code by hand. It's somewhat like trying to assemble a million-piece puzzle without a box, a picture, or even knowing that you have all the pieces. This is an ugly job. It's even uglier when you consider that the developer who needs to do the work is a volunteer.

To prepare for a kernel panic, you need the system source code installed. You need one (or more) swap partition that is at least one MB larger than your physical memory and preferably twice as large as your RAM. If you have 512MB of RAM, for example, you need a swap partition that is 513MB or larger, with 1024MB being preferable. (On a server, you should certainly have multiple swap partitions on multiple drives!) If you don't have that, you have to either add another hard drive with an adequate swap partition or reinstall. While having a /var partition with at least that much disk space free is helpful, it isn't necessary.

The kernel crash-capturing process works somewhat like this. If a properly configured system crashes, it will save a core dump of the system memory. You can't save it to a file, because the crashed kernel doesn't know about files; it only knows about partitions. The simplest place to write this dump is the swap partition. The dump is placed as close to the end of the swap partition as possible. Once the crashing system saves the core to swap, it reboots the computer.

During the reboot, /etc/rc enables the swap partition. It then (probably) runs fsck on the crashed disks. It has to enable swapping before running fsck, because fsck might need to use swap space. Let's hope you have enough swap space that fsck can get everything it needs without overwriting the dump file lurking in your swap partition. Once the system has a place where it can save a core dump, it checks the swap partition for a dump. Upon finding a core, savecore copies it from swap to the proper file, clears the dump from swap, and lets the reboot proceed. You now have a kernel core file and can use that to get a backtrace.

The examples given here are for FreeBSD 4-stable. If you're running 3-stable, paths will be slightly different. If you're running -current, you should have done all of this long ago.

Your first step to make this work is to build a debugging kernel. I'm assuming that you know how to build a custom kernel. If you don't, please see the FreeBSD Handbook for details. All you need to do is add these lines to your kernel configuration.

options		DDB
makeoptions	DEBUG=-g

The DDB option installs the DDB kernel debugger. (This isn't strictly necessary, but it can be helpful and doesn't take that much room.) Finally, the makeoptions you set here tell the system to build a debugging kernel.

When you're configuring your system, you need to decide how you want the system to behave after a panic. Do you want the computer to reboot, or do you want it to stay at the panic screen until you manually trigger a reboot? If the system is at a remote location, you almost certainly want the computer to reboot automatically. If you're at the console, debugging kernel changes, or if you've discovered a filesystem bug, you almost certainly want the system to wait for you to tell it to reboot.

If you want the computer to reboot automatically, include the kernel option DDB_UNATTENDED. Otherwise, the system will wait for you to tell it to reboot. (Here's a little-known BSD trick for you: You can specify more than one option on a line.)

options DDB, DDB_UNATTENDED

Once you have the kernel set up the way you want, do the usual dance to configure and install it. When this finishes, you'll find a file in the kernel compile directory called kernel.debug. This is your kernel with symbols. Save it somewhere. When this process fails, one of the frequent causes is losing the debugging kernel and then trying to debug a crashed kernel with a different kernel.debug. This won't work. I generally copy kernel.debug to /var/crash/kernel.debug.date, so I can tell when a particular debug kernel was built. This lets me date-match the current kernel to a debugging kernel, and it also tells me when a kernel.debug is old enough that I can delete it.

Now set the proper options in /etc/rc.conf. First, tell the system where to write the core dump. This is called the dumpdev. FreeBSD uses the swap partition as the dump device; that's why it has to be slightly larger than your physical memory. (You can use a UFS partition, but after the crash, it won't be a usable UFS partition anymore!) You can get the device name from /etc/fstab. Look for a line with a FSType entry of "swap"; the first entry in that line is the physical device name. On my laptop, my swap field in /etc/fstab looks like this:

/dev/ad0s4b       none       swap   sw         0       0

My swap partition is /dev/ad0s4b, so I specify this as the dump device in /etc/rc.conf.

dumpdev="/dev/ad0s4b"

The next step is to tell your system where to save the dump after the reboot. The default is /var/crash, but you can change this with rc.conf's dumpdir setting.

As you become more experienced in saving panics, you might find that you need to adjust the core-saving behavior. Read savecore(8), and set any appropriate options in savecore_flags in /etc/rc. One popular flag is -z, which compresses the core file and can save some disk space. savecore(8) is now smart enough to automatically eliminate unused memory from the dump, which can save a lot of room.

If you're in front of your computer the next time it crashes, you'll see the panic message. If the system is set to reboot automatically, numbers will start to flow by, counting the number of MBs of memory being dumped to disk. Finally, the computer will reboot. Fdisk runs, and you can watch savecore copy the bad memory dump to disk.

If your system doesn't reboot automatically, you'll need to enter two commands after the panic, at the debugger prompt. Typing panic will sync the disks, and continue will start the reboot process.

You should now have a core dump file in /var/crash. Next time, we'll discuss what to do with this.

Michael W. Lucas


Read more Big Scary Daemons columns.

Return to the BSD DevCenter.

Copyright © 2009 O'Reilly Media, Inc.