BSD DevCenter
oreilly.comSafari Books Online.Conferences.

advertisement


System Panics, Part 1: Preparing for the Worst
Pages: 1, 2

The examples given here are for FreeBSD 4-stable. If you're running 3-stable, paths will be slightly different. If you're running -current, you should have done all of this long ago.



Your first step to make this work is to build a debugging kernel. I'm assuming that you know how to build a custom kernel. If you don't, please see the FreeBSD Handbook for details. All you need to do is add these lines to your kernel configuration.

options		DDB
makeoptions	DEBUG=-g

The DDB option installs the DDB kernel debugger. (This isn't strictly necessary, but it can be helpful and doesn't take that much room.) Finally, the makeoptions you set here tell the system to build a debugging kernel.

When you're configuring your system, you need to decide how you want the system to behave after a panic. Do you want the computer to reboot, or do you want it to stay at the panic screen until you manually trigger a reboot? If the system is at a remote location, you almost certainly want the computer to reboot automatically. If you're at the console, debugging kernel changes, or if you've discovered a filesystem bug, you almost certainly want the system to wait for you to tell it to reboot.

If you want the computer to reboot automatically, include the kernel option DDB_UNATTENDED. Otherwise, the system will wait for you to tell it to reboot. (Here's a little-known BSD trick for you: You can specify more than one option on a line.)

options DDB, DDB_UNATTENDED

Once you have the kernel set up the way you want, do the usual dance to configure and install it. When this finishes, you'll find a file in the kernel compile directory called kernel.debug. This is your kernel with symbols. Save it somewhere. When this process fails, one of the frequent causes is losing the debugging kernel and then trying to debug a crashed kernel with a different kernel.debug. This won't work. I generally copy kernel.debug to /var/crash/kernel.debug.date, so I can tell when a particular debug kernel was built. This lets me date-match the current kernel to a debugging kernel, and it also tells me when a kernel.debug is old enough that I can delete it.

Now set the proper options in /etc/rc.conf. First, tell the system where to write the core dump. This is called the dumpdev. FreeBSD uses the swap partition as the dump device; that's why it has to be slightly larger than your physical memory. (You can use a UFS partition, but after the crash, it won't be a usable UFS partition anymore!) You can get the device name from /etc/fstab. Look for a line with a FSType entry of "swap"; the first entry in that line is the physical device name. On my laptop, my swap field in /etc/fstab looks like this:

/dev/ad0s4b       none       swap   sw         0       0

My swap partition is /dev/ad0s4b, so I specify this as the dump device in /etc/rc.conf.

dumpdev="/dev/ad0s4b"

The next step is to tell your system where to save the dump after the reboot. The default is /var/crash, but you can change this with rc.conf's dumpdir setting.

As you become more experienced in saving panics, you might find that you need to adjust the core-saving behavior. Read savecore(8), and set any appropriate options in savecore_flags in /etc/rc. One popular flag is -z, which compresses the core file and can save some disk space. savecore(8) is now smart enough to automatically eliminate unused memory from the dump, which can save a lot of room.

If you're in front of your computer the next time it crashes, you'll see the panic message. If the system is set to reboot automatically, numbers will start to flow by, counting the number of MBs of memory being dumped to disk. Finally, the computer will reboot. Fdisk runs, and you can watch savecore copy the bad memory dump to disk.

If your system doesn't reboot automatically, you'll need to enter two commands after the panic, at the debugger prompt. Typing panic will sync the disks, and continue will start the reboot process.

You should now have a core dump file in /var/crash. Next time, we'll discuss what to do with this.

Michael W. Lucas


Read more Big Scary Daemons columns.

Return to the BSD DevCenter.





Sponsored by: