ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Big Scary Daemons

System Panics, Part 2: Recovering and Debugging

04/04/2002

This is Part 2 in a two-part series on system panics. In his first column, Michael Lucas talked about how to prepare a FreeBSD system in case of a panic. In this column, he talks about what to do when the worst happens.

Preparing for a crash immediately after you install a system is an excellent way to reduce stress. When your computer panics, would you rather have all the crash information at your fingertips, or would you prefer frantically reading the documentation and trying to set up the debugger? Last time, we discussed building a debugging kernel and setting up your system to save a panic after a crash. Let's hope you'll never need any of this. If you do suffer a crash, however, here's how to get some useful information out of it.

Let's assume that you've followed all the advice in the previous column. savecore(8) should have copied a dump of your crashed kernel to /var/crash.

If you take a look in /var/crash, you'll see the files kernel.0 and vmcore.0. (Each subsequent crash dump will get a consecutively higher number, e.g., kernel.1 and vmcore.1.) The vmcore.0 file is the actual memory dump. The kernel file is a copy of the crashed kernel. You want to be sure to use the debugging kernel instead of this one, however. If you look in your kernel compile directory (/sys/compile/MACHINENAME), you'll see a file called kernel.debug. This kernel file contains the symbols we discussed in the previous article. To make your life slightly easier, you might copy this file to /var/crash/kernel.debug.0. This will help you keep track of your debug kernels and the crashes they are associated with.

This process is an excellent opportunity to use script(1). This program copies everything that appears on your screen and makes it simple to keep a record of your debugging session (or, indeed, anything else you do). After you start the script, start the gdb debugger. gdb takes three arguments: a -k to configure the debugger appropriately for kernel work; the name of a file containing the kernel with symbols; and the name of the memory dump.

# gdb -k kernel.debug.0 vmcore.0

Once you do that, gdb will spit out its copyright information, the panic message, and a copy of the memory dumping process. We've seen an example of a panic earlier, so I won't repeat it now. What is new is the debugger prompt you get back at the end of all this:

(kgdb)

You've now gotten further than any number of people who have system panics. Pat yourself on the head. To find out exactly where the panic happened, type where and hit enter.

(kgdb) where
#0  dumpsys () at ../../../kern/kern_shutdown.c:505
#1  0xc0143119 in db_fncall (dummy1=0, dummy2=0, dummy3=0,
    dummy4=0xe0b749a4 " \0048\200%") at ../../../ddb/db_command.c:551
#2  0xc0142f33 in db_command (last_cmdp=0xc0313724, cmd_table=0xc0313544,
    aux_cmd_tablep=0xc030df2c, aux_cmd_tablep_end=0xc030df30)
    at ../../../ddb/db_command.c:348
#3  0xc0142fff in db_command_loop () at ../../../ddb/db_command.c:474
#4  0xc0145393 in db_trap (type=12, code=0) at ../../../ddb/db_trap.c:72
#5  0xc02ad0f6 in kdb_trap (type=12, code=0, regs=0xe0b74af4)
    at ../../../i386/i386/db_interface.c:161
#6  0xc02ba004 in trap_fatal (frame=0xe0b74af4, eva=40)
    at ../../../i386/i386/trap.c:846
#7  0xc02b9d71 in trap_pfault (frame=0xe0b74af4, usermode=0, eva=40)
    at ../../../i386/i386/trap.c:765
#8  0xc02b9907 in trap (frame={tf_fs = 24, tf_es = 16, tf_ds = 16, tf_edi = 0,
      tf_esi = 0, tf_ebp = -524858548, tf_isp = -524858592,
      tf_ebx = -525288192, tf_edx = 0, tf_ecx = 1000000000, tf_eax = 0,
      tf_trapno = 12, tf_err = 0, tf_eip = -1071645917, tf_cs = 8,
      tf_eflags = 66182, tf_esp = -1070136512, tf_ss = 0})
    at ../../../i386/i386/trap.c:433
#9  0xc01ffb23 in vcount (vp=0xe0b0bd00) at ../../../kern/vfs_subr.c:2301
#10 0xc01a5e58 in spec_close (ap=0xe0b74b94)
    at ../../../fs/specfs/spec_vnops.c:591
#11 0xc01a55f1 in spec_vnoperate (ap=0xe0b74b94)
    at ../../../fs/specfs/spec_vnops.c:121
#12 0xc0207454 in vn_close (vp=0xe0b0bd00, flags=3, cred=0xc32cce00,
    td=0xe0a8d360) at vnode_if.h:183
#13 0xc0207fab in vn_closefile (fp=0xc3369080, td=0xe0a8d360)
    at ../../../kern/vfs_vnops.c:757
#14 0xc01b1d50 in fdrop_locked (fp=0xc3369080, td=0xe0a8d360)
    at ../../../sys/file.h:230
#15 0xc01b155a in fdrop (fp=0xc3369080, td=0xe0a8d360)
    at ../../../kern/kern_descrip.c:1538
#16 0xc01b152d in closef (fp=0xc3369080, td=0xe0a8d360)
    at ../../../kern/kern_descrip.c:1524
#17 0xc01b114e in fdfree (td=0xe0a8d360) at ../../../kern/kern_descrip.c:1345
#18 0xc01b5173 in exit1 (td=0xe0a8d360, rv=256)
    at ../../../kern/kern_exit.c:199
#19 0xc01b4ec2 in sys_exit (td=0xe0a8d360, uap=0xe0b74d20)
    at ../../../kern/kern_exit.c:109
#20 0xc02ba2b7 in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
      tf_edi = 135227560, tf_esi = 0, tf_ebp = -1077941020,
      tf_isp = -524857996, tf_ebx = -1, tf_edx = 135044144,
      tf_ecx = -1077942116, tf_eax = 1, tf_trapno = 12, tf_err = 2,
      tf_eip = 134865696, tf_cs = 31, tf_eflags = 663, tf_esp = -1077941064,
      tf_ss = 47}) at ../../../i386/i386/trap.c:1049
#21 0xc02ae06d in syscall_with_err_pushed ()
#22 0x80503a5 in ?? ()
#23 0x807024a in ?? ()
#24 0xbfbfffb4 in ?? ()
#25 0x807daaf in ?? ()
#26 0x807d6eb in ?? ()
#27 0x80630c1 in ?? ()
#28 0x8062fed in ?? ()
#29 0x805ea4c in ?? ()
#30 0x8065949 in ?? ()
#31 0x806544d in ?? ()
#32 0x806dc17 in ?? ()
#33 0x80616b7 in ?? ()
#34 0x80613f0 in ?? ()
#35 0x8048135 in ?? ()
(kgdb)

Whoa! This is definitely scary looking stuff. If you copied this and the output of uname -a into an email and sent it to hackers@FreeBSD.org, various developers would take note and help you out. They'd probably write you back and tell you other things to type at the kgdb prompt, but you'd definitely get developer attention. You'd be well on your way to getting the problem solved, and helping the FreeBSD folks squash a bug.

If you're not familiar with programming, nobody would blame you if you stopped here. You're better than that, though, and smarter. I know you are. So, without further ado, let's see what we can learn from the debug message and try to figure out some things to include in that first email. Without being intimate with the kernel, you can't solve the problem yourself, but you might be able to help narrow things down a little.

The first thing to realize is that the debugger backtrace contains actual instructions carried out by the kernel, in reverse order. Line number one is the last thing the kernel did. When someone says "before" or "after," they're almost certainly talking about chronological order and not the order things appear in the debugger.

When does a system panic? Well, panicking is a choice that the kernel makes. If the system reaches a condition that it doesn't know how to handle, or fails its own internal consistency checks, it will panic. In these cases, the kernel will call a function called either trap or (if you have INVARIANTS in your kernel) panic. You'll see variants on these, such as db_trap, but you just want the plain old, unadorned trap or panic. Look through your gdb output for either of these. In the example above, there's a trap in line 8. We see other types of trap on lines 4-7, but no plain, straightforward trap statements. These other traps are "helper" functions, called by trap to try to figure out what exactly happened and what to do about it.

Whatever happened right before line 8 chose to panic. In line 9, we see:

#9 0xc01ffb23 in vcount (vp=0xe0b0bd00) at ../../../kern/vfs_subr.c:2301

The hex numbers don't mean much, but this panicked in vcount. If you try man vcount, you'll see that vcount(9) is a standard system call. The panic occurred while executing code that was compiled from the file /usr/src/sys/kern/vfs_subr.c, on line 2301. (All paths in these dumps should be under /usr/src/sys.) This gives a developer a very good idea of where to look for this problem.

Let's go up and look at line 9. Use the up command and the number of lines you want to move.

(kgdb) up 9
#9  0xc01ffb23 in vcount (vp=0xe0b0bd00) at ../../../kern/vfs_subr.c:2301
2301            SLIST_FOREACH(vq, &vp->v_rdev->si_hlist, v_specnext)
(kgdb)

Here we see the actual line of vfs_subr.c that was compiled into the panicking code. You don't need to know what SLIST_FOREACH is. (It's a macro, by the way.) Getting this far is pretty good, but there's still a little more information you can squeeze out of this dump without knowing exactly how the kernel works.

If you have some minor programming experience, you'd probably suspect that the terms in the parentheses after SLIST_FOREACH are variables. You'd be right. Each of those variables has a range of acceptable values. Someone familiar with the code would recognize the legitimate values; by printing out what each variable contains, we can jump-start the debugging process. You can tell gdb to print a variable's contents with the p command, giving the variable name as an argument. Let's look at the middle variable, vp.

(kgdb) p vp
$2 = (struct vnode *) 0xe0b0bd00
(kgdb)

The (struct vnode *) bit tells us that this is a pointer to a data structure. You can show its contents by putting an asterisk in front of the variable name, like so:

(kgdb) p *vp
$3 = {v_flag = 8, v_usecount = 2, v_writecount = 1, v_holdcnt = 0,
  v_id = 6985, v_mount = 0x0, v_op = 0xc2d52a00, v_freelist = {tqe_next = 0x0,
    tqe_prev = 0xe083de1c}, v_nmntvnodes = {tqe_next = 0xe0b0b700,
    tqe_prev = 0xe0b0c024}, v_cleanblkhd = {tqh_first = 0x0,
    tqh_last = 0xe0b0bd2c}, v_dirtyblkhd = {tqh_first = 0x0,
    tqh_last = 0xe0b0bd34}, v_synclist = {le_next = 0x0, le_prev = 0x0},
  v_numoutput = 0, v_type = VBAD, v_un = {vu_mountedhere = 0x0,
    vu_socket = 0x0, vu_spec = {vu_specinfo = 0x0, vu_specnext = {
        sle_next = 0x0}}, vu_fifoinfo = 0x0}, v_lastw = 0, v_cstart = 0,
  v_lasta = 0, v_clen = 0, v_object = 0x0, v_interlock = {mtx_object = {
      lo_class = 0xc0335c60, lo_name = 0xc02ef5c1 "vnode interlock",
      lo_flags = 196608, lo_list = {stqe_next = 0x0}, lo_witness = 0x0},
    mtx_lock = 4, mtx_recurse = 0, mtx_blocked = {tqh_first = 0x0,
      tqh_last = 0xe0b0bd84}, mtx_contested = {le_next = 0x0, le_prev = 0x0},
    tsp = {tv_sec = 3584, tv_nsec = 101067509},
    file = 0xc02ef50a "../../../kern/vfs_subr.c", line = 1726,
    has_trace_time = 0}, v_lock = {lk_interlock = 0xc036e320,
    lk_flags = 16777216, lk_sharecount = 0, lk_waitcount = 0,
    lk_exclusivecount = 0, lk_prio = 80, lk_wmesg = 0xc02ef5d1 "vnlock",
    lk_timo = 6, lk_lockholder = -1}, v_vnlock = 0x0, v_tag = VT_NON,
  v_data = 0x0, v_cache_src = {lh_first = 0x0}, v_cache_dst = {
    tqh_first = 0x0, tqh_last = 0xe0b0bdd8}, v_dd = 0xe0b0bd00, v_ddid = 0,
  v_pollinfo = 0x0, v_vxproc = 0x0}
(kgdb)

For those of you who are learning C, this is an excellent example of how it's easier to hand around a pointer than the object it references. An interested developer can dig through this and see what's going on. Let's take a look at the first variable, vq, and try to get similar information from it.

(kgdb) p vq
$4 = (struct vnode *) 0x0
(kgdb)

This isn't exactly a problem, but we're stuck. A pointer equal to 0x0 is a null pointer. There are many legitimate reasons for having a null pointer, but there isn't anything in it for us to view. Feel free to try, however; you really can't hurt anything through gdb.

(kgdb) p *vq
Cannot access memory at address 0x0.
(kgdb)

You've probably heard the words "null pointer" in close proximity to the word "panic." Without digging into the kernel code, you can't assume that this is the case. (In this particular panic, the null pointer is perfectly legitimate; the kernel panicked trying to decide what value to assign to this newly allocated pointer.)

You could try digging a little further into the data to see what's going on. The second variable in our panic (vp->v_rdev->si_hlist) actually goes on a bit; let's take a look a little deeper into it.

(kgdb) p vp->v_rdev
There is no member named v_rdev.
(kgdb)

Normally, this would work, and if you've used gdb before, you might think that it's wrong. In this case it's correct, however. v_rdev is a convenience macro. Only people who have read the kernel source code would know that. v_rdev actually expands to v_un.vu_spec.vu_specinfo. You couldn't be expected to know that, but don't be surprised if a developer asks you to type something different from what actually appears in the trace. To actually view vp->v_rdev, you would type:

(kgdb) p vp->v_un.vu_spec.vu_specinfo
$5 = (struct specinfo *) 0x0
(kgdb)

You should be able to recognize the null pointer here, but that's about it.

In this particular case, your extra digging would produce the answer for a developer very quickly. This tidbit in the contents of the vp structure identifies the problem almost immediately.

v_type = VBAD

This is a vnode that isn't currently used, and it shouldn't even be in this part of the system. A developer would jump directly on that and try to learn why the system is trying to set a new vnode to a bogus value.

By gathering what information you could before sending out an email, you would short-circuit a round or two of email. Anyone who's used email support in a crisis knows just how valuable this is! Without being a kernel hacker, you can't know which tidbit of knowledge is most important, so you need to include everything you can learn.

I got this particular kernel dump from Dag-Erling Smorgrav. His comment on this was that he "could fix vcount() to return 0 for invalid vnodes--it wouldn't, strictly speaking, be incorrect--but the real bug is somewhere else, and 'fixing' vcount() would just hide it." This is the correct attitude to have on this sort of problem; BSD users expect bugs to be found, not painted over. This means, however, that you can expect your developer to come back to you with requests for further information, and probably more things to type into gdb. He might even ask you to send the kernel.debug and core file.

You should be aware that the vmcore file contains everything in your system's memory at the time of the panic. This can include all sorts of security-impacting information. Someone could conceivably use this information to break into your system. A developer might write you and ask for a copy of the files. There are all sorts of legitimate reasons for this; it makes debugging easier and can save countless rounds of email. Carefully consider the potential consequences of someone having this information. If you don't recognize the person who asks, or if you don't trust him, don't send the files!

If the panic is reproducible, however, you can cold-boot the system to single-user mode and trigger the panic immediately. If the system never starts any programs that contain confidential information, and nobody types any passwords into the system, the dump cannot contain that information. Reproducing a panic in single-user mode, hence, generates a "clean" core file. Just enter boot -s at the loader prompt to bring the system to a command prompt, then do the minimal setup necessary to prepare a dump and panic the system.

# dumpon /dev/ad0s4b
# mount -art ufs
# /usr/local/bin/command_that_panics_the_system

The first line tells the system where to put its dump. You'll want to put your correct swap partition name here. The second line mounts the filesystem's read-only, so you won't have to fsck after your panic. (Since you know the crash is coming, why make yourself fsck?) Finally, you run the command that triggers the panic. You might need some additional commands, depending on your local setup, but this should get most people up and running.

As a final treat, here's a debugging session from the same panic, and the same kernel, but without debugging symbols. Compare it to the initial where output above.

(kgdb) where
#0  0xc01c5982 in dumpsys ()
#1  0xc0143119 in db_fncall ()
#2  0xc0142f33 in db_command ()
#3  0xc0142fff in db_command_loop ()
#4  0xc0145393 in db_trap ()
#5  0xc02ad0f6 in kdb_trap ()
#6  0xc02ba004 in trap_fatal ()
#7  0xc02b9d71 in trap_pfault ()
#8  0xc02b9907 in trap ()
#9  0xc01ffb23 in vcount ()
#10 0xc01a5e58 in spec_close ()
#11 0xc01a55f1 in spec_vnoperate ()
#12 0xc0207454 in vn_close ()
#13 0xc0207fab in vn_closefile ()
#14 0xc01b1d50 in fdrop_locked ()
#15 0xc01b155a in fdrop ()
#16 0xc01b152d in closef ()
#17 0xc01b114e in fdfree ()
#18 0xc01b5173 in exit1 ()
#19 0xc01b4ec2 in sys_exit ()
#20 0xc02ba2b7 in syscall ()
#21 0xc02ae06d in syscall_with_err_pushed ()
#22 0x80503a5 in ?? ()
#23 0x807024a in ?? ()
#24 0xbfbfffb4 in ?? ()
#25 0x807daaf in ?? ()
#26 0x807d6eb in ?? ()
#27 0x80630c1 in ?? ()
#28 0x8062fed in ?? ()
#29 0x805ea4c in ?? ()
#30 0x8065949 in ?? ()
#31 0x806544d in ?? ()
#32 0x806dc17 in ?? ()
#33 0x80616b7 in ?? ()
#34 0x80613f0 in ?? ()
#35 0x8048135 in ?? ()

That's it. There are no hints here about where the panic happened, just the function names that happened. An extraordinarily experienced hacker might happen to recognize a place in the kernel where the exact system calls take place, in exactly this order. If the kernel developer is really, really interested in the problem, he could get some information out of it like this.

(kgdb) p vcount
$1 = {<text variable, no debug info>} 0xc01ffb00 <vcount>
(kgdb) up 9
#9  0xc01ffb23 in vcount ()
(kgdb) p/x 0xc01ffb23 - 0xc01ffb00
$2 = 0x23
(kgdb)

The p/x command means "print in hexadecimal." Here, we've learned roughly how far into vcount() the problem happened. If the developer has a similar kernel built with similar source code, they can do this:

(kgdb) l *(vcount + 0x23)
0xc01fb913 is in vcount (../../../kern/vfs_subr.c:2301).
2296            struct vnode *vq;
2297            int count;
2298
2299            count = 0;
2300            mtx_lock(&spechash_mtx);
2301            SLIST_FOREACH(vq, &vp->v_rdev->si_hlist, v_specnext)
2302                    count += vq->v_usecount;
2303            mtx_unlock(&spechash_mtx);
2304            return (count);
2305    }
(kgdb)

That's it. There's no way to get the bad vnode information out. The developer is left on his own, poking through the code to see if he can figure out the problem through sheer, dogged determination. And in any event, it's very unlikely that any developer capable of working on a problem will have the exact setup that you have on a panicking system. While many of them would be happy to set up such a system in exchange for lavish amounts of hard currency, it's a bit much to expect for free.

Congratulations! You're now as prepared for a crash as any non-kernel developer can be. Proper preparation can make your life easier, and preparing for the worst is one of the best ways to sleep uninterrupted at night.

(The author wishes to gratefully acknowledge the assistance of Dag-Erling Smorgrav in preparing this article.)

Michael W. Lucas


Read more Big Scary Daemons columns.

Return to the BSD DevCenter.


Copyright © 2009 O'Reilly Media, Inc.