oreilly.comSafari Books Online.Conferences.


Getting Familiar with GCC Parameters
Pages: 1, 2, 3, 4

Options Related To Debugging

Everybody needs to debug his or her code sometimes. When that time comes, usually you fire up gdb, put breakpoints here and there, analyze backtraces, and so on, to pinpoint the exact location of the offending code(s). And what exactly do you get? Assuming you haven't used any debugging option, you likely just get an address pointed to by an EIP register.

The problem is, you don't really want an address. You want gdb or another debugger to simply show the related lines. But gdb can't do that without some kind of hint. This hint, specifically called Debugging With Attributed Record Formats (DWARF), helps you do source-level debugging.

How do you do it? Use -g when you compile into object code, e.g.:

  gcc -o -g test test.c

What does gcc actually add so that the debugger can correlate an address with source code? You need dwarfdump [7] to find out. This tool is packaged inside the "libdwarf" tarball or RPM, so you won't find it as a standalone package. Compile it on your own, or simply install from your distro's online repository; both should work. In this section, I use the 20060614 RPM version.

Using readelf, you notice that there are 28 sections inside the non-debug version of Listing 1:

 $ readelf -S ./non-optimized

But the debug version has 36 sections. The new sections added are:

  • debug_arranges
  • debug_pubnames
  • debug_info
  • debug_abbrev
  • debug_line
  • debug_frame
  • debug_str
  • debug_loc

You don't need to dig into all of the above sections; taking a look into .debug_line is enough for quick observation. The command you need is:

$ /path/to/dwarfdump -l <object file>

Here is an example of what you'll get:

 .debug_line: line number info for a single cu
   Source lines (from CU-DIE at .debug_info offset 11):
 <source>        [row,column]    <pc>    //<new statement or basic block
 /code/./non-optimized.c:  [  3,-1]        0x8048384       // new statement
 /code/./non-optimized.c:  [  5,-1]        0x8048395       // new statement

The interpretation of the above messages is quite straightforward. Take the first entry (below the <source> string) as an example:

  line number 3 in file non-optimized.c is located in address 0x8048384.

gdb itself gives the same information:

 $ gdb non-optimized-debugging
   (gdb) l *0x8048384
   0x8048384 is in main (./non-optimized.c:3).

readelf also provides similar information, by using --debug-info:

 $ readelf --debug-dump=line <object file>
   Line Number Statements:
   Extended opcode 2: set Address to 0x8048384
   Special opcode 7: advance Address by 0 to 0x8048384 and Line by 2 to 3
   Advance PC by constant 17 to 0x8048395
   Special opcode 7: advance Address by 0 to 0x8048395 and Line by 2 to 5

Both readelf and dwarfdump can analyze debug information, so you're free to choose.

What you should realize is that the source code itself isn't embedded into the object file. In fact, the debugger must check a separate source code file. The entry in the <source> column helps determine where to load the source file. Notice that it contains a full path--meaning if the file is moved somewhere or renamed, gdb can't locate it.

gcc itself has the ability to produce many debugging information. Besides DWARF, there are:

  • Stabs: -gstabs produce native stabs format, while -gstabs+ includes specific
    GNU extensions.
  • Common Object File Format (COFF): Created with -gcoff.
  • XCOFF: Created with -gxcoff. If you prefer to include GNU extensions,
    use -gxcoff+.
  • Virtual Memory System (VMS): Produced with -gvms.

Each of the above formats is described in the footnotes ([8], [9], and [10]), but in x86-compatible architecture, without a doubt, you would use DWARF format. The latest DWARF specification is DWARF version 3, and gdb can produce it via -gdwarf-2. This could be misleading for first-time users, because you might think it creates DWARF 2-based information. In fact, it is DWARF 2 coupled with some DWARF 3 features. Not every debugger supports version 3, so use it with caution.

However, things do not always go smoothly. When you combine -O and -g, it is necessary for line information to relate to the actual code in the mentioned address offset. An example can clarify this--take a file (I use Listing 1 again) and compile it:

 $ gcc -O2 -ggdb -o debug-optimized listing-one.c
   $ readelf --debug-dump=line debug-optimized
   Special opcode 107: advance Address by 7 to 0x80483aa and Line by 4 to 11

But what does gdb say?

 $ gdb debug-optimized
   (gdb) l *0x80483aa
   0x80483aa is in main (./non-optimized.c:11).
   11              printf("acc = %lu\n",acc);
   (gdb) disassemble main
   0x080483aa <main+26>:   add    $0x6,%edx
   0x080483ad <main+29>:   cmp    $0x1388,%eax

There you see a complete disagreement. By inspecting debug information alone, you would expect the related address to contain something like a CALL instruction. But in reality, you get ADD and CMP instructions, more likely a loop construction. This is the side effect of the jobs done by optimization--instruction reordering in this case. So as a rule of thumb, don't mix -g (or its variants) with -O.

Pages: 1, 2, 3, 4

Next Pagearrow

Sponsored by: