ONLamp.com    
 Published on ONLamp.com (http://www.onlamp.com/)
 See this if you're having trouble printing code examples


Testing C with Libtap

by Stig Brautaset
01/19/2006

Libtap is a library for testing C code. It implements the Test Anything Protocol, which is emerging from Perl's established test framework.

Design for Today, Code for Tomorrow

One of the ideas behind Extreme Programming (XP) is to "design for today, code for tomorrow." Rather than making your design cover all eventualities, you should write code that is simple to change should it become necessary.

Having a good regression test suite is a key part of this strategy. It lets you make modifications that change large parts of the internals with the confidence that you have not broken your API. A good test suite can also be a way to document how you intend people to use your software.

Having worked where people thought that writing tests was a waste of time, I can't tell you how much time I wasted trying to fix bugs that had emerged as a result of bugs being fixed or new features added. If we'd had a proper regression test suite, we could have found those immediately, and I would have lots of extra time to write new features. Taking the time to produce good tests (and actually running them) actually ends up saving a lot of time, not wasting it.

Introducing the Test Anything Protocol

Perl distributions normally ship with a test suite written using Test::Simple, Test::More, or the older (and now best avoided) Test module. These modules contain functions to produce plain-text output according to the Test Anything Protocol (TAP) based on the success or failure of the tests. The output from a TAP test program might look something like this:

1..4
ok 1 - the WHAM is overheating
ok 2 - the overheating is detected 
not ok 3 - the WHAM is cooled 
not ok 4 - Eddie is saved by the skin of his teeth

Related Reading

C in a Nutshell
By Peter Prinz, Tony Crawford

The 1..4 line indicates that the file expects to run four tests. This can help you detect a situation where your test script dies before it has run all the intended tests. The remaining lines consist of a test success flag, ok or not ok, and a test number, followed by the test's "name" or short description. Obviously, the second and third lines indicate a successful test, while the last two indicate test failures.

Perl modules usually invoke the tests either by running the prove program or by invoking make test or ./Build test (depending on whether you're using ExtUtils::MakeMaker or Module::Build). All three approaches use the Test::Harness module to analyze the output from TAP tests. If all else fails, you can also run the tests directly and inspect the output manually.

If Test::Harness is given a list of tests programs to run, it will run each one individually and summarize the result. Tests can run in quiet and verbose modes. In the quiet mode, the harness prints only the name of the test script (or scripts) and a result summary. Verbose mode prints the test "name" for each individual test.

Besides Perl, helper libraries for producing TAP output are available for many languages including C, Javascript, and PHP (see the Links & Resources section).

Suppose that you want to write tests for the module Foo, which provides the mul(), mul_str(), and answer() functions. The first two perform multiplication of numbers and strings, while the third provides the answer to life, the universe, and everything. Here is an extremely simple Perl test script for this module:

use Test::More tests => 3; 
use Foo;
  
ok(mul(2,3) == 6, '2 x 3 == 6'); 
is(mul_str('two', 'three'), 'six', 'expected: six'); 
ok(answer() == 42, 'got the answer to everything');

The tests => 3 part tells Test::More how many tests it intends to run (referred to as planning). Doing this allows the framework to detect whether you exit the test script without actually running all the tests. It is possible to write test scripts without planning, but many people consider this a bad habit.

On to the C Testing

Hey! Isn't this article supposed to be about testing C? It is. Libtap is a C implementation of the Test Anything Protocol. It is to C what Test::More is to Perl, though using it doesn't tie you into using Perl. However, for convenience you probably want to use the prove program to interpret the output of your tests.

Libtap implements a convenient way for your C and C++ programs to speak the TAP protocol. This allows you to easily declare how many tests you intend to run, skip tests (some apply only on specific operating systems, for example), and mark tests for unimplemented features as TODO. It also provides the convenient exit_status() function for indicating whether any of the tests failed through the program's return code.

How would you would write the test for the Foo module in C, using libtap? The #include <foo.h> line is analogous to the use Foo; of the Perl version. However, as this is C, you also need to link with the libfoo library (assuming this implements the functions declared in foo.h).

For this test, I will show the full source of the test program, including any #include lines; I will show only shorter fragments below. Notice again the difference in the number passed to the plan_tests() function and the number of actual tests that actually run:

#include <tap.h>
#include <string.h>
#include <foo.h>
  
int main(void) {
  plan_tests(3); 
  ok1(mul(2, 3) == 6); 
  ok(!strcmp(mul_str("two", "three"), "six"), "expected: 6"); 
  ok(answer() == 42, "got the answer to everything"); 
  return exit_status();
}

The exit_status() function returns 0 if the correct number of tests ran and if they all succeeded; it returns nonzero otherwise. In the Perl version the test framework makes magic happen behind the scenes so that you don't have to twiddle the exit status by hand.

One notable difference between the Perl version and the C version is the ok1() macro, a wrapper around the ok() call. Instead of having to call ok() with a test condition as the first parameter and diagnostic as the second (and any subsequent) parameter, this macro stringifies its argument and uses that for the diagnostic message. This can be very convenient for simple tests.

Both the Perl and C tests above, when run, print something along the lines of:

1..3 
not ok 1 - mul(2, 3) == 6 
#     Failed test (basic.c:main() at line 12) 
ok 2 - expected: 6 
ok 3 - got the answer to everything

The line starting with # is a diagnostic message; libtap prints these occasionally to help you find which test is failing. In this case, it identifies the line in the test file that contained the failing test.

Skipping Tests

Sometimes it is necessary to skip tests. For example, you might be testing an operation that only the root user can perform, or something that applies only to a particular platform. Using Test::More, you can create a block marked with the special label SKIP:

SKIP: { 
  skip 2, "because only root can foo()" 
    unless is_root(); 
  ok(foo(0), "root can foo(0)"); 
  ok(foo(1), "root can foo(1)"); 
}

With libtap you don't have this nice block structure, but the skip() function works similarly. It takes as its arguments the number of tests to skip and the string describing the reason to skip the tests in question. Here's the C version of the previous Perl example:

if (is_root()) { 
  ok(foo(0), "root can foo(0)"); 
  ok(foo(1), "root can foo(1)"); 
} 
else { 
  skip(2, "because only root can foo()"); 
}

Notice that you have to make sure to update the first argument passed to the skip() function in the !is_root() branch if the number of tests in the is_root() branch changes. This is easy to spot for simple cases like the above, but is harder for larger portions of code.

Libtap furnishes the skip_start() and skip_end macros, which provides a more Perl-ish way of skipping tests. If the first argument passed to skip_start() is true, libtap will skip all tests between it and the corresponding skip_end; that is, the code will compile but will not execute. You still have to make sure that the number passed as the second argument to skip_start() corresponds to the actual number of tests between it and skip_end, but at least you don't have to worry about two different branches of an if clause.

  skip_start(is_root(), 2, "because only root can foo()"); 
  ok(foo(0), "root can foo(0)"); 
  ok(foo(1), "root can foo(1)"); 
  skip_end; /* it's a macro: no parentheses */

Regardless of the method you use, when running as root, the output should look something like this:

1..2 
not ok 1 - root can foo(0) 
#     Failed test (skip.c:main() at line 12) 
ok 2 - root can foo(1) 
# Looks like you failed 1 tests of 2.

and something like this when running as a normal user:

1..2 
ok 1 # skip only root can foo() 
ok 2 # skip only root can foo()

TODO Tests

Occasionally you may want to incorporate a test that you expect to fail into your test suite. libtap supports TODO tests, which work well for this. They can be handy when you have a planned feature that you just don't have time to implement right now, but know how to test.

To write TODO tests in Perl, set the $TODO package variable to a true value. Test::More uses this value as the reason the tests are listed as not yet done.

ok(run(), "yay, it claims to start running!"); 
ok(is_running(), "it is running"); 
  
{ 
  local $TODO = "not sussed this part yet..."; 
  ok(stop(), "it appears to stop"); 
  ok(!is_running(), "it is not running"); 
}

In C it's almost the same, though you have to use a couple of function calls instead of a TODO block:

ok(run(), "yay, it claims to start running!"); 
ok(is_running(), "it is running"); 
  
todo_start("not sussed this part yet..."); 
ok(stop(), "it appears to stop"); 
ok(!is_running(), "it is not running"); 
todo_end();

Assume that the code you're testing reports that the program stopped when told to, but fails to actually stop. The output should look something like:

1..4 
ok 1 - yay, it claims to start running! 
ok 2 - it is running 
ok 3 - it appears to stop # TODO not done yet... 
not ok 4 - it is not running # TODO not done yet... 
#     Failed (TODO) test (todo.c:main() at line 17)

Proper Planning Prevents Poor Performance

What happens when you get the number of tests to run wrong? Try it (why yes, this is a contrived test):

plan_test(3);
ok(1, "true");
ok(!0, "!false == true");
return exit_status();

When you compile and run this program, you should get the following output. (I've shown the command line I typed to run the test as well, and I am checking to see if the program is exiting cleanly.)

% ./plan-too-many && echo success || echo failure
1..3
ok 1 - 1 is true
ok 2 - !0 is true
# Looks like you planned 3 tests but only ran 2.
failure

Libtap issued a diagnostic to say that you planned too many tests. In this case the diagnostic was correct, but you could also get this message if some spectacular failure makes the program exit cleanly before all the tests have run. In addition, it printed failure--meaning that the test program returned nonzero, indicating a failure. This is the handiwork of the exit_status() function, and makes it possible to simply run the test and ignore the output to get a ok or not ok (although this would not catch the spectacular failing case here).

Responsibility of Tests

Unit tests have several responsibilities. Above all, they should detect when something has gone wrong. In order to help you quickly and easily find the problem, they should also attempt to give good clues as to what has gone wrong and where the problem might be.

I once needed to test an AI for an Othello (aka Reversi) game. To make sure that the AI picked the expected move at each point, I compared each state in a run with the states of a known good run. Simply doing square-by-square comparisons over the entire grid at each state would have detected errors, but it would have made it hard to see in which situation the errors occurred. Actually creating (not to mention updating) the tests would have been a nightmare also.

Instead I chose to code the test in two parts. The first was a very simple C program to search for and apply moves and print out the current state at each step. This part resembled:

state = init_othello_state(); 
do { 
   print_othello_state(state); 
   putchar('\n'); 
} while (state = ai_move(state));

I then wrote a simple Perl program to read the output of the first and compare each state against a list of canned states from a known good run. That part was also very simple:

local $/ = ''; 
my @states = qx( ./reversi ) 
  or die 'running reversi failed'; 
  
is shift @states, $_ while <DATA>; 
  
__DATA__ 
...... 
...... 
..ox.. 
..xo.. 
...... 
...... - x to move 

best branch: 6 (ply 3 search; 48 states visited) 
...... 
...... 
..ox.. 
..xx.. 
...x.. 
...... - o to move

best branch: 1 (ply 3 search; 45 states visited)
......
......
..ox..
..ox..
..ox..
...... - x to move

[etc]

Notice that this last test did not actually use libtap at all. Instead, it had a C component and a Perl component. Using Inline::C, you could improve on that and put both the C and Perl components in the same file. Note that using two separate parts has the benefit that if Perl is not available the C component can still run, although you must manually compare its results with the canned run in the Perl component.

Availability

Both Test::More and Test::Harness are part of the core in recent Perl distributions. Libtap is unfortunately not (yet) a commonly installed library. However, it has a liberal license and consists of only two source files, so if you want to make sure that your users can run your test suite, you can bundle libtap with your software.

Conclusion

This article explores various ways of using Perl to help testing your C code. First, it briefly examines the Test Anything Protocol, the heart of Perl's well-established test framework. It then shows how libtap can help you produce TAP output from C programs. It rounds off by using Perl to test programs by comparing their output with known good reference output.

Hopefully, this should give you some ideas for improving your own testing.

Links & Resources

Stig Brautaset works at Fotango, where he gets to do interesting things in Perl and eat a lot of fruit.


Return to ONLamp.com.

Copyright © 2009 O'Reilly Media, Inc.