oreilly.comSafari Books Online.Conferences.


Unit Testing Your Documentation

by Leonard Richardson

When O'Reilly editor Mike Loukides contacted me about co-writing the Ruby Cookbook, I was apprehensive. I wasn't worried about the size of the project; I was concerned about quality. How could I work to the level of quality I expected from O'Reilly books--especially the Python and Perl Cookbooks, against which I knew people would measure this book? I'd heard horror stories of books that didn't meet O'Reilly's usual standard, books rushed through second editions because the code was full of bugs. I didn't want that to happen to my book. (I still hope it doesn't!)

At first, I thought it would be especially difficult to ensure the quality of a cookbook. Instead of a few application-sized examples illustrating a few coherent topics (say, database access in Java), we had to test 350 separate pieces of code on 350 wide-ranging topics. As with a software project, we had deadlines to meet.

Worse, due to the structure of our contract and the scarcity of proofreader time, this book was essentially a Waterfall project. Up to the very end, our focus was on getting everything written, with some time allocated afterward to edit the text and code. This isn't quite as crazy as it sounds, because bad text is easier to whip into shape than bad code, but it meant I didn't have a lot of time to spend on building test infrastructure.

Fortunately, despite my early misgivings, the Cookbook format actually made the individual recipes easy to test. I was able to turn pre-existing features of the recipes (the worked examples) into unit tests. I ran the tests in an instrumented irb session and generated a report that flagged any failures.

Thanks to the test framework, on a good day I could proofread, debug, and verify the correctness of 30 recipes. I worked faster and with greater confidence than I could doing everything by hand. I was also able to incorporate the test results into the general "confidence score" calculated for each recipe on my unofficial Ruby Cookbook homepage: a visible, though somewhat vague, metric of quality.

In this article, I present a simplified, cleaned-up version of my testing script. It parses recipe text into a set of code chunks and assertions. It then runs the code chunks in an instrumented irb session, and compares the assertions to reality. It works in a way similar to Python's doctest library.

Defining the Problem

Most of the recipes in the Ruby Cookbook contain a series of code samples depicting a single irb session. I've annotated important Ruby expressions in these samples with comments depicting their value or output. Here's a code sample from Recipe 1.15, "Word-Wrapping Lines of Text":

def wrap(s, width=78)
  s.gsub(/(.{1,#{width}})(\s+|\Z)/, "\\1\n")

wrap("This text is too short to be wrapped.")
# => "This text is too short to be wrapped.\n"

puts wrap("This text is not too short to be wrapped.", 20)
# This text is not too
# short to be wrapped.

Here, an ASCII arrow indicates the value of the first call to wrap, just like in irb. The second call is part of a puts statement, so instead of the value of that statement (a boring nil), the example shows the string printed to standard output.

Both the value and the output are hidden in comments, so that the reader can copy and paste the sample code directly into irb. By following along with the recipe, the reader can try out techniques used in the recipe solutions. At every important step, the reader can compare his results against what it says in the book to see if he understand the code correctly. After reaching the end of a recipe, the reader has libraries loaded and objects set up for further experimentation.

The flip side--what the book says had better be right. Running all that code and cross-checking the results against the comments would take a long time. However, it wouldn't require a lot of brainpower, so why not do it automatically?

We couldn't stick Test::Unit calls in the sample code: it would distract from the main point of the recipes. Yet those annotated lines of code are, effectively, unit tests: assertions about what happens when you use the previously defined code. They serve a pedagogical purpose, but they can also help verify quality.

The Recipe Format

The first step is to parse out the code from the English text of the recipe. Fortunately, we wrote the Ruby Cookbook in a wiki markup format similar to RedCloth. Lines containing three backticks delineate chunks of code:

 This is the English text of the book

 puts "This is Ruby code."
 # "This is Ruby code."

 This is more English text.

What about the format of the Ruby code? If a line ends with a comment containing an arrow, that's an assertion about the value of the expression on that line.

'pen' + 'icillin'                # => "penicillin"
['pen', 'icill'] << 'in'         # => ["pen", "icill", "in"]

If a line begins with a comment containing an arrow, that's an assertion about the value of the expression on the previous line.

'banana' * 10
# => "bananabananabananabananabananabananabananabananabananabanana"

If a line begins with a comment with no arrow, that's an assertion about the previous expression's output. An expression can yield multiple lines of output.

puts 'My life is a lie.'
# My life is a lie.

puts ['When', 'in', 'Rome'].join("\n")
# When
# in
# Rome

Any other line in a code chunk is a normal line of Ruby code with no associated assertion.

Pages: 1, 2, 3, 4

Next Pagearrow

Sponsored by: