Python DevCenter
oreilly.comSafari Books Online.Conferences.


Python Documentation Tips and Tricks

by Cameron Laird

Traditional thinking judges the merit of a computing language on objective, rather mathematical dimensions: how object-oriented is it? Are its constructs orthogonal? Does the syntax admit ambiguities? While that's appropriate for abstract analysis, when we work or play with a computing language in our daily life, we live an intimacy best judged along a more human dimension.

What's important and best about Python is not its use of white space, or capacity for metaprogramming, or any other single feature in isolation. Python's use is exploding because it fits well. It's a tool engineered with good balance, the right combination of rigor and flexibility to match our human needs. Yet one of the best things about Python is its documentation -- especially the kinds of documentation reference books and tutorials rarely explain.

How to look at the documentation

Python programmers like its libraries. The answer to most of the common, "How do you ...?" questions is, "Use the ... module." Part of the reason for this rich tradition of reuse is that Python modules are generally well-documented, and it's easy to make new modules with good documentation.

Consider os as an example. The os module provides "a more portable way of using operating system (OS) dependent functionality than importing an OS dependent built-in module." That's what the online version of the Global Module Index tells us. Hyperlinks on the Index page lead to a variety of formats suitable for online browsing and printing. Nearby on the same site is an explanation by Fred L. Drake of how to document Python, informing document authors of the LaTeX markup practices typically used with Python modules.

The culture of the Python community reinforces high standards in documentation. Since the existing documentation is good, code authors expect that they will need to do as well for new contributions. Programmers' efforts to document their code carefully boosts Python's strength in documentation even further and attracts more practitioners for whom documentation is important. The cycle continues.

You can download standard documentation and use it from local storage. A standard installation of Python on Windows includes includes on the desktop a "Python Documentation" shortcut to local forms of the same browsable documents just mentioned.

__doc__ magic

More distinctive is the help available from within the Python interpreter. Many Python programmers work within a live Python processing session, composing code "on the fly". In this context, you can interrogate Python objects (including modules, methods, and functions) by asking them about their __doc__ value. In the middle of such a session, you might, for example, wonder about file permissions:

>>> import os
>>> os.access.__doc__
'access(path, mode) -> 1 if granted, 0 otherwise
Test for access to a file.'

When you first meet __doc__, you may wonder what the point is. It apparently only abbreviates the standard manual entry in a slightly clumsier form. Experienced Pythoneers use __doc__ all the time, though. A simple __doc__ printout often gives just the right information -- the order of parameters, for example -- that otherwise would take many annoying seconds to find in the full documentation.

__doc__ operates in its own virtuous circle. Because the best Python programmers use it so often, library authors know high-quality __doc__ descriptions will be appreciated. There's also a simple technical trick behind __doc__ maintenance encouraging its proliferation. Most attributes are assigned or, more properly, bound with assignment statements:

thing.attribute = 'This is a description.' 

For the special case of __doc__, though, the docstring convention allows abbreviation: when the first statement of a definition is a string, that string is taken as the __doc__ (or docstring) of the defined object. An example might look like

def phase_of_the_moon():
  "This function returns a slightly randomized
  integer that shuffles data around in a way
  convenient for the XYZ class."
  # Working code here.
  return value 

The "This function ..." explanation is phase_of_the_moon.__doc__, the function's docstring.

Building on the __doc__ base

Because most Python programmers write at least minimal docstrings, several useful tools leverage this information. Pytext transforms docstrings into DOM trees -- an XML-related data structure, handy for preparation of indexes, among other uses. Docutils collects more general-purpose machinery to transform a variety of documentation formats. Last month, this site focused attention on one particular module, PyDoc, the purpose of which is to render documentation, and especially docstrings, in various convenient forms.

Among all these tools, the most beautiful in its simplicity and power is doctest. Doctest is a convention and process for embedding simple self-testing examples in docstrings. As doctest's own documentation explains, "The doctest module searches a module's docstrings for text that looks like an interactive Python session, then executes all such sessions to verify they still work exactly as shown." The documentation goes on to present an example definition of a factorial function, whose docstring embeds such interactive exercises as

>>> [factorial(n) for n in range(6)]
[1, 1, 2, 6, 24, 120]
>>> factorial(30)

The definition of factorial is designed for re-use by other Python code, of course. At the same time, by conforming to the doctest format, the definition can be executed in isolation, as a validation of its content. This kind of introspective play makes good language interpreters enormously productive.

Doctest builds on another social convention of the Python community. One of the continuing challenges with languages such as C is how to design and implement unit tests; this is generally regarded as an thorny problem, best left to more senior programmers. Python's proud tradition is that each definition should embed its own unit tests. Python source files often end in the idiom

if __name__ == '__main__':
  # Self-testing code goes here.
  # This will be executed in case the
  #    source has been imported as a
  #    module.

This idiom means that the source is available either for re-use as a module by other applications or for immediate self-testing execution. In the case of the doctest example, the self-testing code imports doctest and applies it to the source file itself. This makes for the ultimate in succinctness:

def _test():
  import doctest, example
  return doctest.testmod(example)      

if __name__ == "__main__":

By building on strong traditions of Python use, these few lines achieve a powerful result in self-validation.

More jewels in documentation management

These themes of automation and strong social expectations about documentation pervade Python programming. Other languages have comparable and even superior technical answers for questions about documentation. Python is essentially unique, though, in the extent to which its programmers fulfill the promise of the tools. Pythondoc is a tool like Javadoc: it generates documentation from source-code comments. This, in turn, depends on observance of the Python styleguide.

Python documentation keeps getting better. All the tools and references mentioned here remain in active development. One area now blossoming is "structured text" (ST), a cover term for conventions of formatting docstrings and other textual material. Advances in processing ST aim to make it even easier for novice programmers or nonprogrammers to supply content that benefits from Python's power. Why ST? Briefly, "civilians" can read and write ST. It's less rigorous than XML, for example, and it's far more readable for most humans. For example, ST recognizes paragraphs implicitly:

def something():
  "This is a first paragraph.

  This is a second paragraph.  The intervening
  blank line means this must be a new paragraph."
  #  ... 

Contrast this with HTML, which has an explicit tag, <P>, for paragraphs.


There's more to Python than its syntax and reference manuals. To get the most from Python, learn to document "the Python way". Look for the docstrings that other programmers have written; they're likely to hold a lot of meat. Write your own source comments in good style and, especially, include good docstrings in your work. However well you code "raw" source, good supplementary documentation, in traditional Python style, helps others to quickly understand your work.

Cameron Laird is the vice president of Phaseit, Inc. and frequently writes for the O'Reilly Network and other publications.

Return to the Python DevCenter.

Sponsored by: