oreilly.comSafari Books Online.Conferences.


Rethinking Community Documentation
Pages: 1, 2, 3

Problems with Community Documentation

Community documentation is indispensable, and it makes the difference for many people between abandoning a system in frustration and engaging in productive work. Unfortunately, community documentation isn't everything computer users need, or even everything it could be. I divide my critique into a few areas:

I follow up with examples of community documentation I like, to show that my goals are achievable. Parts of the following sections build on an earlier article of mine, "Splitting Books Open: Trends in Traditional and Online Technical Documentation."

Failures of Interactive Help

Some interactions on mailing lists are wonderfully effective. A struggling user can learn of a bug fix to download, a document for background reading, or a subtle typo in her input. Everybody goes away happy and gets back to business.

These interactions can also turn into a crutch. Modern computer users need to develop mental models to handle new situations that come along, and quick fixes can prevent that from happening. In fact, interactive help can retard learning.

System administration is particularly at risk. Imagine someone trying to connect a Linux system to a Windows server that has files the Linux user wants to access. The system administrator has has configured the Linux system to grant access by treating the remote folder as a Linux filesystem, but she gets only an error message saying the filesystem type is incorrect.

The documentation indicates that the requested type is indeed correct, so now the system administrator has to take the error message as grist for further study. If she has the background knowledge to step through some tests, she will soon realize that the Linux system does not have support for the filesystem type, but needs a special module to be loaded. For various reasons, this module is not present by default (such as the desire to conserve space and some legal uncertainty about Microsoft's tolerance for this kind of Windows-compatible software).

A better error message might have been "No support for filesystem type" or "No module found for this filesystem," but software itself is limited in diagnostic capabilities and knowledge of the environment. It's really up to a system administrator to keep hold of the big picture.

By stepping through a diagnostic inquiry, the system administrator can learn something about how Linux is put together, something about how filesystems work, and something about licensing controversies.

Now suppose she had just reported her error message to a mailing list? A helpful user would probably have told her to load the right module, with or without further background.

This is "give a man a fish" behavior. The recipient of the information will be better off the next time she faces the exact same problem, but has lost an opportunity to practice skills she needs as a system administrator.

Where do computers users most often miss out? Some types of knowledge may be amenable to learning in dribs and drabs. Certain other subjects are deep and require a holistic approach. These include:

Security doesn't consist of installing all the patches from the vendor. It consists of an integrated approach to policies, risk assessment, and the disciplined monitoring of systems.
Performance tuning
Few optimizations can be made in isolation. The term "tuning" is quite appropriate here, because performance is like trying to tune a keyboard instrument. On the keyboard, it's easy to tune two strings to a perfect fifth. Yet if you go around the strings tuning them all to perfect fifths, you'll never be in tune. Tuning takes a sophisticated, nuanced approach--and different tunings are appropriate for different time periods and pieces.
As my example of Linux/Windows filesharing suggested earlier, you need a broad understanding of many different levels of a system to do problem-solving.
Robust programming
In any programming language, bad habits are easy to fall into, and they come back to torment you later.

With this longer view in mind, it's worrisome that a lot of advice given on mailing lists is unsystematic. Not, "This is why your system is failing, and here's how to fix it," but "Gee, I had a similar problem a year ago, and when I did the following it worked." The generous donor may, unfortunately, be setting up the recipient of the advice for future failure.

Interactive help can also be highly inefficient. Often I have seen someone out of his depth returning to a mailing list over and over for advice doled out in inadequate quantities by list members. Pointers to more complete documentation are hard to offer, because they tend to sound like a dismissive and insulting "Read the eff-ing manual." Even so, community documentation might not meet the user's needs in any case.

Failures of Writing

My 13 years as editor have shown innumerable times how difficult it is to write effective explanations of technical topics. Most would-be authors need intensive mentoring to write at a level the reader can understand, and people doing community documentation usually lack the time to even try.

Everyone suffers from a bad turn of phrase now or then, or forgets to define a term before using it. These problems--along with grammar and spelling errors--are fixable by copy editors, who are readily available on a freelance basis. Many readers can also help resolve these lapses. I consider this an easy problem, and won't discuss it any further.

What interests me are the conceptual problems that copy editors cannot find or fix. These lie with the intended use of the document, not its readability. Such problems lead to the common complaints "This was too abstract," or "What am I supposed to learn from this?" Regardless of the endless manifestations of these sins, I think that most fall into two categories: approaching an explanation at the wrong level and drowning the reader in details.

Approaching an Explanation at the Wrong Level

Many authors have learned the system at one level--particularly if they are developers--and cannot move mentally to the level of the reader. Take a trivial case as an example: configuring an automatic login.

Suppose an author documents the purpose of automatic logins with:

If you configure your system for an automatic login, your initial screen will come up as soon as the computer starts, without any request for user name and password.

That text is perfectly understandable to most computer users, and would likely pass muster during a copy edit. It might even look useful to someone evaluating the document.

However, the passage is useless, because it is tautological. The term "automatic login" already conveys everything the passage says. The text adds nothing to the reader's comprehension.

A useful discussion of automatic logins would deal with the fundamental purpose of the user name and password prompts: security. The discussion must focus on one key point: by configuring automatic login, you allow anyone with physical access to your system to log in as you. A secondary point is that the feature is predicated on the assumption you are the only user who will want to use the system.

In some situations, this approach is reasonable. For a laptop, for instance, you may be the sole user, and if it is stolen (as we know all too well from recently publicized incidents) the data is readily available even to someone who can't log in. The recommended security approach for a laptop is an encrypted filesystem, and requiring a user name and password is just an annoyance that makes extra work for the legitimate user.

A lot of authors have a hard time investigating which aspects of technology interest its users. I consider the problem to lie with differing levels.

Useful documentation normally starts at a very high level, with goals of the readers, and then descends into the system operations that meet those goals. Because the hardware and software reflect the lower levels more directly, and because a knowledgeable author is comfortable at those levels, the higher levels are the hardest parts of the document to write. Yet these are the introductory paragraphs that the reader should see first! It's amazing that readers persist as often as they do to read the documents, despite incomprehensible or missing introductions.

Drowning the Reader in Details

Related to the previous problem is the proliferation of guides that jump into details too quickly. Open most technical magazines or books for computer users, and you find lists of tasks: "load a disk ... build some software on it ... enter a command to make the operating system recognize it ... give it a name ...."

Supposedly, following the directions to a T gives you a functioning system. Usually, your environment differs in some subtle way from the author's environment, so your attempt to following directions fails. Even if you do get the system working today, it may fail tomorrow when you log back in.

Will background documents help? Finding them may be hard. Just because a background document describes a system doesn't mean you can make use of that explanation. There may be a crucial link that you can't make between the background you're reading and the particular task you're trying to solve.

One software project I described in a recent blog post about a new game development platform called Volity contains extensive, well-written documentation. Like most projects, Volity depends heavily on software developed elsewhere. Its documentation properly points to background documentation for each of those systems. The background for those systems does not, obviously, explain the relation between Volity and those other systems. The Volity website fills that gap with several web pages. All software projects call for that level of investment in documentation.

Failures of Organization

One nice thing about books is their linear arrangement. However, many people say they don't read technical books in order. Our reading technology implicitly embeds this behavior; if we always read things in order, we'd have scrolls instead of books.

When I'm in the middle of getting something to work on my system, I notice an interesting pattern in how I read a book. I think other people do something similar.

I usually skim the book all the way through to get a sense of how things fit together, but when I want to accomplish some task, I just flip to an example or step-by-step procedure (yes, the kind I ridiculed in the previous section) and try it out, with any changes that seem appropriate.

If the procedure works, I'll congratulate myself as a clever fellow, without feeling guilty that I haven't invested in a deep investigation of the system. If the procedure doesn't work, I'll look back at nearby descriptions and try to find out what was missing in my rendition of the procedure. If that doesn't work, I may return to more basic material, perhaps in a different chapter. Essentially, I read the book backward.

Backward reading is an ineffective way to process material the first time you see it, but it may be very effective when you're applying the material to concrete tasks. In fact, such a learning style may be mandatory. Skim some material, try to apply it, struggle a bit, and read the relevant sections again.

Perhaps the step-by-step documentation could work after all, if it had background links. People could write step-by-step documentation and background material at different times. To tie it all together, the Web provides hypertext. Such a system is a flexible way of organization for documentation that allows authors to write in their spare time, and the resulting chunks are closer in size to what readers like to read at a single sitting.

The thread through the various links can become pretty tangled, though. Over time, more and more documents on each topic build up. Because the price of disk space has fallen so much, nobody feels an urgency to remove outdated documents. It's hard to tell what you really need, and hard to know whether a document will help you.

Nature Magazine sparked intense debate in December 2005 with its study claiming that Wikipedia is almost as accurate as the iconic Encyclopedia Britannica. (Another way to summarize the claim is that the hyper-professional Encyclopedia Britannica contains almost as high a rate of errors as the freely donated work in Wikipedia.) Expert opinion differs on the accuracy of the study, but one claim in it has largely escaped discussion: Nature magazine found that Encyclopedia Britannica articles were better organized and easier to find information in.

This suggests that, beyond the question of accuracy, the value of documentation increases with a concentrated, consistent authorial voice--and some editing wouldn't hurt either. Technical documents on the Web tend to be the product of a single author, more than Wikipedia articles, but the collections of multiple documents as a whole tend to reflect problems of organization, chosen audience, and tone.

Community Documentation I Like

What I'm asking for is not impossible. Occasionally I come across a superb example of documentation produced for the community. In addition to the Volity site mentioned previously, two examples are the NFS How-To and the Linux Sound How-To, both from the Linux Documentation Project.

NFS (Network File System) is one of the earliest ways to share files between computer systems in the manner I mentioned earlier. The Linux NFS How-To has had no revision in four years, but this is no drawback because NFS is a stable system (some people would call it a legacy system). The authors did it right the first time. The how-to compares favorably to commercial documentation on the topic.

The languages is quite professional throughout, while remaining conversational and easy to scan for the information you want. Sections have logical organization, and titles bring you to the information appropriate for your setup.

The document's introductory sections lay out the goals of the paper, the software needed to get NFS working, and the knowledge the reader is expected to bring to the project; there is also a reference to another how-to.

Numerous warnings about special cases show that the authors (probably, I'm guessing, with input from readers) have an intimate grasp of real-life use. Security, performance, and troubleshooting--the areas of system administration I indicate as requiring a holistic and deep understanding--have their own in-depth sections, with hints scattered throughout the document.

The Linux Sound How-To is even older than the NFS How-To, but remains a model for how to deliver complicated information in an easily digestible form. The author, Jeff Tranter, has provided me with material on related subjects for O'Reilly books, and he brings his top-notch professionalism and care to this free document as well.

Several aspects of computer sound systems make documentation difficult. First, they require the cooperation of software at many different levels, from the kernel and device drivers up to the particular utilities invoked by users. Second, some Linux installations do more for the user than others. On one installation, everything may be in place; in another, the user has to carry out a lot of manual installation.

A wide range of activities are related to sound (from playing CDs and files in various formats to doing professional sound editing); in this how-to Tranter just explains how to get audio working. Starting with the sound card and moving up through the layers of the system, he helps the reader find out when the Linux system has done the setup work and how to repair the situation when it hasn't. There is a long trouble-shooting section in the form of a Frequently Asked Questions list.

Pages: 1, 2, 3

Next Pagearrow

Sponsored by: