ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


Myths Open Source Developers Tell Ourselves
Pages: 1, 2

It's Better to Start from Scratch

Myth: Bad or unappealing code or projects should be thrown away completely.



Reality: Solving the same simple problems again and again wastes time that could be applied to solving new, larger problems.

Writing maintainable code is important. Perhaps it's the most important practice of software development. It's secondary, though, to solving a problem. While you should strive for clean, well-tested, and well-designed code that's reasonably easy to modify as you add features, it's even more important that your code actually works.

Throwing away working code is usually a mistake. This applies to functions and libraries as well as entire programs. Sometimes it seems as if most of the effort in writing open source software goes to creating simple text editors, weblogs, and IRC clients that will never attract more than a handful of users.

Many codebases are hard to read. It's hard to justify throwing away the things the code does well, though. Software isn't physical — it's relatively easy to change, even at the design level. It's not a building, where deciding to build four stories instead of two means digging up the entire foundation and starting over. Chances are, you've already solved several problems that you'd need to rediscover, reconsider, re-code, and re-debug if you threw everything away.

Every new line of code you write has potential bugs. You will spend time debugging them. Though discipline (such as test-driven development, continual code review, and working in small steps) mitigates the effects, they don't compare in effectiveness to working on already-debugged, already-tested, and already-reviewed code.

Too much effort is spent rewriting the simple things and not enough effort is spent reusing existing code. That doesn't mean you have to put up with bad (or simply different) ideas in the existing code. Clean them up as you go along. It's usually faster to refine code into something great than to wait for it to spring fully formed and perfect from your mind.

This isn't a myth because rewriting bad code is wrong. It's a myth because it can be much easier to reuse and to refactor code than to replace it wholesale.

Programs Suck; Frameworks Rule!

Myth: It's better to provide a framework for lots of people to solve lots of problems than to solve only one problem well.

Reality: It's really hard to write a good framework unless you're already using it to solve at least one real problem.

Which is better, writing a library for one specific project or writing a library that lots of projects can use?

Software developers have long pursued abstraction and reuse. These twin goals have driven the adoption of structured programming, object orientation, and modern aspects and traits, though not exactly to roaring successes. Whether proprietary code, patent encumbrances, or not-invented-here stubbornness, there may be more people producing "reusable" code than actually reusing code.

Part of the problem is that it's more glamorous (in the delusive sense of the word) to solve a huge problem. Compare "Wouldn't it be nice if people had a fast, cross-platform engine that could handle any kind of 3D game, from shooter to multiplayer RPG to adventure?" to "Wouldn't it be nice to have a simple but fun open source shooter?"

Big ambitions, while laudable, have at least two drawbacks. First, big goals make for big projects — projects that need more resources than you may have. Can you draw in enough people to spend dozens of man-years on a project, especially as that project only makes it possible to spend more time making the actual game? Can you keep the whole project in your head?

Second, it's exceedingly difficult to know what is useful and good in a framework unless you're actually using it. Is one particular function call awkward? Does it take more setup work than you need? Have you optimized for the wrong ideas?

Curiously, some of the most portable and flexible open source projects today started out deliberately small. The Linux kernel originally ran only on x86 processors. It's now impressively portable, from embedded processors to mainframes and super-computer clusters. The architecture-dependent portions of the code tend to be small. Code reuse in the kernel grew out of refining the design over time.

Solve your real problem first. Generalize after you have working code. Repeat. This kind of reuse is opportunistic.

This isn't a myth because frameworks are bad. This is a myth because it's amazingly difficult to know what every project of a type will need until you have at least one working project of that type.

I'll Do it Right *This* Time

Myth: Even though your previous code was buggy, undocumented, hard to maintain, or slow, your next attempt will be perfect.

Reality: If you weren't disciplined then, why would you be disciplined now?

Widespread Internet connectivity and adoption of Free and Open programming languages and tools make it easy to distribute code. On one hand, this lowers the barriers for people to contribute to open source software. On the other hand, the ease of distribution makes finding errors less crucial. This article has been copyedited, but not to the same degree as a print book; it's very easy to make corrections on the Web.

It's very easy to put out code that works, though it's buggy, undocumented, slow, or hard to maintain. Of course, imperfect code that solves a problem is much better than perfect code that doesn't exist. It's OK (and even commendable) to release code with limitations, as long as you're honest about its limitations — though you should remove the ones that don't make sense.

The problem is putting out bad code knowingly, expecting that you'll fix it later. You probably won't. Don't keep bad code around. Fix it or throw it away.

This may seem to contradict the idea of not rewriting code from scratch. In conjunction, though, both ideas summarize to the rule of "Know what's worth keeping." It's OK to write quick and dirty code to figure out a problem. Just don't distribute it. Clean it up first.

Develop good coding habits. Training yourself to write clean, sensible, and well-tested code takes time. Practice on all code you write. Getting out of the habit is, unfortunately, very easy.

If you find yourself needing to rewrite code before you publish it, take notes on what you improve. If a maintainer rejects a patch over cleanliness issues, ask the project for suggestions to improve your next attempt. (If you're the maintainer, set some guidelines and spend some time coaching people along as an investment. If it doesn't immediately pay off to your project, it may help along other projects.) The opportunity for code review is a prime benefit of participating in open source development. Take advantage of it.

This isn't a myth because it's impossible to improve your coding habits. This is a myth because too few developers actually have concrete, sensible plans to improve.

Warnings Are OK

Myth: Warnings are just warnings. They're not errors and no one really cares about them.

Reality: Warnings can hide real problems, especially if you get used to them.

It's difficult to design a truly generic language, compiler, or library partially because it's impossible to imagine all of its potential uses. The same rule applies to reporting warnings. While you can detect some dangerous or nonsensical conditions, it's possible that users who really know what they are doing should be able to bypass those warnings. In effect, it's sometimes very useful to be able to say, "I realize this is a nasty hack, but I'm willing to put up with the consequences in this one situation."

Other times, what you consider a warnable or exceptional condition may not be worth mentioning in another context. Of course, the developer using the tool could just ignore the warnings, especially if they're nonfatal and are easily shunted off elsewhere (even if it is /dev/null). This is a problem.

When the "low oil pressure" or "low battery" light comes on in a car, the proper response is to make sure that everything is running well. It's possible that the light or a sensor is malfunctioning, but ignoring the real problem — whether bad light or oil leak — may exacerbate further problems. If you assume that the light has malfunctioned but never replace it, how will you know if you're really out of oil?

Similarly, an error log filled with trivial, fixable warnings may hide serious problems. Any well-designed tool generates warnings for a good reason: you're doing something suspicious.

When possible, purge all warnings from your code. If you expect a warning to occur — and if you have a good reason for it — disable it in the narrowest possible scope. If it's generated by something the user does and if the user is privy to the warning, make it clear how to avoid that condition.

Running a program that spews crazy font configuration questions and null widget access messages to the console is noisy and incredibly useless to anyone who'd rather run your software than fix your mess. Besides that, it's much easier to dig through error logs that only track real bugs and failures. Anything that makes it easier to find and fix bugs is nice.

This isn't a myth because people really ignore warnings. It's a myth because too few people take the effort to clean them up.

End Users Love Tracking CVS

Myth: Users don't mind upgrading to the latest version from CVS for a bugfix or a long-awaited feature.

Reality: If it's difficult for you to provide important bugfixes for previous releases, your CVS tree probably isn't very stable.

It's tricky to stay abreast of a project's latest development sources. Not only do you have to keep track of the latest check-ins, you may have to guess when things are likely to spend more time working than crashing and build binaries right then. You can waste a lot of time watching compiles fail. That's not much fun for a developer. It's even less exciting for someone who just wants to use the software.

Building software from CVS also likely means bypassing your distribution's usual package manager. That can get tangled very quickly. Try to keep required libraries up-to-date for only two applications you compiled on your own for awhile. You'll gain a new appreciation for people who make and test packages.

There are two main solutions to this trouble.

First, keep your main development sources stable and releasable. It should be possible for a dedicated user (or, at least, a package maintainer for a distribution) to check out the current development sources and build a working program with reasonable effort. This is also in your best interests as a developer: the easier the build and the fewer compile, build, and installation errors you allow to persist, the easier it is for existing developers to continue their work and for new developers to start their work.

Second, release your code regularly. Backport fixes if you have to fix really important bugs between releases; that's why tags and branches exist in CVS. This is much easier if you keep your code stable and releasable. Though there's no guarantee users will update every release, working on a couple of features per release tends to be easier anyway.

This isn't a myth because developers believe that development moves too fast for snapshots. It's a myth because developers aren't putting out smaller, more stable, more frequent releases.

Common Sense Conclusions

Again, these aren't myths because they're never true. There are good reasons to have a feature freeze. There are good reasons to invite new developers to get to know a project by looking through small or simple bug reports. Sometimes, it does make sense to write a framework. They're just not always true.

It's always worth examining why you do what you do. What prevents you from releasing a new stable version every month or two? Can you solve that problem? Solve it. Would building up a good test suite help you cut your bug rates? Build it. Can you refactor a scary piece of code into something saner in a series of small steps? Refactor it.

Making your source code available to the world doesn't make all of the problems of software development go away. You still need discipline, intelligence, and sometimes, creative solutions to weird problems. Fortunately, open source developers have more options. Not only can we work with smart people from all over the world, we have the opportunity to watch other projects solve problems well (and, occasionally, poorly).

Learn from their examples, not just their code.

chromatic manages Onyx Neon Press, an independent publisher.


Return to ONLamp.com.



Sponsored by: