When we last discussed bugs, we talked about essential no-frills tactics like triage, prioritization, and naked co-ed blindfolded palm reading (a lie to get you to look at part 1). But here in part 2 we're moving up. Now the stakes are higher: more investment, but greater returns.
How do you know when you're done working on something? Some things, like chocolate chip cookies and lasagna, show on the surface that they're done. But more complex things, like software, aren't as transparent (or as tasty). You need to plan out, in advance, how you'll know when you're done. If you don't do this you'll spend hours arguing over whether code is done enough or not. If you're smart, and take the time earlier to define exit criteria, you'll set up your team to spend those hours coding instead of arguing.
An exit criteria is anything that must be true before the software can be called complete. The simplest exit criteria is time. You state an exact date and time when work will be finished, regardless of how many bugs or how much unfinished work remains. This criteria is easy to follow, but doesn't say anything about quality. You can meet the goal of shipping in a week whether you have 10 percent of the features working or 90 percent. Unless that date is built out of high-quality historic trend-influenced estimates, the date is just an arbitrary unit of time. Another obvious, and also bad, exit criteria is opinion. Your VP can be a living exit criteria, only allowing the software out the door when he feels it's ready. Even with a brilliant VP, this dependence is bad because if she gets hit by a bus, your project's fate is tied to hers. You need something written down to accelerate communication, and if it stays in someone's head there will be a long line of people outside their door.
Real exit criteria focus on ways to measure quality. How many defects, of what type, for what parts of the software, are acceptable? If you've been following along from part 1, you know about bug priority and severity, two easy ways to categorize defects. But what's still undefined is which areas of the project should be prioritized over others. Do all priority 1 bugs need to be fixed? Or only all priority 1 bugs for certain important areas? If you don't know, your team doesn't know (and you'll be arguing over each bug). It's an act of leadership to define exit criteria. It requires thought about the project as a whole and when it's done right it has a catalyzing effect on the entire organization.
The key question is this: How many defects, of which type, for which areas, are acceptable? Here are some guiding questions to help:
What did you do last time? Have a baseline of numbers from the previous release. It's arbitrary, yes, but it does give you a point of reference. You can decide to raise or lower quality comparatively and have everyone know exactly what you are talking about. If you've never measured this before, make up the goals (yes, that's right). Keep track of final decisions so you're set up for version 2.0. Sometimes you can look at the quality bar set by a competitor and calibrate.
Which areas are most important? Stack rank the areas/features of the current project. Some are more important to customers and higher quality in those areas will have more value. You can set one level of exit criteria for features A, B, and C, and a second set of lower criteria for D, E, and F. Resources are zero sum: your best use of resources is always on the features used most often.
What test cases are you using? Exit criteria do not have to focus on bug counts. If you are using test cases for each feature you can define exit criteria on the percentage of test cases passed. If you don't have any test cases (http://en.wikipedia.org/wiki/Test_case), now is the time to make them. They encapsulate many different quality criteria into a single, often automatable, test. If you create them early, they will not only help with exit criteria, but will focus the development work as it's done. Each check-in will be done against those test cases, warning you early when there are big problems.
What performance metrics must be met? Are there attributes of performance that you care about? load time, save time, download time? Focus your metrics on user observable kinds of performance--that way you'll be sure to fix things that impact customers. Performance bugs are often the most frustrating for customers, and tend to require complex work to fix (hint: you want to identify them early). If you have no idea what your performance numbers look like, reread the first point: get a baseline so next time you'll be able to make intelligent decisions.
Any exit criteria you create should be part of any work specification. Each feature or design should include a section explaining how the team can verify that the work is finished. Write down what test cases must be passed, what performance numbers must be reached, what usability metrics must be achieved, or what kinds of bugs must be fixed. If you can't write it down, you haven't thought hard enough about the feature you're building, or the customer you're making it for.
To keep exit criteria simple, you might want to have two sets: one baseline set that applies to the entire project, and additional criteria that are feature-specific. With this approach, the only features that get customized exit criteria are ones that have higher or lower quality than the rest of the project. (Slacker hint: this is less work, and the burden for the first set can be given to one senior person).
Remember: you can always adjust exit criteria during a project. You're not going to get it all right out of the gate. But by putting them in place you both clarify for the team that quality is important and give them some tools for guiding their work towards a quality outcome. It's fine if periodically the criteria are improved (not simply changed) based on discussions with customers and the team. Even if arguments arise when these changes are discussed, people will be arguing about the project (and customer) view of quality, which is a more effective place than the bug level to debate.
Now you are ready to turn the corner on bugs. If you have the first 3 levels going it's time to step out of tactics and move into strategy. To get above mediocrity in anything you have to find ways to spend your time on advanced problems, not the basics. Mastering the low-level tactics buys you just enough time to play lookout for your project and plan for problems before they become serious. So if any of the above has helped your team (or yourself) to become more efficient, carve out some of that earned time for thinking about the next few days, weeks, or months. You want to look for more fundamental things you can do to improve your team's ability to deal with bugs.
The simplest kind of early planning is to look back at your last project (or the last week of the current project). Ask yourself, and others, what didn't go well that is still hurting. Are there bad habits, poor tools, or common miscommunications? Compile a list of areas for improvement and put them in a rough order of urgency or value.
Remedies for these problems may involve buying better bug tracking tools, clarifying who has what triage authority, training your team in better engineering techniques, or improving the way bugs are tracked and managed. It might make sense to have a person dedicated to communicating with customers about bugs, or for you to define a checklist to help customers report issues, raising the quality of bugs as they're entered into the system. A common way to deliver on many of these ideas is to dedicate a full-time person to the role of quality assurance, or if you already have one, to provide them with more resources to do their job (http://www.macdevcenter.com/pub/a/mac/2005/07/08/dev_team.html). Another approach is to focus on the early definition of projects. If you can hire a designer or usability engineer to help organize the project from the outset, prioritizing the things that matter most, you can minimize the unneeded features and development work from the start.
If you do develop a list of areas to improve, stack rank it. Put it in the order of the greatest value to you and your project, and then commit to the first one--not the first five, not the first ten, just the first one. Until you've proven to yourself and your team that everyone is capable of making positive changes, you want everyone focused on making that one change work. Help everyone to rally around improving that one thing, including involving them in suggesting improvements and in carrying them out. Make sure to give yourself an exit criteria for the change. How will you know that the change has had the impact you wanted?
There are some special cases where you can skip the advice mentioned in this two-part essay. If you've been waiting to pounce on me for some horrific oversight, hopefully I've beaten you to it.
Low-hanging fruit: Sometimes a programmer can knock off low priority bugs while they're in the right part of the code for a high priority bug. In this situation it's okay to fix bugs out of priority order. Let them use their discretion. Opportunistic bug fixing just makes sense.
Morale: fixing hated or visible bugs, regardless of priority, can build pride and raise morale. It's okay to do this if you know that's why you're doing it. If you do this often, change your criteria for what priority 1, 2, and 3 mean to reflect visibility. If you keep bending a rule, change the rule.
On the clock / off the clock: Make sure programmers are clear on use of their own time vs. clocked time. It may be fine for them to fix low priority bugs or work that interests them if they're doing it off the clock. And make sure everyone is clear on whether or not this kind of work counts towards bonuses.
Bug fix quality: Bug fixing is like any other kind of work: the more time you spend on the work the higher the quality of the bug fix. As a rule of thumb the bug fix should match the quality of the work it's being checked into. You wouldn't put gold hinges on an plastic door unless you had a really good reason.
Who should be involved in triage (Level 1)? Whoever yells loudest. Failing that, it's the programmer, tester, and project manager. If one person can lead triage and filter out duplicates or garbage bugs, that makes sense: don't waste three people's time doing what one person can do alone. If you have a representative from the customer on site, ask them to participate as well (but be willing to translate the bug into their terms). All specialists, documentation, usability, marketing, should be invited, but it's often the best use of everyone's time to just have them available by phone if their input is needed. Let them know which bugs are going to be covered so they can show up if they care about particular ones.
What if I can't get my boss to agree to setting exit criteria (Level 3)? Study hypnosis. Okay, do it for one part of the project as a pilot. Get the individuals who work on that area to work with you and contribute (or at least to understand what you're doing). Do it alone if you have to. Then when the project is over, review with the boss whether your use of exit criteria was helpful. If it was, he'll likely want to use it elsewhere. If it wasn't helpful, why would you expect him to agree? Repeat until you find something, exit criteria or not, that helps.
What is your worst experience with defect management? Well, there is the story of the QA manager who finished defining the exit criteria two days before we shipped. He seemed happy about his progress, but looking at what he wrote, I wasn't surprised to find that he'd documented the current build. You can guess what the quality of that release was like (it wasn't his fault: the VP pretended to delegate exit criteria, but really just kept it in his head).
Why isn't defect management taught as part of most computer science degree programs? Perhaps their curriculum planning is defective? To be fair, there's only so much time even in four years. I suspect that, from an academic perspective, defects and bug fixing are more about software production and the craft of programming than about understanding the core concepts of computer science. At a trade school, where the focus is heavily skewed towards employment, more coverage is given to production. And some four year schools do offer electives in quality assurance and debugging techniques.
Software testing: This is a big subject and I've given you a quick, dirty, and skewed introduction. Start here: http://en.wikipedia.org/wiki/Software_testing.
Painless bug tracking: An excellent essay from Joel on simple bug tracking.
Examples of bug triage/prioritization: Here's a sampling of how different organizations handle triage. Diversity in these links is intentional.
Software quality assurance FAQ.
In April 2005, O'Reilly Media, Inc., released The Art of Project Management.
Chapter 3: How to figure out what to do (PDF) is available free online.
For more information, or to order the book, click here.
Scott Berkun is the best selling author of Confessions of a Public Speaker, The Myths of Innovation, and Making Things Happen. His work as a writer and public speaker have appeared in the The Washington Post, The New York Times, Wired Magazine, Fast Company, Forbes Magazine, and other media. His many popular essays and entertaining lectures can be found for free on his blog at Scott Berkun.
Return to ONLamp.com.
Copyright © 2009 O'Reilly Media, Inc.