Bugs part 2: Towards perfection

There is an expectation placed on developers – and pretty much everyone else – that they must strive towards perfection. Failure is not an option. There’s a general attitude anyone who makes mistakes is not good enough. When mistakes are made, heads should roll.

There’s a nasty cycle where mistakes are ignored or at least forgotten about. First, we all assume that we are good at what we do. When mistakes (inevitably) happen, we assume therefore that they’re someone else’s fault. If the system does not behave the way the customer expects it to, we assume it was poorly specified. If a bug makes it to production, we assume that the tester failed. Even if it’s decided that the bug is our fault and our fault alone, we assume it was a one-off mistake – a bad code day or whatever. Or else assume that it’s the sort of mistake that we’d have made a year or two ago but we’re too experienced to let that sort of thing happen again. Either way, we assume it won’t happen again. Because we’re good at what we do.

The big problem is that we assume that we’re perfect. We’re not. There’s a simple test: Have you ever made a mistake? If the answer’s yes, you’re not perfect. If the answer’s no, you’re a liar. And not perfect.

Let’s also consider that if we don’t consider ourselves to be perfect, how can we expect others to be perfect? We all make mistakes. The spec writer will miss something important. The tester will miss critical bugs. And the developer will put them in in the first place. We need to do our job – whatever it is – assuming that we will make mistakes and that others will too. The clever bit is not simply wishing that we didn’t. The clever bit is doing something about it.

I’ve found that most developers can be categorised according to how they react to discovering their own mistakes. While the error rate of a developer is a useful indicator of ability, a better one is how they react when they make a mistake.

The easiest response is to simply ignore it. This is obviously bad. Unfortunately, I’ve seen this. Developers discover problems with their own or other’s work and ignore them. It’s understandable though – I went into the reasons for this in my previous post. I would say that this is unacceptable, even for a relatively inexperienced developer. If you find that a developer never adds entries to the bug tracking software, there could be a problem. Unless you have a fundamentally lazy developer though, this should be a simple training issue. Inexperienced developers simply may not know that this is expected of them.

A better response is to acknowledge your own mistakes. Don’t assume that the tester will spot it. Don’t assume that it’s not important. If there’s a bug or a potential bug in your code or someone else’s, get it noted. I’ve found that most developers do this. It’s a start but I wouldn’t say it’s ideal. The bug is noted in the bug tracker and it gets fixed. But there’s no attempt to sort underlying issues. This is a response that often deals with issues rather than causes. It will keep the customer happy for now – their immediate problem is sorted – but it doesn’t improve the quality of the software overall, nor does it make the developer better at his job. The bug is fixed but there’s no attempt to prevent it from happening again.

The best response on finding mistakes is to fix it short term and ask why it happened. If the answer is I (or someone else) made a mistake, ask why again. It’s not good enough to say that development made a mistake or that the tester made a mistake. We need to figure out how intelligent, competent people (I hope that this covers you, if not your colleagues) can do something that appears stupid and incompetent. When you get a good answer to this question, you can figure out how to stop this happening again. This is the 5 Whys method which you may have heard of if you’re into management-speak – I’m not especially, but I like this one.

As an example, I once discovered that one feature of my product completely failed when we deployed it with Oracle rather than our standard SQLServer. It wasn’t even broken in particular circumstances. It completely failed. This software made it to a late beta stage, deployed on customer sites. A problem this severe should have been spotted well before that. Why wasn’t it?

  1. Because the developer (me) introduced a bug which wasn’t wasn’t spotted by testing,
  2. because the system was not adequately tested against Oracle,
  3. because the test script did not include a full regression test against Oracle,
  4. because the spec did not explicitly state that the system was to run against Oracle,
  5. because the spec writers are non-technical.

At this point, we can start to make things better. Using a single bug, we’ve discovered a flaw in our code and (more importantly perhaps) a flaw in our processes. The product spec is created entirely by non-technical client managers and product analysts. They had no particular knowledge of specific database technologies, nor should they. Their job is to specify what the system does. The developer’s job is to handle details such as database vendors.

Our solution was to¬† get development far more involved in creation of the product spec. Broadly the client managers will create a functional spec and the developers will add any non-functional requirements such as database vendors, compatibilities, response times, security features and so on. Now that this is included in the product spec, the test script must include specific tests for these issues and they’re far more likely to be caught during testing. For that matter, they’re more likely to be caught by development as we had a hand in creating the requirement.

Note specifically that none of the answers to the why questions are “because x is incompetent”. I’ve identified at least three people who could be said to be to blame for the error. However it does not benefit us to apportion the blame to any of them. None of us could fix the underlying problem by ourselves. The tester needs development technicians to help client managers to write better specs. While the idea of blaming others (or ourselves) can be satisfying, it’s ultimately unhelpful. First of all, it makes no attempt to improve a situation – no-one ever asks why. Second, if we know we are going to be blamed when things go wrong, it doesn’t make us less likely to make mistakes. It only makes us less likely to report them. And you can’t fix what you don’t know about.

Leave a Reply

Your email address will not be published. Required fields are marked *