Wednesday, July 30, 2008

Robustness features

Since we are living on a bit of a smaller budget at the moment, my wife and I don't spend much money on buying new items. But we both like seeing any expense as a bit of a long-term investment and so we ended up buying a Miele washing machine after the old one gave up. Admittably this choice was solely based on anecdotal evidence (such as my mum still using her 30 year old machine), but recently two incidents made me believe we made the right choice.

The first one was that we accidentally used washing powder not meant to be used in a front loader. The problem seems to be that these washing powders foam to much and can harm the machine this way. The interesting bit is that our machine actually noticed the problem. After the cycle had ended it started swapping a "Check detergent" message with the normal "Finished", which then caused me to notice the mistake.

The second incident was a power failure: while the machine was running, it lost power for a few minutes. This didn't seem to worry it at all, it just continued from where it was before the power went as if nothing would have happened.

Both of these features fall into a category which I like to call "robustness features". Admittably it sounds a bit stupid and I was considering the more catchy but inaccurate "quality features", but let's keep it for correctness' sake until someone else finds something smarter.

A robustness feature is something that has been added to a product with the sole purpose of increasing its robustness, i.e. the chances that it will behave well in the case of some errorneous condition. In the examples above Miele spend some time working on (a) adding some sensor and logic to detect use of inadequate washing powder and (b) adding some type of non-volatile memory to allow the machine to remember its state even during a power failure. Both of these features don't seem trivial, they probably add significantly to both the development and production costs.

What makes these features interesting for me is that they show a certain commitment to producing high-quality products. These are features that are not easy to use in marketing. People tend to think "I wouldn't use the wrong detergent" or "power rarely fails", so these features are often ignored when comparing products. Additionally it is easy for a vendor to pass blame if someone complains: the owner should just not have used the wrong detergent and it is certainly not the problem of the manufacturer if the power fails. Both these effects together means that many companies do not care about putting such features into their products and in turn makes me believe that our choice of washing machine was a good one, since Miele seems to be one of the few companies around that still care.

Note that this also applies to software products. If you ever wrote some code to deal with external input then you know that a lot of time can go into avoiding, detecting and treating errors. I once wrote an input filter for a reasonably small XML format that ended up having more than one hundred different error messages -- it would have been a lot easier to use less error messages and group multiple errors together, but that means in the case that some error occurs the user will have to guess what's wrong. Since I have been in that role of the guessing user much too often I tend to write my code with very detailled error messages and in that case the company I was working for was willing to make that investment not only into my time, but also in terms of maintenance and the additional cost for localization.

I believe that it is good to add this extra effort into writing robust code that avoids failing by not letting bad things happen in the first place, detects errors early and accurate, and treats them with detailled error messages and ideally some decent recovery mechanisms. This applies not only to parsing input formats, but also to user interfaces, library design and even general application code where resources might run out and similar problems can occur.

This effort might not be that visible to your potential clients but the ones you have will learn to appreciate it sooner or later -- after all things do go wrong every now and then. They might never know what exactly you did and how much time you spend thinking about what errors can occur and what to do about them, but they will feel good about your product. And that is what counts for me.

No comments: