Graceful Degradation

The ability of a (Software) system to degrade only the minimum possible amount, if full operation cannot be maintained.

An enterprise system using several databases should not die completely if only one of them fails, for instance.


A situation where rather than outright crash or die, a system will "degrade gracefully". For example, if an HTML browser does not support style sheets, it still usually shows the text, just not in the intended style. This is in contrast to aborting with a "'style' element not found" message or the like. HTML browsers will generally ignore tags and elements they don't recognize.

Of course, not all situations can benefit from this. If a bank transaction could only half-complete, for example, you don't want to let the other side leave with your money without giving the counter payment or receipt indicator.

Contrast: ProgressiveEnhancement.


The wisest man learns from the mistakes of others ...

A department store once installed a brand spanking new electronic till system. This system employed the latest methods of stock control, registering each sale with a central database, which could then determine when new stock needed to be ordered.

Each till would batch its sales and send them in a burst of data, making the most efficient use of the carefully designed and specified network. When the internal buffer of the till was half full it would request a slot, and then continue working until the slot was granted, take a second or two to send the data, then carry on.

The system worked perfectly, and the store prospered.

Until one Christmas when sales were so heavy that the network had insufficient capacity. Each till would request a slot, and then carry on working. The problem was that the slot wasn't granted before the till buffer was full.

So it would stop. And wait.

The result was that, at times, up to half the tills in the store would simply stop working for 10, 15 or 20 minutes at a time.

Since the hardware couldn't be upgraded quickly, a software solution had to be found, and quickly. The solution chosen was this. As the buffer became more full, the till would slow down. The total throughput of sales was still reduced, but now gracefully, rather than in the previous catastrophic manner. Customers didn't notice, all tills still worked, and the staff could actually work less frantically because the till wouldn't let them go any faster.

Eventually, the network was upgraded to cope with the full capacity and sales were not slowed. However, the software engineers involved had learned an important lesson about graceful degradation.


CategoryException


EditText of this page (last edited July 9, 2008) or FindPage with title or text search