Fix The Problem

The FixTheProblem philosophy says that when you know there is a bug in shared code called by applications, you identify and fix the bug. What you DO NOT do is to write patches to the applications that might trigger the bug, thereby potentially crippling applications and leaving the bug for exploitation by newer applications that aren't aware of its existence. Specifically, this may mean to not DoTheEasiestThingThatCouldPossiblyWork, if the patches are easier than actually fixing the source of the problem.


What if you can't fix it ? E.g. don't have the source code ? I got bitten by a nasty bug today - more of a ClusterFuck in fact - we got a report that some functionality we implemented with HTTP cookies didn't work for them, when our own testing had reported no problem. It turns out that the JRun implementation of the Servlet API encodes Set-Cookie headers, and more specifically the expires= portion for version 0 cookies, using the system locale.

This matters not a whit in most cases, but we're a French shop - and thus fall prey to a very insidious AmericanCulturalAssumption; namely, that the system locale is fine for encoding Date object as Strings under all circumstances. This is of course not the case - French dates aren't "officially" a proper format to use in cookies. However : some versions of IE5 will accept cookies with a bogus date in "expires", but all IE4 versions, and some IE5, silently discard the cookie in such cases. By sheer bad luck, the entire tech team was using the IE5 versions that coped with bogus dates.

In the end I chose to fall back on the old reliable - creating the set-cookie header "manually" instead of response.addCookie(). (Writing this now, I realize there is another fallback which may be the SimplestThing, but is riskier - use cookie.setVersion(1) to get Max-Age fields instead of Expires. Wonder what browsers support this format though - I haven't ever seen a server that used it.)


It would be so nice if we could just go and fix the problem. Sometimes you discover the problem already too late: when there are lots and lots of programs using the bug for good, or doing an expensive workaround. yes, developers sometimes discover bugs in the libraries they are using, and sometimes the bugs are not so harmful as the floating point bug, so you can write workarounds for them and still use the library. Now that most software is open source, or at least has an open source alternative, most developers think it was always possible to tell the supplier: "Fix it, here is the patch, or II will do it myself". Sometimes you didn't even have an alternative supplier and had to stick with very buggy software for which there was no maintainer.

For example when ncurses, the open source curses was developed, it was discovered that some applications that originally used curses and were migrated to ncurses would fail. Why? There was a bug in curses in which some applications relied. There were 2 alternatives:

  1. Fix all the code that relied in the curses bug. This requires access to all the code that used curses and therefore that's not possible in a lifetime.
  2. Make ncurses bug-compatible with curses. This would run all code that relied on the curses bug, but would break all the code that relied on ncurses not having that bug.
  3. Create a compatibility mode in which ncurses would execute as it had the bug.

Needless to say option 3 was the preferred option.

-- GuillermoSchwarz


EditText of this page (last edited May 7, 2004) or FindPage with title or text search