Fixed Quantity Overflow Bug

Any defect where a fixed-size numerical type is selected to represent some parameter, and the bound of the parameter grow to exceed the range of the type, is a FixedQuantityOverflowBug. (Perhaps a catchier, idiomatic name would be better. This page used to be called BillGatesBug in honor of the BillGatesSixFortyKbytesQuote, but as he apparently never said the phrase often attributed to him, it is hereby renamed. Bill gets enough abuse, though he deserves much of it...)

Without fail, the type selected is initially adequate for the purpose, but as time passes becomes inadequate. In many cases, the pending inadequacy could have been foreseen; though in many others systems were being used long after the initial designers expected them to be replaced.

Notorious examples of this class of bug:


Strategies to avoid these bugs:

For quantities that could (potentially - always consider MooresLaw and extrapolate it for at least a generation or two; you never know how long your system will last) grow without bound:

For quantities which you can establish an upper bound (either absolute, such as the number of days in an Earth calendar year, or reasonable, such as the number of legs on an animal); feel free to use a fixed-size type if that's more convenient. However, except for memory-limited situations, or datatypes that you will have millions of, you should probably avoid types that are smaller than the standard type (which, on most of today's systems, is a 32-bit integer).

I believe that in most cases a FixedQuantityOverflowBug can be avoided by choosing a sufficiently good ApproximationOfInfinity, a value limited only by hardware resources and empirical performance. Sadly, the "word-sized integer vs arbitrary-precision integer" involves a trade-off between an Infinity which bottoms out far below the hardware-resource limit but is hardly ever reached in the common case, and an Infinity which eats away at the empirical-performance limit even if it isn't necessary. I hear some Smalltalk VMs reserve a bit in the word to differentiate between small integers and pointers (being dynamically typed, these aren't necessarily to actual big integers), but that's not feasible without hardware or OS support.


In many martial arts, such as Aikido and Baghua, the key teaching is to use your opponent's own destructive energy to your own advantage.

This is the common approach in large integer numerical algorithms, where the fixed word size (e.g. 32 bits) of the machine is used as an advantage rather than a bug, by operating in a finite cyclic group that just happens (what a coincidence) to be of order 2^32. Modular arithmetic, a part of NumberTheory, allows for a vast number of powerful, interesting, and fast algorithms, and is often the preferred approach over e.g. BigNums - and indeed, BigNums are always based on modular arithmetic, at least in a trivial sense, and often in a deep sense (when the ChineseRemainderTheorem? is used to reconstruct final answers). -- DougMerritt


Discussion moved here from CounterInManyProgrammingLanguages:

Both the C++ and Java variants have a potential overflow problem. Doing a general purpose counter in either of these languages is sufficiently more painful. I don't know about all the other languages on this page (some don't exhibit this problem).

Which is why the templated C++ version was provided - if you need greater capacity than that provided by an integer; you can have it by instantiating a counter<X> for some arbitrary-precision integral type X. None is provided by the language, of course, but many are readily available.

Right. As I said, significantly more painful (you have to provide a facility for it). What is the Java solution?

Throw an Exception? That isn't a solution (for that problem), it's an error recovery strategy.

Uh, why isn't throwing an exception a solution? What would you like to happen if you hit the maximum value the chosen data-type can handle? Wrap? Seems to my the only proper way to handle it is to throw an exception and let the caller (or catcher) decide how to deal with the problem. Doing anything but throw an exception seems to be a distinctly non-general solution.

Well, what a whole bunch of languages do is just give you the right answer. ;) Grab any Smalltalk implementation and evaluate "200 factorial". It works just fine...

Not a good answer, try again. All languages have numerical boundaries at some point, even if that point is in the trillions. Are you saying that even if you want a counter for a 32 bit integer, that you never want to know if it is about to wrap? Try to be more specific with your answer.

[His answer was quite specific. It helps to have a feel for how big factorial is. 200 factorial is approximately 10 to the 375 power. A trillion is only 10 to the 12 power. He's saying that Smalltalk will give over 1200 bits of precision for 200 factorial; it is limited only by the amount of RAM available. Clearly the universe will die before the counter wraps; in fact, that will happen a very, very, very large number of times before the counter wraps. The same is true for the Scheme/Lisp solutions mentioned several places on this page. So the complaint is that throwing an exception is not a whole lot better than just wrapping; neither gives the correct answer, yet other language implementations do.]

Sorry, I think you are oversimplifying the situation. Smalltalk has SmallInteger, LargeInteger?, and various Fraction and Float types that can be treated as numbers. A SmallInteger typically only has a range of -2^30 to 2^30-1. What if you need a general counter for a float or a small-integer? What would you like to happen when you need a small-integer counter and are about to overflow? A general purpose counter usually has to be used for something and often that something may require something other than a LargeInteger?

You could certainly implement an Int32 class in Smalltalk if you wanted, and give it the same semantics as Java's ints. (And similarly, Java has those BigInteger things, which I imagine are similar to Smalltalk's BigIntegers?.) There's no reason why you can't have both abstractions available in a language; the question is just which one you want to be the default.]

Yes, exactly. Java has the same issue as Smalltalk - a java Integer is like a Smalltalk SmallInteger and a java BigInteger is like a Smalltalk LargeInteger. Often you will want a small-integer counter

[When this page was still young and small, someone pointed out that TypeMigration is the most desirable solution, and that's what Scheme and some Lisps offer: the counter starts as a small int, and when it overflows, it is automatically converted to a Bignum, so that you get the best of all possible worlds.]

[It is not typically desirable to just let a counter overflow, except in C when performance is more important than correctness. :-) It is not clear to me that Smalltalk and Java advocate such an approach, so it's not really clear where this conversation has been going - maybe it's just a misunderstanding of the Scheme/Lisp shining example?]

Smalltalk is doing the same thing as Scheme and Lisp (and Ruby and Python too, I believe) - using the efficient SmallIntegers when it can, and returning LargeIntegers when it can't, and you never ever need to worry about it or be aware of it. Java and C are doing something completely different. The problem isn't that Java is incapable of doing the same thing the others are; it's that it's a pain in the ass, because when you type "42" you get one of these kinda-like-an-integer-but-it-wraps things, instead of an object that behaves like an integer.

That's cool until you need the number to be fixed precision, say, to store in a database field. If I need 32 or 64 bit integers, I'd rather just start with that and be told on the off chance that it overflows. I'd prefer that to it just morphing into some byte array with no bounds. Most things I program have bounds, so it is rarely a problem in real life programming. Scientific programming of course is another thing altogether.

I think maybe you've never had really big persistent counters. Making arbitrarily large counters isn't that hard in a decent language, why should we now be limited because the database isn't up to the task? I had to solve this problem by writing the fast-to-parse rep (a platform-independent serializer I wrote for the occasion) to file and databasing the filename. We really needed huge (HUGE, we were counting molecules impacting a sensor), persistent counters.

Likewise, one can correspondingly use tools that are similarly open to unlimited precision numbers - serialize your BigNums? into strings, use a typeless database like SqLite, etc.


See also ZeroOneInfinityRule.


CategoryBug


EditText of this page (last edited October 27, 2012) or FindPage with title or text search