Expressive Loop Condition

Problem

Given a loop that is iterating over a range:

  for (int i = 0; i != limit; i++) {...}
what happens if i ever becomes greater than limit? You will end up executing the body of the loop with i >= limit, which is sure to break the code. Given, it was already broken if i can become > limit, but DefensiveProgramming says to mitigate the damage.

Therefore

In a loop over a range, use <, <=, >, and >= in the condition instead of == and !=

  for (int i = 0, i < limit; i++) {...}
Resulting Context

If the loop variable goes out of range, the loop will terminate.

The code's intent is now clearer. This is a more compelling reason to use this pattern than is DefensiveProgramming. See below for a discussion about this.

However

Some feel that this confuses the reader by implying that the impossible is actually expected, or because it gives a false sense of security by hiding the error without actually correcting it.

Caveat: If you use STL, even if a given implementation defines < between two iterators (such as if they are native pointers), you still should only use !=. This is because the StandardTemplateLibrary facilitates the Refactor "Replace Container Type With Container Type", and the new type might not support < .

Caveat: If you are using objects rather than integers, you should use "!=" since you cannot know whether or not the objects define the "<" operator. Even if you are using objects that define an operator<, for any type that is not random access, this operation can have a hidden expense (such as with bidirectional iterators or linked-list nodes). Instead, you should program defensively by not modifying the expression variable outside the parens of the for(;;) statement.

 for ( type at = start; at != end; ++at )
 {
     do not modify at in here!!
 }
If you modify the counter or use a non-const reference anywhere other than in the for(;;) statement, you are not programming defensively.

Also

If you are using floating point numbers, the odds are very good that you must use this idiom. For example:For example:

  for (double i = 0; i != 10.0; i += 1)
is safe but:
  for (double i = 0; i != 1.0; i += 0.1)
will never terminate, because 0.1 cannot be exactly represented by IEEE floating point. You must code it like this:
  for (double i = 0; i < 1.0; i += 0.1)


Never do this. Since 0.1 cannot be represented accurately in binary floating-point (only DyadicRationals can), the round-off error from adding 0.1 to itself accumulates. In fact, never use a floating type as the index of a loop. Write the loop instead as:

  for (int i = 0; i < 10; ++i)
  {
    const double x = i / 10.0; // or final
    // Process x
  }
The round-off error from the double version will be minor, but this is a good habit to maintain. It's like putting your seatbelt on for every car trip.

Of course, a functional programmer could create a list 0..9, and map it appropriately.


This code reads more like "for each i: 0 <= i < 10":

  for (int i = 0, i < limit; i++) {...}
than does this code:
  for (int i = 0; i != limit; i++) {...}
Not true if the variable goes out of range on one side it will terminate. This error shows the weakness of being sorta-rigorous about things.

I think that this approach is flawed. If you really want the above statement to be true you can do something like

for(int i=0; i >= 0 && i < limit;i++) {...}

but with less simple looping constructs this can be very difficult to read.

Luckily, there is a proper way to do this. Unfortunately, many programmers seem to be allergic/unaware of it. What is wrong with this?:

for(int i=0; i < limit; ++i){
Assert(i>=0 && i< limit);
...
}
Here the intent is clear, the loops statement is clear. The use of assertions, (invariants, pre/post conditions) is a necessary condition for defensive programming, IMO. It should be standard practice, but isn't for unfathomable reasons.

Actually, it depends on the code. If it is impossible for i to underflow, there shouldn't be a check for it. If it is possible for it to happen during execution, then there shouldn't just be an assert (check and fail) but exception handling (check and recover). Most programmers are surprised how many checks they've added for impossible situations while leaving possible situations totally unhandled.

What is possible for i to underflow is not clear. I was simply addressing the claim that ...if is is out of bounds. Checks for things that can't happen are very useful, at least in development code, for catching errors.

I agree with this. It's often very useful during development. I'm just always amazed when I do test-case analysis, how many of my cases cannot even be executed.


I find it is better practice to use iterators rather than indices in loops.

With iterators, you can drive you loop directly from the collection. Your start iterator is the start of the collection and the termination condition can be either terminal data (NULL at the end of a string, a NULL object, etc.) or a terminal pointer.

This avoids the most common mistakes in loops: do I start with 0 or 1 and do I use < or <= for the continuation/termination condition. These are much more likely problems than run-away indices or iterators. And iterators are more efficient, too! -- WayneMack

If you get to that level, I prefer to use algorithms such as the STL algorithms in the Standard Library or InternalIterators ala SmallTalk enumerators in Java. I really try to avoid using for or while constructs when working with collections. This is much more defensive since the operations are bounded. -- RobertDiFalco


In the spirit of ReductoAdAbsurdum?, and because iterators are not universally available as an alternative to indices in some languages, I would advocate that if one wishes to use DefensiveProgramming practices the best approach would be one or a combination of the following:

1. No ControlOfFlow? variable should ever be a non-integer. This is fairly common sense, and to a degree this issue becomes a FalseDilemma? under that principle.

2. Where incremental, non-integer variables have to influence the control of a loop, a separate count index may be used with the floating point incrementation inside the loop. This isn't good OptimumProgramming? practice however and can lead to SkewedIndices? when coded clumsily (which in itself would preclude use in DefensiveProgramming).

3. Process-related conditions should rarely be used in the iteration of the given process, silly as this may sound. If we really are being defensive, one should always MinimizeVulnerability? by ReducingOutsideInfluences? and thus only operate conditional loops on directly controlled variables.

There were some more things but they've floated out of my brain; also, my prose above doesn't really say what I wanted it to say, but I hope you might get the gist anyway. -- NicholasTurnbull


CategoryDefensiveProgramming CategoryLoops


EditText of this page (last edited November 3, 2014) or FindPage with title or text search