Single Function Exit Point Please

SingleFunctionExitPointPlease may not be the best title for such a page, but it is intended to be a counterpoint to the SingleFunctionExitPoint page. I read through the whole page and a surprising number of comments agree that multiple points of exit are fine. Some believe they are preferable.

There is a rule of thumb in programming that goes back a long time that states something like this:

A function should have a single point of exit.

That is not quite how I learned it. The way I learned it was like this:

A function should have one point of entry and one point of exit.

Rather than the above, I believe a single code block should have one point of entry and one point of exit. A function is a code block, so this holds true for functions. However it also and perhaps more importantly holds true for any other situation where it might be an issue. In languages like BASIC, Fortran, C and Assembly language you can create blocks of code that have multiple points of both entry and exit. Code where this happens is often called 'spaghetti code'. The more points of entry and exit the more difficult it is to debug the code. Eventually, it becomes impossible for any practical purpose.

Here is an example of valid code that breaks the rule:

int multifun(int n) {

    int x = n, y=-1;

x = n;
assignz:
    if( y < 9 ) { y=y+1; }
    goto start;
assigny:
    if(y==9) {
        y = y-1;
odd1:
        if((y/2)*2==y) {
            if(y>x) {goto assignz;}
            return(0+x);
        }
    }
    if(x!=y) {
        if(y<x) {goto ok;}
        printf("y:%i\n", y);
        return(5);
ok:
        if((x+y+3)==4) {goto odd1;}
        return(2+x+y);
    }
start:
    if(y!=x) {goto assigny;}
fin:
    if(y==x) {goto assignz;}
    goto odd1;
    return(y);
}

The above code compiles with an ANSI-C compiler. What value does that return for multifun(2)? For that matter, does it even return?

If you need to translate the above code to another coding language that does not have goto statements, how will you go about it? When the function gets too long, how do you delegate tasks to functions to make it easier to follow? If I want to print a statement on entry and exit to the function, how do I do it? If I need to set a debugger breakpoint just before it exits to see what state it is in, how do I do it? All of these are possible, but all are way more work than they would be if the rule was followed.

The above code may look too tortured to actually exist, but significantly more tortured code than that has been written in the real world. I have worked on it. Code like that, when it exists in volumes of tens of thousands of lines, will defeat even the best programmer. The easiest way to fix code like that ends up being to go all the way back to the specs and writing it from scratch.

Code in real working systems does not stay static. It gets changed. People don't generally set out to write code like that but real code accumulates hacks. Not everybody altering every line of working code in the world is sufficiently clever, has enough short-term memory, enough time, enough energy and the tenacity to ensure that they don't slip up somewhere. In real working production systems, code accumulates errors, ossifies and becomes fragile and is prey to the weakest programmer that has worked on it.

In the working world it is quite possible for a working professional programmer to produce things like code in C on one device and corresponding code for mainframe assembly language. I have done so with a stream cipher for a bank, for instance.

The above code, as tortured as it is, pales in comparison to real systems.

Even the very best programmer will eventually come to grief by disrespecting the single point rule.

The example that people seem to be most strongly supporting is where a function goes through a series of checks and returns as soon as one fails. A simple example might be this:

int mrwfile(char *inname, char *outname) {

    FILE *inf, *outf;
    char buf[BUFSIZE];
    int status = 0;

inf = fopen(inname, "rt"); if(!inf){ doerr(inname); return(0); } outf = fopen(outname, "wt"); if(!outf){ doerr(outname); fclose(inf); return(0); } while(fgets(buf,BUFSIZE,inf)){ fputs(buf,outf); } if(errno!=0) { perror("doofilestuff:fail\n"); } else { status = 1; } fclose(outf); fclose(inf);

return(status);
}

Note that the second time there is an attempt to open a file we have to remember to close the first before we return. If we forget, there will be a resource leak. As the number of these preemptive exits accumulate the probability that there will be a resource leak or similar error approaches one. This is not a question of 'if' such a bug will appear in code like this. It is a question of 'when'.

Imagine if, in the example above, the number of preconditions grows until we have no choice but to refactor the code to move code to a helper function. How is this done? In the simple example above, we need, if we want to move the file processing out of this function, to pick through the code to determine which things require movement and which things require duplication. Mistakes cause bugs and eventually as this burden increases either the movement won't be done and the code will suffer or the movement will be done and the programmer will suffer. Part of that suffering, by virtue of introducing bugs, will be deferred or shifted to another programmer or to the user.

int rwfile1(char *inname, char *outname) {

    FILE *inf;
    int status = 0;

inf = fopen(inname, "rt"); if(!inf){ doerr(inname); } else { FILE *outf; outf = fopen(outname, "wt"); if(!outf){ doerr(outname); } else { //int doofilestuff(FILE *inf, FILE *outf) { // << Block start char buf[BUFSIZE]; while(fgets(buf,BUFSIZE,inf)){ fputs(buf,outf); } if(errno!=0) { perror("doofilestuff:fail\n"); } else { status = 1; } //} // << Block end fclose(outf); } fclose(inf); } return(status);
}

The code above follows the single point rule. It is clear what belongs together and clear how to move it in or out. I have put comments in so that you can see the relationship between the original single point code above and the new refactored code below that breaks out the work done on the files to another function.

int rwfile(char *inname, char *outname) {

    FILE *inf;
    int status = 0;

inf = fopen(inname, "rt"); if(!inf){ doerr(inname); } else { FILE *outf; outf = fopen(outname, "wt"); if(!outf){ doerr(outname); } else { if(doofilestuff(inf, outf)){ status = 1; } fclose(outf); } fclose(inf); } return(status);
}

Provided that variables are declared where they should be (in the scope where they are used as above for variable 'buf'), the entire section of code moves in or out as one item. This happens without alteration to the surrounding code. It also happens largely without respect to the language or programming idiom.

Compiler and manual optimizations can be done much more effectively when the entire flow of control is clear. Tracing code can be easily placed to bracket blocks and functions without worrying that the facility will be missing in another environment.

Functions coded as suggested by the single point rule, worst case, are 'belt and suspenders' when the language you currently happen to be working in does not need it or the code is simple as in the examples above. Functions coded with multiple entry/exit points are significantly more fragile in real world code where the code can be in place for literally decades through many changes of maintainers, languages, operating systems, libraries, different flavor of the month languages and programming paradigms.

This is but one of a wide variety of simple and reasonable rules of thumb that working programmers use to build robust software that works when it was built and continues to work years later.

More modern languages such as C++ offer facilities that insulate programmers from *some of* the consequences of breaking the single point rule. They do not cause it to stop being a sound rule. They do not turn poor practice into good practice. If you can think of a way that it can go wrong, it *will* go wrong in the fullness of time. The bizarre rationale offered for dispensing with this rule is born of limited experience and hubris.

The number one issue with software is defects -- bugs. A single defect can tear down not only a given piece of software but an entire dependent ecosystem. There is never an instance when the single point rule is harmful in any day to day situation. If it cannot do harm and it *could* protect you, then it is a rule you should follow. If you are unsure, you should err on the side of caution.

By their nature, defects happen precisely in the places where they were unexpected. Rules like this are for the instances when code breaks down, not when you get your choice of equipment, toolchain and personnel, not when everything goes as expected in the 99 percent of cases you tested and you have the time and resources to be able to absorb the consequences of reckless programming and your worst case is benign failure. They are for when you are working with defective hardware, older languages, multiple incommensurate languages, different operating systems and you are dangerously close to deadline and running out of budget. Where reckless mistakes can ruin you. In cases where I have supplied code, failure means a bank is robbed, the phone system goes down, millions of consumer appliances break down or millions of dollars of petrochemicals are destroyed. Rules like this are for the 1 percent of cases where things don't go as expected and especially the killer .0001 percent of cases that simply destroy an entire working system and seem impossible to find and correct.

If there is a rule that will save you from creating defects in your software, you are well advised to learn and take advantage of it while you have the luxury to do so. No matter what you do, you will have bugs and some will be frighteningly difficult to find and correct. You need all the help you can to make sure you are able to cope with those bugs. It is foolish to take pointless risks when you can avoid it as easily as following a simple rule like this. You need to steer clear of the problems you *can* avoid in order to have the resources to deal with the problems you can't avoid.

This won't help the people with five or ten years under their belts in a narrow range of languages and environments who are convinced that bugs won't affect them. I am aiming this at the ones like me who already know why rules like these are ignored at your peril and find the other discussion strange. More than that, I am aiming this at the earnest beginners and journeymen who love their craft, want to build solid working software that they can be proud of and are willing to sweat the details to get there.

There are exceptions to every rule. However, rules like this one would require a truly exceptional situation. I did not see a single convincing argument against this rule on the SingleFunctionExitPoint page and I can't, off the top of my head, think of one.

If you are an expert in your programming area, have a couple of decades of programming under your belt and are programming for an extreme edge case that requires unusual code it might be OK to break a vanilla rule like this one or two times a year. If you find you are breaking a well established rule of thumb like this every day, then you are almost certainly making a mistake. -- BobTrower </rant>


What value does that return for multifun(2)? For that matter, does it even return?

Oh goody! A challenge.

Let n0 be the initial value of n when multifun() is called. After the initial section is executed, we have n = n0, x = n0, and y = -1. After the assignz section is executed, we have n = n0, x = n0, and y = 0. (-1 is always less than 9 so we always add 1 to it). In start, we split into two cases.

From this, I would (assuming that we don't care what it does for the cases that were undefined behavior in the original), rewrite multifun as follows.

int multifun_11(int n)
{
    int x = 5;

if (n < 0) { puts("y:0"); } else if (n == 0) { puts("y:1"); } else if (n == 1) { x = 1; } else { x = n + 2; }

return x; }

From this we can clearly see that it returns in all cases (except possibly the undefined behavior ones), and multifun(2) return 4.


Well, yes, it really is that simple. :) Thanks for a yeoman effort. What you have done here illustrates the point nicely, I think. Your functionally equivalent code can actually be understood without resorting to a compiler. It is now sound, significantly easier to understand, and much less likely to attract bugs when refactored, which I do now (assuming the informational printing was not a required side effect, which it was not):

int multifun_12(int n)
{
    int x = 5;

if(n>0) { x = n == 1 ? n : n + 2 }

return x; }

and finally:

#define multifun(n) (n > 0 ? n == 1 ? n : n + 2 : 5)

That last refactoring could put a compiler in a significantly better position to optimize code because it eliminates a function call and gives the compiler access so it can tell if surrounding code can eliminate that bit of code entirely.

This also illustrates one of the beauties of pair programming. Whether or not you could have gone this last bit, or I could have done your part, neither of us bothered to go the route alone. Together we have produced a beautiful illustration of the utility of the single point rule.

Cheers! --BobTrower

Your welcome. However, your last transformation is one I would strongly recommend against. For one thing, you've changed the functionality. For example, the macro version of multifun(x++) (where x is 7 going in) returns 11 and sets x to 10. The function version returns 9 and sets x to 8. Another example, in the function case multifun(7.5) evaluates to 9. In the macro case, it evaluates to 9.5. Leave it as the function multifun_12, or, if you have access to c99 or later.

inline int multifun_13(int n)
{
    return n > 0 ? (n == 1 ? n : n + 2) : 5;
}


We come to another teachable moment. I tested the macro with the code below and it worked fine. I did not test it using the construct you use because I consider it poor practice generally to do what you are doing. I also generally discourage abusing macros like I did here.

What we see here, is that:

I broke a rule by creating a macro with a possible pathway to failure. Macros like this are prone to certain failure modes and in fact my first crack at that macro failed because of missing brackets. I caught that in testing and corrected that particular defect. I did not exercise all use cases and got caught.

I did not catch an error even though I unit tested it, because I did not expect it would be used as it was. Rules of thumb are one of the ways we ensure against failed expectations.

I also broke a rule by pointlessly over-optimizing something that was working well enough, simply to be clever. As we can see, clever code is fragile and the fragility can be difficult to see. Code clever enough to break is distinctly less clever than it might seem. The bread and butter code may take a little more time to execute, but it is a small price to pay to ensure a correct answer.

I cannot argue that my practice was a good one because you exposed a perfectly plausible mode of failure. It is precisely this unexpected use case where the code broke and breakage is enemy number one. I messed up and the proof is plain to see. Being able to explain breakage does not excuse breakage. I have no doubt that in the fullness of time, if that macro had been in production, it would have failed.

I might try to defend the above by saying that this is only example code, but that does not make it better. It makes it worse. Failing example code, unless we do a post-mortem like this, sets a poor example. It is still an example of what *not* to do with production code. I would never actually do that with production code in my day job, but I did it here and breakage is breakage.

It is generally poor practice to do what I did with that macro for reasons that you render apparent.

It is also generally poor practice to do what you did, also for reasons that you render apparent.

Had either of us practiced 100% good hygiene, the breakage, while possible with poor practice when we both transgress, would not occur.

This exercise points up the value of testing and the limits of testing. Automated Unit test is necessary, but it is not sufficient. The programmer doing development is not the best person to do the testing. It is often not possible to have the luxury of a second pair of eyes. That is why pair programming is advantageous. In the best case, the testing team is a separate entity.

I came to the page that this is responding to because a programmer that reports to me indulged in the bad habit of exiting in the middle of a function. He made a mistake. I came here looking for a tidy explanation of a vanilla rule of thumb. I was surprise to find a lively discussion where there were people in favor of gleefully exercising this bad habit. Worse, it actually seems that they were winning the debate.

Our exercise here is largely academic since this is not production code. However, in the real world case of the programmer in question, he ignored this rule and in fact got bitten. The line that I spotted was exiting after resources had been allocated from the heap. It is the type of thing that is unlikely to be caught in ordinary testing. By itself, it would not cause too much harm, but if all the code in the system is making errors like this, the system would suffer badly.

The programmer is a good one with a few years behind him. He is just finishing a degree in engineering at Waterloo. He got into this bind because, lacking the experience to know otherwise, he borrowed constructs from other programmers. He did not know this rule of thumb, followed the example of someone who did not know or ignored the rule and his first notice that there was an issue was that his manager pointed out a bug in his code. This is not his failing. It is a failing of the wider community of programmers that surround him. Not knowing better, he trusted people who, like the ones on that other page, actually were recommending code that broke the rule. When I pointed the rule out to him and showed the bug that breaking it gave rise to, he understood right away what the issue was. He is an unusually bright young man with a 4.0 grade average and he really sweats the details. If this gives him trouble, you can bet it gives the average programmer trouble.

Below is the snippet of code that ran the unit test on the macro to ensure it behaved as it should. It did in this test, but this test ignores a plausible use case and that untested case caused the code to fail.

void tmulti(void) {

    int i;

for(i=0;i<10;i++) { printf( "%s unit test -- f1:%2.2i, f2:%2.2i. multifun_11(%i) %s multifun(%i)\n", multifun_11(i) == multifun_13(i) ? "pass" : "fail", multifun_11(i), multifun_13(i), i, multifun_11(i) == multifun_13(i) ? "==" : "!=", i ); }
}

Result:

    pass unit test -- f1:05, f2:05. multifun_11(0) == multifun(0)
    pass unit test -- f1:01, f2:01. multifun_11(1) == multifun(1)
    pass unit test -- f1:04, f2:04. multifun_11(2) == multifun(2)
    pass unit test -- f1:05, f2:05. multifun_11(3) == multifun(3)
    pass unit test -- f1:06, f2:06. multifun_11(4) == multifun(4)
    pass unit test -- f1:07, f2:07. multifun_11(5) == multifun(5)
    pass unit test -- f1:08, f2:08. multifun_11(6) == multifun(6)
    pass unit test -- f1:09, f2:09. multifun_11(7) == multifun(7)
    pass unit test -- f1:10, f2:10. multifun_11(8) == multifun(8)
    pass unit test -- f1:11, f2:11. multifun_11(9) == multifun(9)

I could not have hoped for a better unfolding of this page. My mistake was a bona fide mistake and illustrates just how easy it is to fail if you break the rules. Even an expert cannot be certain that a risky move will succeed.

I am not sure who my correspondent is on this page, but my thanks goes out to them for making a good page into an excellent one. I expect they must be a pleasure to work with IRL. --- BobTrower


EditText of this page (last edited July 17, 2014) or FindPage with title or text search