Intentional Redundancy Does Not Violate Once And Only Once

OnceAndOnlyOnce is a well-established principle; as redundancy (in particular, inconsistent redundancy) is a common source of program errors. Eliminating redundancy speeds up ReFactoring as well, as you only have to change 1 thing rather than changing n things.

However, redundancy is frequently introduced into systems intentionally, for several reasons. These reasons include increased performance, and (ironically, it appears) increased reliability.

Increased reliability

Examples of this are FunctionPrototype?s in C/C++, various disk-mirroring technologies, DeoxyriboNucleicAcid in nature, etc. How does this work? Two reasons:

In typical violations of OnceAndOnlyOnce, OTOH, the inconsistency is never detected. Some parts of the system use one definition, other parts use the other.

However, see AirplaneRule; the mechanisms necessary to harness redundancy for the purposes of increasing reliability greatly complicate things; it's not a free lunch. That said, redundant systems when designed properly can indeed be more reliable than non-redundant systems; especially when given inputs that the system isn't designed for (such as a bird through the turbines or a bullet through one of the disks).

Performance

Denormalization of a RelationalDatabase is one example. Caching and LazyEvaluation are others. Here, a redundant copy is made (and in many cases, maintained in parallel if the data is mutable) in order to improve performance (due to increased locality of reference, etc.) Managing this redundancy can be tricky, and isn't a task for the faint of heart. However, like the example above, the redundancy is managed.

DocumentationOfIntent?

Data is provided in redundant fashion to allow some system (or a human) to perform consistency checks; failure may be indicated by an inconsistency. Unlike the IncreasedReliability? case above (where data is automatically replicated in order to ensure that subsequent corruption or erasure doesn't occur without being detected), in this case the data is supplied to the system in redundant form, often by a human who must specify something more than once. Often times, only one "copy" of the data is a full set (error correction isn't possible); the redundancy is only used to verify the full set of data. Examples of this are widespread in programming, such as:

In many cases, the redundancy (in addition to being machine-verifiable) also serves as documentation. Documentation itself can be considered a type of redundancy that isn't machine-verifiable; comments and design docs aren't consumed by the machine, but instead by the human programmer. Alarm bells should ring if the code does something different than what the comments describe. (Unfortunately, it is easy for comments and design notes to get out of sync with the code; as the verification steps for these--humans inspecting the code--are tedious and error-prone).


In the above discussion, I believe C/C++ headers should be moved under the "Performance" category rather than the "Increased Reliability" category.

In an ideal world, yes. Given that the design of C/C++ is unlikely to change in this manner, header files are needed for reliability

The reason header files were added when we went from K&R C to ANSI C was to keep the include files small and to avoid writing an additional preprocessor. There is no technical reason why function prototypes could not be pulled from the source code (other languages do this). At the time, however, I assume it was felt that it was better to have the developer manually extract function prototypes rather than having a preprocessor or compiler do it for him.

Please note that I am a C/C++ die hard, but I recognize that using the technology and knowledge we have today, it should no longer be necessary to have the developers create separate header files to allow the compiler and linker to perform consistency checks. It would also be nice if someone stepped forward and defined a way for separately compiled libraries to contain the prototype information (at the binary level rather than at the source code level). Having a header file for libraries is an okay solution for C/C++, but it does not help in exposing C/C++ libraries to other languages.

I'm currently refactoring a hunk of C++ code, I don't mind C++ too much, but it is tedious to have to say or change the same thing in at least two places. I support the above statement for that reason.

On the other hand, the separate include files allows for creating DynamicLibraries? that have circular references to each other (provided that the linker and loader for the OS in question is smart enough to handle that case, most are).


The most important reason I've ever had to be intentionally redundant is clarity. A few times in UnitTests, I've refactored away common code, only to revisit the tests later and discover that their workings were less immediately obvious than they were with the redundant code left in. Sometimes I find I can solve the problem with better names, but in many such cases, I've re-refactored them to re-introduce the redundant code. -- PhilGroce


Introducing redundancy can be a healthy refactoring if it reduces coupling. Consider two modules (which you control) X and Y, both of which use local functions fx and fy, respectively. fx and fy are discovered to do the same thing (perhaps with some trivial differences that can easily be ReFactored). fx and fy don't have any dependency on the rest of X or Y (or such is easily factored out). Should fx and fy be moved to a common library and renamed f?

A quick test; count one point for each "yes":

If your total score is nonzero (and certainly if it is two or greater), ReFactoring is probably an improvement. If the total score remains zero, you might be better off leaving well enough alone. Otherwise, you will need to introduce a new dependency in both X and Y, to contain a minor function which isn't generally useful; this is likely to increase the complexity of your project.

This seems similar to ThreeStrikesAndYouRefactor.


EditText of this page (last edited May 3, 2007) or FindPage with title or text search