Avoid Duplication

Don't

ThinkingOutLoud.DonaldNoyes.20090224.m05

I have heard this phrase over and over again as an argument for efficiency, and are typified by such statements as "OnceAndOnlyOnce". It seems that if something is to be done, recorded, that one process or organization or organism should do it.

This is unnatural being against the nature of the way things are. Why take but one picture of a family member?

We as human beings are unique, separate, and autonomous. Nature provides for duplication and multiplication as a defense against destruction and extinction.

In the computing and communicating mediums we often we hear from another camp, all "x" should be handled by "y", we don't need "a" through "w" anymore, they are either "fat", "thin", "hot", "cold", and they are not "just right", and will run in "z".

In this medium we are forever abandoning things found UsefulUsableUsed for other things perceived to be "better". Often they are. Who would argue that a 8 inch floppy disk is better for storage than is a FlashDrive?? Who would rather have a machine with a 20 Meg Hard Drive over one with a a 500 Meg Hard Drive? This is progress, this is improvement, and this is change for the better. But here we are talking about NewGenerations of the same basic thing, "storage".

But to only have "one machine", "one hard drive", or "one user" is an obviously unseemly scenario.

I say that duplication, multiplication, increases, are healthy and natural. That people misuse and mishandle the things we have is no argument for their extinction. (the user, or the used)

The internet is a perfect example of multiplicity and an argument for duplication. While some may say "it is too much", I think that having several million blades of grass in a lawn is far better than a clump or two.

What's your take on this?

I'd rather have a lawn with a clump or two of grass, because I wouldn't have to mow it.

Suggestions to AvoidDuplication and employ OnceAndOnlyOnce refer to developer-authored mechanisms. Essentially, they state that an object (in the broad sense, which could include classes, functions, templates, table definitions, etc.) should be defined in one place. Defining it in more than one place is redundant, and likely to result in anomalous behaviour when (say) definition A gets updated but definition A' does not. It requires additional, otherwise-unnecessary developer effort to maintain redundant definitions. However, this does not apply to data-preservation and validation mechanisms such as redundant storage, multiple data-entry, backups, etc., which rely on duplication to ensure reliability and/or longevity.

Wow, Donald! That straw man is burned!

Seriously, I've never once heard people suggest OnceAndOnlyOnce applies to family photos, or that DuplicatesAreBad applies to hard drives. If anything, language or library support for those would probably fall under the ZeroOneInfinity rule.

My take on this is that you're confusing software (or any design, for that matter) with instances of a design.

Software must remain simple. Multiplicity inside software is a horrific thing, a punishment not even the folks at Gitmo considered for its inhumanity. You may think it enjoyable, perhaps even employment preserving, to maintain software rife with duplication and unnecessary lack of abstraction. I do not. I have far better things to do with my time, like actually solve customer problems.

However, instances of a design may multiply up to the limits of its ecosystem. You can have any number of threads executing the same binary image as you want, and it might even be helpful, as long as you don't run into the server's memory bottleneck. Even if you never do, you can only support so many threads per process before the kernel runs out of threads for that process.

Can you imagine how big the Linux kernel would be if we didn't have the subroutine abstraction? You think Windows was big?

Of course, ideally, you want multiple species to balance each other out. Humans, left to their own devices, are an unbelievably destructive species. But, we're not alone. Invasive species in the swamps of Florida are showing an incredible ability to decimate local fish stocks. Conservationists are powerless to stop them. Why? Because their natural predators don't exist to keep them from wildly affecting the local ecosystem. Once new predators exist, a new balance will be reached, allowing for more equitable resource sharing amongst the species.

Unfortunately, nobody in the software community has yet learned how to make programs compete for resources in a manner like plants and animals do in the real world. I'm not entirely sure I'd want that either -- think about it -- you're happily browsing your web pages when all of a sudden portions of your browser window starts to wig out. Curious, you look into what's happening, and you see that your e-mail client and the desktop itself are fighting the browser for RAM. Eventually, the "pack" of software defeat the browser, and it just dies a slow, miserable death. The desktop and e-mail clients consume the image formerly held by the browser, thus assimilating its bits into themselves for its use. Sound preposterous? I sure hope so.

While people in the software community don't have much interest in writing programs that eat pieces of one program then later alleviate themselves upon another, the software community has, in fact, come up with a number of both competitive and cooperative approaches to resource management. I do suggest, however, that we want the software world to be considerably less scary and more intellectually tractable than our own. Safety, security, transactions, verifiable trust, understandable economy or market-based principles for resource management, etc. are much favorable to viruses, worms, sniffers, zombies, etc.
- Computers presently manage their resources far more efficiently than any market-based solution ever could. This is because computers run via dictatorship, and are generally unbiased; the kernel knows all, and therefore can make superior decisions. The whole reason dictatorships do not work in the RealWorld is because humans don't have all the information they need to make (good) decisions, and are easily biased to support one group over another (and often already are when they come to power).
  - Dictatorships don't work in computer systems, either, at least not any more generally or efficiently than they do in human societies. First, kernels do not "know all" - critically, they (a) do not at implementation time know exactly what programs will be running atop them, and (b) cannot at runtime generally confirm or deny any interesting properties with which to make better decisions (RicesTheorem). Second, while it may be possible to effectively apply imperial resource management when dealing with very well defined systems (e.g. programming on embedded systems without any desire for runtime upgrade or ad-hoc runtime observation), competition and such get involved the moment independently developed or controlled processes with differing goals start demanding access to a common set of resources (resources including sensors, actuators, energy, CPU, memory, bandwidth, priority scheduling, etc.). We would do well to recognize this competition and tame it (e.g. with market-driven resource policies, contracts, credit, trust, accounting, auditing, etc). I suspect doing so will allow far more efficient and effective resource management - both competitive and cooperative - than is currently achieved by modern imperative approaches.
RE: 'kernels do not "know all"': But it knows what resources are available, and when to disperse them to requesting applications. [Irrelevant. A kernel that merely services resources upon request is not 'making decisions' at all, much less intelligent decisions based on having 'all' the information.]

Isn't that how the original AmigaOperatingSystem worked? (Sorry, joke... I'll get my coat.)

Not really; it was more closer to how native American tribal life existed. As a rule, all applications respected the Great Spirit (exec.library), and all is peaceful. It usually isn't until some newly loaded application which doesn't obey the Great Spirit, and fails to respect that the Earth doesn't belong to them, but in fact the opposite, that revenge is brought upon the peoples of the RAM in the form of a thunder bird (GURU Meditation Alert). OK, analogy and joke taken way too far now. Wait up for me!

This sort of thing might make for a great game of CoreWars, but it has no place when productivity is necessary.

"you're happily browsing your web pages when all of a sudden portions of your browser window starts to wig out. Curious, you look into what's happening, and you see that your e-mail client and the desktop itself are fighting the browser for RAM. Eventually, the "pack" of software defeat the browser, and it just dies a slow, miserable death. [...] Sound preposterous? I sure hope so."

Nope, sounds pretty normal to me; that happens at least a few times a day. And sometimes my email client has a violent fling with the word processor that insists upon integrating with it and ultimately they both limp away and go into torpor to recover from their injuries. But then, I have to use Windows products and have to have antivirus etc. installed on my work PC.

Did I seem to say do a single thing many places in different ways, or many things in a single place? I meant to say, that duplication is not a BadThing. It is how we work. For example, browsing is a duplicated event. Millions of people, right now are browsing some place on the web. But they are not all using the same kind of browser, nor are they looking at the same pages. Some prefer InternetExplorer, others Firefox, etc. To have more than three applications, or processes, or organizations, do or accomplish the same task is not evil or for that matter inefficient. It is differences in them that matter. The user "feels better" when using favorites. To AvoidDuplication and mandate the use of "OneForAll?" and "AllForOne?" is to crush SoftwareDiverstity?. This can be applied to all HumanActivity?. Some spectators like football and call it Football, while others like soccer and call it Football. To mandate but one sport for all spectators is terribly wrong. To go further, I might even say "AvoidBuzzWords?" as statements of the what, where, when and how you do any thing. Buzzwords lose meaning each time they are invoked. Instead, look at the GoodThings in life with a admiring but critical eye. If you are building, sculpting, painting, programming, photographing, working at your occupation, cooking, or whatever, do it well, not just according to someone else's idea of "right". Go ahead, take the picture number 49,000,001 of the eiffel tower. It is really not duplicating, it is originating. You will then be capturing more than a view of a structure, but also you will be capturing a "moment", an experience, and an impression. See the following, to see what I mean:

http://images.search.yahoo.com/search/images?_adv_prop=image&fr=slv8-yma4&va=eiffel+tower&sz=

-- Still ThinkingOutLoud.DonaldNoyes

I don't think anyone's opposing you on that view. I don't recall anyone advocating AvoidDuplication, DuplicatesAreBad, or OnceAndOnlyOnce as universal principles that apply to everything.

SoftwareEngineeringIsArtOfCompromise. All else being equal, AvoidDuplication. However, there are other factors and "design principles" to consider that may bump AvoidDuplication down a notch. Nature "loves" duplication of information, it's part of evolution, and so cannot always be "bad".