There Aint No Such Thing As Premature Generalization

I don't define "generalization" as actually implementing the more general idea, but as thinking about it before you get stuck into coding - and that is always good! By my definition ThereAintNoSuchThingAsPrematureGeneralization.

Please elaborate. What is your definition of PrematureGeneralization and why do you say it doesn't exist?

How much grief would we have been saved if Microsoft's MFC C++ class libraries and wizard generated code had been done properly? No, seriously. You've never seen so much cut and paste code as that generated by using MFC CDialog classes....

Every time you are stuck maintaining two versions of code in VSS that originated from the same source base, ask yourself if someone had thought about generality a little earlier, whether you would be stuck with doing twice the work you should have to do.

Oh, let's get another thing straight. Generalization is NOT adding lots of bells and whistles so that your code can do anything that might be called for of it in future. It IS reducing the size of your code so that less code does more. Code should build in functionality exponentially; each extra unit of code should multiply the functionality by a fixed factor (dammit that's what well factored amounts to). Most code actually grows in functionality worse than linearly with addition of more units of code. Clearly something wrong with the generality of the plan of attack.

If Microsoft practiced refactoring, they could fix the MFC CDialog classes. Refactoring is the antidote to the problems that YouArentGonnaNeedIt poses for traditional software engineering. And OnceAndOnlyOnce is the rule that makes you reduce the size of your code so that less code does more.


Business

Sometimes, a business model requires you to give up refactoring at the time you release the API. This is no doubt a major reason for this sort of problem (which is exhibited not only by Microsoft's code, but also Apple's and X Windows). -- DanielKnapp

This may be a little off topic for this page, but why would releasing the API set have any affect on refactoring? The intent of refactoring is to change the structure of the code without changing its functionality or interfaces.

Perhaps what is meant is that the 'design' stops being refactored, where refactoring of a design is a process of changing the specifications (eg API) without changing its intent or goals.

Precisely. Whether interfaces change depends on where one draws them. Suppose that you find a snippet of code which is used in almost every Windows application, and move it into the library. This would be a refactoring of the system formed by taking Windows and its applications together. Yes, it's a very loose usage which misses the point in a number of ways. -- DanielKnapp


Not sure what point is being made, but anyway I love this page because it made me think and it made me release a rule that was too rigid. When we say PrematureGeneralization, we usually mean, more precisely, premature commitment to a generalization that has not been tested enough. That one deserves the negative connotation. But I think I agree with the original author that thinking ahead and anticipating generalities is a good thing. There is no practical advice I can shake out of this insight however, because I observe from the writings on this Wiki that even extremos do this generalization look-ahead, although they deny it on a stack of bibles. "Code that's poised for change" anyone? -- WaldenMathews

Extremos do not deny anticipating generalization. We just do it when it is simpler, or we need it. -- JeffBay

Thinking ahead about generalizations is fine. Committing a thought-ahead generalization to code is risky. Better to validate the generalization through actual use. --DaveSmith (Original Author)


Premature Generalization Examples Anyone? Sure that they are really Generalizations??

By the way, does anyone have an example of so-called "PrematureGeneralization" that isn't actually an example of overspecialization? All these "new base classes/interfaces for an existing class" [without existing code that already needs just these interfaces] don't qualify because there is no generalization at all, just a premature specialization of the usage of the existing class. Any time you create a relation between two classes, you specialize, not generalize. -- NikitaBelenki

Some examples I have seen (and even done myself): Methods overloaded to handle virtually any parameter set, even though only one or perhaps two of the overloaded methods is actually called. Specific requirements being extended because "it is easy to add these extra capabilities while we are at it." User interfaces where the same operation (often with different results) can be done in a myriad of places.

But where is the generalization in these examples? Not every addition of functionality is generalization.

I would call creating unnecessary method overloads to solve some undefined general problem, as opposed to creating only the methods with the minimum required signatures, premature generalization of the methods. Extending features to solve as yet unreported problems is also a premature generalization of the feature. Putting access to features willy-nilly throughout the interface, as opposed to putting access where the user work flow dictates, is premature generalization. In all of these cases, a more general solution than the problem requires is being put into place.

What I cannot see there is a more general solution. I see just a more specialized solution for an even more specialized (although broader) problem. -- nb


Definition Of Generalization:

According to Merriam-Webster:

  Main Entry: 1gen�er�al
  Pronunciation: 'jen-r&l, 'je-n&-
  Function: adjective
  Etymology: Middle English, from Middle French, from Latin generalis,
from gener-, genus kind, class -- more at KIN
  Date: 14th century
  1. involving, applicable to, or affecting the whole
  2. involving, relating to, or applicable to every member of a class, kind, or group <the general equation of a straight line>
  3. not confined by specialization or careful limitation
  4. belonging to the common nature of a group of like individuals : GENERIC
  5. a : applicable to or characteristic of the majority of individuals involved : PREVALENT; b : concerned or dealing with universal rather than particular aspects
  6. relating to, determined by, or concerned with main elements rather than limited details <bearing a general resemblance to the original>
  7. holding superior rank or taking precedence over others similarly titled <the general manager>

So why do you call the solution "more general", if it doesn't move away from more specialized code to less detailed one, but extends the same specialized code with some additional yet specialized code instead? -- nb

Suppose we have two examples of added functionality, expressed in the terms of documentation, interface specifications, source code and/or unit tests as:

  1. A==>  A || B;
  2. A && !B  ==>  A.

The added functionality is the same. But there is no generalization in the first example. And there is generalization in the documentation, interface specifications, source code and/or unit tests, accordingly, in the second example (probably at the price of added specialization in some other parts of the project, probably not).

So when the documentation says "you now can call method A with your specific parameter set, for which it will work just as method B", there is no generalization. But there is generalization when the documentation says "your parameter set is no longer specific".

If we have the function

  void swap(int* a,int* b){ int c=*a; *a=*b; *b=c; }
and add
  void swap(double* a,double* b){ double c=*a; *a=*b; *b=c; }
there is [almost] no generalization. There is just addition of some code, which is premature if we don't work with doubles right now. If we replace both these functions with
  template<class T> void swap(T* a,T* b){ T c=*a; *a=*b; *b=c; }
there is generalization, which is premature if we work only with integers in our program at this moment. But in this case the PrematureGeneralization is not necessarily evil.

The problem with PrematureGeneralization is that with our not-so-perfect tools the generalization in one place often increases the specialization in other places. For example, the void swap(int* a,int* b){ int c=*a; *a=*b; *b=c; } was valid code in both C and C++, but the template code is valid only in C++.

So it is not the PrematureGeneralization itself that is evil, but some additional [premature] specialization which we may need to add in some other places when we want to make our code more general "just in case". -- nb


OK Here's an Example:

When I wrote a HHLAPI 3270 screen scraping framework, I put methods in for all 3270 functions. Like, SendF1, SendF2, SendF3, etc. Yes, I knew that we were not currently using all the function keys in the current application.

But we weren't using XP, and I knew the application programmers: They wouldn't have the good sense to put methods in the right classes and refactor.

Where's the generalization? Looks like special casing to me!

(Yes training is a good thing and I was doing it, but they were still struggling with the idea of a "class".)

I wouldn't call this PrematureGeneralization. I would say that this is more of a case of making a (perfectly valid) assumption that you ARE going to need it. Now on the other hand, had you generalized your framework to scrape ASCII datastreams from arbitrary places (like say RS-232) and also thrown in a general SNA communications framework capable of handling LU 6.2 in addition to LU 2.0, I'd say you had gone quite a bit too far...


Story so far:

Where do we stand? Is what's being argued here reaching consensus or are we emerging with multiple definitions of Premature and Generalization???


Using Templates To Generalize:

The template swap is, in a way, more complex, more indirect, harder to understand and harder to change. Why pay these costs earlier than we have to?

I don't see why would you think it is more complex or more indirect or harder to understand. In their practical form (references instead of pointers; that kills the compatibility with C, though) I find the template form even more understandable than the int one, because the former doesn't contain additional unnecessary information about some specific types.

(By harder to change I mean we could rewrite the int-specific version to use int-specific features, such as xor or subtraction, if we wanted. We can't do that with the more general one. By over-generalizing we have added unnecessary constraints to the implementation.)

Fortunately, in C++ you can always specialize any template as you wish for your particular type, if you ever need to do that. Which probably shows that with different development tools the costs of PrematureGeneralization are different. And I would prefer tools that further reduce the costs of PrematureGeneralization, because it means that in my projects I could reuse more code written by others. -- nb

This is actually exposing the issue that if the compiler doesn't support templates, it may already be too late to generalize... I mean the designer of the language and compiler should have considered such generalization. If a template-like mechanism isn't already in the language you use (or a hook in place to make it easy for you to add such) when you are considering such generalization, the generalization in question hasn't been done sufficiently prematurely.


What about the use of template templates as policies in standard C++ code, as in for example Modern C++ Design by Alexandrescu? How does this fit into the PrematureGeneralizationIsEvil vs ThereAintNoSuchThingAsPrematureGeneralization debate? -- BillWeston

If you have to generalize, template templates are a useful tool. -- DaveHarris


Is YouArentGonnaNeedIt Too Extreme!!!

The concept of judgement based on experience seems to be lost. Trying to include the transitive closure of all implied generalizations as in BigDesignUpFront has clearly failed. The YouArentGonnaNeedIt camp strikes me as naive and simplistically restrictive. There's a place for using ones experience in knowing where generalization will provide beneficial leverage in the future. And if you are wrong, so what? You can always adapt. Any decision you make is by definition wrong looking backwards.

Harder to modify incorrect Generalization

It is usually more difficult to modify something that is incorrect than to add the correct version from scratch. Also, any time you spend working on something not needed now is time taken away from working on something that is needed now. I believe these are the two major arguments in favor of YouArentGonnaNeedIt.


I don't define premature generalization as actually implementing the more general idea, only thinking about it before you get stuck in to coding and that is always good!

I had something very specific in mind when I coined the phrase "PrematureGeneralization". Unfortunately, as with many coined phrases, people hear the phrase and come away with differing impressions. In retrospect, "premature commitment to a generalization" may have better captured my intent, which was to describe a behavior that many of us have observed in others (and in ourselves). That behavior is to to code for a general case before we've coded for a specific case. It includes pre-factoring abstract superclasses before there's a demonstrated need, and adding abstract methods before there's a demonstrated need (in the as-yet ungrounded belief that someone will eventually need to override the methods).

One result of premature commitment to a generalization is that you're supporting code that isn't used. I've joined projects and found that 1/3rd of the code base wasn't reachable, all because one or more programmers had tried to write class libraries that would solve all possible future needs. It is really, really tempting to give in to the misplaced belief that "... as long as I'm here, I might as well add the other methods that people might need at some point."

And worse, premature generalization locks in untested intellectual structures that will tend to shape or constrain people's future thinking. If you've ever been involved in a project where a team member got ahead of the team, building an elaborate class library before the team has finished designing the code that's going to need that library, you know how much this situation can suck. Naive managers don't want to see the prematurely-generated class library reworked or thrown away ("we don't have time!") and you find yourself handcuffed, and possibly forced into writing even more code to solve impedance mismatches.

It's far better to first solve specific cases first, and then let the generalizations emerge through refactoring. Don't waste time writing code you don't need, and don't base your believe about "need" on forward thinking alone. The future is uncertain, and there are are limits to our horizons. -- DaveSmith


I find Dave's advice helpful, especially the part about preconceived ideas constraining future thinking. And it's especially difficult to generalize without forming a commitment to these generalities because afterall, they are invented to save us work. And once I'm sold that my workload has been lightened, I don't easily go back. Yet, ...

Observe yourself carefully as you craft some code. Oh, did you know it was a "for" loop before you enumerated a bunch of cases? Chances are, you leap right to that kind of generalization quite routinely, and your are correct most of the time. I believe my tendency to generalize stems mainly from the success I enjoy from small generalizations. And bigger ones that pan out are nice, too. There are times, in fact, when I cannot will myself to think through and code each case in detail before I jump to generality. It's a faith thing. With experience, you just know when you know.

So, this is one of those organic "balancing act" things again, the kind of stuff we like to polarize about and slam back and forth like a tennis ball until the sun sets in the west. But check it out. If you're not generalizing prematurely, you're not learning to generalize earlier and better, and you're not getting experience from your time on the job. We talked before about commitment. The trick is not to not commit; the trick is not to not generalize early and often. The trick is to let it go when it sucks. -- WaldenMathews


It is usually more difficult to modify something that is incorrect than to add the correct version from scratch.

My assumption has been that generalizations are generated from real use cases. If that is the case then they shouldn't be wrong. I would never advocate adding something just to add it. Nor would I advocate waiting to until the exact moment it is needed. You only need the brakes after the car starts rolling. It wouldn't be smart to invent brakes after the car is going 100 MPH. This where design comes in. This is where experience comes in. This is where judgement comes in. In case you think the car analogy is crap, in my first downhill vehicle I did not think about brakes (duh) and paid the price. You can bet in the next version I put in some brakes, I didn't wait. There are lot of situations like this. Your use cases come from both immediate demands and from experience. Denying experience in my experience is dumb.

Until you know the potential speed and mass of the car, you cannot design the brakes, even though you may know you need them. Doing so beforehand would be premature.

Also, be sure you do not confuse developer experience with user experience. They are two vastly different things.


I'm wondering if what some regard as PrematureGeneralization isn't in fact simply RequirementsGoldPlating? done by an implementor instead of an analyst. By the way, I regard one of the most hopeful parts of XP the fact that it removes the artificial distinction between the two, since ItsYourCode? (could someone please change this to the right WikiWord?). -- SteveHolden


See also: BargainFutureProofing, BeginWithTheEndInMind (contrast with)


EditText of this page (last edited May 24, 2012) or FindPage with title or text search