Polymorphism Limits

Things that can go wrong with or hamper PolyMorphism

New operations (methods) may be more common than new "subtypes". This may require visiting more code spots to change than other approaches (such as case/switch statements). Using the classic "shapes" example, we have to visit potentially every shape class to add new methods such as "findCenter" and "fill".
The "options" or sub-types are no longer mutually-exclusive. In other words, a change from an "is-a" relationship to a "has-a" relationship. Polymorphism results in a high change-cost (DiscontinuitySpike) going from "is-a" to "has-a". Related: SetsAndPolymorphism, and explored future below.
Multiple orthogonal interweaving factors come into play such that there is no simple taxonomy anymore. MultipleDispatching may be needed. Some feel that databases, or at least PredicateDispatching, better facilitate multiple dispatching.
GranularityOfVariation-related changes are costlier.
Doesn't separate selection (request) of features from implementation of the features and variations. In other words, I want to store and ask for features in a different place than where I implement them, mostly to make the "feature accounting" more compact: separate What from How. I've yet to see polymorphism do this well. Generally more complex OOP patterns are needed in such attempts, taking us into issues beyond just polymorphism (this topic). -t

Using the classic "shapes" example, we have to visit potentially every shape class to add new methods such as "findCenter" and "fill".

But that's EssentialDifficulty rather than AccidentalDifficulty: the algorithms to fill or find the center of a shape have to go somewhere, and are quite different for each shape. Although the code to fill shapes would be more localized if a case/switch were used, its overall complexity is the same. Anyway, a good IDE should be able to show you all of the instances of a given method in a class hierarchy, however they are distributed in physical source files.

Polymorphism often involves splitting things into "buckets" (subtypes) or a list of mutually-exclusive options. The problem is that the division factor ideal for one situation may not be ideal for another. A typical example is dividing Employees into Manager and nonManager. However, you may have other operations that don't care about that division. Instead it may want to divide by fullTime and partTime. Another way to say this is that there is no such thing as universal taxonomies, yet polymorphism tends to depend on universality. (A mild form of EverythingIsRelative philosophy.)

This is only a problem in a statically type checked language. Even there it's possible to conceive of a language that uses TypeInference to figure out all the types a procedure supports rather than requiring that they have an explicitly declared common supertype (see NominativeAndStructuralTyping). This is really the more general problem of InheritanceLimits?.

Another issue comes up when two sub-types need to become a combination (non-mutually exclusive). You might divide up bank accounts into Checkings and Savings, but later have a "type" that can be both. If you keep doing this, you run into LimitsOfHierarchies and/or have to shuffle around method positions on the inheritance tree. An "is-a" becomes a "has-a". Has-a is better future-proofing in my opinion. It is too hard to convert is-a into has-a in many cases.

Why must one run into LimitsOfHierarchies? Can't your objects have sets of interfaces that they implement? (I believe this is how HaskellLanguage works, sort of) That's not a hierarchy, but can express the divisions described in the above paragraph.

Example?

I suppose, you confuse polymorphism with inheritance. (See ConfusionAboutInheritance)

Where does that impression come from? Note it does not say that all polymorphism suffers these things.

Because you are talking about "buckets" and inheritance trees. These are inheritance issues only remotely related to polymorphism. Polymorphism is not about dividing bank accounts into Checkings and Savings.

"Poly" involves multiple. Thus, you must divide things into multiple somethings to be a "poly". Under PolyMorphism we have the "utility" division, which perhaps may not apply, but the other one does.

In a dynamically typed language inheritance isn't needed at all for polymorphism. In Smalltalk I can define two classes that support #foo without requiring that they have a common superclass. The method #foo is polymorphic even though inheritance is not used. Problems only occur in a language with static type checking.

Inheritance is not the issue. The fact remains that you still have two or more things. Plus, what you speak of is possibly "utility polymorphism", and not inheritance polymorphism (see PolyMorphism for definitions).

But dynamically typed languages don't require the subtypes or buckets mentioned in the opening paragraph. Rather, every combination of methods creates its own bucket. You can emulate this in statically typed languages by limiting interfaces to one method each. The problem described on this page seems to be that groupings of methods (subtypes) from different perspectives sometimes collide. This is true, well understood by many, and viewed as an acceptable trade off for the power of grouping behavior into subtypes.

I would note that it says "often involves", not "always involves". As for the "power", most demonstrations are rather artificial if you ask me. Real world messiness often ruins real attempts. Besides, I subscribe to PutClassificationsIntoMetaData.

Classifications are meta-data. Anyone who classifies things is generating meta-data.

You lost me. This sounds like an EverythingIsEverything? argument.

You said you subscribe to PutClassificationsIntoMetaData as if there were some other alternative. All classifications are meta-data, so we all subscribe to that.

There appears to be a definitional dispute over "meta data".

Most of my types have little to do with the real world (the world of particles) and much to do with the imaginary machine I'm designing (the world of bits). I'm free to group behavior however I want.

And what happens when a later maintenance programmer is more comfortable with a completely different mental model than what you used?
[A good project manager will generally be cautious about assigning (for example) COBOL programmers to maintain LISP code, and vice versa. Also, the comfort level of a programmer is of considerably less importance than his or her ability to do the job. My comfort level with maintaining old VB & Access applications is nil, and they ram against my preferred mental models with the sound of an unending train-wreck, but I'm good at it and it needs doing so I do it. Only very poor programmers are so tied to a mental model that they cannot (grudgingly, perhaps) adapt to another one.]
I'm just saying that optimizing it for your head is not necessarily the same as optimizing the design universally. I find that managing "feature sets" via sets instead of classification hierarchies better fits my head because its creates a separation between implementation of the features and assignment of them to instances, but I cannot say its universal among all humans. (And it makes maintenance easier IMO, but that's another topic: FeatureBuffetModel.) If the purpose of polymorphism is to fit *your* head, then its benefits may not scale to other people.
[Have you ever written an object-oriented program that employed inheritance and polymorphism?]
Only for relatively simple things where the difference/impact between polymorphism and non-polymorphic techniques is relatively small (and thus was not impressed). For mass variation management, I have always used a database. And even many OO proponents suggest switching to something "more powerful" than polymorphism, such as composition, when things scale in complexity. The "pattern movement" is partly a result of limitations of basic inheritance-based polymorphism.
Discussion moved to ComputationalAbstractionTechniques.

I just find that behavior does not map well to the "sub-type" model. It is plagued by InterweavingOrthogonalFactors?. Regarding "grouping behavior", see OoGroupsBetterClaim?. [Some highly pleasant individual deleted these topics]. Anyhow, I thinks its time to explore an example.

Mutually-Exclusive Status Change Example

(Moved from ObjectsHaveFailed)

Can you show an example of procedural code being superior to object-oriented code, when both have been written by a skilled developer? Can you provide evidence for the notion that "OOP seems fairly good at making the WalledGarden that a vendor wants to present." Also, could you explain what that means?

Keep in mind that I believe there is little "objective" in our industry to compare. I lobby the idea of considering individual mind-fit (PsychologyMatters) rather than over-focus on finding the One Right Way. That being said, here's a little toy example showing that case/if statements can require fewer code changes than polymorphism when a mutually-exclusive "choice" becomes non-mutually-exclusive: http://www.geocities.com/tablizer/bank.htm#fee

How often do you think changing a mutually-exclusive 'choice' to non-mutually-exclusive occurs? From experience, I can state that it occurs rarely and is not difficult to manage when it does occur. It is far more common to add new derived classes, especially if you're (as you should be) focussing on implementing computational mechanisms rather than DomainModelling. Rather than a contrived example, I suggest you compare case/if statements with polymorphism by converting PayrollExampleTwo to procedural code, then compare adding a new province in the existing ObjectOriented code with adding a new province in the procedural code. Hint: The latter is complex and error-prone. The former is trivial.

In domain modeling, I find things do often change to non-mutually-exclusive. I stop and ask myself, what's the probability of this changing to non-mutually-exclusive. It's either fairly high, or the domain is so set in stone that most likely new CASE/IF alternatives are infrequent, in which case the simplicity, mutually-exclusive-status-change flexibility, and ease-of-adding-new-operations of IF/CASE slightly outweighs objects. As far as "computational mechanisms", then perhaps the trade-offs indeed tilt in favor of objects. I don't necessarily dispute that. Sometimes report columns are best managed as object instances, for example.

In a TableOrientedProgramming-friendly system, I'd prefer to see column attributes in table form rather than tons of set/gets, I find them hard to visually compare in set/get form, making maintenence and trouble-spotting harder. The change-pattern profiles that favor OOP are very narrow when one has a decent TOP tool-set. But we are comparing procedural to OOP here, not TOP to OOP. But in case you don't know what I'm talking about, the reporting column table view may resemble:

Sequence..Title.................Bold..Shaded..Format ---------------------------------------------------- 10........Last Year North Sales...N.....N....Std-Num 20........This Year North Sales...N.....N....Std-Num 30........Change in North Sales...N.....Y....Pct-Chg 40........Last Year South Sales...N.....N....Std-Num 50........This Year South Sales...N.....N....Std-Num 60........Change in South Sales...N.....Y....Pct-Chg 70........Last Year Total Sales...Y.....N....Std-Num 80........This Year Total Sales...Y.....N....Std-Num 90........Change in Total Sales...Y.....Y....Pct-Chg

It's a lot easier to read and notice things out of place in tabular format than OOP set/get format. Maybe you have transforming eyes/brains (FastEyes), but I don't. When one is dealing with many variations of things (object instances), then a tabular format can make development a lot smoother. There are patterns to the settings above that are easy to spot. I'd have to stare roughly four times longer with set/get format to spot the same patterns. Note that report columns somewhat resemble a local DataDictionary. -t

"In domain modeling" ...

That's where it's going wrong.

I guess we have a vocab disconnect here. In a typical business application, what would be an example of a common/typical computation-space (non-domain) abstraction that we can dissect further from a polymorphic perspective? Column attributes could be considered either, depending on one's classification approach. (Readers, see OopNotForDomainModeling for some background on the domain/computation-space distinction.) -t

Typical examples: forms and form widgets; reports and report widgets; menus and menu items; classes representing database connections, queries, tables, views, stored procedures and query ResultSets, etc.

The above are instances of domain-specific "report widgets", are they not?

No. By 'reports and report widgets' I am referring to generic constructs like bands, page headers, page footers, report headers, report footers, headings, (report) columns, and so forth. These are generally instantiated as needed, and occasionally subclassed to implement one-off functionality.

Do you mean the classes supplied by the language vendor and/or commonly-used frameworks? Those are not classes that an application developer normally makes, at least not where they should be spending the majority of their time.

True. They're generally subclassed and/or instantiated to implement application functionality. Note that the 'domain modelling', as such, is primarily represented by the database schema. The application provides a user-friendly view of the database, which obviously may look nothing like the underlying database if it's implementing workflow, etc.

The classes the application developer creates are generally subclasses of the generic ones, e.g., a generic 'Form' class is subclassed to create a 'CustomerForm' form. However, what is notable is that although the database may contain a Customer table, there is unlikely to be any notion of a corresponding Customer class in the application. It is not needed. Creating it is indicative of falling into the DomainModelling black hole, or at least getting sucked into some dire pit of ObjectRelationalMapping. The application exists to collect, present, and sometimes manipulate facts; it does not exist to collect, present, and manipulate entities. Indeed, expressed this way, it can be seen that collecting, manipulating and presenting facts makes logical sense. Collecting, presenting, and manipulating entities does not make sense, unless the 'entities' in question are stamps, coins, cars, or some other tangible object. In short, business application development is about creating fact processing machines, whereas DomainModelling is about creating entity simulations.

And using a "handle" for those things often gives you pretty much the same thing, at least in a dynamic or type-lite language.

A 'handle for a thing' is an object reference, though it may be manifest in an environment without inheritance or polymorphism.
So is an employee number if you want to cast a wide definition of "object".
I'm not sure of the relevance here. I'm thinking of object references along the lines of 'that report' or 'this query', rather than a domain key.

That's what we did in the old days. OO improves name-space management slightly, but that was rarely a problem for non-large handle-based apps. And the more visual-intensive ones should be mostly markup-language-driven if you ask me. The engine is more sharable across myriad languages that way. If most of what is going on is attributes, then markup is the better option.

Nothing precludes creating a markup-driven system, of course. In the late 1980s, my company successfully competed against the local ExBase developers -- in fact pushing them almost entirely out of the marketplace -- using an in-house C-based DBMS and application development tool that employed an extensible markup for defining forms, reports, and database schemata.

In the early 1990s, we replaced it with a C++ based library of classes. It was no less productive than its predecessor (and probably a bit more productive, due to additional functionality), mainly because the class libraries were rich enough to permit a style of programming that for many applications consisted of little more than instantiating objects and setting attributes. It effectively expressed the same semantics of the markup language but using object oriented C++ syntax. What we lost in terms of development efficiency due to the terseness and specificity of the markup language, we gained in flexibility and rapid extensibility from working in C++. In other words, we didn't need to extend a markup language to add new functionality.

Without source code and specifics to examine, it's hard to verify your claim. I've seen ExBase shops that did more with one developer than a group of four C++ developers trying to re-create the same system in a migration plan. It was a Pakistanian company called NovaQuest? if I remember correctly (later merged and sold the name). C/C++ doesn't make a very good custom-business-app language in my opinion. It's too low-level, not having native strings and anal memory management, key-words and type names that bury the actual var name in the middle, among others. It's maybe good for embedded and systems software, but not biz apps. Sure, some C/C++ whizzes can probably make it buzz along, but on average it's the wrong niche for it. But I don't want to get stuck in an anecdote fight here. Let's get back to comparing specific polymorphism scenarios. -t

We used C/C++ to produce a custom DBMS and associated application development environment and/or libraries, and then used that product to build end-user applications. In other words, we effectively used C/C++ to build systems software for building applications.

In the case of C, we used it to create a DBMS, markup processor, and libraries to support business application development. C was used as the scripting language.

In the case of C++, we used it to create a DBMS and associated libraries to support business application development. C++ was used as the scripting language.

Our libraries were the crucial factor that made this approach feasible. They included functions (C) and classes (C++) for handling all aspects of the DBMS, user interfaces, report generation, and utilities such as (vastly) improved string processing. You can think of our C/C++ tools as having fulfilled the same application development role as, say, MicrosoftAccess, but without the UI-based click-n-drag screen/report/table painters and employing C or C++ instead of VisualBasicForApplications for the scripting language.

Using raw C or C++ to directly build business applications would have been (and would still be) onerous, time-consuming, failure-prone and mental-health-threateningly unpleasant.

Well, whatever happened to this great, wonderful, competitive, and profitable CRUD tool?

It targetted extended-DOS and SCO UNIX, so as Windows grew in popularity it became obsolete. In the mid-90s, we moved to using MicrosoftAccess to develop business applications for Windows, and we immersed ourselves in Web development. By 2000, the revenue from applications written using it, plus these other directions, allowed me to semi-retire to England and go into academia.

What does "type names that bury the actual var name in the middle" mean?

It's easier to spot or find the variable name if it's at the beginning of a line than in the middle.

Ah. I never would have guessed that was what you meant from a phrase like "type names that bury the actual var name in the middle", but there you go. I use 'search' or 'find' for finding variable names.

See Also: SwitchStatementsSmell (case-statements versus polymorphism), ThereAreNoTypes, LimitsOfHierarchies, DeltaIsolation, GranularityOfVariation

AugustZeroEight

CategoryPolymorphism