Generics Vs Subtyping

In the classic book ObjectOrientedSoftwareConstruction (which all ObjectWeenies should buy and read), BertrandMeyer devotes an appendix to discussion of generics (templates) vs subtyping (commonly conflated with inheritance; indeed Meyer believes that other ways of constructing subtypes aren't necessary).

It should be noted that this applies only to languages with StaticTyping - this discussion is moot for DynamicTyping.

In the chapter, he presents a fairly convincing argument that:

Adding generics but not subtyping to a language (which some FunctionalProgrammingLanguages do), is useful.
Adding subtyping but not generics to a language (which some OO languages, such as Java pre JDK1.5 do) is more useful.
Adding both to a language (such as CeePlusPlus, Meyer's EiffelLanguage, and the recently released Java 1.5) is more useful still. What is the meaning of Generic here? Generic function, as in CLOS, or Generic as in Java, which is really just template class? As in Java. See (*) below.

The crux of the argument focuses on container classes (though other issues certainly come into play). In a statically-typed language with generics but no subtyping; one can create homogeneous containers easily (such as a list of Integers or a list of Strings); but one cannot create heterogeneous containers without subverting the type system.

In a language with subtyping but no generics (i.e. Java), one can create heterogeneous containers easily (excluding Java arrays, which have their own problems, Java collections are all collections of Objects); creating homogeneous containers is possible, but a royal pain in the butt (and requires use of dynamic constructs like instance_of or downcasts). Java programmers with any significant experience are likely quite familiar with this.

In a language with both; one can easily create both heterogeneous containers and homogeneous containers. If one wants a list of ints in C++, one types list<int>. If one wants a list of Objects (assuming one has defined a class Object, which is the supertype of a bunch of classes of interest), one does list <Object *>. (CeePlusPlus has other problems complicating the issue, such as no universal supertype and the need to use explicit pointer/reference declarations; list<Object> is a BadThing if Object isn't a leaf class). Similar examples can be found in Eiffel and Java 1.5

In a recent edit of VbDotNetVsCsharp, it was said both DotNet languages will get generics. Are there conceptual differences in generics implemented in the two main competing platforms (Java and DotNet)?

If you compare Java and C# generics at the language level they don't seem too different but the actual existing implementations are different. Java uses TypeErasure which basically means the compiler does the work of inserting casts to and from the basic Object type wherever necessary. DotNet implements generics in a different way, trying to gain performance and space advantages by implementing actual parametrized types at the .NET level and not the language level. The paper at http://research.microsoft.com/projects/clrgen/generics.pdf explains it much better than I can. The basic tradeoff, as far as I can tell, is that Java generics can run on an unchanged virtual machine whereas .NET needs a new CLR for generics. --AndrewQueisser

And what is the big deal about generics? Is it a new solution to solve some problems and introduce others?

I look at it this way: for statically typed languages (read C++) they open up a new world (see "ModernCeePlusPlusDesign" or AndreiAlexandrescu's C++ papers) for dynamically typed languages they don't really add anything (and generally don't exist(?)). Java and C# are kind of in-between so there's much more controversy over whether Generics add anything. It probably depends on which side you originally came from. --AndrewQueisser

I wouldn't consider Java (know far less about C#) to be less statically-typed than C++. Both languages require that any feature projection on an object (method call, attribute read/write) be provably "correct" (i.e. the compiler must prove the feature exists and is of appropriate type)--this is a hallmark of StaticTyping. Neither C++ nor Java have anything resembling DoesNotUnderstand. --ScottJohnson
Agree, I didn't know how to express what I consider the in-betweenness of Java and C# so I'll leave what I wrote above. Since those languages have dynamic type checking casting to and, more importantly, from Object is standard practice. If all you want out of Generics are containers then I can see how you would frown upon adding Generics at all. --AndrewQueisser
Of course, C++ has downcasting (DynamicCast) as well; however with templates one needs use it far less often than in Java. Most statically-typed languages do have downcasting (a dynamic feature, to be sure). -- ScottJohnson

In dynamically typed languages such as Lisp, there is indeed a very important equivalent: ParametricPolymorphism (possibly in conjunction with macros for e.g. speed). This is exactly the same thing as Generic Programming, but not based on templates as it is in C++.

It has been quite widely discussed that this facility is a superset in power over OO, since one need not pick out one of the nouns as the special one to send the message containing the verbs and the rest of the nouns. (*) Wrong. It is true that generic function-style OO systems (such as CommonLisp has) are more expressive than single-dispatch OO systems (like Java has), but 1) both are OO, and 2) generic functions are not the topic of this page, rather generic collections.

Any language lacking this kind of facility, whether staticly or dynamically typed, is crippled in significant ways, and that's precisely why they finally decided it was important to add it to Java.

(Unfortunately it has disadvantages in C++, since the template names expand like old-school bad macros into a nightmare, but still, having generic programming in C++ literally transformed the entire language into something far, far more expressive than it had been previously with only OO facilities.)

So yes, it is indeed a big deal. It depends on the language details whether it introduces new problems as well. It did that in C++. I hear rumors that it might in Java, but I don't know the details.

The crux of the argument focuses on container classes

In my experience with C++ STL containers (vector, list, queue, etc.), I have been very disappointed with what is provided. The containers will not take derived objects, only pointers to the objects; leaving it up to the programmer to handle the actual object creation and deletion. The containers also do not provide any functionality beyond container access; the programmer still has to write lots of for loops to do an operation across the collection. I find C++ STL collection templates to be only slightly more useful than the C array ("[]") operation and find that for smaller collections (say less than 10 objects), it is often easier to create a class that explicitly contains the individual objects needed. There is still a lot of room for improvement for containers in C++ than what is currently found in STL or templates in general and it should be a language level primitive that does not require repetitive typing of loops for every desired container operation.

std::for_each and std::transform let you apply a function over the contents of a container. They take iterators, so you have a fair bit of flexibility in how you slice the container. Is this not adequate for your purposes? It should be. There's rarely a need to write loops by hand when using the STL, the paragraph above is from someone who doesn't get it and is just misleading - does it add anything useful or valid to this page?

Actually, this is a good point, and throws an interesting light on the nature of containers. Discuss at CppHeterogeneousContainers.

...the containers will not take derived objects, only pointers to the objects.

This is because of the semantics of C++. You actually can store a derived object in a container of base objects if you define the assignment operator, but the stored object will be "sliced", and any fields defined in the derived class will be discarded. The same applies to arrays and scalar variables.

[Can somebody defines: HomogeneousContainer and HeterogeneousContainer , thanks ]

A homogeneous container is a container which can only contain objects of the same type. Lists in HaskellLanguage, for example, are homogeneous--you can have lists of integers, lists of booleans, lists of characters, etc.... but you cannot have a list whose first element is an integer, whose second element is a character, etc.

A heterogeneous container is one whose elements can be of arbitrary type.

Of course, it should be pointed out that these (fully homogeneous and fully heterogeneous) aren't the only possible choices. In general, it is possible for a container to be parameterized on an element type; which restricts that container to containing elements of that type or subtypes. If you have a container whose element type is a leaf type (no subtypes, according to the semantics of the language's type system), that's a homoogeneous container. As HaskellLanguage has no concept of subtyping, that's all you get. On the other side of the coin, if you have a container whose element type is TopType (or type "Object" in many languages), you can stick anything you want in there. That's a heterogeneous container. SmalltalkLanguage's containers are all heterogeneous, as were the collection classes (excluding Array) in Java before the introduction of generics. Proper support of homogeneous containers requires GenericPolymorphism? in the language.

Homogeneous containers can be simulated with heterogeneous containers by use of external tests/conventions (to trap attempts to insert objects of the wrong type); however, the reverse is not true. However, doing so is a pain in the butt, which was the crux of Meyer's argument.

CategoryComparisons