What Are Types

See TopOnTypes for a summary list of allegedly fallacious definitions.

From Wikipedia:

[...] A data type is a classification identifying one of various types of data [...], that determines the possible values for that type; the operations that can be done on values of that type;[...]
Data types are used within type systems, which offer various ways of defining, implementing and using them. Source: http://multimedia.dictionary.reference.com/browse/data+type
Different type systems ensure varying degrees of type safety.
Formally, a type can be defined as "any property of a programme we can determine without executing the program. (Source: Programming Languages: Application and Interpretation, Shriram Krishnamurthi, Brown University)
A type system associates a type with each computed value. Source: http://en.wikipedia.org/wiki/Type_system
In [...] computer science, type theory is any of several formal systems that can serve as alternatives to naive set theory. Source: http://en.wikipedia.org/wiki/Type_theory

Type structure is a syntactic discipline for enforcing levels of abstraction. - JohnReynolds.

A type system is a syntactic method for automatically checking the absence of certain erroneous behaviors by classifying program phrases according to the kinds of values they compute. - BenjaminPierce, TypesAndProgrammingLanguages

"Kinds" is a synonym of "types" more or less, so this definition is circular.

Types are syntactic tools for managing abstraction (creating abstraction, using abstractions, checking the usage) thereof. That's what types are. Types are classifiers of valid expressions within the language. A type system classifies legal phrases constructed in accordance to the grammatical rules of the language into the type corresponding to each phrase/expression. Type judgements take expression and derive the corresponding types, in case the expression is ill-typed (i.e. it corresponds to no legal type) then it will be considered illegal.

Types typically form an algebra of types, starting with a core of constant types (aka primitive types) like boolean, int, char, etc., and applying composition operators to derive record types, functional types, object types (for OO languages), and so on, so forth.

Thus a type system is a property of the (typed) formal language, but not of its interpretation. The typical mistake that misleads people into wild directions like (ThereAreNoTypes, etc.) is to start with the interpretation (typically the operational semantic) of some language and to look for types in there.

Run-time types certainly are an essential part of the OperationalSemantics? of many languages. A clear example is EeLanguage, in which the semantics of types are defined in terms of coercions.

There are formal languages (such as Lisp, Smalltalk, etc.) that are untyped or to say the same, their type structure consists of only one type. For Lisp that type would be (uninterpreted) S-Expressions, while for Smalltalk that type would be Object. There's also the famous UntypedLambdaCalculus.

This definition is not precise. It does not define "classifiers", "level of abstraction", "legal phrases", etc. Nor is it worded is a practical sense.

It is made precise by the definition of each typed language. Read the definition of Standard ML and you'll have all you need. It cannot be more precise, because it is generic, encompassing a large variety of formal models that perform different tradeoffs with regards to the means and tools to construct type abstraction.

See JohnReynoldsFableOnTypes for the intuitive notion behind the definition.

Re: Type structure is a syntactic discipline for enforcing levels of abstraction.

Are you suggesting that ANY syntactic discipline that "enforces levels of abstraction" is typing? The definition needs to narrow down what is "type" related discipline and what is not. An SQL example:

  select * from foo A, (select x,y,z from bar) B where A.foo = 2 * B.foo

Here we define a virtual table "B" that is a stand-in for a longer expression. Thus, it is a syntactical abstraction and based on "rules". But, few would consider this an instance of "types".

PageAnchor: treedef

An example of a good definition is the definition of "tree". It is well-defined enough that one can apply a relatively short set of tests and/or rules to determine without conflict whether a given graph is a tree or not. ("A connected graph without cycles") It either passes or does not. It does not require "community feelings" and the like. And, it does not take an entire book to (allegedly) explain. If it does, then it is a poor definition, or at least a very hard-to-use one. Costin's proposed definition is not anywhere near as clean. I strongly suspect this is the case because behind all the math surrounding it, it ultimately is inherently coupled heavily to psychological interpretation. I would bet money on it if I had it. -- top

It seems to me that all classes have a type. The type is the sum of its behavior under all conditions. Moreover, this is true in early as well as late bound languages. You have to use DoubleDispatch in Smalltalk too, right? Am I really misunderstanding this? I don't know Smalltalk.

The type of a class may change every time that you modify the class. Classes are what we want our implementation constructs to be, but types are what the constructs are. It is interesting that we do UnitTests repetitively when we develop. I think that the reason that construction engineers do not do UnitTests over and over when they are making a building is because the types of their materials are fixed. We invasively change ours.

I don't get offended when anyone digs into C++. I hope that I don't offend anyone when I dig into class. I'm doing more of this creepy stuff in BlackBoxComponentry. -- MichaelFeathers

A type is a collection of domain values (AnswerMe: What is a DomainValue and how is it different from regular values? - see below for one answer). An object may exhibit value in more than one way through more than one representation, depending on context. Moreover, a given domain of values may be carved up into subdomains, in other words subtypes, in many different ways. Thus, types are simply categories for treating values as groups, and the choice of grouping and subgrouping is ideally selected entirely for its convenience in a given context.

For example, the set of integers may constitute a type. But in the context of a modulo 9 congruence, there are only 9 unique codes, so every integer can be reduced to a symbol 0 through 8; two integers that are congruent modulo 9 represent essentially the same domain value. Thus in an altered context, an integer value reduces to a nine-membered type. Regarding subtyping, the set of integers can be (exhaustively) partitioned in different ways. For example, negative and non-negative integers can be treated as a subtype. Or even versus odd, prime versus composite, and so on. In one context primeness may be regarded as a determiner of type, in another, divisibility by two.

Elsewhere on this page there are Circle versus Ellipse discussions. This problem is simply the result of tension between a static type system, which imposes a single interpretation of value and type on a mutable object over its lifetime, and a more flexible type system that the programmer actually reasons about based on categorizing the object's domain values. When the object's value is mutated, the dynamic interpretation of its type becomes at odds with the static type imposed by the inflexible language.

There are two solutions. One is to keep static types, but banish mutation. An object starts out as a circle or ellipse, and to keep that type, its value is prohibited from changing. If you want to transform one to the other, use functional programming: make a new object based on the properties of the old one, and assign it the appropriate type at creation time; if the new properties indicate a circle, then make it that. In a language like C++, this means choosing the right constructor in conjunction with operator new, in some factory-like function, for instance. Problem solved.

Another solution is to throw away the static type system and use a dynamic typesystem - what is more, one that allows an object's type to change in response to mutations of value. When a circle's properties are mutated so that it becomes an ellipse, then adjust the type; for example using the change-class method in the CommonLispObjectSystem. The type change means that the object is now eligible for method parameters that are specialized to that type; a call that would previously have selected a circle specialization now selects the ellipse code instead. Problem solved. Note that changing the type from circle to ellipse or vice versa may involve a drastic representation change. After the type change, some operations won't work. If we mutate a circle to an ellipse, it may now respond to a get-radius method by signaling an error like "no such method".

-- KazKylheku

Mmm. Types.

The C++ class/type stuff that Alister was mentioning deals with how the compiler sees types. It is by no means the only way to do it, even in statically typed languages. SatherLanguage, and I believe EiffelLanguage before it, completely divorced inheritance and subtyping, the confusion of which is what makes C++'s model touchy (IMHO).

In a dynamically typed language like SmalltalkLanguage, CommonLisp, and PythonLanguage, subtyping is quite simple: does X do all the stuff Y does? This is more what you were looking at as far as typing goes.

What do people think of CommonLisp? (Well, those who use it on a daily basis think it's the most flexible, malleable, eloquent and productive language they've ever used. And not only do we not think it's enormous, we think it's too small, and things like multi-threading should be added to the standard.) On the one side, it is enormous and nobody bothers to gainsay this; of course, it's unlikely to get any larger. However, it's one of the only languages (at least, that I can think of) that evolved into its current form, which makes it very very usable in practice. -- GrahamHughes

WhatAreTypes? I don't know. If you say what types are to you, then I can guess what they are to you, and take the conversation from there. Michael says, "The type is the sum of its behavior under all conditions." Could be. In that case his logic moves in the direction his paragraph takes. Michael's does not strike me as the definition Stroustrup was using across the years he was shaping C++. It is particularly not clear to me that Stroustrup;s definition today would match his definition in the period 1978-85. Regarding the term "subtyping", I believe the word is simply a misnomer, still looking for a valid meaning.

-- AlistairCockburn

Think of it this way. JamesOdell once wrote that concepts have three aspects: a name, an intention, and an extension. The intention is a set of statements which define the properties that something would need to be an instance of the concept. The extension is a set of instances of the concept which, can act as examples. From this, we have two different ways of defining a concept with a given name. What I was getting at above is that, say, your notion of what an Account is and mine may be different. Further, I could have my own account class and it could be different from what it was two years ago.. different methods with different semantics. For me, this is the difference between class and type: class would correspond to the name aspect above and type would correspond to the intention.

Subtyping is just saying "here is a thing that has all the expected externally discernible properties that this other thing has, plus these other properties." Subclassing is pretty much the only way to substitute in a subtype in C++ because of the type system. In Smalltalk, you can have another class which responds to the same protocol and behaves the same way as another class, and it would be a subtype of the other class even if it was unrelated by inheritance. If the class wasn't a subtype, the program could break... responsibility failure.

To me, two things are of the same type if they match the same specification for externally discernible aspects. But then, the words class, type, and concept are so overloaded that I have to apologize for going off like I did above into a particular definition without an explanation. I think that CatalysisMethodology defines type in roughly the same way I do.

Alistair, is it that you have seen contradictory definitions of subtype? Or, you don't see the utility of the idea?

-- MichaelFeathers

I have seen different definitions of Type over the years. The definition has evolved (in my understanding of it, but then, that has evolved, too, so I really don't know what the people actually said back then). There was a period in which Type obviously equals Class. That seemed to work for C++ and Object Pascal. Smalltalkers ignored Type for a long time. Then someone came out with this "sum of all behaviors" line. Someone came out with "external protocol". I don't know that there is yet consensus in the industry, if you ask the people who do research in type theory, and people like you and me, who mostly program and produce out-loud opinions on the matter.

For subtyping, most people go back to the LiskovSubstitutionPrinciple (a page full of this sort of discussion). However, substitutability does not really work, as far as I can tell, and when you force it to work, the results are counterintuitive, and hence wrong along a different axis. I saw a beautiful append to some group by a mathematician, explaining from the perspective of pure mathematics how creating a subset group of a larger group necessarily meant the subset elements were not substitutable for the larger group elements.

: I wish I had copied and saved that text.

To my experience to date, a sub-type / class / group is made precisely because it has some different properties than the super-whatever. It is that set of differences that mean the sub-items will not, universally speaking, be perfectly substitutable for the super-.

Take the old ellipse, circle, point problem. From the substitutability perspective, we would start out by saying that the circle is a subtype of ellipse, because if we need an ellipse at some moment, we could work with a circle instead. But wait, an ellipse in software can change its aspect ratio. So the circle is not perfectly substitutable for the ellipse, because it might be told to change its aspect ration. Same argument for point as subtype of the other two.

So I go to your definition of subtype, "here is a thing that has all the expected externally discernible properties that this other thing has, plus these other properties." Then neither is a subtype of either, because each has expected externally discernible properties that the other does not. The ellipse can change its aspect ratio, the circle has a constant, unit aspect ratio.

The argument typically at this point jumps to, well, ellipses are supposed to be immutable. I just checked, it did do that on the LiskovSubstitutionPrinciple page. That takes us back to, well, what do you mean by a type? Is the behavior of the thing part of its type? If yes, then mutability / immutability is a design decision, and what is really the type of the thing? If no, then we are talking about a tiny subset of the interesting world, and I no longer care.

Anyway, that's the argument as far as I have ever been able to run it. Perhaps someone can correct something or take it farther. Perhaps I missed some significant point on the LiskovSubstitutionPrinciple page. -- AlistairCockburn

Seems to me that types as implemented in your favorite language are interesting insofar as they help/hinder your work. Types as theory have borne little or no useful fruit. I'm wanting to see a page named What Are Types Good For. -- RonJeffries

What Are Types Good For = ThereAreNoTypes. -- MariusAmadoAlves

Here is a real-life example. A C++ system has to abstract out memory allocation and deallocation for a variety of reasons. For another variety of reasons, overloading operator new is not a good idea. So make a simple class named Memory and give it two operations: allocate and free. The system depends on the interface of Memory. Part of system configuration is creating a new Memory object or an object of a subtype and passing it into the system. At some later time, we may be interested in the number of times we allocate memory over runs of the system. We subclass Memory and name the new class Counted_Memory. We give the new class a counter and an accessor. After system runs, we can query for the number of allocations. The system does not know whether it is using Memory or Counted_Memory and it doesn't care. In addition to being a subclass, Counted_Memory is a subtype. Note that we could have just added the variable and the accessor to the Memory class also. In that case, the new Memory class would be a subtype of the old Memory class. So, by this definition of type, type and class are different.

Consider the Smalltalk case: you want to parameterize memory management in Smalltalk (god forbid). Here you have more options. Your Counted_Memory class does not have to be a subclass of Memory to be used in place of it. It just has to have the same protocol and behave in the same way as the Memory class. Because of Smalltalk's type system, you can substitute in the subtype even though it is not a subclass.

-- MichaelFeathers

Normally I stay away from discussions of types, because they are just too circular for my taste. We make a definition to suit a case we want and the thing to be illustrated from the case is the part of the definition we can't describe.

In 1991 there was a term making the rounds called AlloMorphism, if anyone recalls that and can find a reference, please post it. AlloMorphism meant exactly that a subclass was guaranteed to be usable in place of a superclass, or more generally, Class A is allomorphic to Class B if an A can be used in place of a B. AlloMorphism was coined primarily to deal with versioning and upgrades - the new version could be said to be allomorphic to the old. Whoever came up with this term invented it solely with this definition in mind. (Are you sure? Perhaps they felt that the concept they wanted was similar to the linguistic concept of AlloMorphism.) They weren't drawing on our intuition about the word "allomorphism." I don't think the same is true with the words type and subtype.

In your example, you give the example of Counted_Memory and Memory, that a function expecting a Memory could be given a Counted_Memory without knowing the difference. You are counting on the fact that a client object does not know it has a Counted_Memory, and so will not invoke any of its unique functions.

Using this line of reasoning, I create a type Circle, with attribute Radius and functions Area and Draw. I create a subtype Oogle, with attribute Other_Radius and functions Other_Area and Aspect_Ratio. So Oogle is a subtype, because as long as the client does not call Other_Area and Aspect_Ratio, the client is none the smarter.

Up to here your argument is actually ok, even if I could create a counterintuitive example with some stretching. The circularity is when you say that in Smalltalk the type&subtype don't have to be in the same hierarchy because their protocols overlap in a certain way. You carefully orchestrated the definition of type so that you can create an example of subtyping that uses that particular definition of type.

Oh, I see what is bothering me... I am working on the assumption that the word "subtype" has any relation to the word "type" just because it uses the prefix 'sub' on the word 'type'. If you used the word "allomorphic", I wouldn't be having these complaints! Then you can define "type" any way you want, and "allomorphic" any way you want, and then you can say that allomorphic classes don't have to be in the same inheritance hierarchy. And I wouldn't be surprised or upset! Fascinating. I'll have to think on this. -- AlistairCockburn

CatalysisMethodology defines type the same way [http://www.iconcomp.com/papers/Types-and-Classes/Col5.frm.html]. Unfortunately, this definition, with its roots in the quasi-formalist camp, is pretty much second nature to me now. It is easy for me to forget that people mean different things by the term. I also tend to go ambiguous and use type to mean "completely the same," rather than "the same in interesting aspects." -- MichaelFeathers (not a very good formalist)

Ellipse, Circle surely these are both subtypes of some RoundishFigure? type? Both have common features, as Alistair points out, but neither is really a subtype of the other.

I view a type as a contract to provide a given service. A subtype is a contract to provide more of a service. A class is an implementation of one or more types, and subclassing is a mechanism to assist with code reuse. Of course, every class, by default, has a type that, curiously, promises exactly what it implements.

My other view is of typing as checked documentation. For example, I've always been bothered by the Smalltalk idiom of naming parameters by their type, i.e.

#method: aThingyWidget

because you go to the trouble of adding all this type information (which then has to be kept in sync with the code) without gaining any automatic support from it. If I change the type of a parameter, a statically typed compiler will find all the relevant code for me, but I have to search in Smalltalk. Of course, it helps to have a decent type system, such as CLU or ModulaThree, which answers Ron's question above about language theorists. Their role is to avoid technical kludges like C++ (;-), but there are very few situations where theory and practice get to collaborate. -- SteveFreeman

I don't think aThingyWidget is named for its type. It's named for its role in the method. It's just that types are also named for roles and they often coincide. -- DaveHarris

This Ellipse and Circle thing has got me thinking. AlistairCockburn uses an example of circle is a subtype of ellipse. In a geometrical mind frame, this makes perfect sense. However, from an OO design standpoint, I'd make the opposite assertion: ellipse is a subtype of circle. An ellipse can perform identically to a circle if the semi-major axis equals the semi-minor axis. I'd imagine Java code something like this:

 public interface Circle {
   public void setRadius(double r);
   public double getRadius();
   // presumably there would be more methods
 }

 public class Ellipse implements Circle {
   private double semimajorAxis;
   private double semiminorAxis;

   public void setRadius(double r) {
     semimajorAxis = r;
     semiminorAxis = r;
   }
   public double getRadius() {  // this is questionable
     return (semimajorAxis + semiminorAxis) / 2.0;
   }
   // other methods to round out circle's interface ;-)

   // below here are the additional capabilities of Ellipse not in Circle
   public void setSemimajorAxis(double r) { semimajorAxis = r; }
   public void setSemiminorAxis(double r) { semiminorAxis = r; }
   public double getSemimajorAxis() {return semimajorAxis; }
   public double getSemiminorAxis() {return semiminorAxis; }
 }

With this I think it's safe to say that Ellipse is a subtype of Circle. It fulfills all of Circle's behavior, and adds some more of its own. Code that expects to operate on the type of Circle could be handed an Ellipse and be none the wiser. -- GregVaughn

Add an area method to Circle to see that Ellipse is not a subtype of circle.

No problem: given a = semimajor axis and b = semiminor axis, then area of ellipse is (pi)ab. When a=b=radius, area = (pi)(radius)^2, the area of a circle.

An Ellipse IsA Circle? Nah. It's always possible to use inheritance (er ... that is, subtyping) for generalization, but it's generally a bad idea, and the problems start cropping up soon. It's nearly always better to subtype for specialization. (Did I return your copy of Budd?) -- GlennVanderburg, fresh from OntologicalThinking

I might be convinced that this is a bad idea, but it's going to take more than that to do it. If Circle is a subtype of Ellipse, how do I setSemiminorAxis() on it? Hmmm... maybe both set{Semiminor|Semimajor}Axis set the radius of the Circle. I'll have to ponder on this some more. It's not a simple thing to separate subtyping and inheritance in my mind (you even made a quick slip). I'll agree that subclassing for generalization is a bad idea, but I'm not so sure that subtyping for generalization is. I thought a subtype was supposed to expand the capabilities of the supertype. -- GregVaughn

Should we move this over to CircleAndEllipseProblem?

I wasn't even aware of that page, but after reading it, I don't really think we need to move. CircleAndEllipseProblem gives a good practical concrete solution to using Circles and Ellipses in a drawing program via PredicateClasses, while here, through my rambling, I think I've come up with a more abstract fundamental question: Does the rule that one shouldn't subclass for generalization apply to subtyping? -- GregVaughn

My comment above points to my answer, although I'll state up-front that it's a personal answer. I do think about subtyping as an ontological exercise (that's part of my view into the WholeSortOfGeneralMishMash) and I believe that IsA is the core value of subtyping. So my ProgrammingValueSystem would not currently permit me to define Ellipse as a subtype of Circle, however handy that might be in a given situation. -- GlennVanderburg

I think part of the confusion over Circle and Ellipse has to do with extension versus specialization. You can think of Ellipse as being an extension of Circle, because an Ellipse can do everything that a circle can do and more. But you could also think of Circle as being a specialization of Ellipse (a Circle is an Ellipse with an aspect ratio fixed at 1:1). Which interpretation is more useful depends on what you want to do with your Circles and Ellipses - ask yourself if Circle is an extension of the functionality of Ellipse, or vice versa. This point was apparently raised on CircleAndEllipseProblem also:

Mathematicians say that a Circle ISA Ellipse, because a Circle has all the constraints of an Ellipse, plus more.

Computer Scientists say that an Ellipse ISA Circle, because an Ellipse has all the functionality of a Circle, plus more.

Add an area method to Circle to see that Ellipse is not a subtype of circle.

There is no such thing as an official True Type of ellipse. It all depends on what you want to do with the Ellipse. The question "Is Ellipse a subtype of Circle" isn't well-posed unless we are more precise about the definitions of circle, ellipse and subtype. This is true even if you take the sum of all behaviours point of view. People have noticed similarities between ADT's and abstract mathematical structures. The branch of mathematics called category theory (of which I don't know very much, I must add) gives a unified method of describing such structures. There are a couple of chapters in the HandbookOfLogicInComputerScience? that describe a theory of abstract datatypes based on category theory.

But even if we define type as defined there, there are many different definitions of ellipse and circle. If all you care about is radius and center, that gives you a definition of Ellipse, with a corresponding notion of subtype. If you also care about area, you get different results.

I find all this very interesting, but I must agree with Ron that it is probably utterly useless.

-- MartijnMeijering

I think I finally managed to articulate how 'subtyping' is fundamentally a misnomer, and wrote it up in "Constructive Deconstruction of Subtyping" [http://alistair.cockburn.us/crystal/articles/cdos/constructivedesconstructionofsubtyping.htm]. The question, "Is a circle a subtype of an ellipse?" does not have an answer, because it is underspecified. Even, "Is this circle, with this behavior, a subtype of this ellipse, having this behavior", is not yet fully formed. You also have to include the computing power of the execution environment in your question: "Is this circle a subtype of this ellipse in this environment?" is the fully formed question, and that can be answered, for some definition (any of the many definitions) of "subtype". That paper has my best, current, considered thoughts. See also ContextSensitiveSubtyping.

p.s. the paper is really about "subtyping", not "typing". DougLea offered this great summary of the agenda of typing: "Safety: My program, once type-checked, still will not issue any message that is not accepted by its recipient or misinterpret any set of bits, even if I substitute subtypes... or ...The main goal in safety preservation is ensuring that all objects in a system maintain consistent states." Note that in this school of typing, it is consider preferable to reject some programs that actually work, based on plausible fear that they might not work. An 'improved' type system might reject fewer working programs while keeping the same safety. -- AlistairCockburn

Date and Darwen in TheThirdManifesto (esp. the second edition) criticize the object-oriented community's treatment of typing and subtyping and discuss the topic at length, trying to give it more rigor. I'm still digesting much of the material in this book, but find it intriguing. Let me try to summarize it briefly.

One of their central premises is that there is a confusion between the meanings of variable, value, and type in much of the OO community. From my perspective as a developer and designer in (primarily) C++ environments, many of their criticisms seem to have a ring of truth. They start from the premise that a given value is an abstract, immutable concept. The number 2, for instance, represents the same concept no matter its form, and two instances of the number are for all intents and purposes indistinguishable; in fact, it's inaccurate to talk about "instances" of a value. A type is nothing more or less than a set of values. And a variable is described as a triple of (name, declared type, value). The value of a variable value can change, but not its declared type or name. Operations (functions, procedures) are only indirectly associated with a type, which is quite a bit different from OO-think, but when you get into their notion of subtyping and operator selection, the operators are essentially MultiMethods. This is actually much more flexible and in some ways more appealing than the object-oriented notion of tying member functions to a single specific class type. (Unfortunately, although Date & Darwen seem fairly well-read on current mainstream OO thinking, they seem oblivious to prior work on MultiMethods.)

Subtypes are exactly subsets of a parent type. They use constraints to distinguish subtypes from parent types. Every value has a single, unambiguous Most Specific Type (MST), which is the smallest value subset that it belongs to. A consequence is that if a value satisfies the constraints of two types, there is a (possibly implicit) subtype that is the intersection of the parent types. Although a variable's declared type (DT) cannot change, the MST of its current value can be any subtype of its declared type.

Every value has a single, unambiguous Most Specific Type (MST), which is the smallest value subset that it belongs to. ... Indeed this is true, Dan, but is it useful? After all, the smallest value subset containing 2 is {2}. And the smallest value subset containing a circle with radius 3m is the set containing only and exactly that circle. This notion of Most Specific Type is an utterly useless one. There may, however, be the Most-Specific-Type-That-Matters-To-Me-At-The-Moment. Behavior dispatch is on arbitrary patterns, arbitrary properties, arbitrary 'most-specific-types'. E.g. "if there is only one, I attack; if there are 2-5, I pray THEN attack; more than 5, run for my life."

They use the ever-popular ellipse/circle example in their book a lot, too. A variable of declared type Ellipse could of course hold a Circle value. Since the variable is of type Ellipse, you can change the length of one axis. After doing so, the value would (in the general case) no longer be a circle, but rather an Ellipse - so essentially, the MST changes dynamically based on constraint checks at run-time. A real implementation wouldn't necessarily bother to compute the MST of a value until it needed to look up an operation on the value.

In their approach, most of the knotty issues of subtyping and inheritance (like the LiskovSubstitutionPrinciple dispute) shift to the definition of operators and their selection on runtime based on the MST of argument values and the DT of argument variables (modifiable arguments). This is a bit more tractable and flexible because you're looking at a single operation at a time rather than a whole interface. I think this system could give you programs that are more elegantly extensible.

I find this whole treatment to be very appealing, but there are practical issues with designing an efficient and usable language around it which I am currently exploring. I also worry that they have lost the very practical "packaging" aspects of traditional classes and objects, and I'm trying to figure out how to re-introduce this without losing the rigor of their typing and operation dispatching system. I think that a language might end up looking a little more like Modula-2 with MultiMethods.

-- DanMuller

Behavior and computation dispatch based on most-specific-patterns (or predicate, or 'type') is certainly an interesting problem... especially when there is no clear notion of what 'most specific type' might be in any given situation, and even more so when heuristics for determining the MST begin to falter (e.g. prime numbers vs. integers... primes may be "more specific", but these sets are the same "size" and "cardinality"). I don't believe it will ever be solved entirely; we'll develop 'intelligent' machines first, at which point behavior dispatch will be determined with the aide of a fuzzy-bayesian-epistemic inference engine... and will be based on whatever-properties-the-machine-bothered-looking-at.

Anyhow, the 'set' concept for types isn't the only valid one. One could type a value of a signal by the behaviors it provokes in others. One can type others by the behaviors that you observe in them. Many of these do not relate well to the Date and Darwen concept... it's remarkably incomplete for an answer to "WhatAreTypes".

Since wiki lost its links [2], the paper's URL got dropped. [See FixingLinks for how to recover lost links.] See http://members.aol.com/humansandt/papers/subtyping.htm for the paper that debunks LSP and the subtyping arguments, including Date's. I find it a shame that the type-research community doesn't front up that its all a hoax and drop it. (The paper was rejected from OOPSLA 99 for the reason that it's known news among the type theorist cognoscenti, although it's radical blasphemy among the non-cognoscenti. Which means they think what I wrote is true.) -- Alistair

I think the confusion is arising from the omitting the fact that subtyping is not an OO concept. The sentence "class A is a subtype of class B" is no more meaningful than "type A is a subclass of type B". That's why "class Circle is a subtype of class Ellipse" doesn't work.

AlistairCockburn did mention that "subtyping" is actually "ContextSensitiveSubtyping". What he missed is that "subtyping" in that sense can be described as a simple binary relation between two sets of relevant contexts. I mean that if we define type as a set of relevant contexts then we can say that type A is a subtype of type B if set B is a subset of set A.

Let's consider this example (courtesy of AlexanderStepanov):

  template <class StrictWeakOrdered?>
  inline StrictWeakOrdered?& max(StrictWeakOrdered?& x,StrictWeakOrdered?& y) {
    return x < y ? y : x;
  }

  (the definition with const StrictWeakOrdered?& is skipped for simplicity)

It has a context comparing x and y, so types of x and y must be subtypes of StrictPartialOrdered?, which is the type that represents the arguments of (generic) operator<.

Any StrictWeakOrdered? can be an argument of some operator<, but there can be some StrictPartialOrdered? that cannot be arguments of max(x,y). So StrictWeakOrdered? is a subtype of StrictPartialOrdered?, but StrictPartialOrdered? is not a subtype of StrictWeakOrdered?.

The example of StrictPartialOrdered? which is not StrictWeakOrdered? is a class, with (strict) subclassing as operator<.

Most types in most programming languages are not abstract enough to have meaningful subtypes in most programs. But they easily can be subtypes; type int, for example, is a subtype of an abstract type StrictWeakOrdered?.

Another example of abstract types that (sometimes; subject to good design) can be subtyped is interfaces. The problem is that subtyping does not depend on "which interface is created first", but "reverse inheritance" means refactoring.

Where I think this sort of subtyping is (or can be) a helpful concept:

GenericProgramming (as in STL);
TypeInference;
ExtremeProgramming. No kidding. That's why DoSimpleThings often produce good generic code. That's what determines the boundaries of each MercilessRefactoring.

-- NikitaBelenki

I'm surprised nobody's pointed something out about Circle and Ellipse [or whatever]:

Whether a thing is mutable has very important effects on its type.

Confine yourself to immutable values: then, as far as I can tell, CircleValue really is a subtype of EllipseValue: you can get its center, its foci (both the same point), its semimajor and semiminor axes (again, both the same length), and even draw it correctly. You can use CircleValue anywhere you can use EllipseValue. You also have some methods that are only defined on CircleValue, such as getRadius(), so it's a proper subtype.

The problem comes when you introduce mutable objects: CircleContainer can contain different CircleValue instances, and EllipseContainer can contain different EllipseValue instances. When you call setAspectRatio() on an EllipseContainer, the EllipseValue it contains changes to a different EllipseValue.

There's some subtyping involved: EllipseContainer can contain any EllipseValue instance, including a CircleValue instance. But when you call setAspectRatio(2), the EllipseContainer instance will end up containing something that's definitely not a CircleValue.

It's pretty clear to me that, in general, EllipseContainer is no more a subtype of CircleContainer than vice-versa: CircleContainer might make some guarantees about the axes being equal, which of course EllipseContainer can't make good on.

So the LiskovSubstitutionPrinciple becomes messy and less than useful when mutability is introduced. So what? Just about everything becomes messy and less than useful when mutability is introduced. That's why God created FunctionalProgrammingLanguages (or maybe it was Meyer).

Anyway, to sum up: mutability isn't as transparent as we'd like it to be. The LSP and types in general are useful for values, less so for containers. -- GeorgePaci

This point has been discussed in CircleAndEllipseProblem. No, immutability doesn't help if you have

 return ( argument.getClass().getName().equals ("readOnlyEllipse") );

sort of code somewhere.

The actual problem here is not CircleAndEllipseProblem itself, but what it IsGoodFor. Imagine you have the solution for this problem. What are the real problems it could help you to solve? -- NikitaBelenki

A) That'll teach me to follow up all the references before commenting.

B) Nobody over on CircleAndEllipseProblem outright said "mutability screws up a perfectly good subtype relationship", much less "watch out for mutability: a mutable foo and a foo are very different types".

C) Substitute whatever you want for the value type and container type: (String, StringBuffer); (int, Integer) (in Java). Then substitute whatever you want for the value subtype: (String, AllUppercaseString?); (int, evenInt); etc. I'm just using Circle and Ellipse as paradigmatic examples.

I agree with Alistair that you can't just look at some syntactic stuff about the types and figure out (Liskov) subtypes. You need to know something about the semantics of the types: either (as he suggests) by saying something about all the contexts you'll run in, or (as PhilGoodwin suggests) adding preconditions and postconditions to the definition of each type. I haven't seen anyone disagree, and the Theoretical Guardians of All Things Type seem to thing he's not only right, but so obviously right as to be unpublishable (one wonders how a textbook about type theory would ever get written).

And nobody seemed to object the three times people pointed out the distinction between type the mathematical thing and class the language feature which often tries to capture it, so I'll assert it a fourth time.

Combining the previous two points, I'd answer the reflection difficulty this way: the whole point of getting the exact name of the particular class is to break substitutability. So if you want substitutability, exclude getName() from your type: either prevent it with a contract, or rule out contexts in which it'll be called, or override it to always return "foo".

Another way to put this would be: If you want classes and subclasses to express the notion of types and subtypes, don't use reflection to grab the actual name of any classes. Some languages (e.g. Java) have instanceof, which is much more in the polymorphic spirit of things and doesn't break substitutability.

So let's move on to Alistair's unanswered question: "Subtype: what is it good for?"

The only thing I can think of is that it imposes a partial order on substitutability (and, if you're lucky, a hierarchy). So when you're reasoning about whether A is substitutable for Z, you can make very effective knowledge of the fact that B is substitutable for C, which is substitutable for D, ... all the way up to Z.

Substitutability, in my view, is the practical upshot of all this. And a good theory of subtypes should tell you what kinds of changes preserve subtyping (hence substitutability) and what kinds don't.

Note that substitutability isn't just important for subclasses, or alternative implementations of interfaces: you also usually want version 32 of your class Foo (in which you add the bar() method) to be substitutable for version 31 of Foo. So if bar() is the first method to make Foo mutable, your theory of types tells you that mutability has thus-and-such effect on subtyping (hence substitutability), so you should watch out.

Now for a question of my own: if you confine yourself to value types, is subtype the same as subset? The only subtypes I can think of are subsets of the original set's set: even ints, all-upper-case-strings, circles, etc. Is there actually anything to a value type above and beyond the set of things that are in it? -- GeorgePaci

Actually all the problems with subtyping come from our inconsistent and overly general "definition" of types. See your own examples:

String (which is presumably a supertype of AllUppercaseString?). Look at all implementations of String in all languages (or even just in CeePlusPlus alone). Are they substitutable? No. So can we say that type String is not a subtype of itself?
Ellipse with setAspectRatio() method. What does this method do? Sets the aspect ratio of the ellipse, right? But what does it exactly do with the area of the ellipse? That depends on the implementation, right? So can we say that type Ellipse is not a subtype of itself, because its different implementations are not substitutable?

The problem here is that what we are speaking about (String, Ellipse) are not types. They are either concepts or typenames. The types itself appear only when we need them in our design or code. So if we know what AllUppercaseString? IsGoodFor in our code, and when we know what String IsGoodFor, then we can look at our requirements and say whether one is a subtype of another.

[Sub]type is the particular task your object IsGoodFor. -- NikitaBelenki

String and Ellipse are not even concepts, they are just names. To make their referents into concepts, they have to be defined, not just by a signature, but also by suitable laws. A mutable ellipse and a mutable circle will always break each others laws, so they are not subtype- or subconcept-related.

"So can we say that type T is not a subtype of itself?"

(T being in {String,Ellipse})

I would think that those different implementations called T might not all have the same (external) behaviour. (E.g. Ellipse with setAspectRatio() method.) If so, one could hardly think of them all as the same type.

Also, if we are talking about types T in several languages (roughly) corresponding to the same concept, then those languages probably hasn't got compatible TypeSystems, so then (unfortunately) the types are not substitutable.

So I think I agree with: "The problem here is that what we are speaking about (String, Ellipse) are not types. They are either concepts or typenames.". I.e. what you call String or Ellipse here is some informal concept that (apparently) can be formalized (in some particular language, possibly with a particular TypeSystem) in several different ways. (Where some of these ways arguably may be incorrect / bad for coding properties like MaintainAbility, modularity, and what-not ...)

The (concrete in-language) types appear when we try to formalize out informal concept in our language.

s/IsGoodFor/ReallyMeans?/ :) (Roughly the same, I suppose.)

In (StaticallyTyped) OO Systems using (DefinitionalEquivalence?/)NameEquivalence?, (such as CeePlusPlus and JavaLanguage,) one usually seems to has to declare, upfront, if a newly defined type should be a (SuperType?/) SubType? of some other type(s). (I think this doesn't apply to NiceLanguage. What about EiffelLanguage?) I.e. one inherits some SuperClass?es or implements an interface using InterfaceInheritance when defining the new class / interface.

While in (StaticallyTyped) OO Systems using StructuralEquivalence?, (such as ObjectiveCaml), one doesn't declare one ObjectType? to be a SubType? of another one. Though one still declares *SuperClass?es* when *inheriting* (defining a new class from some old one(s)). Inheritance is decoupled from SubTyping?.

(And in DynamicallyTyped OO Systems one doesn't use any (explicit, at least) TypeEquivalence?!)

(Correct me, if I got some details WorngTradeMark?.) -- StefanLjungstrand

Typing is a human survival strategy. After seeing a few trees my brain created a type for them. From then on when I saw a tree my brain reflexively types it.

Types inside human brains are continuous. The tree type isn't discrete from the bush type. Human brains (most of them at least) don't freak out when they encounter a plant that falls someplace along the tree-bush type axis. Typing is a natural and powerful human tool, but when we try to use type in a statically typed language we smash our mental knuckles.

We need tools that fit our natural abilities. We need a programming language that makes it easy to model a plant that is 75% tree and 60% bush.

-- EricHodges

Be careful generalizing your personal preferences to other people. -anon

Eric Hodges, your concept of 'continuous' types is relevant to ProtoTypes?... i.e. type by analogy (this is like that). Prototypes model observations and are inductive in nature. Humans are also capable of deductive types... e.g. "sphere". Something is either a sphere or not. If it is not a sphere, it might be 'spherical' (sphere-like). Spherical is a prototype-concept, with the prototype being the abstract concept a sphere of any size. Your "tree" and "bush" don't have a 'real' prototype to which you can point, but they have the same sort of properties.

Anon, Eric Hodges does not speak of 'personal preferences'. With, perhaps, a very few exceptions, every human brain uses patterns to understand perception and make predictions. When it comes to types produced by induction (like 'blue'), these human brains allow for fuzzy ranges. No preference is involved.

While perhaps true, this process could be called "classification" rather than, or in addition to, "typing". Describing what the brain does and what are types may or may not be related. -- top

Top, if you look up the word type in the dictionary, you will see that it says that a type is a classification. The very first thing to do when learning about a subject that people are yapping about in Computer (computing) Science is I look the subject up in a dictionary as a general reference. Very often one can over-analyze the hell out of a subject with academic implementation and system theories (and personal opinions).. but at some point we have to stop and think: what is the actual definition of a type in simple terms? What about common sense? They ARE related. A classification is a synonym to a type.

''may or may not'... that's quite some waffling. I'll state it: They ARE related. The English words 'class' and 'type' are synonyms in their usage here. If you can give me a meaningful difference between the two that isn't based upon some arbitrary naming decision in a popular programming language, I'll be surprised.

Perhaps your concern is 'formal' (i.e. 'mathematical') type-systems concerned with type-safety (i.e. static or dynamic 'prediction' of sensible behavior) and behavioral dispatch. Or, perhaps, you are concerned that what is being typed can vary (i.e. behaviors, communications, interfaces, objects, values, usage, emergent properties, etc.) But those are only different uses of typing; they do not change 'What Types Are'. Further, even these varying uses are quite related to what our human brains do. We classify and type things in order to both make predictions and 'dispatch' intelligent behaviors (react or interact in certain manners). I.e. if we classify another's behavior as 'aggressive', we'll react to it differently than if it is 'friendly'. The primary difference is that we use a fuzzy-bayesian-epistemic-inductive inference-engine for typing rather than formal-boolean-deductive-mathematical inference-engine to determine these types.

The engine used to compute types cannot change 'what types are'.

I wonder what Costin would say about this. It does seem to lead back to MostHolyWarsTiedToPsychology. -- top

[Most holy wars are tied to what tools you are using, top. I bet that you are protecting your current toolset because you are too afraid to admit that maybe your current tools or your current line of thinking isn't necessarily the best. I've talked to many Perl programmers who immediately bash me when I bring up types.. because they don't believe it is important to worry about types and they think humans should get on with real work instead. I find that real work involves integrity, structure, and classification. Some people are so convenience oriented that when it comes to types, they think they wont help.. hence my discussion with many perl programmers. Then later down the road when you have a TurdFanCollision since you data is a steaming pile of corrupted poo, you realize.. oh.. we should have automated the process with integrity in the design. Let's make that a strict type, and only escape it when absolutely necessary.. if at all. So, if you think this is a process of some sort (whatever a process is... any action, I suppose) then let's stop beating around the bush and clearly decide upon the term. Let's not make more synonyms for a type system please. Some quotes below.]

"RM PRESCRIPTIONS l. (yes the FIRST prescription) A domain is a named set of values. Such values, which shall be of arbitrary complexity, shall be manipulable solely by means of the operators defined for the domain(s) in question (see RM Prescription 3 and OO Proscription 3)---i.e., domain values shall be encapsulated except as noted under RM Prescription 4). For each domain, a notation shall be available for the explicit specification (or "construction") if an arbitrary value from that domain. ".... "We treat the terms domain and data type (type for short) as synonymous and interchangeable. The term object class is also sometimes used with the same meaning, but we do not use this latter term." -- TheThirdManifesto

Re: Human brains (most of them at least) don't freak out when they encounter a plant that falls someplace along the tree-bush type axis.

But computers will generally freak out. Most programming languages don't have "fuzzy types" such that something can be 60% integer and 40% string, for example. I think the human mind when seeing something that is half bush and half tree will send some of the signal to the "bush" section of the brain and some to the "tree" section. If one or the other gets more of a signal, then the personal will act as if the thing is one or the other if in a hurry. However, sometimes the conflict will raise a warning flag and the person will stop to investigate further and form a new brain section devoted to that plant.

However, we generally don't do this with computer applications because we let humans do the fuzzy thinking and want computers to have discrete rules so that we can audit their results and get predictable behavior from them. We rely on their literalness because we lack it ourselves. Thus, we treat computer applications different than we treat human processing of stuff such as "types".

A type is a collection of domain values.

AnswerMe: What is a DomainValue and how is it different from regular values?

If I understand the author's intent correctly, 'domain values' are all possible values in a given domain, which in turn is the set of the valid values for a type. For example, for type Cardinal, the domain is the set all positive integers - or more practically speaking, the set of all integers in the range [0...maxCardinal) for a given implementation. Thus, the domain values would be 1, 2, 3... maxCardinal. By this definition, a 'type' would appear to be a name for a given domain, which would make TypeTheory a branch of SetTheory (actually, as pointed out elsewhere, it is both the set of values and the set of primitive operations upon them, but the basic point stands). Since SetTheory is, IIRC, a TuringEquivalent system of computation, the logical extension is that conceivably, one could devise a TuringComplete programming language consisting of nothing but type manipulations (HaHaOnlySerious). CommentAndCorrectionsWelcome?. -- JayOsako

That is essentially what I argue near the bottom of ThereAreNoTypes. Most definitions pick either sets or trees at their root, and the two camps fight.

Top, this is a page for defining types please keep the definition where it is, and discuss your wild and uninformed objections further. Thanks, Costin

Why does your pet definition deserve to be moved to the top of the page, ahead of even the summary list of candidates? It is as if you believe it so important as to go even before the table of contents. This came across to me as rude behavior. -- top
- It is not my pet definition, it is the definition of John C. Reynolds, and is the definition for types within the part of ComputerScience community that deals with this issues: i.e. programming language theory and type theory. For reference please consult the recent book by Benjamin Pierce "Types and Programming Languages", which is an authoritative book on TypeTheory. Also I can refer you to TheoryOfObjects, the early papers of LucaCardelli (OnUnderstandingTypes), and so on, so forth. With regards to rudeness, this wiki tries to create valuable content and not be a reflection of everybody's 2c on the subject. So if you want to talk about rudeness, you may consider that your behavior with regards to polluting wiki with repeated rantings on the same subjects over and over again, well, that's not exactly the holy grail of good behavior. On this subject, you may want to actually learn something about it. -- Costin
- Just because YOU think Benjy is THE authority is still no reason for barging to the top. And, the reason I tend to repeat my opinion is because people complain when I factor them to a single spot by creating new topics. They seem to complain less about duplication than new topic creation. Further, (alleged) rude behavior on my part does not give you a green light to perform your own brand of rude behavior. {Arrangement resulting in dispute may no longer an issue, for I have rearranged it toward a compromise.}

Re: Abstractions are often not based on "levels" in the real world. Sometimes "levels" is a UsefulLie, sometimes it is not. See MeasuringAbstraction. One must be careful when using the word "abstraction" in a definition.

Nobody claimed that types are based on "levels in the real world". But this objection is irrelevant. Types are tools just like other language elements are tools. You can use tools wisely or you can use them foolishly. In the same way, there are problems to which types as tools do not apply.
If you want to contend that RealWorldHierarchies (especially in insurance, financial markets, accounting) do not correspond to hierarchy of types as used in typical OO environments, I will not even contend that. Any tool cannot address all problems in the world. Types are tools within a particular formal language that are designed to address a certain set of typical problems faced by programmers in that language. The fact that we call them types do not automagically promote them to addressing all kinds of modeling problems. But the fact that they cannot be the OneSizeFitsAll and solve all problems in the world, does not make you right to claim ThereAreNoTypes.

RE: Looking for "types" in a given language is a way to test definitions. Ideas and definitions must be tested against concrete things to know if they work. Unless we find a consensus Boolean test for types in a language, a candidate definition probably needs work.

It is very simple to verify the definition. All typed languages define their type universe. Go read the definition of Java, Haskell, ML, etc. All define very precisely what types are in their respective language. There are also untyped languages that do not define their types hence types in those languages do not exist.
One can define flobbles in terms of snorks, and call them "types". However, that does not by itself create a clear definition. There are probably multiple equivalent ways to define languages. Plus, you appear to dismiss DuckTyping as "types".
In typed languages that support DuckTyping, there are your examples of duck types. One example is ObjectiveCaml. Go read about it.

To the objection that type definition is circular: it is not. It bootstraps from primitive types and further types are obtained by composition operators such as RECORD, ARRAY, SUM , "->", TUPLE, etc. Languages that support subtyping typically reflect that by a fixed point operator.

You still have to define "primitive types".

Ah, bon. Do you pretend not to know what primitive types are? Stuff like int, char, bool. They're just that primitive types that are considered a given for a type system together with a set of operators, that are typically bootstrapped by translating it into a lower level operational environment.

Like int and operators on int in CeeLanguage is not definable within the language itself, but you get away with not defining int because the processor itself supported so that int x=1;int y=2; int z= x+y gets translated into machine code on 32 bits arithmetic that is defined by Intel or by Sun or by whomever.

However using int you can define

 typedef struct { int x; int y; } Point;

Which is a record with two components each a primitive type, and by construction, the type Point is not primitive as it is derived using the "struct" type operator in CeeLanguage. Languages other than CeeLanguage may provide a different set of type constructors, adequate for the domain of application and the design of that particular language.

But this can also be seen as a form of "validation". Something without an "int" tag inside it is flagged.

It is interesting to see arguments about types or classes. I have some experience in one untyped language, and it was by far the most productive language I have used for business style systems. I wonder how many of the participants have used typeless languages long enough to gain intuitive insights into this other way of thinking. My conclusion, having lived in both worlds would be that types are integral to the development of functional systems because they define the behaviour of each attribute. On the other hand, I am convinced that the imposition of types within the source code of the language on every variable is a bit of overkill, and may be responsible for the current objections to type. And behaviour extends past program attribute behaviour - behaviour includes things like conversion to user-readable form and back, preferred display formats etc. In other words, I believe the types should be accessible to, and alterable from the database. And optional in source code - by provision of a generic type which allows the management of collections and atoms. -- PeterLynch

Perhaps discussions about the benefits of types or typelessness should be in another existing topic such as BenefitsOfDynamicTyping, but I am also a "typeless" fan. It results in simpler, easier-to-read code. Regarding putting "types" in a database, do you mean independent of the database's built-in types? Databases are generally supposed to be application-neutral because DatabaseIsRepresenterOfFacts, and putting app- or language-specific types in it is counter to this philosophy. As far as where to put discussions about databases and types, I don't know. Also, there may be a minor distinction between dynamic typing and typeless. TypelessVsDynamic. -- top

So typelessness improves readability when you have to check each and every value in your code for mistakes? Using assertions, filtering, and validation tactics? And it is some how a good thing that these custom validation tactics of your own may be extremely buggy if your filtering code is not right, since it is hard for a non-expert to get a type system right in his own code? And it is a good thing that you have to reinvent the type system for each application over and over again, versus having it in central place.. the database definition, the compiler/environment? An expert programmer who made the built in type system and who is smarter than you and I, is not better than reinventing your own type system with filtering and regex functions? All these are rhetorical and critical questions, TopMind. And a regex type system reinvented by you, is more readable than a built in more automated type system written by experts? And rolling your own type system each time for data integrity and embedding this into application logic each time is a good thing, and improves readability? I think not. Sloppy programming in a typeless language with your own type checking (reinvented using filtering and validation functions) may be very readable to start off from one mans perspective. However, it can become a reinvented TurdFanCollision capable (and non scalable) too. Do you really value data integrity or do you value saving some initial time by declaring everything as a blob of poo? And once you've saved this time, you do realize that you waste more time later, by reinventing your own type system anyway, by checking the values for errors yourself? Go ahead and write some unit tests too, and waste more time reinventing a Unit Test for Type Checks Since I'm Too Stubborn To Admit the Compiler/Type System/Automation/Database Definition could be doing almost all of the work for me.

Typelessness improves readability, but most of all it improves maintainability, not just because the code is more readable, but because the code is smaller - there are not variations of processes required for the same process for different types.

What a bunch of nonsense. You have to check each variable for errors and this does not improve maintainability. You embed the type checking in your application with your own validation functions and this becomes a worse mess than built-in type checking. In fact programmers are too lazy to make their own error checks so they don't bother.. and leave the code wide open for TurdFanCollision and corruption. As for the code being so much smaller for different types - consider ParametricPolymorphism and IncludeFileParametricPolymorphism and other tactics if you really want to save a few keystrokes. In some cases, types are so different from each other that you cannot write the same algorithms for every type anyway.. and you have to make exceptions to the algorithms and special rules for each type anyway.. causing bugs in your software if you program it generically and forget some little factor. But look at the links I give such as IncludeFileParametricPolymorphism which shows an AnyRec? or an AnyType? which some languages even support natively now.. and more inventions can be made to make statically typed and dynamically typed (the safest is using both, not just one) languages easier to write quick algorithms in.

For example, a process to get a reply from the user. Which would be many different processes in a typed language.

Oh really, and are you sure that your filtering tactics for verifying the posted information is truly secure and not open to attack.. and are you sure you wrote the custom validation properly yourself? Better protect yourself and have a type system save your ass when someone injects a string into your number column and corrupts your database.

"Get a reply from the user"

Reply = Reply.Message(Message, Default, Attribute.Name, Message.Name)

Message is the question - e.g. - "What is your birthdate?" Attribute Name is the name of the attribute which is being requested - e.g. - "Birth.Date" Message.Name is an identifier for this message, relative to the application - e.g. - "Program.Name, Birth.Date"

This gets the reply from the user in whatever manner is appropriate for the current environment - maybe a Message.Box, maybe a command line interaction. It ensures that the reply is appropriate, by looking up the definition of Birth.Date, which tells the process that the requested item is a date (at least). It allows consistency across all messages - a message whose Attribute.Name indicates a foreign key would provide a choices, drilldown to the owner (foreign) entity, and search button in the message box. The message name allows the behaviour of the message to be altered without touching the source code. The default behaviour I use is to 'remember' the answer so that the next time this message appears, the default shown will be the previous answer. (This is inappropriate for a birthday, but this is only an example)

The resultant Reply is returned in internal format - in the case of a date, this is a "days since" number.

Top - this was the beginning of the answer to your question above. Here is the rest of it. (Dunno why this happened) - The Attribute.Name in the argument list of Reply.Message is a name for the variable being requested - for the variable to be returned to the caller. In the database, a table called Attributes contains an entry for the Attribute.Name specified. This entry provides to the Reply.Message process all that is needed about it. Like the display length, the entry length, the Type (like Date, Number, Phone.Number), Plurality, Owner.Entity (for foreign key) etc. That is what I mean about the DataBase containing the type. Maybe I should say the MetaDataBase?. (though I prefer it as the database, because one item's metadata is another item's data. -- PeterLynch

I don't see how it is different from validation. Validation can contain a lot of orthogonal checks, BTW, such that it may not be converted into a type tree.

Specifications for validation are part of what may be called the Types entity in the metadata. Each Attribute.Name has a Type. I use the term Attribute.Class for Type in this http://peterl.homelinux.org/ENTITIES/ATTRIBUTES.HTM. And the Attribute.Class is defined here http://peterl.homelinux.org/ENTITIES/ATTRIBUTE_CLASSES.HTM - this will either clarify what I am trying to describe as types, or confuse you altogether. (Apologies that that link has not worked for a while. The machine died. I am fixing it - should be ready soon . . . . (25 July 2005)).

For example, some strings perhaps must be all capitals. Others may not allow quotes, while others (such as file names) might only allow alphanumeric and an underline. These are not necessarily mutually exclusive. Some might want only capitals and alphanumeric, some only lower-case and only letters (no numbers), etc. To get "types" for this we may have to make a CartesianJoin of every possible combination. Ugly. It is much more logical to create set-based validation where the combinations are not hard-wired to each other. (Sometimes some may be mutually-exclusive, but we must deal with such on a case-by-case basis.) -- top

What you are describing, TopMind, is a rich type system that is exactly what is described on dbdebunk.com on this page: http://www.dbdebunk.com/page/page/1812707.htm. What you are doing is avoiding the term "type" because for some reason you just don't like it. What you have to do is admit that a type system is exactly what you are describing and stop beating around the bush with your own vague definitions in which you have no clear consensus on. The reason it isn't clear to you, is because you just can't admit and just can't give in to the evil type system which you seem to hate, for some reason. A type system is about classification and integrity.. and the dictionary even declares a type system as classification. I see a pattern here in that what I declare as a type system is what the experts are declaring it as, while you are off making your own buzzwords for the type system. Let's stop creating synonyms and stop adding confusion to the entire type system definition (by using phrases like it's a process and similar. A word, such as type or domain helps us talk in English without going on and on about a million other things that could be just like a type system. Please. NotInventedHere syndrome? Don't want to admit that a type system is useful and have to reinvent your own terminology?

Can you give an example or something to illustrate what you mean? Maybe use your TopsQueryLanguage as PseudoCode.

Rule A - Character set Aa to Zz (all letters)
Rule B - Character set 0 to 9 (digits)
Rule C - Char set rule A or rule B or underline "_"
Rule D - All capitals
Rule E - All lowercase
Rule F - All characters except quotes (single or double)
Field 1 - All capitals only: rule D
Field 2 - No quotes allowed: rule F
Field 3 - A file name: rule C
Field 4 - A mainframe file name: rule C and D
Field 5 - All lower-case chars and no quotes: rule E and F
Field 6 - All upper-case chars and no quotes: rule D and F
Field 7 - Only upper-case letters: rule A and D

I would give every one of these instances of type a name appropriate within the context of the application, which I cannot determine from here, and implement the functionality for that type if necessary. And apply that name to the desired data elements. The fact that there is commonality in the definitions is irrelevant - in time, any one of these types may evolve in another direction. -- PeterLynch

Sounds like using types to implement set-theory to me. Possible, but are types just sets then? Questions about the difference between sets and types have come up before, but with no consensus answer. One limitation of static types for such is that it is difficult to do Boolean expressions for validation, and probably complicates the formation of a clear definition of "types".

Huh? A set belongs to a domain (which is a type). Please see the Third Manifesto quote as follows: "A domain is a named set of values. Such values, which shall be of arbitrary complexity, shall be manipulable solely by means of the operators defined for the domain(s) in question (see RM Prescription 3 and OO Proscription 3)---i.e., domain values shall be encapsulated (except as noted under RM Prescription 4). For each domain, a notation shall be available for the explicit specification (or "construction") of an arbitrary value from that domain. Comments: We treat the terms domain and data type (type for short) as synonymous and interchangeable. The term object class is also sometimes used with the same meaning, but we do not use this latter term." --TheThirdManifesto

http://wiki.rubygarden.org/Ruby/page/show/TypesInRuby is a better description of what I was trying to say about types. -- PeterLynch

Peter, Ruby isn't the absolute best place to cite when discussing such a topic about types, nor is the PHP manual or similar product pages. Ruby is mostly a product that is hyped by the product developers, as is something like php. We need to use references that are backed by more experts on the database and relational model when discussing these topics. However, if a product ever gets it right (or closer to right) then I think we can reference the product pages more... so hopefully in the future products will be good places to cite. From Ruby's home page, where they declare everything as an object (in purism form, whatever an object may be.. in vague form.. hence no real purism at all, ironically) I can't say Ruby is usually a good place to cite.. although they may have some things right about certain topics, maybe. It's a risky bet, though.

One consequence is that type checking at run time with the instance_of? method does not help find errors, and also artificially constrains clients of your code resulting in code that is "brittle" and hard to maintain. -- That Ruby Page

Their definition of how type systems help us keep integrity, is brittle, at best.

Therefore, in Ruby, unit tests take the place of static and runtime type checking. Learn how to use TestUnit?, it is your friend! You end up with better checks than a compiler for statically typed language can give you, and code that is flexible and easy to modify. -- That Ruby Page

Again: Their definition of how type systems help us keep integrity, is brittle, at best. Data that is well structured is not always something that we want to easily modify - just as corruption is something that is easy and good, no? i.e. I will repeat.. their definition of types is vague and brittle at best. Better we use references such as what TheThirdManifesto calls a named set or domain (which they shorten to type, for clarity.. thankfully).

[Top - Do you actually implement validation this way? I can't see any benefit from modelling validation rules as sets of constraints, when you can implement a per type (or per field) regexp instead. Sure, you can conceptualize it as a set of constraints, but what's the advantage? Do you have a need for ad-hoc queries against the validation rules? I find that RDBMSs and relational concepts in theory are great for creating and describing relationships between data but are in general useless for (not incapable of, note) modelling behavior, which I consider validation to be.]

I have created DataDictionary's that tend to have such aspects and I enjoyed being able to review the info from different visual/query perspectives, but this topic is really about definitions, not practicalities, so I would rather not get into that here. And I have also used character templating tools that are similar in concept to reg-ex's and did indeed find them useful. But the above was mostly meant as a thought experiment to show that one does not need "traditional" concepts of types (integer, float, date, etc.) to ensure clean data. Thus, a definition that resembles, "types are what you use to ensure clean data" is not sufficient unless types == validation. -- top

So writing your own Regexes, Top, is not error prone, compared to having an expert write a type system for you (automation, using OtherPeoplesCode that has been tested by hundreds more people than you custom type system will ever have seen, i.e. avoiding rolling your own?)?

Integers, and bytes, and such things are not the only types.. they have just blinded us due to our current languages which lack richer constrained type systems. The problem is that types are something that take time to implement to perfection and there may never be a perfect type system, but we can at least agree what a type is and aim for perfection with time. Not everything is done yet, in computing.. and we are improving.. but this doesn't mean we should avoid types start vaguely calling them processes of whatever I want and such. See the dbdebunk website at the following link for what types are about (integrity, classification, named sets): http://www.dbdebunk.com/page/page/1812707.htm Your regex solution is a hack.. and yes regexes could be plugged in as part of a type system, but ultimately what you would want is something less error prone and sloppy than regexes. True parsers and true validation on a char by char basis (plugin functions) as described on ConstraintType page are what would ultimately be a better type validation for user defined run time types for a database definition.

Types are hard to get right since the world is very complex (with lots of things that we need to store in databases). But avoiding the term type and reinventing your own vague terms for types won't help us get anywhere. The type systems in current languages like C, Pascal, etc. are not as constrained as they could be.. and there needs to be work done so that types are more specific to what tasks we are doing. Hence the page I link to above on dbdebunk, which is very similar to what I talk about in ConstraintType. So it is not just about integers, bytes, and strings.. but about constrained types.. some of which may be abstracted from integers, bytes, and strings. Our current programming languages don't support rich enough types because it is very hard to have a type system that supports every human law.. such as Weight not being able to be added to a Quantity type.. yet Weight should be able to be multiplied by Quantity. Regardless, don't deny the type definition and call it something else.... that just causes confusion. Whether our current type systems are rich enough or not does not change what a domain and type is.

A type is a restriction on the structure of software.

This is based on the observation that types can be considered as descriptions of interfaces. Interfaces are what give software its structure, since they enforce that the mechanism by which two pieces of software interact is stable. Types describe restrictions on the structure.

All these words and still no definition. What's wrong with this:

A type is a set of values such that calculation with types approximates calculation with values.

That is, functions invoked with values belonging to their argument types result in values belonging to their result type. If the types don't fit somewhere, the values most likely don't fit either. In other words, the algebra of values is homomorphic to the algebra of types.

Note that nothing is said about subtyping or whether types have to be disjoint. A subtype is simply a subset of another type.

Any language has at least one type algebra: the trivial algebra with only one type. This applies to so called dynamically typed languages and can be embedded in statically typed languages. To be useful, type algebras should be more expressive than that, but still be easier to compute in than the corresponding value algebra. Otherwise, approximating the value algebra in the first place would make no sense. Usually type algebras are decidable, while value algebras are not.

Dynamic type checking is actually a misnomer. When types cannot be checked before running the program, they are not really types. In dynamically typed languages there is one true type, which is a tagged union with each arm corresponding to a primitive type or a type constructor. Consequently the dynamic "types" should probably be called tags instead. (Smalltalk doesn't even have tags.) Confusing them with real types only leads to strange arguments and strange definitions like this one:

A type is the sum of behaviours under all conditions.

Well, wouldn't that just define a value? If not, what's the difference? The difference is that ThereAreNoTypes at runtime, only tagged values.

Types are a way for the compiler to say NO! after the parser says YES!

There are a few other mechanisms for the same purpose, such as whitebox analysis. Types differ from these by being 'compositional' (and hence 'modular'). You know that two subprograms compose if their types are compatible. (Well, at least they compose according to the type system and grammar. If they use mutexes, threads, shared state, etc. they might still deadlock or behave badly.)

Similar concepts whose relationship to and difference from "types" is not entirely clear:

Classification
Validation
OOP Class
Sets

Think in semantic, not pragmatic programmatic, terms:

A type is a classification over objects (or entities) which exhibit a set of one or more common natures, behaviors, attributes, and/or qualities. Types are typically used in a descriptive (implicative) sense, but programming languages use them in a prescriptive (assertive) sense. Types usually convey a notion of classifying objects, as opposed to describing them, although there usually is considered to be an association by the type with commonalities exhibited by all objects of a type.

What are types? They are a device put here by our saucer-driving zoo-keepers to drive WikiZens insane by trying to define them and model them to increase profits for the zoo because insane animals draw more visitors.

I don't believe your insanity can be blamed on types.

There are different types of insanity.

Start here: A type is one of {int, float, char} or a compound of such denoted with typedef. See ConfusedComputerScience and find the OneTruePath, otherwise all this ThreadMess will confuse even the best of PhilosophaeDoctorae?s. --MarkJanssen

Apparently there's only one programming language, too -- C. Or TinyC, actually, because apparently long and double aren't types. Is "unsigned int" a type? Is a named struct a type?

unsigned int is an int, double or float is a type depending on what floating point operations the machine implements natively, otherwise your compiler is going to have to fake a native type.
What does that have to do with what is, or isn't, a type? Whether a type is implemented in hardware or not has no bearing on whether it's a type or not.
{Hardware, Software, WetWare, BSware, LaynesLawWare, oh the fun!}
Drinking and driving on the Information Superhighway again, are we, Top?

By the way, beware of attributing confusion to others that belongs to you. Humans have an odd habit of claiming "this will confuse others" when they mean "this confuses me."

Generally, C is just "lucky". There aren't too many languages that can compile themselves into machine code, particularly cross-platform. You probably don't understand the monumental effort that represents.

As it happens, I know precisely what effort is required to make cross-platform languages that can compile themselves into machine code. What I don't know is what that has to do with anything else in this section.

The type of a datum defines the form, fit and function of its values.

A datum of type integer has the form of a whole number, the fit of plus or minus infinity, the function of addition, subtraction etc.

The type may be implicit via inspection or context, or explicitly defined.

See also that even the purists have arguments about types and such subjects:

http://www.dbdebunk.citymax.com/page/page/3317138.htm (HughDarwen, FabianPascal, etc.)

Read DefinitionsOfTypes -- ChanningWalton

See: ThereAreNoTypes, AreTypesTiedToSyntax, QuestionsForTypeDefinitions

CategoryLanguageTyping, CategoryPolymorphism, CategoryTypingDebate, CategoryClassification