Alternative Type Definitions

Branch of WhatAreTypes

According to TopMind, there's a wide variety of "types" or definitions for types in programming language definitions. Of course, we have TypeTheory and programming language theory having a very consistent view on the subject, but nevertheless here they are:

Candidate definitions allegedly collected by Top:

(Please don't embed disputes in the list itself. It serves as a table of contents.)


Of course, the claim that Top has collected these is of no use if he cannot provide some proper context (at least proper citation for the source, the importance and usefulness of the source, etc). Otherwise, we can go on a spree trying to collect what's everybody's opinion of how should we define atom, but that doesn't mean that all the opinions collected warrant a representation on a wiki page. As WardCunningham puts it, wiki does not offer a guarantee to everyone that everybody's 2c will be preserved for eternity like in a mausoleum, regardless of merits and value for wiki.

What does it mean to respond to "WhatAreTypes?" with "Substitutability"? It means nothing. It's purely and simply crap. It's top's in-your-face laziness trying to articulate that in some type systems substitutability is an important aspect related to types and there are different approaches about how type systems should solve substitutability (ManifestTyping with names and explicit declarations, versus DuckTyping, etc). But he cannot in all honesty claim that "substitutability" is a definition for types or an explanation about what are types.

I may be providing a UsefulLie whereas JohnReynolds may be providing a UselessTruth (at best). This approach may be more usable for newbies. Further Reynolds' work is not really a definition, but rather a model under which to analyze types. The implication is that something "is a" type if it can be reworked to fit Reynolds' model, but not if it doesn't. For one, this is not how "normal" definitions work. It's mistaking models for definitions. Second, it could be that one's lack of creativity is preventing them from fitting (re-mapping) something "into" Reynold's model, not inherent lack of type-ness. We want to determine "type-ness", not human ingenuity. It's a disturbing problem that my detractors do not appear to wish to resolve. Even if you are crazy enough to be satisfied with "it's a type if it can be made to fit Rennie's model", that still makes it a bigass 100+ page definition, which would make normal researchers strive for a shorter variant. Excessively long definitions are a smell. -- top


And, of course, Top does not realize he doesn't argue for another type definition, but for a design methodology that within a given language/ programming environment is able to help the developer to identify the types. Therefore he protests the OO methodology of trying to identify a hierarchical taxonomy in the real world that translates into OO "types".

I agree that this is a valid criticism. But the criticism however cannot be applied against TypeTheory because the theory offers you just the tools, and the tools are valid and adequate when put to good use. The criticism is against the OO method for using the tools. So he barks up the wrong tree.

The people who set out to make say SmalltalkLanguage were not interested in implementing any math-based type theory.

Yes, Smalltalk choose not to use the tools given by TypeTheory. It's a trade-off that smalltalkers consider worthy to make. As a matter of fact, David N. Smith, a noted authority on Smalltalk, is on record in his Smalltalk FAQ that Smalltalk has classes, but it has no types in the proper meaning of the word. Does any Smalltalker around have a copy of David Smith's Smalltalk FAQ?

But some Smalltalk fans do claim that Smalltalk has "types", just not in the sense used in other languages. I invite them to explain in more detail.

It might be more accurate to state:

However:


Re: (at least proper citation for the source, the importance and usefulness of the source, etc)

Reliance on ArgumentFromAuthority is problematic.

Also, I have not seen any clearly superior definitions that fit within say 2000 words. If a definition takes an entire book, it probably has very limited practical use.

Last I checked, a reasonable and thorough explanation of the RelationalModel requires more than 2000 words (ChrisDate's book is over 1000 pages). Can we assume the RelationalModel has limited practical use then? The notion that ScienceShouldBeEasy is a well-known fallacy.

Date's book is mostly about explaining techniques and issues, not defining. Relational can be reduced to a small set of primitive operations similarly (but not quite as small) as to how Boolean Algebra can be defined compactly. -- top

And that's just the basics. How about:

There's a lot of material there. While it might be possible to cover the above (or at least a reasonable subset) in 2000 words or less, it would be a very cursory treatment. Methinks it would be easier to provide a gentle introduction to TypeTheory (again, a cursory treatment) in 2000 words or less.

Especially if I'm allowed to elide the complicated parts on the ground that it's an ImplementationIssue?. TypeTheory, like the RelationalModel, both have SetTheory as their basis.

I like to compare relational with BooleanAlgebra: you have well-defined operands and well-defined operators. Relational operators are perhaps not as "final" as Boolean because new ones have not been ruled out, but the concept is similar. Related: CriteriaForGoodMathOrCompactNotation -- top


(Moved from BadIdeasShouldBeKept?)

Re: "Just because one can make a lot of math around an opinion does not change the the fact that it is just an opinion."

See also: LaynesLaw.

A curious form of the EveryoneHasHisOpinion fallacy. While ArgumentFromAuthority is also frequently a fallacy, in a scientific context ArgumentFromAuthority is far less fallacious when quoting a scientist in his/her area of expertise, whose writings are the result of solid, peer-reviewed research. To suggest that scientists cannot base work on the discoveries of other scientists would completely undermine science. It's anti-intellectual PoMo nonsense.

On the specific questions of WhatAreTypes: TopMind could do far better than resorting to the immature, demagoguish technique of dismissing out of hand the authorities on the matter (and using diminutive nicknames like "benjy" for BenjaminPierce) - by employing this technique, Top is wearing the robes of a pseudo-scientific snake oil salesman.

Top could point out, for instance, that there is no universal set of primitives - there are an infinite number of ways of constructing the natural numbers (ChurchNumerals?, PeanoArithmetic, etc.) - any practical type system must either ignore this (and assume the existence of numbers) or state how they are constructed. Drilling down a layer, as many such systems are based on SetTheory - one might ask which set theory? ZF? Or something else.

Of course, all of this is abstract minutiae, and stuff Top probably has no clue about.

And, natural numbers don't even have to be part of a programming language. They can be farmed off to add-on libraries. Let's leave that as an exercise for the FunctionalWeenies :)

In some ways, TypeTheory is a meta-theory, as one can construct numerous different type systems. Speaking of a "type" without an associated type system is meaningless. In the case of programming languages - they all define their type system, so speaking of a type in the context of a given programming language is quite useful and meaningful.

Are you suggesting that a clear language-neutral definition is not possible?


I think a fundamental part of the problem of the current debate is that TopMind and CostinCozianu are talking past each other here. Two issues are being confused:

In the case of the first one, TypeTheory is a mathematical discipline subject to axiomatic proof, and as such is beyond reproach from the likes of TopMind. The work that JohnReynolds, RobinMilner, LucaCardelli, BenjaminPierce, and others do in this field is applied math; if you wish to argue against it you should have some proofs of your own ready. Were someone to successfully disprove TypeTheory, it would be a significant result indeed.

In the case of the second issue, there is much debate. The correctness of TypeTheory says nothing about whether or not one is more productive with StaticTyping or DynamicTyping - questions of productivity are economic and human-centric in nature - part of the reasons why ComputerScienceIsaSoftScience. The debate uses TypeTheory (a hard discipline) as a foundation, but involves many other factors.

In short: ypeTheory is a tool. Whether or not it's a useful tool has little to do with its correctness (assuming it is correct, of course - if TypeTheory itself were a load of codswallop, it probably wouldn't be useful to anybody).

My opinion on the matter: ypeTheory is indeed a useful tool in many contexts. In some, it probably should be a required tool, where the increased ability to reason about a program's behavior that type theory affords might make the difference between life and death. If you're doing a CrudScreen, on the other hand, I'm for leaving it up to the programmer(s) and customer(s) to figure out what practices are most likely to produce the desired result in the most cost-effective fashion.


PageAnchor: typesAsFlags

In a practical sense, "types" in strongly-typed languages are basically just a little hidden or not-readily-seen flag(s) in each variable (compile-time or run-time) that is used to indicate what operations can be done with that variable. Variables that may be derived from compound or multiple types must use various algorithms to determine the final type and/or which operations are valid. Some languages will require a final single "type", while others use hierarchies and other means to allow a "type path search" to be done on a derived variable to know which operations are valid.

In some languages, the operations and their types must be "registered" together ahead of time somehow, while others allow each user of a type to determine how it is going to handle a given type on its own.

Sometimes, more than one flag may be needed. For example, if we declare "foo" as type "int" (integer), then anything of type "foo" is both of type "foo" and of type "int". Type systems are basically the management of the trees and/or graphs that determine what type something is or what operations are allowed on it.

More examples are given below.


For the information of Top, quite a large part of the body of knowledge called TypeTheory has found its way into large-scale industrial languages that dominate the current programming landscape. I'm referring here to Java 5 and CeeSharp (the version available in VisualStudio 2005). So let's rephrase some of the above claims to adjust them a little bit to reality. Shall we?

What is a practical example of how it has improved them? Also note that I did not claim that no popular language was influenced by the work.

Some more specific examples of how TypeTheory has informed/improved programming language design:


Moved from MostHolyWarsTiedToPsychology.

Nonsense. It's mostly TopMind using "psychology" as a trick to get some kind of status from where to pretend a reasonable disagreement with mainstream ComputingScience, for example TopOnTypes, RunTimeEngineSchema, etc. Most people know when it is the case that different approaches amount to a difference in style and culture, when a criticism is valid, or when somebody wants an argument ad perpetuum.

If it is a cut-and-dried situation, let's see the evidence. I don't deny that there is a lot of math surrounding "type theory", but if it is tied to only syntax, it is possibly flawed. Besides, types aside, there are still many other topics that are dependent on psychology so far. PsychologyVsCutAndDryPoll.

Again, Costin and TopMind are arguing past each other somewhat. Costin refers to TypeTheory, a well-grounded mathematical theory of some import in ComputerScience. Top seems to think that it has questionable relevance to programming - a claim which can be made without discrediting TypeTheory itself. However, Top's arguments against application of TypeTheory to programming and programming language design is that "some people don't like it". While it's undoubtedly true that some people don't like it, other people find it extremely useful in modelling the user domain on a computer system. Proponents of TypeTheory as a useful programming tool do not carry the burden of convincing 100% of practicing programmers, DBAs, etc. of its merits.

Perhaps we should move this somewhere else, but my complaint is not so much about usefulness in this case (that is another story), but about it being tied to syntax. One can have identical "typing" features using non-syntax-based "languages". Just because one does it in a structure instead of text should not change the nature of "typeness". That would be silly. ("Typeness" as defined in the strong typing language sense.) -- top

TypeTheory applies even to (well-designed) languages that don't have StaticTyping (manifest or otherwise). As mentioned on another page; SmalltalkLanguage has consistent type semantics, even though the syntax of the language makes little mention of "types". The syntax of Smalltalk does provide several explicit operations which have impact on types. When an object is created, one must send the "new" message to its class; usually the class is explicitly named (though "aClass new" is perfectly legal in Smalltalk - the equivalent is not legal C++ or Java). Inheritance is another (inheritance is always made explicit), and defining a method within a class is another (if a class understands a given message; than it's a subtype of the implicit type of all classes which understand that message). It sounds like the argument reduces to how much type annotations should be provided by the programmer; which is a far cry from suggesting that TypeTheory is an interesting mathematical diversion with no practical use.

The above is not about the practicality of types nor dynamic types. The RunTimeEngineSchema is based on "static typing" because it checks the tables before running the stored instructions.

Hahahahaha. Top, I hate to be rude...

I doubt that, based on the frequency.

but that remark betrays a vast amount of ignorance on the subject. StaticTyping means types are checked - and the program proven typesafe - before the program is run at all.

This is what I proposed.

Checking during execution, but before a given function runs (and then rechecking the next time the function is called, etc.) is dynamic typing.

This is not what I proposed.

[If I misunderstood what you meant, I apologize. Your use of "runtime" in this context threw me; a language runtime normally doesn't get involved until a program written in the language is running.]

Granted, this sort of dynamic typing (checking that a function's actual parameters match a type specification) is different from DuckTyping (where ypechecks are deferred until the last possible instant - which is what Smalltalk does), but it still is DynamicTyping. StaticTyping, by definition, involves no such thing as a "runtime engine schema", because it involves no runtime at all. Static typing means "an expression is proven to have a particular type (or a subtype) in all possible execution paths of the program and for all possible inputs.

It is confined to an expression? Where does this come from? Most compilers convert expressions into a data structure anyhow (such as a tree) before performing such checks. They do not perform it on raw text.

[An "expression" refers to lots of things - any term in the source code, attributes of objects, formal parameters of functions, etc.].

This is why static typing is fast - it has zero runtime overhead. This is also why it is frequently not decidable; it's far too easy to construct programs where the type of an expression depends immediately on a program's inputs. The main objection to StaticTyping comes from the fact that making statically-typed languages fully decidable requires limiting what you can do with objects. You won't find "aClass new" in a statically-typed language, for instance.

I am not debating the merits of dynamic versus static here, but about the merits of definitions.

Further, I don't give a flying pluck what "mainstream computer science" says. The mainstream said COBOL and MS-Windows was grand.

Bullshit. Mainstream computer science never claimed that. Just go and read EwDijkstra manuscripts.READ, for crying out loud, that would be a welcome change for you rather than pretend you can reinvent everything, Top style. -- Costin

Actually, most academic computer scientists of the day thought Cobol was terrible. In my view, it was a usable, serviceable language for its time; though certainly not one that was state of the art. I wouldn't dream of starting a project in Cobol today. But your point is? Just because scientists thought one thing, then modified their thinking in the fact of newer evidence, isn't a weakness of science. It's a strength. -- ScottJohnson

Academic != mainstream. CobolCausesBrainDamage describes how the "academics" tended to get caught up in their theoretical underwear such that business-savvy practitioners were the only ones able to produce something usable before the given deadline. In my opinion, the best tools come from the merging of brainy ideas and results-oriented practitioners who make the ideas workable and marketable. TwoHeadsAreBetterThanOne?.


Typing is a way to ensure that polymorphic dispatching finds one and only one method before methods are actually called, or at least a way to reduce polymorphic ambiguity.

More generally (and more correctly), typing is a way to a) ensure that program operations are well-defined and well-formed, and b) resolve ambiguity. {A little too general.)

SmallTalk is not "typed" because a message may find zero or one method. PredicateDispatching is not typed because it may return zero, one, or many methods (or their equivalent).

The "at least a way to reduce polymorphic ambiguity" was added to the definition because under strong dynamic typing, types are still used to help decide how to dispatch. For example:

a = b + c

In some languages "+" may also mean string concatenation, depending on the "types" of the two operands. It is usually implemented as a "type flag" that comes with each variable (see above). (Generally I prefer that a language have a different operator for concatenation, such as "&" or ".". But, this topic is about definitions, not ideal language design.)

The biggest problem with this definition is that it depends on the definition of "polymorphism", which itself may be subject to interpretation. Also, sometimes polymorphism may not be involved. For example, the above may have an implementation or internal equivalent that resembles:

  a = b.plus(c)
Or

  a = plus(b, c)
If a language does not support ParametricPolymorphism, it may have to use IF or case statements, something like this:

  method plus(operand2) {
    if (this.operand1.typeFlag=='number' And 
         operand2.typeFlag=='number') {
      return mathAdd(this.operand1, operand2);
    } else {
      return stringConcat(this.operand1, operand2);
    }
  }
Or

  function plus(operand1, operand2) {
    if (operand1.typeFlag=='number' And 
        operand2.typeFlag=='number') {
      return mathAdd(operand1, operand2);
    } else {
      return stringConcat(operand1, operand2);
    }
  }
This is not polymorphism in the strict sense. One could also emulate the same thing by creating an array in which one element is called a type-flag of sorts and the other is the value. Some languages just have built-in "arrays" for each variable, which contain the "hidden" type flag and the variable value to simplify the syntax and provide a consistent convention. Such languages essentially are performing the following shortcut:

  foo = 7;  // shortcut in dynamic strong typing

// the long way foo['value'] = 7; foo['typeFlag'] = typeCodeOfConstant(7);
I see nothing that really distinguishes the two other than being built-in versus hand-rolled. Thus, the definition either needs work, or "types" is inherently a vague concept.

Related: SetsAndPolymorphism


I think the underlying reason for the differences in opinion is that I view code as simply a view on an internal (or semi-internal) data structure and things like polymorphism and inheritance are simply code lookup (dispatching) algorithms or rules, usually based on tree-search algorithms. Strong typed languages simply do tree searches to validate "types" and "subtypes". Some do it at compile time, and some do it during run-time, but it is all just a tree search behind the scenes. This view is not necessarily "wrong", but rather a perhaps minority viewpoint. But, it can model anything syntax can for most languages. Most of the formal or "accepted" type theory seems to view syntax as syntax, not as a data structure(s), and thus a data-ist finds the definition foreign. Whether the formal theory can be translated to a data-ist viewpoint, I don't know. Does TypeTheory suffer from LeftHandedSyndrome? Is it my fault for not converting my head to proverbial right-handed?

As an example of its use, I did not "get" recursion until I started viewing code as a data structure. It clicked when I was finally able to mentally picture the variable stack being filled up and view how the code was separate from the stack. The code is a static structure during the recursion process and each "instance" of the subroutine call was an empty template of the variables copied onto the stack to be filled in. It works. It may be a UsefulLie, but a very, very useful one to me. If I tried to explain recursion to somebody else, drawing the stack and copying the instance template onto the stack for every new call is the only direct way I know how to convey the concept. I have read other explanations, but they seem very indirect and obscure, like a politician beating around the bush to avoid being specific. // Oddly, I would never have grokked recursion with that visualization. I always thought of it (in imperative languages) as just "do the same thing". E.g. to walk to the store, "take one step toward the store. Now walk the rest of the way to the store". Or for trees: "Put this together by putting the pieces together. Put the pieces together by putting their pieces together". We think recursively all the time, without the benefit of diagrams of boxes and stacks.

I find it tough to explain things in terms of syntax, but data structures are just a matter of look-ups and filling in the right cell at the right time. And, you can draw them with grids, boxes, arrows, and numbered steps.

In this view, all the fuss about types seems to boil down to guaranteeing one and only one behavioral "cell" is found per dispatch. The one-and-only-one rule seems to be the common thread to separating "typeness" from things not normally considered types; and the primary way of doing this is the "hidden flag" (above). The hidden flag is in both static and dynamic systems. Static or compiled systems just toss it at an earlier stage. Maybe Reynold's approach is just another way to say essentially the same thing, but without reference to data structures. Even multiple inheritance (MI) with its complex lookup process still produces one and only one answer in the end. If it did not, I doubt it would be called "types". (MI's "flags" are sets or lists instead of a single value.) Tying Reynold's definition to the data-structure viewpoint may make it more useful and informative because it provides another perspective. The best teaching methods use or try multiple perspectives.

-- top

The fuss about types is about specification of the domains of data - where it belongs, where it doesn't. It's about program correctness. Whether it boils down to early ("one behavioral cell" in top-speak) or late binding is not material to type discipline (one need only look at C++ to see that static typing != early binding)

Are you suggest that the less problems that can be checked/prevented at compile time, the less "typish" it is? C++ sacrifices certain kinds of checks or error complaints for speed and flexibility. You keep seeming to suggest that types are ALL about compile-time prevention of problems. Does this mean that ANY problems detected at compile-time are about "types"? If typing is just one kind of checking, what distinguishes it from other types of compile-time checking? Take the SQL example under WhatAreTypes. What if the compiler (or whatever its called in the DB) detects a bad table alias. It prevented a problem before running. So does that alone make it part of "typing"? If not, what is the distinguishing factor? Your definition is not providing clear answers to these kinds of questions. -- top

I'm not even sure what the question is. Is it "define 'type'?" Try a dictionary. Programming languages don't agree on a lot of things, let alone something as abstract as the theory of types. Is it "demonstrate the utility of type disciplines X, Y, and/or Z"?. Okay, plenty opportunity for a PissingMatch there, most of those have their own FooVsBar? pages (e.g. StaticVsDynamicTyping) for that purpose.

CostinCozianu is suggesting that there is a single "proper" definition of types, and created this page to separate the proper one from the "unofficial" ones.


Here is my classification of most typing systems based on the "flag" definition (see above PageAnchor Types_As_Flags). It is not meant to replace other classification systems, but rather to only present the "flag-centric viewpoint".

Of course, these leave out all kinds of language-specific details. Note that the "flag" may also be a list, especially in cases of multiple inheritance. Of course, this makes a pedigree check more resource-intensive. Non-trees also risk circular definitions, which the compiler/interpreter must somehow intercept, otherwise risk an infinite loop.

The "flag model" offers proper type substitutability using tree-search/traversal (or graph-search if MI) algorithms commonly described in the body of published data structure knowledge. For example, if type A is defined in terms of type B, and type B is defined in terms of type C, then the tree search will eventually find the definitions in C as it looks up through the type hierarchy. For MI there may be circular definitions. The algorithms for detecting cycles in graphs are also well-established. Signature-based type matching/validation such as Java interfaces or parameter overloading are similarly merely data structure traversals to find a matching signature (or report a "no-find"). Thus, the flag model does not lack published theory, but merely borrows from it (data structure theory). Or, perhaps one could say it piggy-backs on it. Yes, it is kind of a mechanical definition, but that also makes it less vague than some of its competitors.

Nor is the flag model tied to "syntax", unlike competitors. Syntax can be viewed as merely a human-friendly wrapper around a data-structure(s). Most compilers and interpreters convert syntax into internal data structures of one kind or another anyhow in order to better process the information. It is possible to program by putting info directly into data structures (such as an AbstractSyntaxTree)), bypassing syntax altogether. And the usual type checking can still be performed despite lack of a syntax coat.

-- top

There's also ProofCarryingCode which includes the compiled code and its types. These are particularly useful for typesafe dynamic loading, runtime typesafe upgrading, useful stuff like that. -- ShaeErisson


PageAnchor tag_dispute_02

(Re: "side tag" or "side flag")

A type could be seen as a hidden or visible sticker that tells you what type something is (i.e. on an apple you can have a sticker that labels the type of apple it is.) Note that this is a StrawMan definition of types since it isn't even a definition! Just like side flags, it doesn't define what types are, it only tells you that a sticker can label something. The sticker tells you the type of apple it is; the definition of type remains undefined. It's a tautology definition that goes around in circles to DodgeTheIssue, at best describing a way to label what type something is at a particular point in time (put a sticker on the apple, put a flag/tag on a value or variable).

It's not perfect, but neither are the other candidates. Those that say what types are, are not clear on what they are not. A definition that doesn't offer a way to falsify its application (what is not an X) is faulty.

{Top, the definition I provided, "a type is a set of values and 0 or more associated operations", clearly offers ways to falsify its application. You just don't like it because it isn't "type tags".}

"Operators" and "values" are not clear-cut. I've shown that "types" can be used as values and vice verse. And whether "types" need closely-associated operators, or even how to measure such association tightness was never firmly established. "I know it when I see it and that's good enough" is not good enough for times when rigor is needed. Do we really need to reinvent those debates here? See TopsTypeDeterminatorChallenge and DedicatedOperatorMeaningless.

{You've never shown that types can be used as values. Your example only showed that a string value associated with a type -- a type name -- can be compared to other strings. Furthermore, using types as values (which can be done) doesn't damage or alter the definition at all. It still holds true. There's nothing that precludes the "set of values" in "a type is a set of values and 0 or more associated operations" from being a set of type values. Furthermore, there is no need to measure closely-associated vs loosely-associated. Associated is associated!}

It's fairly trivial to conjure up a version that doesn't depend on strings. String was just one of many implementation or language-specific approaches among many possibilities. And, are you sure you want to allow ANY association? Still, no clear-cut rules have been established for assigning with 100% reliability the "parts of speech" of languages (values, types, operators, etc.) We all have notions and pattern recognition in our heads that have a rough commonality, but that's not good enough for tight debates, and leaves a lot of fuzzy areas where two different people will label them differently. One can define a language with "stem objects" than can act like common types, values, operators, etc. and switch between them willy nilly. Pattern recognition may be "good enough" to label C/Algol-like languages, but that may not extrapolate to other languages. Thus, pattern recognition (it's an X in Z because it "looks like" the X in Y) is not sufficient.

Simple question: do different types of fruit exist? If so, you know what types are and should be able to grasp types as a LayMan. Types in computer programming are a bit more specific because the digital world is different than the physical world, but anyone should be able to grasp types by simply looking at different types of fruit.

We could create pages like TableDeterminatorChallenge? to debate whether or not tables really exist and how to prove whether ThereAreNoTables. Why waste time on such silly debates?

I wouldn't. EXCEPT that some here INSIST their def is the OneTrue? definition and are rude to others for not agreeing with them and spray their certainty all over the wiki like a self-elected Truth Nazi. Contrast these two:


Here is something I am writing about types. fwiw. http://sites.google.com/site/argnosis/types

-- PeterLynch


Early definitions:


See also: LaynesLaw, AreTypesTiedToTrees


CategoryLanguageTyping


EditText of this page (last edited October 19, 2014) or FindPage with title or text search