After the controversy, diatribes, insults and general guff on the TypesAreSideFlags page, I'm reluctant to enter the mess. However, by way of trying to find something positive from the unremittingly negative impression given by that page, I'm going to give a purely personal view on types.
As an experiment, may I suggest that people append their comments, questions and thoughts, rather than interspersing them through the initial document. Then I'll address the issues raised by editing the original, enhancing it where necessary, and the responder can remove or replace their contribution based on my refinements.
I'm sure it won't last - people here are rampant individualists and won't stick to any structure - but we can try.
--AnonymousDonor
Types are difficult.
TypeTheory has made enormous strides, allowing fantastic facilities such as provided by HaskellLanguage with its polymorphism, first-class functions, type inference and complete type safety (provided you treat it properly). But types are controversial. People don't know how classes and types interact, and there's still the question of whether a square is a rectangle, a rectangle is a square, both are the either, or never the mane shall tweet.
Caught in all this is the poor student, who has to learn to use types as embodied in C, C++, Java, Python, Ruby, PHP, and other systems, with a total hotch-potch of "types", classes, coercion, inheritance and lack of true type-safety.
So what is a "type"?
One way of thinking about types.
In complexity theory a "problem" is a collection of "questions". Each "question" is said to be an "instance" of the problem. A particular graph is a single instance of the TravelingSalesmanProblem, a particular integer is an instance of the Factorisation Problem, and a particular program is an instance of the HaltingProblem.
Consider, then, that a "type" is a collection of "values". Each "value" is said to be an "instance" of the type. The number five is an instance of the Integer type, The number 3.14 is an instance of the FiniteDecimal type, and 3/4 is an instance of the FractionalNumber type.
It's important to realise early, and throughout, that a "value" is an abstract thing, separate from (although obviously related to) its representation in the program text, and equally separate from its representation in computer memory during run-time. Keeping these different concepts separate is crucial.
We can think of each type as being a set of values, but each value is permitted to belong to several types. We can create equivalences, and define conversions - note, these are different - but each "type" is a set of values. Further, the values concerned are not limited to numbers, strings, or characters, they could be lists, maps, dictionaries, functions, or even entire programs.
We can now give functions that take pairs (or more) of values, and define what the result is. In the case of integers we can easily define "add", "subtract" and "multiply", which always result in an element of the Integer type (ignoring overflow and underflow), and "divide", which gives an element of the FractionalNumber type. We can define functions to carry values from one type to a value in another type, sometimes exactly (as in Integer -> FiniteDecimal) but more often approximately.
Implementation
When we implement a language, we have to represent values as collections of bytes, and whenever we act on a collection of bytes, we need to know what it represents, and therefore what to do with it. Note also that a particular string of symbols in the program could represent a value in any one of several types. The symbol "1" could be any of Integer, FiniteDecimal, FractionalNumber or Float. The way that value gets converted to a collection of bytes in memory will probably depend on which type it belongs to. The permitted operations, and the results thereof, will also depend on that.
In statically typed compiled languages, the collection of bytes representing a value needs no extra associated explicitly stored information regarding the type(s) of the value represented. The binary code being executed knows what to do, because that part of the code already "knows" what "type" is the one to use; the associated type(s) is/are implicitly encoded by virtue of the code being statically type-correct. The type containing the value represented by the string of symbols in the program has been definitively deduced, and that type information has been completely applied to the generation of the data-manipulating code. If that were not the case, then the generated code would not be statically TypeSafe.
- To comment on a side point away from the focus of this page, but motivated by the use of the word "collection" above: the entire issue of types in regard to collections, as opposed to the more fundamental issue of types in regards to scalars, is a significant complication. See for instance CollectionsArentOo (which is not a very coherent page, but does raise the related issues directly). Critics of the entire notion of "type" should perhaps skip that side issue.
In dynamically typed languages, it is always necessary (in at least some, but not all, subprograms which occur both in theory and in practice) to have extra explicitly-stored type information, beyond the simple bare representation of the value. Such a "tag" ensures that when the collection of bytes is accessed, the program uses it in an appropriate manner. Without the tag, the program has no way of knowing whether a particular collection of bits is, for example, a raw 64-bit
Integer, or a
Float.
In Python, (dynamically but strongly typed) for example, the collection of bytes holding a value additionally has a tag to say what type the collection of bytes (or more correctly, the value represented by the collection of bytes) belongs to.
- Some languages (for example OcamlLanguage) use different symbols for the same operation when performed on different types. ".+" for example can mean "addition of floats", as opposed to "+" which means "addition of integers". Many programmers find this difficult to get used to, having become accustomed to the ad hoc overloading of simple symbols.
- This is a classic example of SyntacticSalt, and why "SyntacticSugar" can be a nontrivial issue in the usability of languages. It arises as a serious issue (rather than a minor issue of taste) most often when programming with the endless variety of minor variations on numeric types. Since many programmers never work much with any of those miscellaneous numeric types, they may regard this area of SyntacticSalt as trivial. For them, specifically, it might be, but for those who must work extensively with variant numeric types, however, it is a very significant matter.
Learning about Types
All this is totally indigestible for the poor student who has no experience and misleading intuition. The student needs to be given something more concrete to help get them started. Talking about one means of implementation - for example using "side-flags" - can ease them into the idea that a value might belong to any one of several types, and the representation of them in bytes, and the actions performed on them, vary according to that type. Examples, and many, many of them, need to be given.
The problem is that by focussing on an example implementation technique can result in a very blinkered understanding, inappropriate and perhaps even impossible for extension and enhancement. The ability to step beyond the concrete implementation and into the abstract is not given to everyone, but perhaps some students are being unnecessarily hamstrung, damaged by overly concrete presentation.
On the other hand, if a student can't, when the time comes, transcend their concrete examples and move to the abstract, then perhaps they actually would never understand anyway.
You might think you understand all this, and you might be right. But what happens when you allow functions as first class objects, and take non-termination into account. Suddenly there is a whole slew of problems about what types things are and can be, and it's not enough to think that nothing will go wrong. Experience shows that things do go wrong, and spectacularly so.
Simplifying the Problem
Types are hard, and simple ad hoc schemes have all so far proven to be either overly-limited, inadequate, inconsistent, incomplete, or some combination of them all. To present a simple scheme as an introduction, with plenty of examples, may be good pedagogy, but it should carry a health warning, making it clear that types are much more complex than they appears.
A simple understanding is fine for most people, and for most day-to-day programming, but don't underestimate the problems. If you haven't programmed in OcamlLanguage, or written a compiler that does type deduction, or implemented a language with functions as first-class objects, then the literature suggests that there are things you haven't thought of.
Conclusion
Types are subtle, and quick to snare the unsufficiently wary.
Brilliantly worded, author!
I have seldom seen such a well-done summary of such a hopeless ThreadMess. And a summary it is, though summary is much too weak a word really. And the title is - well - particular. It is a good title no question. Just not what I would have expected. But then again it's somewhat misleading, as is the introduction. Who can expect to add a personal view of types to this? But maybe this will keep the quality of this page unexpectedly high.
- I, the author, was not involved in the original discussion. I just watched it with increasing despair and thought that something had to be done.
In response to the opening statement:
someone added:
- No they are not. They may be difficult to implement, but the idea of a type is extremely easy to understand
- In a rough notion-y kind of way, perhaps.
There is sufficient evidence in the form of confused arguments to show that the "idea of a type" is understood differently by many people. Coming to a common understanding is difficult, so the field/area/topic of types is difficult. Hence the opening.
After the heading "Learning about Types" someone also added:
- The first place to learn about types is to check the standard dictionary definition. It is sad, but true.. that this is the first place one should look. The dictionary gives a very clear description of phrases such as Type, Vary, Number, Apple, Orange, etc. For one to not consider the standard dictionary definition of type as a guideline for what a type is, is ludicrous. For one to consider the dictionary irrelevant in our world, is also ludicrous.
It is looking in a standard dictionary for clear and precise technical terminology that is ludicrous. It's exactly the woolly-minded, imprecise, intuitive "definitions" that cause trouble when thinking about technically complex areas. Beware the person who advises you thus.
Dictionaries give a general notion and examples, but it doesn't answer tough questions such as how types differ from set-theory, "category", "validation", etc. (More questions are at QuestionsForTypeDefinitions) -t
In the above it is stated that:
- ... a "type" is a collection of "values". Each "value" is said to be an "instance" of the type. "Five" is an instance of the Integer type, "3.14" is an instance of the FiniteDecimal type, and 3/4 is an instance of the FractionalNumber type.
Someone replied:
- An instance of a type is a variable or constant which must match a value that the type allows. I wouldn't say that a value is an "instance" of a type just as the color "red" is not an "instance" of a color (the variable, may be set to a value of red, and the variable is the instance). The color red is a value that a type allows, which a variable can be assigned to. The variable instance may be carColor. If carColor is red, the carColor variable is valued at red. The instance is carColor, not red.
This is simply wrong. An instance of a [value-]type is not a variable, or a constant. It is a value. The concept of a "variable" is not referenced in the above discussion. In a program we create variables which can then hold values. A variable may hold the value "red", or a constant may be set to (or represent) the value "red". That does not mean that the variable is of type color.
- Quote: "Along with programmer-defined data types came the clear distinction between type definition and variable declaration, variables being instances of a type." -- http://pascal.retro8bits.com/wirth.html
- Type theory has moved on since the 1970s
- Quote: "make a type declaration, and declare the formal parameter to be an instance of that type" -- Kernighan
- More precisely, declare the formal parameter to contain/hold/represent a value which is an instance of that type.
- Quote: "To create an instance of an adt"..."that is, the adt is created from a tuple that specifies all of its members in order. " -- Kernighan http://www.vitanuova.com/inferno/papers/descent.html
- Above quotes are from highly notable and respectable computer scientists who use terminology such as "instance of a type" in several of their papers. They see the type as a "model which we define" and then we "create an instance of that" which means that "instance of a type" can be an "instance of our blueprint model". If we have a "color sheet" model that allows us to select 5 colors (an enumeration) this "color sheet" model is a slip of paper which we can change the "current color". The color sheet is "not just a value of red" but rather a sheet with 5 color choices. This color sheet with the arrow on it is an instance itself - the color sheet is not "just a value". It is an alive model in which it can be set to different values, hence it is an instance of a type or a model. Similarly, an array variable which holds 10 fixed values is an instance of a model defined in a type declaration such as array [1..10]. By seeing types as "blueprint models" and "using instances of these blueprint models" one can more easily understand why variables are "instances of these types". For instance, here is MyAirplane? that I built based on the TWrightBrothersModel type.
- The term instance is also overloaded to imply "members of" so I can see how one can visualize red as an instance of a color if we are talking about an instance being a "member of color" but why not just use clear phrases such as "member" instead of "instance" since instance can imply more? Membership may be a clearer phrase, since a lot of programmers already use "instance" when "instantiating".
- Also, you are using confusing terminology by stating "[value]-type". This dashed phrase somehow tries to merge value and type.. and they don't need to be merged, they need to be kept separate. For example an integer type is not a "value type" it is a type of integer. The integer variable can be set to a value but the type is a type.. not a "value type". I.e. if all types define what values one can hold, calling something a "value type" is either redundant, confusing, or not necessary.
It means the variable is of type "variable that holds values of type color".
- A variable is "of type" which means it is an instance of the type (model, definition) we declared. When "something is of something" it is a case of it. When something is a "case of something" this means it is an instance of it. Instance can also be overloaded to mean "instantiated" in the life of the program (which is why I suggest "red is a member of a color type" rather than "instance of".
- If one defines an airplane model type which this airplane model type has wing flaps that move, several seat positions and colors (an enumeration), he can create an (one or more) instances of this airplane type (model) and fly it. This airplane variable (the instance of the defined model or type) can change its seat position and can even change its color if it is been painted during a pitstop. Airplane.Color set to white means the airplane model (an instance of the type TAirplane) has its value changed.
The author of this comment appears to have ignored completely the paragraph that came next, which said:
- It's important to realise early, and throughout, that a "value" is an abstract thing, separate from (although obviously related to) its representation in the program text, and equally separate from its representation in computer memory during run-time. Keeping these different concepts separate is crucial.
Someone commented on the comment, saying:
- This makes no sense to me. The color identified by the word 'red' is, indeed, an instance of a color. The variable containing the value 'red', however, is not an instance of a color; it is an instance of 'variable'. If the variable is annotated to indicate that it should carry only values of type 'color', you might call it a type-constrained variable. 'color' itself is an instance of 'type-descriptor' (a nominative one - not really a full 'type' without both a definition and semantics interpreter).
This seems exactly right, including the idea that a given variable is an instance of the type "Variables". This, however, creates a potential for deep confusion of levels.
- Clearly, the ones who are adding deep confusion levels have already admitted it. Back to basics: take an array variable, which is not an instance of the type "variables" nor is an array variable following the type specification of "variables. An array variable is following the definition of the "array type" declared, and is created as an "array" therefore when in use, an "array type" is instantiated as an "array". In the program, one treats the variable "As the array" which is "of the array type" meaning that the array in use is "an array of the type array" which is "an instance of an array". Since "array is a type" and since the variable is "an array" the array is "an instance of the array type we defined".
- By the way, saying confusing statements such as "Variable" being redundantly "an instance of the Variable type" is similar to saying "the number 1" must be "of the number 1 type". i.e. this is causes some recursive, redundant, meaningless laughter from my end. "1" is of type "1" and "2" is of type "2" since "var is of type var".
The original commenter then wrote:
- Perhaps we need to bring in the term literal. Red is a value that an instance of a type can be. If there is a literal in the program called red, then a variable (or constant) instance can be set to this literal called red. A variable is not and instance of a variable, by the way. That makes no sense. A variable is an instance of a type. If a variable was an instance of a variable, you'd have just variables of variables of variables.
This continues to confuse the representation in the program with the abstract concept of types and values. Talking about a literal in the program, and referring to what it is called, is exactly the confusion of levels that makes types a difficult subject.
What makes types a difficult subject is when ones here have not done as much research as they could have, from respected scientists such as Kernighan, Wirth, and others (i.e. backing up claims with quotes and references). I'd refer people some more web pages for further understanding of instance, literals, values, and types.. but I've spent enough time here and provided many quotes and references above. Please read the quotes from the respected computing scientists, and consider using the term "member of a type" instead of "instance".
A type is a classification system, as clearly defined in the dictionary. See also TypesAreNotTypesTheyAreTypes. What is math? Who can prove math is really math? What is a type? What is an apple? The first place to look is the dictionary. It is a group, classification, domain.
- So, you're not accustomed to the idea that a term used in a technical context is not the same as when used in a standard dictionary. If you want to know what a category is in CategoryTheory, don't look in the dictionary, if you want to know what a graph is in GraphTheory, don't look in the dictionary, if you want to know what a string is in StringTheory, don't look in the dictionary, if you want to know what a proof is in ProofTheory, don't look in the dictionary, and if you want to know what a type is in TypeTheory, don't look in the dictionary. Yes, the standard dictionary can tell you the real world concept that inspired the use of that word, but if you rely on the dictionary to get your technical meanings, you won't successfully communicate with people who work in the subject. "Type" as you find it in a "normal" dictionary is basically irrelevant.
The dictionary is irrelevant? A classification system is exactly what a type system is.. which apparently 99 percent of the people don't understand. Instead people ramble on and on about how Type Systems are actually Pilons or Flags or Pieces of Mint flavored Gum that people carry in their pockets. Type systems exist to classify (domains) and constrain to a certain group (again, domains). This, is exactly what the experts say too. It is very sad, that today computer scientists (as we call them) will not take a look at their dictionary.. in cases where they are unsure of the definition of a simple phrase.. such as
type. It is, very very sad.
- And a "computer program" is a string of symbols from an alphabet that is used to give instructions to a computer.
- Yes, it is. If one is considering studying how to program - he must look further. If one is looking to understand what a computer program is, or what a car is, or what an apple is - he better first understand that an apple is not "a flag hanging near a leaf" but "a round shaped fruit with a stem, core, seeds, usually red or green in color". Similarly, if one wishes to know what a type is - he'd better first understand that a type is not a pilon, flag, or a numbering system. One could implement a type system using flags possibly (even that is ludicrous) but one should not claim that types are apples, or types are flags, or types are pilons. It is also to note that type theory does not describe a type - it describes a theory, study, even informal or formal ramblings about ways to implement types. If this page is about What Is Type Theory then please rename it to TypeTheoryIsTypeTheory?, not TypesAreTypes.
{The dictionary should provide a layman's comprehension of types and computer programs. The expert's definition of the word ought to at least bear some resemblance to and not utterly contradict 'in spirit' the dictionary definitions, lest the expert create artificial confusion in choosing that particular word for use in his or her domain. But expert definitions need to be formal; use of informal words won't help one expert communicate with another. It just so happens that 'Type' isn't defined even for experts. '
TypeSystem' is defined, and '
TypeTheory' (the formal study of
TypeSystems) is defined, but 'Type' is defined
on a per-TypeSystem basis. Experts can, however, agree on the
essential properties of 'Type' - those things that are true
for every possible consistent and applicable TypeSystem, and they can do so for types in
TypeSystems applied as broadly as observations over the 'real world' (or even imaginary worlds, like mathematical models) and add additional properties when they know additional constraints (such as types as applied strictly over and within programming languages). Necessarily, these will be high-level and broadly applicable properties, lest some expert could come along and create a
TypeSystem just to contradict them. These
essential properties will have a basis
strictly in information theory (i.e. a
TypeSystem is not 'applicable' if one cannot acquire the information necessary to apply it) and logic (to ensure consistency).}
{The author here focuses on types as applied over values, which is a bit distinct from types over mutable state, types over behavior, and sub-structural types over program components as a whole (e.g. ensuring that there is only one copy of any value in use in the program at a given time). The domain of TypeTheory is vast and quite deep and it is easy for someone to get lost in it, especially if they have no desire to perform or study the math necessary to keep up. As a consequence, it should be noted that most computer scientists - even those with master's or doctorate degrees - do NOT qualify as 'experts' in TypeTheory. (Or CategoryTheory, or Mathematics in general...). TypeTheory is deep enough to be a Master's degree and provide plenty of Ph.D. material all on its lonesome.}
{Fortunately, programmers and those in computer science don't need to become experts, as that would add at least two years to their degree. Programmers can, for the most part, remain laymen when it comes to use of 'types' ... and the dictionary's definition can, for the most part, properly satisfy their need for a definition. However, programmers should at least know enough to be humbled by the field, just as the majority of computer scientists are invariably humbled by the advanced components of post-calculus math and non-monotonic logics.}
{This humility is desirable because when a programmer, who seems like an expert to younger programmers and other laymen, gets idea in head to design a language or design a new TypeTheory (e.g. TypesAreSideFlags) without really understanding why it would fail. This programmer can lead others astray and simply add to the general confusion surrounding 'types'. Lacking skill is not generally a problem for those who aren't UnskilledAndUnawareOfIt and choosing to preach their views or if they're SnakeOilSalesmanAttemptingToMakeAQuickBuckSellingBooksForIdiotsByIdiots. Most of the big Type debates on this WikiWiki are a consequence of one or both of the latter (it's really difficult to distinguish the benign fools from the malign charlatans).}
{The only programmers that benefit a great deal from advanced study of TypeTheory are those involved in (a) language design, (b) compiler implementation, and (c) correctness proofs over programs. The rest only need to learn enough about types to use their programming languages of choice and perhaps take advantage of certain language features that derive from the chosen TypeSystem (e.g. TypefulProgramming).}
Demand a Simple Explanation of Type
Perhaps "type" should be separated from "type theory", "type systems", and "type implementation". A lot of people here are not looking for a "type system definition" or a "type theory page". People are asking "what is a type?" and saying "types are this?". Precisely if one wishes to grasp what a "type" is, he should grasp the "concept" of a type (domain, grouping, model) rather than studying full fledged "type theory", "type systems", and "how to implement types".
Consider an integer or byte type. When one declares a byte or integer, he shall think of these as a "limited number type". How hard is this to grasp? It is not hard to grasp, despite what some claim. An integer type is a way to constrain and group a number. A byte is a very limited model that we can work with since it can only hold values from 0..255. A byte is not just some pylon or side flag, it is a model to follow. A type is a simple concept for even a layman to understand.. and it should be.
To make types easier to understand, it is also important not to introduce confusing terms such as "instance" when discussing "values", since "instance" is also a confusing phrase commonly used in OOP. Rather "member" should be used when discussing "possible values of a type" since the word "member" is less overloaded in computing science.
Re: "simple ad hoc schemes have all so far proven to be either overly-limited, inadequate, inconsistent, incomplete, or some combination of them all."
Can you provide an example?
- Sure. TypesAreSideFlags is one example. A system that said: "there are exactly two types: integers and strings" would be another example. These are both overly-limited, inadequate, inconsistent, and incomplete, though they may be of some use within some extremely narrow domains.
Re: "making it clear that types are much more complex than they appears."
"Are complex" and "can be made complex" may be two different things. It is probably possible to take ANY concept and create a complex and/or convoluted version or model of it. But that act does not necessarily make the concept itself complex. For example, proponents of Smalltalk are often appalled at what languages like C++ and Java have done to OOP.
The basic idea of 'types' is very simple, just like the concept of 'object'. But talking 'type theory' is more like discussing the entire field of ObjectOriented PatternLanguage, not merely the simple concept of 'object'. In the same sense as used above, 'objects' are also hard (or "complex") once you have to start putting them together to create bigger objects or solve real problems, or start creating first-class objects in a distributed system, or deal with the issues surrounding the creation and destruction and naming of objects. As with a simplistic notion of objects (e.g. in the same vein as "TypesAreSideFlags", consider "ObjectsAreIdentifiers"), any simplistic or "ad hoc" notion of types will not make predictions or provide any help at all when it comes to dealing with complex or combinatorial problems involving types, such as those involved with type inference and first-class functions.
Ultimately, the statement above refers more to the domain of 'types' - including its theory, the teaching of 'types' to students, what it would mean to define the use of types in a type-sensitive programming language, etc. All these same problems apply to 'object': the theory of objects and pattern languages, the teaching of 'objects' and their relevance to students, what it would mean to define object in an object-oriented programming language. The gestalt context of the statement ("making it clear [...]") in the above discussion should have made it clear that the word 'types' was referring to more than merely the 'concept' of type.
TypesAreNotTypesTheyAreTypes
MayZeroEight
CategoryTypingDebate