Types Are An Abstraction Of Reality

Lots of people like to ask the question WhatAreTypes. Many expect a "concrete" answer, and are disappointed that a single, universally-accepted answer does not exist. Some argue this is a failure of TypeTheory.

Which is nonsense. On the contrary, it's one of TypeTheory's most profound results.

Types are abstractions of reality--abstractions which are mathematically sound, and subject to formal reasoning--a result which has enabled the construction of software that excludes a whole class of nasty bugs.

What is meant by "abstraction"? Simple.

Consider the numbers. Nay, consider the "natural numbers"--the set of positive integers and zero. Something that virtually all educated people are familiar with, and often take for granted. One can enumerate the first 10 (plus zero) on one's fingers; using a more elaborate scheme one can enumerate 1024 unique values of these on the same set of digits.

Are the natural numbers a "given"--something that exists in nature without the leave of mathematicians? One could argue that this is true--people have been counting long before higher mathematics was developed; and it's only in the past century or so that mathematicians have pondered this question.

But one could also argue that it's false. There are infinitely many ways to construct the integers from arguably more fundamental concepts, such as sets. One such mapping, using only sets, looks like this:

 0  :   {}                 // the empty set
 1  :   { {} }             // Set containing the empty set
 2  :   { {{}}, {} }       // Set containing both 0 and 1 representations
 3  :   { { {{}}, {} }, {{}}, {} } // Set containing 0,1,and 2 representations

This is a construction of the PeanoAxioms? that was devised by JohnVonNeumann. Other mappings exist. The ChurchNumerals? are based on the LambdaCalculus, and the PeanoAxioms? provide a "template" for generating whatever constructions you want.

Many practicing programmers may reply--"so what?" After all, for those of us trying to get a job done, this is mathematical minutiae that's of minimal practical interest. We don't worry about the construction of the natural numbers too much in programming tasks (unless we have to deal with very large ones; then these details become more important--see below). We just know that if we ask the computer to tell us what 4 + 3 is, we'll be told 7.

And that's the point. The concept of a "natural number" is an abstraction. While there is arguably a platonic concept of the natural numbers--there isn't any platonic way of expressing them, either in spoken language, on paper, or in the memory of a computer system. The physical representations we choose--whether it be a string of the glyphs 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 emblazoned on a printed page, a spoken word uttered in any of the many thousands of NaturalLanguages known to man, a sequence of bits in a computer memory (fixed or otherwise), or the PeanoArithmetic--are mere representations of the concept. And unfortunately, that's what we're stuck with.

However, the beauty of the abstractions--and the beauty of TypeTheory--is that in most cases it just doesn't matter. Programmers usually don't need to know whether or not an integer is stored in signed-magnitude or twos-complement form, etc. Programmers don't usually need to know if their objects are stored as a simple linear sequence of data, indexed by a VeeTable; or as a DictionaryDataStructure, etc. Programmers don't care how the tables in a relational database are laid out on the database server's disks. (But there is a catch).

I say "unfortunately" because may different computer systems have chosen different abstractions; and migrating data from one such system to another requires a mapping between the abstractions. In the case of the naturals; this is usually straightforward once you know the details--it's easy to devise a one-to-one mapping. However, when you get away from simple concepts like numbers or booleans and deal with complex things like forms and employees and business processes; this gets much trickier.

Above, it was mentioned that in many cases, the user need not care about the details of the abstraction--but in some cases they do need to worry about this. The most common example of this is found in the integral types of our most popular industrial programming languages--C/C++, Java, etc. As any novice programmer knows, an "int" in these languages is not an accurate abstraction of the integers; being limited (due to a fixed-size representation) to a finite subset of the ingeters; typically those in the range [-2^31, 2^31). If you need larger numbers; the fact that this is a LeakyAbstraction becomes apparent. Other abstractions for the integers exist, of course; but these can impose performance issues (and thus be inappropriate for some applications).

Fortunately, programmers who do care about these concerns can a) choose abstractions that suit their needs; and/or b) inspect the details of the abstraction when necessary.


What in software enginering is NOT an "abstraction of reality"? The title is not useful information even if true. There are probably a lot of ways to represent/define integers, etc. in an abstract way.

And the fact that there are "lots of ways to represent integers" does not mean there are "lots of ways to define integers". Integers are already defined in mathematics. You can, however, decide to define a different concept (such as "all integers between [-2^31,2^31)") and define it as "int".

... And such a decision also not an abstraction of reality.

I consider most of those to be specific implementations or implementation details, and not really "software engineering" concepts. Thus, the issue still stands. I was thinking more in terms of sets, database modeling, paradigms, etc. I am also not sure that integers are a good example to use, because in many domains the "rules" are domain-specific, poorly documented, and changing. Companies can make their own reality to a large extent.

An issue can stand as long as someone is propping it up. Fortunately for the rest of the world, statements like "I consider (insert something here) therefore (insert something else here)" are ultimately ignorable unless held by someone in power.

Anyhow, if you want to discuss trivially real things in Software Engineering, every single process of Engineering a piece of Software is real, not an abstraction. Same with every single product - passed unit tests, design documents, partial source code, prototypes, interviews with users, etc.. OTOH, abstractions are often represented within such documents (especially the design and specifications documents).

I'd consider the title to be somewhat off in any case. TypesAreAnAbstractionOfReality is true, but more interesting is the fact that individual types are each abstractions, and they needn't even strictly be abstractions of reality. They can be abstractions of abstractions of reality. For example, numbers are already abstractions of reality, and types can be abstractions of numbers.


See: WhatAreTypes ThereAreNoTypes


EditText of this page (last edited August 30, 2007) or FindPage with title or text search