Compile Time Typing Problem

The compile time typing problem is the problem of assigning types to terms at compile-time (or "read time" - any time after the program is read but before its "main" function is executed). A term is essentially any expression within a program, including variables, parameters, data members of aggregates, and intermediate results from calculations or function calls. Each term in a language must be(*) assigned a specific type.

One immediate question is "what is a type"? There are many different useful definitions (including ThereAreNoTypes); here we treat types as sets; a type is a set of values and an object is an instance of the type if it is a member of the set containing all such instances. This definition assumes immutable objects; though it does extend to objects with mutable state as well.

Note that this definition implies that an object can be an instance of a (usually) infinite number of types. Most programming languages limit the number of distinct types allowed in a program; languages with nominal typing only recognize types that are explicitly declared by the programmer. Languages with structural typing face more difficulty; especially if types can be mutated at runtime. (See NominativeAndStructuralTyping). We will assume for the purposes of this discussion that the set of types in a program is a finite set. A type is maximal with respect to some property if a) the property holds for the type; and b) there are no subtypes of the type (in the finite set which we consider) that also have the property.

A term is well-typed if all possible values of the term (including all states of any mutable objects which might be bound to the term) are instances of the associated type. A program is well-typed if all terms in the program are well-typed; a program which is not well-typed may have typing errors. It's possible that such a program may be well-behaved, but many languages reject any program that is not well-typed. Others (including both CeeLanguage and CeePlusPlus) don't, or allow ill-typed programs to be written with explicit typecasts. In remaining languages, normally if a program is well-typed certain run-time errors are guaranteed not to happen, i.e., the type system is sound (see StaticTypeSafety). Note that Java is not fully sound.

In languages with a universal supertype (a TopType, called "Object" in most such languages), a trivial well-typing for any program is to assign the type Object to all terms in the program; as all possible values/objects are instances of the universal supertype, it is easy to see that all terms are well-typed. Languages with DynamicTyping do exactly this, however being well-typed gives no guarantee of safety about runtime behavior. One can also say (as usually done) that typechecking is deferred to runtime in such languages; the RunTimeTypingProblem? is the subject of another page. (And much easier to implement).

A term is "ideally typed" if a) it is well-typed; and b) the associated type is a maximal type for the term. Ideally typed terms are useful in that they enable numerous optimizations - data member lookup can be optimized; DynamicDispatch can be turned into StaticDispatch, functions can be inlined, or the term can be subject to TypeErasure. On the other hand, these optimizations can cause UndefinedBehavior if the typing assumptions ever become invalid. In programs written as separate modules (and especially those deployed as separate modules), this can happen quite often. One instance of this problem is known as the FragileBinaryInterfaceProblem.

A program is "ideally typed" if all terms are ideally typed - in other words, there is no type subsumption. Some FunctionalProgrammingLanguages with StaticTyping (HaskellLanguage, CamlLanguage?, but not ObjectiveCaml) produce programs with ideal typings. ObjectOrientedProgrammingLanguages, as a rule, do not - subsumption is a key component of ObjectOrientedProgramming.

How are the types of terms determined? Generally, by one of three ways:

Note that some languages may allow different methods to be used - ML uses TypeInference by default; but the user can attach type declarations if they want.

Very nicely done. Whoever created this page should sign it with pride, it's better than most.

Thanks. -- ScottJohnson


CategoryLanguageTyping


EditText of this page (last edited July 29, 2010) or FindPage with title or text search