Type Safe

Any variable with a declared type will always reference an object of either that type or a subtype of that type.

A more general definition is that no operation will be applied to a variable of a wrong type. There are additionally two flavors of type safety: static and dynamic. If you say that a program is type safe, then you are commenting on static type safety. That is, the program will not have type errors when it runs. You can also say that a language or language implementation is type safe, which is a comment on dynamic type safety. Such a language or implementation will halt before attempting any invalid operation.

Some languages make certain guarantees on type safety. If the language interpreter must make some type-assumptions that can be violated by a programmer, the safety guarantee is called 'weak'... e.g. C++ is weakly typed because it assumes that C-style cast and reinterpret_cast operations are correct (the prototypical example being casting from a 'void*'). Conversely, if the type system can make inviolable guarantees, it is called 'strong'. See also TypingQuadrant and StronglyTypedWithoutLoopholes for more on the subject.

TypeSafe is different from TypeChecking, which is a process (not a property). TypeChecking, however, is often a necessary step in providing any guarantee of type safety.

Note that actual identification of types (e.g. via TypeInference) is not necessary to provide a guarantee of TypeSafety. This is relevant because some type systems lack a notion of 'most specific type' or are otherwise undecidable. Guaranteeing TypeSafety requires proving that certain relationships hold between expressions and operations, which is an easier problem than deciding on actual type.


If the typing system is tree- or DAG-shaped, then degrees of freedom are limited and therefore it is easier to reason about "types". However, a graph-based type system may lead one down messy paths and un-ending/uncertain loops. It then becomes "swampy" instead of "mathy". Does your definition of TypeSafe exclude graph-based type relationships? It may hinge on whether "sub-type" can be any graph path, or is limited to trees/DAGs. The power (MentalIndexability) of "clean" types seems to come from tree/DAGs. (I suppose there is also the potential problem of infinite tree/DAGs, but so far they don't exist in practice, unless they are merely an "unfolded" projection of loopy graphs with node duplication.) --top

The above "swampy" conception of types isn't correct on any of its attempted points.

First, incorrect regarding the "'swampy' instead of 'mathy'" assertion regarding 'loops' in the subtype relation. Type-systems formalize (in a very "mathy" manner) quite easily with 'EQUIVALENT(T1,T2) :- SUBTYPE(T1,T2) AND SUBTYPE(T2,T1)' along with 'SUBTYPE(T1,T3) :- SUBTYPE(T1,T2) AND SUBTYPE(T2,T3)' (transitivity). In any type-system with this pair of axioms, any subtype-relation 'loop' in a graph of types means that all types in said loop are provably equivalent. These axioms, of course, necessarily leaves one with a DAG of final (evaluated) types, since all perceived 'loops' effectively collapse to one point in the DAG. Further, this is a VERY common pair of axioms in type systems - being, in fact, the most common definition of type equivalence for structural typing systems (which just happen to be strictly more general than nominative typing systems). Loops can be very "mathy". Though I suppose you could argue that I'm discussing non-loops, since they do collapse, but if the question is whether one can handle loops formally (in a "mathy" manner), the answer is 'yes'.

Second, incorrect regarding "degrees of freedom [being] limited" assertion. Trees ARE limited in the hierarchies you can create (given that you can only have one supertype for each type). But DAGs (directed acyclic graphs) happen to include any lattice, disconnected graphs, forests, etc. 'Lattice' by itself is at least as general as both the 'subset' and 'superset' relationships by virtue of including both: 'proper subset' and 'proper superset' each form a lattice on the class of all sets. Predicate-types also form a lattice (using, for example, SUBTYPE(T1,T2) :- FORALL(X).IMPLIES(T1(X),T2(X)) - which has \x->TRUE and \x->FALSE at the top and bottom of the lattice respectively). As a consequence, type-systems that can be described on a DAG don't necessarily have "limited degrees of freedom" - they potentially have the full expressiveness of sets; they potentially have even more!

Indeed, most of the type-systems I prefer to utilize include inductive types (like tagged unions, which (loosely) relate 'subtype' to 'subset' - e.g. a:T1|b:T2 <: a:T1|b:T2|c:T3), coinductive types (like records, which (loosely) relate 'subtype' to 'superset' - e.g. {a:T1,b:T2,c:T3} <: {a:T1,b:T2)), and predicate types (which relate subtype to implication, as described above) all in one integrated package - allowing for multi-dimensional lattices far more general (and with more "degrees of freedom") than working with oft naive 'TypesAreSets' approaches, all the while possessing a semantic richness and missed by 'TypesAreSideFlags' and allowing both static/compile-time inference over types and FirstClassTypes (these both being very useful properties of a type-system that aren't part of TypeSafety).

Third, incorrect on the assertion that there are no infinite DAGs of types. It is true that any particular program (being a finite entity in its own right) will represent interest in some finite set of types and subtype/instance-of relationships. However, that still allows for infinite variation on, for example, the REP-loop. A casual example is a function that takes a record possessing type {Name1:T1,Name2:T2}. Where the traditional record subtype relationship applies, you can pass to this function any value of the form {Name1:S1, Name2:S2, Name3:T3, ..., NameN:TN} where S1<:T1 and S2<:T2 - an infinite collection of subtypes (without 'node duplication'), thus forming an infinite DAG. And this doesn't even touch on FirstClassTypes, wherein types can be constructed and utilized at runtime (a useful property for mobile code), or type systems that lack the concept of 'MostSpecificType?' for values (i.e. predicate type systems).

Fourth, the definition of TypeSafe doesn't require any particular features of a TypeSystem or any particular definition of 'subtype'. A TypeSystem is called 'unsound' (though it may still be of practical value) if it does not provide TypeSafety, but there is no dependence between the definitions of 'TypeSafe' and 'TypeSystem'. A generic definition of TypeSafe is simply that "no operation will be applied to a variable of a wrong type" (and even more generic: "no invalid operation will occur" - allowing for operations over arbitrary expressions). This requires only that there be some means to distinguish 'wrong type' or 'invalid operation' from the good/valid ones. 'Invalid operation' generally includes any undefined operation in addition to semantically invalid operations (such as adding a distance to a volume). If you do utilize the more specific definition above: "any variable with a declared type will always reference an object of either that type or a subtype of that type", TypeSafety only requires a guarantee (from the joint combination of the language, the program, and the type-system) that either operations on a variable of a particular type also happen to be valid when applied to any subtype of that type or that there are no operations on a subtype - this constrains the definition of 'subtype', but does not set it. It is generally the axioms and rules of the type system that ultimately circumscribe, finite, and constrain the concept of 'subtype'.

That said, it is probably difficult to formulate a definition of 'subtype' relative to single expressions that simultaneously provides type-safety and doesn't form a DAG. It is likely that the "soundness" requirement for TypeSystems - that valid operations on a type also be valid operations on a subtype - constrains 'subtype' to at least form a lattice (involving sets of all possible operations), though doesn't prevent 'subtype' from being even further constrained. That leaves only two possibilities that I immediately recognize for 'subtype' relations that aren't DAGs. One is rejecting type substitution in the language for something non-transitive (e.g. single-step nominative-type coercion, as per the C++ coercion 'operator TYPE()(void) const'). The other is attempting to handle subtypes across more than one expression; for example, dependent types may or may not form a DAG (I haven't explored dependent-types well enough - i.e. with enough mathematical vigor - to provide any confident answers).

OTOH, as discussed above, DAGs are hardly a problem. They don't in practice limit your "degrees of freedom".

Top, I'm trying desperately not to be rude, but would you please cease attempting to make sweeping statements about the properties of type systems, type safety, the nature of types, etc. until such a time as you possess the proper motivation and domain knowledge to support your conclusions and sweeping statements with formal proofs? Formal subjects deserve formal argument - certainly no more of this "swampy" and "mathy" tripe, but even your assertions about DAGs, which were made without explanations, demonstrations, or proofs of any sort (and were wrong), are poor form.


CategoryLanguageTyping


EditText of this page (last edited July 9, 2010) or FindPage with title or text search