Arguments Against Type Indicators

Many dynamic languages have explicit "type indicators" for scalar[1] variables or values. Typically in indicator-centric languages, variables or values (depending on modelling assumptions) can be viewed as being bifurcated into two parts (using local working terms only) called the 1) "value representation" and the 2) "type indicator".

Various other terms have been used to describe these two parts on this wiki, including 1) "value" and 2) the "type tag". I'm using "value representation" (VR) and "type indicator" (TI) here only because they seem to produce the least contention, not because I agree with them. I'll leave those vocabulary battles for other topics (see TagFreeTypingRoadMap). Since I find them verbose, I'll tend to use the abbreviations (VR and TI).

Anyhow, I've found languages that don't rely on (visible) TI's to be much cleaner. I like to call it "WYSIWYG typing" because you don't have to worry about a semi-hidden type indicator pulling surprises on you: what you see (print) is what you get. Not having the indicator makes one less thing to worry about.

Some argue that TI's improve performance because they reduce the need for variable/value/object-using routines or operations to have to parse to determine the "type" of a variable/object. However, a hidden type indicator can still be used as a "processing clue" only to speed up processing under normal/typical variable usage. For example, the assignment "a=123;" would produce a TI of "Integer" under the hood while "a='123';" would produce a "String" TI under the hood. The efficiency assumption here is that assigning a variable using a quoted literal probably means the programmer will mostly be using numeric operations on that variable and thus if we "pre-numerify" it, it will be ready for numeric operations (no conversion or byte re-packaging needed). However, the programmer would have no way to know the difference and it shouldn't matter for processing in terms of input and output (behavior of the app). In theory, in WYSIWYG typing the hidden TI could be removed from the interpreter and any app program in a WYSIWYG-typed language would still function as it did before, albeit a bit slower. It's conceptually comparable to "branch prediction" in some CPU's: the apps don't have to care if branch prediction is used by a CPU chip; it's transparent to them (if done right). TI's would eseentially be "type prediction" instead of "branch prediction", but the interpreter's visible results don't reveal such performance tricks and they could be switched off without "breaking" our existing apps just as branch prediction may be switched off or removed in CPU's (or replaced with a chip lacking branch prediction). Contrast this with dynamic languages which make type-overloaded polymorphic decisions based on the TI. (WYSIWYG-typed languages should avoid overloaded operators, such as not overloading "+" for both addition and concatenation.)

Thus, the machine performance argument is mostly moot.

Another alleged argument for them is that TI's are needed to represent Nulls and other similar doo-dads such as NAN's. First, I'm not sure a special value can't be used instead; and second, these kinds of values could be represented with strings such as "[null]" or "[NAN]". (I question that "NAN" is a good idea to begin with, but that's perhaps another debate.)[2]

In summary, explicit/visible type indicators cause more problems than they solve. I prefer the behavior of languages without them, such as ColdFusion's and Perl's typing system.

Summary of points:

--top


Discussion

I assume what you mean by "visible type indicators" is some form of ManifestTyping, yes? I suspect you prefer languages with good TypeInference and ImplicitTyping (or even SoftTyping).

Moved rest of discussion to TypeDefinitionsSmellBadly because this topic is about merit, not definitions.

I generally informally define a TI dynamic language as one that "acts like" each variable has/references one primary explicit type indicator/tag/code (such as "string", "integer", "date", etc.) and that the built-in libraries and app programmers can definitively see what this indicator is and use it if they want.

TI-free languages may have operators such "isNumeric" that can tell you if a variable/object/thing can be "interpreted as" or "processed as" a number, but they either don't produce singular answers (such as isNumber(x) and isBoolean(x) both returning True at the same time), or appear to be directly tied to what the inspect-able value or "content" has in it. (I have attempted a more formal definition, but failed miserably at it. In general, TI dynamic language variables/objects are "best modeled with" two chambers (type and value) versus one chamber (value only) for TI-free languages.)


Foot Notes

[1] I've kicked around the idea of ridding TI's for non-scalars also, and my gut feeling as that they are not needed/helpful for non-scalars either, but I am not prepared to "officially" endorse them at this point. -t

[2] String-indicator-based null's may take up more memory, but the flip side is that less memory is needed for the TI byte(s) in general for all variables, so it may be a wash.


See also: TagFreeTypingRoadMap


EditText of this page (last edited May 27, 2014) or FindPage with title or text search