Hidden Type Tags

It's possible to use "internal" type tags to make processing more efficient, but we don't have to make their existence visible to the programmer such that a clone interpreter without them would produce the same results (IoProfile), but perhaps be slower. It's roughly comparable to branch prediction in compilers where guesses of usage or future paths are used to speed up the more likely scenario, but programmers typically don't have to think about such efficiency shortcuts (outside of performance tuning).

For example, in the assignment "a=7;", we can track the fact that the literal was a number inside the variable using a tag and perhaps a binary value instead of a string representation of a number. If in the future an operation has to determine the variable's suitability as a number (such as an isNumberic() function), it wouldn't have to parse the variable's value because the internal tag is already set to a number.

However, none of this needs to be detectable to the programmer, unlike most tag-based languages, which unfortunately "expose" their tag-ness. It shouldn't expose it anymore than branch prediction compiler tricks affect the results (IoProfile) outside of performance characteristics.

In this way we can get the best of both worlds: the efficiency "clues" of tags, but not the complexity of dealing with the existence of tags as a programmer, which I like to call "WYSIWYG typing". Programmers shouldn't have to think about their existence. -t


We may need a way to "coerce" internal tags if we want a way to improve the performance for variables who get get their data from external systems or sources. I propose a function called "ensureType" that serves two purposes: to validate a variable (check it's value), and to set the internal type tag, if supported. Example:

  if (! ensureType(myNum, "number")) {
    errorStop("value cannot be interpreted as a number");
  }
If an interpreter version does not support internal type tags, then this function only serves as validation.

One potential side-effect is that it could result in a loss of precision, depending on the number engine used by the interpreter. Example:

 a = 12.3456789012345;
 print(a);  // result: 12.3456789012345
 ensureType(a, "number");
 print(a);  // result: 12.3456789

This risks braking the "invisible to programmer" rule, but may be an inherent trade-off between relying on hardware architecture versus the more flexible "string-based" numbers typical of direct non-tagged values/variables. Although, it should be pointed about that there are different numeric engines that are more flexible than the typical approach used by interpreter floating point number engines (C/Fortran-influenced). However, they are likely not as efficient as the traditional approach because they require more software-side processing rather than matching typical CPU chip architecture.

A trick to gain consistency is that the "ensure" function would always match the hardware-centric view even if an interpreter does not support hidden tags (or they are switched off). Internally, it may convert values with a decimal into "double precision", and then convert it back to a string-based number. Thus, the floating point "truncation" would match that of tagged variables (with internal "binary" values).


ToDo: Find repeated instances of this concept and attempt to refactor them to this topic.


See also: TagFreeTypingRoadMap


EditText of this page (last edited June 16, 2014) or FindPage with title or text search