Tops Dynamic Types Done Right

TopMind's description of an "ideal" dynamic typing (or un-typing) system that is or emulates "type-tag free". Lack of a tag is allegedly a LiberatingConstraint.

--top


Re: "WYSIWYG Typing"


Footnotes

[1] I'm assuming zero-length strings are not considered equivalent to Null. In some systems they are.

[2] I actually prefer periods be reserved for objects and/or maps. "->" is too a bit verbose. Related: MergingMapsAndObjects. -t


What does "no hidden type tags" mean in practice? Can you give an example of behaviour with "hidden type tags" and without "hidden type tags"?

See DefinitionOfTypeTag.

Yes, I've read that. I'm afraid none of what you wrote made much sense (but some of the queries and criticisms about what you wrote do make sense) and I write compilers and interpreters, so I do know something about the subject. There seems to be a consistent conflation of implementation and model, and a consistent confusion between type and "tag" (presumably), that I find confusing.

And it all seems to hinge on strings. By "no hidden type tags" do you mean all values are represented as strings, but strings are cast -- by operators -- as needed to other types?

I agree that it's a difficult topic to formalize because terms such as "value", "type", "canonical representation" (output), and "string" are not rigorously defined. I don't know what to say other than English is not serving us very well. The main goal is to make it so that the developer does not have to "think about" a type-tag. We want to simplify his/her life by removing that concern/factor. -t

Is there some problem with the English phrase, "all values are represented as strings, but strings are cast -- by operators -- as needed to other types"? That seems to be exactly what you're trying to describe.

It can be used to explain why in one language 2 + 2 = 4 but "2" + 2 = "22", yet in another language "2" + 2 = 4. In the former, values are associated with types, such that 2 and "2" are distinct. In the latter, "2" is the same string as 2 because all values are of type string, so the '+' operator must internally cast strings to numeric values in order to perform numeric addition.

If all variables "are" strings, then there is only one type, and therefor no type. There is no need to track that something "is" a string in the interpreter if everything is automatically and always a string. It becomes meaningless in the realm of that language to talk about "strings". It's roughly analogous to the fact that our laws assume human beings, and not intelligent aliens from other planets. We don't have to encode such into our laws because the scope is assumed to be about Earth humans. It simplifies our laws and reasoning about our laws to not make such distinction. (Now, "corporate personhood" is another messy matter.)

What do you mean by 'variables "are" strings'? Do you mean "variables are of type 'string'" or "values are strings"? I'll assume the latter. If all values are strings, then all values are of string type. In some circles, that is called "un-typed" or "typeless", but that's a terminological convenience rather than a conceptual notion. No one thinks it really means "there are no types"; it's merely shorthand for "all values are of string type."''

Without a rigorous definition of "string", this is hard to test either way. Just because one can extract a particular character from some object does not by itself mean that object "is" a string. It only means that we have available to us techniques to extract a particular character from it. (This gets into the philosophical debate over whether "types" are defined by the operations that can be done on objects of that "type", or some other characteristic, such as implementation.)

A rigorous definition of "string" isn't required, especially if every value is of type string, but even if values are of differing types. In the latter case, "string" is whatever the language designer decides it will be.

If every language can define it any way it wants, then it's difficult to talk about "strings" in any consistent way. We probably need working definition to take this further. Let me ask you this: From the app programmer's perspective, how does one know (test) that "every value is of type string"?

In any general-purpose language, and most domain-specific ones, we can presume "string" will be used in the usual canonical sense.

In a language where every value is of type string, this is typically stated by the language designer -- in some cases as a feature. There is no test that will definitively prove that all values are of string type.

So the language designer could call them "Foobnixia". This reinforces my suggestion that "type-free" and "only has strings" are not really different. Could you propose a tag-free or type-free sample language that doesn't "use only strings" as a comparison here? (That supports "typical" functionality.) -t

This seems to be going in circles. I already noted that (quoting myself) 'In some circles, that is called "un-typed" or "typeless", but that's a terminological convenience rather than a conceptual notion. No one thinks it really means "there are no types"; it's merely shorthand for "all values are of string type."' I'm not sure what your "Foobnixia" is intended to illustrate, and there is absolutely no question that strings are a type -- this is fundamental to both programming theory and implementation.

I still would be hesitant to call them "string types". A super-set of the "string type" perhaps because in such languages math operations can also be done on them just as easily as string operations. The "native" operators tend to be a combo of those for strings, numbers, and sometimes date/time. I don't see string-centrism in that arrangement. If we cannot open up the interpreter, we cannot tell what data structure or byte arrangement it's using under the cover, and it shouldn't matter because that should be swappable for a different implementation in theory.

{Types are often defined in terms of operations rather than representations. String operations (concatenate, substring, etc.) are, by definition, performed on values of string type. Math operations are, by definition, performed on values of number type. These are true independent of how the values are internally represented. How values may be cast from one type to another is an aspect of the TypeSystem in use. Even AssemblyLanguage and Forth have types, but they don't have TypeChecking.}

I don't know how to objectively verify "has types" unless we come up with a rigorous definition of "types", which has proven elusive on this wiki. ("I know types when I see them" isn't "rigorous".) Anyhow if we look at the native operations available, then we have a stringnumberdate because all three "kinds" op operations are possible on any given scalar variable/object in such languages. This is another case where IS-A fails because we don't have to stick with a strict hierarchy of operations. -t

{Objective verification of types, whatever that is, doesn't enter into it. I am using familiar terminology for generally-recognised concepts. Your personal rejection of these is irrelevant, and I won't waste further time on it. ComputerScience isn't going to change to suit just you, so if you wish to engage in these discussions, you'll need to change to suit ComputerScience.}

Fuzzy is never the ideal, and it's not my fault they are fuzzy. The existing terminology is "good enough" for roughly 90% of the uses out there (largely shaped by historical tradition), but we are at the 10% here. Traditional ADT-style classification of "types" fails when one views operations as a buffet of choices rather than "belongs to" one type. Shit overlaps. That's life. Hierarchical taxonomies clash with a SetTheory view of things. Sure these kinds of objects have behaviors of strings, but they also have behaviors of other "types". Thus, I reject the notion that they are primarily "strings". They are (potentially) numbers also if we classify them by operations. Please don't be blinded by history. -t

Now I will grant that one can do string-like operations on ANY value we may encounter (if there is a way to "serialize" it), but that's true regardless of language or implementation. Strings are just more flexible than numbers by their nature, independent of technology. -t


Assuming from you "no tags" transparency that structures also cannot be distinguised from strings. This means that you have some canonical representation of this in mind (an dprobably also for objects and functions). Lets assume JSON. This means that I could validly do something like

  "{ a: 1, b: [ 3, 4 ]}".b[0]
Correct? Doesn't this open a whole can of bugs?

I'm not sure what "b" here is supposed to be. For this topic, I'm only considering "scalars", but it's an interesting issue and may give us something like tcl meets lisp meets strings. There is a discussion around this wiki somewhere on that.


CategoryTypingDebate, CategoryRant, ColdFusionLanguageTypeSystem, DefinitionOfTypeTag


EditText of this page (last edited September 27, 2012) or FindPage with title or text search