Definition Of Type Tag

This is an attempt to define Top's "side tag" typing model, which is subject to many heated debates between Top and others on this wiki.

An informal definition of Top's "side tag" is a language system or engine that "act likes" it has a side tag to indicate types other than the value itself. "Acts like" is used to avoid prematurely tying it to actual implementation. Further, it should be considered a "working definition" for the scope of only this wiki rather than universal, although universal attempts are welcome here. Just please state this assumption.

So far there are 3 candidate definitions, and there is discussion about whether "tags" should be considered a model for prediction such that having a definition is of secondary importance.

Attempt #1:

A language uses the "type tag model" if it's possible to get different results for two different variables even if their string representation is identical. Scope differences excluded (local versus public, etc.)

Well, now we have a definition of "a language uses the 'type tag model'". What about "type tag"?

{Furthermore, what does "it's possible to get different results for two different variables ..." mean? When and where do these "different results" manifest themselves, and in what manner? What do you mean by "results"? What do you intend by the sentence fragment, "Scope differences excluded ..."?}

Would you suggest a topic name change to DefinitionOfTypeTagModel?? I'm not sure what your point is.

PageAnchor: test_53

As far as "different results":

  print(toString(a) == toString(b));  // result: True
  print(myPureFunction(a) == myPureFunction(b)); // result: False

Alternative:

  if ((toString(a)==toString(b)) And Not (myPureFunction(a)==myPureFunction(b))) {
    print("Possible type tag detected.")
  }

Consequence of this definition: Whether a language uses the "type tag" model depends on its toString function. If a language has a toString method which completely represents its internal structure, then that language doesn't use the tag model.

You didn't do what you set out to do. You said you wanted to define "type tag". That's all well and good, but instead you define "a language uses the 'type tag model'".

Attempt #2:

Given two variables, a and b. If it's possible to get different results using identical operations or transformations upon these two variables even if their textual representation is identical, then the set of properties for a and b that cause a and b to give different results is called a "type tag". If it's not possible to get different results, then a given variable has no type tag. Scope differences excluded (local versus public, etc.)

Questions:

Do you mean the textual representation of the

declaration of the variables
the expression(s) used to assign a value to the variables
the value literal(s) occurring in the assignment
(causing recursion) variables occurring in otherwise equal expressions in the assignment above

Textual representation of the "value" of the variable. I realize that some languages have a "dump" value that may be different than a "print" value. The "dump" value is generally used for debugging or serialization across servers etc., while the "print" value is designed more for end-users. People may have to first agree on which best represents the "value" of a variable that "prints" out in a production "output" sense. How to work that into the definition is still an open issue.

By identical operation do you mean

the operations have the same name
... the same definition
... the same source text

Same source text, such as same expression.

Just to exclude the obvious: Consider the identity function id:

        a = 1
        b = 2
        id(a) -> 1
        id(b) -> 2

This returns different results and according to the definition I'm bound to call the difference a type-tag, even though it is just a different value.

Thus something about the variables must be equal, otherwise values would be type tags.

Please state the difference.

I don't understand what the above example is intending to show. What is "id", and why are the variables given different assignments? That would give them different "printable" values, and thus disqualify them. Is "id" similar to the RAM address of the variable? I suppose that's one of those weird exceptions, such as scope differences that may have to be excluded. It may be difficult to make a definition without a messy list of exceptions. But first let's work out the "gist".

Hm. If everything is the same (value, function, expression) there will never be a difference and hence no type tag. Something has to differ. I have to assume that you mean the 'printed' value at some intermediate place be the same like this:

  a = "2"
  b = 2
  toString(a) -> "2"
  toString(b) -> "2"
  toString(dup(a)) -> "22"
  toString(dup(b)) -> "4"

Thus it looks as if you define type tag relative to some toString method. So it appears you have to specify the toString method in more detail.

Please note that there are some languages where the default toString method renders a string "2" as ""2"" and a numeric 2 as "2", so theses differ in these languages and thus there'd be no type tag differing between "2" and 2.

I'm not sure what you mean here. It's false that "their textual representation is identical" if one has quotes and the other doesn't when "printed", and thus they are not covered by the definition. (What kind of goofy language would do that anyhow?)
- Actually I like languages where the default toString method returns the notation to recreate the value. Scheme does this. It simplifies reading debug output because it avoids extra type annotations (at least in the base cases).
  - User-side output and debug output generally have different needs. The user doesn't want to see all kinds of internal gobbledygook.
  - [I think you're misunderstanding something. Scheme (and other languages; the RelProject does this too) avoids showing "internal gobbledygook" that other languages might display. This not only simplifies reading debug output, but all output. Java's (and C#'s equivalent) toString(), for example, is often misused to produce output full of "internal gobbledygook". However, the RelProject (for example) always outputs values in a format that re-creates the value if used as input.]
  - How does including quotes by default help readability for non-techies? Think of it this way, what would/should HelloWorld output look like if the text is in a variable?
  - [It clearly distinguishes alphanumeric values from numeric values. "HelloWorld".]
  - The end-user doesn't care about that. Managers will chew you out if you include gobbledygook characters they didn't ask for.
  - [Programmers (in the RelProject, at least) are unlikely to use the default string representation except during development and debugging. Hence, application end users are highly unlikely to see so-called "gobbledygook characters".]
  - How about we put non-output-oriented languages aside for now, and focus on those that do have end-user-oriented output options. Let's not get stuck on exceptions and oddballs just yet.
  - [Rel is not a "non-output-oriented" language, whatever that is. It has end-user-oriented output options.]
  - Then why your focus on the debug-oriented ones?
  - [They are more likely to involve some canonical string representation of values. End-users are likely to see formatted values, that may or may not be the same as their canonical string representations. I presume you intend to base your "definition of type tag" on canonical string representations, rather than values formatted for end-users, yes?]
  - If it's possible to provide a non-ambiguous definition of "canonical string".
  - [It doesn't need one. It's generally recognised and understood.]
  - That's not good enough for definitions, and obviously there are grey areas as witnessed by this sub-topic.
  - [Actually, that's perfectly good enough for working definitions, and formal definitions have -- thus far -- not been evident here.]
  - So you say. If you define it well enough, then there is no ambiguity and I'd only get it "wrong" if I mis-run your definition algorithm or formula.
  - [What's a "definition algorithm or formula"? Do you mean a formal definition?]
  - Anyhow, what's the "canonical string" of a Rel variable? The one with the quotes?
  - [Variables don't have canonical string representations, values do. The canonical string representation of a value is that which is emitted by the OUTPUT operator for that value.]
- Do you at least agree that your definition rests on the definition of toString?
  - Yes, that could be the case. Parties may have to first agree on what's the official user-oriented "output" representation before the definition is applied. That's not necessarily a show-stopper even if it does complicate things.
  - [Don't you find it awkward to employ a definition that requires adherence to non-existent official standards in order be of any use?]
  - There are very few "official standards" that apply to most or all programming languages. EverythingIsRelative, making ComputerScience "awkward". I'm just the messenger. At best we sometimes can mutually agree on the givens as "working assumptions" for further purposes. Thus, if both parties agree that output X is the "reference output string", then they can move on from there.
  - [Huh?]
  - What's an example of a comparable "official standard" that's useful for language-related definitions?
  - [Any recognised introductory ComputerScience text would be sufficient.]
  - Pick one example that's as comparable as possible.
  - [I recommend http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=3305.]
  - I'll repeat the request with some context-free clarifications: "Pick one example [definition from CS] that's as comparable as possible [to what we are attempting to do here]." - t
  - [Striving for the level of clarity and generality as found in, say, the accepted mathematical definition of "function" might be a reasonable goal.]
  - "Standard" math is a notation. We interface with each via this notation. It would be comparable to a specific programming language. We can say "this pattern is called a 'function' in language X" with little risk of ambiguity, but that may not be fully applicable to a different language. Further, what many languages call a "function" is not strictly a function in the math sense.
  - [I use "function" as an example of an accepted definition, well-recognised, that can be formalised. Its correspondence to programming language "functions" is, in this case, irrelevant.]
  - Not sure how to word it better at this point without knowing what's happening in your head.
  - [I suggest you look at the formal definition for "function" as an example of the level of rigour you might want to achieve. See, e.g., http://en.wikipedia.org/wiki/Function_(mathematics)#Formal_definition]
    - Quote: Given sets X and Y, a function from X to Y is a set of ordered pairs F of members of these sets such that for every x in X there is a unique y in Y for which the pair (x, y) is in F.
  - That's all fine and well, but first one has to decide/define what's in the sets. In our case, we could define set or list P as the printable output characters of a variable, etc., but per above, different people may put different things in the list. It's a classification contest.
  - [Actually, no, you don't have to decide/define what's in the sets at all, because the definition of "function" describes an abstraction. That means the sets can be anything, as long as they're sets. Can you create a similarly abstract, yet rigorous, definition for your "type tag"?]
  - It's not really rigorous because just about everything can be defined as set(s). I can define a program as one big function, for example: output = program(input). Tell me, what's the difference between a function and map if we exclude "typical" implementation techniques? - t
    - {In mathematics, the term map is overloaded, so the answer would depend on which definition of map you were using. Of the common definitions, we'd have one of the following:}
      - {When map is just another name for function, there is obviously no difference.}
      - {When a map is a function that has certain special properties (which would depend on context), then all maps would be functions, but not every function would be a map.}
      - {When map is a name for a function like object, then all functions would be maps, but not every map would be a function.}
      - What's an example of a map that's not a function?
      - {Successor map for ordinals. It can't be a function since neither the domain nor the range is a set.}
  - [The fact that "just about everything can be defined as set(s)" is a good thing. It means "function" as an abstraction is appropriately applicable in innumerable places. Your example of defining a program as one big function is, in fact, the conceptual essence of FunctionalProgramming. In mathematical terms, "map" is often a synonym for "function" or function-like things, but its specific meaning varies depending on the particular mathematical field in question.]
  - One has to be able to falsify the application of a definition or it's useless in practice. One has to be able to say what's not a function, for example. If one cannot think of a way to view a given thing as set(s), that doesn't mean the definition doesn't apply, because it may just be that those involved simply haven't found a way to view things as sets YET. We want to test objective things, not human creativity. It's not practical to say, "This thing is not an X because nobody has found a way to make it fit the prerequisites of X yet." Somebody once said that "String Theory" can't be falsified because it's really a framework that allows one to build a model which fits just about any 3D universe one can conceive or observe" (paraphrased). Functions and sets are kind of like that. - t
  - {It's perfectly ok to have definitions that have no referents or for which everything is a referent. For example, there's ideal voting system as defined on ArrowsTheorem. Another example is a Halt TM. Neither of those can exist, but there's nothing wrong with the definitions. The Halt TM definition even sees use, since the non-existence of such a thing has some rather wide-reaching consequences.}
  - {But that's all just a RedHerring, since it's easy to give examples of things that are/aren't sets/function.}
  - Name one.
  - {The class of ordinals isn't a set. The successor map for ordinals isn't a function.}
  - Is there a formal proof that says it's impossible to define it as a set?
  - {Yes. See Burali-Forti paradox.}
  - Is this an issue that can affect "typical" programming?
  - [No, but I think this has veered away from the original intent: That you create a formal definition of "type tag". The definition of "function" was not intended to spawn another debate, but merely to provide an example of the appropriate level of detail and rigour.]
  - I agree one may have to make a choice about targeting all potential programming languages versus "common" or "typical" languages.
  - [That wasn't my point.]
  - Further, I wanted an example from programming languages, not math. Is there such a dearth of "great definitions" for programming languages such that you had to wander into the domain of math instead?
  - {Why should the example be restricted to programming languages? I can't see how that would make the point any clearer.}
  - Because this is a programming and computer wiki, not a math wiki.
  - {And that's relevant because?}
  - In part because math has a standardized notation, whereas program doesn't. Programming definitions should not be tied to any specific notation.
  - {Math definitions aren't tied to any specific notation either, so that clearly can't be it.}
  - Will math tell us what a "canonical string" is?
  - {Beyond the fact that it's a function from all the values in a programming language to strings (and anything that can be derived from that), no. Math will not tell you which function to use (at least not without further constraints).}
  - [Indeed. I don't draw a hard line between ComputerScience, programming languages, and math. As such, I chose a definition that is simple, intuitive, universally recognisable -- and appropriately rigorous -- to use as an example. I could equally have chosen examples from grammar theory, type theory, denotational semantics, and so on, but these involve more complex and difficult definitions -- most of which are dependent on essential foundations like the definition of "function", so it makes an ideal illustration.]
  - I suspect there are multiple mathematical ways to model the same things found in "typical" languages. But on the more practical side, most developers are not going to have such an extensive background and need more practical, approachable models.

Open Points:

Top, could you provide a few examples where we can see what is equal and what is different regarding your definition attempt 2.
- Please clarify. I don't understand the context of "equal" here.
Do you require that the definition of toString (should) depend on official definitions (of types?).
- Most languages have at least one operation for turning the contents (value) of a variable into a list of ASCII or Unicode characters. I won't label this as a "type" at this point.
Possibly an example of what the toString method (which seems to be central to your definition) could look like.
- You mean an example implementation?

{The gist appears to be "polymorphic, and has the same canonical string representation".}

That's an interesting way to put it. The "tag" would then be that property of the variable that allows it to polymorph. However, the definition of "polymorphic" can get messy also.

Re: "Don't you find it awkward to employ a definition that requires adherence to non-existent official standards..."

Hell, I doubt "variable" has an "official definition" (unambiguous). - t

There's no such thing as an "official" definition. However, the concept of "variable" is well-understood.

We've had a similar discussion in TopsTypeDeterminatorChallenge over "types" and "cats" (feline). The state of the art is still not beyond the "I know it when I see it" stage. Historical-pattern recognition is not sufficient for a definition. - t

Of course it is. What do you think definitions are?

Definitions are often influenced by historical patterns, but don't actually use historical patterns as PART of the definition itself. If I look up "bonnet" in the dictionary, it usually described the characteristics of a bonnet(s); it doesn't define it as "looking very similar to the following pictures....". It may have pictures as side info, but does not use the pictures as the definition itself.

Typically, a definition for a term <x> is simply an agreed set of terms that accurately reflects a common understanding of <x>. Rarely, formal or working definitions used in (say) academic papers may deliberately and explicitly deviate from common usage in order to add rigour within the narrowly-defined scope of a single paper, specialised field of study, or particular discussion.

"Common understanding" is not sufficient for settling controversy and differences in viewing/modeling things. "Good enough" for general or "most popular" notions, perhaps, but that obviously should not be considered the pinnacle of communication.

Indeed. That is why we create and use formal definitions.

Which so far seem poorly suited for software issues.

Why do you say that? Formal definitions are used extensively in SoftwareEngineering and ComputerScience. They are fundamental to programming language theory. They're also used extensively in mathematics, physics, chemistry, logic, and increasingly in philosophy and the humanities.

If they were truly formal and thorough, then a machine could be constructed that would be able to take a programming language grammar and compiler/interpreter as input, and unambiguously label the parts of source code. The problem is that you are too ready to accept "I know an X when I see an X".

Why do you think that's possible?

If it's unambiguous, it should be possible to produce an algorithm to perform the task instead of human.

Why do you think that's possible? There's a vast gulf between making a definition unambiguous and making artificial intelligence. Perhaps you're confusing formal definitions with automated theorem proving, and then confusing that with high-order pattern recognition and identification &/or formation of concepts?

What do you mean by "label the parts of source code"?

I'm afraid stating "similar to" doesn't reveal what parts you think are alike, or to what degree, and what parts you think are not.

The point is to be able to "apply" allegedly formal definitions such that given sufficient info about the language grammar and behavior, a machine (algorithm) could determine if a definition applies to parts of a given (and potentially new) programming language without human intervention. This would remove the "I just know it when I see it" technique that humans over-rely on to classify things and/or apply definitions. "I know it when I see it" is too close to ThenaMiracleOccurs.

{It's not going to be possible, any language with types and an eval equivalent will run into the HaltingProblem.}

I'm not talking about artificial intelligence. I'm talking about describing your mental steps clear enough to "automate" them. That's what us coders do: take "notions" and iteratively or fractally refine them into concrete steps. If you can't do that with something you are personally sure is "clear cut", then something is wrong. Sometimes things people think are clear cut are not really clear cut. And what's something that a computer would have a HaltingProblem with but not a human? For example, a human can read source and make an educated guess if a program will complete or not, but it's only a guess. A program could be made to guess also. - t

First you wrote, "if [formal definitions] were truly formal and thorough, then a machine could be constructed that would be able to take a programming language grammar and compiler/interpreter as input, and unambiguously label the parts of source code."

We pointed out that such a thing is, in general, not computable.

Now you write, "I'm talking about describing your mental steps clear enough to 'automate' them."

You're essentially asking for the same thing: an algorithm for mathematical intuition and understanding. That algorithm is unnecessary (assuming it's computable, which it isn't), because the problem here is one of communication, not automation. We can do mathematics, and we can achieve understanding of mathematics, and (most importantly) we can agree on our understanding of mathematics, without automating it. Furthermore, focusing on automation pointlessly deviates from a reasonable request on our part: that you provide a formal definition of "type tag". I doubt a formal definition of "type tag" would lead to automatic identification of "type tags", but it would certainly result in a more informed and fruitful discussion because we should finally get a clear idea of what you have in mind.

In general, a formal definition is a tool to facilitate human-to-human communication, not enable machine automation.

If it can do both, then it may be even better than one or the other. If it's "runnable" we can subject it to experiments and science.

That may be feasible, given current technology, for a limited set of definitions in pure mathematics. It's infeasibly intractable for everything else.

To the individual who attempted a refactoring and summarisation of the debate: I appreciate your efforts, but please, please, please strive for accuracy. Creating a third thread that differs from the first two will only further fragment what should be a trivially-resolvable debate. I have, however, removed all invective from the original threads.

{I accept that you want to preserve the ThreadMode. Possibly it is too early for DocumentMode. It tried to resolve this. -- .gz}

I think it is far too early for DocumentMode, and I'm still not convinced your summaries would do anything at this stage but fragment the debate. Sorry, but I've removed them. When this page has been quiet for a while, then it will be time for DocumentMode.

I'd suggest creating a parallel document of the improved version rather than rework the original until it's finished or settles down with regard to new material.

Top-- since you are the one who invented the term type tag, do you have any examples of languages that use the "type tag model" or dont use the "type tag model". Do you have any example programs that illustrate the model. I.e. a program that would behave differently if the language did not use the "type tag model".

A similar term has been used in the compiler building world, thus the source of the term may be more complex.

ColdFusionLanguageTypeSystem and TypelessVsDynamic discuss the issue. The tricky part is defining and/or characterizing it it in terms of externally-observable behavior versus as an implementation versus a model. The boundary between all of these can be blurry because a sufficiently-explicit model usually *is* an implementation. The models of the type tag I prefer resemble this:

Diagram var_01:

   Variable: [ [name] [tag] [value] ]  // tag-based model

   Variable: [ [name] [value] ]  // tag-free model

But turning it into explicit language/rules/definitions that people agree on, both practitioners and the academic type, can get sticky. The above resembles an implementation model, but we can't assume that one can open up the hood and see the implementation because the same language can be implemented many different ways. A language is not defined by the compiler/interpreter implementation techniques it uses. But we can present a model that resembles an implementation that predicts behavior (input-to-output), and this can be a "tag model".

I want to move past "I know X when I see X" with regard to both types and type-tags, but it's an elusive goal for everybody. We can make models that mirror our personal notions (or Cardelli's), but saying it's the "right" model is another level. --top

If the goal is a model rather than a description of implementation, wouldn't the following be sufficient?

   Variable: [ [name] [type] [value] ]       // variables are typed

   Variable: [ [name] [value] ]              // variables are not typed

This permits exactly the same explanations of behaviour as "tag-based model" vs "tag-free model", but without the added complication of having to define "tag".

From a model point of view (as opposed to implementation), it appears that "tag" is an unnecessary synonym for "type".

Hmmm, so it's really the case that TypesAreSideFlags (AKA tags)?
Why would you conclude that? Using the word "spaghetti" as your own personal synonym for "money" doesn't mean spaghetti is money, does it?
They are your words: "it appears that "tag" is an unnecessary synonym for "type" ". Thus, Types == Tags.
Types are much more than just a tag. A type is (simplistically) a set of values and operations on those values. A "tag" -- in the usual meaning of the term -- is an integer or string value, typically used solely as an indicator to distinguish one item making use of a tag from another. Above, you appear to be using "tag" as a synonym for "reference to a type". That's not the same thing as "tag" is-a "type", any more than "pointer to integer" is the same thing as "integer". Sometimes we treat them as the same in discussion -- e.g., for the sake of convenience we say "integer p" when we mean "the integer pointed to by p" -- but that doesn't mean they are the same.
Yes,"sometimes". We need a stronger model than "sometimes"; stronger than the fuzzy language that's often used in practice.

I can only answer that if given a reasonably-clear definition of "type". Sure, "tag" may not be well-defined either, but it doesn't have the baggage of an overloaded history. - t

Perhaps a "tag" can be seen as merely a second value. The built-in operations of the language tend to favor the transparency of one over the other, but that is a "soft" rule. I've shown in TopsTypeDeterminatorChallenge that they are potentially interchangeable: we can use the "type tag" to "hold" information we normally associate with a "value", for example. We could call it the "dual value model" and it would have pretty much the same properties as the other candidates. - t

If "tag" doesn't mean "type", then I assume a variable can have a tag of, say, "3" or "Dave"? Obviously, that is ludicrous. "Tag" only makes sense as a synonym for "type", thus it is redundant.

What is "done with" the tag is up to the language. I observe patterns in their typical actual usage, but that's not rigorous. One can use a wrench to drive in nails, and it works. It's just not as "effective".

Sorry, not following you. Do you know of any programming language where a "type tag" is anything but a type?

How does one know that? How is "is a type" being measured?

How is "is a tag" being measured? "Type" is a well-understood concept in ComputerScience and SoftwareEngineering. "Type tag" (outside of a some references to compiler/interpreter implementation) is not.

If it was "well understood", then it wouldn't fail TopsTypeDeterminatorChallenge. You can't automate it because you mistake your feelings for external universal truth.

Once again, you are conflating "understanding" with "computability" (they are unrelated!), and mistakenly believing your personal lack of understanding about types represents some general (and non-existent!) lack of understanding about types among the ComputerScience and SoftwareEngineering communities.

I'm not convinced there's a hard distinction. See ChineseRoomArgument. Further, "understanding" may be inadvertently defined in your head as "uses my favorite model also". Further, TopsTypeDeterminatorChallenge doesn't ask the machine to "understand" something, only label "types" properly and expose the rules/algorithm for all to see. "Understanding" is generally the ability to apply a model to some external activity or object. But that doesn't necessarily mean the model is the "proper" model, unless prediction against some external event is taking place, but you haven't defined what that external thing is. Testing that my head model matches your head model is of little practical use unless it can be codified and externally examined. Arguing about what can't be measured is often futile.

When you said that the tag "resembles a type", your head did some kind of computation to come to that conclusion. Why can't you codify the rules/formulas/algorithms your head used to make that determination when you made that statement? "It feels like a type" is not very useful here. If you don't know why your head did what it did, then we are going to have a very difficult time communicating. - t

{The issue is that you keep equating "well-defined" and "decidable". This has nothing to do with AI, and therefore, the ChineseRoomArgument has nothing to do with this.}
How come it's "decideable" when you do it (classify) in your head, but not with the computer?
{If it's decidable, it can be done with a computer (given enough resources). I.e. that there is an algorithm that will determine whether or not any given thing meets that definition. Well-defined means that it's either true or false that any given thing meets that definition. You keep insisting that well-defined implies decidable and it simply doesn't.}
Are you claiming it's not decidable? If so, then how is your head doing it? What supernatural computational powers or deity does it have access to that computers don't? If it's not decidable, then perhaps it's useless or very limited as a practical definition also because humans can only make a guess since they have finite processing power, just like computers. A decidable model/definition is generally preferred over one that's not because it's usually much easier to test and/or apply in practice.
{As stated, it's underspecified. But, any of the obvious ways of fully specifying the challenge would result in it being undecidable. (Especially since you want it to work on all languages.) As for how my head is doing it? It can't. There are situations where it will fail. That doesn't mean that the question doesn't have an answer, it just means that I can't always provide it.}
Let's skip "all languages" then, and focus those that resemble the more common ones: Java, C family, JavaScript, Php, VB, Pascal, Python, etc.
{What do you mean by "resemble"? If we drop that word, the answer is clearly that the challenge can be met (if we also allow the challenged party to pick which way to finish specifying the challenge). Since there are only a finite number of languages, we just switch on the language, and solve for each language. Since none of the common languages require knowledge of the runtime state to determine which symbols in the source designates a type, every compiler for that language includes a solution. On the other hand, if a language like Perl but with types is considered to "resemble" a common language, then the answer will be that the challenge can't be met, since that would require solving the halting problem.}

I didn't write 'tag "resembles a type"'. Please read what I wrote. I wrote that '"tag" only makes sense as a synonym for "type", thus it is redundant.' In other words, your model makes "tag" into an alternative term for "type", but "type" is well-known whereas your use of "tag" is not. Therefore, your use of the term "tag" in place of "type" is unnecessary. If you can rigorously show that "tag" is different from "type" but explains all the same behaviours, then you may be onto something.

Yes, "type" is well-known, but also overloaded. Commonality is hardly the only concern. Again, I don't wish to bring something with historical baggage and other distracting complexities into the model. I can define "tag" as a "secondary value" and it's simple and fits into the model. When you tie a simple definition to a complex definition, it's no longer simple because it inherits the complexity of the added complex part. I don't want to foul up the model with drama-laden parts.

What do you mean by "type is overloaded"? What's a "secondary value"? What do you mean by "drama-laden parts"? What drama?

Unless you intend "tag" to mean something other than "type" -- and I'd like to see an example of a language where that's the case -- surely you'll inevitably be talking about types anyway, because that's what tags refer to, no?

As far as I can tell, you are using the term "tag" in precisely the places where a programmer, software engineer, or computer scientist would normally use the term "type" in a familiar, recognisable, and generally unambiguous way. Even beginners to programming quickly understand types via simple examples, like "integer", "string" and "float". What would these be in your model? Integer tags and string tags? How does that make them easier to understand?

Yes, but it's a vague notion, a feeling; the classic "I know it when I see it" due to historical pattern matching, but not rigorously defined in a codified way such that it's existence on/in an instance of something can be falsified.
I rather doubt any programmer, software engineer, or computer scientist would regard a type reference as "a vague notion, a feeling". Types are rigorously defined in theoretical terms, but they are informally recognisable in programming languages by even a programming beginner. Types are sufficiently well-defined that a programming language implementer can build parsers and code generators that precisely decide what types shall mean, how they shall be referenced and defined, and where they may be used.
PageAnchor function_01
It fails "rigorous" for reasons given above. And if each compiler/interpreter decides using it's OWN rules (or the maker's rules), that's local/working definition of types, not universal. For example, many programming languages use something called "function" even though they don't fit the strict definition because they allow side-effects. They are labelled in a vague "notiony" sense, not rigor. - t
I don't know of any language where its type definitions, variable declarations and assignments, etc., are vague and "notiony" and we can reason about them and discuss their behaviour without reference to broader definitions. Just as we can discuss what happens in a programming language's implementation of what it calls a function without concern for a strict definition of "function", we can discuss what happens vis-a-vis a programming language's implementation of types without concern for a strict definition of "type".

I don't see how using the term "tag" makes understanding types the relationship to language behaviour any clearer. I do see how using the term "tag" makes it less clear: it (at least) complicates a simple notion by adding an unrecognised term and an apparent additional level of indirection in which 'value --> type' becomes 'value --> tag --> type'.

Again, familiarity is not the same as thoroughness. I am not sure what you mean by additional level of indirection. Where are you getting your two example mappings from?

By additional level of indirection, I mean this: You say a value has a tag, but apparently a tag always refers to a type. So: "Value has a tag" can also be written as "value --> tag". Therefore, "value has a tag, and tag always refers to a type" can also be written as "value --> tag --> type". But what does "value --> tag --> type" mean? Apparently, it means "value --> type". Hence, there is an additional level of indirection between "value" and "type" in stating "value --> tag --> type", because "value --> type" means exactly the same thing.

In other words, "type tag" simply seems to be your personal terminology for "type reference".

Without a rigorous definition of "type", one cannot tell. I'll choose simpler atoms over complex and defective ones to build a definition/model around if given a choice. And I never said a value "has a" tag. I consider them distinct in the model. See Diagram var_01. Although, there are different ways to model constants. I choose to model them as anonymous variables for now.

But there seems to be an obvious relationship between "tag" and "type", and it's trivial to identify the type references and type definitions in any programming language, so surely you can define the relationship between "tag" and "type" without having to define "type", no?

If no value "has a" tag, how do you explain why the following fails in C#?

  int x;
  x = "123";

You say ColdFusion discusses the issue. I'm asking can you show two variables in cold fusion that fit one of your above definitions?

ColdFusion behaves as if it has no type-tag (at least for scalars). (An exception may be the new "null" support.) It will never trigger a result in PageAnchor test_53.

Is that written in a specific language or is it pseudocode. Are there any examples of values for a and b that trigger the type tag detected message?

If ColdFusion doesn't behave like it has type tags, is there any other language that does. i.e. Are there two variables and a series of operations that fit one of the above definitions. I've tried for about 90 minutes to do this with C, assembly, Common Lisp and python and have been unsuccessful.

I'm pretty sure I've encountered such in JavaScript [see below] and Php, I just don't remember the code itself. There was a case where I had to use Php's triple equal sign (===) because a double equal didn't work. Below is an example from ASP.Net -t

Example "hungry":

    ' ASP.Net
    '--------------------------------
    Sub testType()
        Dim a As String
        Dim b As Integer
        '
        a = "123"
        b = 123
        '
        If a.toString = b.toString And Not myFunction(a) = myFunction(b) Then
            Response.Write("Possible type tag detected")
        Else
            Response.Write("Get me a sandwich, I'm hungry.")
        End If
        '
    End Sub
    '----------------------------------
    Function myFunction(byVal thing As Object) As Object
        Return (thing + thing)
    End Function

The result I get is "Possible type tag detected". For type "String", the plus sign appends, but adds arithmetically if it's type Integer. (I've put comment markers on blank lines because of a spacing bug on this wiki.)

Doing this also gives a result:

        Dim a As Object
        Dim b As Object

Thus, the quotes seem to be used to "set" the internal tag.

There are no overloaded operators in ColdFusion. There is also no equivalent of Php's "getType()" function. I believe one can trigger a result in Php by replacing the equivalent "myFunction" in Php with:

  a = "123";
  b = 123;  // no quotes
  ...
  function myFunction($thing) {    // Php
     return(gettype($thing));
  }

ColdFusion has functions such as "isNumber", but it appears to parse the value, not a type tag. It can't be used to write a "myFunction" that gives a different answer for a and b. Thus, it has no detectable "hidden tag". Placing quotes around the initialization doesn't change any known result.

You've shown that ColdFusionLanguage internally represents all values as strings. Its numeric operators must, upon invocation, internally coerce string values to numeric values. PHP associates types with values but not with variables. ASP.Net/VB associates types with values and with variables. I don't see anything here -- in terms of a model -- that is explained by "type tags" in general or a "hidden tag" in particular.

What do you mean by "associates types with variables"? My model can explain (mirror) these behaviors. You have not presented a strait-forward alternative that can also.

How does your model provide more information than that conveyed by "associates types with variables"? Are you sure you're not making statements based on your understanding of your model, rather than what you've explicitly stated about your model?

The problem with "types" is described further above. If I've left out some detail, please point it out and I will attempt to correct it. - t

You appear to believe there is some general problem using the term "type", that it is controversial or unclear. I submit that the only controversy or lack of clarity over "type" -- at least in the contexts where you wish to use the term "tag" -- is sustained by you, and you alone.

I'm not convinced that one individual's difficulty with the term "type", in certain contexts, is justification to promote "tag" in its place -- especially as it seems to be, upon examination, nothing more than an alternative term for "type reference".

I haven't seen a decent survey from you. Until you address why you fail TopsTypeDeterminatorChallenge (per above), I do not consider "type" a rigorous and useful term.

You've offered nothing to suggest why your personal difficulty with the definition of "type" should have any bearing on the world outside of you.

The concept is nebulous, or at least has fuzzy boundaries. That's not my fault; I'm just the messenger. "It's a set(s)" is far too open-ended to form a clear-cut model around because almost anything can be modeled as sets. I want a "mechanical" visual model that can be drawn and explained in a discrete way on a whiteboard with boxes and arrows and step numbers etc., not hand-wavy stuff about category theory or human "intent". Your approach is too abstract for regular-Joe programmers (even if it was rigorous, which I doubt because you fail TopsTypeDeterminatorChallenge and try to use non-determinable as an excuse even though your brain allegedly does it in non-infinite time, per above.)

(Or a virtual clerk in a room with pieces of paper representing bytes with labelled wooden bins in which the clerk follows explicit step-by-step instructions when "running" samples involving "type"-related stuff. Those 60's classroom films that attempted to explain "how computers work" with cartoons of munchkins moving around boxes inside a CPU sort of had the right idea.)

Example: JS_05 - JavaScript version of The Test

 testType();
 function testType() {
  var a = "123";
  var b = 123;
  if (String(a)==String(b) && ! (myFunction(a)==myFunction(b))) {
    document.write("Possible type tag detected.");
  } else {
    document.write("Get me a sandwich, I'm hungry.");
  }
 }
 function myFunction(thing) {  
  //alert("test 1:" + typeof(thing));
  return(typeof(thing));
 }

Result: Possible type tag detected.

Types are defined in a type system. Words are defined in a language. Top's insistence that a definition of type be "universal" to be rigorous is so irrational and unreasonable that it is NotEvenWrong.

This isn't saying anything useable or concrete.

Nothing about Top's theory of typetags is useable or concrete. Noting that types are defined by their typesystem IS usable, and concrete. It says: "go find yourself a typesystem before you try to define types, you wanker."

So if I define your digestive system as a "type system", then whatever comes out is "types"?

Sure, if you defined "types" to be "the output of the type system". Definitions are the meaning we give words. That is a simple concept. But you would need to be careful to avoid equivocation in arguments using the words "type system" with another meaning. And your choice to call the digestive system a "type system" would certainly be evidence of sophistry - use of words to confuse and mislead others - if not sheer idiocy.

Testing the validity of definitions shouldn't be dependent on determining human motivation.
The idea of testing "validity" of definitions is fallacious. For definitions, we only test utility (does it make useful distinctions in context?) and acceptability (will people laugh at you and ignore it?).
{Since the only way a definition can be wrong is by not matching the intent of the users, you can't check it any other way. (Although it doesn't have to be human motivation we check against.)}
If and when a clear model appears, THEN we can run a vote on it's conceptual match.
[Do you mean clear to you? Or clear to us?]
Clear enough to computerize. If your head can do it in finite time, then so can a computer, so don't pull the "determinable" shit again.
[Can you recognise cats? Can you write a program that can recognise cats? Can you recognise business software? Can you write a program that can recognise business software? Can you recognise loops in programs? Can you write a program that can recognise loops in programs?]
Yes, but it would probably mostly be based on historical pattern matching, which most will likely discard as a canonical definition. It may be possible to produce a geometric formula for cats, but I'd have to give that one more thought. We already beat the dead cat examples to death. I'll see if I can provide a link.
[Historical pattern matching sounds like an excellent starting place for a canonical definition, and please don't devote all your attention to the cat example -- what about business software and loops?]
Probabilistic definitions? Typitivity level?
[Sorry, not following you. "Typitivity"???]
What you describe are generally probabilistic models, not Boolean result models (unless a threshold is used).
[What do you mean by that?]
Similar to what neural nets do: learn based on "training sets". One could, for example, take 100 programming languages, label the "types" or "type indicators" of each language, and "train" neural nets or some statistical process to be able to take a new programming language as input and spit out the parts labelled in such a way that they'd probably closely match what a person would do. But this is generally not considered a good way to build a definition or a model, similar to using circular regression versus a gravity model in AddingEpicycles. Regression can "match" a process but it is usually not modelling it.
[How do you feel that would work with recognising loops, or recognising business software?]
I don't really know what steps my mind makes to ascertain such, at least not enough to claim or produce an examinable formula/algorithm for them. A lot of it may indeed be historical pattern matching. The human mind generally uses a combination of techniques to make judgements. Note that neural nets are lossy in that they don't necessarily remember each instance of the training set. Plus, it may be continuous, some things are more "businessy" or "loopy" than others. For example, smart languages/compilers can run contents of multiple loops in parallel such that they may not really be "loops". Functional/declarative languages also have things that resemble loops but may not really "be" loops. "Loop-like" is perhaps all one can safely say. It's a continuous/probabilistic way that humans generally classify things in their mind. It's only when it becomes an academic problem that we try to find clear, rigorous, and Boolean (yes/no) definitions and then try to turn wishy washy notions into something more systematic and rigorous. Maybe it's a lost cause and the fuzz is "good enough" or the best we can do at the moment.

So a "type system" is whatever the hell a language maker wants it to be?

Yes. But TypeSafety is defined independently of any specific type system. So it's wise to design a "type system" that will be compatible with generic terms like "type safety" (and otherwise support useful comparison of their type system with those of similar languages).

See the "function" analogy above (PageAnchor function_01).

The idea that a definition of types should be "universal" is flawed. TopsTypeDeterminatorChallenge is invalid.

If that's the case, then I have good reason not to directly tie the tag model to it because I want to make it as universal as possible (at least for "common" languages). QED. Thank you for solidifying my argument against tying the model to "type".

It was a mistake for me to focus on "definition". I should have focused on defining a model in order to explain/predict certain behavior associated with "types" in common programming languages. -- top

That sounds reasonable. Life is short -- why waste it worrying about definitions for novices, when there are so many interesting, rewarding, and potentially important discoveries, inventions and innovations to be made in programming, SoftwareEngineering and ComputerScience?

Having and using visual models or visual metaphors has served me relatively well over the years in school and IT. It was always possible to "explain" stacks and B-trees using visual metaphors/cartoons/machines etc. for example. These are useful for communication, "explaining", and predicting the behavior of software. I often form such models in my head and then test them against reality. If they hold up, I keep them; if not, I tune or toss them. They may not even reflect the underlying actual mechanism, but if they predict it well anyhow, it may not matter. I've yet to see something useful for explaining how "types" work in typical programming languages and the kind of oddities and differences demonstrated on this topic other related topics, other than the tag/flag model. If you can show something better, be my guest... - t

Let me make an EverythingIsRelative statement here. Maybe there is a "proper master model" (PMM) of types that every developer "should" know and use. However, most developers probably don't use this PMM in practice (since its existence is sketchy and cryptic and/or vague). So what are they using instead? Probably some kind of UsefulLie model of their own making (likely a conglomeration of experience-based pattern recognition and "computed" models).

These individual self-rolled models may not be perfect, and this lack of perfection may cause bugs or confusion at times, but the problems are fixed when encountered and the programmer continues on their merry way. Thus, even if the type-tag model isn't "perfect", it can still be a UsefulLie just as good as or better than most other typical UsefulLie typing model floating around out there in existing programmer heads. If you are the Purity-or-Nothing type, this may bother the hell out of you; but so be it. Accept botherdom like a man. -- top

Other than obfuscating the simplicity and obviousness of the notion by adding a level of indirection -- you add "variable -> tag -> type" to what can simply be "variable -> type" -- your "model" is functionally indistinguishable from "a variable has / doesn't have a type".

No, because "type" can be a "conceptual type" without a computer-kept representation. Example:

  foo = 347;

The programmer (reader) may conceptually model this as "having" type "number". But in a tag-free system, there is NOTHING inside the computer that tracks number-ness. It's purely in the reader's head and has no machine-represented counterpart (other than being parse-able as a number). But in a tag-based system, there is a computer-side representation: the tag. The distinction may become important when foo is used along with other variables or operations such that a conversion, implicit or explicit, may be applied. It affects machine behavior (output).

It is usually a UsefulLie for programmers to "track" intended type-ness in their head when reading and writing programs, but it does not necessarily have to have a computer-model-side counterpart to get results that are reasonably close to what the programmer intended based on "running" an explicit type model in their head.

You appear to be saying, in your example, that "a variable doesn't have a type." The alternative -- which you describe as having a "tag" -- is that "a variable has a type."

In most programmers' minds, "can be usable as type X" is nearly synonymous with "has type X". Thus, one could say that foo "has the properties of a number type" or "satisfies the requirements for being a number". This is largely because the practical, observable behavior of tag and non-tag languages is identical in most cases. The observable difference is usually subtle and the situational "fixes" cover up the differences. Thus, your suggestion would be confusing to a good many programmers.

In other words, "has the properties of a number" and "is a number" is pretty much the same thing in most programmers' heads, and your approach doesn't help clean up the difference between the two. "Has the properties of X" can be seen as or mistaken for the same thing as "is X". In fact, tag-free languages have functions similar to "isNumber()" to test for number-ness. But there is no computer-side representation of "number". (It may pass other isX() tests also at same time, unlike most tag-based systems.)

(Earlier in my career I didn't really care or ponder much about the difference either; I just coded, got the program working, fixed any discovered bugs, and continued on. Over time I grew more curious and did more experiments to isolate the differences between the two. It's kind of like when you get a new car: you are happy just to figure out where the basics are so that you can drive to work on time. Over time you start to explore the radio short-cuts and "funny knobs" some more. I wanted a concrete head model that always predicted the output properly; and got the tag model as the product of this.)

Why do you think my "suggestion would be confusing to a good many programmers", given it's what the majority of programmers learn (very early!) in university?

No they don't. They usually encounter the concept of a type-tag in say a compiler class, but other than possible existence don't give it any further thought. Maybe it's covered in an "interpreter" chapter or class, but most seem to forgot about such if it is. A lot of students think, "I want to be an application programmer, not a SystemsSoftware programmer", and so will just get a passing grade in such a class. If you look at the grades, most students get a "C" if the school has balanced grading. My degree did NOT formally cover interpreters, although I didn't choose the SystemsSoftware sub-specialty. They had about 5 sub-specialties IIRC such that only about 20% of CS grads would have covered interpreters thoroughly.

And even if they did, it's still useful to make a distinction between "having a type" and "having a tag" because one may not know if one is talking about their head model or the interpreter's guts. Often when talking to other programmers or working out designs, one tends to think at a higher abstractions level than perhaps what the interpreter can provide such that it's best to be clear.

You appear to be conflating understanding of language behaviour -- which is what the majority of programmers learn (very early!) in university -- with language implementation. When learning about the former, budding developers typically learn that "a variable has a type" or "a variable does not have a type" when they encounter their first dynamically-typed language after having learned a statically-typed language, or vice versa. Language implementation is covered in compiler and interpreter construction courses, where implementation details -- like type tags -- are explored in technical detail without the need for trivial models, and typically when students already have a good grounding in types and type systems in various languages and paradigms.

Re: "You appear to be conflating understanding of language behaviour -- which is what the majority of programmers learn (very early!) in university -- with language implementation."

No, you are, and that's the problem. And sometimes it takes experience before one realizes the impact of such distinctions. Most students have a lot going on their head at the same time, trying to absorb so much in such short periods. Subtlety often escapes them.

What makes you think I'm conflating them? And what is the intended audience of your "tag model"?

Those who care about the subtly of dynamic types.

To the extent that there is "subtly [sic -- I assume you mean "subtlety"] of dynamic types", it is entirely (and quite trivially) explained by "a variable has a type" or "a variable does not have a type", along with a few simple examples. There's no need to complicate it by inserting a "tag" between "variable" and "type".

Gee, I failed to recognize the subtlety between subtly and subtlety there.

And again, what makes you think that most students had an interpreter course, let alone one that covers AND contrasts tag and non-tag implementations? I have no reason to think my university experience is unique.

Where did I say I thought most students had an interpreter course? I wrote that budding developers typically learn that "a variable has a type" or "a variable does not have a type" when they encounter their first dynamically-typed language after having learned a statically-typed language, or vice versa. They learn about type tags if they do a compiler/interpreter course; such students are well beyond needing simplistic models of, or analogies for, types.

I already explained why "has a type" is ambiguous. And I estimated above that the majority of students don't have enough exposure to interpreter implementation to readily consider the difference in the real world. My explanations look perfectly good to me, I don't know why you are rejecting them. If a sub-assumption is wrong, then please point it out specifically at the very spot/word/phrase of fault itself rather than a summarized rejection. I cannot process an overly-summarized rejection.

In the eleven uses of "ambiguous" on this page, none appear to explain why "has a type" is ambiguous. Students require no exposure to interpreter implementation; they need only see the difference in behaviour between one statically typed language like C# and one dynamically typed language like PHP or Python, and for that a notion of "has a type" in relation to variables is sufficient. Introducing "has a tag" as a substitute for "has a type" adds nothing but complexity and irrelevancy.

Nope, because tag-free dynamic languages like Perl and CF behave different than tag-based dynamic languages like Php and JavaScript. There are different kinds of dynamic typing. Haven't we been over this already? (I'm not sure I'd call C# "static". It's kind of a hybrid because it allows "object" types that can morph into or be treated as the other base types during run-time.)

You have no decent model that can explain the difference.

The different kinds of dynamic typing are trivially explained by values and variables having or not having types. Again, introducing "has a tag" as a substitute for "has a type" adds nothing but complexity and irrelevancy -- at least until the student is prepared to understand compiler/interpreter implementation, at which point analogies are unnecessary because students are ready to appreciate technical reality. (And, yes, C# supports 'object' as an alias for Object, which is the base type for all other types. It's still statically typed, and there is no "morph" (whatever that might be) involved.)

"Having" is confusing, per above. We are going around in circles again. It seems we are using different assumptions about the minds of programmers, and these assumptions are not readily empirically testable here, so we are stuck at an impasse. (I disagree with your characterization of C#, but will save that debate for another day.)

Beware of conflating your personal "having" difficulties with a general confusion. (C# supports class inheritance, and in languages with class inheritance it is typical to be able to assign values of type T' to variables of type T when T' is inherited from T, because T' is-a T. In this case, T is type Object and T' is every other type. The same is true of Java for all but the primitive built-in types. Both are statically-typed languages.)

Perhaps you are doing the opposite conflating. All we both have are anecdotes and counter-anecdotes. You or somebody similar have suggested elsewhere that you are not very interesting in the psychology of "ordinary" programmers because it hinders your goal of "learning from the best". Well, I am interested the psychology of "ordinary" programmers and do pay attention to it. Perhaps my observations have some flaws, but that's probably true of anybody who doesn't have a secret shortcut to observation.

What does "the opposite conflating" mean here?

Note that somebody may be able to go a good long time in CF and Perl without ever realizing they have no detectable type tag (depending on programming habits). (Note that in C# one does not have to use the static types.)

If you can go a long time in CF and Perl without realising they have "no detectable type tag" -- by which I presume you mean they're dynamically typed, variables do not have types and values are always represented as character strings -- then this is a bit of a non-issue, isn't it?

No, because it's different than how say Php or JS would do it. It is an issue, but again often subtle in terms of results. If you mean it's a "non-issue" because it is subtle, then I will partly agree by calling it a "relatively small issue". But, it is still useful knowledge to know the difference when debugging, trouble-shooting, and preventing flaws.
That's because PHP and Javascript are dynamically typed, variables do not have types and values have various types.
Please elaborate. "Have type" is ambiguous for reasons already explained. Regardless of how you or I label/classify stuff, the two classes of dynamic languages act different. Call the difference floo, zog, or flurddle, it's still a concrete difference.
"Have type" appears to be ambiguous to you, but not in general. Some dynamic languages represent values using various types such as string, integer, float, etc. Others represent all values as strings of characters. This choice of representation affects how the language behaves.
You've done no rigorous study of ambiguity in the population. Again again, it's anecdote versus anecdote. There's no need to repeat that.
There's no need to do a rigorous study in order to determine that "has a type" is generally understood, because it's a common phrase in informal descriptions of type systems. Unless, of course, your claim is that no one understands informal descriptions of type systems, but that can't be true or there wouldn't be any programmers.
- Yes, "informal", yes yes. And in some places that lack of formality creates confusion. A general fuzzy notion is good enough most of the time, but not ALL the time.
- If you wish to provide a formal definition of "type tag", that might be worthwhile. Prior attempts to get you to do that have met with resistance.
- I spent a fat topic trying to define it. That's not "resistance"; your bias is tainting your memory. I admit I failed to provide a good formal definition after my try, but it's moot here, it only has to be BETTER than yours in order to be better than yours (in terms of communication to reg developers.)
- You didn't try to define it formally, and our attempts to get you to define it formally -- I think I gave an example of the formal definition of "function" as an illustrative starting point -- resulted in quibbling. You attempted to define it verbally. If lack of formality creates confusion...
- It's a model to help predict program behavior. One doesn't need a formal definition for that. And yes, lack of formality can contribute to or cause confusion, but so can lots of other things, and those other things may be a bigger problem. And note that formality is not necessarily a requirement for "understanding" a tool well enough to use it effectively. Formality is nice to have if it's available, but is not a necessary ingredient.
- As a model, it's redundant. Your model consists entirely of injecting an extraneous "tag" into the relationship between variable and type, i.e., "variable -> tag -> type" where "variable -> type" is sufficient. There is nothing that "tag" adds to "type".
- You don't explain what this "-> type" symbol really means or how it affects processing. Typing slashes and arrows on the screen doesn't fix the problem: they just float there by themselves and don't do anything or react to anything.
- It's a notation for "has" or "associated with" or "relates to". That should be obvious from the context, and it avoids quibbles over terminology.
- "Relates to" and "associated with"? That just drips with clarity. Note that in non-tag langs, the property of "numberness" is "associated with" a variable/value if it can be used AS a number. It's just not "explicitly" associated with it, and I use "explicitly" in a notiony sense.
- It's an abstraction. Further specification adds no clarity and potentially introduces inaccuracy.
- In non-tag languages, numberness can be "associated with" (your words) variables (or their content). It's just that it's "done" differently than tag langs.
- The usual way to describe "numberness" is "numeric type". The usual way to describe a variable's content is a value. So, what you mean is that types are associated with values and can be associated with variables. I'm not sure what you mean by "'done' differently than tag langs", though whatever it is can be described in terms of variables, values and types and the relationships between them.
- I thought you said that tag-free languages don't "have" types. You complained that tag was synonymous with "type" and therefore redundant. But this appears to contradict that claim because number-ness can be determined withOUT a tag. (Let's just consider the scalar level right now.)
- I don't recall saying "tag-free languages don't 'have' types", and it seems unlikely that I would. That sounds like a misinterpretation or misunderstanding of something I wrote. Can you show me where I wrote that?
- You are thus saying that CF "has" types (where CF uses only parse-based-typing), yet you say "tag" in my model is synonymous with "type". But the lack of a tag in my model "state" for CF modeling is what explains the difference in output between the two "kinds" of dynamic languages. But if I follow what you said above, then there would be no difference in the state of the models under parse-based and tag-based langs, meaning it won't explain the output diff. Identical model states will produce identical results, which contradicts the reality of output of the thing being modeled. We want our models to match reality, no?
- The "lack of a tag" in your model is equivalently explained, in precisely the same case and exactly the same results, by the phrase "lack of association with a type."
  - Wrong! There is the potential association "can be interpreted/used as type X", which doesn't distinguish problem cases. You agreed that that association may be called a "type"; but permitting it breaks your model's forecasting ability.
  - Sorry, not following you here. Are you claiming there are cases where "tag" is not equivalent to "type", or at least not equivalent to "reference to type"? If so, can you give an example?
- Note that it may be possible to model parse-based typing via a tag(s), but there are at least two problems. First, it's possible that it may "have" more than one "type" at the same time, such as "number" AND "string", because parsing successfully as an X does not preclude ALSO parsing successfully as a Y, meaning your model would need to handle multiple tags per variable; and second, it's superfluous in that it's not necessary to model the behavior of such langs (but is necessary for tag-based languages). Third, it doesn't emphasize the difference between both language kinds. The difference has to be stuffed into ugly little specific rules of the model. Occam's razor says toss it. (At least toss it if explaining/modeling type-related behavior is your primary goal of using the model. I agree that different models may illustrate different things better. One model to explain/illustrate/predict everything well may be unrealistic.)
- The "model" that predicts everything is to describe the behaviour of a language, vis-a-vis types, in terms of relationships between types, values and variables. No further elements are needed.
  - You haven't demonstrated that it unambiguously predicts behavior in the field. Vocabulary is secondary, I want an unambiguous model that is presentable in the field. You have fuzz. I don't care what you call the parts, just make the fsckers WORK! Getting it working first, and THEN we'll tune the naming.
  - It unambiguously predicts behaviour in the field by definition, because they're precisely the elements used to define behaviour in languages.
  - You have not illustrated this with sufficient detail.
  - Perhaps that's because I haven't illustrated how the relationships between types, values and variables determine aspects of language behaviour. Behaviour varies on a language-by-language basis, though we can generally categorise behavioural characteristics into "dynamically-typed" vs "statically-typed", and we can provide further language-specific descriptions from there. The important point is that types, values and variables -- and the relationships between them -- are sufficient to describe language behaviour when we want to do so; we do not need additional mechanisms to describe the behaviour of any language.
  - I'll believe it when I see it in concrete form of a clear model that uses them, not fuzzy talk.
- You have not explained how your model works in a clear way. The relationships, separation of "parts", and points in time of impact are foggy. It's a big cotton-candy & fog orgy. With mine the difference is clear-cut: with a tag in place, you get one answer, and with the tag not in place, you get a different answer because there is simply NO tag around to provide the info available in the tagged version. It's simply there or not there; there's no half there or connected with fuzzy wires or have semi-transparent walls that let some unspecified things in but not other unspecified things in. You have no clear visual nor step-by-step counter-part, at least without fuzzy membranes between parts. Maybe it's clear in your own head, but you've failed to turn your stuff into something clear external to your head.
- How it works depends on the language, but the relationships, separation of "parts" etc., are nothing more than familiar language semantics learned by every programmer.
- As already described, "types" is ambiguous in practice such that it makes a poor model building block as-is. I am trying to present a model that provides unambiguous results. I am prioritizing accuracy of forecasting programming behavior above that of vocabulary reuse in this particular model.
- But you're inevitably forced to explain some relationship between "type" -- which is familiar to every programmer -- and "tag" -- which is new. What will your explanation be?
- Familiarity with vague concepts is not sufficient to differentiate, at least in this case. I'm familiar with "country music", but that's not sufficient to have clear and unambiguous classification rules for each and every song.
- You still haven't answered the question. You're inevitably forced to explain some relationship between "type" -- which is familiar to every programmer -- and "tag" -- which is new. What will your explanation be?
CF and Perl could in theory "compress" numbers under the hood to something similar to a Double and nobody would be able to tell the difference in terms of output. We don't even know that Php or JS don't store numbers as strings (or vice verse) under the hood (when possible). We couldn't tell without dissecting the run-time memory image or interpreter source code. Thus "choice of representation" is NOT a defining difference because the rep could be swapped and programs would run identically (other than speed/RAM issues). What I offer is a model that has prediction ability, but does not intend/guarantee to accurately model actual implementation. But it would work in the swapped implementation scenario because of this. - t
Note that in the compression scenario, Php or JS would forgo number compression if "a=001;" was given instead of "a=1;" because the output wouldn't be faithful to the original. (Actually JS interprets stated values starting with "0" as Octal, I think, but let's ignore that quirk for the example.)
Internal representations are irrelevant. I'm referring to how the language user perceives the language.
"Perceives", yes yes, WetWare, yes yes!
Don't read too much into it. Perceiving a value declared as 'int' as an integer as opposed to (say) a string is what I'm talking about. This is a straightforward technical matter, not a psychological one.
No, it's not straightforward in relevant cases, as already explained. Remember the talk about "can be used as type X" versus "is a" type X?
It's straightforward in all cases. Lack of clarity appears only when there isn't a good understanding of the distinction between types, values, and variables, and the "tag model" certainly doesn't help clarify it.
They are ambiguous concepts, at least to non-PhD's. Somebody else even agreed that each individual languages gets to define "types" how it sees fit.
They are not ambiguous, even to beginning programmers. Individual languages do get to define "types" how they see fit -- under the definition that "types" are attributes of a program that can be checked without running the program -- but that doesn't mean they're difficult to distinguish from variables and values. We've been over this before -- you were unable to provide an example of a language where values, variables and types are indistinct. That's probably because there isn't one.
Why do I have to prove they are indistinct? What does "distinct" mean anyhow in cyberland? They are not physical objects but abstract notions in WetWare. My approach focuses on behavior (output), not so much vocabulary. Something that prints "7" will still print "7" regardless of what the hell we call it. You are trying to turn this into a vocabulary battle when vocabulary is mostly moot, other than general "feely" notions. And please clarify "checked without running the program". That's static types, not all types.
It's all types, though admittedly I could have been clearer. See http://en.wikipedia.org/wiki/Type_systems for a better description. In short, type checking is about verifying the consistency or correctness of interconnected parts of a program before execution. This can occur at run-time, such that it occurs immediately before execution rather than in a separate compilation phase.
CF has optional "type checked" function parameters. The attribute even uses the word "type". They are just checked a bit differently than tag-based dynamic langs. Thus your notion that CF does not "have" types appears to contradict your statement.
"Type checked" function parameters are variables associated with types. See above where I wrote "I don't know about CF ..."
"Associated with" doesn't tell me anything other than there's some unnamed relationship. I don't see your stated CF reference above. Essentially what CF does (or acts like) is parse-validate the byte string to see if it fits the syntactic profile of "number". Either way, "numberness" is a determinable attribute of the contents of the variable (passed "value"). Tag-based languages will also do this, but only if the "tag" says it's something other than a number.
Exactly -- there's some unnamed relationship. That is precisely correct, and what you call "numberness" is a numeric type. If you search for "I don't know about CF" you'll find my reference, which is actually below. Sorry for the inadvertent misdirection.
Which then doesn't differentiate as already explained multiple repeated times. I give up; you are either dense or have the articulation skills of a sick yak. Arrrrrrrgggggggg *(*%*@#%(@(*&#

C# is considered statically typed because variable types are known at compile-time. If you define a variable of type Object it is known to be of type Object at compile-time. At run-time, it can only be assigned values of type Object or values inherited from Object -- which are, by definition, all of type Object -- but this is checked at compile-time.

If you only use "Object" types and every variable is the dynamic type then the fact it statistically checks that everything is dynamic is mostly a UselessTruth. It can behave as a dynamic language.

A variable of type Object is not a dynamic type. It's a static type Object.

A UselessTruth at best. It's the same fricken behavior as a dynamic language. CF and Perl could be said to have a static type: "flex-string".

No, a variable of type Object -- in a statically typed language -- is always of type Object. It does not change, and is known at compile-time. I don't know about CF, but Perl variables can have one of three possible types -- scalar, array and hash -- and these are only known at run-time and a variable can, as I recall, change type.

Type Object in C# could change/be all those also during run-time.

No, the value the variable contains could be any subclass of Object, but the type of the variable is strictly Object.

Sure, but it's as open-ended as dynamic language at point. Outside of word games, it's the same behavior.

That's hardly "open-ended", and it doesn't change the fact that C# is a statically typed language.

No, it's a language that has static types and has dynamic types: I.E. a hybrid.

That's not in accordance with the usual definitions. All subtypes T' of a given type T may be assigned to variables of type T, but the hierarchy and the variables' types are always static in C#. They are not dynamic; they cannot be changed at runtime. In C#, at runtime you cannot redefine a variable V to be of type T now and T' later. The declared type of V is checked at compile time and remains throughout runtime. It is, therefore, considered statically typed.

How does one empirically verify the "type" cannot be changed during run-time?

Write a C# program where you attempt to redefine the type of a variable. E.g.:

 int myvar;
 string myvar;

The compiler won't let you do it, so you obviously can't do it at run-time if it won't even compile.

That's because it's duplicate declarations; it's not a "type" problem.

There is no mechanism to change the type of a variable, so it can't occur at runtime.

 Object foo;  // declare foo as type Object
 foo = 7;
 foo = "blah";

Is this "changing type"?

No. It's assigning various values of type Object to foo. The variable foo is always type Object.

Not.

 Object foo;
 foo = 7;
 System.Console.WriteLine("T1: " + foo.GetType()); //T1: System.Int32
 foo = "blah";
 System.Console.WriteLine("T2: " + foo.GetType()); //T2: System.String

C-sharp has two kinds of types, compile-time types (seen with "typeof") and run-time types. We can model this with two tags: one tag that's locked at run-time and one that isn't.

No, you're simply printing the type of the value stored in foo, not the type of foo. The type of variable foo and the types of values 7 and "blah" are strictly static, known at compile time, and immutable. The only run-time change is the value stored in foo.

You are inventing vocabulary here or using self-rolled head models. When is a type stored "with" a value versus "with" a variable and how does one know the difference?

{What he said is correct using the usual semantics of the words he chose. As for your question, that's up to the language and the implementation. In the example above, the implementation could compile that fragment to the equivalent of

       System.Console.WriteLine("T1: System.Int32\nT2: System.String");

this would, obviously, store nothing with either the values or the variable as all three of them have been optimized away. In C#, the type of a variable is a property of that variable, and every variable has a type, whether or not that type is stored with that variable. Similarly, the type of a value is a property of that value, and every value has a type, whether or not that type is stored with that value.}

I'm simply reiterating familiar C# semantics. However, it's trivial to demonstrate:

Note that 'Object foo' declares a variable foo which may be assigned values of type Object. By the C# inheritance model, values of type Object include values of types inherited from Object. In general, a variable of type T may only be assigned values of type T or values of type T', where T' is a subclass of T. E.g.:

 // Base class T
 class T {
   ...
 };

 // T_prime is a subclass of T.
 class T_prime : T {
 };

 // foo is of type 'T'
 T foo;

For the sake of argument, let's assume an assignment to a variable V changes the type of V. E.g.:

 // Hypothetically, foo should be of type T_prime
 foo = new T_prime();

It should then still be true -- by virtue of well-understood C# semantics -- that a V of type T may only be assigned values of type T or values of type T', where T' is a subclass of T.

So, if we assigned V a value of type T' and it changed the type of V, we should not subsequently be able to re-assign V a value of type T, because we can only assign values to a variable that are of its class or its subclasses, and T is not a subclass of T'. In other words, the following should fail because T is not T_prime and T not a subtype of T_prime:

 foo = new T();

Of course, we are not so restricted; the above is permitted. We can assign V any value of class T or any subclass of T, regardless of prior assignments. Therefore, the type of V must consistently be T.

Couldn't one declare that all variables in a dynamically-typed language are subclasses of some flexible "god type", and therefore statically typed? That's essentially what C# can be, except that there are other optional types besides the God Type (Object) to choose from.

In a typical statically-typed language, the declared type of a variable remains unchanged -- and cannot be changed -- for the lifetime of the program. In a typical dynamically-typed language, the declared type of a variable may be changed at runtime.

No, the God Type in the dynamic language is the one and only type from birth to death. Anything else that resembles a type is merely something else or the type of the value, not the variable just like your rejection of the "getType" method above.

Sorry, you've lost me here. "God Type"??? "Anything else that resembles a type is merely something else ..."??? Wat?

Type Object in C# is essentially a God Type: it can be integer, string, Dictionary, etc.

You mean Object is a base class? I've never heard that described as a "God Type" before. In C#, integer, string, Dictionary, etc., are all subclasses of Object and can therefore be assigned to variables of type Object. An integer type is an Object type, but an Object type is not an integer type. You can assign an integer value to a variable of type Object, but you can't assign an Object value to a variable of type integer.

That's pretty much how tag-based dynamic languages work, regardless of what label you glue on it. You can call it subclassing or plagnorffing, but it's pretty much the same result.

C# is a statically-typed language, regardless.

If it can be made to resemble a dynamic language almost perfectly, then perhaps the existing classification needs a fix.

No. Given a variable V, in a dynamically-typed language, we can unconditionally assign values of any type at any time. In a statically-typed language like C#, we can only assign values to V that are of its declared type and its subtypes. For example, a typical dynamically-typed language allows:

 var v = "blah";
 v = 123;

C# does not allow the following, failing to compile on the second line:

 var v = "blah";
 v = 123;

This is because we can only assign values to v that are of its declared type and its subtypes. var v = "blah" implicitly declares that v is of type string. 123, an integer, is not a string or a subtype of string.

A more comparable example would be:

 Object v = "blah";
 v = 123;

Which is allowed.

Candidate #3

   A type tag is an observable trait of a variable or constant that cannot be 
   determined by examining the canonical string representation of the content
   of said variable or constant (usually via a Print- or Write-like statement).

Discussion

To use our specimen languages, there are no tags in Perl and CS (at least when used as scalars) because there are no (known) observable traits that cannot be determined by examining the output (string representation). However, in Php and JavaScript, there are such traits (as illustrated by prior examples). - t

That appears to be a definition by incomplete exclusion, which is a bit awkward -- like saying "a frog is a creature that can be identified by its legs but not by counting them." You've said a "type tag is an observable trait ... that cannot be determined by examining the canonical string representation". How, then, do you observe that trait? Are you saying that a "type tag" exists where two values have the same canonical string representation but behave differently? Isn't that more trivially stated that the variables (do you mean values? variables don't have string representations) or constants have different types but the same string representation? It appears, from the above, that Perl and CF represent scalar values as a string type whereas PHP and Javascript represent scalar values using various types.

I already gave coded examples of how such traits can be isolated. I have since added "content" to clarify the "variable" issue. I'm not going to use the word "type" because it's either ambiguous or not determinable by practical means (as our long debates over it have shown). Your use of "represent" is also questionable. Representation or its impact would have to be observable in some way to be objective or stable enough for a definition. I agree the exclusion approach is awkward, but tags can potentially store or do a lot of stuff.

But "type" is familiar, even to beginning programmers. "Tag" is not. How would you explain the relationship between "tag" and "type"?

But it's too ambiguous for reasons already described above. I'm frustrated that I have to keep pointing this out.

You keep saying it, but you don't provide an argument to defend it or evidence to support it, and it appears to be untrue because "type" is familiar even to beginning programmers and apparently well understood, given the number of programs successfully written that use it. Once again, how would you explain the relationship between "tag" and "type"?

I explained all that already. Jeeeez. I fucking give the hell up! If somebody else wants to try to explain in clear terms the position of either side, I welcome such attempts.

{I can give it a try. His position is "Your definition isn't clear. In particular, what do you mean by 'observable traits of a variable' and how does this relate to the (commonly used and well-understood) term 'type'." Your position appears to be, "I understand what I said." Hope that helps.}

I can't speak for Top, but that nicely summarises my side.

Test_53 is an example of "observing traits of variables". Printing "typeOf()" is another. We probe and test it like a scientist does to new species of animal. How does Species A act in box 7 compared to how Species B acts? And I've already explained difficulties with existing general notions of "types". I wish to avoid using "type" for now because language is poisoning this whole debate and I just wish to only describe observable traits for now on rather than "probe head notions" further.

{What's needed is a definition, not more examples.}

You cannot produce decent definition of "types" (usable by regular devs), so why would you expect it's reasonable to produce a decent definition of "tags"? A model may be a more obtainable goal.

{The definition of "type", as used by regular developers, has been given to you repeatedly. So why do you think we can't produce it? (And I thought you didn't want to talk about "types" right now.)}

I poked holes in it, but you pretend like the holes don't exist.

{In order to poke holes in it, you have to do something other than say it's vague. To date, that's all you've done.}

One cannot poke holes in a cloud.

{So you agree, you haven't poked holes in it. I'm glad that's cleared up.}

The Fligmook Experiment

I believe language is screwing things up here. How about this experiment: explain the differences in the two "kinds" of dynamic languages withOUT using existing language terms such as type, value, variable, etc. Call them Flig, Mook, Zog, etc. or whatever. Just make sure that whatever rules and terms your model uses are clear and self-standing. None of this, "programmers already know what X's are" stuff. After you successfully demonstrate your model can explain the differences on its own, THEN you can go back and assign Flig, Mook, etc. to common IT terms. Are you up to it? - t

In some dynamically-typed languages, every flook is of splork fleem. In other dynamically-typed languages, a flook can be of any splork. In the former, certain fizzles turn fleems into forgles or noofs as appropriate. In the latter, the appropriate fizzle is chosen based on the splork. Does that help?

How do we objectively measure/observe "is of"? (Don't forget the problem of "can be transformed to and/or viewed as" versus IS-A hierarchies. The first does not require any hierarchy.)

It's essentially synonymous with "associated with" or "has a property of".

"associated with" or "has a property of" is still vague. It only means that there is SOME relationship, but doesn't say if that relationship matters to anything we care about here (output). And how is "can be" verified? Under what conditions is be-ness true or false? "As appropriate" and "based on" needs metrics also.

"Associated with" or "has a property of" is as precise as the "has a" in "has a tag", and is a sufficient basis for describing program behaviour in terms of a variable or value having or being associated with a type.

No. For one, it fails the "parse issue", below.

  <!--- ColdFusion Example CF002 --->
  <cfset a=123>
  <cfset foo(a)>
  <cfset b="123">
  <cfset foo(b)>  
  ...
  <cffunction name="foo">
     <cfargument name="p" type="numeric">
 ...
  </cffunction>

  Equivalent C-ish style would be:
  a=123;
  foo(a);
  b="123";
  foo(b);
  ...
  function foo(number p) {
     ...
  }

Both function calls run successfully, meaning they both "pass" the type="numeric" parameter test ("abc" would produce an error). Remember, our working assumption is that CF uses parse-based verification at call time under the hood. You don't need to know CF to agree with that working assumption. If it's wrong in reality, it does not matter for the scope of this example. I'm dictating "how it works" for this example.

What is this intended to illustrate?

Let's try again:

1. Tag-free languages (TFL) don't have a "tag"
2. If tag==type, then TFL's don't have "types" (there are "no types" in TFL's)
3. You said parse-based type checking (above example) means there ARE "types" (don't need tags for PBTC)
4. #3 contradicts #2.

"Tag-free" appears to be your label for some languages that have variables without type references and values that are always of type "string". Languages with "parse-based type checking" associate more specific (at least, more specific than "string") type references with values as needed. Where is the contradiction?

All your 4 steps appear to illustrate is that if "tag-free" is intended to mean "no type references anywhere", then perhaps "tag-free" isn't the most accurate name for languages with "parse-based type checking".

But you just said "as needed"? The contradiction is back. Your "as needed" contradicts "no type references ANYWHERE". Nowhere-ness and as-needed-ness cannot exist at the same time in a language, at least not without adding caveat rules to the model, complicating it.

By "as needed", I was referring to examples like <cfargument name="p" type="numeric"> from above, where "parse-based type checking" is -- according to you -- "needed". Hence, "as needed".

You seem to be tripped up by the word "type". It could just as well be called "validation" or "groppnipping". Yes, the "validation" is needed, but not a tag. There's no need to introduce a tag or any damned thing LIKE a tag here. Thus, we don't have to cumbersomificate our model.
{No. You don't seem to recognize that the 'type="numeric"' is like a tag. (And "Cumbersomificate"? Really?)}
How so?
{Because the primary purpose of your tag appears to be associating a type with a variable or value. That's exactly what that snippet does.}
Sorry, I don't see it. It's more of a GateKeeper operation: the keeper checks the wagon for whatever criteria and either lets the wagon pass (into function) or not. The keeper does not need to mark (tag) the wagon. That would be introducing unnecessary state into the model; and plus, nothing known uses the candidate tag such that keeping/tracking it does not add predictive value to the model. - t
Regardless, this threadlet started because your assertion appeared to be that there is a difference between "tag" and "type" or, more accurately, that "tag" and "type reference" are different. In every case where you use the term "tag", it appears it can be replaced with "type reference"; in every case where you use the phrase "no tag", it can be replaced with "no type reference". Isn't that true?
No, it's not equivalent. Sure, you can say you did just for the sake of saying you did to satisfy a word itch, but it fucks up the utility and simplicity of the model. Admit it, you backed your model into a corner; words won't save you this time, Sam. It ruins the model, period.
Sorry, I find your response a bit baffling, and I don't see how I've "backed [my] model" (what model?) "into a corner" or what it is that I "can say [I] did just for the sake of saying [I] did". In what way is "tag" not equivalent to "type reference", and in what way is "no tag" not equivalent to "no type reference"? What are you claiming "ruins the model"? If you claim that "tag" and "type reference" are different, how do you explain the relationship between "tag" -- which programmers will not be familiar with -- and "type" -- which programmers are familiar with?
- Re: I don't see how I've "backed [my] model" (what model?) "into a corner" -- So you are admitting you DON'T HAVE a model? Hmmmm. That makes it far easier to win the model race then. - t
- There are programming languages, and they have variables, values and types. We can explain the relevant behaviour of programming languages in terms of variables, values and types without any further abstraction or indirection. They are already an effective abstraction of the underlying machine, so the "model" is simply variables, values and types and their relationships, such as whether variables have types or not. If we need a further abstraction for logical reasoning about programming languages, we have DenotationalSemantics. However, for day-to-day pragmatic programming, variables, values and types are simple and sufficient. Even if you find "types" too nebulous in general, the notion of a reference to a type is easily understood and useful for explaining TypeChecking in a programming language.
- Sorry, I disagree for reasons already stated. For example, there is no clear distinction between type checking and validation (such as parsing).
- Validation (such as parsing) is often used as a mechanism for implementing aspects of TypeChecking.
- "Often" is not good enough. Often the existing feel-ish notions of "types" work just fine; to that I agree. But I don't want just "often" here, I want an "always" model, at leave very close to 100%. Or, at least explain a pattern I see that creates two "kinds" of languages. (Those that act like they have a tag and those that don't). Vague type-ish notions don't explain this split pattern, while the tag model does.
- I think you've misinterpreted my use of "often". I simply meant that of the various mechanisms used in TypeChecking, parsing is often one of them. The pattern you see -- based on what you've described -- is a result of whether or not types are associated with variables and what types are associated with values. Most dynamically-typed languages do not associate types with variables. Some dynamically-typed languages associate various types with values. Others only associate a "string" type with (scalar) values. These relationships account for the behaviour you've seen in various languages. If you'd like to present specific examples of behaviour in various languages, I'm happy to show how they can always be simply explained in these terms.
- I'm sure you can, but your model would be more complicated than the tag model, having to consider more specific situations case-by-case.
- Bring up some specific examples. You explain them in terms of tags, I'll explain them in terms of types, values and variables. Let's see, or LetTheReaderDecide, which is more complicated or difficult to understand.
"Type" is vague; that's the bottom line. I'm not going to build a model around a vague idea if it makes it too complex. You are force-fitting. Again, if the tag/type is "always there", then it cannot be used to explain the diff between tagged and non-tagged languages. And we don't need it for example CF002: it explains nothing and provides no prediction benefit. Why add extra parts to the model of they don't affect it? The model is just plain simpler if we simply say language set X has a tag while language set Y does not have a tag. If that's not obvious, I don't know what else to do at this point. Perhaps we should just end this and LetTheReaderDecide. You haven't shown how your model explains the difference other than on a case-by-case basis say that something is ignoring your fuzzy "type reference". You complicate the model by trying to apply "type reference" on a case-by-case basis because then you need lots of specialized rules rather than simply saying "all languages of set X have no tags".
If types are "too vague", how do you avoid making reference to types when you explain tags, given that programmers are familiar with types and not familiar with tags?
The language calls it "type", not me. If given a choice between model simplicity and fitting programming street terminology, in this case I'll go with model simplicity. The primary purpose of the tag model is to improve results prediction in a way that's relatively easy to explain. The notion-y fuzz was the state of affairs BEFORE the tag model. Further, calling it a "tag" reduces confusion when language words like "type" come up. Using your suggestion that tag=type increases confusion. I don't have to use the word "type" to explain anything because my model needs no parts called "types". - t
Granted, languages may intermix type-related concepts such as validation and the IS-A hierarchical style of "types", but that's going to happen regardless because there are different ways to do type-ish things and diff languages will straddle these in various ways to various degrees and make up their own terms or usages of existing terms. I'm just trying to clarify it all by making a mechanical/visual model instead of a verbal model. Your verbal model is too complex and vague; and I'm not even sure it can be made to work right (by adding more rules). - t
If your "model" needs no parts called "types", what do you call the parts that other programmers call "types" when they differ from "tags" (you did say they were different, didn't you?) and how will you explain the differences (whatever they are) between "tags" and "types" to programmers who are already familiar with types?
I don't know how to clearly explain "types". Type is a vague concept. I can explain them in a fuzzy notion-y sense, but that's not always helpful when it comes to actual details of production languages. That's the very reason why I evolved the tag model: because I wanted a consistent, clear model. Like I said elsewhere, it may not matter if you call the parts, "zizzles", "friddles", and "flackabobs" as long as they clearly do their job in the model. Matching the external vocabulary is a nice bonus if it can be done smoothly, but so far nobody has done it. You are welcome to try. That being said, one can use the tag model to explain specific situations in such languages, such as parse-based "typing" ("typing" used in a loose sense) versus tag-based typing (or combo-typing). Some languages and/or operations use the tag, some only use the value. One can do experiments to see if a thing is ignoring the tag and then they have a better idea of how it reacts to dynamic data.
But it's clear that your "tags" and the "types" that programmers are familiar with are closely related. Programmers know the latter well -- at least in terms of their programming use, application and effects -- though perhaps not in formal terms. Surely, therefore, you have to draw some clear connection between "tags" and "types"? How will you explain it? It certainly isn't clear to me from your descriptions -- they suggest to me that "tag" == "type reference", but you say they're not the same -- and I use and implement programming languages including type systems. How do you expect it to be clear to programmers without that level of experience? I expect only inexperienced programmers have difficulty understanding the pragmatic role of types and might seek clarity in tags.
"Know [types] well" is debatable. As a general notion-y feeling, yes, but that doesn't answer some sticky areas of dynamic languages where most programmers just use trial and error or "defensive" techniques to work around them. I wanted something less ambiguous and that's why I worked toward a more mechanical/visual model that "explains" (predicts) certain language differences. I suspect you are a linguistic thinker such that MV models perhaps confuse you more than help. Perhaps different models fit better to different minds. I generally don't like linguistic models.
Ok, avoiding "types" and adopting a somewhat visual notation, is the following true?
- variable ---> tag means variables have tags (typical of statically-typed programming languages)
- variable -/-> tag means variables do not have tags (typical of dynamically-typed programming languages)
- value ---> tag means values have tags
- value -/-> tag means values have no tag; values are always represented as a string
- So, ColdFusion is variable -/-> tag and value -/-> tag, PHP is variable -/-> tag and value ---> tag, C# is variable ---> tag and value ---> tag.
  - PHP has tags (has==acts like has).
  - Yes, "variable -/-> tag and value ---> tag.
  - Please clarify, but perhaps on a different Php topic. I don't know why you are introducing 3 "things" here.
  - What 3 "things" do you mean?
If so, how does it differ from the following?
- variable ---> type means variables have types (typical of statically-typed programming languages)
- variable -/-> type means variables do not have types (typical of dynamically-typed programming languages)
- value ---> type means values have types, like integer, string, float, etc.
- value -/-> type means values are always represented as a string
- So, ColdFusion is variable -/-> type and value -/-> type, PHP is variable -/-> type and value ---> type, C# is variable ---> type and value ---> type.
- With Example CF002 you claimed it's showing it "has types", yet above you say there are no types in CF. Right back to the same issue.
- Not a contradiction, but an omission -- I didn't include <cfargument ...> in the above. A revised ColdFusion is "variable -/-> tag, value -/-> tag, <cfargument> ---> tag" or "variable -/-> type, value -/-> type, <cfargument> ---> type". How's that?
  - 1. You are complicating the model by making statement-specific rules and exceptions. Tossing the tag altogether for the given lang gives us the same results without adding a messy granularity of rules. 2. There is no way to verify (probe) this "type" part in/next-to CFargument. It's easier to say the "the value is parsed" for fit such that we don't have to model a temporary tag-like thingy. You are adding unnecessary state and parts to the model.
  - How am I complicating the model? In my "type" example, I'm using terminology from the language statement itself and its apparent semantics. However, if you wish to avoid a having to "model a temporary tag-like thingy" (?) that's fine; ignore it and there's still no contradiction with "variable -/-> type, value -/-> type" because we're ignoring it.
  - Like I already explained, "type" is ambiguous, which is the main reason why I go with "tag" instead. If one just "ignores it", then it's not clear what role it's playing (or not playing) to explain/model why the two kinds of dynamic languages behave different. It makes things far more obvious to lop it off altogether for Type II languages. Why have distracting clutter floating around the model? And the missing tag highlights the fact that it's not being used. Non-use is no longer a rule one has to "just remember", it's in-your-face because it's gawwwnnn! You don't have to remember to ignore something if it's not there to ignore.
  - Compare to this: "Apes are a lot like monkeys, but apes have no tails. However, in our model our apes will still have tails, but ignore them; pretend like the tail isn't there."
    - {Why not. We were talking about primates (types or tags associated with values or variables). You complained that some primates have tails (type or tag associated with cfargument). We said fine, monkeys are primates with tails, apes are primates without tails (there's a type or tag associated with cfargument). You complained about complicating the model. We said fine, ignore it, and we're back to talking about primates (types or tags associated with values or variables).}
      - What? You monkeyed up my monkey example.
    - {You brought up the issue of cfargument, and if you want to discuss its behavior, your model is going to have to take it into account. If you don't, you might be able to ignore it. This is true regardless of whether or not we are using types, tags (whatever that happens to be), or something else entirely.}
    - We have a choice: explain it as parse-based testing, or introduce a tag or type-like thingy or whatever it is you call them into that spot of the model. I see no good reason to do the second since it cannot be examined. If it actually did such under the hood, it's garbage-collected away before it affects anything else. And risks confusion with other language models.
    - So you're taking the "ignore it" option, yes?
    - No. It's automatically explained as only looking at the value since that's all there is. Magic! But because the question of "how can it be done without a tag?" might come up, we offer parse-based validation as a possible processing-based explanation.
    - {It's not all there is. You also have the part that decides whether or not the value is valid. The current question is "are we going to concern ourselves with it?"}
    - Since the tag model is to model "type-related or type-like behavior", and the language itself uses the word "type" at that spot, it should probably be addressed as a practical matter. My key point is that we don't need a tag or something like a tag called a "type" to explain observed behavior. The language "acts like" it uses parse-based validation of the value and only the value. There's no evidence it needs to look at the tag or a tag-like thing called a fizzle, dooggle, or "type".
    - In fact, if we go with your suggested naming, we have more reason to address it because your naming has "type" and the language uses "type" in its syntax. I get away from that more by using "tag".
    - {The point is, that there is something type-like involved. Something has to determine which values are valid and which aren't. Now, we don't care if you wish to include it or not. Types can explain the behavior of cfargument just as easily as they do with variables and values. Apparently tags can't, so to get an equivalent model, you need to include something besides tags if you wish to explain the behavior of cfargument.}
    - It's not "doing" anything specific or testable. It has no concrete model behavior or results, other than unspecified magic. I want a concrete model, not a fuzzy notion-a-tron. I want to see something like: step 1, it checks box A, step 2 it copies box B to box X, etc. Concrete. Mechanical. Visual. Hard Walls. Metal, not cotton. Granted, I don't explain "parse-based testing" but that's either understood by most or explainable as a side topic.
    - {What do you mean it's not doing anything specific or testable? Cold-fusion defines what it does. How is that not specific? You can try passing numeric and non-numeric data to the function. How is it not testable?}
    - What is "it"? The cfArgument statement does the checking. Not a "type", at least not as concrete thing of the model. We can conjure up a "type ghost" that helps the cfArgument statement, but that's just making up parts of the model for undefined reasons or to force adherence to nebulous type feelings out of.....habit?
    - If it looks like a duck and quacks like a duck... It behaves like TypeChecking. Notably, the ColdFusion people even used a "type" keyword. Why do you think they did that?
    - You are ducking a real answer.
    - {You're not asking real questions. "What is 'it'?"? "It" is referring to the same thing as the "It" in "It's not doing anything specific or testable." If you didn't know what "it" was referring to then, on what do you claim that it's not doing anything specific or testable? If you did, why don't you know what "it" refers to now? Your second "question" isn't really a question.}
    - I'm referring to your ghost "associated with". Which may or may not be the same "it" in "If it looks like a duck" because I don't know what the hell the "associated with" thing is and is not your model.
- (Note that "represented as a string" may not necessarily be true, but we can say we model them as strings, which appears to produce accurate predictions so far. It could "compress" stuff under the hood if it wanted to by converting a string of digits into Int or the like.)
- I'm assuming only observable characteristics in the above, including "represented as a string". Internal compression mechanisms, etc., are not addressed.
- There is no known experiment I now of that directly verifies "represented as a string". I agree it's a UsefulLie for modeling purposes, though.
- By "represented as a string" I mean it's "a string of characters", which is self-evident.
- It comes out that way, but that doesn't necessarily mean it's stored that way. I'm assuming we can only see the output, not X-ray RAM.
- I thought that's (essentially) what I wrote.
- The output of both kinds of langs is bytes. It's not a differentiator.
- A differentiator with what? A "string" is certainly distinct from an integer type or a floating point type.
- Not necessarily. Go about testing that assumption and you run into the kind of results and differences that inspired the tag model.
- The results and differences are easily explained in terms of the familiar concepts of types, values and variables without the need for novel and obscure constructs. "Obscure" is meant in a literal sense -- whilst types, values and variables have obvious syntax associated with them and are familiar programming language concepts, "tags" are unknown and inherently obscure, because the existence of tags can only be inferred rather than observed directly. In a given language's source code (or language reference document) I can point at declarations of and references to types, values and variables and explain how they interact. Where can I find, in a typical programming language (or its documentation), reference to or declaration of a tag?
- You keep claiming that, but cannot come up with simple visual representation/model of the "mechanics" of type handling (or non-handling). You use nebulous words and confusing, round-about language. I try to turn your words into a visual model, but get dotted lines, semi-permeable membranes, and clouds with amorphous borders.
- The "visual model" is trivial, without any "semi-permeable membranes, and clouds with amorphous borders". What is vague or lacking in a "simple visual representation/model of the 'mechanics' of type handling" about "variable ---> type"? That's as simple and clear as it gets, and entirely accurate without artifice or construction.
- You might think so, but either you are wrong or explained it poorly. It appears you are adjusting it on a case-by-case basis to bend it to a specific situation. While such may technically "work", it fails to provide a forest-level explanation (such as family A has tags and family B does not). It's like a fuzzy "associated with" ghost that hangs over each operation and morphs or does magic as needed to fit a specific scenario.
Re: "(they suggest to me that "tag" == "type reference")" -- I thought I have explained many times that they are not the same and that one lacks prediction power in terms of comparing language differences. Lack of the (presumed) tag for some languages is the very thing that makes them behave different than those with the tag. But apparently I am failing to communicate it. I am at a loss for a better way to communicate it so far. It looks fine to me; I don't see holes apparent to me, but apparently I'm leaving something unstated that is naturally "there" in my mind but not to yours when you look at it.
You appear to have claimed many times that "tag" != "type reference", but I don't recall seeing an explanation of how they differ. Could you point to a PageAnchor where the difference is explained? Perhaps I missed it.
"type reference" is vague such that I cannot supply the precision you asked for. The results of test_53 are objective and they reveal a pattern that the Tag Model explains/mirrors well.
test_53 is trivially explained as a and b being variables containing values of different types with the same string representation.
I meant the pattern of differences between the two "kinds" of dynamic languages.
In one kind of language, a and b are variables that contain values of different types with the same string representation. In another kind of language, a and b are variables that contain values of the same type with the same string representation.
How does one test/observe "same type"? And if they are ALWAYS the same, then "type" is superfluous, and can be tossed to avoid distraction.
An obvious way is to examine variable declarations and/or assignments. "Same type" is still a type of some sort, with associated operations and a set of possible values, unless you're dealing with arbitrary sections of memory of undefined length.
But in dynamic languages, the boundary tends to be blurry. For example you can usually do string operations on "numbers" (or things that appear to be numbers) and numerical operations on strings, assuming they resemble numbers to a sufficient degree. "Still a type of some sort" is nebulous. Things can be multiple "types" at the same time and different operations can interpret the same variable and same content as a different "type". Using traditional hierarchical "type" classifications is a poor fit, and the alternative is messier than the the tag model.
No, it's simple. The "same type of some sort" is invariably a string. Strings can be tested to see if they're numeric or some other type like a date, and appropriate operations performed on them. This is trivial, first-year ComputerScience stuff. I don't see the difficulty.
Again, what validation is "type checking" and which is not? If it's too open ended it will be difficult to make a clear and consistent model around. Parse-checking is very different from tag checking, and to lump them both together under "types" is asking for confusion. You are bowing down to pet language, even if it's the wrong tool for the job. Divorce your ugly nagging types wife and free your mind.
If you're performing validation to determine whether or not a value belongs to an identifiable set of values, then you're doing TypeChecking, and the identifiable set of values is a type.
That's too open-ended, see below.
For the purposes of your model, that may be true. Experienced programmers can recognise when (say) validation of some string is actually a form of ad-hoc TypeChecking, but that's about a given program rather than the programming language.
Bzzzzt. Wrong. The pattern is at a language-level. Those langs in the non-tag category ALWAYS act like they have no tags, while those in the other camp mix and match tag-based inspection and parsing. Perhaps YOU need the tag model to recognize it for yourself rather than think of it on a case-by-case basis. Your approach seems to be blinding you to larger-level patterns.

Please elaborate on this "associate...with". That's too nebulous. And no association is necessary for explaining/modeling program behavior, so why introduce such fuzzy ghost terms? Specifically the cfArgument process examines the parameter passing through and it either passes the examination or fails and the program stops. No creation of a reference is necessary; we don't have to glue anything extra on to the package; and it doesn't explain anything taking place. I will agree we could perhaps rework the model to include such "association", but it complicates the model. Lack of a tag is conceptually simpler.

"Associate with" is the intuitive and expected meaning. For example, in a language that specifies a declaration like "int x", we can say that "x is associated with 'int'" and vice versa. "Lack of a tag" is precisely the same as "not associated with a tag".

But if the association doesn't do anything or does nothing clear-cut, then it's not helping out goal, and may only serve as a distraction.
What do you mean by "doesn't do anything"? In a declaration like "int x", the association of "int" with "x" serves a fundamental purpose: It means "x" can only be assigned integer values. If "int" wasn't associated with "x", we (and the compiler) wouldn't know what types of values "x" can or can't be assigned.
We are talking about Example CF002, no? Not "int x". If cfArgument by chance makes a temporary tag-like thing, nothing testable uses it. It helps nothing in a forecasting model, existence or not. I suppose the parsing step (model) could create an extra tag with the result of the test, but that's extra steps because you create a (temp) type code first, and then have another comparison to see if the validation is pass or fail based on the set of pass-able types. It's less steps to return pass or fail based on a match-parse because the result only has to be a Boolean value in the model regardless of the "type" of the input variable.

Note that tagged dynamic languages may also do parse-checking in some cases, but it's used less often because the tag can be examined first, often eliminating the need for parsing.

Maybe we should try models that are semi-machine language. Such can risk adding low-level detail that we may not need, but machine-language is something that most graduates should have experienced and can be defined in concrete ways where all the bits are X-ray-able and observable with clear ordering and rules. No more "notion processing". But hopefully there is a way to abstract away or postpone the portions that are not in dispute. -t

Undergrads are often taught the inner-workings of the machine via machine-language, typically in 1st or 2nd year. However, this illustrates actual mechanics -- I thought your goal was an abstraction or a model?

Yes, that was the goal, but it's not working for either side. Something more explicit may be needed. Abstraction is in the head, and everyone's head is different. We may have to go to a lower level of abstraction to have a prediction model that is clear to both sides. Keep in mind that a virtual machine is not necessarily an (intended) implementation, but rather a model with prediction capabilities.

That's not unreasonable. Some educational institutions favour giving students a heavy dose of computer architecture and assembly language, often in 1st year, for this reason.

Like I said somewhere, it depends. My university had about 5 "minors" (sub-specialties) of CS, and only one covered such heavily.

But even if this was the case, the actual tag existence or non-existence would probably be reflected. Dynamic langs like Perl and CF probably only represent scalars as strings without any extra tags for number versus string versus date, etc.; and likewise Php and JS probably have an explicit tag separate from the value. Thus, if type=tag as you assert, then parse-based testing/comparing is not "types". QED. (It would also make CF's use of it in cfArgument a misnomer. But I get away from that problem by using "tag" instead of "type". I've thus solved 2 problems. Now give me my fucking Nobel so that my ego gets even more obese, like the Gods of Logic intended.)

Note that the example above perhaps may run something like this under the hood:

  ...
  <!--- Example CF003 --->
  <cfArgment name="p" regexVerif="[-+]?([0-9]*\.[0-9]+|[0-9]+)" failMsg="Not a number">

Yet have the same result.

Yup, and it means exactly the same thing -- it's performing TypeChecking.

That's an awful wide definition of "type checking". Swiss Army Types? What regex's are "type checking" and which are not?

If you're performing validation to determine whether or not a value belongs to an identifiable set of values, then you're doing TypeChecking, and the identifiable set of values is a type. TypefulProgramming recognises that types are pervasive, and endeavours to make types explicit wherever possible.

What? So if we have range checking, then it is the "types" of all things in that range? That's ludicrous! Waaay too open-ended. Every WHERE clause is a type picker?

Recall that a popular definition of "type" is that a type defines of a set of values and a set of operations on those values. Does a WHERE clause define a set of values and a set of operations on those values?

Such string parsing is blind to operations down the road. It doesn't think/plan about the future and the "meaning" of numbers; it just follows orders. The programmer or language designer may have had intentions or future usage patterns in mind, but that's exploring human heads, not program results. I want to model program output, not YOUR output.

Even so, which set of all possible regex's qualify it as "type checking" and which don't? Do we have to know programmer intent to answer that? If so, are we back to the old WhatIsIntent fights?

As I pointed out above, experienced programmers can recognise when (say) validation of some string is actually a form of ad-hoc TypeChecking, but that's about a given program rather than the programming language. However, a language feature like <cfargument ...> with a keyword "type" can and probably should be included in your model, but it probably doesn't hurt (much) to treat it as spurious and exclude it, either.

If one is modeling behavior, then modeling a difference based around the existence of the the word "type" is against that goal. Behavior-wise, it's no different than any other regex expression or filter rules such that bifurcating a model to have split paths based on the existence of the word "type" is a violation of OnceAndOnlyOnce, since Path A and Path B could be explained the same but are not to cater to a word-centric or historical-habit-centric model.

Let's just agree that both approaches can "work" and having both model choices can help one view the thing from different angles, one behavior-centric, the other vocabulary-centric or history-centric. Let's agree to let both live.

I agree, as long as you're ok with the fact that I will oppose the "tag model" every time I see mention of it. I think it's confusing, redundant, and possibly misleading.

The feeling's mutual.

So, what does that mean, exactly? I prefer to emphasise understanding language behaviour in terms of language semantics and syntax and well-known elements like types, values and variables, so if the "feeling is mutual" does that mean you're going to oppose actual language semantics and syntax whenever you see them? Are you going to oppose explanations of language behaviour in terms of syntax and semantics? Or do you mean you think existing language semantics and syntax are confusing, redundant, and possibly misleading?

But your model is either more complicated because of your language-centric tilt, or if simplified outright doesn't work to explain differences in languages. One of main problems is that "types" has been overloaded with explicit typing and parse-based typing (examining the value only). The tag model cleans that up and models both in a clearly different way, not your hazy "associated with" vagueness. If one bifurcates these two concepts into distinct modelling actions/features, then it reduces confusion and highlights why the two language families act different. --top

Surely the "$" in the Perl variable is a type tag, while @ and % are other examples in the same language. The var named $fred may contain several types of value but they must all be all suitable for use in a $-variable. This is enforced by the language and is not just a convention.

Whether this is what Top meant or not I don't know but several popular languages have had these sorts of type tags, including the classic implementations of BASIC that many of us started out on (a$, b$, i%, j% etc.).

It's not what Top meant, though it might be the inspiration for his "tag model". Single character prefix or suffix "tags" are syntax for specifying a variable's type. BASIC's 'a$' is semantically equivalent to C#'s 'string a', but obviously syntactically different.

Top, would you consider C to be a language that uses tags?

During compilation, yes; during run-time, no.

I'm surprised. From the attempted definitions and examples you've given, I see nothing that would change between compile-time and run-time. I also would have expected the answer to be yes during run-time since the following program outputs "Same string", "Different foo".

 #include <stdio.h>
 #include <string.h>

 static double foo(double x)
 {
     return x;
 }

 int main()
 {
     double x = 1.0 / 3.0;
     double y = x + 0.0000001;

     char xstr[100];
     sprintf(xstr, "%f", x);

     char ystr[100];
     sprintf(ystr, "%f", y);

     printf("%s\n", !strcmp(xstr, ystr) ? "Same string" : "Different string");
     printf("%s\n", (foo(x) == foo(y) ? "Same foo" : "Different foo"));

     return 0;
 }

Curious. If I'm following this correctly, then "%f" is rounding or truncating. In this case the "canonical string representation" (CSR) is not showing us the "full" value. I'd perhaps argue that's a flaw with the language's CSR. The CSR shouldn't chop off precision: default to sufficient precision to show all influence of the 8 bytes, but have formatting operations to round if desired. Maybe "%f" is not the CSR anyhow because we can use different formats in printf for any variable regardless of declared type. C may not have a CSR. (My C is rusty.) I wonder if any dynamic languages do this?

C doesn't really have a CSR. Only the arithmetic types, pointers, and arrays of characters have anything that's even a reasonable candidate. ("%f" or "%g" are the best candidates for both float and double. Both suffer from rounding.) But the question is now, how can we determine if a "Same string, different foo" is because of a "flaw in the CSR" or a "tag"?

Is there any direct way to extract the full value faithfully? (Indirectly we may be able to subtract the value from slightly rounded versions of itself to slice the parts.)

You can explicitly specify the precision. Regardless of which precision you choose, C allows for an implementation that exceeds it. C does give you the ability to look at any object as a sequence of bytes. I don't see how either helps in distinguishing between "flaws in the CSR" and "tags".

If C has no CSR, then we cannot apply candidate definition #3. But the existence of tags during compiling but none at run-time can help explain/model why we can apply different formats to different (compile) types willy nilly. Also note that I mostly limit using the tag model for dynamic languages.

My question still stands. How can we differentiate between "flaws in the CSR" and "tags"?

How does one find flaws in something that doesn't exist? I will agree that hypothetically, if a CSR doesn't show us the full "value", then def #3 would be picking up the hidden effects of this under the name "tag", which is probably not what we want in the def. There will probably always be odd caveats that break any def or "rule".

Forget C for the moment. Let's say we have a language that defines a toString method that works for every value, but that method truncates floating point values. That toString method would be the obvious choice for the CSR. How do we tell if that toString method is flawed as a CSR or if the language has tags? (BTW, there are plenty of rules and definitions that don't have any exceptions. As examples, the standard definition of integer, Von Neumann ordinals, modus ponens, etc.)

Because math doesn't have to worry about practical issues, unlike programming.

How about which values belong to a given implementation of C's int type?

We could add 0.0000001 to one group of numbers, say monetary amounts, and not add to quantities to create a kind of "tag" that differentiates between both kinds of numbers in an app or shop, and build that into a library (including removal of the 0.0000001 end-marker for processing). Thus, it can act as a typical "type tag". Whether it's convenient or not is another matter. The difference comes down to intent and usage, and we've already done the "intent dance". What I really want is a model that explains behavior of the language, not a model of human heads.

Yes, we can take our own ad-hoc type system and stick it on top of the language's type system. I'm not interested in that at the moment. I want to know how to differentiate between CSR flaws and tags.

Ummm, it's called typedef. --MarkJanssen

A typedef is a C-specific alias for a type name or a means to associate a name with a type definition. How does it relate to the above?

Try actually making a compiler to transform language text into machine code, instead of debating abstractions, and then you'll know. Find more at ComputerScienceVersionTwo.

I have done, and I still do. I wrote my first compiler in 1982, and I've been writing them ever since. I've written a number of compilers, and I am the principle author of the RelProject which incorporates a compiler for a VM. What "more" do you expect me to find at ComputerScienceVersionTwo?

Again, how do C's typedefs relate to the above?

Rather than endlessly (and circularly -- it's getting lengthy and repetitive with no resolution in sight) debate the merits of the "tag model" vs understanding language behaviour in terms of language semantics and syntax and well-known elements like types, values and variables, I have created TypeSystemCategoriesInImperativeLanguages. Top, I encourage you to create a similar page for the "tag model", and then LetTheReaderDecide.

Stumbled across this interesting ComputerScience about tags and typing:

Quote: "a tag section that describes the type of the data: how it is to be interpreted, and, if it is a reference, the type of the object that it points to."

CategoryTypingDebate

JuneTwelve AugustThirteen