Typeless Vs Dynamic

There is a distinction between "typeless" variables and "dynamically typed". DynamicTyping values generally carry a "hidden" type indicator with them. This type indicator offers hints to the interpreter about how to interpret the value if there are potentially multiple interpretations. Typeless depends purely on context and never on any hidden or unseen type "flag". Some feel this is better than flag-based dynamic typing because it is allegedly more WYSIWYG. One considers the contents only, and not the contents in conjunction with the flag. It is thus "less to think about". It may require a different mindset than flag-based, however, and this confuses some.

I have never seen a purely typeless language, but have kicked around the idea while planning a pet language. Nearly typeless languages, such as ColdFusionLanguage or PerlLanguage (IIRC) are typeless for simple scalars (numbers and strings), but still distinguish between collections and scalars. In a purely typeless language, one could do something like:

  a = "foo"
  a["bar"] = "glob"
  a = 7

Here the variable originally held a scalar, and then suddenly used as an array, and then back to a scalar. I don't know of any languages that do this both ways. If anyone has seen such, please let us know. It's even possible for a variable to be both an array and a scalar. If you use brackets, then the array elements are accessed, otherwise the single scalar value is returned (or an empty string if it has none).

Actually I've seen it somewhere, at least from scalar to array, but don't remember where. But, the language had some kind of "isArray()" function that told you which "type" it is. Thus, it is still not purely typeless. A purely typeless language would not be able to answer that question.

This is achieved in Revelation basic. An internal data type of Dynamic Array is provided, with convenient syntactic operators. A dynamic array is a series of attribute values delimited by hierarchical separators -

   thisvalue = "abcd"                thisvalue now contains abcd
   thisvalue<2> = "efgh"             thisvalue now contains abcd <level1delimiter> efgh
   thisvalue<2,2> = "ijkl"            thisvalue now contains abcd <level1delimiter> efgh <level2delimiter> ijkl

 so if
   thatvalue = "abcd"
 then
   (thisvalue<1> = thatvalue)   is true
   (thisvalue = thatvalue)      is false but perfectly valid

(and I think Python dictionaries could be used this way)

(and Adabas/Natural is another which would allow this - maybe somebody could show us how it does it) -- PeterLynch

So using "dot path" notation it really means this?

     value<2,2> = "ijkl"

     value.2.2 = "ijkl"   // "traditional" syntax

I am not sure what you are saying here. In the light, hopefully, of my refactoring of the names above, can you restate this?

Although I suppose it could have a "elementCount()" function that would return 0 if it has no array elements. This would be an indirect way to see if it was an array, or at least used as an array.

Internally, the language could implement such by treating every value as an array or potential array. A pointer to the elements would be zeroed/nulled out for regular value and filled in with a pointer to the collection if any elements are assigned.

In my draft pet language I took another approach. Special values in the array were set aside (by convention) to serve as the scalar value. Example:

  a = 7
  a['~value'] = 7  // same meaning as above

A scalar assignment is just a shortcut for the "~value" element. Implementation-wise one would probably not want to treat each value as an array for performance reasons, but conceptually they can be treated as being no different. To achieve this, the "~value" element could be internally mapped to a standard scalar "slot". Only if some other element is assigned do we need to use or allocate a full-blown array structure. If we later just used it as a scalar, then the other elements are simply ignored. If we want to free up memory, I suppose we could have a "clearArray" function that removes the elements.

-- top

(Occurrences of "value" fixed. Don't complain, correct.)

First question: why?

I don't understand the question.

Second question: What happens when the user wants to iterate the array? Does ~value pop out unexpectedly?

Generally if you want to iterate, then a function such as "keysToList() that ignores those marked as "system values" would be used, unless explicitly asked to include them. In the examples I used "~" as the marker, but in practice perhaps something even more obscure should be used (perhaps configurable). Iteration pseudo-code example:

  myList = keysToList(myArray)  // create delimited list (defaults to comma)
  while (listItem = iterateList(myList, listItem)) {
     print('Key: ' . listItem . ' Value: ' . myArray[listItem])
  }

Third comment: Treating each value as an array isn't any worse than normal DynamicTyping. You still need a type flag, but otherwise every value is just a one-element array. Then it can grow like a normal vector or HashTable.

It removes the need for any kind of internal type flag. I see no need for it, at least not something that shows up in the language's behavior. Maybe internal optimization would use a type flag or something similar, but it never has to make its existence felt. The "a = 7" line in the above example would not cause an error, for example. By the way, a "clone()" function would have to be used if you wanted to copy more than the scalar value. If the array is empty, besides the scalar value, then clone() acts just like an assignment.

Fourth comment: AssemblyLanguage and BeeLanguage are both untyped. ParrotCode is DynamicallyTyped, but lets scalars have "properties", which essentially gives them array-like features (and aside from that, scalars and arrays are both PMCs, so in a sense they're the same type). CeeMinusMinus, LowLevelVirtualMachine (LLVM), and many other VM assembly languages are all untyped too, beyond specifying the maximum number of bits in the value.

You are right that assembler is generally untyped, but I was thinking of higher-level languages. The examples you mention just seem to grab areas of memory. A higher-level language will treat values as atomic such that you cannot accidentally overlay other stuff in memory. They also usually treat everything as a dynamically-sizable string.

On second thought, the assembler I remember did have "types". Operations expected information to be represented in certain ways. For example, there may be an "addInteger", "addDouble", etc. It is typed, just not enforced. A true typeless language would not have such operations built-in.

Another example of a truly typeless language would be Forth. ForthLanguage has only cells, the same as BcplLanguage and BeeLanguage. AssemblyLanguage is typeless as well. Having operations that can treat a value as a type does not make it typed, otherwise you're in the position of saying that only LambdaCalculus is typeless. Even then you could reduce it to the absurd argument that lambda calculus is typed because it understands exactly one type. Most of us are interested in a more meaningful and thus operational definition of typelessness.

Okay, I guess I will agree with you with regard to assembly. Perhaps we need to split typeless into subcategories for the case where built-in operations expect certain formats. The difference between high-level typelessness and low-level seems to be that the high level treats everything as independent strings, while the low level treats everything as a starting point in a continuous byte stream with little or no regard to borders. Many assembly languages do allow one to define the size of the target area (variables) and may do some checking based on this. In others, variables are merely an alias for an address point. In other words, in high-level languages independence is maintained. (I need to rework this wording. It seems long-winded.) -- top

Many fans of advanced static type systems do use the term "typeless" to refer to languages such as SmallTalk; which don't perform any type analysis at CompileTime. This practice is somewhat controversial, and frequently seen as pejorative by fans of dynamically typed languages.

If there is a hidden or side "flag" that indicates types even during run-time, I would call it "dynamic typing". If there are no such flags, then I call it "type free". There are also some in-between languages. For example, a hidden flag may indicate whether something is an array or scalar, but scalars are not broken down into types such as numbers or strings, depending on usage instead. -- top

I agree with Top, except that it seems to me that "type free" and "single typed" are synonymous in this context.

But the heart of the point is that "dynamically typed", where values/variables/memory cells do in fact have type flags that are checked at run time, are importantly different than languages that don't check types either at runtime nor at compile time (most assemblers, BCPL), yet the statically typed literature often does not distinguish the two. -- DougMerritt

"Check" can be a sticky LaynesLaw trap here. In a type-free (no flag) language or API, an operation may check the parameters. For example, if a language uses a plus sign for addition and say "&" for string concatenation, then in the expression "A + B", the operands are inspected to see that they are proper numbers; that is a string of digits that can be interpreted as a number. Some might call this "type checking", and thus a type-free language actually does have "types". However, it can equally well be viewed as "validation" and the distinction between validation and type-checking is fuzz galore. Do you really want to call RegularExpressions a "type checking" mechanism? -- top

Yes. Validation of function-inputs or variables to ensure they have certain invariant properties IS type-checking, regardless of when or how it is performed. Use of flags is a simple optimization to avoid performing checks where they are unnecessary or have been done before, but it's quite plausible to utilize straightforward and statically determined RegularExpressions as a type-checking mechanism. Dependent type-checking allows one to check arbitrary types, and constraint type-checking can test variable inputs relative to one another to ensure certain invariant conditions are met by the greater set of inputs. Validation is broader than type checking only in that it includes more than checking for invariant properties; e.g. whether a security certificate is valid is a variant property of that certificate. Similarly, whether a mutable ellipse is a circle is a variant property, and cannot be usefully type-checked... unless you can guarantee that the desired properties of the ellipse will remain invariant and intact for as long as you need them to be (e.g. if only the viewing agent is holding it, or if it is locked) because you can't otherwise guarantee that it will remain valid even mere nanoseconds after you finish the check. I suppose, though, that it's important to note that if you must perform explicit validation for a particular invariant property, then that level of type-safety is NOT a property of the language. You measure a language's type-safety by the protection that is provided implicitly.

When is "validation" type-checking and when is it not? I find that to be an overly-broad definition or characterization of type-checking because it would overlap with too many other concepts such that it becomes a watered-down phrase, in my opinion.

Validation of invariant properties of an object or value constitutes type-checking. Validation of variant properties of an object or value is not type-checking. Objects tend to have a great number of variant properties, while values can only have variant extrinsic properties (since values, themselves, are inherently invariant). E.g. if you validate that a certificate is still valid (by comparing the expiration date to the current date) that is a validation of a variant property - one that depends on the environment. Similarly, if you validate that a file exists, that's a variant property of the environment. The relevance of discriminating type-checking as being over invariant rather than variant properties becomes more obvious as you study type-systems that stretch the limits of what is computationally feasible such as dependent-typing systems (essentially arbitrary predication over one variable) and constraint-typing systems (which allow arbitrary predication over 2 or more variables). There are even those that perform advanced state analysis to ensure that certain mutable properties (e.g. whether an object still exists or whether that ellipse is a circle) are both true and remain invariant for some relevant duration.

Technically, there is one more major aspect to type-checking that should be mentioned (though I believe it implicit in the discussion above). Type-checking, in particular, is a validation of intent against approach, both of which are described in a language a language (usually a programming language). As a prerequisite to type-checking, you must have a pair: (description of intent, description of approach). It is only validation that this description of intent (which is invariant, being a description) is met by the description of approach (also invariant) that constitutes type-checking. As such, type-checking is validation of invariant properties of the program description. The description of intent may be implicit or explicit, inferred or manifest... but languages that cannot provide it (like BrainFuck) do not allow for any sort of typechecking.

Generally, of course, when people refer to type-checking they're referring to automated type-checking. Type-checking can be done by hand... and it's just as painful and error-prone as it sounds. However, that's something that might get you thinking... if you look at a programmer's comment on a procedure, then check that the procedure does what is specified in the comment, you've just become a human type-checker. It'd be nice if you could make the compiler check that your procedures do what your comments say they do, would it not? However, validating complex specification of intent against approach can be computationally infeasible... even undecidable. As a consequence, specification of intent is quite limited in most popular modern programming languages, usually consisting of "I intend that variable x refer to a valid value with properties Y" where the "properties Y" is an (often quite limited) type-descriptor, such as 'int' or a typename. I expect that, as we move towards the ProgrammingLanguageOfTheFuture?, language designers will toss their hands into the air and say "hell with it; the programmers can learn what their massive CPUs can handle" and allow far, far more complex specifications of intent... even allowing full proof-of-correctness given a set of contractual assumptions. Even now, given the complex feats one can perform with (for example) the C++ template system, programmers are forcing the system to perform advanced checks and tests. Making that sort of thing convenient is where the real advantage lies.

Deep type-checking allows one to also verify some parts of the description of intent with other parts, allowing something of a fixed-point recursion... verifying the verifiers. If you read intent/approach as 'why/what', this is equivalent of providing 'why/why'... the intent of your intent-description. Since any description by a programmer is finite, there's ultimately an upper limit to regression. (Why this? because of X. Why X? because of Y. Why Y? Just because.) The trick here, though, is developing a means to verify one answer to 'why' with another. The answer, ultimately, is to make the second why a 'what'... every description of intent must also qualify as a description of approach such that why/why is the same as why/what, and that advanced program analysis is possible over the typing system, too.

Such things as validating that an input signal (e.g. from a wire or ethernet or user) does, in fact, represent a value of a particular type does NOT constitute type-checking. It is input validation; the set of signals received by a program is not an invariant of the program description. Such things as validating that a program passes certain unit-tests described outside the program does NOT constitute type-checking. That set is not part of the intent description within the program... though a language that allows unit tests to be described internally DOES allow for unit-tests as type-checking.

Validation and type-checking are still quite usefully distinct.

You seem to be suggesting that types must be invariant because non-invariant validation is not "types". But I don't think many would commit to this. If it is not a rule for types themselves, why would it be a rule just for validation being a type? -- top

Types are invariant because whole and partial descriptions of intent are invariant. Descriptions of intent are invariant because all descriptions are invariant. All descriptions are invariant because descriptions are values in a language, and all values are immutable. All immutable things are invariant. It's a rather intrinsic property. You can, however, describe some rather wild things... like private data isn't sent to untrusted service or integer-variable never decreases or object still live over duration of this call. All of these are invariant properties... one describes temporal invariance (integer-variable never decreases), and one describes temporary invariance (object liveness over specified period), but they are all invariant properties. Further, all of these can be checked statically... but some are much easier to check at runtime.

Validation is neither variant nor invariant... validation is a computational process. It can be a static ('compile-time') process or a dynamic ('run-time') process, but either way it is a process that will take time, energy, money, etc.. The result of validation of invariant intent-description on an invariant approach-description will itself be invariant whether it be performed at compile-time or runtime; indeed, in the event everything is well typed type-checking should not affect runtime behavior at all excepting any runtime computational costs (time, memory, energy). Type-checking will only interfere when something isn't well typed... e.g. causing a compile-time error or runtime exception. Some descriptions of intent are far more computationally feasible to validate at runtime, especially in loosely coupled programs... e.g. dependent typing to ensure that an integer-typed variable carries a prime integer value (e.g. if(!isprime(x)) throw runtime-exception; or assert(isprime(x)); or if(!isprime(x)) handle_error(x); else do_whatever(x);) is vastly easier to check at runtime. Same with checking for object liveness, and pointers being not-NULL... though these can be checked statically. When a programming language possesses a turing-complete approach-description language but lacks a powerful intent-description language, it is often much, much easier to describe advanced programmer-intent in terms of explicit runtime checks. It's unfortunate that nothing automatically verifies the verification... but you must make do with what's available.

Communications validation cannot be type-checking because it was not performed over anything within the program description. It's just plain ol' communications validation. Well... it could be considered type-checking of the value represented in the signal in much the same way that a compiled program is typechecked by another process that reads (as input) and validates it. But it is not type-checking of the program or of properties that should exist within the program. Consider it this way: a communications input validation failure is NOT a type error... at least not by the programmer of the component that is accepting the input. The provider of the input is at fault; no agent can control its inputs from other independent agents.

Validation of variant properties cannot be type-checking unless you can show those properties will hold (i.e. be invariant) for some meaningful duration after the check has occurred. E.g. if you check that an ellipse happens to be a circle, it's a validation, but it's meaningless for type-safety unless you can also guarantee that it will stay that way until you no longer require it. If you cannot guarantee that invariance, then you have not proven that the intent against the approach; the change could occur mere microseconds after you completed the check. You'd have wasted your time and energy... and gained nothing for safety purposes. If type-checking is to be meaningful, you must be able to prove that the properties in question are (at least temporarily) invariant.

I haven't a clue what you meant by "a rule just for validation being a type". Validation is a process and a type is a description of properties on part of a program. They aren't at all the same... type.

Perhaps it is time for an illustration. Re: "Types are invariant". Suppose we had a dynamic language with smalltalk-like behavior in which we assume a type "coordinate" which has an X and Y attribute:

  // example Coord-01
  ...
  foo = new object;
  if (glob) {   // is glob defined
    foo.x = 7;
    foo.y = glob.aNumber;
  }
  showCoords(foo);
  ...
  function showCoords(obj) {
    print("X=%, Y=%", obj.x, obj.y);
  }

"foo" is not "invariant". Here, the type "coordinate" is "mental" in that we don't declare a type of "coordinate" in the language, it is implied by the programmer. (Even if we do have to declare it, some dynamic languages can make it happen at run-time.) If you are excluding our coordinate from being a "type", then one must say that some forms of dynamic typing are not typing at all. Note that we could validate the existence of attributes:

  function showCoords(obj) {
    if (!hasAttrib(foo,'x') || !hasAttrib(foo,'y')) {
      throwError("not a coordinate: ",objectDump(foo));
    }
    print("X=%, Y=%", obj.x, obj.y);
  }

In your example, the type coordinate is invariant. You can also show that the necessary properties of "foo" remain invariant for the duration of the call. The structural types here are as follows: (1) procedure showCoords requires that the input have attribute-properties syntactically accessible as ".x" and ".y". (2) Attribute-properties ".x" and ".y" of obj must also be of a type accepted as inputs by 'print'. In summary: (object with printable ".x" and ".y") is the required input type to procedure showCoords. Indeed, you described it yourself: assume a type "coordinate" which has an X and Y attribute.

Anyhow, the type of function showCoords is invariant: (object with ".x" and ".y") -> (procedure returning no value). That is the implicit intent of the programmer. As a type, it can be validated statically, checked dynamically, or be not verified at all. For static checking, since no explicit description of this intent was offered by the programmer, type inference is required: the compiler must infer what the programmer implied.

By use of explicit description, you could say something like:

   type Coordinate describes object with printable ".x" and ".y"
   class coord_object (...)
       -- assume constructor of 'coord_object' class creates object and adds printable ".x" and ".y" before
       -- returning, and blocks (dynamically or at compile-time) their removal or alteration to non-printable
   Coordinate foo_error = new object; -- ERROR! new objects are not coordinates; they lack ".x" and ".y"
   Coordinate foo_good  = new coord_object()
   foo = new object;  foo.x = 42;  foo.y = "Answer to life, universe, everything";
   function showCoords(Coordinate obj) {...}
   pass_to_malicious_agent(foo_good)  -- assume foo can't be deleted while at least one ref remains
   showCoords(foo_good)
   showCoords(foo)
   remove foo.x
   showCoords(foo) -- ERROR!
   foo.x = "Hello, World!"
   pass_to_malicious_agent(foo)
   showCoords(foo) -- ERROR if malicious agent eliminates foo's 'Coordinate' property before or within duration of call

A powerful static typechecker for a flexible language would pass this program except where ERRORs are marked. It wouldn't even need to maintain dynamic types, though it could. In this case, the type associated with Coordinate is made a bit more obvious. One can note that coord_object IS a Coordinate despite lacking any inheritance, delegation, and similar crap. For its first pass to showCoords, foo (despite being a plain object) is ALSO a Coordinate; it is easy to show that foo has printable ".x" and ".y" coords. However, after passing foo to an agent that can (and probably will) remove foo.x and cause other such havoc, the typechecker would refuse... just as if you had done it yourself. OTOH, if the static typechecker could prove that the malicious agent WON'T mess with the '.x', or at least won't get to it before the 'showCoords' call completes, then it'd be okay.

You made an implicit "type" into an explicit one. This seems to come back to the old problem of WhatAreTypes. Your description implies types are a mental thing, not something explicitly in the program. Type "Coordinate" is in your head, not an abstraction explicitly tracked and managed by the program. In example Coord-01 there is nothing programmatic declaring or marking "type" coordinate off as a special thing. To the interpreter it's just a function that prints array elements (x and y). Thus, we are slipping into Subjectivity Land, where endless HolyWars lurk, such as the difference between "types" and input validation.

If you can say what 'Coordinate' is in a comment, you can probably say what it is in an explicit declaration... at least if your type system is expressive enough. Even without an explicit declaration, a good typechecker is capable of learning at least the syntactic 'features' required by parameters and guaranteeing that these features are met by the program (aka TypeInference) - and is thus capable of creating a type consistent with what you're calling 'Coordinate' and that, therefore, IS what you're calling 'Coordinate'. And the rest of your claim is not too sensible. Types and values and words are all abstractions, and all 'abstractions' are in your head. All 'programs' are mental things. But I don't see how that leads necessarily to slipping into this "Subjectivity Land" you describe; please expand on that claim of yours.''

It's just a function, like any other function. Are you saying all functions are "types"? If not, which functions are types and which are not? And where did this invariant rule come from? Why should "types" have anything to do with invariance?

Rather than a general philosophy debate again, to focus on something more specific for now, how does "type" differ from "validation" in this case? When is validation not type checking and visa versa?

And here I thought you would "gladly supply details if [I] ask about a given passage". Oh well.

"Type" is never validation. A particular "Type" is an invariant abstraction. This is necessarily true because any language expression that leads to an understanding of an operating definition of a particular "type" is necessarily invariant.
"Type-checking" is a form of validation over language expressions (usually programs or program-fragments). It attempts to determine "TypeSafety" - that the program possesses the properties declared, implied, or otherwise necessary for running without forbidden errors. In the same sense that Humans are never NOT animals, Type checking is never NOT validation.
Loosely speaking, a "Validation" is any process that takes an object and returns a pass/fail or true/false without changing the object (to the degree that this is possible). E.g. one can test whether water is safe to drink, whether the exhaust from a car is within legal limits, or that a program-fragment is consistent with its own declared properties, and these would all be "validation".
There are forms of validation that are not TypeChecking (at least not as the word is commonly used). Most clearly:
- (a) any validation that does not attempt to determine TypeSafety is not TypeChecking
- (b) any validation that is not over language expressions, programs, or program fragments is not TypeChecking
Validation of variant properties (e.g. that a mutable Ellipse object currently qualifies a circle) fails on (a) unless one can guarantee that the property remains invariant for long enough for the operations that assume the feature complete. (TypeSafety is easily violated if the property changes.)
Validation of input may fail on either or both (a) and (b).
- Failing on (a): When one says "Validation of input" there is an implied "for property X". This 'X' will often be a test for whether the input is safe to utilize in further processing without forbidden errors (consistent with a TypeSafety check), but it may be a number of other things - tests for properties unrelated to TypeSafety. Security checks, for example: verification that a PKI 'signature' is valid, or that the certificate used to sign the document hasn't been revoked.
- Failing on (b): In particular, there is no guarantee that what qualifies as valid input is just dependent upon language-expressions. Input validation can depend upon state, time, and a number of other things.

Applying the above seems to require first defining and agreeing on the definitions of "type" and "type-safety", which would probably bring us back to definition debates that we've both suffered through already. I am not willing to revisit those at this time.

Despite the 'seeming', your agreement is not required. "TypeSafety" possesses a very formal, mathematical definition that is in use above (and that does not require a definition for "type"). Arguing about "type" can only affect the meaning of the first bullet, but almost certainly won't affect its truth: any particular type used in a system must be described, and descriptions are values, and values are immutable.

Based on the material in TypeSafe, it appears that the definition is indeed heavily tied to "type". Also note that we don't have to verify the existence of the X and Y indexes/attributes in our coordinate example to use them. (The program might crash if they are missing.) I'm working with an "address" table as we speak that has a slot for latitude and longitude. Whether one calls that a "coordinate type" is purely in the head. They're just attributes.

I see that appearances are quite deceiving to the layman. The strict definition of 'TypeSafe' relies only upon the definition of 'invalid operation', not 'type'. The errors found in TypeChecking are only called 'type errors' because they violate rules in the 'type safety system'. That's nominal reference, not dependency. It is more likely that a proper, formal definition of 'type' in computer science depends upon the accepted definitions of 'TypeSafety' and 'TypeChecking' than vice versa. Nominal reference creates a sometimes unintuitive inversion of conceptual dependency among linguistic phrases.

TypeSafety is all about those 'invalid operations' - making sure that the program doesn't crash or otherwise act with undefined results. The more expressive type-systems extend that to preventing semantically invalid operations in addition to the operationally undefined actions (e.g. preventing you from accidentally adding a distance to a volume). In the above examples, strong TypeChecking would make a minimal guarantee that the X and Y attributes aren't missing (and won't disappear mid-operation), and strong static typing would make this guarantee at compile-time. But a sense the 'type' - the abstraction describing anything that possesses 'X' and 'Y' attributes - may be 'in the mind' whether or not you check for it.

So any function or method that can throw an error based on analysis/computation of input parameters creates a "type"?

Not that broad, no - you leap too far in the right direction. There are some restrictions: Like all validation processes, TypeChecking is intended to be side-effect free (or at least as free as possible), which excludes a rather large class of 'functions and methods' - in particular, all those with notable side-effects. Side-effect free functions that take input(s) and return pass/fail wouldn't be far off from 'type'. However, there is an additional gotcha in the form of time-of-check-to-time-of-use (TOCTTOU) race conditions: any guarantee of TypeSafety requires a guarantee that the 'type' won't change between checking it and using it. This has the effect that not all side-effect free functions will qualify as types. E.g. if two threads can hold reference the same ellipse object and manipulate its attributes, one could not safely treat as a type a side-effect free function that takes as input the ellipse-object and returns whether it is currently a circle. The theoretical outer limit for what can qualify as a 'type' over input parameters, based only on the constraints necessarily imposed by TypeChecking for TypeSafety, is: any side-effect free function that returns pass/fail over input(s) where a guarantee can be made that TOCTTOU errors are avoided or prevented. Avoidance of TOCTTOU can be supported by the language (e.g. preventing you from removing 'X' and 'Y' attributes from an object being treated in a function as a 'coordinate'). Many of the differences between DynamicTyping and StaticTyping can be attributed to the different distances between time-of-check and time-of-use; avoiding TOCTTOU in static type systems tend to require much more language support.

A type system over inputs can't allow more than that, but type systems can easily be less expressive than possible. I don't know of any type system that in practice includes all 'side-effect free', but I have heard of a few type systems that allow all pure functions that return true/false as types. (The relevant difference being: 'Side-effect free' can still depend on side-effects, but can't cause them. 'Pure' functions are referentially transparent - they can't even depend on side-effects... making them much more readily utilized with StaticTyping and SoftTyping.) Most type systems in use today are less expressive: type systems that allow arbitrary predicates are undecidable even at runtime, and many language designers choose to trade decidability at cost to expressiveness.

And while I dislike confusing the issue, I should note that the above only describes the limit of types for values, parameters, inputs, variables and the like. Identifying 'invalid operation' in type systems can require analysis/computation over more than just 'input parameters', thus more than the above can be 'type'. A casual example would be a type system that makes it an 'invalid operation' (a TypeSafety violation) to pass higher-security data into a lower-security environment. Protocol types, uniqueness types, linear and substructural types, etc. are all in this flavor.

You are getting increasingly obtuse and distant. Rats, you sucked me into another type definition debate. Will.....resist.....replying. -- top

I'm not debating terms, simply explaining them. You know little of the jargon and formal study of type systems. If you feel I'm unclear, ask specific questions masterfully directed to clarify things for you rather than broad, retaliatory questions that possess answers that will just confuse you further. And don't insult or whine; at the moment you have at your fingertips a resource for gaining a greater understanding of an topic that apparently interests you, but that won't survive much abuse.

If you want anybody besides the 8 typing literature savants of the world to know what you are talking about, you better find a way to clarify it. People almost ignored DrCodd at first because he was not a good communicator. Fortunately, he kept at it and others eventually deciphered his stuff.

Perhaps I'm not a good communicator - the vast majority of my training is in talking to machines, though I have written fictions that were very well received. I believe at least half the problem is you - you've a tendency to ask broad questions as though in retaliation or as some sort of attack. Then you complain, repeatedly, when the answer you receive confuses you. It confuses you mostly because you ask questions for which you aren't ready to hear the answers - YouCantLearnSomethingUntilYouAlreadyAlmostKnowIt. If you want better answers, ask better questions. It doesn't take a type-literature savant to understand what I'm talking about, but it certainly takes someone with more than your level of self-education in automated theorem proving, type systems, or programming language design to really grasp how everything fits together without much effort (because they already almost know it). There are a couple other options if you aren't willing to ask smarter questions: shall I continue treating you like a peer and offering with each answer full explanations that inevitably confuse you, or shall I treat you as a child or layman and offer simple answers that will sound authoritarian?

That was really well written and interesting. The guy asking questions was a clear troll. -- MattJohns?

I am offended by that accusation and demand a fair trail. The criteria for vetting academic claims established in BookStop PageAnchor "Vetting" are fair in my opinion for the reasons stated. If you disagree, then so be it. Vocabulary is not necessary determined by academic theorists, and quite often is not. I am using a practitioner's definition of "type", which is a "side-flag" to a variable's value. (There may be other practitioner definitions. I am not claiming a monopoly.) If C2 is *only* for academics, then perhaps your accusations would have merit. But it ain't. Further, I suggest we move this vocab debate to another topic to keep this topic clean, or at least not mostly about vocab. -- top

[If you want to keep this topic clean, why not use "side-flag" to discuss side-flags? Why, instead, do you fight a battle over definitions with a group of people who understand "type" to mean something other than what you want it to mean? Why do you play HumptyDumpty and insist we remember to mentally translate: "type" when said by TopMind means "side-flag to a variable's value"? Are you really that arrogant, to believe everyone else should learn your habits in order to accommodate your abuse of language? I feel you equivocate: you redefine types then make conclusions about "types" even though your conclusions are not compatible with what ComputerScience describes by use of "type".]

You stubborn academic C2 regulars are just trying to gang up on me because you are trying to justify your existence. HumptyDumpty assumes I am the minority. However, 99% of all working developers don't give a flying sh8t about the silly academic definition, because it's nearly useless to a practitioner (and requires tons of Sherlocking to uncover "intent" which could be ANYTHING). You provide a formal survey comparing side-flags with "intent to classify" if you claim my definition is enough of a minority to cut. Otherwise, the original stands. Kill me with real rules, not silly intimidation games.

[Sorry, but you are in the minority, TopMind. "99% of all working developers don't give a flying sh8t about" your definition, either, and that leaves you vs. the other 1% of developers and scientists... that is: you against tens of thousands, including you versus the people who write books that teach 'type' theory to those who study ComputerScience and wish to understand them. You are playing HumptyDumpty, you are equivocating, and based on your SelfStandingEvidenceDiscussion response here, it is very clear that you are trolling.]

That's not a survey, that's your (biased) personal opinion. Stop voting yourself center of the universe. Practitioners would laugh their arse off if they knew the silly "intent to classify" scam pulled by the academics. It's bullshit in mathy clothing. The jury is still out on whether it's a UselessTruth or a useless-lie. Either way, it's still useless.

[Practitioners won't "laugh their arse off" either way because, as you said earlier, they don't care.]

But if forced to face your "intent" shit, they would laugh. You should be ashamed to back such a RubeGoldberg definition.

[Some people will scoff at anything until they study it. You should be ashamed for being a closed-minded bigot with the arrogance and naivety to think he has the right to redefine ComputerScience to suit themselves.]

"Computer science" neither invented the term nor does it have a blank check to re-define it in its own way. The fact that you don't know these shows you are missing the basics. You are mis-educated. Either that, so biased that you find a way to pump yourself up by claiming your field is God. Your assumption is stupid. -- top

[English is a living language with technical jargons to support particular domains. Anybody, including you, has the ability to define a term for use in ComputerScience and programming... but doing so is not free: you need to convince other people to use the term the same way you do, and this takes time, communication, and respect - especially if there is a competing definition for the same term that people already accept. Of time, communication, and respect, you have time in abundance, but you waste communication by focusing it on a few people in C2 who really, really don't respect you. And you know it. Take a serious look at how far your approach has taken you, then decide who is 'stupid': the 'stubborn academic C2 regulars' who don't budge to your will when it comes to definitions, or you, for fighting a battle that everyone knows you will lose.]

[And guess what: those who do decide they care usually have the sense to study what has been written about types rather than listening to hand-waving sophistry from persons such as yourself. (Not that you have such sense... you just scream 'BookStop!' like a deranged troll.)]

Your type definition is deranged. A definition that requires 500 pages if fatally flawed (and still as open ended as a constipated elephant's asshole despite 500).

[I've not stated a definition that takes 500 pages, but I have noticed that you don't hesitate to take a short definition and spend 500 pages waving your hands and complaining about problems that don't exist.]

Don't exist? Bah! It's too open-ended to even test. It out humpty's Mr. Dumpty. Tied to "intent"? Pfffft. (Related: WhatIsIntent)
[This was already countered on other pages. "intent" is just as testable as everything else we ever bother to test, which makes it good enough for science.]
Sorry, I missed it. Use links please. It is only testable in the "behavioral prediction" sense, which also applies to beauty and favorite songs.
[You deny it, but you've already responded to it (with more of your usual hand-waving and fallacy) so you can't have 'missed' it. It seems you're attempting to start the argument up again here with unsubstantiated, objective claims like "it is only testable in..." for which I should simply demand proof. But this page is the wrong place for it.]

[And regarding your ridiculous belief that people should present evidence to "cut" your definition when you don't have any evidence to support it? I'm sorry, but that sort of ShiftingTheBurdenOfProof will have academics and scientists of all sorts (not just ComputerScience) laughing their asses off at you. Fringe opinions and personal definitions are cut by default. If you want other people to follow, it is YOUR job to provide evidence, funding, or at least enough charisma to have a cult following your sophist nonsense. That's how this world is. And despite that you've voted yourself center of this universe, it seems nobody else agrees. The only survey we have of your 'side-flags' opinion (in TypesAreSideFlags) came essentially to the conclusion that you really haven't a clue what you're talking about. But don't take my word for it. Please. Go to another forum to spout your ridiculous views, so they can reject you over there and we don't have to deal with you at WikiWiki.]

JohnReynolds's type definition is a WalledGarden. Stop over-magnifying his place in the universe. He probably meant his work to be mostly about techniques, not definitions anyhow. You just magnify what you want to magnify. Academia is NOT the default. I hope my tax money is not backing your convoluted arrogant MentalMasturbation type shit. Shut down that place and board up the windows or turn it into something useful, like a garbage sorting warehouse. They know what "classification" is.

[Despite your rage against the machine, you are wrong: When it comes to academic subjects like ComputerScience and ProgrammingLanguageTheory and TypeTheory, Academia IS the default source for definitions.]

Bullshit! Claiming that does not make it so.

[I do not claim it in order to make it so. I claim it because it is so. Whether you admit to it or not is irrelevant to everyone but you and me.]

[And when practitioners decide to learn, guess where they go: to books, and articles, written by or translated from interaction with academics. Same goes for medicine. If you don't like it, you're stuffed. Live with it, or rage on like an idiot.]

It's good to study techniques sometimes. Academia *can* be a source of definitions, but has no fiat. Definitions are usually determined by popular (quantity) usage.

[Popular usage of a term among laymen can influence use of a term by professionals. Academia often adapts and refines terms from related subjects. But you cannot use the word 'type' as it is used by biologists (a taxon) or data entry personnel (a verb related to keyboards) and expect much credibility or respect on a ComputerScience site like C2. Nor can you use the word as it is used by laymen. Popularity matters... but only popularity among people in the same field.]

It is NOT solely or mostly a "computer science" site. You are projecting your personal desires into it. You have exposed your bias with that statement. Your internal desire to turn it into one has made you overly agressive. --top

[The PeopleProjectsAndPatterns discussed in WikiWiki are all software development and CS related. I can say that WikiWiki is not an IT site. You won't find HowTo pages for installing ActiveServerPages on your Windows box. Which other sciences and studies do you believe are represented here, if not ComputerScience?]

Software development is not CS. If you want to hard-limit C2 to just computer science, then you need to make a stronger justification. (CS is not even CS when there is nothing to test against reality. The whole term needs a rethink.)

[Software development is part of CS curricula, Top.]

That does not contradict anything I've said.

[If you have problems with that grouping then I'd be happy to modify my statement to "Software Engineering and Computer Science site like C2". Software development classes, including education in patterns for algorithms, are part of CS curricula needed to graduate the schools I've observed, as opposed to IT classes which (in the four schools I've observed as data points) teach how to use computers and how to use specific software like spreadsheets and webservers and oracle databases. And unless you mean to imply "there is nothing to test against reality" is a common occurrence (a claim that this person, who tested most CS models in a lab during schooling, doesn't believe), then your "CS is not even CS when..." statement is hardly justified.]

Outside of performance and machine resources, current CS has provided almost no scientific help in determining which paradigms languages are "better". It appears there are simply too many variables to apply the scientific process to.

Let's continue this in TooManyVariablesForScience.

(remaining contents shifted to aforementioned location)

Is the dispute above only about the topic title? If so, perhaps we can work out a mutual agreement. Please supply proposal names if so. I'll start with one. --top

TypeFlagsInDynamicTyping?

Testing For Flags

Here is a non-exhaustive list on how to tell whether a language uses internal type side-flags or not.

Flag-based languages usually have a function such as "typeName" that produces a single answer. Flag-free languages don't. However, both kinds may have functions such as "isNumber" or "isDate" that tells whether a variable can be interpreted as a number, regardless of any flags attached to it.
Lack of operator overloading. Flag-free languages tend to shun operator overloading. (This is merely a clue rather than a guaranteed way.)
Flag-free languages usually don't allow type declarations, such as "int", "long", etc. But, the opposite is not necessarily true. JavaScript is an example: you cannot explicitly declare a type, but it acts as if it has a flag.
If you cannot find any instant where a variable that "prints" the same can act differently in other contexts, then it may be flag-free. Printing the same but acting differently elsewhere implies that there is "extra" or "hidden" information in the variable that does not show up in a Print statement. This extra or hidden information is what is called a "flag" here (although "tag" may be more common in the industry.) Example:

   a = 123;
   b = "123";
   printLn(a + b);
   printLn(b + a);

Flag-based language output:

   246
   123123

Flag-free language:

   246
   246

In the first result, the interpretation of "+" depends on the type flag of the first argument. If the first argument is a number, then many flag-based languages will use that as the basis for determining how to interpret the "+". Of course, this is not a perfect test because of the different ways to implement "+". Above we could also swap the quotes between the "b" to "a" assignments to see if the quotes make any difference. In flag-free languages they usually wouldn't because the existence of the quotes does not leave any indication behind when the variable's value is stored. However, there are exceptions such as:

  a = "0123";
  b = 0123;

Here, the variable "a" would have "0123" stored as the value, but "b" would potentially have just "123" stored. This is because the parser knows that the value in the language code text is a number because it has no quotes. It thus may discard the leading zero. It could be said that the parser may use a temporary "flag" up to the point just before the value storage in RAM. But after storage the flag is discarded.

Details on how to experiment with dynamic type behavior to better understand it or test "flag" models is covered under EmpiricalTypeBehaviorAnalysis.

-- top

Partial Emulation of Flag/Tag-Free Typing

"Typeless" typing can also be done by ignoring the "type tags", along the lines of type "variant" in VB or "object" in Java. In other words, one can more or less emulate tag-free variables using a generic or filler type tag. Some languages or language libraries are designed with such usage in mind, while others expect explicit types (other than "variant") in some or all cases. The DEGREE to which they support pretending like types don't exist can vary across the spectrum. Thus, it is not black-and-white.

The fact that a variant (the one I've worked with) can hold multiple types of things, means it is multi-type, not typeless. I'd like to see you implement a type-less math, where there weren't any different types of numbers in math. Would this help math, or make it incomprehensible disorganized gibberish? Are you seriously suggesting we don't distinguish between number types and how will this possibly help humanity? It's like removing procedures from procedural languages and going back to coding without any subroutines.. just a long list of code without any procedures. That would make the language simpler, but why would you do it?

There are different approaches to handling "variant" or "object" types. One approach is have "variant" mean that the tag is tracked and change-able during run-time. Another is that it has a CF-like typing system (ColdFusionLanguageTypeSystem) where the values are carried as strings (or act like they are), and are only parsed into other types (numbers, dates, etc) if an operator has been asked to perform conversion/parsing to extract or process the value as a number etc. Thus, we have the run-time tag approach and the parse-as-needed approach. -t
In terms of your "math humanity" statements, I have no idea what you are talking about. As written on paper, generally the mathematician has to track what "type" a given variable is in their head. The "side tag" is thus kept in their head and not on paper (at least not "with" a given variable occurrence). -t

A language without type tags generally won't have this problem because library writers cannot test the tags to require explicit types simply because the language lacks tags on variables: you can only analyze the value. If you want to test it to make sure "it's a number", then a function that does parsing under the hood would generally be used. It is more akin to validation than type analysis. -top

This essay would benefit from greater clarity, depth, and much more investigation into the mechanics of actual language implementations. I award it a D. Please see me in my office if you'd like additional feedback or advice on how to develop your study skills.

I would note it's possible to mirror the behavior of a given language implementation using very different internal implementations. Thus, while knowing how "the guts" work is useful knowledge, it is not necessarily the only possible model or only possible implementation for a given language. -t

That is correct.

Often the model in one's head almost has to be some kind of "mechanical" model. It's difficult to grok the intricacies of types without some kind of physical or concrete representation such that operators have something explicit and clear-cut to operate on while simulating them in one's head. If you can offer a better head model that's easy to explain what's happening to a wide variety of programmers, be my guest...

As far as how to "fix" the topic text, suggestions are welcome. If you have a specific question, ask away.

WTF?°n° What's all this page about? Seems like an extreme example for RefactorMe!

The bulk of the discussion is about two things:

WhatAreTypes (OP should read about that, and reformulate accordingly)
What does it mean to have no types at all (same thing).

And in respect to the later, most contradictions come from the lack of distinction between:

Type of the variable ("foo refers to a string").
Type of the data ("this memory cell stores a string").

If you distinguish those two concepts, most of this is meaningless. The distinction between types clearly goes somewhere. If it's not annotated somewhere, then it's implicit.

Let's say you work in assembler: You're doing the right operations on the right raw bytes

 add register1 register2 # this will work because I stored two integers... I'm a genius!

then your types are in your head, and the validation is checking yourself ("What did I put in this address??").

Absolutely.

{Types are kind of a Rorschach test of how one models programming, programming languages, and the "ideal system" in their head. There is no easy answer because there are different ways to implement and talk about and define types and none are clearly wrong in all cases. Pondering types will drive you mad, until you come to the final realization that Types==Madness, and the puzzle is finally solved and then you die. --top}

Speak for yourself. Not all of us have that problem.

{Some have a problem communicating clearly what their interpretation of "types" is. Some of the most insistent WikiZens are also some of the worst communicators/writers I've ever encountered, to be honest. So much "education" (in something) yet you walked out of the institution with such sorry writing skills. Seriously, take some writing courses before honking your loud, convoluted horn on "types" etc.}

If you try to use "the most insistent WikiZens" as a substitute for introductory ComputerScience and SoftwareEngineering textbooks, it's not going to work. The "most insistent WikiZens" aren't here to educate you. They're here for other purposes, and sometimes highlight your wrongness in the process.

{There is NO "official" clear canonical definition of types. The language often used is not sufficiently "clinical" to clearly classify "types" from non-types (or type-ativity, for lack of a better word) in an objective and unambiguous way. You just mistake your personal head model for a universal objective truth, and don't realize you are biased and subjective due to a personality flaw. Another compiler writer, DavidClarkd, called you out on your definition play in ValueExistenceProofFour and you wanted to delete the criticism to hide it. Deletion of criticism is for intellectual cowards, delusional bastard!}

Huh? What did I want to delete of DavidClarkd's that is relevant? I trivially contradicted him about values, variables and types -- his misunderstandings were significant -- but I don't recall suggesting deletion of anything he wrote except a completely OffTopic AdHominem rant about FunctionalProgramming programmers and mathematicians. It's limited to a cluster of nine or so bullet points characterised by his comment that "I believe that functional programming is a conspiracy of Mathematicians..." I think Wiki gains nothing by leaving nutty conspiracy theories, but it's still there -- I didn't delete it.

What does an "'official' clear canonical definition of types" have to do with anything? I don't think there's an "official" definition of anything in English. Definitions are agreements among a majority, not official statements. What's a "definition play"?

This sounds like more evidence you need to spend some time reading ComputerScience and SoftwareEngineering textbooks.

You are not qualified to be the final judge about whether DavidClarkd's specific comments were irrelevant and thus should not be the final judge of deletion. -t

I'm not qualified? Really? I would think even a casual reader would be qualified enough to tell that DavidClarkd's specific comments had nothing to do with values, variables and types and everything to do with "a conspiracy of mathematicians" and the evils of FunctionalProgramming. That has what to do with values variables and types?

Anyway, given that I didn't delete the comments, this argument is moot.

Anyhow if the textbooks CLEARLY back your point of views, then extract and cite the text that clearly does so in an orderly and properly-referenced fashion. Past attempts at such resulted in you interpreting things a specific way and you fail to consider alternative interpretations (from yours) of the given text, using a circular argument similar to "if you knew what they really meant, then you would know what they really meant" (paraphrased). I try and try to dig out the details from your claims and statements, but the end-nodes always end up being variations of ArgumentFromAuthority. -t

I don't think any text snippet will help. They didn't help before. You need to read the whole textbook.

Note, by the way, that all human knowledge is either personal experience or something you'd probably call ArgumentFromAuthority. For example, have you ever been to Siberia? No? Then how do you know it exists?

Or are you claiming one needs to read 1000 CS textbooks before they properly interpret #1001? Even if true, the burden is on you to show that all 1000 back your interpretation of #1001 using careful and documented textual analysis. If that's your stance, at least you admit that NO ONE BOOK on types/values is "self standing". -t

One good textbook -- or even a mediocre one -- is generally enough.

But I suspect a bigger problem is that two parties must agree on a concrete representation to talk about types and values in a meaningful and clear way. Without a concrete representation, the discussion is too "notion-y" to "process" in an objective way. It degenerates into "my notions are better than your notions because I read more CS books". Even if such were true, it's not objectively measurable. Do you agree? Discuss at DiscussingWithoutConcreteRepresentations. -t

What do you mean by "a concrete representation"? A set of axioms?

Moved reply to DiscussingWithoutConcreteRepresentations.

int main() or char a - what is unclear about CeeLanguage types and variables and subroutines? -- ChaunceyGardiner

MayZeroSeven, NovemberZeroSeven, JanuaryZeroNine

CategoryLanguageTyping