Vague Or Arbitrary

Definitions of certain words, such as "types" and "values", are either vague, or arbitrary in that the definition author selects a specific model or implementation out of multiple models or implementations that could serve the same purpose. Thus, if one cranks up the specificness knob, the arbitrary-ness indicator meter also goes up. The relationship is reminiscent of a wave function collapse in quantum physics.

Now, maybe it may be possible to find a definition that is not either VagueOrArbitrary, but so far such doesn't exist. --top

They're neither vague nor arbitrary, and they have standard definitions in ComputerScience. A type is a set of values and zero or more associated operators. A value is a representation and a reference to a type. You probably find these definitions vague or arbitrary because they're abstract. Some people who are used to dealing only with concrete implementations are not comfortable with mathematical abstractions, and may confuse their discomfort with abstraction with an apparent vagueness or arbitrariness of the abstraction itself. If these definitions were genuinely vague and arbitrary, we'd all find them vague and arbitrary. The fact only you find them vague or arbitrary is evidence that the issue lies with you, and you alone. See VagueVsAbstract.

As shown in TypesAreTypes, others find such words messy also, at least when applying them to our usual tools of the field. As described in related topics, "associated with" is too open ended to be practical.

On TypesAreTypes, the conclusion is that understanding types is tricky. There are pitfalls to trap the unwary, as your confusions demonstrate, but that doesn't mean the definitions are vague or arbitrary. "Associated with", by the way, is an abstraction. As an abstraction, it's precisely as specific as is needed. Any more specificity would no longer be an abstraction, but a specification or an implementation.

Re: "A value is a representation and a reference to a type." - Where are you getting this from?

It's a trivial conclusion from reading stacks of ComputerScience, ComputerArchitecture and computer technology journal papers and textbooks, though it's most succinctly and explicitly described in ChrisDate and HughDarwen's texts, since they are particularly careful to distinguish types, values and variables. Their books are replete with citations, by the way, so their exposition is neither personal opinion nor unsubstantiated.

PageAnchor bit-sequence-01

It's also logically evident: Given a "value" 1101011111010110 of unspecified type, what is it?

Without a specified type, the answer can only be "unknown". Thus a value is only meaningful if its type is known. Since each value must be associated with a type in order to be meaningful, it makes logical sense to describe a value as being some representation -- some static string of symbols or states, e.g., 1101011111010110 -- and a type.

There are languages that leave it up to the programmer to know its "type". It's in the head, not in the machine.

It's in the machine. If you choose the FADD operator for adding floating point numbers, it assumes each value is of type 'float'. In other words, the association between the representation and the type (float) is asserted by each operator. If you use FADD on integers or character strings, you'll get incorrect or meaningless results. The fact that the programmer chose to use FADD, instead of the machine choosing to use FADD, is irrelevant.

The head can be far away from the machine such that's its clear that type info can be far away from the value/representation/data/content or whatever you are calling it today. There is no Pope that forces them to always be together. Compilers often do it also: they toss type info from the EXE file such that only raw "values" are passed around. Typness is thus in the source code, but not the EXE.

That's fine, and it's called TypeErasure. There's no notion of "together" or "apart"; all that matters is that there is an association between the representation and a type, such that given any representation the machine knows its type. How, where, and when that association is made doesn't matter. It might be explicit and "close together" at compile-time, but at run-time the association may be asserted (i.e., assumed, as per the above) only by the machine-language operators that were chosen at compile-time.

You can always disassemble the machine language of any binary executable and trivially use the machine language instructions to determine the types of their operands. This proves that the association between representations and types still exists, even in machine language.

That's not true. Perhaps only the reader of the output knows the intended type/use/context of something and the machine language leaves no clue. Besides, even if there are clues in the machine language, it's humans making that determination, not machines. A network router is an example; it's just moving bytes around and may have no info about the bytes' intended use.

Intended use is something else entirely, and it's irrelevant. The machine language will tell us, unambiguously and definitively, whether values are integers (often of various sizes), boolean values, binary coded decimal (BCD) values, floating point values (in machines that support them natively), or (in machines that support them natively) character strings. Types not natively supported by the machine may be opaque to a greater or lesser degree at a machine-language level, but they're always composed from primitive types that are natively supported by the machine, which can always be trivially identified from the machine language.

No it won't, not in all cases. You are incorrect. And even when clues are available, they may be very indirect; it's more like detective work rather than a simple look-up of associations or attributes.

No, every machine language instruction is documented, unambiguously, what the types must be of its operands. From this, every operand of every operator is trivially identifiable as to its type. Obviously, high-level types may not have a direct mapping to some corresponding machine type, but that doesn't matter. It's sufficient that every value must belong to, or reference, a type.

For example, in a type-tag-free language such as ColdFusion, App X may receive the byte sequence "74", and pass it on to another sub-system Y as-is. That sub-system may do type-related stuff to it, such as add (number) or append it (string) to something, but X and/or I as a developer may not "know" or care about what Y does to "74". We cannot answer the "type" question at Page-Anchor bit-sequence-01. From app X's perspective, it's merely a package to be delivered. It's a "value" to be moved, NOT interpreted (in X). It still has "meaning" in terms of being a package we are entrusted to deliver. Thus, having meaning does NOT require having type info "known". And it may have meaning to process Y (which we don't see). The "handler" of info and the "processor" of that info may be closely related or close by, far removed, or something in between. App X is treating it as merely a value devoid of type info.

An operation that copies an arbitrary section of raw data -- like a memcpy operation -- is sometimes considered to have an operand of type string. In another sense, memory copy operations are the only truly "typeless" machine operations. However, by definition, such data is opaque. A "copy" merely locates data in a place where operations can be performed on it. We do not consider the types of data that are passed down a network cable, for example, but it is critical for the software at the network endpoints that send and receive the data to know the type of every value. Do not conflate such bulk data copy operations with operations on values.

Our systems typically have what may be called "root types" such as "binary", "bytes", "sequence of bytes", etc. It's imposed by our existing hardware. (I challenge you to implement position-free bytes.) But those root types are at a lower level than the issues of contention.

They're normally called primitive types, but they're not called "binary", "bytes", or "sequence of bytes". Those terms refer to arbitrary data. They're normally called integer (of varying sizes), float (of varying sizes), boolean, and string. All higher-level types are composed from these. As I noted above, "high-level types may not have a direct mapping to some corresponding machine type, but that doesn't matter. It's sufficient that every value must belong to, or reference, a type."


The bottom line is that the typical definitions do not dictate specifics, such as data structure, element relationships in memory, explicitness versus implicitness requirements, etc. That leaves open multiple "implementation" and modeling choices such that those definitions will NOT answer most language-specific and machine-specific questions/issues/conflicts. To pretend they do is wrong, period. There are rough/general patterns to what people call what in specific implementations, but they are just that: rough/general patterns (based on a combination of habit, history, and textbook knowledge).

Of course. The typical definitions are abstractions, which appropriately encompass the widest variety of current and possible implementations.

"Associated with" can be explicitly in the machine's data structures right next to each other, on different machines that run at different times, or some of it purely in the mind of the reader or developer. The definitions don't choose among those. Why exclude the human mind from "computation"? Why should such a definition make a distinction between silicon computation and wet carbon-based computation (brains)? Computing is computing. As is, they put NO limit on "associated with", and that includes human brains far far apart in distance and time.

Of course. That's what "associated with" means. It is sufficient that an association of some sort exist in order to say that an association exists.

R2D2 is merely carrying an encrypted message "1101011111010110" (a "help" call from Ms. Leah). The reader and interpreter (understander) of that info may be far far in the future. The definition as given does not preclude that "association". It does NOT say anything remotely like, "the type info must be within 7 feet of the value info". Nada, Zilch about that.

If anybody clearly sees something in those definitions that restricts time, distance, wall thickness, and/or processor type (silicon, WetWare, etc) of and between the "parts", please identify it. Otherwise, It's fair to assume that time, distance, wall thickness, and/or processor type are NOT DICTATED by those definitions.

In light of this, "types" can be viewed as information that helps one interpret a "value" (sometimes called a representation, data, content, etc.) That "type" information can be in the machine OR in the mind. The base definitions do NOT specify the nature or position of such "type" information.

That is essentially correct. However, no machine can function usefully with type information that exists only in the mind. Otherwise, how can any machine (or mind, for that matter) meaningfully perform any operation on 1110101010101011101010101010111001101? Of course, even it has a trivially-identifiable type -- bit string of finite length -- which if nothing else, allows some memcpy operator to copy it from one location to another. However, a machine that only performs memcpy is not particularly useful.

Typically you have Thing T (the type), Thing D (the data/representation), and Thing U (the interpreter as in "understander"). The location of these 3, in a brain or a machine, is not restricted by the traditional definitions. "Stricter" languages tend to put more of them in the machine, and/or rely more on the machine to track them.

That is true. The machine needs to know T in order to perform U, otherwise D is opaque.

Note that these 3 "things" can be reprojected into an operator-and-operand view of things to fit the traditional definition. It's just a matter of how one views the packaging of what they label as an "operator" etc. The Thing X view is a better projection for dynamic and loose-typed languages in my opinion, with the operator/operand version a better fit for "stricter" languages. Which you choose is a matter of convenience, for they are both just different views of the same thing somewhat similar to how polymorphism and IF statements are interchangeable for implementing conditionals.

-t

Sorry, that last one lost me.

I'll see if I can think of a better way to word it. The short version is that just about ANY system can be viewed/mapped/re-represented to be "a set of values and zero or more associated operators". However, the human convenience of that view varies widely per tool flavor, usage, etc. Static languages tend to be closer to that view as-is.

No, not any system "can be viewed/mapped/re-represented to be 'a set of values and zero or more associated operators'". How is an IF statement or a WHILE loop "a set of values and zero or more associated operators"? How is the sequence of I/O operations to initialise a printer "a set of values and zero or more associated operators"? How is the 'main' method of a program "a set of values and zero or more associated operators"? How is an arithmetic expression "a set of values and zero or more associated operators"? How is an assignment of a value to a variable "a set of values and zero or more associated operators"? How is a given method of a class "a set of values and zero or more associated operators"? And so on.

I said "system". Most of those examples are not a system as given. We can make a set of IF statements into a system by packaging them into a subroutine or program. As a working definition, I'll define "system" as anything with input, processing, and output. And if something has input, processing, and output, then the input and output can be viewed as "values", and the processing as an "operator". Thus, every system satisfies the definition. -t

How is "input, processing, and output" a "set of values and zero or more associated operators"? It's not sufficient that "input and output can be viewed as 'values'"; the system itself (and a type is not a system, because it doesn't have "input, processing, and output") would have to be a set of values and zero or more associated operators. How does it do that?

Why is it "not sufficient"? Please elaborate.

A type is 'a set of values and zero or more associated operators', so if system <x> is a type, then system <x> must be 'a set of values and zero or more associated operators'. That's not the same thing as system <x> accepts a set of values as input, performs an operation, and produces a set of values as output. To be a type, the system must be a set of values and zero or more associated operators.

To Be Or Not To Be

How is it not the same thing? How exactly is this be-ness applied and objectively verified? This be-ativity thing of yours is confusing the shit out of me. Types are a mental UsefulLie, not an objective property of the universe. The universe does not give a shit about types or sets, those are human mental abstractions (UsefulLies). A program is an operator as much as an ADD instruction in machine language. It operates on source code and input data (per common colloquial usage). It's not any less is-ificated to operator-ness than the machine instruction.

You're operating under the incorrect assumption that types are only some mental construct, and your misunderstandings appear to follow from that. In given program, 'int' is not a mental construct, it's a programmed definition of what bit patterns are an 'int' (thus forming a set of values) and a set of operators that receive 'int' as operands and/or return 'int'.

Bullshit! Machines are just molecules bumping around; they don't understand integers etc. If a rock slips off a cliff and happens to smash you, it does not "understand" death or killing, it's just blindly doing what molecules do. "Killing" is a concept in the heads of humans, not rocks. Same with integers. If another human purposely positions the rock to fall on you, that does not change what the rock thinks or knows...nothing.

Huh? What does "understand integers" have to do with what I wrote? I wrote that 'int' is not a mental construct, but a programmed definition of what bit patterns are an 'int' (i.e., a set of values) and a set of associated operators. A programming language type is defined with code, not cognition. 'Int' isn't produced by thought; it's a representation (usually 8, 16, 32 or 64 bits) that defines a set of (256, 65535, 4294967296 or 1.8446744e+19) values and operators like +(int, int), -(int, int), etc.

What's a "programmed definition" exactly? The definition is in code? Boggle. It appears to be anthropomorphism of machines. You don't "define with code" you implement HUMAN ideas with code. Using our death-rock example, you don't "define" a death device by putting the rock on the ledge, you IMPLEMENT a death device by putting the rock on the ledge. The machines are just dumb savants.

A "programmed definition" is something like 'class p { ... }' which defines 'p'. There's nothing anthropomorphic about it; it's just a term used to indicate that some code has been associated with an identifier.

I'm having difficulty explaining what I wish to explain. I'll have to ponder it a while perhaps.


See VagueVsAbstract


CategoryDefinition


EditText of this page (last edited November 4, 2014) or FindPage with title or text search