(I would put this discussion under TopOnTypes rather than create a new topic, but that one is stuck in a vandal loop or EditWar it appears.)
In WhatAreTypes, it is strongly suggested that "types" are tied to "syntax". This struck me as a very odd definition and has led to LaynesLaw battles over the definition of "syntax".
- Did it not strike you that it could be the opposite? Syntax is very often tied to types, and this is quite evident in such things as OperatorOverloading.
- I view op overloading as a form of predicate dispatching, or at least computation, on type flags. To find the "proper" operator, computation is usually done on a structure (AbstractSyntaxTree in which variable nodes have a type flag(s)) , not directly on "syntax". (I will agree they are interchangable, but being interchangable does not prove your point.)
- OperatorOverloading as 'dispatching' is only correct for some cases... particularly where there is a runtime process determining which of many operator implementations to call, or an optimization of this with partial-evaluation and inlined dispatch result. As a 'static computation' to determine what was intended by the programmer, it is correct to call it a syntactic mechanism... because that's exactly what reading is all about: trying to determine what was intended or meant by the author. Also, types of the return values will determine, again, which of many other overloaded/polymorphic/etc. methods to call, thus further affecting interpretation. In some systems, even order of operations can change based on the types involved. That aside, the syntactic process extends from lexing the very first character to producing the final, topmost, highest-level representation of the program-description that includes all relevant details... and can require an arbitrary amount of computation to reach. Indeed, there are some languages with macro languages that are, themselves, TuringComplete, and expanding macros is a necessary part of the syntactic process, especially in languages that allow macros to expand into sets of components (e.g. one macro into multiple definitions or arguments to a function). Thus OperatorOverloading might require computation, but that fact does NOT support that OperatorOverloading isn't part of syntax.
- Okay, I admit that "predicate dispatching" may indeed be a sloppy description of the process. However, parsing is not necessary to have something that has types. Common, perhaps, but not a requirement.
- Perhaps not, but I don't think it'd be incorrect to say that interpretation and distinguishing in -some- form -is- necessary to "have" types (or at least more than one type), even if it is distinguishing 'blue stones' from 'red stones', 'ducks' from 'dogs', or 'animal' from 'vegetable'. The notion of 'parsing' is somewhat more limited, and one would generally discuss parsing as being associated tightly with the overall Type System for a particular language. For example, parsing languages sensitive to type-context always require typing as part of the parsing process, and what is parsed in turn influences the typing context. The meaning and type of a particular expression or statement may vary considerably based both upon the types of internal subexpressions AND upon how the expression or statement is being applied externally. Most languages provide very weak support for this in the broader sense, but that fact remains. On the smaller scale, excepting the one-typed or so-called 'un'-typed languages like UntypedLambdaCalculus and UnLambda?, I've never seen a language where types aren't used as part of the parsing and lexing process... starting with the very smallest pieces: lexing integers and numbers as different from symbols, statements as different from expressions, etc.
- Parsing and related checking is merely a necessary and/or convenient step in converting source code into something that the machine likes better. That does not make types about source code and parsing it anymore than cargo ships are about tugboats because they almost always require them when docking. ("merely necessary" is something of an oxymoron. And parsing needn't just be performed by machines; humans must also parse that which they attempt to read.)
- I agree that types are not 'about' source code and parsing; rather, the opposite is true. Parsing a language is 'about' identifying and classifying language expressions - distinguishing them, typing them. But to "have" a type is a bit more complicated than the type itself, for 'having' it implies somehow representing it either in one's actions (does different things with blue stones than with red stones) or in one's conception (ability to recognize blue stones vs. red stones). Types are 'about' distinguishing things by certain properties or abstract qualities.
- I tend to agree with the last sentence. However, it is more or less a restatement of TypesAreSideFlags. The "value" by itself does not necessarily tell you what type something is. It's "type" comes from SOME KIND OF COUPLING between the value and some OTHER value or item or context. I.E. a "side flag" in a general sense. Type indicators are like adjectives and adverbs. Perhaps "side flag" is a bit too narrow, but it still captures the essence of such coupling.
- Please explain to me how that is possibly interpreted as a restatement that TypesAreSideFlags. Type comes from THE COUPLING BETWEEN THE VALUE AND ITS INTRINSIC AND EXTRINSIC PROPERTIES, which isn't some vague black-box of "SOME KIND". You can't make something an adjective or adverb by tagging it with "adjective" or "adverb"; rather, it's the other damn way around: you can tag it with "adjective" or "adverb" BECAUSE IT IS ONE. My statement above is a direct counter to the notion that TypesAreSideFlags. 'TypesAreSideFlags' completely butchers the notion of the actual coupling that is occurring; it describes causality as going in the wrong damn direction: type because of tag. It should be tag because of type; the type is there WHETHER OR NOT you flag it or tag it. At this point you should know well enough what I think of 'TypesAreSideFlags' to realize that anything I say that you interpret as a restatement of 'TypesAreSideFlags' must be a gross misinterpretation on your part.
- I guess there is indeed a "gross misinterpretation". That is why I keep trying to steer this to concrete scenarios instead of using English to describe mathematically-imprecise notions with yet more imprecise notions, which is where the fault lays, IMO. English is failing us here. Regarding, "the type is there WHETHER OR NOT you flag it or tag it", how does one know? In the case of English grammer (adjectives, adverbs, etc.), it is the *context* that determines that. It is *side* words. It is largely "stuff surrounding the word" (processed with complex algorithms), and perhaps the nature word itself. Besides, that is perhaps "classification" and not necessarily "types". In programming languages sometimes it is context and sometimes it is explicit declarations that determine a "type". It is items EXTERNAL to the value itself that affect how the value is labelled or treated.
- External context can affect how you -interpret- a -representation- of a language-expression, and this is very common to context-sensitive languages. However, it does not determine the type of the language-expression. 'cut' the noun expresses a concept of a fundamentally different nature (or 'type') than 'cut' the verb. It is completely irrelevant to the concepts that the word used to represent them ('cut') happens to be the same word interpreted in two different ways based on context - that's a representation issue. In context-sensitive languages, context (and "side words") provide clues as to which high-level abstraction a particular word (or phrase, or sentence, or thesis) is intended to represent. However, this does NOT affect the type of the actual concept expressed in the language, the (in this case) noun or verb. That type is determined by intrinsic and extrinsic properties of the values and concepts so expressed. For example, we name as "nouns" those words with the extrinsic property of identifying objects of some sort (people, places, things, states). Being a person, place, or thing, in turn, is an intrinsic property of objects. And as far as "side words" having some vague relation to "side flags" - are you planning to flag each word with every word to the left and right of it? or did you have a different plan for relating the two? Remember that "side-words" are still content.
- Again, I'm not sure natural-language grammars are an appropriate analogy. But anyhow, how the type designations affect treatment of a tagged value is usually purely up to the language and/or app. Therefore, I don't think that actual different treatment is necessary to be called a "type". Perhaps we could add the clause, "and somewhere in the system there must be at least one different treatment based on the designation (flag)." I'll consider such, but at this point is a fairly trivial issue. But at least perhaps we can make a distinction for the sake of communication between the issue of assinging an item (value/value) to a type, and what is done with items of a given type.
- Anyhow, the quotes alluded to in WhatAreTypes are more about the TypeSystem; any system that provides meaningful type-safety must provide a mechanism to both describe those types and describe where they shall be applied. This mechanism requires representing these types and their applications, which invariably requires syntax. Thus a TypeSystem is tied to syntax, always always always.
- This is a tautology because everything that humans interact with is going to need some form of representation. Every def in computers would be "about syntax" if this was true. It is possible to have the same compiler or run-time engine using different syntaxes such as C-style, Pascal-style, Ada-style, VB-style, etc.; all being different syntax but representationally equivalent. What matters is the structure. Now if you say a structure *is* syntax, we are right back in non-falsifiable tautology land. You can use structures, but you need to define how we know if a structure has "types" or not. Dual nodes with a side type flag/list? Bingo, that is flag theory again.
- You shouldn't attempt to use tautology as a derogatory term, top, at least not in a field that makes love to math and logic, such as Computer Science. All great truths in math are tautologies, and anything that isn't a tautology is a lie... at best a useful lie or approximation. You might as well have said: What you just said is TRUTH! with a capital T! and T stands for TROUBLE! which is why I disagree. You're being rather silly. That aside, I didn't actually state a tautology. I've read the works of BenjaminPierce, who is one of the quotes alluded to within the WhatAreTypes page, and I've read the quote provided there. I feel confident in stating that his quotation was not an answer to 'WhatAreTypes', but rather an answer to 'WhatIsa? TypeSystem' - i.e. a description of what characterizes a system in which you can utilize types as part of programming.
- A "type system" may be getting ahead of ourselves. Processing and type-checking "types" are not necessary to be called a "type", for types may merely be a spreadsheet that describes the nature of the values on it (date, integer, etc.). It may be for human consumption only. As far as "tautology", I meant your definition of "syntax" appears to cover everything and anything such as to make your definition of it useless.
- A "type system" is more than a set of types and type-descriptors. You can write all the spreadsheets and tables you want, but unless you actually have a mechanism of applying those types to various expressions, they're certainly not part of the system. I do agree that types may exist without a type system.
- And you're wrong about my definition of 'syntax', which certainly does not cover cars and cats, values and types, expressions and statements, truths and logic, inference and decision, and a great many other things. What it does cover is exactly what the word is defined to cover: patterns and structure in the representation of values... which include program descriptors and language expressions. Perhaps you should ask what my definition is before making conclusions about it. It is true that every value representation must have a syntax. If that truth is what you're unhappy with because it makes syntax "apply to everything" in your mind, then change yourself because you sure as heck aren't changing any definitions, and this particular truth is one of those tautologies based on the joint definitions of "representation" and "syntax".
- But my confidence doesn't make it a tautology. As to the fraction of your argument about different syntaxes, I agree. Indeed, so long as a runtime system is both TuringComplete and provides the necessary Communications services and Translation components, it is possible to run a program described in any language. Program descriptions aren't necessarily tied to any particular syntax, and type annotations aren't tied to any particular representation. If you wish to represent type annotations with flags, feel free.
- Languages that lack a syntactic mechanism to describe types or their uses (e.g. BrainfuckLanguage, untyped Assembler, UnLambda?, untyped LambdaCalculus, etc.) will be unable to provide that typing discipline. It won't matter how far you go towards an AbstractSyntaxTree, because even the AST will lack any sort of indicators as to the intended properties of its structure.
- Indicators? Sounds like "flag theory" to me (TypesAreSideFlags). If they have side flags (type indicators associated with each variable), they have "types".
- I think you missed the point on this final one after zoning in on the word "indicators". The point: the language determines whether one even can express association between types and 'variables', and syntax ultimately determines how you can represent this association. The syntax might even forbid explicit or manifest typing in favor of assertions and inference. As to your "flag theory", I've left my criticism on the dedicated page.
- We don't need syntax to have types (unless one uses a loosy goosy def of syntax, making it useless).
- That depends on what you mean by "have" as applied to values in general. I tend to use it to mean that you "have" a representation of that value upon a physical or logical object that is within your ability to control. E.g. do you "have" a particular value (say the sequence of bits that represents an RIAA-proscribed music description) without controlling a representation thereof? If you use "have" the same way I do, then you certainly do need to "have" syntax to "have" types, for there is simply no means of representing a value without syntax, and there is no means of describing a type without representing a value (since types aren't values but type-descriptions are values). OTOH, if what you mean by "have" is that the types can exist independently of any representation, then you're right. Unfortunately, though, due to some rather fundamental restrictions placed upon us by physics, it is impossible to use types that you've never represented (even implicitly) as part of a computation... thus types you don't "have" in MY sense of the word are simply not part of the TypeSystem.
As a thought experiment, suppose an alien race preferred to program directly into
AbstractSyntaxTrees (AST's) out of preference. (Humans can too, but it never really caught on so far... though it is often alledged that
LispLanguage comes close)
- Sorry, but even the most esoteric of Alien Races are unlikely to be able to do this. They will, at most, be able to program in a representation of the AbstractSyntaxTree... and this representation will have a syntax.
- Again, this is a tautology. The definition of "syntax" you are using is either so wide that it covers everything and is non-falsifiable (other than a blank sheet of paper), or narrow enough that I can show that syntax is merely interchangable window dressing. You are logically stuck between a rock and a hard place so far.
- Again, you're confusing math with science; definitions aren't supposed to be "falsifiable" like theories. One might consider them more kin to predicates; you can test a definition to a description and it might or might not apply. If it applies, the word associated with the definition is a valid abstraction of the original description.
- I meant testing against an instance, not in the absolute sense, which seams to be what you are describing. I apologize for not making that scope of testing clear.
- The definition of syntax I use is consistent with the one you'd find in a dictionary for Computer Science or Linguistics: the grammatical rules and structural patterns governing the ordered use of appropriate words and symbols for issuing commands, writing code, etc. (Dictionary.com). I simply apply it against everything it can be applied against: words or bytes in memory, characters on paper, sequences of audio-signals, etc. There certainly are things against which syntax doesn't apply... e.g. to numbers, values, types. Syntax is only important when dealing with representations of these things (e.g. "1" vs. "one" vs. "s(0)" vs. a bit-pattern at a particular alignment and position in memory, vs. implicit as part of an 'inc1' machine-language command), not with the actual concepts. As far as it being "interchangeable window dressing", I doubt it. You might find alternate words with which to replace the word "syntax", but the concept that "syntax" is defined to represent will still be of valid concern.
- The dictionary definitions seem insufficient for computer-science issues, focusing on linear symbols.
- You could easily apply 'syntax' to meaningful patterns and structures of any graph-based language, not just those in which the nodes are organized linearly or upon a plane. Neither the dictionary definition nor the concept of syntax requires that the symbols and words be organized linearly. Physics does tend towards linear natural languages (due to the time-linear nature of the audio medium), but there are many written languages that are organized on a plane, and graph-based programming languages can extend this to any number of dimensions.
- Since most things in software either involve graphs or could be converted into graphs, we are back to the Definition That Ate The World. Sets can be represented as graphs, so are all sets also "types"? What are the rules for looking at a graph and knowing what parts are related to "types" and which are not?
- What did I say that made you fallaciously jump all the way to the conclusion that all graphs are types? You've mistaken my statements on several different levels. Firstly a syntax - a set of patterns and structure over the representation of language expressions - is not, itself, a type of the language-expressions so represented. The type of a language-expression is based on the relevant properties of the language expression in the context of evaluation and execution... e.g. whether it is a command-statement or value-expression, whether it can safely be passed to a particular function or fulfill a role in a particular communications protocol, etc. Secondly, patterns and structure over language-expressions represented in a graph-based or graphical manner are still typed as language-expressions, thus subject to the exact same conditions listed under 'Firstly'; being a graphical language doesn't offer any free rides. What a graph-based language syntax allows is reduced ambiguity and improved ability to represent certain sorts of values (especially graphs, for the obvious reason). This syntax, however, isn't a magic ticket; one doesn't suddenly gain meaningful types over all sorts of graphs that one couldn't have had in a straightforward text-based language.
- As far as whether "all sets are type" - that depends on whether you want them to be. It's perfectly legitimate to distinguish as a type "all values that fall within this set", and, conversely, it's perfectly legitimate to describe a set as "the set of all values x such that x satisfies type-properties X". Thus any set can conceivably be a type-descriptor. Of course, whether sets can represent types is constrained by the language used to meaningfully describe types. Graphs can also represent patterns or properties, and so may also be type-descriptors. While these things are both possible and legal, it's important to remember that just because arbitrary sets or graphs can feasibly be types doesn't mean that they are actually used as types. For sets to be types as part of the type-system for a programming language requires that language-expressions or runtime actions be meaningfully distinguished based upon that set.
- Your response uses too many words that contain natural-language ambiguity. Let me re-ask the question as a scenario. Any language's program can be converted into a graph, you seem to agree. Once it is in graph form, how does one know if there are types and where there are types in the graph? One prints the graph out using big plotter paper, showing lots of boxes and arrows, and your job is to tell somebody how to identify the types or type-related aspects of the graph and mark them with a yellow pen.
- To be clearer: you're being extremely non sequitur with your question. Perhaps you are confusing the syntax used to represent language-expressions with the language-expressions so represented... which exist at two entirely different layers of abstraction. I think you're confusing the two, which makes me think that any direct answer I give you will just confuse you because you don't understand the more foundational concepts it is based upon. Your question, to avoid being extremely non sequitur, should NOT have been about graph conversions of programs, but rather about graph or multi-dimensional representations of programs, which are two entirely different things! To provide some answers: A languages source possessing a syntax in 2D or non-linear space (like BefungeLanguage) does not affect or determine the types of the language-expressions, which are still based upon the properties of said expressions within the language being expressed. Indeed, if the graph syntax allows for manifest type-annotations, they're still manifest - just as manifest type-annotations are when you see them in text. Therefore, if you want to highlight them with a yellow pen, you can still do so... at least if you can find a nice way of organizing a representation of the graph upon a 2D plane so you can print it upon paper. OTOH, if types are mostly implicit in the graph syntax, you'll have just as much difficulty in highlighting them as you would have highlighting implicit types in any other syntax. This would all be quite obvious once you actually are familiar with and literate in the higher-dimension syntax rather than in your current position of trying to wrap your head around the concept.
- (See bottom for reply)
We all agree, I hope, that any language that has "types", including static compile-time-checked types can be represented as an AST.
How does the syntax-centric definition indicate which AST's have "types" and which don't?
- The type system of each language come with typing rules that define (using a mechanism called structural recursion) which AST has a type (it is well typed) and which is not. The trivial case for dynamic languages like Smalltalk is that any AST whatsoever is well typed.
- But how does this distinquish between non-type language or tree rules and type-related rules? All languages have rules, but not all rules are "type rules".
Or, are types "lost" as soon as we turn it into a tree?
- Of course not. Actually in the process of generating the actual code, the typing algorithm (if implemented) operates on representation of ASTs and not on the raw stream of characters and not even on the stream of tokens, although you can do it differently but it's simply not the sane thing to do.
Questions that may arise are:
- What is the difference between any tree and AST's?
- What's the differnce between "thing" and "car" ?
- Is any "executable" tree an AST?
- Assuming there's such thing as "executable tree" it might be or it might not be an AST, it depends on how you define it. However not many people will pay attention to TopDefinitionForExecutableTree? as it will likely be a superfluous concept that nobody needs it.
- I am approaching this topic from a definition resolvation standpoint, not about what is "good" or popular (unless the def is somehow tied to such). How does the "type theory" definition distinguish any executable tree from one that has "types"? Does any AST with rules have types, or are there specific kinds of rules which indicate "types"? See what I am getting at? Can we provide a "types" definition that is precise in terms of AST's? There should be some tree traversal algorithm or math that can identify or verify "types" at this point. If not, then the type theory def is perhaps still too fuzzy, relying on psychological interpretations.
- No. You assume that when you say "executable tree" you refer to some concept that other people are supposed to know what that is. You might as well say gobledeegook. How do we distinguish any gogledeegook from one that has types ? Please answer, TOP.
- Better yet, please make the minimal effort to aquire the basic knowledge and vocabulary used in PL theory, otherwise this page becomes a "tutoring for TOP" exercise (and a very inefficient at that) from where other WikiReaders cannot take any profit, therefore it will be a waste of all out time and a waste of wiki space.
- Nothing wrong with a tutorial. Others may find it useful also. The existing literature is written by one of the poorest, most verbose articulators in computer-science history. The early writings of DrCodd were horrible teaching tools and documentation, but fortunately others found ways to better describe his concepts. The same needs to happen with types.
- Why don't YOU write up a definition topic for it from PL theory then? Wiki is about pooling knowledge and ideas, not personal insults. Add info instead of sit on the sidelines and criticize like a heckler afraid to come up on the stage. Anyhow, here is my working definition of "executable tree": A tree that contains instructions (operations) and data (operands) such that an "engine" can run the instructions in the tree based on a known set of rules. A mathematical expression parsed into a tree would be a simple example. The evaluator takes the operations and evaluates them, probably from bottom (leaves) to top.
A = B + C * D
A -----> + ------ B
|
* -------> C
|
-------> D
(AsciiArt here needs work. See also AbstractSyntaxTree for another example.)
(This text is here to work around a formatting bug)
See "Tree Example 2" below for another example that involves type conversion.
- Chill out, TOP. You pulled a rabit out of your hat you were asked to define what that means and why should we consider that concept. Now I see that you meant AbstractSyntaxTree. So we didn't need to talk about "executable trees" in the first place. Adding unnecessary concepts can only create confusion, so don't complain for being criticized, that's the risk you exposed yourself when contributing here and persisting in NotInventedHere syndrome. If others have studied the problem extensively and wrote books on it, and is a fairl well understood domain, you should reuse the framework and the concepts that already exist. --Costin
- A syntax tree does not necessarily have to be imparative. HTML can generate a syntax tree also dispite not having any direct commands. That is why I asked the "executable" question. And like I said under WhatAreTypes, if it takes a whole book to give a useful definition, then something is wrong with the definition. --top
- A HTML AST is interpreted by a browser and the result is a rendered page. So you don't need to keep pounding on this "executable" feature if you cannot define it clearly and the distinction that you fail to come up with would have no bearing to this discussion anyways.
- I am sorry it lead to your confusion. It was not intentional. I will consider substituting it when we finish this discussion. Anyhow, what is an objective way to determine "types" from an existing AST?
- Are you sure you're asking the right question ? It's different algorithms for different languages. So your question does not make sense in general. There's no "objective" way that will work on all ASTs. But if you ask me about an AST for ML I can direct you for the typing rules of the ML language and the instructions of how to apply them. There will be different rules for ML and different algorithms for ML.
- This implies that "types are defined by the language". Would you agree to such a statement? However, this also implies there is no commonality to the word "type" beyond a vague notion.
- Is the definition of AST also tied to definition of "syntax"?
- AST is a concept that is integrated in the study of syntax and semantics of the programming language. Obviously if you don't have syntax you don't have abstract syntax tree.
- What about the other way around? Remember the "alien" scenerio from above. I am more interested in exploring the definition of "type" and it's alleged relationship to "syntax" after it is an AST, not as text.
From the little I've read so far in the "formal type theory" references that Costin has provided, I get the impression that the syntax being referenced here is the syntax of type definitions. (I.e. not the syntax of the language that uses the types -- although often the former is a subset of the latter.)
"What is the difference between any tree and AST's?" An AST has to be composed of the valid lexical tokens of the language in question, and the tree has to represent a valid structure according to the rules that are the syntax of the language.
"Is any "executable" tree an AST?" What's an executable tree? A tree can be considered as input to an algorithm; processing a tree in this way could be considered "executing" it. If the tree is "executable" according to a given processor, and the valid input for that processor can be described by a syntax, then I suppose you could say that the tree is an AST for that syntax.
Is the definition of AST also tied to definition of "syntax"? Obviously, hence the name.
You can certainly represent a tree as a (set of) table(s). Consider a syntax as a set of rules for composing lexical tokens into an AST. Aside from structuring your tables so that they can represent the given AST, you'd also probably want to define constraints that reflect the rules of the syntax, disallowing invalid constructs. Typical representations of syntax rules will probably be a lot more clear and succinct than an equivalent expression of the rules using typical representations of database constraints, I think.
- Yes, but if I mention tables, people go nuts, so I am sticking with trees for now. --top
- LOL! -- DanM
--
DanMuller
Suggestions: Clarify which syntax is meant here, the syntax that describes the types of a type system, or the syntax of a language that uses types. Related: Clarify what is meant by an "executable" tree. Since you're talking about types, I assumed the former, but but "executable trees" makes me think of the latter. Thanks. -- DanM
I've clarified "executable" as best I can above. I don't think I can give a 100% precise definition because any information (data) is potentially executable. Whether it has useful results or not is defined by the user. For example, one can take a spreadsheet file 'foo.xls', rename it to 'foo.exe' and type that at the DOS command line. The OS will try to run it, probably to the point where it causes a run-time error. (Some OS's can figure out the generating app and launch it from there, but I am not assuming that feature here.) But, we cannot define 'executable' as something that does not cause errors, because even programs with flaws often still have use and almost every complex program probably has at least one flaw, however subtle or unlikely to be encountered. Thus, in an informal sense, an 'executable tree' is a tree that contains instructions of some kind (and perhaps related operands) that were put there with the 'intent' of being executed. Please don't ask me to define intent. --top
Tree Example 2
myNum = 7;
myString = "foo";
...
A = myNum + myString;
A ------- + -------> myNum {num}
|
--------> myString {string}
Suppose this is the syntax tree that would be generated in a dynamic language. The portion in curly braces are the internal type flags often used by interpreters to keep track of types. (Each operand "node" has a "value" and "type" attribute.) Now suppose we lost the source code forever and ever and just have the syntax tree (similar to P-code or MS-CLR). Does this syntax tree have "types"?
The interpreter would generally take such p-code and the plus operator would notice that at least one operand is a string, and thus convert the result to a string.
An interpreter does not have to use type flags. Such languages tend to have operators that explicitly determine types, such that "+" assumes only numbers and "." or "&" assumes strings and converts any numbers to strings, or stores them as strings from the beginning. "+" would dynamicly interpret/parse the strings as numbers (and abort if it can't).
PageAnchor: spreadsheet_scenario
Machine Optional
Suppose we had a spreadsheet with the following cells:
Attribute....Type.....Bob...Cindy...Mike
----------------------------------------
Age..........Number....43.....34....51
Shoe-Size....Number....11......7....12
Fav-Color....Text......Red...Blue...Green
Hire-Date....Date.....12/02..8/98...4/01
Etc...
Most would consider this sheet to define "types" even though it may be for human consumption only. No machine may ever use the "Type" column. This is to illustrate that a formal type processing system ("type system") is not a requirement to have "types" themselves. A formal type system is not a prerequiste to have "types".
[The sheet doesn't define types, it identifies the types to which attribute values belong. I see nothing there that defines the constraints under which we determine whether a sequence of symbols is a Number, or a Text, or a Date, nor anything that describes the operations we can perform on values belonging to those types. That must exist somewhere, if only in your head. Otherwise, it would be meaningless to distinguish one type from another. If we formalise how we describe what distinguishes one type from another, we call it a "type system." Therefore, a type system exists whether you contemplate it or not.]
For the sake of argument, would it change anything if the sheet was supplied with BackusNaurForm descriptions of the referenced type names?
[BNF of the type names??? You mean something like "<Number> ::= 'N' 'u' 'm' 'b' 'e' 'r'"? It's a start. It would allow us to automate distinguishing one type name from another, but it does not describe the nature of these types, or how they interact with values of other types, or how new types can be formed. Can we perform addition on values of Date and Text? How about values of Date and Number? Can we divide a Date by a Number? Can a Text contain spaces? Can a Number have decimal points? Can we combine these types to form new types, and what are the rules for doing so? These are the beginning of the issues you need to consider in designing a type system. You probably already have some mental rules (i.e., a mental type system) -- now familiar to the point of being implicit -- which you've formed from your past experience with non-computing values of these types, and from experience with software implementations of type systems.]
Sorry, I meant "types per name"; the grammer of items on a row with a given name. Just the syntax, not any kind of (formal) processing. Can they still be called "types" in your book without having behavior defined?
[Sure, but the behaviour must be, at least, implicit. If there is a syntactic mechanism for distinguishing kinds of values, then we have at least the start of a type system. If there is no syntactic mechanism for distinguishing kinds of values, but we still regard them as distinct, then there must be an implicit type system, i.e., the behaviour and properties exist in the mind of a human observer. What if there is no mechanism at all for distinguishing types beyond an arbitrary naming? E.g:]
Attribute....Type.....Bob...Cindy...Mike
-----------------------------------------
Norfle.......Gadoink..3%z...foo.....^^^
Spoont.......Piffle...nN....7uP.....12
Fazoom.......Thwock...50&...Blue....%24
Hoggleboof...Zorg.....*%&...92/98...thoon
Etc...
[Gadoink, Piffle, Thwock, and Zorg only represent types if
somewhere there is an associated type system. Otherwise, they are arbitrary and meaningless labels.]
(from above)
Perhaps you are confusing the syntax used to represent language-expressions with the language-expressions so represented...
I was trying to convert it to a different format, graphs, in order to get away from syntax issues.
- You can't escape syntax by moving to graphs or tables or any other representation. You can, at best, simplify syntax. Consider that the moment you move to graphs, you need to describe what means a vertex, what means an edge, how shall you label these things, whether the labels are of any consequence (and whether they shall have their own syntax, or perhaps be graphs in and of themselves), and how the meanings of edges and nodes affect one another. The moment you start having the meanings influencing one another, you have syntax: patterns and structure in the representation. This syntax can be of arbitrary complexity. Indeed, one can even make something equivalent to a macro-language for subgraph expansions. Consider that it is possible to view any normal source-code as being a digraph in one or two dimensions with vertexes each being associated with a character-value and each edge being labeled with an association between characters (e.g. next, down). In that view, existing languages already have macros for subgraph expansion.
- You definition of "syntax" appears to be "patterns or rules in structures". I'm okay with that as a local working definition of "syntax" for now, but perhaps not as a global definition because it is too wide to be useful.
- I'd use something closer to: "patterns and structure in the representation of language-expressions". Syntax can't be applied to structures in general... e.g. one can't look at the Eiffel Tower and say, "oh! look at that syntax!". This also excludes communications structure/organization and data/value structures; those aren't syntax unless the patterns internal to them represent something meaningful for yet higher-level languages. Syntax has a very specific area of application that is associated with linguistics: the representation of language-expressions. Also, it isn't "patterns IN structure" - it's "patterns AND structure IN representation". Representation can occur in any medium, though audio and visual are the most common. One needn't actually list both 'pattern' and 'structure' since structure implies a very regular pattern. Finally, "rules" do NOT get a place beside "pattern" - while one can view patterns as being rules for representation, that's only valid if a syntax is being imposed rather than observed. Thus, when describing syntax, you can either use 'patterns' OR you can use 'rules', but you can't really use both. Linguistics as a whole is more often concerned with observation than with imposition, and so the common definition of syntax uses "pattern".
- Re: {one can't look at the Eiffel Tower and say, "oh! look at that syntax!"} Most don't do that with graphs either. It usually applies to linear text or symbols on paper or a display equivalent.
- You've ignored an important clause stated above if you think what you just said is relevant. It would also be incorrect to look at a graph and seek syntax UNLESS that graph was a "representation of a language expression".
- How does one tell the difference in a graph? Or better yet, why should it matter? It shouldn't change the nature of the graph if the original expression/source-code is lost or its existence is disputed. That seems a silly/arbitrary point to hinge a definition on. The graph is what it is regardless of whether it came from parsing BrainFuck, a GUI graph editor, or from Ethal Mertz's air-sickness bag.
- I'll answer with another question: How does one know the difference between a JPEG file and a TXT file and a file produced by dragging 30kB from /dev/rand? or difference between ENGLISH TEXT and JAPANESE TEXT and QUICKBASIC TEXT? How would you tell after discarding the filenames and external clues? The raw data is what it is regardless, so why should it matter? Answer these yourself, then apply the answer to how you might tell the difference between a graph that represents a language expression and a graph that describes the structure of a filesystem and a graph that represents a network with nodes and communications and a pseudorandomly created graph. Meanwhile, you should go read the JPEG aloud at the next poetry convention! After all, it'd be silly to define a poem in terms of something representing natural language text (or any similar form)! And stop expecting graph-based syntax to provide any magic answers.
- Determining file types? In Windows it is the file extension, which is a "side flag". If you by chance lose the extension, sometimes one can tell by codes at the very start of the file (more flags). If that doesn't do it, then one can study the pattern of the contents and guess. At this point the file has lost its types because it has lost its type flag(s); so if we want to designate a type for some purpose, we have to study it and guess. When we have formed a guess, the type flag is in our head until we write it down. Same with written (natural) languages. (Ideally, we don't want to have to guess, so flags are good.) What does guessing the purpose or source of a graph have to do with anything? Someone can type Microsoft CLR codes directly, or they can generate a CLR file by running a parser/compiler. Its still a CLR file regardless of whether you used source code, a graph editor, or a hex editor. As far as why I use graphs here, it is to form a common representation between us to help communicate, as already explained.
- Re: "What does guessing the purpose or source of a graph have to do with anything?" It answers your questions... you know, the ones you asked JUST BEFORE the answer? See that huge list of ways you provided for determining with which type a file might be associated? both before and after losing the context-based information from the environment (e.g. the file extension)? You can apply those just as easily to graphs. That answers "How does one tell the difference in a graph?" As to what the purpose and source have to do with anything? you failed to get that far (The raw data is what it is regardless, so why should it matter?). Purpose and source have everything to do with type. Whether a particular pattern of bits is to be interpreted as an integer or pointer or instruction has everything to do with its source. Someone can type Microsoft CLR codes directly and end up with a file that, interpreted as a JPEG, is proscribed pornography. Similarly, one can drag data out of /dev/random and end up with a Microsoft CLR program that, when run, prints various works of Shakespeare. These things are possible, no matter how unlikely. You describe getting a CLR file "regardless of blah blah blah" - all sources that are intended to produce CLR files. That's all fine. But is a JPEG image that just -happens- to be interpretable as a CLR file -actually- a CLR file? No. It is still a JPEG image. This is not silly, and it is not arbitrary. It is also not silly or arbitrary that the definition for syntax is "hinged upon" representation of language expressions. Even if it is physically possible, one it is incorrect to seek syntax in a fully randomly produced graph, or a graph that represents streets and roads in a major city; these don't represent language expressions. Similarly, one shouldn't search for syntactic patterns or structure in "Ethal Mertz's air-sickness bag" unless one believes that the spread and consistency of vomit communicates something important (a little like reading entrails...).
- Okay, I withdraw the "purpose" statement. While I agree that having the original source may be helpful in interpreting and understanding how a post-parsed result (such as a graph or CLR file) works and what it does, the existence or lack of one does not materially change the nature of the CLR/graph. If you claim it does, please provide a scenario. (I do not claim that having the original source changes the nature of the CLR/graph that results from translating or converting it. I must say, though, that when you started discussing graph-based representation as a "common language for communication", I assumed you were discussing the graph as "the original source". I.e. having the graph IS having the original source. I've certainly seen it done often enough. None of my caveats about graph-based syntaxes change whether it is a target language for mid-layer compilation (like CLR bytecode) or a source language.)
- As far as your goal of using graphs to "form a common representation between us": already told you - it's a lost cause, top. Graph-based syntax does NOT allow for a more natural common representation than does any other syntax. It is NOT a magic solution. To the contrary, even just increasing a couple of dimensions in how much one can associate directly increases the number of possible syntaxes exponentially, reducing the likelihood your graph-based syntax will be similar to someone else's graph-based syntax. Use graph-based syntax to help where it can... reducing ambiguity and need for names, and increasing clarity in the declaration of complexly linked value-structures. Graphs are useful for these things. They, however, will NOT accomplish your goal.
- which exist at two entirely different layers of abstraction. I think you're confusing the two, which makes me think that any direct answer I give you will just confuse you because you don't understand the more foundational concepts it is based upon.
Or, perhaps you are mistaking personal internal subjective notions for external reality. It is a common fault in software engineers. (I know it is rude to externalize my personal judgments of you, but you started the practice and I feel I have a right to return in kind.)
- Perhaps I'm 'mistaking' something, but I doubt it. You provide enough clues to your knowledge in this field that even I, a person who spends much of his time communicating with emotionless machines via stunted programming languages, can detect it. Besides, software engineers might be better than normal at detecting "does not compute" errors. ^_^
- But if there are multiple paths for computation (and defs), they sometimes mistake their favorite as the external truth.
- No. They make their favorite computation or definition into the external truth. Software engineers get to choose both how they define a word in a particular context AND how words are processed. And, so long as that computation is a correct one, and all definitions are consistent, all is well and good. That's what programming is all about. But if they become incorrect or inconsistent, things crash and break or simply do wrong things.
Your question, to avoid being extremely non sequitur, should NOT have been about graph conversions of programs, but rather about graph or multi-dimensional representations of programs, which are two entirely different things!
What is the difference and what is an example of the difference? I suspect there is a miscommunication here.
- You like concrete examples, so I'll make this as concrete as I'm comfortable with. Consider compiling source-code all the way down into machine-code. If your compiler is any good, you have the same program that was expressed in the source code. However, this machine-code needn't have any typing information. It is a correct and lossless translation of the program, not a correct lossless translation of language expression represented in the source-code. Representing a program in a graph-based syntax for a language, however, requires that you maintain and represent all the information relevant to the language-expressions, not just the program. Most relevant here, it means representing type-information that is part of the language. Asking to highlight types in a program converted to a graph is like asking to highlight types in a program converted to assembler... it might be doable (there are typed assemblies), but there is no guarantee of it. What you really want is to transcribe the programming-language-expression of a program into a graph-based syntax.
- False. The compiler may forever toss type info for efficiency/space purposes. Thus, it may only be a subset of the original program, not equivalent. You can't reverse the process and get the same thing. (It also tosses variable names and control structures, translating them into RAM addresses and JUMP's respectively, but that is not important here.) The typical compiler *does* use type flags *when* compiling to ensure the proper substitutions via the "Type DAG" are being used. Thus, during the steps that types exist in the computer (compile process), flags are also there. Coincidence? I think not.
- You are incorrect. A program is a description of actions to perform. You might have a 'program' AFTER compilation of the 'source-code', or you might just have a library or module-component that can be used in a program. We can call source-code for a program 'program source-code'. It's useful to recognize that 'program source-code' can be a description of the program OR a description of how to construct the program OR a description of what the program is supposed to do OR some combination of these... depending on the programming-language paradigm. Thus, even program source-code isn't necessarily itself a program. Additionally, a programming-language can, and often does, express and specify far more than the program, including such things as internal consistency checks, types, and proofs-of-correctness. Source code often expresses even more than the programming-language via use of comments, which aren't part of any programming-language expression. All of these things are higher level than the program itself.
- I don't want to get into a definition battle over "program". Lets take a specific example of C-language program steps. It starts out with type-related info in the source code (such as declarations). The compiler also stores type-related info to know the intended types of variables or values as it is processing the source code. It more or less converts the source-code into a data-structure equivalent of the source code while compiling. But the EXE it generates no longer has type info because they would only get in the way for C's target usage. The flags are gone and one cannot find direct type-related info in the EXE. At best they could guess based on usage (if the machine code multiplies bytes X and Y, then you know they are probably numbers, but that is merely using clues to guess). Now a P-code-based compiler like VB will keep type flags in the P-code, and they exist in the run-time engine because VB is semi-dynamic. (This is one reason why the p-code is slower and perhaps larger than C.) When the flags are gone, such as C's EXE, it no longer has "types", at least not observable ones (they might exist in the machine-code reader's head, which is outside of my and your control).
- It is common that the program lacks types (more than one type); so long as no sort of dynamic type-based dispatch is required, the actual program doesn't need them. This doesn't contradict what I said above. The language represented in the source-code often has types to aide in internal consistency checks, but these language-expressions are not actually part of the program. And they aren't intended to be.
- Let's see a specific example. Compilers for static languages sometimes disgard type info because they no longer needs it to make an EXE. If it no longer needs types, then it no longer needs types. So be it. Having types at stage X does not necessarily mean we need them at stage Y.
- Are you suggesting I contradicted you somewhere? Yes, compilers do discard type information. They can do so because the program doesn't require it. It might have been required in the source, though... e.g. for determining which function to call in the instance of polymorphism, or to (using the common Pascal example) prevent a count of apples to being assigned to a variable that is a count of oranges.
- The compiler uses type info to check for contradictions, etc. But the EXE does not need to use type info (flags), so they are not included in most C results. (Although they may be useful for a debugger.)
- If we disagree somewhere, it's still in the definition of "program", which I associate with "a precise sequence of instructions"... a definition that, by nature, rejects the possibility that all programming-languages directly express "programs".
- I didn't claim they did. "Program" is an overloaded term; so we should perhaps be clear about the stage (source, compiling, EXE) here.
- With this rejection, there is no cognitive dissonance in saying: Compilers discard type information before they produce the program from the program source-code because the type-information simply isn't used in any of the instructions. Having types at stage X does not necessarily mean they were ever part of the program.
- No disagreement here. The C EXE ain't need no stinkin' types (but they do need badges). Flags are gone.
To provide some answers: A languages source possessing a syntax in 2D or non-linear space (like
BefungeLanguage) does not affect or determine the types of the language-expressions, which are still based upon the properties of said expressions within the language being expressed.
Agreed. The "types" may come about from running the program rather than be directly in the program (pre-run). But if one can x-ray the run-time or compiler engine, types will usually be represented as a kind of side-flag(s). That is how one can objectively observe types being created. More on this below.
- "Usual representations" is not justification for theoretical definition, which is where your "flag-theory" falters.
- If its 1/10th as complex yet 9/10th as accurate (especially for common stuff), it may be a very UsefulLie, like Newtonian physics is compared to relativity. Only when you are using a weird language or excellerating turing the sun at 70% the speed of light will the "better" model be practically better.
- You've a long way to go to convince me that your pet theory is anywhere near '1/10th' as complex. By the time you figure out how to decide which flags go with the implicitly typed variables, your system will be just as complex as every other typing system out there AND you'll have the extra complexity of dealing with flags... you'll still have that lie in your definition that isn't being at all useful.
- The type system is language-dependent. I am just trying to extract the commonalities between most of them because a def should not depend on instance specifics, and that commonality is cow bell, I mean type flags. How the flags are chomped or used depends on the language. The flag-theory def only dictates that side type flags are there and that they affect execution in some way. Static, dynamic, strong, and weak-typed systems will all use them differently or keep them for differing lengths of times. The def does not have to describe how to implement each of these (unless we're seeking defs for "strong typing", etc., which is outside the scope of this discussion).
- [If a type system is language-dependent, and a language has syntax -- which, by definition, it must -- and types are a component of a type system -- which, by definition, they are -- then are you now agreeing that types are tied to syntax? By the way, I don't see any "flag theory" here at all. At best, your "type flags" or "side flags" are simply the usual type identification mechanism often used in interpreters and compilers. It's commonly known as type tags, and doesn't represent a "theory" at all -- it's an implementation strategy. Perhaps what you're fielding is the hypothesis that all interpreters and compilers will use type tags of some sort? That may be true, but it's hardly of interest, because in and of itself it doesn't allow us to do any logical reasoning about the types themselves. It's merely a compile-time or run-time identification mechanism. It's implementation machinery, not theory.]
- (also in answer to top) The commonalities between type-systems do NOT include "type flags". You're just making that up and sticking with it. Implementations of some type systems may use "type flags" but there ISN'T EVEN ONE accepted type system that uses "type flags" at the theoretical level. I will agree that any implementation of a typechecker must, in some manner, annotate the code with types (though whether this annotation involves any 'flags' is up for grabs)... but here's where we diverge IN A VERY FUNDAMENTAL MANNER: Your theory is that the annotations/flags ARE the types, which means that the expressions had NO TYPE AT ALL before they had annotations. All respected theories result in implementations where annotations REFLECT the types THAT EXIST EVEN BEFORE ANNOTATION, because, in all the respected theories, the language expressions HAVE TYPES DERIVED FROM THEIR INTRINSIC AND EXTRINSIC PROPERTIES; by annotating types, you're annotating those properties that happen to be relevant in your type system. Would you not agree that these are two contradictory viewpoints at the theoretical level? And your continuing to ignore two significant problems with your theory does you no credit at all: (1) how do you assign "flags" to expressions that are not manifestly typed? (2) how do you determine whether there are any violations of manifestly stated type constraints if the types are just "flags"? You'll find that answering either of these requires you abandon your current position that the types are the flags. If the potential need to abandon your position is why you refuse to hammer out an answer consistent with your theory, then I feel more than a little disgust with your approach to theory.
- Regarding (1), see "inferred types" at TypesAreSideFlags. Regarding (2), "violations" are language or run-time-engine specific. There is no need to include that in a general definition. Like I said, flag theory only dictates that the flags are used (differences in flag values can affect outcome), but does not need to specify any details about how they are used. Substitution via type-DAG lookup is ONE technique for using type flags, but is below the definitional level of "types".
- Oh? So nothing in your definition of types requires that types ever be used, or even be useful, for the particular property known as TypeSafety, eh?
- No. I've recently modified it to require some kind of usage (changing the flag can potentially change the output), but the details are an implementation/language-specific detail.
- Nor does your definition require any particular means of choosing which flags go with which expressions because the types ARE DEFINED AS the resultant flags and types ARE FUNDAMENTALLY INDEPENDENT OF any rules required to reach them. Therefore, by your theory, a program that just takes an 'ispell' dictionary and attaches flags pseudorandomly stands at an equal level with a program that looks at each expression and assigns 'flags' based on careful testing of properties (e.g. only assigning "integer" to expressions that it first guarantees will produce something with the mathematical properties of an integer). I already reject your definition on the theoretical level; now I'm quite convinced that it is also useless even on the operational level.
- Correct. Sloppy parsing can produce "types" just as well as nice compiler. The definition is NOT a quality-detection mechanism, nor should it be. Similarly, a poorly-normalized relational database can still be "relational". Quality of usage or implementation instance is a lower level issue. Types officially exist when a clearly-identifiable flag is identified and dissapear if the flag is tossed (like when a C compiler makes an EXE file).
- Oh? When did your pet-theory become official? I want you to expand on the "nor should it be". Prove it to me. Convince me that types require no reasoning at all... and that they shouldn't, and that all type theories that insist they do are wrong. I'll give you a few days. Please make it an essay with a thesis rather than a half-arsed answer here, and link to it.
- Why should the definition be tied to type-system quality? That should be part of techniques, not definition. Most paradigms/definitions in comp.sci are not tied to quality, so why should types be an exception?
- You just stated quite clearly that it's a "duck" because you labeled it "duck", even though it walks like a dog, looks like a dog, and barks like a dog. I'm rather interested in seeing your explanation and justification for this incredibly eccentric vision of what 'types' actually 'are'. As far as type-system quality: if you are not aware now, you should become aware that ALL major type-theories that exist are founded VERY HEAVILY in mathematics and proofs of consistency and correctness. The very definition and nature of "TypeSafety" is founded in verification and deductive proof that a program will not perform actions that have undefined results. Even object and behavior classification comes with concepts such as false-positives and false-negatives. Quality is absolutely part of any type-system, and the concept that a type (like "duck") can be rightly or wrongly applied to an instance (like a dog) is tied into the very nature and definition of types under ALL accepted type-theories. You're attempting to deny all this. So, as the man raging against the establishment, it's YOUR job to answer that question: Why should the definition of 'type' be dissociated from properties and limited to just the name?
- It is unclear to me what is the proverbial duck or dog here. As far as what all type definitions have done in the past, I don't give a fudge. Perhaps they are just doing one-up copycatting. It is purely ArgumentFromAuthority. I will agree that definitions may indeed have a component of ArgumentFromAuthority. However, they fail to provide a relatively simple answer for WhatAreTypes. Consider these two definitions of "rocket":
- A device that uses the directed force of expending combustion in a uni-directional container to propell itself.
- A device that uses to the principles of Dr. Foo to perform according to the criteria set forth by Dr. Foo.
- Any device that a pigeon has flagged with "rocket". The pigeon may or may not be trained.
- The first definition tells the reader what a rocket *is* (more or less) without taking him for an *unnecessary* ride down Academic Lane. It does not get into the difference between a "good" rocket and a "bad" rocket because that is unnecessary for more practical definitions. A bad rocket is still a rocket, even if it blows a hole thru the fense and kills the cat.
- I agree. The first definition classfies a 'rocket' in terms of its properties. One can be correct or incorrect in making the classification. And, contrary to your assertion, it does get into the difference between a "good" rocket and a "bad" rocket: a chair makes a very bad rocket, and that is reflected in this definition. What it doesn't get into is morality or directional control: a rocket used to kill a cat through a fence is still a rocket whether you do it on accident or on purpose.
- Please explain how a chair fits "rocket"?
- What for? The purpose of bringing up 'chair' is that it very clearly doesn't fit 'rocket'. Thus one would be incorrect in classifying a chair as a rocket; chairs are "very bad" rockets because they do not fulfill the definition of 'rocket' even a little bit.
- Okay, I guess I misunderstood your original point. However, something that is not a rocket, such as sky elevator, can still potentially do useful stuff such that not being a rocket and not doing things well that rockets often do are not necessarily related.
- Agreed. If you needed a type for: "things that can transport at least 600 kg mass into Low Earth Orbit", some rockets might possess that property, as would realistic space-elevators and space slingshots and (speaking sci-fi) teleportals. That's a different property by which to type or classify objects than 'rocket'-hood.
- The second definition would also be a fine definition for something or other... not sure it would be consistent with "rocket", but you can certainly type things as "this is something that uses the principles of Dr. Foo within the constraints set forth by Dr. Foo". And if Dr. Foo is describing a useful set of principles, there might be some interesting properties you can derive for these particular objects.
- "Dr. Foo's Rocket Implementation Concepts", which is what the existing type work seems to fit.
- "Dr. Foo's Rocket Implementation Concept" can be used as a definition... e.g. an operational definition for a type. Whether other users of the English Language would agree that it has anything to do with what they mean when they say 'rocket' is a different matter. Perhaps they'll describe it: "that idiot Dr. Foo's foolish concept of 'rocket' that happens to classify my spoon, my chair, and my cat all as 'rockets'". If you are going to overload the word 'rocket' to (ambiguously) also reference your formalization of the word, you best make sure that your formalization is either consistent with the existing understanding of 'rocket' OR so damn different that it is very easy to tell which a speaker means by context (e.g. as between 'letter' the thing you drop in the mail and 'letter' the character represented by glyph upon paper). That, however, is a natural-language and naming issue, and is completely independent of the actual types or concepts.
- Being a useful definition for something *specific* is not a problem as long as it is not labelled THE definition of rockets or types. Watt's Steam Engine Design is NOT "The Steam Engine". It appears what is called "type theory" these days is committing that very sin. We don't have to except a bad name forever, do we?
- I agree... we, who are not laymen in a field, should cast away bad or informal definitions for words in favor of precise, formal definitions. So, looking at TypeTheory, there exist three main approaches to typing: typing based on set-membership, typing based in category theory, and typing based in predication by first or second-order logic. All of these are equivalent and possess a common set of features on what it means to be a 'type', even when dealing with the outer computational limits where decidability becomes an issue (fixpoint, coinductive, constraint-based, and dependent types). In particular, types of described things are always determined by both their intrinsic properties (e.g. the structure of the description) and their extrinsic properties (e.g. where the description is used, or to whom it is communicated). Thus, it is not merely "useful", it is also quite formal and precise and (very importantly) consistent with every accepted type theory. It is the whole combination, not just utility, that should make it THE definition of types for 'described things' (aka 'language expressions' and 'values'). Though it doesn't yet touch upon classification for mutable objects.
- Those are implementations. It may be like saying, "all commercial rockets that ever made it to space use bell-shaped combustion chambers, therefore all rockets must have bell-shaped combustion chambers". This also makes them too cryptic for regular IT joe use.
- Not at all. It'd be more like saying: "all things we have ever called 'bell shaped combustion chambers' have a shape that can be described by a particular equation... call it 'bs'. As experts in the field of rocket-science, let's define 'bell-shaped combustion chamber' as any 'combustion-chamber' that possess a shape describable by equation 'bs'. This way, in the future, we can know exactly whether the name 'bell-shaped' can be applied to any -particular- combustion chamber in the future.
- Can you confidently describe current Type Theory as a general definition and not a technique? See TypeTheoryParsimonyChallenge.
- And, frankly dear, we type-theorists don't give a damn about regular IT joe-user. And we shouldn't... not while wearing the type theorist hat. Sacrifice of correct communication for ease of explanation is justifiable, but sacrifice of theoretical correctness for ease of definition is not.
- Them not giving a damned certainly shows. I shall try to correct that one way or another. And, Flag Theory has not been proven "incorrect".
- [Huh? Flag Theory has been shown to not even be a "theory". At best, it's a terminology equivalence: "Type" = "Side Flag", where both are something associated with values. It introduces a synonym for the word "type", with no description of what a type is, or how types behave in a type system. As a theory, it has no actual content that can be used, applied, or evaluated in any way. You can make no deductions from it, you can make no predictions from it. Therefore, there's nothing there to prove incorrect, because there's nothing there.]
- Do we have to battle over what a "theory" is now also? You have not established that a definition of types should be heavily dependant on "how it behaves". You seem to be walking around with a bag of PREconceived notions on what a definition of types should contain. I challenge your claimed prerequisites! As far as "make no deductions", that is not true. Languages/engines based on type flags behave different from those that don't. Parse-centric engines behave differently than flag-centric engines, for example, and one can test for that. Flags are a model to explain behavior seen in languages, compilers, and interpreters; and they do it better than any non-cryptic explanation I've ever seen. It is based on years of my observations and looking for a relatively simple way to explain those observations. I've saught out both traits of types and parsimony in their description. (It at least reflects "common" languages. If it faulters for oddball languages, that may only matter to oddball language users.)
- [Battle over what a "theory" is? Heavens, no. I'm using the standard definition as it applies to mathematics (and by extension, computer science): A body of related knowledge, consisting of axioms, definitions, theorems, computational methods, etc. The theorems and computational methods, in particular, can be applied to real-world situations, thereby permitting deductions and predictions about the real world. As it stands, "flag theory" consists of a single trivial axiom, which boils down to "every value has a (type = side flag)", without defining "type" or anything associated with the "side flag". So what do we do with side flags? Shouldn't that be part of "flag theory"? If not, "flag theory" is nothing but a terminological replacement for "type". Wherever you see the phrase "a value has a type", replace it with "a value has a side flag." Okay, I've done that. What do we now know that we didn't know before? Nothing. We still need some theory (either a valid "side flag" theory, or type theory) to explain what to do with the side flags. Net gain? Nothing.]
- Your own approach is more akin to the third definition of "rocket" written above, where TypesAreSideFlags. IMO, it's silly and stupid and lacks any justification. But you say this is "Correct. Sloppy parsing can produce "types" just as well as nice compiler.".
- Please elaborate. Your comparison makes no sense to me.
- You have argued that the types are the side-flags, the tags, no matter how they come about... be it formally, sloppily, or even randomly. Flagging by trained pigeon is decidedly better than random (unless the pigeon was trained wrong). Look above and read your own statements. You can recant if you disagree. Under your concept that TypesAreSideFlags, the object has a type because it has a particular side-flag... so, for example, if a pigeon attaches the flag 'rocket' to something that we'd call a 'chair', who are we to say it's wrong? The pigeon said it's a rocket. It must be a rocket.
- Remember that the flag has to make a potential difference in the outcome. As far as pigeons programming, a "program" can potentially be made by a bunch of peogons pecking at a keyboard. Whether it produces anything usable by humans or not is another matter. (Maybe such randomness could be useful to making sure EXE crashes don't crash the whole OS. Or, be used as mutations in genetic algorithms.) A misuse of type flags should not be prevented by the definition itself anymore than the definition of "rocket" should also prevent misuse of rockets. If someone turns a coffee cup into a rocket by dumping cleaning alchohol into it and lighting it, it is *still* a rocket. If spicy pigeon poop lands in a discarded but hot cigar, combusting to send the cigar 10 feet, its still a rocket (or could legitamately be called such by a human observer).
- Sure. Just use the flag to determine whether you're going to 'rm -rf *' vs. 'touch hello_world.txt' and it has made a difference, has it not? And I'm not talking about pigeons programming. I'm talking about pigeons helping the compiler or interpreter out by being the mechanism that determines what side-flags certain expressions receive. After all, something needs to determine that, and you've already said it can be random or sloppy. Pigeons qualify. However, you're contradicting your earlier statements. Are you now saying now that something is a rocket even if it NEVER received the side-flag "rocket"? That the type exists regardless? because it "could" "legitimately" be called a rocket? (equal emphasis on both those words). If so, then congratulations! I believe we have an agreement!
- I think you are confusing "classification" with "types". I don't consider them the same thing (although there may be some overlap). Classifying real-world nouns and defining "types" are two different issues here. As far as "quality matters", relational does not require normalization, OOP does not require that the OO be "good" to be called OO, programs don't have to be right to be called programs, etc. The "quality" part is treated as a more specific/detailed level. You are are not doing a very good job of convincing me that the definition should define a minimum standard of quality. Why should types get special treatment that other defs don't?
- They aren't truly two different issues. There is typing of some sorts of things (like values) and typing of other sorts of things (like objects) and typing of the sorts of things (values vs. objects). The fact that this discussion about types keeps coming back to rockets and chairs is a reflection of this truth. I have not confused the two. Anyhow types, by nature, can be "legitimately" or "illegitimately" applied to an instance. That is part of their nature, and that is why they get a special treatment that OO and Relational do not. 'Object' can be correctly applied to anything that meets the rather loose set of properties required for 'object-hood', and so an implementation of OO can be rather awful but still be called 'OO' if it meets that requirement. 'Relational', similarly, does not require normalization. The very idea of 'types' and 'classification', though, is based around the notion of necessary and accidental properties... e.g. those properties are required for something to legitimately be called 'Relational' or 'OO'.
- Why have or use two different words ("type" and "classification") if they are not different concepts?
- It's a peculiarity of English, which also uses the words 'sort of' and 'kind of' and even 'form of'. I've not educated myself in the etymologies of these different words and word-phrases, but I believe that 'kind' originates more from breeding or taxonomy (e.g. man-kind), and that 'sort' comes, to some degree, from the act of putting different things into different buckets. "Class" or "Classification" seem to be associated with privilege (high class, low class, working class, classified, top-secret), but it also refers to other groupings (e.g. class of students). In the CS field, the use of "type" and "class" in particular have come to refer more often to types of values and types of objects respectively, owing much to the popular languages that used these as keywords. Thus one discusses value-types and object-classification. It helps in conversation to have such distinctions, and it helps emphasize some of the differences between typing of immutable things vs. typing of mutable things. OTOH, that comes at the cost of distracting from the similarities in the underlying concepts. ("kind", "sort", and "form" have also been taken by different users of the CS field, referring respectively to the types of types, algorithms for putting objects into a partial ordering, and HCI where a user provides input to and submits a bunch of fields all at once).
Indeed, if the graph syntax allows for manifest type-annotations, they're still manifest - just as manifest type-annotations are when you see them in text. Therefore, if you want to highlight them with a yellow pen, you can still do so... at least if you can find a nice way of organizing a representation of the graph upon a 2D plane so you can print it upon paper. OTOH, if types are mostly implicit in the graph syntax, you'll have just as much difficulty in highlighting them as you would have highlighting implicit types in any other syntax. This would all be quite obvious once you actually are familiar with and literate in the higher-dimension syntax rather than in your current position of trying to wrap your head around the concept.
What I am trying to do here is provide a concrete common representation (graph), and then find a way to identify types with that representation. By using a common format/representation, we can get away from things like semi-colons and quotes (syntax). If we need to represent the run-time or compiler engine as a graph also, not just the program, then so be it. (You don't seem to get what I am trying to achieve, and instead focus on the specific suggestions/examples meant to help communicate, and criticize me based on the fact that the examples are not all-encompassing.) If on the other hand types are ONLY in the mind, then a common, tight definition is impossible. Mathematicians invented a common notation to help them describe and communicate their stuff: Math. Similarly, I am proposing we use graphs as the common notation to aide in communication. If you have another suggested representation that is not a specific existing language (like C++ or Pascal), please bring it up. (Maybe that's what Reynolds did, but it is a difficult notation and not very wiki-able.)
What I propose is that in at least one of the graphs of the various steps (program, compile, execution, etc.) one will SEE side-flags of some sort in most things that people agree have/use "types". This is hopefully the concrete external objective representation, the Higgs Boson tracks of "types".
You cannot escape syntax by moving to a graph-based notation or representation for a language. Do you not see the inherent contradiction there? A graph-based notation or representation IS a syntax. What you CAN do, however, is avoid and simplify various other obtuse or unnatural syntax, especially when it comes to naming of things like variables and types or declaring data with complex associations. (In a graph-syntax, you can often avoid names entirely because nodes can represent named things and certain edges can indicate their placement.) Also, a graph-based syntax can allow for more interesting object and source-browsers, and a better programmer UI. These things are positives, but they aren't magic that resolves the need for or existence of syntax.
Also, moving to a graph-based notation does not create or eliminate any language-features if you're still expressing the same language... using graph-based notation will not give you closures or multiple dispatch or module-support or concurrency-support where they did not exist before. It will NOT eliminate the need for other languages that provide these features that the graph-based language fails to provide. The QuestForThePerfectLanguage might very well lead to a graph-based primary syntax for a language, but the graph/graphical/visual syntax is not, itself, an answer to many real language-design problems (though it might be a answer to ONE real problem, which is suitable and intuitive syntax and representation).
As to the idea of allowing programmers to easily 'see' the annotated program-graph including type annotations ('side-flags'), that's a laudable goal. It just doesn't support "flag theory". The flags there are reflecting the types, not defining them.
Again, the usage of the type flags is usually language or app-dependent, and thus should not be part of the definition -- except perhaps requiring different treatment at least in one spot, otherwise they are merely documentation at best. I could agree to that. In our definition, this would ensure that the type indicators are "used", but does not dictate how. Deal?
Your theory won't survive on politics and 'Deals', or at least I won't help it do so. So, no deal. You speak of type flags as though they necessarily exist, and I'm not willing to give you even that much.
MetaDiscussion? moved to VaguesDependingOnVaguesDiscussion? (@TheAdjunct)
See also: WhyTypeSyntax
AugustZeroSeven