DeleteWhenCooked
- Last edit 2005-04-03: Is it cooked yet?
This is TOPMIND's working definition for the word "syntax":
{Note that TopMind did not create this topic.}
To me, "syntax" is a string that does not have built-in separators between the "elements". When they are formally separated and the boundaries or isolation is directly encoded, such as put into a tree, it is not "syntax", but a "structure". A structure isolates "atoms" of the expression, while a syntactical expression leaves that to the reader or parser.
Syntax:
x = a + (b * c)
Structure:
Please don't redefine well-established terminology. What you call syntax is ConcreteSyntax?; what you call structure is AbstractSyntax? (or an AbstractSyntaxTree).
<http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?concrete+syntax>
<http://www.cs.dartmouth.edu/~cs68/02w/lecture-01-07a.html>
Aren't they just different manifestations of the same thing? If not, what is the difference?
No, they are not the same things. Concrete syntax requires additional rules to create the tree. Concrete syntax is effectively a string, a syntax tree is a tree. They are not the same thing, and to think they are exposes confusion at a deep level.
No need to create a new topic when SyntaxDefinition is still small.
Oh, but it's very much needed because you deserve your own definitions. Are you afraid of your brand name?
Wiki is not about ArgumentFromAuthority.
Actually, there are numerous definitions of "syntax" both in the literature and in English proper; which one is correct depends on context. The two of you (TopMind and Costin) are being rather unproductive when you argue which definition should apply; 'tis easier to clarify than to invoke LaynesLaw.
{I am not arguing which should be right, I am arguing that the definition is too open-ended to deserve to be in THE definition for types. A definition does not deserve to barge in and shove the other ones to the side unless it is precise and simple. -- top}
- Top's definition, if I understand it correctly, seems to come from introductory compilers class; where the syntax of a language (or of a program written in the language) is often presented as the particulars of the arrangement of tokens/lexemes that make up a program's text - that stream of characters that gets turned into an AbstractSyntaxTree via application of a LALR(1) or similar parser. And while that's a somewhat valid definition for syntax - and appropriate if we are discussing how to create LALR(1) parsers or RecursiveDescentParsers, etc... such a program-text-centric definition is less appropriate for a discussion on TypeTheory. Within the realm of TypeTheory, the most common definition for "syntax" is what Costin uses - a somewhat wider definition of "syntax" as "what the programmer types in". The TypeTheory definition is intended to abstract away such irrelevant (to TypeTheory) details regarding a program's lexical structure and grammar. As the subject matter is TypeTheory and not parser construction, Costin's definition is probably more appropriate.
- Even in the introductory compilers classes, I don't know of any author who came up even closely to the "syntax is a string that ... ". Does he want to say that the result of syntactic analysis is a "string that ..."? Is it a confusion between the result of a lexer and the word "syntax"? Without question, TypeTheory operates on syntactic elements as they appear in AbstractSyntaxTree, for example, but whatever definition Top is thinking of, I cannot make any sense of the above phrases defining "syntax"; it doesn't correlate with anything I know about languages or compilers or anything, so I thought the only solution was to refactor it in TopDefinitionForSyntax and let him sort it out. Let him take responsibility for letting us know in a coherent way what he understands by "syntax". I reject that there's any useful usage for the word "syntax" that defines syntax as a "string that ...". I suspect he might be referring to something else altogether.
- What's the difference between syntax and grammar then? Grammar is the rules for how symbols are combined or not combined. Syntax is the nitty gritty of characters in the string. -- top
Actually, I'd like to see any reference at all to a definition that resembles Top's. I posit that Top just came up with the above in a HumptyDumpty move, and if nobody can produce any definition where "syntax is a string ...", that will just confirm my hypothesis. It's not unproductive to insist that we do not go HumptyDumpty about standard terms in computing science.
What the hell are these "standards", and who decides what gets to be standard? -- top
Standards bodies, for one. The community that adopts de facto standards, for another. Leg snort fungible sand blue clock. Isn't it nice that we all agree on the meanings of words and don't just pick our own? Standards.
Standards bodies generally do not define words, except for specific contexts.
The intellectual marketplace decides for definitions and meaning of terms for another. All the people who followed a formal ComputingScience or SoftwareEngineering curriculum in the past 30 years will have been taught the same or very convergent ideas about basic things related to formal languages, compilers, algorithms, etc. Also, older people or people without formal education but who bought books by reputed authors and studied to improve their knowledge and their value in the marketplace, those people will also have the same ideas about basic notions in computing science.
- I would like more "science", such as surveys and references for your alleged "convergence". If you are going to use ArgumentByVote?, do it right. Further, things might move along more smoothly if you would simply say things such as, "Based on my working definition given in exhibit Foo, I conclude that....".
What's the source where we can trace your definition that "syntax is a string that does not have 'built-in separators' between elements"?. Did you get it from somewhere or did it just pop into your head? In the latter case, you're the
HumptyDumpty of this wiki.
Sorry, I'm new here, but I've read the whole back and forth over the word "syntax" and I'm repressing a need to smack Top upside the head. Here's the dictionary definition of "syntax":
[http://dictionary.reference.com/search?q=syntax]
-
- The study of the rules whereby words or other elements of sentence structure are combined to form grammatical sentences.
- A publication, such as a book, that presents such rules.
- The pattern of formation of sentences or phrases in a language.
- Such a pattern in a particular sentence or discourse.
- Computer Science. The rules governing the formation of statements in a programming language.
- A systematic, orderly arrangement.
Your redefining of it is absurd and is getting in the way of what he's trying to say. A syntax is a set of rules. Specifically, it's the set of rules that make up a grammar. What you're calling "syntax" is a specific kind of syntactic expression. Now, I'm not even clear exactly what you're trying to get at with your redefinition, but your definition of syntax has no basis anywhere in either computing or in English. There's no such thing as a string which is "syntax". It's an abstract, separate from any representation of it. As a
RelationalWeenie, this should be a familiar concept. -- ChrisMellon
?
If you view DataAndCodeAreTheSameThing and believe in SyntaxFollowsSemantics, then something being represented as a string or something else, such as an AbstractSyntaxTree, is a matter of preferred manifestation. Rules alone did not distinguish anything. Lot's of stuff has rules. However, my working definition is really immaterial to Costin's arguments. The real issue being his vague definition, regardless of which manifestation of a grammar is used. I don't care which definition you use as long as it is clear. Depending on a vague global gestalt is not good enough in this biz. -- top
Rules are the definition by which you distinguish something, at least when you're talking about language. A language is defined by its syntax as much as by its semantics.
What is the clear-cut distinction?
[Syntax is the rules that define a valid statement. Semantics is the meaning of that statement.]
Even a meta-language, where you work directly on the syntax elements, still has syntax, and there are other syntaxes involved, such as the syntax used by the query language you use to manipulate the syntax tree. Now, as to human languages, effective communication is based on global consensus of definitions. His definition of "syntax" was not vague - he was arguing for the universal definition, which I've quoted specifically above. What you were claiming was sufficiently out in left field that you might as well have been calling it soccer.
Yes, syntax involves rules, but so do a lot of other things. One needs to find a clear-cut distinction between "syntax rules" (grammar rules?) and other kinds of rules. Having rules might be necessary, but is it sufficient? You are welcome to reject my definition, but we still need a clear alternative to deal with the "type" issue (WhatAreTypes).
[Eh? Now you're confusing me. What rules are you confusing with syntax rules?]
... I've just read SyntaxFollowsSemantics. It's the same problem - you're confusing strings with syntax. Parse trees are syntax. Syntax highlighting in an editor is NOT a new syntax, it's a new textual representation. Looking at rendered HTML rather than HTML source does not change HTML syntax. There's no such thing as a language without syntax - such a thing would be a language without grammar or structure and therefore would be useless for communication. The CLR and IL doesn't have anything to do with removing the need for syntax - IL has its own syntax and you can write directly in it. You can't take IL and generate the original C# code for it, especially if it was actually written in VB.NET. All the languages supported by your hypothetical intermediate form would have to have exactly the same syntactic richness, otherwise you'd lose information going from one to the other. It's a leaky abstraction. Every one of your supported languages would have to have one and only one way of expressing any given concept. Furthermore, much of what makes code usable by humans isn't syntactic - you can auto-generate C from any number of languages using any number of tools, but the resulting code isn't something that's useful for humans to work on. And, of course, it's another 1-way transition - I defy anyone to write a useful C-to-Python compiler.
Re: "Parse trees are syntax"
But are all trees syntax? What is the clear difference between a parse-tree and a non-parse tree? "I know it when I see it" is not good enough for definition issues. I think this was brought up under TopOnTypes IIRC.
[A parse tree is a tree produced by parsing something. I'm not sure why you're confusing them with other types of trees. If you didn't produce it by parsing something, it's not a parse tree. I think you're reading far, far too much into the word "syntax" and want it to take on some sort of existential meta-meaning. Syntax are rules by which you extract meaning (see definition above). A rule that doesn't apply to syntax is obviously not a syntax rule. You don't need extensive navel-gazing here, it's a pretty simple concept.]
But the origin of the tree certainly cannot be part of a useful definition. Such a definition would not say anything about the nature of the thing itself.
There are two issues here:
- Your definition of "syntax" is absurd and wrong and insisting on it is getting in the way of useful discussion. If you use the common terminology, then you can talk on equal ground with other people
- Fine. Ignore it for now. But I would like to see a better replacement definition from the critics.
- Err.... Isn't there already better replacement definition? The definition which other people agree to use, one that you are disagreeing on.
- The idea that there could be (or should be) some sort of meta-language that you use to extract the important bits from a program so you can work "directly" with program code instead of in C, or VB, or whatever. It's an interesting idea, at least, if you avoid muddying the waters by talking about removing syntax. I'm extremely skeptical that it would ever work in practice, and I am certain that it won't work between any 2 arbitrary languages.
- Such meta-languages exist - OpenC++ for example, gives one reflection over the syntax tree of a C++ program, which lets you do anything from defining new keywords (e.g. "persistent class Foo {") to whole program transformation (though this is difficult - OpenC++ really is geared toward MetaObjectProtocol functionality). Tools dedicated to exploring and transforming the structure of other programs can be googled for as "program understanding". There are also "languages" designed to work with semantics more or less directly, with a minimum of actual syntax. Such languages work in a paradigm more or less in its infancy called IntentionalProgramming.
- This kind of stuff is interesting but it's not what I hear touted as the "new way", and what Top seems to want, which is storing all your code in some abstract format and viewing it as whatever language you want. Working with source code both as text and as "code" is of course powerful. I'm skeptical about IntentionalProgramming as well, it has shades of the PR about COBOL to me, but I'm perfectly willing to be proved wrong. What I do not believe is that there will be a magic "code" format which C++ programmers see as C++ and which VB programmers see as VB. Note that .NET does *not* do this, because implementing it as described requires 2-way transforms between all languages.
- Taken up in SeparateMeaningFromPresentation and AdvantagesOfExposingRunTimeEngine. Note that I did not create this topic and wish it was renamed because it has been used to embarrass by making it sound as if I am proposing a new official definition of "syntax". -- top
- The issues above are about definitions, not practicality. There are other topics for that, such as AdvantagesOfExposingRunTimeEngine.
- [I agree but I don't want to delete now that someone else has responded and I'm not sure where to put it.]