It is claimed herein that the syntax of a programming language matters a great deal; for several reasons. (Here, syntax refers to all textual elements of the language, including both lexemes and grammar; no distinction is made between the "lex stuff" and the "yacc stuff".)
- Readability. Some powerful languages are (subjectively) ugly to look at; and it has been argued that this has limited the use of said languages; this is often claimed about LispLanguage. (Many Lisp users disagree vehemently with this.)
- OTOH, PerlLanguage is often called a WriteOnlyLanguage; yet this hasn't hurt its popularity one bit. :)
- Yeah it has. I've seen many projects reject PerlLanguage because it almost encourages writing unmaintainable code.
- For many, "readability" means "similar to what I'm already familiar with"; which is why many more modern languages use the Algol/C-style notation.
- I pity the new language that doesn't enclose execution blocks in curly braces. The vast majority of programmers will, even unconsiously, decide it's hard to understand.
- I pity the programmer who thinks everything needs to look like C.
- I'd have them try and write C-like syntax on a German keyboard layout, I wonder how many would appreciate being able to type *end* instead of twisting their pinky finger to Alt-Gr or Shift anytime they want to type a curly brace or a semicolon...
- The trick there is to buy a standard keyboard.
- Excuse me? Standard keyboard? Do you mean US-Keyboard? I'm in Switzerland and when I go to Media Markt, the local equivalent of Best Buy, the Standard Keyboard is a Swiss one. There are braces on the keyboard, and although they would be just slightly difficult for the user of a US keyboard to type the first time, the second time and all subsequent times should be no problem.
- Ahh, so that's why C isn't as popular in foreign speaking Europe countries. It's hard to get people to just buy a keyboard, it is kind of like telling people to just stop reading the internet in French and start using Chinese, for creeps sake - is it that hard!
- Europe is not like what is pictured in the film Amadeus, where everybody walks around in period clothing speaking clear American English.
- It wouldn't be at all difficult to make braces optional, with the offsides rule in use if braces are not.
- HaskellLanguage works this way
- It wouldn't be at all that difficult to make a lot of things optional, like switching off "==" and making it "=" and changing "=" to "==" depending on who likes what. However this is poor language design, as seen in Ruby and Perl where there 246 different ways of doing the same thing. It complicates compiler source just to satisfy some fool who doesn't like consistent elegant code, and prefers "features" that are quite close to "bugs".
- SyntaxMatters because syntax is the "interface" to our psychology and physiology, and PsychologyMatters.
- Writability. Treated separately from readability; many things which are readable are a pain to type (things which are very verbose, in particular). Given that the primary means for entering computer programs still is typing into an editor of some sort, this is also a concern.
- A big component of 'Writability' is 'ExpressivePower' - the ability to express a range of concepts directly. A concept is expressed indirectly if you must write the mechanism for it at the location you're writing it. E.g. a for-loop incrementing and index across an array is an indirect means of writing a 'foreach', and 'foreach' itself is an indirect means of specifying various operations over collections that may be better served or more directly expressed by relational. The ExpressivePower of a language starts with its syntax and its language primitives, but also includes its available components (functions, procedures, methods, libraries). The ExpressivePower of a language can actually be increased if it has a good macro system or extensible syntax (-extensibility-, below).
- Parsability. Some languages have very simple grammars; TuringTarpits? like BrainfuckLanguage has an exceptionally simple, easy-to-parse grammar, but among the touted expressive languages, the EssExpressions of LispLanguage are arguably among the simplest to parse. Any reasonable programming language must be at least context-free (exclusive of semantics). It is often remarked that Lisp can be parsed with a FiniteStateMachine augmented with a counter; a full PushdownAutomaton? is not required to parse EssExpressions. Many languages, unfortunately, have context-sensitive grammars and ambiguous grammars; two of the nastiest languages to parse are CeePlusPlus (which has numerous reduce/reduce ambiguities) and FortranLanguage (which can be excused, as it predates much of the theory in this area). Parsibility is important not just because it makes implementing compilers and interpreters simpler; but because it makes implementing lots of other tools simpler. It's much easier to write a RefactoringBrowser or syntax-aware editor for Lisp than for C++, for instance.
- In my professional opinion, I reject the conclusions here on several levels. Parsability is certainly nice for the writers of the compiler and RefactoringBrowser, I agree, but it often comes at the cost of ExpressivePower. A programming language designed for humans should cater to the humans, not the mechanical tools. If that means making the language 'context-sensitive' or even 'unrestricted', forcing those tools to work 10x as hard to understand what was written, it's well worth the price if the programmers could write even twice as fast... and worth far, far more if the programmers could write their ideas directly. Even potential ambiguity is excusable if it either leads to more direct expression and there are mechansims to disambiguate... and that's not even touching on the possibility of -leveraging- ambiguity intentionally where one either doesn't care which solution is taken or wishes to peruse all possible interpretations.
- And if one wishes to write refactoring browsers for difficult-to-parse languages, the easiest mechanism is to create a library shared by both the compilers/interpreters AND the RefactoringBrowsers?/editors that has the task of efficiently annotating source-code input (i.e. creating the attribute-annotated abstract syntax tree with all attachment-points to source-text). Essentially, any good RefactoringBrowser needs be capable of fully parsing the source. However, the comments above imply that this parser need be rewritten once per tool, which isn't at all the case.
- CeeLanguage isn't context-free, because whether T has been defined as a type controls whether "T * x;" is a pointer declaration or a discarded multiplication operation. And in any event, plenty of other useful languages aren't, either. Any stack-based language that allows the creation of new parsing words, for instance ....
- Extensibility. Some languages are straightforward to extend (either by the definers of the language, or by programmers); others are notoriously difficult.
- Extensions include both components and macros... in addition to the more direct extensible syntax mechanisms that haven't been touched in over 30 years.
- Grace and Robustness in the face of typos, brain-farts, and similar errata.
- The syntax and typechecker annotations should report errors locally, where they are caused, not in other files entirely (as is often seen in C++ with a missing semicolon or template problems).
- Some language have the potentially useful (and potentially catastrophic) property that if you make a type or a simple mental error, it will usually transform a valid (acceptable to the compiler, and runs to completion) and correct (does what you want) program into an invalid program (one that fails to compile, or otherwise fails immediately and obviously upon encountering the error). In other languages, syntax errors transform a valid and correct program into a valid and incorrect one (a program which runs normally but produces incorrect results) or a program which fails silently and non-obviously. Languages which don't require variable declarations a notorious example of this--use a misspelled variable name and it creates a new variable, not an error. One classic example of this was the famous FortranLanguage bug that could have caused a rocket to crash (the defect was found in testing, not in use--see http://www-users.cs.york.ac.uk/~susan/cyc/p/fbug.htm).
- Auto-correction (PL/I style) is a bust. It shouldn't be done until we have AIs doing the parsing - something that can obtain a true understanding of intent from context. (see also http://horningtales.blogspot.com/2006/10/my-first-pli-program.html)
What about languages that are effectively "SyntaxFree?"?
Forth comes to mind: arg parsing in any conventional sense simply doesn't occur. Arguments are typically left on the ParameterStack?, as are results.
Although there is a convention for argument passing, there is no grammar to speak of beyond that imposed by the developer.
I'd say it proves the point. It's effectively impossible for anyone but the most forthified hacker to read Forth (or Postscript or any other StackBasedLanguage) without "playing computer" to try to figure out what the program does.
There is no such thing as a 'SyntaxFree?' language. Even Forth must parse words from whitespace and commands from definitions.
This point comes up several times in PythonVsRuby and other LanguagePissingMatches.
Proof that syntax matters: the cumulative number of extra seconds wasted by all programmers of all C-like languages typing the C "for" syntax (when 99.99999999% of all "for" loops count from A to B by 1 and therefore don't need such a generalized syntax) probably adds up to thousands of years.
You mean you never created a macro for that?
Is the ability to create a macro supposed to be an excuse for not putting it into the language?
Certainly! There are a lot of folks who like minimalist feature sets in their languages; and view adding features that can be trivially implemented on top of other features (i.e. non-orthogonal features) to be clutter. That said, the C++ while loop is trivially implemented as a for loop yet has its own syntax. OTOH, the do/while loop is not trivially implemented as a for loop.
If this were C's only "flaw", the world would be a better place. At any rate, "for (foo=0; foo < limit; foo++)" is so common among the CeeIdioms that claims of "thousands of years of lost productivity" stink to high heaven. Other than rank beginners, C programmers don't have any problem with this.
Untrue... if you have to type it at all... even if no mistakes are made, it's costing you time. Typing for(0,limit) is easier and faster than typing for (foo=0; foo < limit; foo++), no matter how much you know about C. [And the number of C, Java, JavaScript, etc. programs ever written is an astronomical multiplier]
Of course, C is terse compared to many languages, so probably it all comes out in the wash. :) Number of keystrokes needed to implement a particular function in the language strikes me as a dubious metric. To repeat the above--C has far more serious issues than this; being indignant over the for loop strikes me as nit-picking. Probably many more programmer-hours have been lost to memory leaks, WildPointers, and other strange and wondrous behavior made possible by C's low-level programming model, then by C's lack of a proper for-next loop. (Or better yet, a foreach loop)
Your reasoning strikes me as odd: "You shouldn't mention this particular bit of overly-baroque syntax because there are worse things in C." Anyway, for one thing I was talking about all languages with that syntax, and for another I thought it made a particularly clear example of "human factors" in language syntax. I guess YMMV.
for(int run = 0; run < runs; runs++) do_some_stuff();
Raise a hand if you spotted the bug immediately... Coming from Ruby back to C# is a most painful experience.
- Heck, that's an argument for good variable names as much as anything else
- I spotted it immediately. -JH
- It has the runs.
- There is more than one problem. First there are not enough type declarations or a contract stating what "runs" is and where it came from, what scope it is in. Without that, one could not easily decipher why those variables are there, and for what reasons, with what bounds, goals, restrictions, purpose, etc. Second, ramming stuff on one line is not recommended. Third, runs and run as variable names are too similar and could cause the eye to be fooled. Fourth, if the compiler had strictly checked for only a single dedicated loop iterator variable, this wouldn't happen
- RUBY CODE
for i in runs..runs
puts "Value of local variable is #{i}"
end
The output might not be what you were after either.
The trick is to pay attention to what you type. I'd say the syntax matters argument comes down to consistency.
99% of functions are called the same way in c and java.. method(). Oh the brace :)
Can a user create an object by accident? Python/Ruby Yep. For a dyslexic/canadian colour is a constant issue for me..
color = Color.Red;
I'd also say if(TEXT = test = tEst = tEST) is a bit a personal issue. IMHO, case matters.
As a delphi dig.. adding an else should not mean my if is now bad code.
if(test) then diaplay("test is true");
else display("test is false");
if(test) then diaplay("test is true");
//else display("test is false");
See also: LispLacksVisualCues
CategorySyntax