Most lines of most computer programs have exactly one statement. (Although some languages, such as Lisp, Scheme, and SQL, make identifying the beginning and end of statements fairly arbitrary. (although Lisp makes identifying beginning and end of expressions completely precise))
But often, it is nice to have to have multi-line commands, or multi-command lines. Different programming languages use different syntaxes to implement this feature. Some programmers prefer one way; some prefer another; some don't care; and some can switch back and forth, but find it temporarily awkward to switch styles.
Some approaches (some of these may be combined)
By default, each line has exactly one statement. Require the use of a line continuation character to have a multi-line statement.
Known uses:
#define TRY { jmp_buf ex_context; int ex_code; switch (ex_code = setjmp(ex_context)) { case 0: #define CATCH(N) break; case (N): #define CATCH_ALL break; default: #define FINALLY } }By default, each line has exactly one statement. Require the use of a statement ending character to separate multiple statements in a line.
Known uses:
if (condition);/* note erroneous semicolon */ do_something(); /* Always executed */There's also a subtle distinction between ';' as a statement terminator (C/C++/Java) and as a statement separator (PASCAL). Modern implementations of Pascal-family languages, such as OberonLanguage, still follow the statement separator interpretation, but permits null statements; thus, the semicolon in Oberon can be treated as a statement terminator if you want to.
Known uses:
Resulting context:It is hard to copy-and-paste such code from web browsers.
Known uses:
Some ways the IDE or compiler can intelligently figure out what the programmer meant, and reduce the need for these special characters:
Because print-outs are sometimes narrower than the code window, a single-line statement (in the code window) is sometimes printed out as a multiple-line statement (on the printout).
The IDE can print out a special symbol to indicate the wrapped line.
Known Uses:
(basically, if the last char on the line is an operator, or some quote or parens are open, it doesn't look like the end of a statement to the parser). This combines approaches #1 and #3.
Known uses:
The Sql expressions and Sql clauses in a SQL query can be thought of as statements in a block. Because StructuredQueryLanguage requires its keywords to be in a certain order, clauses do not need clause ending characters.
Known uses:
Resulting context:SqlLineCounts are fairly arbitrary.
A Sql query can be thought of as a long statement.
Known uses:
Resulting context:It encourages big "run-on" sentences, and other SqlFlaws. The language does not encourage any particular indentation of the code within a query. Some SQL IDEs (such as SqlServer) do not preserve the programmer's formatting. Unformatted code is very hard to read.
Statements begin with certain keywords, and cases where those keywords occur in-statement do not cause ambiguities.
Known uses:
Known uses:
Discussion:
I don't like semicolons in computer languages. They are anti-WhatYouSeeIsWhatYouGet and a common source of syntax errors in my opinion. I realize that this is a personal preference, but I wonder what the mental steps are in some people that make them like semicolons. Are they just to make parsers simpler (if they do), or do they have some human interaction benefits? Should languages focus on being easier for compilers over human issues?
Semicolons are a human ease trade-off. By having them, you get newline back for your own purposes: You're free to have multi-line commands, or multi-command lines. Both of these are worth the semicolon to me.
And I'm puzzled by how to apply the notion of WYSIWYG to programming. Can you explain what you mean?
Line breaks are immediately visible to the eye. Semicolons are not. More on WYSIWYG below.
From above: Missing/superfluous statement ending characters and missing scoping characters are common sources of syntax errors or hard-to-find semantic bugs. (All C/C++ programmers have run into the following at one point)
I for one have been doing C/C++ for 30 years now and have never seen this issue arise -- MartySchrader
I think the people who are against semicolons never had the experience of using a PunchedCardLanguage? like FortranLanguage or CobolLanguage. I can clearly remember when I started using Pascal (TurboPascal 2.0 in 1984). Two years earlier I had been writing Fortran, and to have the ability to format my code as I saw fit, to have the whitespace be irrelevant to the semantics of the program, was nothing short of a revelation.
Instead, the defining experience for these people must have been the frustration of getting "semicolon expected" syntax errors from a compiler and thinking, "Duh, why can't it understand what I meant." But this, surely, is a brief stage in learning a language.
Semicolons are there for the human, to give the human control over where the statements begin and end, and how the code is formatted. They aren't for the compiler or to make the parser simpler; a language like PythonLanguage can be well-defined and parsed without any semicolons at all (and when I first saw Python my immediate reaction was that it was an enormous step backwards; it took some time and experience with it before I grudgingly came to accept it, and now I quite like it).
-- DavidConrad
I'm just wondering where the perception that a line break signals the end of a statement arises. How often (outside programming languages) is that the case? How often have you read a book in which one and only one statement appeared on each line? The text editor I'm using to draft this note is currently inserting a line break every 80 characters as I type, and the note is getting on to being six lines long - with five line breaks. All that will go out of the window once this page is rendered: YMMV.
The perception comes from mathematics, where expressions are generally placed one per line. Historically, mathematical notation is the underpinning of programming languages. It makes sense for programming language design to taken inspiration from and build upon mathematical languages, rather than from written or spoken languages. Both mathematics and programming languages are trying to express precise meanings with a very compact notation.
In some poetry (but not all) there is one statement per line, but in general the proposition that computer languages should resemble natural language is one that should be questioned.
Now paragraph breaks are another matter; but in many programming languages (particularly those where whitespace is considered insignificant) there is already a convention of using extra blank lines to separate "conceptually atomic" pieces of code. When you're reading, paragraph breaks may be more noticeable, but how often are you consciously aware of the line breaks (which in English at least have no semantic significance)?
I do not state a personal preference here. Mine may be obvious, but irrelevant.
Is it the idea of having an explicit separator character, or the particular punctuation character which is disliked? Does any compiler allow a different statement separator to be specified?
Smalltalk uses periods as statement terminators. This can be considered more "English-like" than using semicolons, as periods are used to end sentences in written English.
Pascal uses semicolons as statement separators and a period at the end of the program. This is a different way of seeing an "English-like" syntax: a program is a single sentence.
Many programmers don't want whitespace characters to be significant to the compiler.
Some people consider extra whitespace characters to be "noise".
Some people consider syntactically significant characters to be by definition "signal", not "noise".
(These are just personal preferences -- if you don't like it, than you don't like it.)
Three ways of thinking of WYSIWYG for programming:
This is maybe the misunderstanding here. A linebreak and SemiColon are equal characters (at least on Unix). In most languages they don't fight for territory. There is either the one or the other. There is no special meaning in a linebreak. One just tends to think there is some special meaning. BTW: the delimeter which I would consider most "natural" would be '.'.
main = do putStrLn $ "Hello world!" name <- readLine if name = "Bill Gates" then do putStrLn $ "Ewww!" else do putStrLn $ "OK, you're cool."But, you are free to use braces and semicolons if you wish:
main = do { putStrLn $ "Hello world!"; name <- readLine; putStrLn $ if name = "Bill Gates" then "Ewww!"; else "OK, you're cool."; }You have your choice.
There's a subtlety here you don't mention: The entire body of a Haskell source file is a brace-enclosed semicolon-delimited set of definitions under the where clause of a module declaration, but most people write their toplevel declarations (at least) with automatic layout.
When the parser got to your end brace, it saw that it was on the first column and inserted a semicolon before it, assuming it to be the start of the next declaration. This happens to work because you're in a do block, but it's something to keep in mind. If you want whitespace-agnosticism, write the module declaration and braces yourself:
module Main where { main = do { putStr "blah" } }
Some programmers want the "look" of the program (controlled via whitespace) to be recognized by the compiler without any extra characters, whereas others want explicit tokens so that they can make the program "look" however they want. Both sides believe that their way leads to clearer code and fewer errors.
I know which works best for me. You are not going to convince me that your preference works better on me because you are not me. I have used both approaches extensively, so it is not a matter of "getting used to". Similarly, I agree that your preferences probably make things work smoother for you. This is basically a psychology issue more so than a technical one.
If your lines are often so long that wrapping is frequently needed, then perhaps ResponsibilityDrivenDesignConflictsWithYagni (bloated syntax) plays a role?
What if my lines are only rarely so long that wrapping is infrequently needed?
Semicolons aren't only there to allow long lines to wrap, they're also there to allow short, closely related lines to run together.
x = 17; y = 23;Some languages have a character that allows that, but does not otherwise use semi-colons. Some BASIC dialects used to use a colon for example to do that. But one could also use a function for such in some languages:
assign(x, 17, y, 23, foo, "bar")But, I found that I did not need it that often if it was available.
After using C and Perl for years, I have problems in any languages that doesn't allow a semicolon at the end of a line. It was particularly annoying in a language that does allow semicolons to separate statements within a line, as well as blank lines. However, the transition to Python was easy (except that I would forget to use colons at the end of a line).
There are several syntactic features which make a StatementSeparator? (whether it's a SemiColon, a newline, or whatever) necessary:
Can't all the punctuation just get along?:
See: SyntacticallySignificantWhitespaceConsideredHarmful, LanguagePissingMatch, PythonWhiteSpaceDiscussion, SqlLineCount, HolyWar