Alternatives To Cee Syntax

This is a place to kick around ideas for a syntactic model to replace C-style syntax yet be close enough for an easier transition. It's motivated by ItsTimeToDumpCeeSyntax. The goal is to toss what stinks about C syntax, but keep what's good.


Candidate #1 - Tee

Borrows what is viewed as the best parts of C, Pascal, and VB syntax. This also considers a way to use the "End-X" style (TheRightWayToDoWordyBlocks) for a VB-like style. If you don't want to consider End-X style, then just ignore End-X-related comments and examples. I figure if we allow both styles, then we can make a much larger proportion of developers happy. (Roughly a third of developers prefer End-X over the C-style in my personal observation.) I'll nick-name this language style "Tee" for now.

 func foo(name: str, rank: int, *serial: str): bool private {
var {x: str; y:dbl; zz,za:real}  // "bundled" version (dbl = double[footnote 1])
var z: str// singular version (both allowed)
var aa, bb, cc: real   // note the commas

if x == y {// no parenthesis needed // curly-braces style block z = 12.4 // semi-colon optional } while x == y begin // if "begin" used, then "end X" style ender is expected. See notes below. end while// endwhile also allowed if (x > 3 or y < 2) && z == 4 { // both boolean operator syntaxes are allowed (English and symbols) } if x == y begin // no "then" needed either for end-x style blocks (treated like "begin" if present, perhaps) end if }

(Note: TabMunging has been damaging the indents. Repairs have not been carefully inspected yet.)

Semicolons are optional unless multiple statements are on the same line. Parentheses can be used to resolve any ambiguity about continuation like in Python.

Note that most of your "C habits" will still work, such as semicolons and parenthesis in IF statements. It's just that they are no longer required. Much of the body of typical C-style code could be imported with relatively minor changes (mostly changes would be in declarations and CASE statements, which don't recognize BREAK. However code that skips curly braces for single-statement blocks would require adding the braces.).

To avoid ambiguity, all blocks require either curly-braces or "end X" enders even if only one statement inside. (There are no single-line versions of end-X blocks. Use curly braces if you want to stuff the entire block on a single line. More on End-X issues and colon versus "begin" later.)

Case statements would be based on IsBreakStatementArchaic examples. I'm leaning toward Example Sally-4. The End-X block version of Sally-4 would be:

 switch(a) begin    // End-X version of Sally-4
case 1,2,3 
          foo()  // must be on next line, like VB
        case 4 
          x = 7
        case 5,6 
  ...
        otherwise 
          write("Else do this")
 end switch

--top

Reminds me somewhat of GoLanguage. You might like GoLanguage, though I expect you'll be horrified by its support for HigherOrderFunctions.

I don't find Go's syntax "instantly intuitive" based on my initial browsing, but maybe it will click later. For example, there is no "while" loop, and it uses ":=" for assignment. Although ":=" is a nice idea, it deviates too much from C other languages, and colon is kind of annoying to type on keyboards, as assignments are common. For close calls in improvements or tricky tradeoffs, we should default to the C standard. For example, although C's FOR loop construct is a bit awkward, most "improvements" either use COBOL-like keyword syntax, or end up looking like C's FOR loop anyhow. Thus, I don't yet see a compelling reason to toss C's FOR loop. Thus, I vote to skip Go's version. Plus, I'm offering the End-X style blocks as an alternative for those who prefer VB-like FOR loops (which I forgot to define). Those who want a bit more English-like style can use the End-X style blocks.

As far as HOF's, this is mostly about syntax, not what features a language includes. I don't "hate" HOF's, I just feel that are not needed in app-side development (as long as API's don't force them on you). Objects are good enough to do the same thing unless you need lots of HOF's, but so far nobody has shown a practical need for high quantities on the typical app-side.

I hadn't more than glanced at Go before about a half hour ago. The more I look, the more I like, and I find it highly intuitive. And it really is quite similar -- in spirit, and to a lesser degree, syntax -- to your code sample above.

Of course there is no practical need for higher-order functions. There's no practical need for objects, structured programming, or even functions. All you really need are I/O and system calls, if statements, variables, expressions (simple ones will do -- nothing compound, please!) that assign to variables, and GOTO. Everything else is sugar.

I meant in terms of utility versus language complexity. Anyhow, this is not the place to argue over such, other than perhaps presenting the syntactical possibility as a reference point or future consideration, but letting implementers decide whether to actually include it.

Utility versus language complexity? Indeed! Personally, I think anything but I/O and system calls, if statements, variables, expressions (simple ones will do -- nothing compound, please!) that assign to variables, and GOTO are way too low utility to be worth their complexity.

Is that a statement for your particular WetWare, or a universal statement about general programmer WetWare? Anyhow, let's approach our draft language like this: IF you wish to include feature X, here's the standardized syntax for feature X.... (such as HOF's). We don't have to get into the argument over whether feature X is "good" to have in an actual language.

Is "nobody has shown a practical need for high quantities [of HOFs] on the typical app-side" a "statement for your particular WetWare, or a universal statement about general programmer WetWare"?

Anyway, why not make every language feature optional, by making it either a function call or a macro directive? Like LISP.

Because we are trying to compete with C here, not Lisp. We want the best "Algol-style" syntax we can discover. If we want to replace C syntax, then we generally have to target the same kinds of usage it typically does, which generally excludes heavy "roll-your-own-block-idioms". (Yes, I know JavaScript is trying to (poorly) be Lisp of late, but this gets back to the repeated argument of whether it's because that's the way the industry wants to go, or because JavaScript is forced upon us because it's de-facto client-programming language due to browser standards; AKA, QwertySyndrome. We already have/had that argument in [I'll get back to you].)

Why not strive for C-like syntax with the flexibility of LISP? That could make this new language considerably more powerful than simply an alternative C syntax.

I'm skeptical it can be done well. Past attempts created too many ways to do the same thing, such as when attempting to produce nested attributes: sometimes square brackets are used, sometimes parenthesis, sometimes curly braces, etc. without rhyme or reason to the structure "user". WaterbedTheory comes into play such that you'd have to reign in the complexity/variation in some spots to get more in others. Doing so would likely de-Algol it.

Are you sure your skepticism isn't premature?

Nothing's for sure. Show me the goods, and I'll change my mind. (Please don't say CoffeeScript.)


More on End-X Blocks

I was thinking of using the colon to mark the beginning of End-X style blocks, but it is already taken in the function declaration such that using it for an "end-x" style function block may not be the ideal. Plus, it seems slightly awkward. Thus, a "begin" keyword is required to start the block. And we'd want the same for internal blocks, such as While and If blocks:

while x == y begin
  if x==8 begin
    // stuff
  else
    // more stuff
  end if
  if x==9 {
    // curly-brace style
  } else {
    // stuff
  }
end while// endwhile also allowed
In related topics, I have suggested that an End-X style language avoid use of secondary words such as BEGIN or THEN. However, if we are mixing curly brace and End-X style, it may be necessary, or at least more informative compiler-message-wise, to use a secondary word in this case. -t

For backward compatibility or to be "habit friendly", perhaps "then" could be allowed in place of "begin" for IF statements.

A possible compromise "end" shortcut is to permit "End If" to optinally skip the "If" (just "End"), but require it for every other block type (loops, functions, etc.). IF is usually the most common block such that it doesn't really hurt the self-documenting benefit of the key-word repetition at the end of the block since loops, functions, etc., still have it. (It may confuse newbies a bit, but they'll get used to it).This may sound like an odd suggestion at first, but it's one that makes more practical sense the more I ponder it.

  if foo begin
    while a < b begin
      if a==7 begin
        blah(a)
      else 
        meef()
      end    // "if" not required
    end while   // loops require block name
  end
An End-X block for declaration groupings has yet to be worked out.

For loops:

 for x = 3 to 21 step 3 begin   // "step" clause is optional
   write(x)
 end for


Equality Versus Assignment

I decided to keep the "=" versus "==" approach of C because I haven't found a better alternative. ":=" is also subject to forgetfulness. Better to keep the devil you know, until an angel is found. I believe Visual Basic uses context to distinguish to avoid having to make a choice, but this limits flexibility, such as assignments in conditions. Unlike VB, I propose "==" also be the convention for the End-X style blocks. -t


Switch case break continue goto

Here a preview of consistent use for switch and so on:

 private func(a, b int; c char) void
switch b
// default handling here
// funniest stuff
case a // a maybe a variable? Need be a constant?
// stuff, no break needed, no fallthrough anyway
case 1 ... 32 // allow ranges
// more stuff
case 0, 100, -1 // allow lists of values (and ranges (and variables?))
// common Initial stuff
if b == 0 and !c
goto break // leave Switch early this way, not stealing break from loops
// funnier stuff
goto case fun(b, c) // allow reswitching this way (maybe only with constant for fallthrough replacement)
case 42
goto label // Standard goto involved here
// cleanup1
 label:
// cleanup2
return

Some of these are considered at IsBreakStatementArchaic. I'd prefer a special keyword for the final default, such as "else", "default", or "otherwise". The empty approach could be mistaken for or created from a typo where one forgot to attach the value. "Reswitch" is a bit odd. I don't see a common enough need for it to justify something that unfamiliar to readers. But then again, if it were available, coders may perhaps find some good uses for it. Jury is still out on that. I think the code would be clearer if an explicit loop is used instead, even if it's a few lines of more code. Loops are too powerful and dangerous to hide in a single "reswitch" statement. We want the Launch Nukes button to stand out, not be the 7th tooth on your comb. -t

Cannot see how Looping is launching the nukes. An innocuous method invocation or a Standard goto could be a Loop too, and that's not reason enough to forbid either, I hope.

There might be value in not using case without values for a Default Label. In that case i would prefer else, so no new keywords are added. Still, there's no Need for any Label at all, just stick it at the start of the Switch Statement, that is unreachable yet, and it makes sense to state how missed cases are handled first.

Taking the rest of switch to IsBreakStatementArchaic as recommended.

But now something which should certainly be considered here: Shall this proposed Syntax for a new language be verbose or rather terse? I hate useless boilerplate code. Shall it be more C or Pascal-like? Shall it be for a Systems-programming-language? And if so, shall it support opt-in to garbage-collection? Seems to me there are opposing Trends on this page. But these decisions are critical to evaluating any proposed syntax.

Something else to think about: Introducing multiple tokens with the same meaning is really bad for understandability. I never can remember those pesky 2 and 3-character substitutes in c. Just enough to avoid what i have to.


I propose using periods for object "paths" instead of "->" (and perhaps maps/tree-path designations, if they are rolled into the same structure/mechanism). However, languages that use periods for concatenation will have to find another symbol, such as the "&" used in the Microsoft world. This may conflict with reference indicators for parameters, depending on the language, but perhaps the syntax context may differentiate. Or, use a key-word for reference indicators, which probably should come after the variable/parameter in this "variable name first" philosophy.

Related: LanguageSymbolAllocation


TypeScript? uses the Pascal-style type declarations ("foo: number"). So somebody agrees.


Footnotes

[1] Because "double" is used often, I propose "dbl" instead. But on second thought, I'd prefer "num". "Double" is goofy. Or perhaps use "num" for both Real and Double Precision, but allow a byte number indicator: num8 = double precision, num4 = single precision (aka "real"). "num" without a digit is equivalent to "num8" and thus normally you'd only specify "num" instead of "num8". This convention would also allow for "num16" if a given language/compiler permits it. If a given language does not support a smaller unit, then the next biggest matching unit is used. For example, if compiler only supports 8-byte floating point as a minimum, then "num8" will be used internally if "num4" is requested. Remember that we are defining a convention, not necessarily a specific language.


See PageAnchor for-loop-ugly under EvidenceDiscussion regarding FOR loops.


See also: DualTypingLanguages (mixed dynamic & static typing)


CategorySyntax, CategorySpeculative, CategoryCee, CategoryLanguageDesign, CategoryPascal


DecemberTwelve


EditText of this page (last edited October 17, 2014) or FindPage with title or text search