Better Syntactic Sugar

To DoTheSimplestThingThatCouldPossiblyWork, you should use the highest-level language that might possibly both work and satisfy your performance requirements.

This would seem to conflict with PrincipleOfLeastPower
Correct. PrincipleOfLeastPower is an optimization principle. It, by nature, conflicts with DoTheSimplestThingThatCouldPossiblyWork.

Actually, you should use the next highest level language, if it's realistic to expect the schedule to allow for profiling the performance-critical sections and reimplementing them in a more efficient language. For example, I recently did a project with scripts implementing the first version of the processing and reporting functionality; after we proved that we correctly parsed and reported the data, we reimplemented in a compiled language, which went very quickly since all our algorithms and data structures were already proven.

The phrase "Better Syntactic Sugar" was originally used in derision about programmers who liked PrettyPrinted, color-coded, hyper-linked, interactively-iconified code editing, by one who thought that this kind of "sugar" was just to cover up a lack of skill at the basics of reading and writing code. It's such a delightful phrase that I decided to use it as a page title, and explore just how sweet automated tools could make the XP development process. Thanks to BenKovitz for extensive refactoring to clean up the page.

-- ChrisBaugh

Examples

XML can be used to make a data file which can be written and read by a variety of tools, that each only have to know the basic rules of XML and the rules for their own specific domain. An XML based development system could have a single XML file with the customer requirements obtained from CRC cards; the customer's prioritization of requirements; the development team's time estimates for completion, risk estimates for difficulty, and spike tests to reduce risk; the unit tests written to verify completion of development; the source code written to satisfy the unit tests; object code generated from the source; the results of test runs of the object code against the UnitTest; programmers' vacation schedules, and anything else relevant to the project.

Rather than a single large file, in real life a relational database might be used; but a DTD for extreme programming would allow for "development project files" to include any or all of these contents. XP developers could have a suite of programs which can each import and export the data as XML, and handle one or more of these types of contents. Compilers and interpreters could read source and write object code; documentation programs could scan for matches between CRC cards, names of source blocks, and results of test runs; project management programs could scan for CRC cards not satisfied by successful object code, customer priorities, and developers' schedules. Source editors could handle the source in a variety of syntactic formats based on the needs and preferences of developers. Here are some examples of the source code portions of "syntactic sugar", but my wish list goes far beyond just "PrettyPrinting" the code.

ChuckMoore has pointed out, with his ColorForth system, that the use of words and punctuation marks to distinguish names, literals, parameters, or the like in code are no longer necessary. Code can be stored in an XML format, representing program elements and their attributes, and displayed in whatever way programmers like best - with curly braces on the same line, curly braces on another line, the words "begin" and "end", or in his case, with changes of color indicating the function of code elements. Other developers have pointed out that XML elements could be used to represent conditional statements and loops, converting on the fly to BASIC, C, or COBOL like display of the structure. Microsoft's VisualStudio automatically looks up available properties and methods for objects whose type has been declared, eliminating the justification for terse and cryptic abbreviations whose sole virtues were that they were quick to type and quick to transmit over a few-hundred-baud Teletype connection. Around 1990, the AmigaVision development environment let the developer add icons to a flowchart. Icons represented such ideas as a conditional test, a request for user input, or a media playback. Oracle has tools that take a model of the relational data structures required and generate all the relevant SQL code.

Unfortunately, that won't work. All languages don't share the same semantics, and things that one language does with ease, others can't do at all. That's why we have different languages to begin with, because they express different ways of programming. You can't represent Lisp macro's in VB, and XML can't get around that. If it were that easy, we wouldn't have languages to begin with.

ColorForth type of editing could be used to decorate names with CSS attributes at display time in order to illustrate the properties of what the names represent. For example, when the mouse hovers over DictionaryBase?, the mouse cursor will be swapped for an icon of an arrow pointing at a 2x2 grid. In the top left cell is an icon of an arrow to a series of ellipses with a trailing slash, followed by (0..3, 0..5). This shows that DictionaryBase? is a pointer to a two-dimensional array of pointers to null terminated strings. Of course the editor would be configurable (by swapping in a new XSL stylesheet) to have a variety of display possibilities for this type of display, including prepending Hungarian abbreviations to the variable name.

Here's a more specific example in the code-editing section: A variable is a pointer to a two-dimensional array of pointers to null-terminated strings. Using the hypothetical DTD for representing C code, the XML may look something like this:

 <type name="stringPointer" type="pointer" destination="nullString" />
 <type name="stringArray" type="array" content="stringPointer" size="3, 5" />
 <variable name="Dictionarybase" type="pointer" destination="stringArray" />

An XSL stylesheet with transformations can be used to display this code in Hungarian Notation for programmers who prefer it, showing the code as:

 prgrgpszDictionaryBase

I would prefer to swap in an XSLT sheet which would highlight Dictionarybase, like all variable names, in a color of my choice; and, whenever I moved the mouse pointer over the name, popped up an icon showing an arrow pointing to a two dimensional box, with the top left box containing an arrow to a series of ellipses closed by a 0, and the information (0..3, 0..5).

Here is another example:

 <loop type="foreach" enumeration="Salaries" current="Salary">
 <statement>this.pay</statement>
 </loop>

This could be automatically represented in VB as:

 for each Salary in Salaries
  Salary.pay
 next

And in C++ as:

 for (int i=0; i<Salaries.count; i++) { Salaries[i].pay }

And in UML as one of those diagram thingies that are all the rage these days.

Note that the "ugly" XML would not necessarily be shown to the programmer; the source code could be presented to the programmer with the syntax of standard C, if desired.

Wouldn't this type of automatic translation of semantics to a variety of syntaxes be useful, for instance, in extending open source collaboration? No. You can translate a loop, but more than a loop you cannot. -- ChrisBaugh

Chris, what you are describing sounds very similar to IntentionalProgramming, as far as I understand it. Maybe you should have a look at that. An extra twist comes in when you make the representation and its mapping to semantics program-accessible - if you "go meta", so to speak. -- FalkBruegmann

Discussion

I don't think that computers are currently powerful enough that we can afford tools that provide enormous ease of programming at the expense of performance. Such tools brings computers to their knees. See EnterpriseJavaBeans as a little example.

Even if we have enough computing power to waste for safer and more 'programmer efficient' languages, we shouldn't replace things like C all over the place. There are areas where no matter how much performance you've got, you always need more.

-- CostinCozianu

Hey, I design microprocessors for a living, and our current biggest concern is that we know how to make computers several X more powerful than now, but nobody wants to buy them... I'm in favor of any tool that provides greater ease of programming... if there are problems that can only be solved via greater ease of programming, we (AMD, Intel) will be happy to provide the performance. -- AndyGlew

Non sequitur. You and Costin are talking about two different things. Your issue is that Intel/AMD/etc have the current problem that CPUs are already more than fast enough for the things that the average consumer or average office worker.

That is completely unrelated to Costin's point: for many uses other than the average consumer's/office worker's, there currently is not enough CPU power. This is trivially true; I would like to simulate the weather over the whole planet down to a resolution of atoms, projecting whether it will rain on San Diego 30 years from today. Barring something like QuantumComputing, we will never have fast enough CPUs to do that. Actually, this is questionable even with QuantumComputing.

Both your point and Costin's point are well known and valid; they just happen to be unrelated.

(Actually, you overstated your point; you're already delivering approximately as much price/performance as you can. To deliver vastly more performance would require vastly higher price, e.g. cluster computing. You can't just wave a magic wand and change the slope of MooresLaw. For that matter, where's that 4Ghz Pentium that at least part of the market would prefer to have but that you decided not to deliver? :-)

For the past few months I have been working on an XML based programming system, using a bottom up approach. One of my intermediate goals was to be able to program using UML. Another was to get fine-grained control over generated code.

I used a bottom-up approach, co-developing a useful program and its CodeGenerator. I quickly realized that the CodeGenerator is, itself, a program, so I BootStrapped it. For the first few weeks I tuned the XML manually, then selected a drawing tool to use as my code editor, using a UML representation of the XML.

I won't get into details now. My point is that XML based code representations do work, and do save effort. By taking the approach of working bottom-up, and then bootstrapping, I maintain tight control of my generated code whilst getting the benefits of a more abstract description. I have separated the program from the implementation details, but only in the same sense as a OO interface hides information. I regularly refactor both the "CodeGenerator" package and my "Application" packages (sometimes both as part of one MetaRefactoring). Both are UML; both are generated into code using the same code generator.

The program is isolated from the code, but the programmer is not.

I think higher level representation is achievable. Not too far from what I've been thinking about. The forward mapping from a high-level grammar to low-level syntax is relatively well defined. It is essentially a compilation activity. The reverse mapping is trickier though would be tremendously valuable. I am extending some inferential techniques that might work but it's a bit half-baked. -- RichardHenderson.

I have difficulty with the idea that the reverse mapping is tremendously valuable. In practice, I've never needed it:

If I'm writing new code, then I write it at the highest level that my current generator-infrastructure supports
If I have legacy code, then I refactor it on an as-needed basis.

Automation may be possible but, as with any automated refactoring, it is difficult for a computer to know which details of a block of code will remain important in the alternate representation. Automation would be nice (say, a MetaRefactoringBrowser?), but manual intervention will always be needed.

Not necessarily. Consider how optimizers work. They are automated refactorers, and they definitely do not want manual intervention when reorganizing the code. Instead, they have either heuristics, algorithms, or metrics to make the right (we hope) decisions. Is your argument really that automated refactorers whose target is improved readability cannot be written? Of course, because the measuring stick for that would be humans. In that case, manual intervention is necessary. -- SunirShah

: Thank you for the clarification: I was not considering optimization to be a refactoring. However, even for optimization, a tool cannot know (for example) to change a linked list into an array (or vice versa) unless it knows the relative importance of lookup vs add vs delete performance in the specific application. -- DW

I could use a tool that gave me an interpreted view of a system. It would just be a tool. I would be suspicious of automatic refactoring, but something that could apply my expertise (or someone else's) in a pattern seeking manner would be valuable to me, and I reckon I could sell it. Fiction still, unfortunately. -- RIH

The most important concept is that your CodeGenerators must be part of your project. Reuse of third-party components is fine (e.g. a compiler), but the mindset should be that they are components, not stand-alone applications. -- DaveWhipp

This may well be true, but doesn't address the fact that after decades of improvement, general purpose CodeGenerators? still suck. Badly. The only thing we can do well are small, domain specific generators like yacc.

: Oddly enough, general purpose programming languages suffer the same fate. But this shouldn't be surprising. Programming languages are CodeGenerators too. cf. LittleLanguages. -- SunirShah
: True, true. Problem is, stacking these things on top of each other almost always makes it worse...

That's why I argue against using general-purpose code generators (except when I can use them as a stepping stone: say, to bootstrap a better CodeGenerator). -- DaveWhipp

Check out http://www.cs.yale.edu/Linda/linda-prog-builder.html

http://www.cs.yale.edu/homes/ahmed/linda-prog-builder.html

See SyntacticSugar ProgrammingLanguagesAreSyntacticSugar

Compare SyntacticSemtex

CategoryProgrammingLanguage