Build Syntax

The "BuildSyntax" is a way EdwardKiser invented to write SchemeLanguage programs out of order. It is inspired by LiterateProgramming, but it only has the advantages of the "tangle" part of the implementation. It is useful for highly parenthesized languages.

It looks like this: [This is an excerpt from real code. It has references to the rest of the code, which has been omitted. I need a shorter example!]

 BUILD (define make-stack (lambda () $make-stack-body)) WHERE
 $make-stack-body = (let* ($vars) $let-body) ;
 $vars = $var-size
   $var-space $var-the-vector $var-respace-to
   $var-ideal-space
   $var-oversize-ok?
   $var-resize-to
   $var-push! $var-pop!
   $var-abs-addr $var-rel-addr
   $var-get-builder $var-set-builder
   $var-rel-get $var-rel-set!
   $var-abs-get $var-abs-set!
   $var-alloc! $var-free!
   $var-get-depth ;
 $let-body = (lambda (msg) $dispatch) ;
 $dispatch = (case msg
   ((push!) push!) ((pop!) pop!)
   ((rel-get) rel-get) ((rel-set!) rel-set!)
   ((abs-get) abs-get) ((abs-set!) abs-set!)
   ((alloc!) alloc!) ((free!) free!)
   ((get-depth) get-depth)
   (else #f)) ;

 $var-size = (size 0) ;
 $var-space = (space 10) ;
 $var-the-vector = (the-vector (make-vector space #f)) ;

 $var-respace-to = (respace-to (lambda (newspace) $body1)) ;
 $body1 = (if $respace-bad $respace-error $body1a) ;
 $respace-bad = (not (and (< size space) (< size newspace))) ;
 $respace-error = (error `("Size:" ,size "Old Space:" ,space
   "New Space:" ,newspace)) ;
 $body1a = (let ($var-newvec) $body2) ;
 $var-newvec = (newvec (make-vector newspace #f)) ;
 $body2 = $copydata $set-value $set-space #t ;
 $copydata = (vector-copy-range the-vector 0
   (min size newspace) newvec 0) ;
 $set-value = (set! the-vector newvec) ;
 $set-space = (set! space newspace) ;

And so forth. Add the occasional comment, and it's almost like writing in WEB.

It's easy to implement. Try writing it using itself, then going through it manually. That's what I did.

I don't think any text editor can do what the Build syntax can do. Build is more than just a syntax; it's a way of thinking. I have written things with Build that I couldn't have written without it. I used to do Building on paper before it occurred to me that I could write an automated tool in Scheme itself. I wasted some time thinking about writing a more advanced version in C++, before I realized Scheme was easier. (I've also thought about doing Build for C++, but there my aim was to avoid duplicate typing between declarations and definitions, things like that, and the syntax was much more complex.)

-- EdwardKiser

That syntax is as ClearAsMud?.

Well, you have to know the SchemeLanguage to make sense of it. I should have mentioned that before you commented; I have already gone back and corrected the description.

Those body1, body2, body1a, etc, still pretty much yell CodeSmell to me.

Looks interesting Edward! Next time I get back into Scheme-like programming I'll have to come back to this page for some ideas. -- RobHarwood

That's pretty cool! I've been a big fan of the SchemeLanguage for years. This looks like using BackusNaurForm to express scheme functions. Can you recurse in your syntax? Or maybe you wouldn't want to.... -- IainLowe

BNF is more complicated than this! In BNF you can say A = B | C | D E, but the BUILD syntax only allows one choice.

I created an extension once where you could express a rule in terms of itself. You could write

 $vars = one two three ;
 $vars = $vars four five six ;

and, being read from the top down, it would be seen as equivalent to

 $old_vars = one two three ;
 $vars = $old_vars four five six ;

. This was less useful than I had hoped. I wanted it to be an error if I accidentally had two substitution variables named $body or $params or something; this made it a non-error. What I really should do instead is have some kind of scoping rules.

But why would you simplify Scheme like that? I mean, couldn't you have written:

 (define *vars* (list 'one 'two 'three))
 (define *vars* (cons *vars* (list 'four 'five 'six)))

Or is the idea that it's just a different syntax for writing in SchemeLanguage?

It's a different syntax. It allows forward references.

I think you can translate a BuildSyntax program into an equivalent letrec as follows.

 BUILD $c WHERE
  $a = "hello"
  $c = ($a $b)
  $b = "world"

would translate into:

  (letrec (
      ($a (lambda () "hello"))
      ($c (lambda () (list ($a) ($b))))
      ($b (lambda () "world")))
    ($c))

However, the BUILD syntax was intended for building code, not data. That only means that you'd have to wrap the whole thing in an eval (and it would also mean that the example I just gave won't work, because ("hello" "world") is not a valid Scheme expression).

It may be possible to write a macro with DefineSyntax that handles the transformation from a more Scheme-like notation into an eval letrec. -- EdwardKiser

Strangely, I haven't been using Build much lately. Most of the reason I used it was to create large single expressions, such as the kind that define 20 or 30 local functions without "polluting the global NameSpace." But lately I've stopped minding about how the global NameSpace is "polluted" by the definitions of auxiliary functions, and I can write much shorter expressions by using define ten or twelve times before I write the expression. Of course, there are some minor problems with this approach. Such as, I have to remember not to step on any of my auxiliary functions; if I redefine one, I'll alter the behavior of my "big" function. -- EdwardKiser [Sun, Apr 8, 2001, 8 AM UTC]

Why not have lots of nested let clauses? They don't pollute NameSpaces and they're just as easy to read as your syntax above... I'm just not sure if you can use a binding inside a special form. -- TaralDragon

You can't run the interpreter inside of a let clause, which you might want to do to test your local functions. You have to make the let clause export everything, and test outside -- or you have to make the let clause out of functions that have already been debugged. -- EdwardKiser [Mon Apr 9 2001 1:30 PM UTC]

Hmm... MIT Scheme has environments, which would fix this, but then you don't have the R4RS macro system. I suppose you could write a macro which would embed the expression inside the let clauses... -- TaralDragon

DefineSyntax implements hygenic macros, which wouldn't work with the BuildSyntax -- at least not directly. -- EdwardKiser

The mechanism I've seen to deal with lots of nested functions is to use nested defines, such as

   (define fact (lambda (n)
     (define fact-aux (lambda (n accum)
        (if (< n 2)
          accum
          (fact-aux (- n 1) (* n accum))))
     (fact-aux n 1))

You can have as many nested defines as you want (although they all have to be at the beginning of their `parent' lambda expression), and they all share scope just like "letrec". In practice, you use the short definition syntax and the code looks like:

   (define (fact n)
     (define (fact-aux n accum)
       (if (< n 2)
         accum
        (fact-aux (- n 1) (* n accum))))
     (fact-aux n 1))

While you're doing testing, you comment out the top-level function and any body:

   ;;(define (fact n)
   (define (fact-aux n accum)
     (if (< n 2)
       accum
       (fact-aux (- n 1) (* n accum))))
   ;;(fact-aux n 1))

All your internal functions are then available in the top-level environment.

There appears to be an implementation of LiterateProgramming for scheme, "simple support for literate programming in [Scheme] Written by John Ramsdell": ftp://ftp.cs.indiana.edu/pub/scheme-repository/utl/schemeweb.tar.gz

See also CodeOrdering.

May 24, 2010. I am considering writing another implementation of this, and I see here that I have not explained it as clearly as I might have.

The Build syntax is a layer on top of Scheme. It allows you to build a Scheme expression without deeply nested parentheses. What you do is create variables and state their values later. The program then substitutes the values in place of the variables. For example, here is a Scheme expression with some deeply nested parentheses:

  (let loop ((i 0)) (if (>= i 10) #t (begin (display i) (display " ") (loop (+ i 1)))))

If you have some kind of substitution mechanism, you can instead write:

  (let loop ((i 0)) $body)
  where $body = (if (>= i 10) #t $body2)
  where $body2 = (begin (display i) (display " ") (loop (+ i 1)))

This is much simpler to deal with because every line closes all the parentheses it opens. The nesting depth at any given point is therefore a lot less, and it is much easier to verify that syntaxes are correctly formed.

The BuildSyntax analyzer substitutes values for variables, and produces the original expression above. (Thus $body and $body2 are eliminated by the substitution process.) I use the dollar sign to distinguish Build variables from Scheme variables.

The above Scheme expression is still relatively simple; the highest nesting depth is 5. It is possible to define an entire program, such as a canonical LR parser generator, in a single, very large expression. The nesting depth would be much greater and the improvement offered by the BuildSyntax would be correspondingly more dramatic.

Once I wrote an automatic tool called "Grayspace." It helped me enforce a rule that each line would be indented two spaces for every parentheses left open before that line. It helped by creating HTML markups for the source code and setting the background color to gray for the first 2N characters of every line, where N was the nesting depth. Any errors with parentheses would be immediately obvious, because some code would be in the gray area, or the gray area would not extend far enough to reach the code.

Some text editors also help with deeply nested parentheses, doing things such as ensuring that indentation is correct when you cut and paste code.

Refactoring Scheme often requires moving pieces of code in ways that require them to change their depth. For example, suppose you have a loop and you want to move a variable assignment out of the loop. The variable assignment itself might be a large expression. With my two-space-per-open-parenthesis rule, the indentation has to be changed as well.

When I started using Grayspace heavily, the BuildSyntax became obsolete.

Right now I am writing a lot of deeply nested Scheme expressions without the benefit of Grayspace, and without the benefit of a helpful text editor, so I am beginning to see the usefulness of the BuildSyntax again.

-- EdwardKiser

CategoryScheme