As a thought experiment, I've been kicking around the idea of a semi-Homoiconic scripting language (HomoiconicLanguage) that somewhat resembles C-syntax. It's not that I love C-syntax, but at least it would help in the acceptance of the language and make it more familiar and perhaps more natural than Lisp. This language is sort of a forced marriage between Lisp and JavaScript, with some simplification of concepts for conceptual conservation.
It is based on nested UniversalStatements (function calls with optional named parameters), and a "tree-map". A tree-map is basically a tree where the nodes are non-positional and addressable like an associative array.
If you want to record positional info, such as with positional parameters, then one generally uses sequence numbers like "1", "2", "3", etc. in the tree. Lisp fans may find this annoying and is admittedly one of the uglier aspects of this language. But every language has ugliness, and one often has to accept the ugly stuff to get the good stuff (WaterbedTheory).
The whole run-time image is one big tree. One can x-ray the tree to study or change any aspect (if interpreter permissions allow). All the variable stacks etc. are part of this tree. Here is an example tree generated from parsing:
// Example 1 - source code print("hello world", device=foo); printLn(foop, device="zert"); printLn("foop"); // note quotes, contrast with prior // Run-time tree generated from Example 1 system.default.main.1.print.1 => "hello world" system.default.main.1.print.device.var => "foo" system.default.main.2.printLn.1.var => "foop" system.default.main.2.printLn.device => "zert" system.default.main.3.printLn.1 => "foop"The "=>" indicates we are showing the value at the given array position given by the map-tree path. The above is essentially part of an array dump where the array is the system tree-map. You could get it by using:
print(dump("system"));Normally one does not have to give the whole path for variables because a scope system is used during execution which would assume function-only scope unless not found there.
Above, "default" is the default name-space. This allows the possibility of multiple name-spaces, although I am not describing such a feature any further and it won't be part of our hypothetical version 1.0.
"main" is the main function, which will be assumed if no explicit "main" function is defined.
There is no difference between variables and map arrays. Variables are just short-cut references. The following are all equivalent:
foo = 7; foo.value = 7; foo["value"] = 7; foo['value'] = 7;The "value" convention makes the formal (tracked) distinction between a variable and an array (tree map) unnecessary. Array dumps normally don't explicitly show the "value" node. (If you wanted to check to see if something "is" a variable or array, you'd see if it had any elements other than "value".)
[How the run-time scoping works will be described later.]
There are a few conventions to allow the syntax to stay almost C-like. First, there is no difference assumed between commas and semi-colons as separators. However, the convention is to use commas to separate parameters and semi-colons to separate statements. (A lint-like utility could find potential violations.)
Second, there is no difference between parentheses and curly braces. However, the convention is to only use curly braces for control statements, such as an "if" statement. The interpreter will (by default) issue warnings if parenthesis are paired with curlies. This helps reduce matching problems and offers visual cues.
Note the difference between a control statement in C versus this language:
// C-style if(condition) { foo(); bar(); } // This language if {condition, foo(); bar(); }The slight difference is because the "if" statement is just a regular function call.
Third, an assignment statement is really a "set" statement. The assignment syntax is simply a shortcut for "set".
Here is another general set of examples and the resulting tree (only partial path shown for simplicity).
// Example 2 if { compare(foo,"<",7), zap(a,"b",waz=2), bar(), x = 12 } // Equivalent to above if ( cmp(foo,"<",7), zap(a,"b",waz=2); bar(); set("x", 12) // see note ) // note not a curly // equivalent assignment statements (partial paths only) if.1.cmp.1.var="foo"; // cmp stands for "compare" if.1.cmp.2="<"; if.1.cmp.3=7; if.2.zap.1.var="a"; if.2.zap.2="b"; if.2.zap.waz=2; if.3.bar.paramCount=0; // place-holder, needs work if.4.set.1="x"; if.4.set.2=12;Note how "var" is used to distinguish between constants and variables.
Miscellanious Notes:
Useful Library Functions
Q: Is 'value' a loopback, or a special tree value of its own?
foo = 23; foo.value = 42; foo.value['value']["value"].value = 108; how does 'foo.value' evaluate?A: That would be an error. Value has to be a leaf. A thoughtful question.
Q: Tree assignment or 'value' assignment?
foo.y = 42; foo.y.answers.1 = "Life"; foo.y.answers.2 = "Universe"; foo.y.answers.3 = "Everything"; foo.x = foo.y // assume 'foo.x' did not exist prior how does 'foo.x' evaluate? how does 'foo.x.answers.2' evaluate?A: print(foo.x) would return "42". Tree/branch copy would need a library function(s). The second one would be an error (unless "@" used).
Q: Tree vs. Object Graph?
foo.y.answers = "Life, Universe, Everything"; foo.x = foo.y foo.x.answers = "For whom the bell tolls"; how does 'foo.y.answers' evaluate?A: As described above, assignment by itself does not do tree/branch copying. See "clone" function above.
Q: Metaprogramming support: How do I quote a program? Homoiconic is somewhat pointless without it. Something like as follows is needed:
foo.x = quote(println(foo.y)); foo.x.println.1 = quote(foo.z); // change from foo.y to foo.z foo.x.println.device = "console"; // make device explicit foo.x(); // print a line system.default.main.1 = foo.x; // change the first behavior of the system.(I'm still refining that. I'll consider this as a fall-back, though. --top)
Q: One of the examples shows how to call the "if" function using assignments, with if.1 containing the condition and if.2 and up containing the body. How is this invocation distinguished from creating an array called "if"? In other words,
if.1 = "This is the first array element" if.2 = "This is the second array element"Looks like it would be interpreted as follows:
if { "This is the first array element", "This is the second array element" }A: Variable assignments are done through the "set" function, and thus will not be on the same tree level. They would actually only directly become part of the tree on the run-time stack (not shown yet), and thus the actual variables are in a different tree-space.
Alternative Control Statements
An approach that may simplify the implementation but complicate the syntax a bit is to have "lite" control statements that are stand-alone functions. Example:
if(cmp(a,"=",b)); foo(a); bar(b); endif();Here there is no direct nesting of the statements "foo" and "bar" within an "if" function. What would happen here is that the interpreter would jump to the appropriate "endif()" function if the statement was false.
The run-time engine could perhaps look for a "jumpto" attribute in the given IF statement's sub-tree. If it does not find one by chance (such as first run), it then scans the function to calculate it. Remember that all statements are numbered within a function. Thus, it just has to find the matching statement number and put that into the "jumpto" attribute.
A simpler run-time engine allows easier custom tweaks.
Discussion
Please implement it. Call us when you're done.
{I'm waiting for you to implement your AI-based language implementor to do it. Howzit comin' by the way? --top}
Since I've neither conceived of nor designed an AI-based language implementor, I'd say it's well into its pre-pre-alpha stages.