Homoiconic Root Structure Discussion

You cannot debate that EssExpressions are powerful.

top, I think you're conflating a few issues here. Common Lisp is based on nested lists, and the programs themselves are built out of nested lists, but there are a variety of structures available to Lisp programmers for keeping data. Common Lisp defines vectors, multi-dimensional arrays, hash tables, association lists, structures, closures, and some other junk that escapes my memory just now. Common Lisp has very powerful data structures. It's only the program structure that's restricted to nested lists. I.e., you can have both powerful structures and simple syntax. Lisp does not sacrifice CollectionOrientedProgramming, and it comes pretty close to offering it.

But then why focus on just lists as the "root" structure? Structures outside of the root structure are no easier in Lisp because they don't map one-to-one to the syntax. Why not pick maps or tables or whatever as the root structure? If you are going to homoicify a language, then pick a better structure than nested lists to build syntax around (I vote tables, surprise!). Did Lisp pick nested lists because it was the best structure or because it is the only syntax that anybody knew how to homoicify? If the first, then why is it the best? If the second, then you are admitting a compromise was made; a case of WhereTheLightIsBetter. (Actually, OOP is homoiconic with maps to a large extent. See ObjectsAreDictionaries.) -- top

Lisp uses nested lists because they're a convenient structure. All programs can be viewed as the composition of functions (there's your LambdaCalculus for you), and one simple representation of a function call is a list whose car is the function and whose cdr is the arguments.

So can maps:

  myMap.funcName = "foo"
  myMap.paramX = "blah"
  myMap.paramY = "Zog"

  <myMap funcName="foo" paramX="blah" paramY="Zog">

In fact, maps better represent named parameters than lists because lists are only positional unless you nest or alternate name and value. This is why some suggest going with XML instead of EssExpressions as the root structure. XML can be used to build IF statements, loops, etc. See XmlProgrammingLanguage. A possible drawback is that XML cannot be nested. A variation that allows nesting may be better than EssExpressions in some ways. That would be homoiconic with nested maps. (It would be fun to experiment with a map-only language.) -- top

Except that KeywordParameterPassing is a really bad default. It's good to have, but when everything needs to be passed by keyword, you quickly find yourself wishing for good 'ole positional parameters. I once did a quick Scheme interpreter for a language with only keyword parameters; it was unusable. The comments on XmlProgrammingLanguage seems to agree that it's a horrible idea too. -- JonathanTang
It is only a suggestion. Fixed parameters could be added to an XML-like language. They would just have the key of "1", "2", etc. if treated as map. Example:

      <myMap "foo" "blah" paramY="Zog">

      OR

      myMap("foo", "bar", paramY="Zog")

      Same As:

      myMap[1] = "foo"
      myMap[2] = "blah"
      myMap['paramY'] = "Zog"

It's certainly not the only possible representation (Scheme started moving away from this with LexicalClosures, and Dylan and Goo both use an ObjectOriented representation), and you'll never be able to prove it's the "best" because you can't even define what "best" is, but it's a convenient one. You could argue that the Haskell/ML strategy of reducing everything to PatternMatching lambda expressions of one argument is a better approach, but they've chosen not to homoiconify their languages, so there's no external interface to manipulate Haskell/ML code fragments. More's the pity; I think a HomoIconic? HaskellLanguage would be really cool.

Maps and tables are horrible structures for code because they don't mimic the structure of the code.

Yeah, that's the problem. Thus, I abandon or reduce dependency on a tight mapping because it only gets me a ticket to limited structures that I don't want to marry.

Actually, maps are very useful in another area - symbol tables - but that ain't what you're talking about. (Tables would be useful there too, but usually you only need to reference a symbol by one key, its name.) Were you paying attention to TupleOrientedProgramming? It's very clumsy, because tables impose a structure on the data that doesn't mimic the actual structure of the code.

That is one reason why I want to abandon or reduce reliance on code. See CodeAvoidance. I feel that one structure cannot fit all well. Thus, use a table browser for stuff that fits tables and a nested-based language for things like Boolean or math-like expressions. Expressions are something that tables are not good at. You have to admit that EssExpressions are not good at tabular data. Columns don't naturally line up. The limit is text. Lisp seems to have a OneSizeFitsAll mentality with regard to data structures, but I don't think that is practical. Some things are best as lists, some as maps, some as trees, and some as tables. I don't want to limit my structures to just one just to satisfy Homoiconic ideology.

Agreed until you said Lisp has a OneSizeFitsAll mentality. Lisp provides pairs, lists, AssociationLists, arrays, vectors, HashTables, records, objects, and full relational algebra support (via SchemeQL). That's far less limited than what you've been proposing. -- JonathanTang

"Limiting"? I don't understand. I have not disputed that nested lists can represent anything, just the convenience (see TuringTarpit). Lisp does nested lists better than anything out there probably. No real disputes there. But it does not do the other structures demonstratably better than other languages. Its Homoiconicness does not improve upon maps, for example, because it is not homoiconic with maps. In fact, it may be a drawback in that area.

I think this:

  myMap(foo=1 bar=2 blah="zog" nork=4726)

Is more natural than something like this:

  (setf myMap '((foo 1)(bar 2)(blah "zog")(nork 4726)))

One thing I like about TCL is that it lets you create just about any syntax you want for the task at hand. If tons of parenthesis gets annoying, one can create a sublanguage that does not need them. However, that also makes it a HackerLanguage in that what I like may bother other's eyes.

Lists were not the only syntax that anybody knew how to homoiconify. MachineCode is homoiconic - that's the whole basis for the stored-program computer. Snobol wasn't around back then, but I'd be surprised if nobody realized that strings could be homoiconic in an interpreter. And Lisp was not designed to be homoiconic - it was designed as a mathematical notation that just happened to be programmable on a computer.

And OOP in general is not homoiconic for any decent definition of homoiconic (haven't we been over this?) Homoiconic doesn't mean that code can be stuffed in a data structure; that applies to any system where an interpreter is available at runtime. It means that code can be manipulated as one of the data types built into the language. You can cdr down a function call in Lisp. You can run a regexp search on Tcl code (and probably in Perl - Perl is an edge case). You can manipulate Goo function calls through a collection of GenericFunctions. You cannot rewrite a Java method and, say, add an if-clause unless you build the full program as a string and pass it to an external Java compiler. -- JonathanTang

But IoLanguage is OOP and homoiconic, isn't it? -- JasonGrossman

(You are right about OOP on a smaller level. More on this later.)

See NestedListsAsDictionaries for an illustration of lists doing tables. -- AnonymousDonor

The issue with this page seems to be that the HomoiconicLanguages page defines program code as being represented in "the language's fundamental data type." But lists are not "the fundamental" data type. A Lisp program can easily be a string, or a number. Maybe there needs to be some sort of PR campaign to get people away from thinking that Lisp is a LISt Programming language. Maybe Lisp should be renamed Sexp. [Oh, I would totally vote for that, it would be much less dorky. -- JJ-S]

Re: A Lisp program can easily be a string, or a number.

Please clarify.

There's a couple senses. The most obvious sense is trivial: 42 is a program. Which evaluates to, not surprisingly, 42. What that might mean is you could wish to conceive of a Lisp which uses things like tables as a code datastructure. It's not my job to think this through, but these are things one can imagine and see if it leads anywhere.

Yes, 42 evaluates to 42 in C as well. Surprise surprise. If you diff the machine code generated from 42 (LispLanguage) and printf("%d\n", 42); (CeeLanguage) it probably helps to understand that LispLanguage is not special in this way.

If you mean that you can create something that references (refers to) something more complex, then this is common in almost any language.

No I am referring to that it has nothing to do with being HomoIconic?...'

I'm going to repeat what someone else already pointed out here: Top, you're seriously confusing two issues. Although Lisp programs are represented as lists, this places no limit whatsoever on the data structures available for use by a program. None. The heart of homoiconicity merely requires that the data structure used to represent programs be one of the data structures easily manipulated by programs written in the language, not that it be the only one.

But if the homoiconic structure is not the one favored by the developer, then a Homoiconic language is not of much use. It is like a Swedish-to-Korean dictionary. If you don't use Korean much, then such a dictionary is not of much use. Nested lists are not the Ultimate Data Structure in my opinion. In short, Lisp "wastes" homoiconicness on a lame data structure. Again, I don't dispute that it can represent any structure, I just dispute that it is the best way to to it.

Source code for most languages is very reasonably represented by a tree structure, and nested lists are a very convenient means of representing trees. We'll just have to agree to disagree on this, because I think you're way off base.
Well, perhaps my problem is that I don't want to work with just tree structures. Trees have their use, but that is not everything. Nested lists for everything is a case of EverythingIsa taken too far in my opinion. I can think of other things I would rather make EverythingIsa out of. As far as nested lists go, Lisp is the king of that. No dispute there. But things that are not as convenient as nested lists are not going to fare as well. The bottom line is that Lisp depends on nested lists being the ultimate structure. If that assumption is not true, then we have known rough spots.

And as for this comment:

"I think this:

  myMap(foo=1 bar=2 blah="zog" nork=4726)

Is more natural than something like this:

  (setf myMap '((foo 1)(bar 2)(blah "zog")(nork 4726)))"

Assuming that the first statement is some kind of constructor, the second example would more likely be something like:

  (defMap (foo 1)(bar 2) (blah "zog") (nork 4726))

  (makeMap '((foo 1)(bar 2) (blah "zog") (nork 4726)))

  (makeMap '((foo 1)
             (bar 2) 
             (blah "zog") 
             (nork 4726)))  <-- which the editor will do automatically for 
                                you, just by hitting enter. A table is, 
                                after all, just a list of rows.

But this can be spaced the same, yet be something completely different:

  (makeMap '((foo 1)
              bar 2 
             (blah) "zog")) 
             (nork 4726)

... but it's hard to know without more specifics on what's intended.

As you say, you think that one is more natural -- but that probably has a lot to do with what you're used to. The two are in fact isomorphic, differ not at all in semantic content (or wouldn't if they were more realistic examples, I suppose), and are to my eye not different in difficulty of writing or reading.

I don't think very many people would select (x y) pairs in parenths as the optimum text representation of maps, even if they never programmed before. Personally, I would select a tabular grid, but the world is still text-happy for some reason.

I think you would have a very hard time defining "optimum" in this context, nor do we particularly care what "many people" would find natural. In fact, I'd wager that for someone that's never programmed before, the concept "map" would be so foreign that any reasonable notation would do. Every technical pursuit has notation conventions that have to be learned; we ask only that they not be unnecessarily baroque, preferring conciseness. A compact two-element list notation meets this criterion.
"Optimum" is probably subjective. I don't want to dictate to others how to process symbols the best for their head.
But when wearing a language designer's hat, you must choose a representation, so the question of what's "optimum" or at least "acceptable" is relevant. I think that (x y) pairs is a reasonable representation, and as someone else demonstrated by adding to the example above, formatting can make all the difference. That's generally the case with all modern, text-based programming languages.
But I think I would accept a bit more complex syntax to get better or more varied structures to tie syntax closely to. I never claimed Lisp's choice was not "reasonable".

Note also that your preferred notation is not at odds with the notion of homoiconicity; the printed representation of a program doesn't necessarily have to directly or obviously reflect its representation as data. Your example statement could also be represented as a list, for instance -- we would require only that the rules for mapping it to one are well defined. In fact, the representation of Lisp source code doesn't always directly correspond to the list structure -- consider for example the quote character's role as shorthand for the QUOTE operator. However, it makes life easier for the programmer if the correspondence is generally close, with simple and few rules for mapping source code to data structure. -- DanMuller

In that view, ALL languages are homoiconic because they all turn into a (sometimes internal) data structure(s) in the end.

If the representation of the code is internal (implying unspecified, or at least implementation-dependent, and possibly even hidden from the programmer), then the language is obviously not homoiconic, because there is no way for a standard program written in the language to generate and evaluate a representation of source code. This should be plain from the definitions discussed on HomoiconicLanguages.
By "internal" I don't mean locked away. Lisp generally does not execute EssExpressions directly, but a byte-code version of them. Thus, it has an "internal structure".
- We are talking about the language, not a specific implementation. As long as there are ways to turn the internal structure to fundamental structure of the language, it's still homoiconic.
But you made a statement about ALL languages above. Most languages do lock away their internal structure. I guess I should have said: "If the only representation ..." And yes, even Lisp implementations typically have an internal structure that's "locked away", and that structure is similarly irrelevant to discussions of homoiconicity.
I have never encountered a language with some real world usage that did not turn syntax into an internal structure of some kind. (In Tcl, the text often is the structure, I would note.)
- I don't think so. Even Tcl is the interpreter, it would have to parsed the source code somehow. and whatever the result of the parse is, it surely is not string. -- AnonymousDonor
  - The result of the "parse" actually 'is' string, until something else happens to it. A Tcl program starts off as one big string. Then it gets evaluated, at which point it gets internally converted to bytecode (it's then both string 'and' bytecode -- this is merely an optimization; conceptually it's still a string). Any substructure in the code is still just a string at that point, and remains a string until/unless it too gets executed. That is, the body of an if command remains just a string until the if command gets executed and its condition is found to be true. -- AnonymousDonor

I still have not received a direct answer to the question: "Are nested lists the ideal data structure to represent most everything else?" If they are, what is the justification? If not, then why make a language syntax around it? Because it is "good enough"? -- top

Yes, they are. Not for all things nor all purposes, but certainly for most things and most purposes. They are at their worst when direct indexing is at its best -- so if you already know you want an array, use an array. But Lisp S-expressions directly represent sequences (including sequences of characters, a.k.a. strings), trees, n-ary trees, and graphs, and does so in a way such that insertion and deletion of nodes is cheap. That covers a huge amount of ground. You can't say the same for arrays, including strings and hash tables -- although again, those have their own strengths. -- DougMerritt

They get ugly for something more complicated than lists or a small tree. I have generally concluded that 3 data structures are sufficient to cover almost all uses:

Lists
Maps (dictionary arrays)
Tables

If you want a homoiconic base stracture(s), use those and life will be better (at least to me). A big fat tangled tree as the only store in town is not it. Text is crappy at representing non-trivial trees. That is why you need dedicated parenthesis processor tools to use Lisp. -- top

Lisp Not Ideal for Boolean Expressions

When I factor to tables what tables best seem to do best, one thing that tends to be left over for code is Boolean expressions. I don't see how Lisp improves on Boolean expressions. When translating from written (English) documents, infix notation better fits the original source. I won't claim that prefix is objectively worse by itself, but it does not fit commonly used conventions and spoken languages. Thus, it does not help what code is best at. -- top

No, it doesn't "better fit", that's subjective. And it's subjective with an English bias.

Yes, but many of us speak English and most computer documentation is also in English.

There are in fact VSO (verb subject object(s)) natural languages, and even English has such constructs; e.g. "Run Jim, away from the wolves! is VSO, and is perfectly grammatical English, although not the most common structure for such sentences.

An objective advantage to prefix (and postfix) over infix is that operator precedence and parenthesis are not necessary. -- DougMerritt

That is only because the parenths force one to include a precedence. One can use them in infix also. Precidence is just a default to reduce typing of parenths. (Personally, I think OR should rank over AND.) Note that I am not saying infix is better for all things, it is just how we learn and document math and logical expressions by linguistical convention.

Not true, at least as stated. While it is true that our linguistic conventions use them, there isn't a single shred of evidence to suggest that that's "how we learn." Quite the contrary, if we didn't use infix notation, I'm willing to bet two pizzas with toppings of your choice that we can shave 5 years, minimum, off the mathematics cirriculum. What I mean is, from kindergarden to senior year in highschool, students would be able to pack in five more years of mathematics education -- perhaps covering topics heretofore only covered in universities.

Five years? I very much doubt that. At the very least, issues of notation were only a tiny aspect of my own mathematics education. How would using prefix or postfix aide in the simplification of formulae? in the distribution of operators? in the teaching of associativity, reflexivity, commutivity, and transitivity? in working with graphs and matrices? in the education regarding proofs that start with an education in Euclidean geometry? How would it considerably aide trignometry, calculus, discrete mathematics, statistics, propositional logic, and calculi? Going the other direction, how would it help with by-hand multiplication and long-division that modern calculators make largely pointless? Teaching polish notations would likely help regarding the teaching of 'order of operations', but most children don't take anywhere near five years to comprehend order-of-operations. I'd take you up on that pizza bet, but I'm highly allergic to cheese.

Most prefix notations require parentheses solely delimit where one list starts and where the same list ends. Precedence occurs as a natural side-effect of the ability to nest lists within lists. If you remove the ability to nest, you remove the need for bracketing. Consider the case where all constructs have permanent "slots" into which you can place code with precise meaning (such as, for instance, guaranteeing mathematical operators are always dyadic):

define f2c x * / - 32 x 180 100

Can you tell me where one concept starts and another ends? It should be relatively easy in this example, because we already know that *, /, and - all take only two arguments, guaranteed. But, can you see how this can quickly get out of hand if we start coming up with more sophisticated examples? Can you answer the question of how to pass more than one argument to a function without using some form of bracketing?

In ForthLanguage, we use ReversePolishNotation, which is just the same as the above, but backwards, and the concatenativity of the language makes it easier to think about, because you don't have to manually associate increasingly nested pairs of parameters with the correct operator:

  : f2c 32 - 180 / 100 * ;

We remove the need for bracketing parameter names by removing the need for naming them, e.g., by just assuming the evaluation stack already holds the required number of parameters. But, despite all this simplification, notice that even Forth still needs a form of bracketing -- the colon brackets the semicolon. Without this, Forth simply has no way of knowing which word names the definition, versus which word implements the definition. Furthermore, unless you're intending on relying on color to eliminate the colon and semicolon brackets (as in ColorForth), there is no way to distinguish when a definition ends, and when its uses begins, particularly when invoking a freshly compiled word.

In summary, Lisp's prefix notation must use parentheses (or something like them, at least) to delimit where a list starts and ends. It has nothing at all to do with enforcing precedence.

I hope this clears some things up.

I forgot about this topic when I started MaspBrainstorming. I just rediscovered it. --top

MayZeroEight