Homoiconic Root Structure Discussion

from HomoiconicLanguageDrawbacks

You cannot debate that EssExpressions are powerful.

top, I think you're conflating a few issues here. Common Lisp is based on nested lists, and the programs themselves are built out of nested lists, but there are a variety of structures available to Lisp programmers for keeping data. Common Lisp defines vectors, multi-dimensional arrays, hash tables, association lists, structures, closures, and some other junk that escapes my memory just now. Common Lisp has very powerful data structures. It's only the program structure that's restricted to nested lists. I.e., you can have both powerful structures and simple syntax. Lisp does not sacrifice CollectionOrientedProgramming, and it comes pretty close to offering it.

But then why focus on just lists as the "root" structure? Structures outside of the root structure are no easier in Lisp because they don't map one-to-one to the syntax. Why not pick maps or tables or whatever as the root structure? If you are going to homoicify a language, then pick a better structure than nested lists to build syntax around (I vote tables, surprise!). Did Lisp pick nested lists because it was the best structure or because it is the only syntax that anybody knew how to homoicify? If the first, then why is it the best? If the second, then you are admitting a compromise was made; a case of WhereTheLightIsBetter. (Actually, OOP is homoiconic with maps to a large extent. See ObjectsAreDictionaries.) -- top

Lisp uses nested lists because they're a convenient structure. All programs can be viewed as the composition of functions (there's your LambdaCalculus for you), and one simple representation of a function call is a list whose car is the function and whose cdr is the arguments.

So can maps:

  myMap.funcName = "foo"
  myMap.paramX = "blah"
  myMap.paramY = "Zog"
Or

  <myMap funcName="foo" paramX="blah" paramY="Zog">
In fact, maps better represent named parameters than lists because lists are only positional unless you nest or alternate name and value. This is why some suggest going with XML instead of EssExpressions as the root structure. XML can be used to build IF statements, loops, etc. See XmlProgrammingLanguage. A possible drawback is that XML cannot be nested. A variation that allows nesting may be better than EssExpressions in some ways. That would be homoiconic with nested maps. (It would be fun to experiment with a map-only language.) -- top

      <myMap "foo" "blah" paramY="Zog">

OR

myMap("foo", "bar", paramY="Zog")

Same As:

myMap[1] = "foo" myMap[2] = "blah" myMap['paramY'] = "Zog"

It's certainly not the only possible representation (Scheme started moving away from this with LexicalClosures, and Dylan and Goo both use an ObjectOriented representation), and you'll never be able to prove it's the "best" because you can't even define what "best" is, but it's a convenient one. You could argue that the Haskell/ML strategy of reducing everything to PatternMatching lambda expressions of one argument is a better approach, but they've chosen not to homoiconify their languages, so there's no external interface to manipulate Haskell/ML code fragments. More's the pity; I think a HomoIconic? HaskellLanguage would be really cool.

Maps and tables are horrible structures for code because they don't mimic the structure of the code.

Yeah, that's the problem. Thus, I abandon or reduce dependency on a tight mapping because it only gets me a ticket to limited structures that I don't want to marry.

Actually, maps are very useful in another area - symbol tables - but that ain't what you're talking about. (Tables would be useful there too, but usually you only need to reference a symbol by one key, its name.) Were you paying attention to TupleOrientedProgramming? It's very clumsy, because tables impose a structure on the data that doesn't mimic the actual structure of the code.

That is one reason why I want to abandon or reduce reliance on code. See CodeAvoidance. I feel that one structure cannot fit all well. Thus, use a table browser for stuff that fits tables and a nested-based language for things like Boolean or math-like expressions. Expressions are something that tables are not good at. You have to admit that EssExpressions are not good at tabular data. Columns don't naturally line up. The limit is text. Lisp seems to have a OneSizeFitsAll mentality with regard to data structures, but I don't think that is practical. Some things are best as lists, some as maps, some as trees, and some as tables. I don't want to limit my structures to just one just to satisfy Homoiconic ideology.

Agreed until you said Lisp has a OneSizeFitsAll mentality. Lisp provides pairs, lists, AssociationLists, arrays, vectors, HashTables, records, objects, and full relational algebra support (via SchemeQL). That's far less limited than what you've been proposing. -- JonathanTang

"Limiting"? I don't understand. I have not disputed that nested lists can represent anything, just the convenience (see TuringTarpit). Lisp does nested lists better than anything out there probably. No real disputes there. But it does not do the other structures demonstratably better than other languages. Its Homoiconicness does not improve upon maps, for example, because it is not homoiconic with maps. In fact, it may be a drawback in that area.

I think this:

  myMap(foo=1 bar=2 blah="zog" nork=4726)

Is more natural than something like this:

  (setf myMap '((foo 1)(bar 2)(blah "zog")(nork 4726)))

One thing I like about TCL is that it lets you create just about any syntax you want for the task at hand. If tons of parenthesis gets annoying, one can create a sublanguage that does not need them. However, that also makes it a HackerLanguage in that what I like may bother other's eyes.

Lists were not the only syntax that anybody knew how to homoiconify. MachineCode is homoiconic - that's the whole basis for the stored-program computer. Snobol wasn't around back then, but I'd be surprised if nobody realized that strings could be homoiconic in an interpreter. And Lisp was not designed to be homoiconic - it was designed as a mathematical notation that just happened to be programmable on a computer.

And OOP in general is not homoiconic for any decent definition of homoiconic (haven't we been over this?) Homoiconic doesn't mean that code can be stuffed in a data structure; that applies to any system where an interpreter is available at runtime. It means that code can be manipulated as one of the data types built into the language. You can cdr down a function call in Lisp. You can run a regexp search on Tcl code (and probably in Perl - Perl is an edge case). You can manipulate Goo function calls through a collection of GenericFunctions. You cannot rewrite a Java method and, say, add an if-clause unless you build the full program as a string and pass it to an external Java compiler. -- JonathanTang

But IoLanguage is OOP and homoiconic, isn't it? -- JasonGrossman

(You are right about OOP on a smaller level. More on this later.)

See NestedListsAsDictionaries for an illustration of lists doing tables. -- AnonymousDonor


The issue with this page seems to be that the HomoiconicLanguages page defines program code as being represented in "the language's fundamental data type." But lists are not "the fundamental" data type. A Lisp program can easily be a string, or a number. Maybe there needs to be some sort of PR campaign to get people away from thinking that Lisp is a LISt Programming language. Maybe Lisp should be renamed Sexp. [Oh, I would totally vote for that, it would be much less dorky. -- JJ-S]

Re: A Lisp program can easily be a string, or a number.

Please clarify.

There's a couple senses. The most obvious sense is trivial: 42 is a program. Which evaluates to, not surprisingly, 42. What that might mean is you could wish to conceive of a Lisp which uses things like tables as a code datastructure. It's not my job to think this through, but these are things one can imagine and see if it leads anywhere.

Yes, 42 evaluates to 42 in C as well. Surprise surprise. If you diff the machine code generated from 42 (LispLanguage) and printf("%d\n", 42); (CeeLanguage) it probably helps to understand that LispLanguage is not special in this way.

If you mean that you can create something that references (refers to) something more complex, then this is common in almost any language.

No I am referring to that it has nothing to do with being HomoIconic?...'


I'm going to repeat what someone else already pointed out here: Top, you're seriously confusing two issues. Although Lisp programs are represented as lists, this places no limit whatsoever on the data structures available for use by a program. None. The heart of homoiconicity merely requires that the data structure used to represent programs be one of the data structures easily manipulated by programs written in the language, not that it be the only one.

But if the homoiconic structure is not the one favored by the developer, then a Homoiconic language is not of much use. It is like a Swedish-to-Korean dictionary. If you don't use Korean much, then such a dictionary is not of much use. Nested lists are not the Ultimate Data Structure in my opinion. In short, Lisp "wastes" homoiconicness on a lame data structure. Again, I don't dispute that it can represent any structure, I just dispute that it is the best way to to it.

And as for this comment:

"I think this:

  myMap(foo=1 bar=2 blah="zog" nork=4726)

Is more natural than something like this:

  (setf myMap '((foo 1)(bar 2)(blah "zog")(nork 4726)))"

Assuming that the first statement is some kind of constructor, the second example would more likely be something like:

  (defMap (foo 1)(bar 2) (blah "zog") (nork 4726))
or

  (makeMap '((foo 1)(bar 2) (blah "zog") (nork 4726)))
or

  (makeMap '((foo 1)
             (bar 2) 
             (blah "zog") 
             (nork 4726)))  <-- which the editor will do automatically for 
                                you, just by hitting enter. A table is, 
                                after all, just a list of rows.
But this can be spaced the same, yet be something completely different:

  (makeMap '((foo 1)
              bar 2 
             (blah) "zog")) 
             (nork 4726)
... but it's hard to know without more specifics on what's intended.

As you say, you think that one is more natural -- but that probably has a lot to do with what you're used to. The two are in fact isomorphic, differ not at all in semantic content (or wouldn't if they were more realistic examples, I suppose), and are to my eye not different in difficulty of writing or reading.

I don't think very many people would select (x y) pairs in parenths as the optimum text representation of maps, even if they never programmed before. Personally, I would select a tabular grid, but the world is still text-happy for some reason.

Note also that your preferred notation is not at odds with the notion of homoiconicity; the printed representation of a program doesn't necessarily have to directly or obviously reflect its representation as data. Your example statement could also be represented as a list, for instance -- we would require only that the rules for mapping it to one are well defined. In fact, the representation of Lisp source code doesn't always directly correspond to the list structure -- consider for example the quote character's role as shorthand for the QUOTE operator. However, it makes life easier for the programmer if the correspondence is generally close, with simple and few rules for mapping source code to data structure. -- DanMuller

In that view, ALL languages are homoiconic because they all turn into a (sometimes internal) data structure(s) in the end.


I still have not received a direct answer to the question: "Are nested lists the ideal data structure to represent most everything else?" If they are, what is the justification? If not, then why make a language syntax around it? Because it is "good enough"? -- top

Yes, they are. Not for all things nor all purposes, but certainly for most things and most purposes. They are at their worst when direct indexing is at its best -- so if you already know you want an array, use an array. But Lisp S-expressions directly represent sequences (including sequences of characters, a.k.a. strings), trees, n-ary trees, and graphs, and does so in a way such that insertion and deletion of nodes is cheap. That covers a huge amount of ground. You can't say the same for arrays, including strings and hash tables -- although again, those have their own strengths. -- DougMerritt

They get ugly for something more complicated than lists or a small tree. I have generally concluded that 3 data structures are sufficient to cover almost all uses:

If you want a homoiconic base stracture(s), use those and life will be better (at least to me). A big fat tangled tree as the only store in town is not it. Text is crappy at representing non-trivial trees. That is why you need dedicated parenthesis processor tools to use Lisp. -- top


Lisp Not Ideal for Boolean Expressions

When I factor to tables what tables best seem to do best, one thing that tends to be left over for code is Boolean expressions. I don't see how Lisp improves on Boolean expressions. When translating from written (English) documents, infix notation better fits the original source. I won't claim that prefix is objectively worse by itself, but it does not fit commonly used conventions and spoken languages. Thus, it does not help what code is best at. -- top

No, it doesn't "better fit", that's subjective. And it's subjective with an English bias.

Yes, but many of us speak English and most computer documentation is also in English.

There are in fact VSO (verb subject object(s)) natural languages, and even English has such constructs; e.g. "Run Jim, away from the wolves! is VSO, and is perfectly grammatical English, although not the most common structure for such sentences.

An objective advantage to prefix (and postfix) over infix is that operator precedence and parenthesis are not necessary. -- DougMerritt

That is only because the parenths force one to include a precedence. One can use them in infix also. Precidence is just a default to reduce typing of parenths. (Personally, I think OR should rank over AND.) Note that I am not saying infix is better for all things, it is just how we learn and document math and logical expressions by linguistical convention.

Not true, at least as stated. While it is true that our linguistic conventions use them, there isn't a single shred of evidence to suggest that that's "how we learn." Quite the contrary, if we didn't use infix notation, I'm willing to bet two pizzas with toppings of your choice that we can shave 5 years, minimum, off the mathematics cirriculum. What I mean is, from kindergarden to senior year in highschool, students would be able to pack in five more years of mathematics education -- perhaps covering topics heretofore only covered in universities.

Most prefix notations require parentheses solely delimit where one list starts and where the same list ends. Precedence occurs as a natural side-effect of the ability to nest lists within lists. If you remove the ability to nest, you remove the need for bracketing. Consider the case where all constructs have permanent "slots" into which you can place code with precise meaning (such as, for instance, guaranteeing mathematical operators are always dyadic):

  define f2c x * / - 32 x 180 100

Can you tell me where one concept starts and another ends? It should be relatively easy in this example, because we already know that *, /, and - all take only two arguments, guaranteed. But, can you see how this can quickly get out of hand if we start coming up with more sophisticated examples? Can you answer the question of how to pass more than one argument to a function without using some form of bracketing?

In ForthLanguage, we use ReversePolishNotation, which is just the same as the above, but backwards, and the concatenativity of the language makes it easier to think about, because you don't have to manually associate increasingly nested pairs of parameters with the correct operator:

  : f2c 32 - 180 / 100 * ;

We remove the need for bracketing parameter names by removing the need for naming them, e.g., by just assuming the evaluation stack already holds the required number of parameters. But, despite all this simplification, notice that even Forth still needs a form of bracketing -- the colon brackets the semicolon. Without this, Forth simply has no way of knowing which word names the definition, versus which word implements the definition. Furthermore, unless you're intending on relying on color to eliminate the colon and semicolon brackets (as in ColorForth), there is no way to distinguish when a definition ends, and when its uses begins, particularly when invoking a freshly compiled word.

In summary, Lisp's prefix notation must use parentheses (or something like them, at least) to delimit where a list starts and ends. It has nothing at all to do with enforcing precedence.

I hope this clears some things up.


I forgot about this topic when I started MaspBrainstorming. I just rediscovered it. --top


See also: HomoiconicLanguages, AdvantagesOfExposingRunTimeEngine, LispLacksVisualCues

MayZeroEight


EditText of this page (last edited November 13, 2014) or FindPage with title or text search