Xml Isa Good Copy Of Ess Expressions

Some people think that XmlIsaPoorCopyOfEssExpressions. But what's "poor" about the copying? One is redundant, the other isn't. That doesn't have to mean "good" and "bad". I do prefer sexps but XML has already become accepted in many projects and communities I work with.

That's not a problem - since sexps and XML are more or less copies of each other, they can easily be translated between each other. This makes xml-tools good for working with lisp, and (especially) the other way around. Lispers should be cheering, not sulking that XmlSucks.

It's a bit amusing to look at emacs configuration files, and web pages, and other weird xml, after realizing that they are somehow the same thing.


From EssExpressions: http://pobox.com/~oleg/ftp/Scheme/xml.html#SXML-spec


Sorry. XML is a lot more than a copy of sexprs. I may have started as such but there is no equivalent for namespaces, encoding and XSDs in sexprs (though I guess this can be put on easily in any ad-hoc way with a very few lines of scheme code).

If you show me how to do this in sexprs I will be quite though:

 <?xml version="1.0" encoding="ISO-8859-1"?>
 <example>
  <e:tree xmlns:e="http://www.example.org/example" e:noNamespaceSchemaLocation="example.xsd">
   <e:list>
    <e:entry>Apples</e:entry>
    <e:entry>Bananas</e:entry>
   </e:list>
  </e:tree>
 </example>

That's easy:

 (?sexpr ((version "1.0") (encoding "utf-8")))
 (example
    (e:tree ((xmlns:e "http://www.example.org/example") (e:noNamespaceSchemaLocation "example.xsd"))
       (e:list
          (e:entry "Apples")
          (e:entry "Bananas"))))

It's trivial to write a macro that will then go through, find all the symbols of the form "e:[etc]" and expand them using the symbol "xmlns:e" as the value to do the expansion. For an example, I would refer you to "Let Over Lambda" by Doug Hoyte, in one of the earlier chapters, where Doug describes something similar for gensyms denoted "g![name]". (I don't remember which one off the top of my head).

Having said that, I consider XML namespaces to be pure evil. Perhaps it's just because Python's lxml library doesn't seem to handle them properly; also, it never made sense to me to have them, when children of an element didn't inherit a given namespace automatically. The only way I've been able to deal with them is to replace each namespace with a distinct string using a RegEx, process the XML as if it had no namespaces, and then restore the namespaces with the original text when finished processing. --Alpheus


Now what I want is an xml-mode for emacs that exploits this fact and displays XML as sexps.


Well...DTDs seem like a very heavy weakness of XML vs sexprs.

<currency value="1.00" country="Untied States of America">
<exchange_rate_date>2004 March 26</exchange_rate_date>
</currency>
You're going to fix all such typos with DTDs? Because of course it is an error as it stands (no such country).

Forgive my ignorance, but does not this have the same problem?

'(currency (value "1.00") (country "Untied States of America")(
(exchange_rate_date ("2004 March 26"))
))
This would be something that the app would have to deal with regardless, unless there are only a very few specific valid values. I'm probably demonstrating my lack of knowledge, but DTDs are more about making sure that the 'currency', 'country', and 'exchange_rate_date' bits exist in the right spots, and have the right _type_ of date associated with them, aren't they?

And anyway, I'm sure I wouldn't want that particular verification done in the same manner in which I detect tramission and structural errors. Who's to say that "Untied States of America" is wrong? The contexts of a computer game, social revolution, test data, all come to mind.

Aside; is it technically possible for a DTD to specify that something must match a member of another data structure? If so, then it would be quite feasable to solve this problem using DTDs, and quite applicable to the lisp structure.

--WilliamUnderwood


I just remembered that Gnome Glade, MozillaXul, Gnustep Renaissance etc uses XML. Mmmm, a nice way to do lisp-bindings to those gui systems? I don't know. I ran some of the Renaissance example files through xml2sxml and whoah! That's how I'd like to be able to program GUIs. An example:

 (*TOP* (window
          (@ (title "Preferences"))
          (hbox (box (@ (title "Game preferences"))
                     (vbox (button
                             (@ (type "switch")
                                (title "Enable monsters")
                                (tag "0")
                                (halign "left")
                                (action "changedPrefs:")))
                           (button
                             (@ (type "switch")
                                (title "Enable advanced AI in monsters")
                                (tag "1")
                                (halign "left")
                                (action "changedPrefs:")))
                           (button
                             (@ (type "switch")
                                (title "Enable infinite lives (cheat mode)")
                                (tag "2")
                                (halign "left")
                                (action "changedPrefs:"))))))))


Counterpoint: XML is useful precisely because it has no semantics. There are a lot of XML-RPC libraries for a lot of languages. Each and every one of those languages has slightly different ideas about what things like "integers" or "structures" mean. Some have to hack around the language to allow you into complex structs. Some are able to dynamically construct structures based on the input, like the dynamic languages can.

If XML strongly enforced some sort of semantic meaning, you would not be able to create something like XML-RPC, because the languages don't agree on semantics. S-expressions are not, strictly speaking, LISP specific, but to the extent they carry semantics, they are not equally easily expressible or equally easily manipulable in all languages.

I'd say the vast majority of languages out there - at least the ones that one would use to implement XML applications - are able to express signed 32-bit integers. Of course, if you have examples to the contrary, I'd be interested in seeing those.

This entire page is taking the microscopic case and arguing it to death, when at the microscopic level it doesn't much matter which you use. Explain to me how you would implement S-Exp-RPC, in 20+ languages, without stripping out the putative advantages of S-Exps or embedding non-native semantics into a language (when XML-RPC demonstrates that is not necessary).

You're missing a very important point: The S-expression semantics are a superset of the XML semantics, in the sense that XML can only express strings (XmlIsJustDumbText), while S-expressions can express strings and (say) 32-bit signed integers. Any piece of XML can be mapped to an equivalent S-expression. Thus, whatever you can do with XML, you can do with S-expressions.

No, that is my point. The converse is not true: "Anything you can do with S-expressions, you can do with XML." I am saying this is a good thing for the purposes to which XML is being put to, and is a reason why you can't just drop S-expressions in to the same places and make it work. This also answers the previous italicized objection; it's not that XML can do things S-expressions can't, it's that XML can't do things S-expressions can. More power! is not always a good philosophy; the more you load into the language by definition, the more likely it is that another given environment will have conflicting assumptions. (For what it's worth, attempts to add semantics to XML itself have the same problems; not all environments agree.)

So you're saying that problems would arise because there's differing opinions on the semantics of a 32-bit signed integer? Well that's why we have these things called "standards". I'm not saying that such a general encoding format as XML should have built-in support for three kinds of sequences and circular structures; I'm saying that maybe it should have support for basic integers.

Again, that's a microscopic argument on a vast topic. Is everything in your world a 32-bit signed integer?

No, but most of it can be represented by strings, integers and symbols (as evidenced by the fact that everything in XML is represented by just strings and symbols alone).

Across all of the languages that implement an XML-RPC library (see http://www.xmlrpc.com/directory/1568/implementations but I don't think that's a complete list any longer), using that as a more realistic, macroscopic argument, you can't find any agreement on what a "structure" is, beyond "something more then a flat array".

That's not what I'm talking about. I don't think "structures" should be built into the language any more than I think virtual baseclasses should.

You keep picking on integers but you're zooming in on what is almost the only point of agreement there is!

Exactly!

...almost, but not quite even there. Compare "1073741824 * 5" in Python vs. C. Even there, the languages end up imposing their own semantics on ints, which S-Expressions would presumably be overspecifying, assuming the numbers are specified as 32-bit ints in the language definition.

Well, then until we can agree on a standard for 32-bit integers, let's use IEEE floats.

I really don't think you're getting my point; you keep making arguments that implicitly assume that I think that the problem with S-Expressions is under-specification, and if we just agree to use 32-bit ints, IEEE floats, some standard of structure (none?), and in general keep adding to the S-Expressions that S-Expressions will somehow come out on top in the comparison. My problem is that S-Expressions are over-specified; adding specifications like "we'll agree to IEEE floats and 32-bit ints" doesn't work in many environments, and every addition you propose to S-Expressions makes them heavier. Python has no 32-bit float, in the sense of an int where 1073741824 * 500 is something much less then 536870912000. Other languages may natively support floats with much more precision then IEEE, or much less (embedded). By not specifying so much about the data and focusing almost entirely on the syntax only, XML goes into these environments. S-Expressions don't. For many, if not most, situations that XML is used in (especially actual exchange of data between vastly differing environments), XML is useful (if not perfect), and S-Expressions aren't even in the running; we'd actually prefer flat text or custom binary interfaces, as amply demonstrated in any number of real-world apps. To me, that says that XML is not a "poor copy of S-Expressions" in practice, even if you can argue in theory that it is, just as push-down automata are poor copies of Turing Machines.

In XML, <number>384759374</number> is just a number, not a Float or an Int or any other enforced semantic meaning. In fact it could be a string too. You can just as easily say <number>39285729385723985729385723985748709817360985730.387523637523875</number>, and XML doesn't break. If you have two apps that understand that, then you can use the XML. Just as a reminder, from further up on the page: "S-expressions endow data with type semantics, whereas XML just gives tree structure to pieces of text! In ANSI Common Lisp forms, you have symbols, lists, integers, strings, vectors, multidimensional arrays, complex numbers, rationals, etc." S-Expressions break here if they endow a number with 32-bitness. S-expressions enforce a certain semantic on each and every one of those things (list, symbol, vector, etc). An XML library can be built without also forcing the introduction of a certain type of list, a certain type of integer, etc. XML is better for many uses because it is less specified.

[Compare: (number "384759374"), (number "39285729385723985729385723985748709817360985730.387523637523875"). S-exps don't break, either. You can safely drop 32-bit numbers from S-expressions and they won't even notice. You'll just have to do this, let's say, XML-style, that is, using a symbol and a string. XML is better iff it is convienient to embed tags inside text, as HTML does. Otherwise, S-exps are a more compact (both space- and code-wise, I believe) way of doing exactly the same thing. -- Baczek]

I'm not insisting that XML is better then S-Expressions always; I'm saying that XML is better in many cases, many of which occur in the real world a lot.

[It should be noted, that in addition to cons, strings, symbols, and numbers, a lot of the XML-recast-as-EssExpressions above seems to slyly introduce a lot of metacharacters - there are a lot of @'s and :'s and &'s and the like in the above. Which doesn't make EssExpressions good or bad - but its additional semantics that the parser has to deal with.]

The : was used to represent attributes. To make the XML people feel at home, @ was sometimes used instead. I didn't see any &, or any other special character for that matter. The only semantics added to S-expressions are attributes, and of course XML consumers already have to deal with this, so I'm afraid I pretty much fail to see your point - that is, if you were trying to make one. But I agree; it is noteworthy.

It's easy to speculate on how wonderful S-Exps would be for all the uses of XML in isolation, but by the time you're done mutilating S-Exps to work in all the ways XML does, you'll have XML again... if you're lucky. It's easy to imagine S-Exp-RPC just "working", too, but I submit that if you actually try to implement it while keeping all the advantages claimed about in four or five languages, you're going to discover it is, as usual, not as easy as you're imagining at first. Carrying around S-Exp semantics into tens of languages is not going to be trivial, and while you're free to demand that all other programming communities bend to your idea of semantics, they're equally free to not.

[As for s-expressions and numbers, yes, you could specify that all numeric literals are 32 bit integers or IEEE floating point numbers, but, in fact, it just ain't that way. As in XML, a program that comes across a sexp such as (salary 59233.12) can and will interpret 59233.12 however it wants.]

Not really: It at least has to interpret it as a number.

[So, I argue that there are no built-in semantics for numeric literals. There is, though, a requirement that strings, numbers, and symbols are not the same thing. (salary 59233.12) is not the same thing as (salary "59233.12"), and I think that's probably a nice thing.]

So, it is conceded that SExprs touted "semantic" advantage doesn't exist, as evidenced by the arguments that SExprs don't have such strict semantics after all.


It's funny to see all this bickering about how XML would handle <number>39285729385723985729385723985748709817360985730.387523637523875</number>, as compared to an S-Expression. In both cases, we're talking about a data representation. It doesn't matter what the standards are for integers, or floats, or whatever, because in the end, the computer language reading in the data--whether it's stored as plain text, or XML, or S-Expressions, or JSON, or what have you--is going to have to decide what to do with all those digits. And if the computer program doesn't handle those digits the way the original data creator intended, we're just going to get GarbageInGarbageOut, independent of the the original implementation!

And I would add that this problem will be an issue regardless of language; that is, when sending information from Python to C to Lisp, we're going to run into issues with how numbers are represented in the three languages regardless of whether we format those numbers as XML, or as sexprs, or as plain text, or even as binary...these issues are data-format agnostic. --Alpheus


CategoryXml LispByTheBackDoor


EditText of this page (last edited September 10, 2013) or FindPage with title or text search