Lisp Vs Xml

For whatever reason, Lispers want to tell the rest of us how crappy XML is and how wonderful Lisp is. Okay, great -- use this space to do so. Please stop putting references to Lisp all over the XML pages, eh?

Claim: XML is a poor copy of EssExpressions. Discussion moved to XmlIsaPoorCopyOfEssExpressions.

Counter-claim: Lispers don't (or shouldn't) actually care.

Evidence: Languages like XML and HTML are in common use by ordinary users as well as programmers of almost all kinds. <> is a parenthetical notation just like (). In fact, <> + marketing = (). Therefore it not only knocks down the excuse that the human brain is somehow incapable of learning parenthetical notation, but it also trains people to accept it more.

Evidence: Lisp notation can co-opt XML notation's advantages. Despite the claim that </section> is easier to understand than ), comments like ;END SECTION or ;</section> may easily be used in lisp for clarity. The terse readability of ()s still remains an advantage when you don't want this markup. But when you do, lisp has enough flexibility to learn from others.

An indentation style from the {}; languages can also be emulated whenever appropriate.

Evidence: Lisp is a natural language for manipulaing XML, unlike others whose users complain of ImpedanceMismatch. This gives lispers an economic advantage.

The problem with the sexp model is: which sexp does this line close:

[Note: Put your cursor on the parens and hit your text editor's parenthesis-matching command.]

Now, which tag does this close?

  </section>

Nope. You can't be sure of what section you are talking about without keeping exactly the same sort of state you have to for sexprs. XML doesn't add any syntactic power, just makes it very verbose. I'm not a big fan of sexprs, but XML doesn't improve on them syntactically, and loses semantically.

that tag closes a specific element, named "section." There may be many such elements in the XML data stream, but the close tag is related to the one that was most recently opened and hasn't yet been closed.

Ok so this is valid then ? <section><heading>data</section></heading> - and this is unparsable after a refactor? (section (heading "data"))

Of course you haven't seen bad html, when you see a tag </td> in a mess of tables, tr's and td's. And you wish you could just press % on vim so you can go to the opening tag. Making something verbose doesn't makes it readable. See CobolLanguage.

[Instead of XML: Just use comments where needed in any file that you are editing that is not XML. Don't be verbose when you don't have to. Example:

  {Section}
Begin 
Long Code;
Code Long;
Code Long unorganized blocks;
Code Long;
End
  {/Section}

Use no comments on short code blurbs that are self explanatory, Example:

Begin 
Short Code;
Code;
End

XML is a standard and that is great but how many standards? (insert quote from Minix inventor here).

So you can demonstrate either how XML syntax is an improvement, or semantically equivalent (or better)?

The XML file is not well-formed unless all tags are present/balanced. [By the way, DTD or schema enforces this; how is it done in Lisp?]

<Uh, DTD/Schema have nothing to do with well-formedness in XML. See below>

That being so, the only gain from reiterating the element name in the closing tag is redundancy - the ability to check that you don't have a syntax error there. That may be useful if you're generating the file by hand (though not nearly as useful as a structure-based or at least structure-aware editor), but given that most of the XML use that people get excited about is for data interchange between computer programs, it seems like a poor engineering decision.

Moreover, you introduce new syntax errors that are not possible with parentheses, like

  <foo>blah<bar>blah</foo>xyzzy</bar>  [ERROR: somebody's not using his DTD or schema]

[The above is not well-formed XML. DTDs and schemas are irrelevant; even a non-validating parser is required to reject the above.--ScottJohnson]

Using symbol matching to denote structure is verbose, requiring the parser to keep information about the symbols just to parse the syntax. Matching parentheses requires only a counter, or push-down stack.

XML has no syntactic advantage to sexprs. XML is in some sense a collection of poor engineering decisions and repetition of old mistakes. It is possible that its political benefits will outweigh its technical failings though, so not all is lost.

[So what you are saying is that it's okay to have a technical failure, if it is politically acceptable. Sorry, but that doesn't get me through the day when I actually have to work with the crap.]

One small advantage is that you can search for "/section" in an editor and find the appropriate end tag location more easily then if all end tags were identical. Hardly earth-shaking I know, but it does make a difference.

I hope this is a joke. You aren't using Notepad, are you? Real editors can move at the end/beginning of a block, highlight block, copy block. With minimum keystrokes, so again XML loses.

-- -- -- --

Why not just add comments in ? for example:

{ //(start: MyWidgetPrices?) <------------the opening comment 
 Wdgt1 = $123;
 Wdgt2 = xyz;
 Wdgt3 = abc;
}//(end: MyWidgetPrices?)<------------the closing comment,

Using your text editor, search for comments. Comments can describe anything, and they aren't part of the code. Comments are something you can strip afterward, whereas XML you can't strip. XML markup is inside the code, and puts load on the processor. Comments are ignored or stripped out.

-- -- -- --

That works great in an active environment, but please show me how that works on a piece of paper on on my Palm. (As long as we are being ridiculous here.)

How about (section "some text" section)? Then you add a hook to the parser to remove the last item.

Remember where XML came from. Text mark-up. Lots of text, and relatively few tags. When </section> comes 100 lines of text after the opening, it makes a lot of sense to repeat the tag name at the close. But e.g., <month>May</month> is just stupid. My opinion: XML is great for text mark-up. It sucks for data." (which is why I would probably have a tag more akin to this <date month="May" year="2002" day="25" />. A main question with XML is the purpose of CDATA, is it really the "value" of the tag? I don't think so. I use it to store extra markup that can be used when transforming, eg. HTML so there is a readable representation, akin to an image alt tag. I tend to use attributes more than most people do, but they'll come around.)

Is there a standard for s-expressions. It would be really nice to have a standard for things like validation, translators, namespaces, defines, comments, includes... Not that I care much if there isn't one. I'm planning on switching my project from xml to sexp anyway. I would be nice to make it as compatible with scheme as possible, though.

You, sir, are in luck. Go see OlegKiselyov's SXML Specification at http://pobox.com/~oleg/ftp/Scheme/xml.html#SXML-spec.

Ron Rivest wrote an Internet-Draft describing s-expressions in 1997, but I don't know how compatible it is with Scheme or Lisp: http://people.csail.mit.edu/rivest/Sexp.txt

Thank you. It seems to me that the Scheme R^5 with only lists, identifiers, strings, and numbers would work for just about anything. It would also be extremely easy to implement a parser in just about any language. Scheme's full sexp are a little little rich, with quoting and pairs. What we want is octet streams and lists.

There is also ANSI Common Lisp, a real standard. Available in HTML form, under the name "Common Lisp HyperSpec". Google for it.

So where would you start using Lisp on the web and how? I know XML is used not on the web only, but if one wanted to, how would they use lisp on the web? Would it be like python where you pretty much have to have access to the server? No standard "python install" like a "php install"... What about a Lisp install, or in other words; what would it take to get lisp working on the web server?

There's a mod_lisp for Apache, but I don't think it's all that common for it to be pre-installed, particularly on inexpensive hosting services. You can use SISC, found at http://sisc.sourceforge.net/ to run Scheme on any server that allows Java servlets. While not quite as common as PHP-enabled servers, Java servlet hosting services are fairly easy to find.

This is an old discussion, but the best environment I have found for using Lisp on the web, at least getting started, is Emacs + Elnode http://elnode.org/. This means you will be writing EmacsLisp instead of CommonLisp, but for web development that's not necessarily a bad thing. mod_lisp is completely different than mod_php; it doesn't embed a lisp interpreter into Apache, it's more like FastCGI that sends the request to an application server. There is no language other than PHP or CGI that won't require greater than ftp access to the server.

Here is a quick example of a dictionary in XML:

 <dictionary>
   <email>electronic mail</email>
   <html>hypertext transport language</html>
   <xml>extensible markup language</xml>
 </dictionary>

A corresponding s-exp could be as follows:

 (dictionary
   (e-mail "electronic mail")
   (html "hypertext transport language")
   (xml "extensible markup language"))

All the repetition in XML just seems to get in the way here. Imagine 10000 definitions.

Unlike XML, the S-expression is more than just printed characters. It has standard semantics.

Where do you suggest I look to find a description of these alleged standard semantics?

The token DICTIONARY in the above represents a symbol object. An object, not a piece of text.

Just as in XML, it represents an element object. An object, not a piece of text.

So don't use XML when you have "short pieces of text"... only use it when you have things like paragraphs, columns of text, boxes of text, and bigger pieces of text?

For small things .. like objects, stick to code?

The whole notion of data vs. programs sucks. XML is based on that notion, sexprs aren't. This is why sexprs are superior. To illustrate this point, here is a macro in Common Lisp:

 (defmacro dictionary (&rest words)
  `(progn
(format t "<dictionary>~%")
(dolist (word ',words)
(format t "  <word>~%")
(format t "<id>~A</id>~%" (first word))
(format t "<def>~S</def>~%" (second word))
(format t "  </word>~%"))
(format t "</dictionary>~%")))

Now, when I enter the following sexpr...

 (dictionary
(xml "data language")
(java "programming language")
(lisp "programming and data language"))

...I get the following result:

 <dictionary>
<word>
<id>xml</id>
<def>"data language"</def>
</word>
<word>
<id>java</id>
<def>"programming language"</def>
</word>
<word>
<id>lisp</id>
<def>"programming and data language"</def>
</word>
 </dictionary>

Conclusion: sexprs include XML as a subset (or any other representation) qua Lisp. (Note that the macro is also written as an sexpr, and is in fact also just a data structure that is compiled/interpreted by the Lisp runtime. See MetaCircularInterpreter for more insights into the resolution of the prorgram/data dichotomy.)

The whole notion of data vs. programs sucks. XML is based on that notion, sexprs aren't. This is why sexprs are superior.

Interestingly, the exact opposite -- i.e., that s-expressions are based on that notion and XML isn't, and hence that XML is superior to s-expressions -- is equally true (i.e., completely untrue). To illustrate this point, here is a pair of XSLT templates (arguably even more straightforward than the CL macro):

 <xsl:template match="dictionary">
<xsl:text>(dictionary</xsl:text>
<xsl:apply-templates />
<xsl:text>)&#10;</xsl:text>
 </xsl:template>

 <xsl:template match="word">
<xsl:text>&#10; (</xsl:text>
<xsl:value-of select="id" />
<xsl:text> &quot;</xsl:text>
<xsl:value-of select="def" />
<xsl:text>&quot;)</xsl:text>
 </xsl:template>

Now, when I enter the following XML,

 <dictionary>
<word>
<id>xml</id>
<def>data language</def>
</word>
<word>
<id>java</id>
<def>programming language</def>
</word>
<word>
<id>lisp</id>
<def>programming and data language</def>
</word>
 </dictionary>

I get the following result:

 (dictionary
(xml "data language")
(java "programming language")
(lisp "programming and data language"))

Conclusion: XML includes s-expressions (or any other representation) as a subset qua XSLT. (Note that the templates are also written as XML, and are in fact also just data structures that are compiled/interpreted by the XML processor.) [Note in case this gets taken out of context: This was meant as a parody of a corresponding claim for s-expressions and against XML.] -- DanielBrockman

An advantage of this approach is that by stating "xsl" 17 times in 12 lines, the more forgetful readers can remember that they are looking at XSL.

So what's stopping us from going:

 {dictionary}
{word}
 xml:=id;
 data language:=def;
{/word}
{word}
 java:=id;
 programming language:=def;
{/word}
{word}
 lisp:=id;
 programming and data language:=def;
{/word}
 {/dictionary}

...and making some easier to read standard - based on possibly human eye and readability studies versus <HTML> want-to-be syntax? Then, offer a separate trimmed down version later, because they are just optional ways but recommended standards based on hopefully eye studies(whitespace can be used to organize too, and I see that XML lacks it, you know?)

 {dictionary}

{word}
xml:=id;
data language:=def;

{word}
java:=id;
programming language:=def;

{word}
lisp:=id;
programming and data language:=def;

PossibleXmlReplacement ?

HTML and XML are both subsets of SGML and XHTML is a subset of XML. So HTML is an XML want-to-be text, not vice-versa.

More specifically, HTML and XML are applications (or instances) of SGML; XHTML is an application (or instance) of XML. XHTML is a replacement for HTML which is XML-compliant; HTML allows may SGML-ism's like overlapping tags (i.e. <B>Bold text<I>Bold italic text</B>Italic text</I>) which XML (and thus XHTML) outlaws. Taunts like "wannna-be" seem out of place here; the lineage of all of the above protocols is well-known. -- ScottJohnson
- SGML does not allow overlapping elements, and therefore neither does HTML. (What popular web browsers do with mangled input like the above is not a part of HTML as such.) A better example would be the use of SGML's optional tags in HTML.
They all look similar on the surface and have some general consistency by using <dumbtext> and </dumbtext>. But that's a longer sentence and harder to quickly read on a wiki. I think the stem of the problem was using < > and having a rule that every part of the language consists of <something></something>. Maybe we should do eye/readability/maintainability studies before implementing a technology that millions of people are going to be reading and managing. Have studies carried out before many computer formats were implemented? Do people just think up formats and implement them based on gut instincts? Even if they are half-ok markups that sort of do the job (GoodEnough), and even if they still succeed partly, could work better? Redundant <anything></anything> syntax is hard to read for humans and is slower for computers (WorstOfBothWorlds??). Forcing continual use of <something></something> is simply not required. Even the choice between <BR /> or <BR> is annoying itself in HTML, due to the nature of the language. Why not use a combination and pull in more := and ; and | and END and ` and ~ and => into the language?

I disagree (a bit). HTML is an application of SGML, that is true. But XML is a subset, not an application, of SGML. And XHTML is an application of XML. --GodsBoss?

So what you're saying is that XML has reinvented the wheel, yet again? Is XML turing-complete like LISP, or does it just happen to work for that trivial case?

It's more like an unrelated species that has evolved to fit the same niche than a reinvention. XML isn't turing complete because it doesn't do anything on it's own, but XSLT is.

"The whole notion of data vs. programs sucks."

Support that argument, please. Are you in favor of word macro/email viruses (Word/VB script)? Separation of data vs program is very important.

The whole section above is in support of that argument. The (directory ...) sexpr is both a data fragment and a code fragment. So is the (defmacro dictionary ...) sexpr. There is no distinction between data and programs with sexprs in Lisp. Source code is usually seen as a representation of a program, but the compiler for a language needs to treat it as data and transform it to another representation (object files / class files / whatever). In Lisp, you don't need to change the level of abstraction in such drastic ways in order to treat programs as data or vice versa. You can just do it within the same framework. (See DataAndCodeAreTheSameThing)

See OoAndXml

That argument (macro viruses) crops up often, and it seems that a more important misunderstanding is hidden behind it. It is also sometimes brought up by newbies on c.l.l. It might be important to note that while data and code have the same representation, code must still be explicitly (when *real-eval* is nil) EVAL'ed. If one were to create a language that used XML as both its internal and external representation of code, it wouldn't magically be cursed with the uncontrollable execution of user input. If one looks at SQL injection exploits, it becomes clear that the problem of not EVAL'ing unsafe data is a programmer problem, not a language one.