Wiki Interchange Format

Tools to handle this are beginning to emerge.

should provide an open and standard data format for sharing wiki pages between different sub-species of wiki
should be used to serve wiki page data to any inquiring client programs.

A standard format could facilitate

Its function as as a means of conversion to alternate document formats (e.g. Word Processors, HTML).

The WikiInterchangeFormat would exist behind-the-scenes. WikiClones using the format should retain their unique characteristics (including the editing markup syntax). Only those wishing to use the format directly should need to be aware of it.

Q: Would such a format be desirable? Feasible? A: Yes to both questions.

Q: What is the simplest scheme for implementing such a format? A: See the discussion continued below.

Scheme employing XML

An envisioned implementation method for existing WikiClones would be to simply provide an alternate CGI access script (a WikiInterchangeFormat server script) which would convert a requested document from the clone's native format to WikiInterchangeFormat and deliver the document to the client. The client wiki or application could then use the structured WikiInterchangeFormat data directly, or convert to another desired format. An appropriate Xml WikiBrowser might use the format directly (such a built-for-purpose browser could also provide desirable features such as wysiwyg editing). The WikiClone's traditional HTML delivery script would remain unaltered.

An edit script might also be provided for accepting edited content in WikiInterchangeFormat and converting to the clone's native format.

Issues:

A standard XML format (DTD?) would have to be agreed upon. An implementer might choose to add non-standard tags to his pages. He would do this with the understanding that the non-standard tags might be ignored or mis-interpreted by clients.
It would be left to the client to determine how to handle non-standard tags. A good strategy is probably to render unrecognized tags as verbatim text.
I think the markup for internal wiki links should include both the original text of the link (remember, the text may not fit the client's local link pattern), and an absolute URL to the actual wiki page referred to. It would be left to the client to determine what to do with this information (should the link be made local? Should the link text be mangled to local format?).
As implied above, some imported links may need to be mangled to work with the local wiki. This is especially an issue if an edit script is provided.
Many others...

WikiInterchangeFormatExamples

-- TimVoght

What happened to this scheme?

See http://www.usemod.com/cgi-bin/mb.pl?WikiXmlDtd for a related discussion. I will create an XML DTD, I guess within the next months (i.e. Sep'00). But I guess I'll call it WimlWiml?, not plain boring "WIF". :) -- JuergenHermann

I was going to place this on WikiBase, but recent traffic here on the subjects of WikiPortal(s) and a WikiConsortium led me to place it here. -- TimVoght

Seriously Complex

This proposal and the examples seem seriously complex to me. Isn't there something profoundly simple, like wiki itself? Take a look at the problem you are trying to solve: You don't like the syntax of A HREF and you want to replace it with something else. Take a look at the proposed solution: You're trying to define an interchange format and a wiki name server. Perhaps it would be enough to use A HREF links to one central host with a page per wiki with the appropriate redirection (like home.pages.de or the like). The goal is to leverage URL namespace instead of devising YetAnotherNamespace?.

Contributors: RonJeffries, Alistair, AlexSchroeder.

Forward and Reverse Compatibility

Don't slap me for saying this, but XML - or anything like it - allows applications to be forwards, backwards and laterally compatible.

What I mean by that is that if the WikiInterchangeFormat (WIF) changes, or that other Wiki implementors decide to extend the basic format with their own custom tags (like text colour or copyright notations), they can all still attempt to communicate.

Also, if WIF 2.0 is defined in the future, older Wikis won't be cut out of the picture.

If you're going to go and redesign the web (which is what you're doing), might as well do it so we don't have to do it again.

Example of a strategy

An example of where this strategy works really well is in our applications where I work. The object model I defined was designed from the get go to support XML and it turned out to be so wonderfully good that it allowed us to move the same save files back and forth between sibling applications. This is before I was introduced to YouArentGonnaNeedIt.

[We don't actually use XML, but something like it.]

It's may not even be necessary to define a DTD because you don't have to strictly parse the XML. Just anything where the set of attributes is flexible and you have a concept of parent-child relationships is sufficient.

Something like

 <PARENT>
<PARENT-ATTRIBUTE-1>value</PARENT-ATTRIBUTE-1>
<PARENT-ATTRIBUTE-2>value</PARENT-ATTRIBUTE-2>
...
<PARENT-ATTRIBUTE-n>value</PARENT-ATTRIBUTE-n>

<CHILD>
<CHILD-ATTRIBUTE-1>value</CHILD-ATTRIBUTE-1>
<CHILD-ATTRIBUTE-2>value</CHILD-ATTRIBUTE-2>
...
<CHILD-ATTRIBUTE-n>value</CHILD-ATTRIBUTE-n>

<GRANDCHILD>
...
</GRANDCHILD>
</CHILD>
<CHILD>
...
</CHILD>
...
 </PARENT>

can be easily dynamically interpreted from context. A DTD is necessary only if you want to impose more complex forms like value constraints.

-- SunirShah

Problems of Translation of more complex methodologies to simple generic methods

The point here is that many people author their own Wikis because they want to innovate (and usually in the direction of greater complexity). I'm not sure that all the possible innovations can be fitted into the plain Wiki syntax without getting ambiguous parsing. -- DaveHarris

It is not necessary to restrict ourselves to a single format. Plain HTML (ie what the human reads), and plain Wiki (ie what the human edits) could be made available alongside XML. (Whether this is a step towards simplicity or away from it is another matter. I suspect it is the SmallestShield.) -- DaveHarris

Forward Tolerant Standard

One of the tricks of designing a forwards-tolerant standard is to define it so that any version can at least read and parse the content (simple SGML or XML will provide this), and find the boundaries of things. Then when a version finds a 'thing' it doesn't understand, it becomes a local matter how that version handles this 'thing', that it found. In a wiki case, it might just put it up as plain text.

In the case of wiki, the important item to be parsable is the link. Wikis have different rules for allowed characters in the link name, so the mechanism must allow some of that (e.g. this wiki might convert spaces to underlines, or some other character for the hyperlink to be rendered).

Handling Unknowns

A better way to handle unknown XML tags would be to embed them in such a way that if the page is rerendered as XML the unknown data is not lost. This could be a special embedding syntax or something structured on top of a generic comment syntax (like the way Java Script is embedded).

My vote would be wikis in the original style to agree on the form

 [WikiWord: ... ]

as a generic extension syntax and for

 [WikiInterchangeFormat:<SOMETAG>]

to be the embedded tag. By default this tag would not be shown but would be present for editing. -- EricBennett

Using Regular Expressions

Why don't you just embed the LinkPattern RegularExpression in the XML? That way other non-similar Wikis can use that to figure out where to put the links without breaking anything. -- SunirShah

I'd rather not force all Wiki's to be written in Perl, or to have to support Perl's regular expression syntax. -- DaveHarris

It should be noted that Regular Expression syntax is not exclusive to Perl and that it's use is fairly standardized in various Language implementations.

How about because some very "wiki-like" things may have a separate table of link-keywords and translate all occurrences of these words to internal or external links as they are rendered? So there's no LinkPattern to include.

Is WikiInterchangeFormat supposed to be an editable format? Might it be simpler if it were merely displayable, and editing delegated back to the server Wiki? For example, links could be rendered in an HTML syntax and the client Wiki would often not need to parse them at all. -- DaveHarris

WikiInterchangeFormat is impractical. It would require standardizing the markup and LinkPatterns between sites, which is precisely the wrong thing to do. However, on a WikiFarm... -- ss

Each site could maintain its own format. An interchange format would attempt to include all/most features (and deal elegantly with those that did not).

If somebody doesn't DoTheRightThing we could end up with a BetamaxSituation.

Note that there are a lot of tools that process XML now. You can even display it on IE with CSS. Also, you can transform it into other XML formats (e.g. XHTML, WML) via. XSLT: http://www.w3.org/TR/xslt. If you're going to use XML, I would like to suggest XML Schemas (http://www.w3.org/XML/Schema). Personally I prefer DTDs, but people keep telling me that XML Schemas are the way for the future :-)

This looks like a decent and valuable project, and I think it would be a good idea to keep the technologies grounded in current Web architecture. XML serves this purpose very well. -- SeanPalmer

Wiki Interchange Format similarities To Notation3

Actually, something else I just thought of; this WIF thing seems like it could have some similarities to Notation3 (http://www.w3.org/DesignIssues/Notation3). In fact, the inventor on N3 (TimBL, who also invented HTML and the WWW) mentioned a possible similarity between N3 and Wiki Wiki at http://www.w3.org/2000/10/swap/ That's how I found out about Wiki Wiki in the first place... -- SeanPalmer

Publish an Interface?

I like the idea that Wikis publish an interface where you can always get the following pieces of information:

raw format for the page (i.e. what user would see in editor)
plain text list of linkable words for a page

Any Wiki engine (or any piece of software, for that matter) that wants to host a page from an "open" Wiki would render the pages with almost no formatting except for links. Of course, if you are writing a Wiki engine, and you notice that your users tend to view a lot of pages from certain other Wikis, you can add formatting libraries to your engine that work with the format from the other Wikis. You might even only support a subset of features, such as bold and italics, and leave the rest of the page unrendered. See StorageLinkingPresentation too. -- SteveHowell

+1 for that. Getting access to the raw pages would make writing WikiMode much easier (for emacs/w3) as I wouldn't have to strip out the html around the textarea, or unescape chars encoded in the html. -- BrianEwins

Wiki API?

Not sure how many of you are aware of it or still interested in it, but an XmlRpc Wiki API is shaping up over at http://www.jspwiki.org/Wiki.jsp?page=WikiRPCInterface. So far, there are budding implementations for JspWiki, TwikiClone, UseModWiki, and MoinMoin. One of the obvious pairs of methods to have are getPageWIF() and setPageWIF(), WIF being WikiInterchangeFormat. If we actually had a WIF to use, it would be simple to make a WikiConverter? or more direct InterWiki.

-- LesOrchard

Version 1 of the WikiXmlRpcInterface? is pretty much set to stone, but we're thinking about version 2 now. If you're interested, drop by at the above address :-).

-- JanneJalkanen

Why Not just use HTML or a subset My XML knowledge is not that deep - but could we not just take same form of HTML (XHTML, whatever) as a WikiInterchangeFormat? I mean basically we just have br, h1 ... h6, lists, bold and italics, maybe tables and links. I think if a server put out a page just using this subset of xhtml, it should be possible to build the WikiMarkUp?, once the page is retrieved. And if the html were nicely made, even the xml people should be smiling. Maybe one has to place some extra info in the anchors, like page="WikiPage" or page="external".

-- JoergBaach

* I think this is an idea worth discussing.

HTML (or XHTML) has a lot of support. There are good engines and good editors that support HTML. I propose that WikiEngines might accept HTML, but convert the HTML to their own native format. This means that WikiEngines may provide all their extra value (such as reverse links) without surrendering to the whole HTML mess. -- JoaquimBaptista

Yes, I'm tending to agree. We should use a small subset of XHTML Basic (i.e. no nested tables :) ), and then maybe add a few extra elements and attributes (such as for InterWiki links). -- BayleShanks

Low Tech Solutions or Necessary Complexity

Running the fledgling Perl Patterns Repository at http://wiki.slowass.net, it is very frequently in my interest to link here. I've played with three low tech solutions:

URLs that get expanded in-line. Abuse issues but cool effect.
Links of the form: C2 WikiWikiWeb get turned into links to that word here.
Recognizing words defined here but not there from the .cgi here that dumps the catalog index.

Planning for the future can take the form of saving pennies, or futile fretting over an upcoming interview. Categorization is left to the reader. However, jumping wholesale on one solution without playing with it first is avoiding experimentation and therefore avoiding feedback from the system: http://wiki.slowass.net/?PlanningIsNpComplete - a complex solution *might* be required, but *only* if it is NecessaryComplexity?. Excessive complexity will do as much damage as over simplicity of the protocol. Having a vested interest in this, I argue for a simple but rich protocol: providing a standard suite of .cgi programs takes advantage of the HTTP API and uses the rich namespace conventions of the URL scheme. New .cgi programs can always be added that offer more power at the cost of more complexity, and people can support the HTTP based API piecemeal. A simple, initial API could take the form of merely listing, one per line, as HTML links, each word defined on the site. Programatic fetching for pages and doing more complex processing would always be an option for doing more intensive investigation of another Wiki in addition to other machine-specific interfaces. I hope people working on this problem can benefit from the work I've done on the subject. Feel free to contact me. -- ScottWalters

It seems to me that the WikiCreole effort will accomplish many of the goals stated above, but without the XML baggage, and, hopefully, with more attention to preserving WikiNature. -- RaphLevien

CategoryWiki