Writing My Own Wiki

I started with some libraries I'd already written and kept updated over the last 7 or 8 years. The oldest one was a CGI library that I first wrote in 95 or 96. I've spent some time updating it to handle a number of useful CGI features like GET, PUT, POST, File Upload, Virtual Paths etc. There are also some additional libraries that handle Virtual Path based page handler dispatch, cookie management, and other related tasks, but those I have not put into the wiki code yet.

The second of my libraries is an SQL interface with multiple backends (well, two anyway). I'd found an SQL engine called SqLite a year or two back, and had wanted to use it in a CGI app - the engine compiles into your application, so there is no separate SQL server. SQLite handles multiple processes accessing the same database, transactions, most of SQL92 etc. I suspect people don't hear more about it because it's billed as an "embedded" SQL server, and so not taken quite as seriously as the big boys ("they're hard to set up - they must be better"). Having now done some simple work with SQLite I can say that it is a great tool.

Then it was down to design.

My intention was to eventually set up this engine on its own Web site, and configure Apache to run the entire Web site out of the single CGI executable.

So I created a table called "wiki" with a name, an access count, and a data column for the page data. Viewing a page that does exist in the database gives a wiki page much like the one you are reading now. Trying to view a page that is not in the database gives a wiki page saying "please fill me in". Both have edit links.

If the page is "http://yoshi/AboutThisWiki", the edit link is "http://yoshi/AboutThisWiki?edit".

The edit code puts up a TEXTAREA box with the page contents in it ready for editing. Differing slightly, the FORM ACTION is "http://yoshi/AboutThisWiki" as a POST. When you hit the the save button, the data is posted to the ACTION URL, and I realize this is not a page view request because of the method type (being POST). I simply save the data to the database and issue a "Location:" header to "http://yoshi/AboutThisWiki" since I don't have a spell checker, and, to be honest, I've also been a little annoyed at the extra click needed to review my changes.

Now this is where I declare the desire to do something a little NonWiki. Since this is in part for me to set up for my own personal pages, I wanted the ability to edit the actual HTML too. I can wiki up my Web pages very quickly, but for some (for example my front page - http://yoshi/) I would want to add images, tables etc.

So I added another handler for each page - if the method is a PUT I store it in the database directly, and I added a new field to the table called "type", which is either "wiki" or "html". "wiki" if the page was saved via a POST, and "html" if it was saved via a PUT.

I fired up Mozilla (any Netscape browser since, uhh, is it 3 or 4?, should do) and choose "File->Edit Page". Up popped the HTML editor (which I find perfectly OK in Mozilla these days), and I added some text and hit "File->Publish". Worked first time. I reload the page in the browser and the changes have taken effect. I add some coloured text, fine, some tables, fine, some Unicode, fine. Good so far.

It all fell down when I added an image, since that is binary data, and I've been treating everything as text up till this point.

Next on the agenda is changing the table to store a context-type column in place of the simple type column I used above (so I can correctly capture "text/html", "image/jpg", etc, as well as my own "text/x-wiki" for plain wiki edited pages). I'll detect which context types can be binary, and store them hex-encoded. So add a storage-type column for that.

More as it comes to hand.

(Later)

I now handle binary data coming in from the HTML editor - very messy code at present, so I may have to spend some time refactoring back into code I can stand to look at. Not planning ahead to handle binary data from the start was a mistake.

Currently, a Web page with 10 images will upload all 10 images every time I choose "File->Publish", and I'll have to look into whether I should be supplying more headers (like an E-Tag, or perhaps just Last-modified) to prevent this.

-- Edouard


EditText of this page (last edited February 25, 2006) or FindPage with title or text search