Summary of WikiUseful? TextFormattingRegularExpressions:
Quoted emphasis: ('')(.*?)\1 -> <em>$2</em> Quoted strong: (''')(.*?)\1 -> <strong>$2</strong>
Preformatted code (space at beginning of line): ^ +(.*)\n -> <pre>$1</pre> with a subsequent match and replace of </pre><pre> ->
Ordered list (tab-numeral-space at beginning of line): ^\t[0-9]\s*(.*)(?:\n?|$) -> <li>$1</li> with a subsequent match and replace of (?<!</li>)((?:<li>.*</li>)+)(?!<li>) -> <ol>$1</ol> Unordered list (tab-asterisk-space at beginning of line with consideration for previous unordered list parsing): ^\t\*\s*(.*)(?:\n?|$) -> <li>$1</li> with a subsequent match and replace of (?<!</li>|<ol>)((?:<li>.*</li>)+)(?!<li>|</ol>) -> <ul>$1</ul>
standard protocols, not image: (http|news|ftp|mailto)\:(\/\/)?((?:\S(?!\.gif|\.jpg|\.jpeg))+)\s --> <a href="$1:$2$3">$3</a> standard protocols, inline image: (?<!href=")(http|news|ftp|mailto)\:(\/\/)?((?:\S(?!\.gif|\.jpg|\.jpeg|\.png))+\S)(\.gif|\.jpg|\.jpeg|\.png) --> <img src="$1:$2$3$4" border="0">
The following paragraph was moved from TextFormattingRegularExpressions.
This page could have been really, genuinely useful to my endeavors the last two days and I was very happy when I found it. Unfortunately I must say I find many of the expressions here dodgy at best, and just wrong at worst. Could we turn this into a resource for the (many) ways that WikiAuthors? have implemented this? -- SvenNeumann
Let's use this page to discuss the RegularExpressions and use the other page as a document. Great idea! Maybe I'm just a bit slow, but some of these RegExes? give me a splittin' headache :)
The emphasis RegularExpression /'{3}(.*?)'{3}/ looks like it would match situations where the apostrophes contain nothing. Wouldn't /'{3}(.*)'{3}/ be simpler? And horizontal line RegularExpression should only match exactly 4 dashes not 4 or more. -- AdewaleOshineye
Also in HtagWiki I use the backreferencing to make it easy to match several alternate formatting rules like:
(''|//)(.*?)\1for example to allow the slash notation too in one simple expression. Also, I believe matching blank apostrophe parts is accuarate too, as long as the rules are ordered with the expression gobbling the most quotes first. -- sn
Let's try this one:
^ +(.*)\n','<pre>$1</pre>to mark all monospaced lines and
</pre><pre>for deletion after to merge the lines into a single block. is that accurate?
What is an efficient quick and dirty way for the spell checker? I tried a RegularExpression but that took over 14seconds for a small page and a 270k dictionary. A simple search for each word takes only 0.4seconds. Of course indexing would speed this up. But perhaps I'm approaching this from the wrong end too. How is simple spell checking usually implemented? -- sn By the users... it's a wiki, after all.