If you are interested in who is writing and WikiGnomeing most, you can process all the ChangesInWeek? pages as follows:
for i in `wget -q -O - "http://www.c2.com/cgi/wiki?search=ChangesInWeek"` do sleep 1;wget -q -O - "http://c2.com/cgi/wiki?$i" done | awk '/\. \. \./{ split($0, a, ". . . . . . ."); print a[2]; }' | sort | uniq -c | sort -nrThis will show the most active WikiZen on top (but may take some time). -- .gz
Please don't do it. I think this doesn't work as stated, and it hammers Ward's server. If you must do it, please don't do it like that.
It fetches just 53 pages, that's not that much (though I agree, that these are not the smallest ones). What would you propose? Should I insert a sleep between the fetches? -- .gz
Certainly it needs a sleep between the fetches. You also need to process the first fetch to turn it into a list of pages. Currently, this version, as stated, returns HTML with links. It also doesn't take into account the "only the last editor is mentioned" problem. I use the RSS feed for this.
It was a quick hack and worked (it really did). Processing the first page isn't necessary as the non-links just generate 404er and are filtered out later. "Last editor mentioned problem" isn't really a problem, because it averages out (in a way). What's much worse is that ips and user names are not matched. It's enough to get an impression. -- .gz