Ancient History -- This bug plagued wiki for a year or so. None of the suggestions below worked. Nor did a half dozen others I tried. The problem went away when I upgraded to perl5 and tied in a different implementation of dbm. -- WardCunningham
One of my long standing WikiWikiBugs really has me stumped. I use Perl's associative array iteration construct,
while (($key, $value) = each %db) {...}to search pages of this database. I report the number of pages visited which is normally in the hundreds. I say normally because sometimes it stops early. This messes up my search command and could be catastrophic in my compress-database script. (Don't worry, I check.)
Does anyone know what is going on? My only other hint is that the problem corrects itself with the addition of another page. -- WardCunningham
Do you lock the database? I don't think UnixDbm? or the PerlDbmInterface? does this automatically. Thus, you may be seeing problems only when the iterator runs simultaneously with some other access. -- MarkEichin
Yes, now I do. When I first saw this problem I interpreted it as a WikiWikiDatabaseSerializationProblem, but now I think otherwise. -- WardCunningham
I suspect that this is related to a problem I've seen with just exactly this type of iteration. Try changing the loop to be a
foreach my $key (keys %hash) { my $value = $hash{$_}; # do the thing with $key and $value ... }When I changed it to this, an unpredictable and undependably irreproducible error that seemed to be related to memory management quit happening, and we were able to handle a lot more entries in the hash without running out of memory.
--- JoeMcMahon
I agree with Joe; use keys rather than each.
According to PERLFUNC(1)'s definition of each:
Calling keys grabs all the keys at once, and stores them in an array; you then iterate over the array. (Storing the keys for a few thousand page names should be okay.) --PaulChisholm
Using keys instead of each is an essential first step. The next step requires locking.
The generally accepted way of locking (via flock) a DBM file is to use a separate lock file. The prior accepted scheme -- extracting a file handle from the DBM and flock'ing the file handle -- has been shown to be unsafe.