Comment 13 for bug 12446

Revision history for this message
In , David Relson (relson-osagesoftware) wrote : Re: Bug#293207: bogofilter: last two versions caused db errors

On Wed, 02 Feb 2005 10:40:39 -0600
Karl Schmidt wrote:

> Matthias Andree wrote:

...[snip]...

> Installing bogofilter on a Debian testing box gives us:
>
> ii bogofilter 0.93.5-1 a fast Bayesian spam filter
>
> $ bogofilter -V
> bogofilter version 0.93.5
> Database: Sleepycat Software: Berkeley DB 4.3.27: (December 22, 2004
>
> I delete all the files in the db directory and run the following script
> (as I've had to rebuild a few times now<g>):
>
> #!/bin/bash
> bogofilter -M -s -d /etc/bogofilter -I /home/karl/mail/zs-archived-spam2004
> bogofilter -M -s -d /etc/bogofilter -I /home/karl/mail/zs-archived-spam2003
> bogofilter -M -s -d /etc/bogofilter -I /home/karl/mail/s-archived-spam
> bogofilter -M -n -d /etc/bogofilter -I /home/karl/mail/z-archived2004
> bogofilter -M -n -d /etc/bogofilter -I /home/karl/mail/archived
> bogofilter -M -n -d /etc/bogofilter -I /home/karl/mail/list-servers/EXIM
> bogofilter -M -n -d /etc/bogofilter -I
> chown Debian-exim.Debian-exim /etc/bogofilter/*

Your script looks reasonable. I, too, make a practice of to preserving
sequences of useful commands in scripts.

As a tip, using "-B" instead of "-I" allows listing of multiple message
sources (mailboxes, maildirs, etc) in 1 command, e.g.

bogofilter -M -s -d /etc/bogofilter -B /home/karl/mail/*s-archived-spam*

> Everything works (not sure if it is tagging quite as much spam) then it
> ends ups stopping after about 48 hours.
>
> This is on a Tyan MB with ECC memory, antec powersupply - I think a
> quite stable system running bind, dhcp,hylasfax, samba, nfs, imap all
> flawlessly. I would suspect falky hardware at this point except going
> back to the older version fixes things.
>
> Only other thing I can suspect is that exim is threaded - could there be
> a locking problem I'm seeing running two requests at a time? I can
> imagine that 48 hours would be long enough to be filtering two messages
> at the same time. That would explain why most people running in a single
> thread POP service manner would not see this bug.
>
> The basic fact is I am sure I recreated the databases and didn't upgrade
> and try to run the old data base (which if I remember would have failed
> at once.) Going back to the old version and once again reproducing the
> databases fixes the problem.
>
> I can think that it would be easy to test by running two or three
> instances of bogofilter at the same time on some mail files. One can
> write a script that will fork and you might want to add it to your
> testing procedure. Hope this helps.

Bogofilter can be built with "make check" which runs a series of 40 or
so regression tests. The 2 locking tests run multiple copies of
bogofilter to test operating system, database, and bogofilter. If you
have source code, it's probably worth running -- just for grins.

> I hope I didn't sound off base here and hope I haven't ruffled any
> feathers, but I really do think that these should spend some time in
> unstable.

Nope, no ruffled feathers here. Guess you'll have to try harder :-)

Cheers!

David