#524758 bogofilter: Bogofilter crashes on some mails

Package:
bogofilter
Source:
bogofilter
Description:
fast Bayesian spam filter (meta package)
Submitter:
Sjoerd Simons
Date:
2024-05-27 15:33:06 UTC
Severity:
important
Tags:
#524758#5
Date:
2009-04-19 17:37:57 UTC
From:
To:
It seems like bogofilter crashes on some mails, i get the following backtrace
when rebuilding with debugging symbols:

(gdb) bt full
#0  word_cmp (w1=0x21681f0, w2=0x0) at ../../src/word.c:40
        r = <value optimized out>
#1  0x0000000000417a44 in listsort (list=0x21460e0, compare=
    0x40ed80 <compare_rstats_t>) at ../../src/listsort.c:113
        p = 0x219dd10
        q = 0x219dd50
        e = 0xffffffff
        tail = 0x219dcd0
        insize = 1
        nmerges = <value optimized out>
        psize = 1
        qsize = 1
#2  0x000000000040e73b in rstats_print (unsure=false) at ../../src/rstats.c:142
        robn = <value optimized out>
#3  0x000000000040d08e in write_spam_info () at ../../src/passthrough.c:138
No locals.
#4  0x000000000040d326 in write_message (status=RC_HAM)
    at ../../src/passthrough.c:218
        rf = 0x40ccf0 <read_mem>
        rfarg = 0x7ffff9541630
        text = 0x2167e70
        seen_subj = false
#5  0x0000000000402ba8 in bogofilter (argc=<value optimized out>,
    argv=<value optimized out>) at ../../src/bogofilter.c:132
        spamicity = 0
        w = 0x2146030
        msgcount = 1
        status = RC_HAM
        register_opt = false
        register_aft = false
        write_msg = true
        words = 0x0
#6  0x0000000000404bd9 in bogomain (argc=0, argv=0x7ffff95417c8)
    at ../../src/bogomain.c:67
        status = <value optimized out>
        exitcode = <value optimized out>
#7  0x0000000000402dee in main (argc=5, argv=0x7ffff95417c8)
    at ../../src/main.c:31
        exitcode = <value optimized out>
(gdb) p * (rstats_t *)p
$35 = {next = 0x219dd50, token = 0x21681f0, good = 0, bad = 2108, msgs_good =
    0, msgs_bad = 6757, used = false, prob = -nan(0x8000000000000)}
(gdb) p * (rstats_t *)q
$36 = {next = 0x0, token = 0x0, good = 0, bad = 0, msgs_good = 0, msgs_bad =
    0, used = false, prob = 0}
(gdb)

The compare_rstats_t function doesn't seem to cope well then the probabilities
are incomparable and one of the sides doesn't actually have tokens...

  Sjoerd

#524758#10
Date:
2009-06-27 00:10:30 UTC
From:
To:
(Adding in the loop the bogofilter devs.)

Sjoerd, would you please supply an example email to help with reproducing the
problem?

Cheers,
Serafeim

Package: bogofilter
Version: 1.2.0-1+b1
Severity: important

It seems like bogofilter crashes on some mails, i get the following backtrace
when rebuilding with debugging symbols:

(gdb) bt full
#0  word_cmp (w1=0x21681f0, w2=0x0) at ../../src/word.c:40
        r = <value optimized out>
#1  0x0000000000417a44 in listsort (list=0x21460e0, compare=
    0x40ed80 <compare_rstats_t>) at ../../src/listsort.c:113
        p = 0x219dd10
        q = 0x219dd50
        e = 0xffffffff
        tail = 0x219dcd0
        insize = 1
        nmerges = <value optimized out>
        psize = 1
        qsize = 1
#2  0x000000000040e73b in rstats_print (unsure=false) at
../../src/rstats.c:142
        robn = <value optimized out>
#3  0x000000000040d08e in write_spam_info () at ../../src/passthrough.c:138
No locals.
#4  0x000000000040d326 in write_message (status=RC_HAM)
    at ../../src/passthrough.c:218
        rf = 0x40ccf0 <read_mem>
        rfarg = 0x7ffff9541630
        text = 0x2167e70
        seen_subj = false
#5  0x0000000000402ba8 in bogofilter (argc=<value optimized out>,
    argv=<value optimized out>) at ../../src/bogofilter.c:132
        spamicity = 0
        w = 0x2146030
        msgcount = 1
        status = RC_HAM
        register_opt = false
        register_aft = false
        write_msg = true
        words = 0x0
#6  0x0000000000404bd9 in bogomain (argc=0, argv=0x7ffff95417c8)
    at ../../src/bogomain.c:67
        status = <value optimized out>
        exitcode = <value optimized out>
#7  0x0000000000402dee in main (argc=5, argv=0x7ffff95417c8)
    at ../../src/main.c:31
        exitcode = <value optimized out>
(gdb) p * (rstats_t *)p
$35 = {next = 0x219dd50, token = 0x21681f0, good = 0, bad = 2108, msgs_good =
    0, msgs_bad = 6757, used = false, prob = -nan(0x8000000000000)}
(gdb) p * (rstats_t *)q
$36 = {next = 0x0, token = 0x0, good = 0, bad = 0, msgs_good = 0, msgs_bad =
    0, used = false, prob = 0}
(gdb)

The compare_rstats_t function doesn't seem to cope well then the probabilities
are incomparable and one of the sides doesn't actually have tokens...

  Sjoerd

#524758#13
Date:
2009-06-27 00:10:30 UTC
From:
To:
(Adding in the loop the bogofilter devs.)

Sjoerd, would you please supply an example email to help with reproducing the
problem?

Cheers,
Serafeim

Package: bogofilter
Version: 1.2.0-1+b1
Severity: important

It seems like bogofilter crashes on some mails, i get the following backtrace
when rebuilding with debugging symbols:

(gdb) bt full
#0  word_cmp (w1=0x21681f0, w2=0x0) at ../../src/word.c:40
        r = <value optimized out>
#1  0x0000000000417a44 in listsort (list=0x21460e0, compare=
    0x40ed80 <compare_rstats_t>) at ../../src/listsort.c:113
        p = 0x219dd10
        q = 0x219dd50
        e = 0xffffffff
        tail = 0x219dcd0
        insize = 1
        nmerges = <value optimized out>
        psize = 1
        qsize = 1
#2  0x000000000040e73b in rstats_print (unsure=false) at
../../src/rstats.c:142
        robn = <value optimized out>
#3  0x000000000040d08e in write_spam_info () at ../../src/passthrough.c:138
No locals.
#4  0x000000000040d326 in write_message (status=RC_HAM)
    at ../../src/passthrough.c:218
        rf = 0x40ccf0 <read_mem>
        rfarg = 0x7ffff9541630
        text = 0x2167e70
        seen_subj = false
#5  0x0000000000402ba8 in bogofilter (argc=<value optimized out>,
    argv=<value optimized out>) at ../../src/bogofilter.c:132
        spamicity = 0
        w = 0x2146030
        msgcount = 1
        status = RC_HAM
        register_opt = false
        register_aft = false
        write_msg = true
        words = 0x0
#6  0x0000000000404bd9 in bogomain (argc=0, argv=0x7ffff95417c8)
    at ../../src/bogomain.c:67
        status = <value optimized out>
        exitcode = <value optimized out>
#7  0x0000000000402dee in main (argc=5, argv=0x7ffff95417c8)
    at ../../src/main.c:31
        exitcode = <value optimized out>
(gdb) p * (rstats_t *)p
$35 = {next = 0x219dd50, token = 0x21681f0, good = 0, bad = 2108, msgs_good =
    0, msgs_bad = 6757, used = false, prob = -nan(0x8000000000000)}
(gdb) p * (rstats_t *)q
$36 = {next = 0x0, token = 0x0, good = 0, bad = 0, msgs_good = 0, msgs_bad =
    0, used = false, prob = 0}
(gdb)

The compare_rstats_t function doesn't seem to cope well then the probabilities
are incomparable and one of the sides doesn't actually have tokens...

  Sjoerd

#524758#18
Date:
2009-06-27 03:27:17 UTC
From:
To:
...[snip]...

the w2 parameter should never be 0x0

which indicates a problem in listsort

Please submit a sample email (preferably gzipped).

Thank you.

David

#524758#21
Date:
2009-06-27 03:27:17 UTC
From:
To:
...[snip]...

the w2 parameter should never be 0x0

which indicates a problem in listsort

Please submit a sample email (preferably gzipped).

Thank you.

David

#524758#26
Date:
2009-07-20 20:29:53 UTC
From:
To:
Attached

  Sjoerd

#524758#29
Date:
2009-07-20 20:29:53 UTC
From:
To:
Attached

  Sjoerd

#524758#34
Date:
2009-07-21 01:18:51 UTC
From:
To:
On Mon, 20 Jul 2009 21:29:53 +0100 Sjoerd Simons wrote:

Hello Sjoerd,

You've supplied the mime part of an email, not a complete email with
headers.  I added some headers to it and had no problem.  I tested with
bogofilter 1.2.0 on a 32-bit mandriva machine and a 64-bit gentoo
machine.

Can you supply a _complete_ message that demonstrates the problem?

Thanks.

David

#524758#37
Date:
2009-07-21 01:18:51 UTC
From:
To:
On Mon, 20 Jul 2009 21:29:53 +0100 Sjoerd Simons wrote:

Hello Sjoerd,

You've supplied the mime part of an email, not a complete email with
headers.  I added some headers to it and had no problem.  I tested with
bogofilter 1.2.0 on a 32-bit mandriva machine and a 64-bit gentoo
machine.

Can you supply a _complete_ message that demonstrates the problem?

Thanks.

David

#524758#48
Date:
2010-01-15 09:21:22 UTC
From:
To:
...in http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=524758 - could you
provide a full message that shows the crash?

Does the database verify? (You can use bogoutil --db-verify to check.)

Thank you.

#524758#53
Date:
2011-12-01 22:55:10 UTC
From:
To:
Hello.

I believe the crash is caused by bogofilter's empty "ham" or "spam"
infos in its token database (wordlist).
Such a case may happen when your bogofilter from the start learns just
spam and never ham. And vise versa.
When such a one sided state happens my bogofilter certainly behaves that
way (SIGSEGV)
while using args like a "-vv" and so on (and meanwhile "bogoutil
--db-verify" is checking always correctly).
Just a hint.

Jaroslav Rektoris

#524758#58
Date:
2018-10-05 07:29:30 UTC
From:
To:
	Hello, everyone.

Sorry for digging up this old bug, but I was it by it in the current
stable.
I confirm Jaroslav's finding - this SEGV reproduces with an empty "spam"
list with the following backtrace:

#0  word_cmp (w1=0x55e6bad0b050, w2=0x0) at ../../src/word.c:40
#1  0x000055e6b8ac24e5 in listsort (list=0x55e6bace5900, compare=compare@entry=0x55e6b8abc690 <compare_rstats_t>)
    at ../../src/listsort.c:114
#2  0x000055e6b8abc853 in rstats_print (unsure=<optimized out>) at ../../src/rstats.c:142
#3  0x000055e6b8abd0dd in msg_print_stats (fp=<optimized out>) at ../../src/score.c:104
#4  0x000055e6b8ab0575 in print_stats (fp=<optimized out>) at ../../src/bogofilter.c:67
#5  0x000055e6b8abae76 in write_spam_info () at ../../src/passthrough.c:99
#6  0x000055e6b8abb384 in write_header (rf=<optimized out>, rfarg=<synthetic pointer>, status=RC_UNSURE)
    at ../../src/passthrough.c:179
#7  write_message (status=status@entry=RC_UNSURE) at ../../src/passthrough.c:239
#8  0x000055e6b8ab0789 in bogofilter (argc=argc@entry=0, argv=<optimized out>) at ../../src/bogofilter.c:132
#9  0x000055e6b8ab28b9 in bogomain (argc=<optimized out>, argv=0x7ffcebb92f88) at ../../src/bogomain.c:67
#10 0x000055e6b8ab0436 in main (argc=5, argv=0x7ffcebb92f88) at
../../src/main.c:31

Which can be explained by a classic NULL pointer dereference at word_cmp
here:

 int word_cmp(const word_t *w1, const word_t *w2)
 {
     uint l = min(w1->leng, w2->leng);
     int r = memcmp((const char *)w1->u.text, (const char *)w2->u.text, l);


Either w1 or w2 can be NULL with empty "spam"/"ham" list, and this kind
of problem should never reproduce if both lists contain something.

Attached patch solves the issue for me.

Sincerely yours, Reco

#524758#63
Date:
2020-04-30 14:24:13 UTC
From:
To:
One simple way to reproduce this appears to be running this without
pre-existing wordlist.db:

echo $'\ngood' | bogofilter -n
echo $'\ngood' | bogofilter -Rv

I am adding a related "make check" test upstream as
bogofilter/src/tests/t.debian-bug-524758.

#524758#68
Date:
2020-04-30 14:44:28 UTC
From:
To:
tags 524758 -moreinfo
tags 524758 +confirmed +upstream +fixed-upstream
thanks

I think upstream Git commit 8eaeb85c should fix this.
To appear in the next release after 1.2.5 (which has already been released).

If it does not, please recompile bogofilter with debug log,
provide your test message, your database, your configuration,
the command line, and a backtrace.

#524758#77
Date:
2020-04-30 14:44:28 UTC
From:
To:
tags 524758 -moreinfo
tags 524758 +confirmed +upstream +fixed-upstream
thanks

I think upstream Git commit 8eaeb85c should fix this.
To appear in the next release after 1.2.5 (which has already been released).

If it does not, please recompile bogofilter with debug log,
provide your test message, your database, your configuration,
the command line, and a backtrace.

#524758#82
Date:
2020-10-31 08:57:26 UTC
From:
To:
With a patch available from Git, I am tagging this accordingly.
#524758#87
Date:
2020-10-31 09:16:02 UTC
From:
To:

#524758#94
Date:
2024-05-27 15:28:09 UTC
From:
To:
Hi!

Since several weeks, bogofilter does not filter spams anymore. I wanted
to know why. There were 2 bogofilter installations : sqlite and bdb. I
don't care which one I should use but for _both_ I got some core dumps.

From journalctl :

Process 56179 (bogofilter) of user 1000 dumped core.
Stack trace of thread 56179:
#0  0x000055b3b01c564a n/a (bogofilter-bdb + 0x1564a)
#1  0x000055b3b01cc1ed n/a (bogofilter-bdb + 0x1c1ed)
#2  0x000055b3b01c2327 n/a (bogofilter-bdb + 0x12327)
#3  0x000055b3b01c0806 n/a (bogofilter-bdb + 0x10806)
#4  0x000055b3b01c100c n/a (bogofilter-bdb + 0x1100c)
#5  0x000055b3b01b5ae1 n/a (bogofilter-bdb + 0x5ae1)
#6  0x000055b3b01b7a2d n/a (bogofilter-bdb + 0x7a2d)
#7  0x000055b3b01b574f n/a (bogofilter-bdb + 0x574f)
#8  0x00007fec0a042c8a n/a (libc.so.6 + 0x27c8a)
#9  0x00007fec0a042d45 __libc_start_main (libc.so.6 + 0x27d45)
#10 0x000055b3b01b5781 n/a (bogofilter-bdb + 0x5781)
ELF object binary architecture: AMD x86-64

This is only 1 of the *many* crashes I got. It was _the same_ with
the sqlite version (I have a folder full of spams so I just had to
remove the db file for the bdb version to create its own when re-parsing
my folder).

There's no debug symbol included so it's hard to know where it happens
but if you have a map file it can be calculated :)


The BR mentions a patch. However it dates from 2020 ! That's quite old
and the BR still isn't closed... Has there been any progress on that
front ?