#175744 moderate messages with SA score positive but under threshold

#175744#5
Date:
2003-01-07 20:31:50 UTC
From:
To:
This is a summary of the spam and legitimate messages I received from
the debian lists I'm subscribed during last month, ordered by
spamassassin score:

                        Legitimate         spam     % of spam
Below 0                       5884            5         0.08%
Between 0 and 1                689          123        15.15%
Between 1 and 2                236          227        49.03%
Between 2 and 3                150          174        53.70%
Between 3 and 4                 20          164        89.13%

Total                         6979          693         9.03%

As you will see, the higher the spamassassin score, the greater the
probability of a message being spam is (I do not expect this to
change in the near future even if we use better filtering methods).

This suggests that a very simple but effective way to get rid of a lot
of spam would be to moderate messages over a certain threshold.

For example, by moderating messages having a spamassassin score over
2.0, we would have to moderate only 2.4% of all legitimate traffic,
but we would get rid of 48.8% of all the current spam, i.e. for very
little cost, we could get a very high benefit.

Additionaly, if we assume that most (if not all) of the spam comes
from non-subscribers, we could let messages from subscribers to pass
(assuming they aren't caught by spamassassin, that is) and moderate
only messages coming from non-subscribers. This way moderation would
be even easier.

I proposed this on debian-devel about two months ago and there were
people willing to moderate lists in this way, so I think this would work.

As a side effect, we would not have to worry so much about some
spamassassin scores. For example, we could assign NO_REAL_NAME its
original score of 1.285, avoiding a lot of spam to the list subscribers
(what is usual for a standard spamassassin installation) and moderators
could care about the extremely low number of false positives we would
obtain in the range from 2.0 to 4.0.

Thanks.

#175744#14
Date:
2003-02-02 20:25:27 UTC
From:
To:
Hello.

These are the statistics for the approximate amount of ham/spam I
received in January, ordered by spamassassin score:

                               ham         spam      % of spam
Below 0                       5704           14          0.24%
Between 0 and 1                625           27          4.14%
Between 1 and 2                278           62         18.24%
Between 2 and 3                172           83         32.55%
Between 3 and 4                 20           44         68.75%

Total                         6799          230          3.27%

Comments:

* Compared to last month, spam is now approximately 1/3 of what it
used to be. Congratulations.

* There is an increasing number of bogus virus warnings which I have
excluded from these figures. This is a problem that should be
addressed also.

* Even if we are now using more effective anti-spam filters, it's
still true that as the spamassassin score increases, so does the
probability of the messages being spam; so I still suggest that we
start moderating messages having a high spamassassin score.

Thanks.

#175744#19
Date:
2003-02-03 03:19:05 UTC
From:
To:
Hi Santiago,

I'm assuming ham is good messages -- I haven't heard that terminology
before.

Thanks for those figures; I'll be sure to use them in the next update we
send out.

I agree, now that I'm back I'll try to work on these some more.
Currently I am thinking of soliciting for moderators for each list and
bouncing messages ovr 3.5 to them for approval.

I'm working on the SA tags and moderating the chinese lists first
though.

Regards,
Anand

#175744#24
Date:
2003-06-03 15:07:42 UTC
From:
To:
If you do implement moderation of posts by non-subscribers, I think it
would be a good idea to have some sort of whitelist mechanism for
subscribers that regularly post from an address other than their
subscription address. I'm one such person.

My From header says liw@iki.fi. IKI is a non-commercial forwarding
service that has been in operation since 1995 and which has a high
probability of staying in operation for the next decade, at least. One
way to ensure this is to avoid flooding the mail server with unnecessary
mails. For example, I do not subscribe to mailing lists via liw@iki.fi,
since mailing list subscriptions are easy to change. Thus, whenever I
point liw@iki.fi to another mailbox, all my personal, regular mail goes
there and then I change my mailing list subscriptions as well.

Other people have other reasons for having the subscriber address and
the From header be different. For example, it is somewhat popular, it
seems, to subscribe to lists with addresses of the form
liw+listname@example.com, to ease filtering.

Thus, if Debian lists will become moderated for posts from
non-subscribers, a whitelist feature would be much appreciated by me and
my ilk, and probably also by the moderators, since it would lessen their
workload.

#175744#29
Date:
2003-06-03 15:41:04 UTC
From:
To:
A whitelist mechanism is described in Bug #175477.
The report includes implementation.

#175744#36
Date:
2003-07-19 18:34:27 UTC
From:
To:
I'd only extend the courtesy to subscribers if the SA score of their
messages was < -2 or so.

We definitely don't want spammers to subscribe to the whitelist and then
freely and easily spam every list.

#175744#41
Date:
2003-07-19 18:49:16 UTC
From:
To:
Hi,

You wrote:

I was just wondering, can I hassle you to post your current statistics? :)

BTW, while we're at stats, there's a ~ 1.6% false positive rate on
spamassassinated messages with score between 4.5 and 10 in listmaster
mail (this includes smartlist-generated mail such as failed subscription
requests). Unfortunately I didn't make a note of their exact scores,
but IIRC they were all well under 7. I wonder if there's a similar rate
in the 4 to 5 range...

#175744#46
Date:
2003-07-19 21:34:29 UTC
From:
To:
The reason I stopped temporarily giving you statistics is that June
was extremely bad. Among the mail I received there were 463 spam
messages and 5961 legitimate ones. 463/(463+5961) = 7.2% of spam.

There will be better statistics at the end of this month, but I wish
you really consider using some good DNSBLs in murphy for the cases
spamd fails as it did in June, so that we do not only rely on
spamassassin to stop spam.

Using some good DNSBLs in murphy would also remove some of the spam
which is sent to @debian.org accounts, many spammers prefer murphy or
gluck over master as their MX for debian.org just because they think a
non-primary MX will have less anti-spam controls.

#175744#51
Date:
2009-04-23 21:24:44 UTC
From:
To:
Hi,

in the meantime we reworked the filtering mechanism so we now use
amavis with spamassassin and we have a whitelist.

A Moderation for some 'grey' articles would be a nice thing, but
as that would need Moderators and Implementation this is a wontfix
now.

As this bug is rather old i close it now, feel free to re-open it with
some words about your opinion of the current situation.

Yours,
        Cord, Debian Listmaster of the day

#175744#56
Date:
2009-05-16 17:41:42 UTC
From:
To:
reopen 175744
thanks

If you are still using spamassassin, then it is almost sure that the
initial message in this report still holds: The more the spamicity of
a message, the greater the probability that the message is spam.

Of course this proposal would need moderators, but there are already
volunteer people who report "this message is spam" in the listarchives.

So it's not Moderators what we don't have but willingness to code an
implementation.

Why not tag this bug "help" instead of "wontfix" then?

I thought it was pretty obvious that it's better to stop the spam
before it's sent than to remove it from the list archives after it has
been distributed to thousands of people.

So if you are doing campaign for people to click in the "report spam"
button, please start thinking about this proposal of moderating messages,
as it would be a lot more optimal way to avoid spam.

Thanks.

#175744#63
Date:
2009-05-17 06:23:20 UTC
From:
To:
Hallo! Du (Santiago Vila) hast geschrieben:

The 'report as spam'-users are unqualified, they are search engines
and other Joe Random Listarchiveusers.

The real Reviewers (which have to be DDs) currently aren't enough to
do this in the needed quantity and speed.

There would be a need to review ~100 postings every day atm.

So from my point of view this is a wontfix, because I don't see that
we get a reliable and fast Moderation and i also don't see anyone who
would implement it.

Yours,
        Cord, Debian Listmaster of the day

#175744#74
Date:
2019-10-07 08:28:57 UTC
From:
To:

#175744#79
Date:
2019-10-07 08:37:18 UTC
From:
To:

#175744#84
Date:
2019-11-02 04:42:04 UTC
From:
To:
Hello,



I am contacting you with regards to using your name for funds claim of long
overdue dormant funds for Investment belonging to a late depositor. Let me
know if you are interested for more details.



Best Regards,

#175744#89
Date:
2019-11-02 04:42:04 UTC
From:
To:
Hello,



I am contacting you with regards to using your name for funds claim of long
overdue dormant funds for Investment belonging to a late depositor. Let me
know if you are interested for more details.



Best Regards,

#175744#108
Date:
2025-02-08 04:50:29 UTC
From:
To:
Final Notice.

You are among the beneficiaries of 2024/2025 grant for all scam victims and relatives reconfirm your email if active for more details

Thank You.

Regards
Mr. Rowland Cole
( Financial Crimes Enforcement Network)

#175744#113
Date:
2025-02-08 04:49:24 UTC
From:
To:
Final Notice.

You are among the beneficiaries of 2024/2025 grant for all scam victims and relatives reconfirm your email if active for more details

Thank You.

Regards
Mr. Rowland Cole
( Financial Crimes Enforcement Network)