#990832 spamass-milter: Postfix has increased unparseable relay failures leading to SPF_FAIL

Package:
spamass-milter
Source:
spamass-milter
Description:
milter for filtering mail through spamassassin
Submitter:
William Haller
Date:
2021-07-19 14:57:03 UTC
Severity:
important
#990832#5
Date:
2021-07-08 20:56:59 UTC
From:
To:
Dear Maintainer,

   * What led up to the situation?

Migration from CentOS 7 to Debian buster. Minor configuration file changes from CentOS due to Postfix chroot, but was working there. Users reported missing mail on Debian which we discovered was being routed to system
junk files due to SPF failures from spamassassin.

   * What exactly did you do (or not do) that was effective (or
     ineffective)?

Removing the spamass-milter from the stream and letting the spamc program be run at delivery time via procmail dropped the number of unparseable relay e-mails (and several subsequent SPF failures) from 2600 or so per day
to around 100 per day. No changes to the spamassassin configuration - just dropping the spamass-milter. Since SPF is failing, it is almost certainly the last hop that is not parsed right (usually the first listed).
A sample is included... I didn't test this message, but I did test some others that failed by running them directly through spamc and they gave a correct report with no unparseable relay errors.

Return-Path: <EmailAddressRemoved@nfpa.org>
Delivered-To: EmailAddressRemoved@autoelect.com
Received: from NFRSProdINF1.us.nfpa.org (NFRSProdINF1.us.nfpa.org [161.47.147.140])
	by postoffice-2.autoelect.com (Postfix) with ESMTP id 5100AA03E0
	for <EmailAddressRemoved@autoelect.com>; Tue,  6 Jul 2021 12:31:48 -0600 (MDT)
DKIM-Filter: OpenDKIM Filter v2.11.0 postoffice-2.autoelect.com 5100AA03E0
Received: from NFRSProdWeb10.us.nfpa.org ([172.24.32.139]) by NFRSProdINF1.us.nfpa.org with Microsoft SMTPSVC(10.0.14393.4169);
	Tue, 6 Jul 2021 14:31:47 -0400
Received: from NFRSProdWeb10 ([127.0.0.1]) by NFRSProdWeb10.us.nfpa.org with Microsoft SMTPSVC(10.0.14393.4169);
	Tue, 6 Jul 2021 14:31:47 -0400
MIME-Version: 1.0
From: EmailAddressRemoved@nfpa.org
To: EmailAddressRemoved@autoelect.com
Date: 6 Jul 2021 14:31:47 -0400
Subject: NFPA  70 Document Alert
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: base64
Message-ID: <NFRSPRODWEB10aKfmW1000815fe@NFRSProdWeb10.us.nfpa.org>
X-OriginalArrivalTime: 06 Jul 2021 18:31:47.0761 (UTC) FILETIME=[2DF78A10:01D77295]
X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.5.11 (postoffice-2.autoelect.com [207.109.102.68]); Tue, 06 Jul 2021 12:31:48 -0600 (MDT)
X-Virus-Scanned: clamav-milter 0.103.2 at postoffice-2.autoelect.com
X-Virus-Status: Clean
X-Spam-Status: No, score=3.7 required=7.0 tests=HTML_IMAGE_ONLY_20,
	HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY,RDNS_NONE,SPF_FAIL,
	T_REMOTE_IMAGE,UNPARSEABLE_RELAY autolearn=no autolearn_force=no
	version=3.4.2
X-Spam-Report:
	*  0.0 SPF_FAIL SPF: sender does not match SPF record (fail)
	*      [SPF failed: Please see http://www.openspf.org/Why?s=mfrom;id=EmailAddressRemoved%40nfpa.org;ip=172.24.32.139;r=postoffice-2.autoelect.com]
	*  0.0 HTML_MESSAGE BODY: HTML included in message
	*  1.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
	*  0.7 HTML_IMAGE_ONLY_20 BODY: HTML: images with 1600-2000 bytes of
	*      words
	*  0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay
	*      lines
	*  1.3 RDNS_NONE Delivered to internal network by a host with no rDNS
	*  0.6 HTML_MIME_NO_HTML_TAG HTML-only message, but there is no HTML
	*      tag
	*  0.0 T_REMOTE_IMAGE Message contains an external image
X-Spam-Level: ***
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on
postoffice-2.autoelect.com

body of message omitted.

   * What was the outcome of this action?
Mail was incorrectly flagged as forged and hid from the user.

   * What outcome did you expect instead?
Mail to be correctly parsed and delivered.

#990832#8
Date:
2021-07-10 23:13:22 UTC
From:
To:
exist at the moment it has to send the header to spamassassin. I'm
suspecting that a configuration issue (or a bug) is causing the
Received: header to not be parsable (or doesn't match localnet in
spamassassin).

Try running spamass-milter with -d spamc; to have it output the received
header it is sending to spamc (and compare with the actual received
header that gets generated).

#990832#13
Date:
2021-07-12 18:30:27 UTC
From:
To:
Just a quick update. I'm doing some logging. To me it looks like DNS
replies may not come fast enough. For some reason, with calls from spamc
from procmail they aren't timing out and moving to the next hop (thereby
marking it as the sender and causing the SPF failures) like happens in
spamass-milter.

It is possible that an IPV6 reply from the first address is also an
issue, but I'm not certain of this. It does appear that it's IPV6
addresses that cause the most issues from a brief perusal. It could also
be that only the bigger e-mail senders are using IPV6 at this point and
when another company uses something like outlook.com as its final relay,
then checking through all the possible includes in that company's SPF
records for which outlook.com is one just takes too long to finish
processing the first time but is then available by the time
procmail/spamc is hitting it.

As long as DNS resolution occurs, I haven't seen any issues with
internal/trusted settings.

I've added dns to received-header for the logging and will post again if
I get something definitive.

#990832#18
Date:
2021-07-13 15:11:49 UTC
From:
To:
SMTP id unknown#015#012#011Mon, 12 Jul 2021 11:17:57
-0600#015#012#011(envelope-from 
<bounce-34_HTML-3203040-15955-514005572-29013@bounce.mc.ihg.com>);#015

output "Received: from mta12.mc.ihg.com (unknown)#015#012#011by
postoffice-2.autoelect.com (8.13.0/8.13.0) with SMTP id
unknown#015#012#011Mon, 12 Jul 2021 11:17:57
-0600#015#012#011(envelope-from 
<bounce-34_HTML-3203040-15955-514005572-29013@bounce.mc.ihg.com>);#015#012"

output "Received: by mta12.mc.ihg.com id hthska2fmd4d for
<EmailAddrRemoved@AUTOELECT.COM>; Mon, 12 Jul 2021 17:17:56 +0000
(envelope-from
<bounce-34_HTML-3203040-15955-514005572-29013@bounce.mc.ihg.com>)#015#012"

yields from spamd

received-header: unparseable: from mta12.mc.ihg.com (unknown) by
postoffice-2.autoelect.com (8.13.0/8.13.0) with SMTP id unknown Mon, 12
Jul 2021 11:17:57 -0600 (envelope-from
<bounce-34_HTML-3203040-15955-514005572-29013@bounce.mc.ihg.com>);
---

but when called via procmail call to spamc gives

received-header: parsed as [ ip=13.111.173.13 rdns=mta12.mc.ihg.com
helo=mta12.mc.ihg.com by=postoffice-2.autoelect.com ident= envfrom=
intl=0 id=050749FFD0 auth= msa=0 ]

received-header: relay 13.111.173.13 trusted? no internal? no msa? no

which has no unparseable relay errors.
----- In the e-mail text the received lines are: Received: from mta12.mc.ihg.com (mta12.mc.ihg.com [13.111.173.13])     (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))     (No client certificate requested)     by postoffice-2.autoelect.com (Postfix) with ESMTPS id 050749FFD0     for <EmailAddrRemoved@AUTOELECT.COM>; Mon, 12 Jul 2021 11:17:56 -0600 (MDT) Other header lines (DKIM-Filter, Authentication-Results: DKIM-Signature:) and then Received: by mta12.mc.ihg.com id hthska2fmd4d for <EmailAddrRemoved@AUTOELECT.COM>; Mon, 12 Jul 2021 17:17:56 +0000 (envelope-from <bounce-34_HTML-3203040-15955-514005572-29013@bounce.mc.ihg.com>)
---- The DKIM verify wouldn't have happened till after spamass-milter saw it, but it looks like the first received line is simply lost and replaced with the second received header which will never have a chance of matching SPF. But as I don't routinely look at the debug output, I may well be misinterpreting it. As far as I know the spamass-milter configuration is very generic OPTIONS="-u spamass-milter -i 127.0.0.1 -d spamc" SOCKET="/var/spool/postfix/spamass/spamass.sock" SOCKETOWNER="postfix:postfix" SOCKETMODE="0660"
#990832#23
Date:
2021-07-14 14:56:05 UTC
From:
To:
Look at moving the two assassin->connected = 1 statements down after you
try to connect. I put them after the catch block when I made the change.
I think setting the flag to 1 early causes the output to fail to buffer
and it is the first line which is frequently not able to be transmitted
to the pipe that is not yet running but marked as connected. I admit I
don't know the code very well and it may cause other issues I don't see,
but this seems to make a huge difference.

If this looks like a fix could you remove the message #18 from the bug
report. Thanks.

Bill

#990832#28
Date:
2021-07-15 16:13:55 UTC
From:
To:
Still see some dropped header information - drastically less, but not 0.
So I'd say there still might be a race condition or error handling that
isn't right.

I tried to move the outputbuffer handling into ::output, moving any
existing outputbuffer contents to a temporary buffer, killing the
outputbuffer, and dumping that temp buffer first once connected was set,
but I was still getting unparseable relays. Then I pulled the connection
attempt for each line completely letting everything be buffered and just
did the connection attempt after the _eoh was buffered to the
outputbuffer (turning on connected before the Connect call as it was
before - just done after everything was buffered). This also gave
unparseable relays, but in both of these cases at least all received
headers were examined by spamd.

So I'm thinking that the dropped lines are one problem - and either of
the above solutions might help with that - I don't know - but that there
is also a received header formation problem that spamd doesn't like the
result of.


A sample header in the text e-mail might be

Received: from mail02.connect.hpe.com (mail02.connect.hpe.com
[204.92.22.193])
     (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
     (No client certificate requested)
     by postoffice-2.autoelect.com (Postfix) with ESMTPS id 9278E9FFB5
     for <EmailRemoved@autoelect.com>; Thu, 15 Jul 2021 09:53:19 -0600 (MDT)

and I see

Received header for spamc: Received: from mail02.connect.hpe.com
(unknown)#015#012#011by postoffice-2.autoelect.com (8.13.0/8.13.0) with
SMTP id unknown#015#012#011Thu, 15 Jul 2021 09:53:19
-0600#015#012#011(envelope-from <bounceback@connect.hpe.com>);#015

output "Received: from mail02.connect.hpe.com (unknown)#015#012#011by
postoffice-2.autoelect.com (8.13.0/8.13.0) with SMTP id
unknown#015#012#011Thu, 15 Jul 2021 09:53:19
-0600#015#012#011(envelope-from <bounceback@connect.hpe.com>);#015#012"

output "Received: from P01SNJ404 (10.32.120.245) by
mail02.connect.hpe.com id hu1cus2q38k7 for <EmailRemoved@autoelect.com>;
Thu, 15 Jul 2021 11:53:18 -0400 (envelope-from
<bounceback@connect.hpe.com>)#015#012"

from spamass-milter logging and


received-header: unparseable: from mail02.connect.hpe.com (unknown) by
postoffice-2.autoelect.com (8.13.0/8.13.0) with SMTP id unknown Thu, 15
Jul 2021 09:53:19 -0600 (envelope-from <bounceback@connect.hpe.com>);

received-header: parsed as [ ip=10.32.120.245 rdns=P01SNJ404
helo=P01SNJ404 by=mail02.connect.hpe.com ident=
envfrom=bounceback@connect.hpe.com intl=0 id=hu1cus2q38k7 auth= msa=0 ]

received-header: relay 10.32.120.245 trusted? yes internal? yes msa? no

from spamd's logging


Thanks, Bill

#990832#33
Date:
2021-07-16 16:26:15 UTC
From:
To:
Applied fixes from here to fedora 34 base for spamass-milter:
https://bz.apache.org/SpamAssassin/attachment.cgi?id=5745&action=diff

With the exception of setting a default macro_r based on whether or not
there was a cipher since r doesn't seem to be a postfix thing.

macro_cipher = smfi_getsymval(ctx, const_cast<char *>("{cipher}"));
if (!macro_cipher)
{
   macro_cipher = "";
   /* Protocol used to receive the message */
   macro_r = smfi_getsymval(ctx, const_cast<char *>("r"));
   if (!macro_r)
   {
     macro_r = "ESMTP";
     warnmacro("r", "ENVRCPT");
   }
}
else {
   /* Protocol used to receive the message */
   macro_r = smfi_getsymval(ctx, const_cast<char *>("r"));
   if (!macro_r)
   {
     macro_r = "ESMTPS";
     warnmacro("r", "ENVRCPT");
   }
}

and including the changes to append so the lines aren't dropped (setting
the connected flag after a successful connect, moving the dump of the
buffer to be after if (!connected) in ::output before the output of
desired line).

Then corrected the fedora 34 base to include the -I option in the debian
code.

Then made the final output for computed last hop to be

rec_header = (string) "Received: from " + macro_s + " (" + macro__ +
")\r\n\t";
if (strlen(macro_tls_version)!=0)
{
   rec_header += (string) "(using ";
   if (strlen(macro_tls_version)!=0) {
     rec_header+=macro_tls_version;
   }
   if (strlen(macro_cipher)!=0) {
     rec_header+=(string) " cipher="+macro_cipher;
   }
   if (strlen(macro_auth_ssf))
   {
     rec_header += (string) " ("+macro_auth_ssf+"/"+macro_auth_ssf+" bits)";
   }
   rec_header += (string) ")\r\n\t";
}
if (strlen(macro_auth_authen)!=0) {
   rec_header += (string) "(Authenticated sender:
"+macro_auth_authen+")\r\n\t";
}
rec_header += (string) "by " + macro_j + " (Postfix) with " +
   macro_r + " id " + macro_i + ";\r\n\t" +
   "for "+envrcpt[0]+"; "+macro_b + "\r\n";

Which gets rid of unparseable relays and dropped lines. I'm not sure
what the best fix would or should be - adding a postfix flag so it would
generate postfix worthy lines for spamassassin or making spamassassin
recognize the lines put out by spamass-milter as "sendmail" lines. But
getting the client ip address for the sender seems to be the big thing.

Bill

#990832#38
Date:
2021-07-19 14:54:35 UTC
From:
To:
Thanks for all of the work and the patching.

The debug output that you're showing looks a lot like some of the
necessary macros aren't enabled in the postfix config; can you compare
what you have to what is listed in README.Debian?

I'll try to take another look at the code that you have which is working
and see what is going on there.