- Package:
- bugs.debian.org
- Source:
- bugs.debian.org
- Submitter:
- Ben Longbons
- Date:
- 2023-03-26 06:27:03 UTC
- Severity:
- important
- Tags:
Dear Maintainer, When running e.g. `reportbug -N 853037`, a bunch of base64 is displayed instead of the actual content of the messages.
Dear Maintainer, The same thing occurs when saving a bug report to disk if the bug report contains a non-ascii character - it is saved as base64 and then is rejected by the bug tracking system if you try to send it later because the first line doesn't begin with "Package: ".
control: clone 853915 -1 control: reassign 853915 python-debianbts control: retitle -1 reportbug: base64 encoded reports rejected by bts Reading and sending base64 message are two different bugs, so let's split this report. I believe that python-debianbts is supposed to decode a base64 message body, therefore I'm reassigning the base64 reading bug. Please reassign back to reportbug if this assumption is wrong.
Can you please provide a case to reproduce this issue? I'm not sure if this is a problem with python-reportbug. Cheers, Bastian
It seems like the core of the problem is that parts of the header -- i.e.: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: base64 are assigned as part of the email body instead of the -header. I'm not sure if this problem comes from the BTS which sends wrong SOAP or on my side for parsing it wrong.
My suspicion is that the BTS SOAP interface has trouble with certain types of MIME messages (which can be quite complex...) and sends wrong SOAP. Can we reassign this bug (to check) if that is the case? (Where?)
The message is a multipart message, where Content-Type and Content-Transfer-Encoding are given separately for each part, so they must be part of the email body. Each body part then has own header lines like this. For most bug log messages, one can read the body text like this: But sometimes something like this is needed: print(bts.get_bug_log(853037)[0]['message'].get_payload()[0].get_payload(decode=True).decode())
Hello,
I'm often hit by this bug, as I have several Debian instances where no e-mail
conection is available. So I save the report of reportbug, transfer
the file to another machine and then I need to massage the e-mail to
get it accepted (currently removing all but the base64 part, running
"base64 -d " on the remaining part and inserting the output back into
the original mail. I'm probably going to skript it soon.
So it would be very helpful for me if either temporarily stored
e-mails of reportbug are not stored in base64 format at all or at
least configurable, so that I can always see the contents after "mutt
-H reportbug …".
Since all e-mails sent by mutt at least arrived in the BTS without
problems I do not see the point of encoding e-mails in base64 at all.
(But there may be use cases, so makeing this configurable would
probably be the best option).
Having the BTS accept the base64 encoded e-mails is suboptimal, as I
can no longer read the e-mails in mutt and sometimes I notice things
just before sending, prompting me to update the report in mutt.
Greetings
Helge
control: reassign 853915 debbugs Summary: Reportbug functionality to browse existing bugs fails with signed messages, where the encoded message is shown instead of decoded message text. While this is not too critical with quoted-printable, it is a real problem with base64 encoding. Try: reportbug -N 853037 reportbug internally does to show the first message in a report. Example bugs where the problem can be seen are: #853037, #820649, #861168 Could the BTS SOAP interface be changed to return the decoded message body of signed messages? Being able to deal with all other kinds of complex MIME messages is not really necessary.
I've been looking at the tools interacting here and am not yet sure where the bug is. Python-debianbts, when retrieving a bug log via the BTS SOAP interface, receives each buglog element (message) already split into header and body [get_bug_log]. If the body is base64-encoded, it gets decoded before the function returns the bug log. Python-debianbts also attempts to reconstruct something resembling the original full message by using the feedparser, and includes that in the buglog elements (dicts) it returns. I am not sure how reliable that message reconstruction is, but I suspect it is not perfect. [get_bug_log]: https://github.com/venthur/python-debianbts/blob/master/debianbts/debianbts.py#L298 Now I'd like to understand the constraints better under which python-debianbts is operating: What exactly is the BTS supposed to deliver via SOAP as the message body part of the bug log? If the message is a simple text/plain email, is the body expected to be already decoded or not? If the message is some MIME/multipart construct, is the body then expected to be the main text message part only or should it just be everything that is not part of the main message headers? I've been trying to look at the debbugs code to find the answer to these questions, but with limited success so far. Looking at lib/Debbugs/SOAP.pm in subroutine get_bug_log, it uses Debbugs::MIME's parse function to split the messages into header and body: https://salsa.debian.org/debbugs-team/debbugs/-/blob/master/lib/Debbugs/SOAP.pm#L249 `parse` in turn uses `getmailbody`, which definitely tries to extract the main text message part and does not just dump everything that isn't part of the primary message headers. So either something does not work as expected there, or I'm simply looking at the wrong code and should be looking somewhere else. Ideas? In the meantime I have come up with a workaround for this in reportbug, but it would still be useful to know if everything else is working as intended or not.
If this is the correct code, then why is it behaving differently when its is run to serve a soap request, as compared to running it directly, for the same email message? This is at least happening for multipart/signed messages, and likely also for other MIME messages. For testing I am using the initial report mail from #853037. The BTS' SOAP interface delivers the message body as the entire undecoded body of the primary email message, as if the BTS did not understand multipart messages at all. See `reportbug -N 853037` for how the email body is then displayed by reportbug. However, when I run this same message through Debbugs::MIME's `parse`, it correctly identifies the text/plain subpart and decodes and returns it. I've tested this by downloading the message mbox: wget -O msg https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=853037;mbox=yes;msg=5 and then using the following perl script to call the `parse` function:------- m.pl ------ #! /usr/bin/perl use strict; use warnings; use Data::Dumper; use Debbugs::MIME qw(parse); sub make_list { return map {(ref($_) eq 'ARRAY')?@{$_}:$_} @_; } local $/; my $lines = <>; my $message = parse $lines; my ($header, $body) = map {join("\n", make_list($_))} @{$message}{qw(header body)}; print Dumper({header => $header, body => $body,}); ------------- Command line: perl -I debbugs/lib/ m.pl < msg Here the result shows a nicely identified and extracted message text. Looking now at the code starting here: https://salsa.debian.org/debbugs-team/debbugs/-/blob/master/lib/Debbugs/MIME.pm#L130 Somehow, when the code is run on the BTS server, the MIME::Parser seems to fail and the `parse` function code is falling back to the legacy pre-MIME code. Why?
otherwise you won't get the mbox.
control: reassign 853915 bugs.debian.org control: affects 853915 reportbug Bug summary: - Browsing bug logs in reportbug is broken for some messages - Bug log messages retrieved from the BTS via the SOAP interface are supposed to be decoded, but in these cases aren't. - All MIME multipart messages are affected (e.g., messages with attachments, PGP/MIME signed messages) - The debbugs code itself seems fine (AFAICS) To check whether a problem with some old version of libmime-tools-perl could be behind this, I've tested this with the versions in stretch (oldstable) and jessie (old-oldstable), but couldn't reproduce the problem there either. Thanks to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=910360 it is now also clear that I've looked at the correct code, and the code is actually working as intended in another debbugs installation. (Due to this bug (#853915), #910360 actually currently does not apply to Debian, because `get_bug_log` SOAP queries have been returning complete messages with all attachments since at least 2017.) So the problem is specific to bugs.debian.org. Reassigning accordingly.
The reason is that the perl code on the BTS server is executed in taint mode, and MIME::Parser fails on multipart messages when run in taint mode. Adding the -T flag to the perl invocation in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=853915#65 reproduces the problem: The message body is not properly decoded.
control: tags 853915 + patch There is a merge request with the fix on salsa: https://salsa.debian.org/debbugs-team/debbugs/-/merge_requests/10
feport spam
Help bugs g
Malware