#999620 pktanon: autopkgtest regression on armhf

#999620#5
Date:
2021-11-13 19:40:16 UTC
From:
To:
Dear maintainer(s),

With a recent upload of pktanon the autopkgtest of pktanon fails in
testing on armhf when that autopkgtest is run with the binary packages
of pktanon from unstable. It passes when run with only packages from
testing. In tabular form:

                        pass            fail
pktanon                from testing    2~git20160407.0.2bde4f2+dfsg-8
all others             from testing    from testing

I copied some of the output at the bottom of this report.

Currently this regression is blocking the migration to testing [1]. Can
you please investigate the situation and fix it?

More information about this bug and the reason for filing it can be found on
https://wiki.debian.org/ContinuousIntegration/RegressionEmailInformation

Paul

[1] https://qa.debian.org/excuses.php?package=pktanon

https://ci.debian.net/data/autopkgtest/testing/armhf/p/pktanon/16640926/log.gz
-----------------------------------------------
pktanon --- profile-based traffic anonymization
-----------------------------------------------
initializing PktAnon,  configuration =
/usr/share/doc/pktanon/examples/profiles/profile.xml
unknown element: pktanon-config: 37
unknown element: anonymizations: 102
istream: opened file
/tmp/autopkgtest-lxc.tgw0nfyu/downtmp/build.ybV/src/profiles/sample.pcap
ostream: opened output file ./out.pcap
initialized
Bus error
autopkgtest [23:49:21]: test run-example

#999620#10
Date:
2021-11-14 10:03:37 UTC
From:
To:
Hi Paul,
[...]

I am puzzled. The recent upload only changed the watchfile and updated
Standards-Version, compat level etc -- packaging things. Nothing touched
the code or build rules.

Also, I can't reproduce the bus error when running the offending command
from the autopkgtest on a version I built on a porterbox:

(sid_armhf-dchroot)satta@abel:~/pktanon-2~git20160407.0.2bde4f2+dfsg$
../usr/bin/pktanon -c
../usr/share/doc/pktanon/examples/profiles/profile.xml
profiles/sample.pcap ./out.pcap
-----------------------------------------------
pktanon --- profile-based traffic anonymization
-----------------------------------------------
initializing PktAnon,  configuration =
../usr/share/doc/pktanon/examples/profiles/profile.xml
unknown element: pktanon-config: 37
unknown element: anonymizations: 102
istream: opened file profiles/sample.pcap
ostream: opened output file ./out.pcap
initialized
complete

statistics for input file 'profiles/sample.pcap'
  processed packets: 9
  errors in packets: 0
  elapsed time:      639us
  Mpps:              0.0141

I must admit that being unfamiliar with these architectures and not
really having an idea of where to start, I am tempted to just remove
armhf from the list of supported architectures and have the version with
the broken autopkgtest removed from unstable. Do you probably know
someone who might be more knowledgeable with such architecture-specific
issues?

Cheers
Sascha

#999620#15
Date:
2021-11-16 18:34:09 UTC
From:
To:
Hi Sascha,

Well, but maybe your build dependencies have. Also, compat level isn't
totally safe either in general (although the issue here doesn't
obviously look like it).

Our armhf host is very powerful, it has 160 cores and 255GB RAM. Maybe
that makes it enough different from the porter box. (Albeit our other
extreme host (ci-worker13; amd64) process the package fine, but that has
*only* 48 cores and 256GB.

We have porters for architecture specific support. However, I'm not
totally convinced yet it's architecture specific.

Is there anything I can try out for you on our armhf host to help debug
the issue? Run the command with more debug options? Grab an output file
from somewhere? I could try to run the test in testing with a rebuild of
the package in testing, would that help?

Paul

#999620#20
Date:
2021-12-20 14:24:20 UTC
From:
To:
Hi Paul,

sorry for the delay in replying, I was quite busy and now I have some
free time over the holidays to follow up.

Yes, I agree it's likely not that.

[...]>> I must admit that being unfamiliar with these architectures and not

Maybe. You noted that it seems to work fine on a machine with the same
architecture but different specs.
Hmmm. Since I am pretty unfamiliar with the source and/or any
assumptions that are being made in the code, a good start would be to
get an idea of where in the code the bus error is triggered. You could
try the -v option but I am not sure it would help much.
I think a real stack trace would help, by running the tests with
valgrind or via gdb. Nothing one would do in a generic test suite :/
How much customization would be possible in the test run?

Maybe, if that does not cost much then please try.

Cheers
Sascha

#999620#25
Date:
2021-12-22 20:56:46 UTC
From:
To:
Hi Sascha,

I think you misunderstood me then. We only have one host where we run
armhf tests (we do have other arm64 hosts, but we use those for arm64
testing).

root@elbrus:/tmp/autopkgtest-lxc.9j0ja_rt/downtmp/build.ssB/real-tree#
pktanon -vc /usr/share/doc/pktanon/examples/profiles/profile.xml
./profiles/sample.pcap ./out.pca
-----------------------------------------------
pktanon --- profile-based traffic anonymization
-----------------------------------------------
initializing PktAnon,  configuration =
/usr/share/doc/pktanon/examples/profiles/profile.xml
parsing configuration file...
unknown element: pktanon-config: 37
unknown element: anonymizations: 102
parsed configuration file.
configuring transformations...
         configuring ethernet packet:
                 mac-source[0]: AnonHashHmacSha1
                 mac-dest[0]: AnonHashHmacSha1
                 ethertype[0]: AnonIdentity
         configuring arp packet:
                 hardware-type[0]: AnonIdentity
                 protocol-type[0]: AnonIdentity
                 hardware-size[0]: AnonIdentity
                 protocol-size[0]: AnonIdentity
                 opcode[0]: AnonIdentity
                 sender-mac[0]: AnonHashHmacSha1
                 sender-ip[0]: AnonHashSha1
                 target-mac[0]: AnonHashHmacSha1
                 target-ip[0]: AnonHashSha1
         configuring ip(v4) packet:
                 tos[0]: AnonConstOverwrite
                 identification[0]: AnonIdentity
                 flags[0]: AnonIdentity
                 fragment[0]: AnonIdentity
                 ttl[0]: AnonWhitenoise
                 protocol[0]: AnonIdentity
                 src-ip[0]: AnonHashSha1
                 dest-ip[0]: AnonHashSha1
                 options[0]: AnonShorten
                         newlen: 0
         configuring ipv6 packet:
                 traffic-class[0]: AnonIdentity
                 flow-label[0]: AnonIdentity
                 next-header[0]: AnonIdentity
                 hop-limit[0]: AnonWhitenoise
                 src-ip[0]: AnonHashSha256
                 dest-ip[0]: AnonHashSha256
         configuring tcp packet:
                 source-port[0]: AnonRandomize
                 dest-port[0]: AnonRandomize
                 seq[0]: AnonWhitenoise
                 ack[0]: AnonWhitenoise
                 flags[0]: AnonIdentity
                 window-size[0]: AnonWhitenoise
                 urgent-pointer[0]: AnonConstOverwrite
                 options[0]: AnonShorten
                         newlen: 0
         configuring udp packet:
                 source-port[0]: AnonRandomize
                 dest-port[0]: AnonRandomize
         configuring icmp(v4) packet:
                 type[0]: AnonIdentity
                 code[0]: AnonIdentity
                 rest[0]: AnonIdentity
         configuring icmp(v6) packet:
                 type[0]: AnonIdentity
                 code[0]: AnonIdentity
                 rest[0]: AnonIdentity
                 target-address[0]: AnonHashSha256
         configuring payload packet:
                 payload[0]: AnonShorten
                         newlen: 0
configured
istream: opened file ./profiles/sample.pcap
ostream: opened output file ./out.pca
initialized
Bus error

(Mind you this was pktanon/unstable build in testing and installed there).

I very much not fluent in valgrind and gdb. Please provide the exact
commands you'd want me to execute and I'll see if I can get them.

I have a shell after the failure, so I can run what I want. I just don't
want to change packages before and build them.

Paul

#999620#32
Date:
2022-04-13 20:20:49 UTC
From:
To:
Here is a backtrace of the armhf SIGBUS.

Note that not all ARM implementations return SIGBUS which is probably why
this was not reproducible on the porter machine.

# gdb --args pktanon -c /usr/share/doc/pktanon/examples/profiles/profile.xml ./profiles/sample.pcap ./out.pcap
[...]
Reading symbols from pktanon...
Reading symbols from /usr/lib/debug/.build-id/af/1ac53f46ae133c8898358966960cba95ac7a70.debug...
(gdb) run
Starting program: /usr/bin/pktanon -c /usr/share/doc/pktanon/examples/profiles/profile.xml ./profiles/sample.pcap ./out.pcap
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
-----------------------------------------------
pktanon --- profile-based traffic anonymization
-----------------------------------------------
initializing PktAnon,  configuration = /usr/share/doc/pktanon/examples/profiles/profile.xml
unknown element: pktanon-config: 37
unknown element: anonymizations: 102
istream: opened file ./profiles/sample.pcap
ostream: opened output file ./out.pcap
initialized

Program received signal SIGBUS, Bus error.
pktanon::TcpPacketTransformation::transform (this=<optimized out>, source_buffer=<optimized out>, destination_buffer=0xfffef35a "\212y\262X\335\300l\221", max_packet_length=40) at transformations/TcpPacketTransformation.cpp:88
88	  hton32 (output_header->ack_num);
(gdb) bt
#0  pktanon::TcpPacketTransformation::transform (this=<optimized out>,
    source_buffer=<optimized out>,
    destination_buffer=0xfffef35a "\212y\262X\335\300l\221",
    max_packet_length=40) at transformations/TcpPacketTransformation.cpp:88
#1  0x0040b77c in pktanon::IPv4PacketTransformation::transform (this=0x4b4eb0,
    source_buffer=<optimized out>, destination_buffer=0xfffef346 "E",
    max_packet_length=<optimized out>)
    at transformations/IPv4PacketTransformation.cpp:153
#2  0x0040af64 in pktanon::EthernetPacketTransformation::transform (
    this=0x4ad780, source_buffer=<optimized out>,
    destination_buffer=0xfffef338 "\376\212\a\213\001\254\303\341\372DI\355\b", max_packet_length=74) at transformations/EthernetPacketTransformation.cpp:53
#3  0x00416862 in pktanon::transform_packet (stats=...,
    packet_len=<optimized out>,
    transformed_packet=0xfffef338 "\376\212\a\213\001\254\303\341\372DI\355\b", original_packet=0xfffef438 "", record_header=...) at Utils.h:26
#4  pktanon::IstreamInput::read_packets (this=0x4b3ce0)
    at IstreamRecordsHandler.cpp:121
#5  0x00415130 in pktanon::PktAnonRuntime::run () at PktAnonRuntime.cpp:37
#6  0x00405bfa in main (argc=<optimized out>, argv=<optimized out>)
    at src/Main.cpp:73
(gdb)

So, this is trying to do an hton32() operation on a field that is not 4-byte
aligned.

#999620#37
Date:
2022-04-13 20:39:57 UTC
From:
To:
Note that this will consistently fail alignment checks on architectures
which require alignment, because the initial buffer is allocated with
reasonable alignment (32bit) but the ethernet header is 14 bytes long, so
the TCP header fields will always be unaligned within the buffer.

#999620#42
Date:
2022-04-14 07:09:53 UTC
From:
To:
Hi Steve,

Many thanks for reproducing this and for offering a the detailed
explanation. I would be happy to forward your findings to upstream
(however, my previous issues/PRs on upstream's GitHub have gone
unanswered). For the time being, I must admit I unfortunately do not
have the time to fix it via a patch.

Do you think we should wait for this to be fixed? As I said before I
(just from my practical point of view) would be in favor of just
removing the problematic architectures.

Cheers
Sascha

#999620#49
Date:
2022-04-14 20:00:31 UTC
From:
To:
I have no opinion on this.  But if you want the package to be releasable,
you will need to change it so that it is not building a (completely broken
and useless) package on armhf, then get agreement with the ftp team to
remove the existing armhf binaries.

#999620#54
Date:
2022-04-18 17:46:34 UTC
From:
To:
Hi,

Yes, sure. Will file RM bugs right after an upload disabling the builds.

BTW, since you seem to be knowledgeable in the matter, can you think of
any other architectures I would need to exclude here other than armhf?
Just to ensure that I remove a sensible list of affected archs and
reduce potential rounds of additional RMs...

Thanks
Sascha

#999620#59
Date:
2022-04-18 18:14:36 UTC
From:
To:
The other architectures where alignment matters are all obsolete
architectures in Debian.  (alpha, hppa, powerpc, sparc are the ones that
come to mind.)  This could be an issue for running armel binaries on an
arm64 CPU, but I don't see any reason why someone would do that.

#999620#64
Date:
2022-06-05 23:43:55 UTC
From:
To:
Closing this since armel and armhf builds have been disabled and RMs
have been addressed, removing old builds for these archs. Closing this
bug should now unlock testing migration again.

Thanks
Sascha