#1006755 libarmci-mpi-dev: mpich test failures on s390x, mipsel

Package:
libarmci-mpi-dev
Source:
armci-mpi
Description:
ARMCI-MPI (Development version)
Submitter:
Drew Parsons
Date:
2022-03-04 11:39:04 UTC
Severity:
important
#1006755#5
Date:
2022-03-04 11:09:24 UTC
From:
To:
A handful of armci-mpi tests are failing with mpich on s390x, mipsel,
mips64el.  In this case the tests pass with openmpi. The issue is
raised upstream at https://github.com/pmodels/armci-mpi/issues/35

The failing tests are

FAIL: tests/mpi/test_mpi_dim
FAIL: tests/mpi/test_mpi_indexed_accs
FAIL: tests/mpi/test_mpi_indexed_gets
FAIL: tests/mpi/test_mpi_indexed_puts_gets
FAIL: tests/mpi/test_mpi_subarray_accs

They have much the same error message, e.g.

FAIL: tests/mpi/test_mpi_dim
============================

MPI test program (2 processes)

Testing strided gets and puts
(Only std output for process 0 is printed)
--------array[5]--------
local[1:3] -> remote[0:2] -> local[1:3]
Assertion failed in file src/mpi/datatype/typerep/dataloop/looputil.c at line 815: *lengthp > 0
/usr/lib/s390x-linux-gnu/libmpich.so.12(+0x2b3d76) [0x3ff7e2b3d76]
/usr/lib/s390x-linux-gnu/libmpich.so.12(+0x1fc89e) [0x3ff7e1fc89e]
/usr/lib/s390x-linux-gnu/libmpich.so.12(+0x1c6774) [0x3ff7e1c6774]
/usr/lib/s390x-linux-gnu/libmpich.so.12(+0x1cce1c) [0x3ff7e1cce1c]
/usr/lib/s390x-linux-gnu/libmpich.so.12(+0x256b2e) [0x3ff7e256b2e]
/usr/lib/s390x-linux-gnu/libmpich.so.12(+0x2598e6) [0x3ff7e2598e6]
/usr/lib/s390x-linux-gnu/libmpich.so.12(+0x25be40) [0x3ff7e25be40]
/usr/lib/s390x-linux-gnu/libmpich.so.12(PMPI_Accumulate+0xa94) [0x3ff7e0f9044]
./tests/mpi/test_mpi_dim(+0x2980) [0x2aa1bf02980]
./tests/mpi/test_mpi_dim(main+0x6a) [0x2aa1bf0123a]
/lib/s390x-linux-gnu/libc.so.6(__libc_start_main+0xe6) [0x3ff7de24c5e]
./tests/mpi/test_mpi_dim(+0x1314) [0x2aa1bf01314]
internal ABORT - process 0
FAIL tests/mpi/test_mpi_dim (exit status: 1)



Discussing with upstream, they recommend running the mpich test/mpi
test suite when building or testing mpich. That might help catch some
issues on less common architectures.

As a workaround I'll configure debian/tests to "information only"
(drop set -e) on s390x with mpich, so as not to hold up the other
architectures.