Dear maintainer(s),
With a recent upload of openblas the autopkgtests of dolfin, gemma,
openmolcas, and xtensor-blas fail on ppc64el in testing when their
autopkgtest is run with the binary packages of openblas from unstable.
It passes when run with only packages from testing. In tabular form (for
dolfin):
pass fail
openblas from testing 0.3.30+ds-3
dolfin from testing 2019.2.0~legacy20240219.1c52e83-24
all others from testing from testing
I copied some of the output at the bottom of this report.
Currently this regression is blocking the migration of openblas to
testing [1]. Can you please investigate the situation?
Someone pointed me at https://github.com/OpenMathLib/OpenBLAS/pull/5463
which may completely unrelated, but at least include changes for ppc64el.
Paul
[1] https://qa.debian.org/excuses.php?package=openblas
https://ci.debian.net/data/autopkgtest/testing/ppc64el/d/dolfin/66420095/log.gz
284s Start 72: demo_singular-poisson_serial
285s 29/49 Test #72: demo_singular-poisson_serial
..............Subprocess aborted***Exception: 1.17 sec
285s terminate called after throwing an instance of 'std::runtime_error'
285s what(): 285s 285s ***
-------------------------------------------------------------------------
285s *** DOLFIN encountered an error. If you are not able to resolve
this issue
285s *** using the information listed below, you can ask for help at
285s ***
285s *** https://fenicsproject.discourse.group/
285s ***
285s *** Remember to include the error message listed below and, if
possible,
285s *** include a *minimal* running example to reproduce the error.
285s ***
285s ***
-------------------------------------------------------------------------
285s *** Error: Unable to successfully call PETSc function 'KSPSolve'.
285s *** Reason: PETSc error code is: 76 (Error in external library).
285s *** Where: This error was encountered inside
./dolfin/la/PETScKrylovSolver.cpp.
285s *** Process: 0
285s *** 285s *** DOLFIN version: 2019.2.0.64.dev0
285s *** Git changeset: debian_2019.2.0~legacy20240219.1c52e83-24
285s ***
-------------------------------------------------------------------------
285s 285s [ci-325-a328d227:14164] *** Process received signal ***
285s [ci-325-a328d227:14164] Signal: Aborted (6)
285s [ci-325-a328d227:14164] Signal code: (-6)
285s [ci-325-a328d227:14164] [ 0]
linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0) [0x3fffbb002494]
285s [ci-325-a328d227:14164] [ 1]
/lib/powerpc64le-linux-gnu/libc.so.6(+0xafd3c) [0x3fffb807fd3c]
285s [ci-325-a328d227:14164] [ 2]
/lib/powerpc64le-linux-gnu/libc.so.6(gsignal+0x2c) [0x3fffb801663c]
285s [ci-325-a328d227:14164] [ 3]
/lib/powerpc64le-linux-gnu/libc.so.6(abort+0x28) [0x3fffb7ff65f0]
285s [ci-325-a328d227:14164] [ 4]
/lib/powerpc64le-linux-gnu/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x158)
[0x3fffb83ae858]
285s [ci-325-a328d227:14164] [ 5]
/lib/powerpc64le-linux-gnu/libstdc++.so.6(+0x119ad4) [0x3fffb83a9ad4]
285s [ci-325-a328d227:14164] [ 6]
/lib/powerpc64le-linux-gnu/libstdc++.so.6(_ZSt9terminatev+0x20)
[0x3fffb835527c]
285s [ci-325-a328d227:14164] [ 7]
/lib/powerpc64le-linux-gnu/libstdc++.so.6(__cxa_throw+0x7c) [0x3fffb83a9fec]
285s [ci-325-a328d227:14164] [ 8]
/lib/powerpc64le-linux-gnu/libdolfin.so.2019.2t64(_ZNK6dolfin6Logger12dolfin_errorENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES6_S6_i+0xc1c)
[0x3fffbada1bbc]
285s [ci-325-a328d227:14164] [ 9]
/lib/powerpc64le-linux-gnu/libdolfin.so.2019.2t64(_ZN6dolfin12dolfin_errorENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES5_S5_z+0x184)
[0x3fffbad9e354]
285s [ci-325-a328d227:14164] [10]
/lib/powerpc64le-linux-gnu/libdolfin.so.2019.2t64(_ZN6dolfin11PETScObject11petsc_errorEiNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES6_+0x370)
[0x3fffbad72cc0]
285s [ci-325-a328d227:14164] [11]
/lib/powerpc64le-linux-gnu/libdolfin.so.2019.2t64(_ZN6dolfin17PETScKrylovSolver5solveERNS_11PETScVectorERKS1_b+0x1c44)
[0x3fffbad57ad4]
285s [ci-325-a328d227:14164] [12]
/lib/powerpc64le-linux-gnu/libdolfin.so.2019.2t64(_ZN6dolfin17PETScKrylovSolver5solveERNS_13GenericVectorERKS1_+0x58)
[0x3fffbad58068]
285s [ci-325-a328d227:14164] [13]
/tmp/autopkgtest-lxc.qr0i5fy5/downtmp/build.bm6/src/dolfin-demo/documented/singular-poisson/cpp/demo_singular-poisson(+0xb5c8)
[0x13e65b5c8]
285s [ci-325-a328d227:14164] [14]
/lib/powerpc64le-linux-gnu/libc.so.6(+0x26f0c) [0x3fffb7ff6f0c]
285s [ci-325-a328d227:14164] [15]
/lib/powerpc64le-linux-gnu/libc.so.6(__libc_start_main+0x1ac)
[0x3fffb7ff714c]
285s [ci-325-a328d227:14164] *** End of error message ***
Le samedi 22 novembre 2025 à 12:03 +0100, Paul Gevers a écrit : I talked to upstream about the problem (in an issue that was initially about a FTBFS, due to a failure in OpenBLAS own testsuite, which has since been fixed): https://github.com/OpenMathLib/OpenBLAS/issues/5372#issuecomment-3353517450 Unfortunately upstream does not really know where the test failures in third-party software come from. In particular, they can’t replicate the issue (note that they tried with more recent git snapshot than version 0.3.30), and I couldn’t either with Debian version 0.3.30+ds-3 (tried on the ppc64el Debian porterbox). At this point, fixing this issue is beyond my time budget and skills (I know next to zero about PowerPC, and the issue is probably due to some changes to PowerPC assembly code). CC’ing the Debian PowerPC porters, with the hope that they can help.
user debian-powerpc@lists.debian.org usertag 1121177 ppc64el thanks Dear ppc64el porters, We're in dire need of your help, the issue is stalling openblas' migration to testing and because it's a key package, autoremoval doesn't work. Paul
Thanks for the ping. I’m currently reproducing the issue on the ppc64el side and investigating the root cause. Since openblas is a key package, this needs a proper fix rather than a workaround. Let me go through the bug and I’ll update with findings. Thanks, Trupti
Thanks for the ping. I’m currently reproducing the issue on the ppc64el side and investigating the root cause. Since openblas is a key package, this needs a proper fix rather than a workaround. Let me go through the bug and I’ll update with findings. Thanks, Trupti
Hello,
I tried building the package on different Power systems and observed a
machine-specific failure.
The build completes successfully on a POWER9 (p9) system, but fails
during the test phase on a POWER10 (p10) system.
On p10, the build fails with test errors:
RESULTS: 1522 tests (1518 ok, 4 failed, 0 skipped) ran in 565 ms
make[3]: *** [Makefile:87: run_test] Error 4
make[3]: Leaving directory
'/build/reproducible-path/openblas-0.3.30+ds/0-pthread/utest'
make[2]: *** [Makefile:177: tests] Error 2
make[2]: Leaving directory
'/build/reproducible-path/openblas-0.3.30+ds/0-pthread'
make[1]: *** [debian/rules:165: test_0-pthread] Error 2
make[1]: Leaving directory '/build/reproducible-path/openblas-0.3.30+ds'
make: *** [debian/rules:99: binary-arch] Error 2
dpkg-buildpackage: error: debian/rules binary-arch subprocess failed
with exit status 2
On p9, the package builds and completes successfully, including all
tests, and the binary packages are generated as expected.This indicates
that the issue is specific to POWER10 rather than a general ppc64el
failure.
I am currently investigating the failing tests on p10 to identify the
root cause and will share updates once I have more information.
For p9:
debian/rules override_dh_shlibdeps
make[1]: Entering directory '/path/openblas/openblas-0.3.30+ds'
dh_shlibdeps -plibopenblas0-pthread -plibopenblas0-openmp
-plibopenblas0-serial -- -xlibopenblas0
dpkg-shlibdeps: warning: diversions involved - output may be incorrect
diversion by libc6 from: /lib64/ld64.so.2
dpkg-shlibdeps: warning: diversions involved - output may be incorrect
diversion by libc6 to: /lib64/ld64.so.2.usr-is-merged
dpkg-shlibdeps: warning: diversions involved - output may be incorrect
diversion by libc6 from: /lib64/ld64.so.2
dpkg-shlibdeps: warning: diversions involved - output may be incorrect
diversion by libc6 to: /lib64/ld64.so.2.usr-is-merged
dpkg-shlibdeps: warning: diversions involved - output may be incorrect
diversion by libc6 from: /lib64/ld64.so.2
dpkg-shlibdeps: warning: diversions involved - output may be incorrect
diversion by libc6 to: /lib64/ld64.so.2.usr-is-merged
dh_shlibdeps -plibopenblas64-0-pthread -plibopenblas64-0-openmp
-plibopenblas64-0-serial -- -xlibopenblas64-0
dpkg-shlibdeps: warning: diversions involved - output may be incorrect
diversion by libc6 from: /lib64/ld64.so.2
dpkg-shlibdeps: warning: diversions involved - output may be incorrect
diversion by libc6 to: /lib64/ld64.so.2.usr-is-merged
dpkg-shlibdeps: warning: diversions involved - output may be incorrect
diversion by libc6 from: /lib64/ld64.so.2
dpkg-shlibdeps: warning: diversions involved - output may be incorrect
diversion by libc6 to: /lib64/ld64.so.2.usr-is-merged
dpkg-shlibdeps: warning: diversions involved - output may be incorrect
diversion by libc6 from: /lib64/ld64.so.2
dpkg-shlibdeps: warning: diversions involved - output may be incorrect
diversion by libc6 to: /lib64/ld64.so.2.usr-is-merged
dh_shlibdeps --remaining-packages -a
make[1]: Leaving directory '/Path/openblas/openblas-0.3.30+ds'
dh_installdeb
dh_gencontrol
dh_md5sums
dh_builddeb
dpkg-deb: building package 'libopenblas0' in
'../libopenblas0_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas0-pthread' in
'../libopenblas0-pthread_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas0-pthread-dbgsym' in
'../libopenblas0-pthread-dbgsym_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas0-openmp' in
'../libopenblas0-openmp_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas0-openmp-dbgsym' in
'../libopenblas0-openmp-dbgsym_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas0-serial' in
'../libopenblas0-serial_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas0-serial-dbgsym' in
'../libopenblas0-serial-dbgsym_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas-dev' in
'../libopenblas-dev_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas-pthread-dev' in
'../libopenblas-pthread-dev_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas-openmp-dev' in
'../libopenblas-openmp-dev_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas64-0' in
'../libopenblas64-0_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas-serial-dev' in
'../libopenblas-serial-dev_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas64-0-pthread-dbgsym' in
'../libopenblas64-0-pthread-dbgsym_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas64-0-pthread' in
'../libopenblas64-0-pthread_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas64-0-openmp-dbgsym' in
'../libopenblas64-0-openmp-dbgsym_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas64-0-openmp' in
'../libopenblas64-0-openmp_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas64-0-serial' in
'../libopenblas64-0-serial_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas64-dev' in
'../libopenblas64-dev_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas64-pthread-dev' in
'../libopenblas64-pthread-dev_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas64-0-serial-dbgsym' in
'../libopenblas64-0-serial-dbgsym_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas64-serial-dev' in
'../libopenblas64-serial-dev_0.3.30+ds-3_ppc64el.deb'.
dpkg-deb: building package 'libopenblas64-openmp-dev' in
'../libopenblas64-openmp-dev_0.3.30+ds-3_ppc64el.deb'.
dpkg-genbuildinfo -O../openblas_0.3.30+ds-3_ppc64el.buildinfo
dpkg-genchanges -O../openblas_0.3.30+ds-3_ppc64el.changes
dpkg-genchanges: info: not including original source code in upload
dpkg-source --after-build .
dpkg-buildpackage: info: binary and diff upload (original source NOT
included)
Now running lintian openblas_0.3.30+ds-3_ppc64el.changes ...
Finished running lintian.
Thanks,
Trupti
Hello Paul, I was able to reproduce the autopkgtest failure for xtensor-blas on ppc64el locally. And I have attached both falling and working logs. [ RUN ] xlinalg.pinv /tmp/autopkgtest.nrAywe/autopkgtest_tmp/test_linalg.cpp:239: Failure Value of: allclose(expected, res) Actual: false Expected: true [ FAILED ] xlinalg.pinv (0 ms) [----------] Global test environment tear-down [==========] 77 tests from 6 test suites ran. (7 ms total) [ PASSED ] 76 tests. [ FAILED ] 1 test, listed below: [ FAILED ] xlinalg.pinv 1 FAILED TEST make[3]: *** [CMakeFiles/xtest.dir/build.make:70: CMakeFiles/xtest] Error 1 make[2]: *** [CMakeFiles/Makefile2:188: CMakeFiles/xtest.dir/all] Error 2 make[1]: *** [CMakeFiles/Makefile2:195: CMakeFiles/xtest.dir/rule] Error 2 make: *** [Makefile:192: xtest] Error 2 autopkgtest [23:35:52]: test command2: -----------------------] autopkgtest [23:35:52]: test command2: - - - - - - - - - - results - - - - - - - - - - command2 FAIL non-zero exit status 2 autopkgtest [23:35:52]: @@@@@@@@@@@@@@@@@@@@ summary command1 FAIL non-zero exit status 2 command2 FAIL non-zero exit status 2 The failure occurs in the test: xlinalg.pinv test/test_linalg.cpp When running the test locally on ppc64el with OpenBLAS 0.3.30, the maximum numerical difference between the expected result and xt::linalg::pinv() output is: max diff ≈ 7.0e-09 mean diff ≈ 2.7e-09 With the current test tolerance (allclose default / 1e-12), the test fails. When the tolerance is relaxed to 1e-8, the test passes consistently and all results are numerically stable. This indicates the failure is due to test tolerance rather than a functional regression. kindly consider reviewing the test tolerance. Thanks, Trupti
Dear Trupti, Le mercredi 07 janvier 2026 à 02:19 +0530, Trupti a écrit : Thanks a lot for your investigation and for the recommendation. If you have the time, could you possibly also check that the two other autopkgtest regressions (in src:gemma and src:openmolcas) are also tolerance-related? (see https://tracker.debian.org/pkg/openblas for the list of autopkgtest regressions)
Yes, I will do it. And share you the results as soon as possible. Thanks, Trupti
Hi,
Other possible causes:
- IBM vs IEEE ldbl ABI
- -march=native or -mtune=native
Especially the latter sometimes does surprising things.
Simon
For src:gemma, the autopkgtest failure on ppc64el occurs during the
eigen-decomposition step.
The run reports a warning about many eigenvalues close to zero, followed
by an LU decomposition failure in GSL/LAPACK.
The failure is triggered in the following code path:
// LU decomposition.
void LUDecomp(gsl_matrix *LU, gsl_permutation *p, int *signum) {
// debug_msg("entering");
enforce_gsl(gsl_linalg_LU_decomp(LU, p, signum));
return;
}
For src:openmolcas, the autopkgtest failures on ppc64el are limited to
CASPT2 tests (standard tests 009, 010 and hdf5 test 601).
The logs show floating-point exceptions (IEEE invalid, divide-by-zero,
underflow) followed by CASPT2 convergence failures (_NOT_CONVERGED_ /
_INTERNAL_ERROR_).
All non-CASPT2 tests complete successfully. The CASPT2 output itself
indicates numerical instability and suggests increasing
linear-dependence thresholds.
I have attached the relevant .out and .err files from the failing tests
for reference
Running test standard: 005... (26%) OK
Running test standard: 006... (31%) OK
Running test standard: 009... (36%) Failed! (caspt2)
Running test standard: 010... (42%) Failed! (caspt2)
Running test standard: 011... (47%) OK
Running test standard: 012... (52%) OK
Running test standard: 014... (57%) OK
Running test standard: 015... (63%) OK
Running test standard: 019... (68%) OK
Running test standard: 023... (73%) OK
Running test standard: 025... (78%) OK
Running test standard: 026... (84%) OK
Running test standard: 028... (89%) OK
Running test standard: 029... (94%) OK
Running test hdf5: 601... (100%) Failed! (caspt2)
----> 009.err:
Note: The following floating-point exceptions are signalling:
IEEE_INVALID_FLAG IEEE_DIVIDE_BY_ZERO IEEE_UNDERFLOW_FLAG
Note: The following floating-point exceptions are signalling:
IEEE_UNDERFLOW_FLAG
Note: The following floating-point exceptions are signalling:
IEEE_INVALID_FLAG IEEE_DIVIDE_BY_ZERO IEEE_UNDERFLOW_FLAG
[ process 0]: xquit (rc = 96): _NOT_CONVERGED_
Note: The following floating-point exceptions are signalling:
IEEE_UNDERFLOW_FLAG
----> 009.out:
ATVX 3 Mu3.0001 Se3.004 -0.00118554
-0.00099545 -0.05076084 0.00005053
.....
.....
Total nr of CASPT2 parameters:
Before reduction: 1582
After reduction: 1488
Computing the right-hand side (RHS) elements
--------------------------------------------
Using conventional MKRHS algorithm
Variance of |WF0>: 0.0892508119
The contributions to the second order correlation energy in atomic
units.
-----------------------------------------------------------------------------------------------------------------------------
IT. VJTU VJTI ATVX AIVX VJAI
BVAT BJAT BJAI TOTAL RNORM
-----------------------------------------------------------------------------------------------------------------------------
1 -0.000526 -0.001147 -0.005220 -0.005485 -0.000292
-0.009164 -0.001570 -0.000613 -0.024018 0.043402
SIGMA D. ICASE1,ISYM1: 1 1
ICASE2,ISYM2: 14 5
Colossal value detected in SIGMA.
This implies that the thresholds used for linear
dependence removal must be increased.
Present values, THRSHN, THRSHS: 1.0000000000000000E-010
1.0000000000000000E-008
Use keyword THRESHOLD in input to increase these
values and then run again.
--- Stop Module: caspt2 at Thu Jan 8 04:02:30 2026 /rc=-6 ---
Thanks,
Trupti
Le mercredi 07 janvier 2026 à 23:35 +0530, Trupti a écrit : Do you consider the following as a good summary of your analysis: there is no structural problem in the new OpenBLAS on ppc64el (just slightly numerically different results, within the usual tolerance of numerical software), and as a consequence the adjustment needs to be done in the testsuite of the affected reverse dependencies (xtensor-blas, gemma and openmolcas)? (that seems clear from what you said of src:xtensor-blas and src:openmolcas, less so for the case of src:gemma, hence my question) Paul: if my statement above is correct, what would be the right course of action?
Hi Sébastien, The right action would be to get those pacakges fixed, e.g. via filing bugs (cloning and reassigning might be appropriate). When the bugs are in place, I can hint src:openblas into testing. For info: regularly when I see a package break another package, I file a bug against both packages. I didn't do that in this case because there seemed to be a pattern, and I believe that often means a problem in the breaking package. But sometimes it's the other way around. Paul