#1055750 gfortran: [armhf] Yield SIGBUS when compiling with -fstack-clash-protection

Package:
gfortran
Source:
gfortran
Description:
GNU Fortran 95 compiler
Submitter:
Rafael Laboissière
Date:
2023-12-30 06:33:06 UTC
Severity:
normal
Tags:
#1055750#5
Date:
2023-11-10 14:52:35 UTC
From:
To:
Dear Maintainer,

The motivation for the present bug report comes from Bug#1055228. Since
version 1.22.1 of dpkg-dev was released (on October 30), the plplot
package FTBFS due to a failing compilation of one of Fortran examples,
which is exercised as a unit test during package building.

The package built fine previously. The problem is triggered by the change
in dpkg-buildflags, which now includes -fstack-clash-protection in
FFLAGS.

I am attaching to this bug message a shell script that can reliably
trigger the bug on an armhf system. Here is the output:

    $ ./gfortran-stack-clash-protection-armhf-bug.sh
    […]
    Program received signal SIGBUS: Access to an undefined portion of a memory object.

    Backtrace for this error:
    Bus error

Note that the bug does not happen on amd64. Also, it does not happen on
armhf when the option -fstack-clash-protection is not used in the
invocation of gfortran.

As far as I can tell, the problem is due to a global variable (tr) that
is not correctly accessed in a private function (mypltr) of the x09f
program. Here is what gdb tells me:

    $ gdb x09f
    […]
    (gdb) run -dev ps -o /dev/null
    Starting program: /home/rafael/fortran/x09f -dev ps -o /dev/null
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".

    Program received signal SIGBUS, Bus error.
    0x00400dfe in x09f::mypltr (x=0, y=1, xt=1, yt=34) at x09f.f90:193
    193             xt = tr(1) * x + tr(2) * y + tr(3)

My knowledge of Fortran and gfortran is way too scarce and, therefore, I
cannot debug the problem deeper.  There may be a programming error in
x09f.f90 or this may be a problem with gfortran on armhf when option
-fstack-clash-protection is given.

Any help will be appreciated.

Best,

Rafael Laboissière

#1055750#10
Date:
2023-11-13 16:13:30 UTC
From:
To:
The attached file bug-1055750.tgz contains a minimal code that
triggers the bug on an armhf system :

     $ ./run
     ***** Compile without -fstack-clash-protection & run
     /usr/bin/ld: warning: /tmp/cc9l2GJa.o: requires executable stack (because the .note.GNU-stack section is executable)
     ***** Compile with -fstack-clash-protection & run
     /usr/bin/ld: warning: /tmp/ccP0miz5.o: requires executable stack (because the .note.GNU-stack section is executable)

     Program received signal SIGBUS: Access to an undefined portion of a memory object.

     Backtrace for this error:
     Bus error

* Rafael Laboissière <rafael@debian.org> [2023-11-10 15:52]:

#1055750#15
Date:
2023-11-14 11:01:26 UTC
From:
To:
Hi Rafael,

Thanks! For the record I can reproduce the issue in a armhf chroot, but
*not* on armel and arm64. The only thing to change in the reproducer is
the -I argument to gfortran:

armhf: /usr/lib/arm-linux-gnueabihf
armel: /usr/lib/arm-linux-gnueabi
arm64: /usr/lib/aarch64-linux-gnu

I'll see if I can find out more.

#1055750#20
Date:
2023-11-15 17:06:17 UTC
From:
To:
* Emanuele Rocca <ema@debian.org> [2023-11-14 12:01]:

Thanks for the followup, Emanuele.

I have investigated the issue a little bit deeper. Please, find here
attached a new tarball with a simplified version of the x09f.f90 file,
that still triggers the bug. Running the resulting executable with gbd,
yields the following:

     (gdb) run -dev ps -o /dev/null
     Starting program: /home/rafael/bug-1055750/x09f -dev ps -o /dev/null
     [Thread debugging using libthread_db enabled]
     Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".

     Program received signal SIGSEGV, Segmentation fault.
     x09f::mypltr2 (x=0, y=1, xt=0, yt=7.300765e-43) at x09f.f90:48
     48              yt = tr(1)

In this new code, there are two private functions mypltr1 and mypltr2.
The bug only happens when the latter is invoked, in which a value coming
from the global variable tr is assigned to yt. In mypltr1, the assignment
is done to xt, instead of yt, and no segfault is triggered.

Best,

Rafael Laboissière

#1055750#25
Date:
2023-11-15 17:37:53 UTC
From:
To:
* Rafael Laboissière <rafael@debian.org> [2023-11-15 18:06]:

Oops, with the correct tarball this time.

Best,

Rafael Laboissière

#1055750#30
Date:
2023-11-15 17:45:10 UTC
From:
To:
* Rafael Laboissière <rafael@debian.org> [2023-11-15 18:37]:

Grrr, with the correct one now, hopefully!

R.

#1055750#35
Date:
2023-11-15 17:47:11 UTC
From:
To:
Hi Emanuele,

Our messages crossed.

* Emanuele Rocca <ema@debian.org> [2023-11-15 18:20]:

Does this mean that the origin of the bug is upstream or that it still
may be a bug in gfortran?

Best,

Rafael Laboissière

#1055750#40
Date:
2023-11-15 19:11:40 UTC
From:
To:
Hello Rafael!

At this point we know for sure that the issue is not armhf-specific, and
also that it is not caused by stack-clash-protection. On the contrary,
enabling stack-clash-protection on armhf allowed us to discover a
problem that would have otherwise gone unnoticed.

Whether the bug is in plplot or gfortran I really have no idea. I would
tend not to think of a compiler issue unless we have some evidence in
that direction though, and suggest raising the issue with plplot
upstream. Showing them the x86 reproducer with -fsanitize=address should
be a good starting point.

Thanks,
  Emanuele

#1055750#45
Date:
2023-11-16 06:51:19 UTC
From:
To:
* Emanuele Rocca <ema@debian.org> [2023-11-15 20:11]:

Ok, thanks.

FYI, I can reproduce the segfault on armhf and amd64 with
-fsanitize=address.

My guess is that the bug is in PLplot and not in gfortran, but this is
jsut a guess. I will eventually inform the PLplot upstream authors about
the issue.

Best,

Rafael Laboissière

#1055750#50
Date:
2023-11-16 07:42:55 UTC
From:
To:
* Rafael Laboissière <rafael@debian.org> [2023-11-16 07:51]:

Done !

R.

#1055750#57
Date:
2023-11-16 08:30:05 UTC
From:
To:
Hi Rafael,

Thank you!

To be honest I think it's safe to close 1055750 (gfortran) and mark
1055228 (plplot) as forwarded upstream though, I don't think we have any
reasons to believe the compiler is at fault really.

  Emanuele

#1055750#62
Date:
2023-11-21 16:13:20 UTC
From:
To:
* Emanuele Rocca <ema@debian.org> [2023-11-16 09:30]:

Thanks for the suggestion, Emanuele.

I am hereby merging both bugs, which are now both assigned to plplot.

Best,

Rafael Laboissière

#1055750#71
Date:
2023-11-21 16:29:22 UTC
From:
To:
severity 1055750 serious
merge 1055750 1055228