#1060735 glib2.0/experimental: FTBFS on s390x and other 64-bit BE: gdatetime test fails or crashes

#1060735#5
Date:
2024-01-13 15:21:02 UTC
From:
To:
I recently uploaded a snapshot of GLib 2.79.x to experimental (in
preparation for NEW processing) and it failed tests on s390x and on
the 64-bit, big-endian ports ppc64 and sparc64. I suspect this means
it's a general problem with 64-bit BE, rather than specifically s390x.

The 32-bit big-endian powerpc and hppa ports seem to pass this test fine,
although hppa had an unrelated failure in a different test.

On the s390x buildd, the test crashed:

https://buildd.debian.org/status/fetch.php?pkg=glib2.0&arch=s390x&ver=2.79.0%2Bgit20240110%7Eg38f5ba3c-1&stamp=1705088035&raw=0

On ppc64, the same test failed with SIGBUS, but otherwise similar symptoms.

I can sort of reproduce the failure on s390x porterbox zelenka, but instead
of segfaulting, the test failed with an assertion error involving dates with
a Japanese era marker:
[Everything passes, until...]

The sparc64 buildd saw the same assertion failure as on zelenka. I don't
know whether this is related or unrelated: if unrelated, then we can
clone this bug if necessary.

    smcv

#1060735#10
Date:
2024-01-13 16:21:33 UTC
From:
To:
git bisect says commit df4aea76 "gdatetime: Add support for %E modifier
to g_date_time_format()" is the first bad commit, which would be consistent
with it being...

... something to do with Japanese and Thai eras, and the %E modifier.

    smcv

#1060735#15
Date:
2024-01-13 19:32:56 UTC
From:
To:
Control: forwarded -1 https://gitlab.gnome.org/GNOME/glib/-/issues/3225
Control: tags -1 + help

I can't see anything in the relevant commit[1] that looks like it would be
affected by endianness. Could there be an endianness problem in one of the
glibc APIs that it's calling into, or something like that?

    smcv

[1] https://gitlab.gnome.org/GNOME/glib/-/commit/df4aea76204090f770a8fd90c2b68b51c2cfc2a3

#1060735#24
Date:
2024-01-15 13:12:26 UTC
From:
To:
Control: severity -1 important
dates with the %E modifier (used in Japan and Thailand) on big-endian
64-bit, which reduces the severity of this bug to non-RC.

It looks as though:

- glibc documents nl_langinfo(ERA) as returning a semicolon-delimited list
  of eras

- but in fact it returns a NUL-delimited, double-NUL-terminated list of
  eras, such that parsing the list cannot be done without risking a read
  overrun, unless you either assume that the undocumented
  double-NUL-termination will be present or use the undocumented
  nl_langinfo(_NL_TIME_ERA_NUM_ENTRIES). GLib currently does the latter.

- GLib has, at least for now, prioritized its own usability for Japanese
  and Thai users higher than the design principle that it should not rely
  on undocumented APIs

- this is OK on 32-bit and on little-endian, but glibc's
  nl_langinfo(_NL_TIME_ERA_NUM_ENTRIES) returns what appears to be a
  wrong result on 64-bit big-endian architectures

Discussion in GLib: https://gitlab.gnome.org/GNOME/glib/-/issues/3225

Workaround in GLib: https://gitlab.gnome.org/GNOME/glib/-/merge_requests/3820

Related glibc bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31030

If there is a safe way to get this information from glibc, then GLib should
use that, but I don't know what that safe way would be.

    smcv