I tried enabling the Salsa CI team pipelines[1] for openjk[2] and found an unexpected result: in the second of two builds done by reprotest, openjk is compiled with an x86_64 compiler but appears to have CMAKE_SYSTEM_PROCESSOR (the equivalent of DEB_HOST_ARCH_CPU) set to i386. In openjk this results in the loadable modules (which have architecture-specific names) being named differently, causing extensive binary differences. I can't be sure (it isn't currently logged by the salsa-ci-team pipline[3]) but I think reprotest is using 'setarch i386', equivalent to linux32(1), for the second build. I don't know which of the packages involved is at fault here, but this is surely a problem with at least one of them: - maybe reprotest shouldn't be using linux32 to build packages for amd64 when DEB_BUILD_ARCH_CPU and DEB_HOST_ARCH_CPU are both amd64, because this causes uname(2) and DEB_BUILD_ARCH_CPU to be inconsistent, and causes debhelper's is_cross_compiling() to return false, which makes debhelper assume it doesn't need to override things like CMAKE_SYSTEM_PROCESSOR? - or maybe debhelper's cmake build system should be forcing CMake to set [CMAKE_HOST_SYSTEM_PROCESSOR to a value appropriate for the DEB_BUILD_ARCH_CPU and?] CMAKE_SYSTEM_PROCESSOR to a value appropriate for the DEB_HOST_ARCH_CPU even when is_cross_compiling() returns false, so that CMake will obey those settings instead of calling uname(2)? - or maybe it's CMake that should be doing something differently? A similar situation would presumably appear if you used reprotest on other 64-bit architectures that have a non-default 32-bit personality, like aarch64/arm, powerpc64/powerpc and mips64el/mipsel. Complicating factors: * CMake cross-compiling terminology is not the same as is used in dpkg, so you can't usefully say "host" when discussing this issue without clarifying whose definition of host you are using. dpkg uses the GNU terminology (as seen in Autotools and Meson), where you compile on the build architecture, producing binaries suitable to be run on the host architecture (which might themselves in rare cases be cross-compilers that produce code for the target architecture). In CMake terminology, you compile on the host system (which is the GNU build system), producing binaries suitable to be run on the target system (which is the GNU host system). * CMake doesn't document its taxonomy of CPUs (unlike for example dpkg, GNU and Meson, which each have a (subtly different!) canonical list), and it appears that it will be different on different OSs; but at least on Linux (and possibly GNU/kFreeBSD and GNU/Hurd?) it appears that in practice it uses (struct utsname).machine, which is the same thing as Linux uname -m. [1] https://salsa.debian.org/salsa-ci-team/pipeline/ [2] https://tracker.debian.org/pkg/openjk [3] https://salsa.debian.org/salsa-ci-team/pipeline/issues/56
Hi Simon, me. There are very many (non-cmake) packages that rely on uname -m and many of them will misbuild in such a setting. Of course, every such use of uname -m needs to be worked around for cross building. In a perfect world, build systems wouldn't be using uname -m at all. This is not where we are today. However, the reverse seems to be somewhat tolerated: Performing a native i386 build in an i386 chroot on an amd64 kernel is commonly expected to work. In this setting, a linux32 variation makes somewhat sense. A number of build systems (including meson) have code to handle this. As far as I know, the official buildds do linux32 for i386 to avoid problems. So I'd first like to understand the rationale for this reprotest behaviour. Other than that, my general advice would be preferring $CC -dumpmachine over uname -m as it avoids a whole host of problems. Getting there seems like a herculean task though. Helmut
...
Yes, the policy I would have expected goes something like this:
* packages MUST do a successful non-cross build if $(uname -m) agrees with
the dpkg host architecture (dpkg says i386 and uname says i[3456]86, etc.)
* packages SHOULD do a successful non-cross build if $(uname -m) indicates
anything "better than" the baseline for the dpkg architecture (where
x86_64 > i386, mips64el > mipsel, arm64 > armhf > armel and so on)
* packages are not required to do a successful non-cross build if
$(uname -m) indicates something "worse than" or incompatible with the
baseline for the dpkg architecture (e.g. dpkg says amd64 and uname
says i[3456]86, or dpkg says ppc64 and uname says ppc64el or powerpc)
Yes. If it was up to me, I would recommend for official/production buildds
(which just want working binaries or a RC bug report) to wrap builds
for all 32-bit architectures in linux32. I think it's OK for QA builds
like reprotest, which want pedantic correctness more than working code,
to try doing 32-bit non-cross builds in a 32-bit chroot on a 64-bit build
machine without linux32, as long as it's understood that the resulting
bug report on failure might not be RC.
Is -dumpmachine portable among compilers, or is it a GNU'ism? If it's
specific to gcc, or specific to gcc and compilers like clang that mimic
gcc, or specific to compilers designed with the GNU/Autoconf vocabulary
of CPUs in mind, then I can see why upstreams targeting both GNU and
non-GNU OSs would avoid it.
As far as I can tell, CMake uses uname -m for its vocabulary of Linux CPUs
(but see https://bugs.debian.org/930995), so switching to using the CPU
part of $CC -dumpmachine would be an incompatible change, unless CMake
had a lookup table to map between GNU CPUs and what uname -m would have
said on the relevant machine. I think Meson did this better by having an
explicitly documented table of known CPU names; I would have preferred it
if Meson had reused GNU's vocabulary of CPU names rather than inventing
a new one, but it's too late for that.
Debian::Debhelper::Buildsystem::cmake effectively already does have
a lookup table to map GNU CPUs to uname -m (it's a list of exceptions
rather than a complete table, since in practice they usually match),
but it's currently only used when told to cross-compile, and reprotest's
builds are "officially" not cross-compiling, even if uname -m would
indicate otherwise.
smcv
Simon McVittie: Hi, I am not sure I am any wiser on this bug after reading the bug log other than it looks unactionable to me at the moment (hench the moreinfo tag). If the conclusion is that debhelper should always pass -DCMAKE_SYSTEM_PROCESSOR (or/and -DCMAKE_SYSTEM_NAME) then I am happy to implement that after we confirmed that this is "safe" (doesn't break every cmake package in sid, etc.). However, it is not clear to me whether that is the conclusion we reached. Thanks, ~Niels
Hi Simon and Helmut, Is there any update on this? At the moment, this bug is not actionable and sprayed over three packages. (last reply quoted in full so you don't have to look up the bug) Thanks, ~Niels
Hi Niels, No. I don't think there are any news here. I had hoped that my previous mails would have made my position clear. Let me try to be more explicit: I think that running native amd64 builds in linux32 is broken and that any tool doing so (e.g. reprotest) is buggy. Having debhelper pass CMAKE_SYSTEM_PROCESSOR is a nice idea, but doing so will break a fair number of packages in subtle ways. Therefore I recommend not doing that (other than for cross compilation). This is basically repeating what I already said. If you concur, the logical next step is reassigning to reprotest. Helmut
Helmut Grohne: Thanks for clarifying; I missed that part. Reassigning to reprotest Thanks, ~Niels
Hi reprotest maintainers, I'd like to point out that an additional impact of this bug is that it results in failures for packages that are designed to build on amd64 but not on i386. For example, bazel-bootstrap supports 64-bit processors but (currently) does not support 32-bit processors. This line [1] causes the second build to *almost* always fail because it tries to build a package on i386 even though that is not a supported architecture. For reference, on my amd64 machine running "setarch --list" returns: uname26 linux32 linux64 i386 i486 i586 i686 athlon x86_64 Eliminating x86_64, as line [1] does, gives a very high probability of the resulting architecture being incompatible with 64-bit-only packages. If this is not easy to fix, is there a recommended workaround to prevent false-positive failures? Thanks!
control: tags -1 +help Hi Olek, Sadly I have to admit that with these words you nailed the main current problem with reprotest: there are no real reprotest maintainers. I mean we, the reproducible builds folks within Debian, maintain the package more or less nicely, however, there is noone working on it "upstream", fixing bugs like this one. (The last person doing this has left the reproducible builds project.) If someone is looking for a useful FLOSS project to contribute, reprotest could be it!
While revisiting the package where I originally saw this issue (openjk)
I noticed that reprotest 0.7.18 has this in its changelog:
Implement realistic CPU architecture shuffling
(commit reproducible-builds/reprotest@15e3c653). It looks like that
change might be intended to be a solution to this bug, or to an
independent rediscovery of this bug?
I've re-enabled the reprotest salsa-ci job for openjk and I'll see what
happens.
smcv