#1028251 xen: FTBFS when building xen binary packages for sid on x86_64

Package:
src:xen
Source:
src:xen
Submitter:
Chuck Zmudzinski
Date:
2023-01-14 18:54:02 UTC
Severity:
normal
Tags:
#1028251#5
Date:
2023-01-08 22:18:17 UTC
From:
To:
Dear Maintainer,

Hi,

I needed to test a patch to libxl so I started by trying to build
xen from source on an up-to-date sid installation.

The build failed:

   debian/rules override_dh_missing
make[1]: Entering directory '/home/chuckz/sources-sid/xen/xen-4.17.0'
dh_missing --list-missing
dh_missing: warning: usr/lib/modules-load.d/xen.conf exists in debian/tmp but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/proc-xen.mount exists in debian/tmp but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/xen-init-dom0.service exists in debian/tmp but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/xen-qemu-dom0-disk-backend.service exists in debian/tmp but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/xen-watchdog.service exists in debian/tmp but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/xenconsoled.service exists in debian/tmp but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/xendomains.service exists in debian/tmp but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/xendriverdomain.service exists in debian/tmp but is not installed to anywhere
dh_missing: warning: usr/lib/systemd/system/xenstored.service exists in debian/tmp but is not installed to anywhere

Please note that this output is after editing the
line in debian/rules that is currently

dh_missing --fail-missing

with

dh_missing --list-missing

so the missing files only induce a warning instead of FTBFS.

So the workaround is this patch to debian/rules:
--- a/debian/rules      2023-01-08 16:36:01.605863417 -0500
+++ b/debian/rules      2023-01-08 05:31:24.000000000 -0500
@@ -329,7 +329,7 @@
 # By default, files in debian/tmp which are not handled by anything
 # in rules are ignored.  This lists them.
 override_dh_missing:
-       dh_missing --fail-missing
+       dh_missing --list-missing


 # We are dropping the config file /etc/default/xen which appeared in
-----------snip---------

I presume you know about this and plan to fix it before the
next upload, but perhaps a recent systemd update is causing
this so I am reporting it here.

I also request that if the missing systemd files cannot be
installed properly before the next upload of a new version
you apply a workaround such as this patch or another workaround
until the missing systemd files are installed and configured
correctly.

Kind regards,

Chuck

#1028251#10
Date:
2023-01-08 23:09:57 UTC
From:
To:
Sorry, the patch I posted in the original message will not apply properly.
I forgot I also edited the comment:

Here is the correct patch:
--- rules    2022-12-21 16:34:51.000000000 -0500 +++ rules.new    2023-01-08 05:31:24.000000000 -0500 @@ -327,9 +327,9 @@          | xargs -0r gzip -9vn    # By default, files in debian/tmp which are not handled by anything -# in rules are ignored.  This makes them into errors. +# in rules are ignored.  This lists them.  override_dh_missing: -    dh_missing --fail-missing +    dh_missing --list-missing      # We are dropping the config file /etc/default/xen which appeared in ----------snip---------- Thanks for all your work. Apart from this little problem, it appears Xen 4.17 will work well on Bookworm. Kind regards, Chuck
#1028251#15
Date:
2023-01-09 13:09:34 UTC
From:
To:
Hi Chuck,

I cannot reproduce this error here locally and the CI build also succeeds:

https://salsa.debian.org/xen-team/debian-xen/-/pipelines/481577

How are you building the packages? In a clean build environment, using
for example sbuild or pbuilder, or in an environment where unrelated
other build dependencies could be present, that are not included in the
xen list, but maybe 'wake up and do something' if they're present?

You can also compare your own build output with the full one from the CI
job:

https://salsa.debian.org/xen-team/debian-xen/-/jobs/3767564/raw

Hans

#1028251#20
Date:
2023-01-09 17:44:50 UTC
From:
To:
thanks

I thought I had a fairly clean sid install, but I think the problem
on my system could be caused by some obscure grandfathered in
setting because the sid I am using was updated from all the way back to
an original install of jessie many years ago...

It might be time for me to refresh my sid with a clean installation.

Out of curiosity and if you have time, can you answer a couple of
question if you know the answer?

1. Do the builds on a clean environment produce the missing files
listed in my build?

2. Are those systemd service files installed anywhere in the xen
binary packages, either in arch=x86_64 packages or for the arch=all
packages such as xen-utils-common?

If you don't know the answer to these questions I will investigate
myself to find the answers, so you can work on more important things.

As I said, I am building on a sid install that might have some
stuff grandfathered in from old releases going back to jessie.
I also might have some stale stuff around from my private builds
of the traditional device model available from xen that is not
part of the Debian packages. I will investigate these possible causes.

I use debuild as a frontend to dpkg-buildpackage to build the packages.

I will take a look at that when I get a chance.

This is not a real high priority for me, so I am content to let this
be until I get a chance to investigate the quirks of my current
installation of sid, and I also added the moreinfo tag, so you can
ignore this bug if you wish until I do some further research.

Cheers,

Chuck

#1028251#27
Date:
2023-01-09 17:55:55 UTC
From:
To:
Hi!

No, after my local package build, there's no such things in there:

~/build/xen/debian-xen/debian/tmp/usr/lib m (master) 1-$ ll
total 0
drwxr-xr-x 1 knorrie knorrie  110 Jan  8 23:51 debug
drwxr-xr-x 1 knorrie knorrie 2048 Jan  8 23:50 x86_64-linux-gnu
drwxr-xr-x 1 knorrie knorrie   20 Jan  8 23:51 xen-4.17

No, they are not:

https://packages.debian.org/search?searchon=contents&keywords=xenconsoled.service&mode=path&suite=unstable&arch=any

Yes. So (I'm not entirely sure how it works, but as example, just making
something up here): After doing something else first, you might end up
with a system that has for example dh-systemd-yolo-all-the-things-helper
installed. And, it might be that only it being present means that the
package build process changes. It might even be a 'feature' of that
helper... "just add it to your build depends, and it will automatically
do all the things for you!!!~``1"

This is why it is very much recommended to build the packages using
something like sbuild, so that you can be sure that every time it will
start with a super minimal chroot which only has some essential things,
and that the only build dependencies used will be the ones that are
explicitly defined in the debian/control of the package.

Sure, no problem.

Have fun,
Hans

#1028251#32
Date:
2023-01-09 18:08:44 UTC
From:
To:
Thanks for the advice - it is now on my TODO list to learn to use sbuild
or some other tool that makes it easy to do builds in a minimal chroot.

Kind regards,

Chuck

#1028251#37
Date:
2023-01-12 03:58:34 UTC
From:
To:
checking for LIBNL3... yes
checking for SYSTEMD... no
checking for SYSTEMD... yes
checking for SYSTEMD... no
checking for SYSTEMD... yes
checking for bison... /usr/bin/bison

On the CI build:

checking for LIBNL3... no
configure: WARNING: Disabling support for Remus network buffering and COLO.
    Please install libnl3 libraries (including libnl3-route), command line tools and devel
    headers - version 3.2.8 or higher
checking for SYSTEMD... no
checking for SYSTEMD... no
checking for bison... /usr/bin/bison

It looks like my system having libnl3 is the culprit. It is causing extra checks
for systemd that succeed on my system, but all checks for systemd fail on the
CI build. That is why on my system it is trying to install systemd files.

On my system:

chuckz@debian:~$ dpkg-query -l \*libnl\*
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                    Version      Architecture Description
+++-=======================-============-============-==========================================================
ii  libnl-3-200:amd64       3.7.0-0.2+b1 amd64        library for dealing with netlink sockets
ii  libnl-3-dev:amd64       3.7.0-0.2+b1 amd64        development library and headers for libnl-3
un  libnl-dev               <none>       <none>       (no description available)
ii  libnl-genl-3-200:amd64  3.7.0-0.2+b1 amd64        library for dealing with netlink sockets - generic netlink
ii  libnl-route-3-200:amd64 3.7.0-0.2+b1 amd64        library for dealing with netlink sockets - route interface
ii  libnl-route-3-dev:amd64 3.7.0-0.2+b1 amd64        development library and headers for libnl-route-3
un  libnl2-dev              <none>       <none>       (no description available)
un  libnl3-dev              <none>       <none>       (no description available)
chuckz@debian:~$ sudo apt-get -s remove libnl-3-200
...
The following packages will be REMOVED:
  ibverbs-providers iw libcephfs2 libibverbs-dev libibverbs1 libiscsi-dev libiscsi7 libnl-3-200 libnl-3-dev
  libnl-genl-3-200 libnl-route-3-200 libnl-route-3-dev librados-dev librados2 librbd-dev librbd1 librdmacm-dev
  librdmacm1 libvirt-glib-1.0-0 libvirt0 powertop virt-viewer wpasupplicant
...
chuckz@debian:~$

So I think this means xen currently suffers from ftbfs if wpa-supplicant or libvirt0 is installed.

libnl-3-200 is a not an old package from a prior release, amd64 version was uploaded on 9-6-2022:

https://snapshot.debian.org/package/libnl3/3.7.0-0.2/#libnl-3-200_3.7.0-0.2:2b:b1

Do you want me to look at the configure settings and see if I can find a
way to fix this?

Chuck

#1028251#44
Date:
2023-01-13 05:58:29 UTC
From:
To:
Regarding the systemd files causing ftbfs, this explains it:

https://salsa.debian.org/xen-team/debian-xen/-/blob/master/m4/systemd.m4#L119

and this:

https://salsa.debian.org/xen-team/debian-xen/-/blob/master/tools/configure.ac#L480

The comments indicate that using AX_AVAILABLE_SYSTEMD() will
by default enable systemd if systemd development files are on the
build system, and AX_ALLOW_SYSTEMD() means --enable-systemd
must explicitly be passed to tools/configure to enable it. Upstream
uses the former, so build systems with systemd development files
by default will ftbfs because that produces missing files that dh_missing
in debian/rules does not like.

So the reason there is ftbfs on my system is that my system has
the systemd development package installed.

After doing:

chuckz@debian:~$ sudo apt-get remove libsystemd-dev

the build succeeded on my sid system even with

dh_missing --fail-missing

in debian/rules.

Since systemd files are not installed in the packages, there
is no need to build them. So you might consider this patch
to explicitly disable systemd with the override_dh_auto_configure
setting until the packages include the native systemd startup files
that fixes this ftbfs even if libsystemd-dev is installed:
--- a/debian/rules    2022-12-21 16:34:51.000000000 -0500 +++ b/debian/rules    2023-01-12 20:49:35.282125205 -0500 @@ -206,6 +206,7 @@          --with-libexec-leaf-dir=xen-$(upstream_version) \          --disable-blktap1 \          --disable-blktap2 \ +        --disable-systemd \          --disable-qemu-traditional --disable-rombios \          --with-system-qemu=/usr/libexec/xen-qemu-system-i386 \          --enable-ovmf --with-system-ovmf=/usr/share/ovmf/OVMF.fd \ So it turns out the libnl-route-3-dev package is *not* the culprit of ftbfs, so I was wrong about that in my earlier message. I would consider that if you apply the above patch, you could mark this bug as done. I don't know if you consider the existence of the libnl-route-3-dev package on the build system causing remus/COLO support being added a minor reproducibility bug: On my build: checking for LIBNL3... yes checking for SYSTEMD... no On the CI build: checking for LIBNL3... no configure: WARNING: Disabling support for Remus network buffering and COLO.     Please install libnl3 libraries (including libnl3-route), command line tools and devel     headers - version 3.2.8 or higher checking for SYSTEMD... no Since those libnl3 libraries are not Build-Depends, the presence of those packages on the Build system makes the build not reproducible when compared to builds in the CI environment. Not a big problem, though, so I think you can ignore it if you want as long as no one complains the remus/COLO feature is missing in the Debian Xen packages. Thanks, Chuck
#1028251#49
Date:
2023-01-13 12:39:19 UTC
From:
To:
By the way, maybe a better fix would be to pass --enable-systemd, add libsystemd-dev
build-dep and list them in the package? They might require patching to
support Debian-specific upgrade machinery, though...

Not installing xendriverdomain.service is one of things missing for
driver domains support
(https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=922033).

#1028251#54
Date:
2023-01-13 21:45:14 UTC
From:
To:
Hi Marek,

I wouldn't be against fixing it that way. In fact, I would prefer
that Debian packaged Xen with full support for native systemd units.
I am willing to wait until if/when the package maintainers have
full systemd support in the Xen packages.

Perhaps this is an opportunity for you to try to fix 922033 again.
I see it has been sitting there for a few years now. Let's see
what Hans thinks.

Kind regards,

Chuck

#1028251#59
Date:
2023-01-13 23:59:04 UTC
From:
To:
Hi,
[...]
Yolo style cutting out lines here...
[...]

Yeah, well, so, the thing here is...

When Debian started to package Xen (thanks! Bastian, in 200X), the
upstream init scripts were copy pasted, and adjusted to have the ability
to have different Hypervisor-ABI-incompatible versions installed at the
same time. Also, this is related to the collection of Makefile patches
we carry around to have ABI-incompatible stuff end up in a directory
like /usr/lib/xen-4.14/ and /usr/lib/xen-4.17/ !

What does this mean? Well, in the most basic sense it means that you
could apt-get (dist-)upgrade and then still be able to xl shutdown a
domU afterwards before doing reboot, because it will choose the right
tools which match with the ABI of the *now* running hypervisor instead
of being left with a dumpster fire, which in the end causes you to shout
curse words and cause you to have to go to the machine and hold the
power button for 5 seconds to force power it off.

This is the thing about where you upgrade from Xen 4.14 to Xen 4.17
during the upgrade from Debian 11/Bullseye to Debian 12/Bookworm, it
will allow you, if booting the whole new thing is a huge failure, to
reset the computer, and in grub, choose to use the previous Xen (and
possibly do that in combination with previous Debian linux kernel) and
then have a system where you again at least can start your domUs again
*) and first have a good rest, night of sleep before starting to dig
into what's going wrong.

So, this is exactly the same way of doing stuff like how you can also
reboot back into the previous Linux kernel (ABI-compatible) one during a
system upgrade, even if you're not using Xen at all!

I like this very much. This is the kind of thing that helps admins of
systems that have just local disks and a few domUs. Like, the case where
you support some non-profit organization with their server stuff running
on donated hardware. (Yes, I also do some of those, I do!) And, in case
something does fail (there could always be something like a misbehaving
mpt3sas card in the hardware or anything that no one else spotted yet),
the admin does not have to end up in total panic mode after doing the
upgrade on a Friday afternoon lying upside down inside a broom closet,
but they can just at least recover from the situation and have something
that's running again, and then a day later, or 2 or 3 days or a week
later return on another planned moment to fix it, after asking around.

Upstream Xen stuff doesn't have anything like that.

But, they actually look at us, and they think, ooh, this is actually
nice, we should have that also by default.

The fact that we have this changed/altered/divergent init scripts in
Debian is the main reason that we cannot just enable systemd things
which will put upstream whatever on the system.

So, what could we do about this?

The project plan (that could be drafted on an A4 paper) could look like,
gather around all distro maintainers of Linux distro's that are shipping
Xen, and then search for a 'Project owner', which we totally need to be
someone that is actually employed at a company that actually cares about
getting the results of this.

Then we go look at the Debian stuff and fix upstream to do the same
thing, also fixing all the init/systemd stuff that got lost along the
way, and then we can push it down to all other distro's as well again.

And after all of that is done, there will be a time where we actually
(at Debian) can have a different response to everything init script
related than "sorry, probably won't happen".

If you look at the init script stuff in Xen upstream already, it quickly
shows that it's just a total dumpster fire. Someone cobbled up something
in the year 2005, and after that, only changes have been made by people
who actually tried to touch it for a few seconds with a 10-foot pole.

So, nobody is really caring about this. There is not an actual person
upstream who is involved into this. As long as that's the case, it's not
a healthy thing for ourselves to start to try fixing all of that from a
downstream POV.

We're currently doing 'good' with the Debian Xen Team, I think. We can
keep up with security updates, and we're in some sort of OK position to
get the essential stuff into place for Bookworm, but to say it in good
Dutch "Nee, het houdt niet over...".

Knorrie

P.S. and if you think "I have no idea what you're rambling about
Knorrie, I have never experienced that problem", well, then thank you
for using Debian ;]

*) Yeah, this works for PV and PVH, but until the #$^$%& qemu stops
using internal unstable function calls any more so that it doesn't have
to hard depend on libxenmisc4.1X any more, you're still screwed for HVM.
BUT! if you just shut down the domUs nicely and reboot and you have to
go back, then dpkg -i or whatever the previous qemu and you can still
start all domUs again instead of going into full panic mode during the
night.

#1028251#64
Date:
2023-01-14 02:08:38 UTC
From:
To:
That is a nice feature of the Xen Debian packages, to have the ability
to manage guests on different versions of the hypervisor.

I understand the problem here.

I have noticed this problem, being a user of Xen. It would be nice if
there was a corporate contributor to Xen who cared about the free
software licensed version. It appears there is not such an entity
these days.

That's unfortunate.

As long as that's the case, it's not

I am quite satisfied with the work of the Debian Xen Team. While I said earlier
I would prefer systemd units, I can see that cannot happen anytime soon due to
circumstances beyond the control of the Debian Xen Team, and I am OK with that.
So, thanks you for your efforts, and for taking the time to explain the situation
here. I will leave it to the Debian Xen Team's discretion what to do about
this bug. The ftbfs is not that big a problem to be concerned about, because it
is so easy to work around it.

Kind regards,

Chuck

#1028251#69
Date:
2023-01-14 17:08:22 UTC
From:
To:
As a week-end admin of a non-profit organization running Xen on donated
hardware (and having already spent one or two night trying to get back the
system online for the morning), I thank you for this invisible work for us.

#1028251#74
Date:
2023-01-14 17:19:28 UTC
From:
To:
While Qubes release upgrade would benefit from this feature a bit (but
see remark at the end), I'm afraid it isn't high enough on our priority
list to dedicate enough time for this... In a long term, I'd rather
invest in making hypervisor ABI itself stable, so libxenX.Y would work
with Xen X.Y-n too. That's rather far away, but AFAIK it is on Xen
upstream roadmap.

IIUC, since Debian ships wrappers for various Xen tools that choose the
right version, just getting native systemd units shouldn't be that hard.
But yes, syncing those init scripts back together is substantially more
work.

FWIW, we have a bodge for 922033 as a package that patches some of those
init scripts:
https://github.com/qubesOS/qubes-vmm-xen-guest
(xendriverdomain.service is shipped via another package, for historical
reasons).

systemd units are likely in significantly better shape. They are
actually used in production at least by Fedora, Qubes and OpenSUSE,
contrary to legacy sysvinit scripts.

Unfortunately the same caveat applies to libvirt, and while qemu uses
only very few functions from the unstable API, with libvirt it's the
whole libxl, so it's very far from dropping that pain point...
So, if you manage Xen domains via libvirt, not xl directly, you're
screwed in PV and PVH case too.

#1028251#79
Date:
2023-01-14 18:50:16 UTC
From:
To:
"Totally need to be someone that is actually employed at a company." I am curious
about that statement. Has Debian given up on the idea that members of the FLOSS
community can band together and solve a problem like this without corporate
backing? I don't think other distros have given up on that idea: For example, Fedora has
its community spins, a KDE spin, for example. Perhaps a Xen spin at Fedora could lead
this project. But why not Debian, with all it's derivative distros pitching in to help? I think
maybe the conclusion to draw from the statement that it has to be someone actually
employed at a company really means there is not enough community for support for
Xen to do this, Xen just is not a very big priority for the larger FLOSS community any
more. AFAIK, Qubes is the only project downstream of Xen that is serious about making
Xen work for desktop virtualization. Thanks, Marek, for the great work you have been
doing!

Kind regards,

Chuck