#1005308 Installation Report: Crash of Debian netinst iso in Xen HVM guest

#1005308#5
Date:
2022-02-10 23:47:53 UTC
From:
To:
Boot method: CD/DVD

Image version: First test: debian-11.0.0-amd64-netinst.iso and second
recent test:
https://cdimage.debian.org/cdimage/weekly-builds/amd64/iso-cd/debian-testing-amd64-netinst.iso
downloaded on 10 Feb 2022.

Date: First test Mon, 23 Aug 2021 21:17:18 -0400 and second test Thu 10
Feb 2022 03:01:58 PM -0500

Machine: First test DMI: Xen HVM domU, BIOS 4.14.3-pre 07/30/2021 and
second test DMI: Xen HVM domU, BIOS 4.14.4-pre 10/22/2021 (these are the
outputs of dmesg after successful installation in the Xen HVM)

Processor: Intel core i5-4590S, 4 virtual cpus in the Xen HVM domU guest

Memory: 3 GB allocated to the Virtual machine, 16 GB total on the
desktop system

Partitions: <df -Tl will do; the raw partition table is preferred>
user@bullseye:~$ df -Tl
Filesystem     Type     1K-blocks     Used Available Use% Mounted on
udev           devtmpfs   1496304        0   1496304   0% /dev
tmpfs          tmpfs       302628     2700    299928   1% /run
/dev/xvda3     ext4      99833320 29985520  64753516  32% /
tmpfs          tmpfs      1513128        0   1513128   0% /dev/shm
tmpfs          tmpfs         5120        0      5120   0% /run/lock
/dev/xvda1     vfat        101158     3478     97680   4% /boot/efi
tmpfs          tmpfs       302624      204    302420   1% /run/user/1000

Output of gdisk -l /dev/xvda:

Partition table scan:
   MBR: protective
   BSD: not present
   APM: not present
   GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/xvda: 209715200 sectors, 100.0 GiB
Sector size (logical/physical): 512/512 bytes
Disk identifier (GUID): <REDEACTED>
Partition table holds up to 128 entries
Main partition table begins at sector 2 and ends at sector 33
First usable sector is 34, last usable sector is 209715166
Partitions will be aligned on 2048-sector boundaries
Total free space is 4029 sectors (2.0 MiB)

Number  Start (sector)    End (sector)  Size       Code  Name
    1            2048          204799   99.0 MiB    EF00  EFI system
partition
    2          204800          206847   1024.0 KiB  EF02  BIOS boot
partition
    3          206848       203323391   96.9 GiB    8300  Linux filesystem
    4       203323392       209713151   3.0 GiB     8200  Linux swap

Output of lspci -knn (or lspci -nn):
user@bullseye:~$ lspci -knn
00:00.0 Host bridge [0600]: Intel Corporation 440FX - 82441FX PMC
[Natoma] [8086:1237] (rev 02)
     Subsystem: Red Hat, Inc. Qemu virtual machine [1af4:1100]
00:01.0 ISA bridge [0601]: Intel Corporation 82371SB PIIX3 ISA
[Natoma/Triton II] [8086:7000]
     Subsystem: Red Hat, Inc. Qemu virtual machine [1af4:1100]
00:01.1 IDE interface [0101]: Intel Corporation 82371SB PIIX3 IDE
[Natoma/Triton II] [8086:7010]
     Subsystem: Red Hat, Inc. Qemu virtual machine [1af4:1100]
     Kernel driver in use: ata_piix
     Kernel modules: ata_piix, ata_generic
00:01.2 USB controller [0c03]: Intel Corporation 82371SB PIIX3 USB
[Natoma/Triton II] [8086:7020] (rev 01)
     Subsystem: Red Hat, Inc. QEMU Virtual Machine [1af4:1100]
     Kernel driver in use: uhci_hcd
     Kernel modules: uhci_hcd
00:01.3 Bridge [0680]: Intel Corporation 82371AB/EB/MB PIIX4 ACPI
[8086:7113] (rev 03)
     Subsystem: Red Hat, Inc. Qemu virtual machine [1af4:1100]
     Kernel modules: i2c_piix4
00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device
[5853:0001] (rev 01)
     Subsystem: XenSource, Inc. Xen Platform Device [5853:0001]
     Kernel driver in use: xen-platform-pci
00:03.0 VGA compatible controller [0300]: Device [1234:1111] (rev 02)
     Subsystem: Red Hat, Inc. Device [1af4:1100]
     Kernel driver in use: bochs-drm
     Kernel modules: bochs_drm

Base System Installation Checklist:
[O] = OK, [E] = Error (please elaborate below), [ ] = didn't try it

Initial boot:           [E]
Detect network card:    [ ]
Configure network:      [ ]
Detect media:           [ ]
Load installer modules: [ ]
Detect hard drives:     [ ]
Partition hard drives:  [ ]
Install base system:    [ ]
Clock/timezone setup:   [ ]
User/password setup:    [ ]
Install tasks:          [ ]
Install boot loader:    [ ]
Overall install:        [ ]

Comments/Problems: The problem manifests as an almost immediate crash of
the installer after booting the Xen HVM from the installation iso,
preventing the Debian installer from starting. Using the bullseye
netinst iso, IIRC it was possible to see the error that causes the crash
on the console by escaping into the grub configuration editor when the
grub menu is first displayed (press e, I think) and adding panic=60 to
the Linux kernel command line and then typing F10 (I think) to continue
the boot process. This will display the error message for 60 seconds
before the virtual machine dies.

Using the bookworm (testing) netinst iso it was possible to see the
error that causes the crash on the console by first selecting help from
the grub menu that first appears after booting from the iso and then by
typing install panic=60 at the boot prompt at the bottom of the help
screen and pressing enter. The error message will be displayed for 60
seconds, and then the virtual machine will destroy itself and the
console window will disappear.

The testing iso I actually booted and tested on 10 Feb 2022 displayed
this text at the top of the help screen:

This is a Debian 12 (bookworm) installation CD-ROM.
It was built 20220207-03:24; d-i 20220207-00:01:02.

I also attached a screenshot of the error (screenshot-#983357.png) when
booting the 20220207 Debian 12 installation CD-ROM to this report. It
shows a Cannot allocate memory error when trying to write uevent to
sysfs (this is the main symptom of #983357), and there is a Call Trace
of the resulting kernel panic, and also the version information of the
kernel on the installation CD is printed: 5.15.0-3-amd64 #1  Debian
5.15.15-2

After applying one of the workarounds described at
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983357#31 or at
https://lists.debian.org/debian-user/2021/08/msg01339.html or at
https://lists.debian.org/debian-user/2021/08/msg00917.html the
installation completes normally. Also, after a successful installation
of bullseye using one of the known workarounds or after a successful
upgrade to bullseye from buster, the system boots normally in a Xen HVM
guest and #983357 only manifests as the same boot error message about
failing to allocate memory when writing uevent to sysfs, but in this
case the uevent error is more or less harmless because there is *not* a
kernel panic and the system does *not* crash.

This problem was first reported to BTS as
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983357 on Mon, 22 Feb
2021 15:20:44 -0500. It is filed against src:linux and marked as
affecting debian-installer, so the Debian Installer Team should be aware
of this problem. Today I verified the problem still persists in the
latest weekly build of the bookworm amd64 netinst iso image and it
assuredly persists in the latest stable bullseye amd64 netinst iso
image. I am submitting this report to encourage the Debian Installer
Team to include this known bug in the official documentation for the
Debian bullseye installer and also the Debian testing installer as
appropriate.

There is a proposed fix (as opposed to workarounds while we wait for the
official fix) identified in #983357 which consists of an upstream patch
to the Linux kernel, but AFAICT the upstream Linux kernel developers
have not accepted the patch Debian has proposed to fix this bug after
about six months.

Since it does not appear #983357 will be fixed soon, I propose that
#983357 as it manifests itself in bullseye and bookworm debian-installer
iso images be added to the list of errata at
https://www.debian.org/releases/bullseye/debian-installer/index#errata
and to any corresponding pages for testing/bookworm if they exist.
-----Proposed text for the bullseye debian-installer errata page--------

Debian 11 installer using debian-11.0.0-amd64-netinst.iso (and later
versions) crashes in Xen HVM guests
Other iso images that include the debian-installer such as live images
might also be affected.
There are workarounds identified at
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983357#31 and at
https://lists.debian.org/debian-user/2021/08/msg01339.html and at
https://lists.debian.org/debian-user/2021/08/msg00917.html
-----End of proposed text for debian-installer errata page-------------- For those unfamiliar with Xen but wanting to reporduce this problem, I provide this brief tutorial: 1. If the Xen hypervisor and tools are not installed, install the the xen-system metapackage for your architecture. Something like sudo apt install xen-system-amd64 should work on amd64. It will install xen-utils, qemu-system-x86, and other necessary packages. 2. Disable/Uninstall other virtualization platforms such as Qemu/KVM, Virtualbox, VMWare, etc. 3. Use the grub menu to reboot the system on the Xen hypervisor. The installation of the Xen metapackage should cause grub to default to booting on Xen. To check that this step is completed correctly, the command sudo xl list should list Dom0 as a running domain. Dom0 is Xen terminology for the machine that is responsible for creating and managing virtual machines running on the Xen hypervisor. 4. Create an xl domain configuration file that boots the netinst iso in a Xen HVM guest (domU). Unfortunately there is no up-to-date wiki or howto available from either the Debian Project or the Xen Project. As a hint, I have included the text of the domain configuration file I used at the end of this report. That example assumes the Debian installer iso has been downloaded and a raw image file to serve as the virtual disk has been created as described at the end of this report. 5. The domain configuration file included in this report below uses vnc to display the console - install a vnc viewer to view the display from the Xen HVM guest. If a GUI is installed in Dom0, install a vnc viewer in Dom0 and point it at 127.0.0.1:5901 immediately after booting the virtual machine using the xl create command. Otherwise, view the display of the virtual machine in a vnc viewer running on another machine connected to the network by pointing the vnc viewer to the IP address of the machine that is running Dom0 (*not* the IP address of the Xen HVM virtual machine!) and port 5901 (192.168.1.254:5901, for example). 6. Use the sudo xl create bookworm-install.cfg command to boot the debian installer in the Xen HVM virtual environment. A good place to look for more details about how to configure and run a Xen HVM virtual machine in modern versions of Xen are the xl man page and the xl.cfg man page which are available in the Dom0 after the Xen metapackage is installed. As a hint, I post here the contents of the domain configuration file, with comments, named bookworm-install.cfg I used to reproduce #983357 with an up-to-date bookworm testing netinst iso: user@bullseye:~$ cat bookworm-install.cfg # Domain configuration to reproduce Debian Bug #983357 # This creates a Xen HVM guest builder = 'hvm' # This is virtual firmware to emulate legacy BIOS MBR booting # Another option is ovmf, which emulates UEFI booting, but it is a little buggy firmware = 'seabios' memory = '3072' vcpus = '4' # This is to install Debian bookworm (testing) using the official Debian netinst iso disk = ['/home/Public/images/bookworm.img,,xvda,w','/home/Public/images/debian-testing-amd64-netinst.iso,,xvdc,cdrom'] name = 'bookworm-install' # We do not need a network device to reproduce #983357 # vif = [ 'model=e1000' ] on_poweroff = 'destroy' # Avoid endless reboot loop with the crash caused by #983357 on_reboot = 'destroy' on_crash = 'destroy' # Set CD Drive as first in the boot order boot = 'dc' # Some common options for Xen HVM guests acpi = '1' apic = '1' viridian = '1' xen_platform_pci = '1' serial = 'pty' # Use a standard emulated VGA device with 16 MB shared video memory vga = 'stdvga' videoram = '16' Use the builtin Xen vnc server for the display vnc = '1' # Make vnc listen on all IPv4 interfaces vnclisten = '0.0.0.0' # Use port 5901 for vnc display connections vncdisplay = '1' # Use the Qemu device model mouse/tablet emulated USB device usb = '1' device_model_args_hvm = [ '-device', 'usb-tablet' ]
---------End of domain configuration file------------------- Of course this domain configuration presumes that the /home/Public/images/bookworm.img and /home/Public/images/debian-testing-amd64-netinst.iso virtual disk and virtual CD image files exist. The .iso file is the official Debian 12 testing iso downloaded from Debian's CD installation servers on 10 Feb 2022. The /home/Public/images/bookworm.img can be created using something like dd if=/dev/zero of=/home/Public/images/bookworm.img bs=1M count=10000 to make a 10 GB virtual disk image.
#1005308#10
Date:
2022-08-19 15:18:11 UTC
From:
To:
Elevating severity to important. The bookworm freeze is approaching and this has been ignored for over six months.

Thanks,

Chuck

#1005308#17
Date:
2022-12-26 00:44:46 UTC
From:
To:
This is still a problem to this day and the bookworm freeze is right
around the corner...

#1005308#22
Date:
2023-06-15 17:28:51 UTC
From:
To:
This issue persists in the current Debain 12 installer: debian-12.0.0-amd64-netinst.iso
#1005308#27
Date:
2023-08-21 07:43:48 UTC
From:
To:
On Thu, 15 Jun 2023 18:28:51 +0100 Abhinav Praveen  <abhinav@praveen.org.uk> wrote:
 > This issue persists in the current Debain 12 installer:
debian-12.0.0-amd64-netinst.iso
 > --
 > Abhinav Praveen
 >
 >

The issue is also in the latest installer, debian-12.1.0-amd64-netinst.iso