#1012210 linux-image-5.10.0-14-amd64: Kernels of Bullseye and Testing (5.10 and 5.17) hang at boot

Package:
src:linux
Source:
src:linux
Submitter:
Markus Kolb
Date:
2026-03-15 14:37:02 UTC
Severity:
normal
Tags:
#1012210#5
Date:
2022-06-01 10:17:56 UTC
From:
To:
Dear Maintainer,

I've updated an older typewriter computer from Stretch to Buster and
from there to Bullseye.

The kernel linux-image-4.19.0-20-amd64 from Buster works.
This one is used at the moment for reportbug!

But anything from the newer versions hangs during boot.
I've tried the release kernel of Bullseye,
the security-updated linux-image-5.10.0-14-amd64,
the newest one from backports
and the latest version from testing, which has been a 5.17 version.

I've tried to start up with boot option boot_delay=1000, but then it
already hangs/crashes after the line of loading the initial ramdisk.
Can only switch off/on the computer afterwards. Num-lock switch is
dead.

Without the boot_delay option, there is some fast kernel output and I've
filmed with my camera.

Between working and crashing kernel there is some difference in the SATA
ports. And these are the last lines of output before the screen becomes
black and there isn't any reaction of the computer any longer.

This is a working boot with the Buster-kernel:

[    1.800181] hub 7-0:1.0: USB hub found
[    1.800226] hub 7-0:1.0: 2 ports detected
[    1.816820] scsi host1: ahci
[    1.817122] scsi host3: ahci
[    1.817395] scsi host4: ahci
[    1.817663] scsi host5: ahci
[    1.817939] scsi host6: ahci
[    1.818297] scsi host7: ahci
[    1.818430] ata3: SATA max UDMA/133 abar m2048@0xf01a6000 port
0xf01a6100 irq 25
[    1.818489] ata4: SATA max UDMA/133 abar m2048@0xf01a6000 port
0xf01a6180 irq 25
[    1.818547] ata5: DUMMY
[    1.818582] ata6: DUMMY
[    1.818618] ata7: SATA max UDMA/133 abar m2048@0xf01a6000 port
0xf01a6300 irq 25
[    1.818678] ata8: SATA max UDMA/133 abar m2048@0xf01a6000 port
0xf01a6380 irq 25
[    1.819353] scsi host2: ata_generic
[    1.819448] ata1: PATA max UDMA/100 cmd 0x1218 ctl 0x1240 bmdma
0x1200 irq 18
[    1.819494] ata2: PATA max UDMA/100 cmd 0x1220 ctl 0x1244 bmdma
0x1208 irq 18
[    1.879056] pci 0000:00:00.0: Intel Q35 Chipset
[    1.879122] pci 0000:00:00.0: detected gtt size: 524288K total,
262144K mappable
[    1.879855] pci 0000:00:00.0: detected 8192K stolen memory
[    1.879947] [drm] Replacing VGA console driver
[    1.880476] Console: switching to colour dummy device 80x25
[    1.880940] [drm] ACPI BIOS requests an excessive sleep of 1124034056
ms, using 1500 ms instead
[    1.884726] [drm] Supports vblank timestamp caching Rev 2
(21.10.2013).
[    1.884730] [drm] Driver supports precise vblank timestamp query.


With Bullseye-kernel I can't see the line for scsi hosts 1 and 3.
But maybe the order is differently and it is not viewable before the
screen becomes black. So maybe also a problem with console switching and
i915 graphic?

I'll upload the video to opened bug report.

I hope you can help me?! Do you know some boot options for the kernel I
could try to get the newer kernels to work?

br
Markus

#1012210#10
Date:
2022-06-01 10:32:31 UTC
From:
To:
Am 01.06.2022 12:17, schrieb Markus Kolb:
[...]
[...]

In the attachment is a screenshot of the last output and the video where
it switches to black screen and crash around 0:12.

#1012210#15
Date:
2022-06-01 11:23:26 UTC
From:
To:
Control: found -1 linux/5.17.3-1

Bug https://bugs.debian.org/1006149 was also about a boot failure (and SATA)
and that got fixed in version 5.17.6-1, but due to the openssl transition, that
didn't get into Testing (which currently has 5.17.3-1).
The current version in Sid/Unstable is 5.17.11-1 and it would be useful if you
could test that as well.

This does seem like a different issue as it happens with much older kernels
then that bug report, so I don't expect it to fix it.
But it's still useful info and it may result in some extra info.

#1012210#22
Date:
2022-06-01 11:34:39 UTC
From:
To:
Could you try to unplug any peripherals that are not strictly needed?
IOW: only attach a keyboard and monitor and see if that makes a difference.

#1012210#27
Date:
2022-06-01 15:22:35 UTC
From:
To:
Am 01.06.2022 13:34, schrieb Diederik de Haas:

Hey Diederik,

I've just tested
   linux-image-5.17.0-2-amd64            5.17.6-1+b1
but also doesn't boot.

The computer has only attached USB mouse. The keyboard is PS/2. Next to
this only Ethernet cable and VGA is connected.

In the meantime I've built
   linux-image-5.10.119
from kernel.org sources and it boots successful.

I've used localmodconfig while running the
   linux-image-4.19.0-20-amd64           4.19.235-1
and afterwards used mostly defaults for new config options but also
deactivated many options and modules, I don't need and was quite sure
about it, for faster build finishing.

I've attached dmesg output, config and lsmod output for my 5.10.119.
Maybe it helps to find the right patch.
Next I build 5.10.114 which is date corresponding to the 5.17.6 and see
if I can find the version with patch with going upwards.
Or someone already any idea what it could be? :-)

cu
Markus

#1012210#32
Date:
2022-06-01 16:56:01 UTC
From:
To:
Hi Markus,

I was expecting 5.17.11-1, but that version does have the SATA fix from the
other bug report too. And there's another thing to focus on ...

Ok, that is fine.

... and this is VERY significant (afaict) :-)

The exact implications of this is 'above my pay grade', but hopefully one of
the kernel maintainers (who should understand this) chimes in.

Via https://packages.debian.org/bullseye/linux-config-5.10 I retrieved the
Debian kernel config for 5.10.106-1 (which is likely close enough) and compared
it with the config you attached.

The diff was *huge*, but the fact that you were able to boot your self-built
5.10 kernel while the Debian 5.10 kernel failed, points (strongly) towards a
Debian kernel configuration difference which is the cause of this bug.

I have no idea how to make any intelligent recommendations wrt kernel config
changes, so I have to defer to people 'smarter' then me (wrt this).

Building 5.10.113 with your custom config sounds like a good test case.
Debian's 5.10.113 didn't boot (with Debian's config), but if the (exact) same
version with a different config does work, then it seems almost certain to me
that the bug is in the Debian kernel configuration.

Cheers,
  Diederik

#1012210#37
Date:
2022-06-02 13:42:54 UTC
From:
To:
Because the kernel.org 5.10.113 with my stripped down config is running
successful,
I've rebuilt this kernel.org 5.10.113 with the Debian config
/boot/config-5.10.0-14-amd64 via make oldconfig.
There are some differences between kernel.org and Debian in the config,
I've put the config diff bug-1012210-kernel-config-changes.patch in the
attached tar.xz.
This kernel also boots successful.

I think the problem is introduced by a Debian patch which handles the
config
   CONFIG_INTEL_IOMMU_DEFAULT_ON_INTGPU_OFF=y
because this is not available in kernel.org 5.10.113.

And the Debian kernel
   linux-image-5.10.0-14-amd64           5.10.113-1
boots successful when I set the boot option
   intel_iommu=on,igfx_off
which is only needed with the Debian kernel and not any version from
kernel.org.

In the boot log of Debian kernel are the additional lines
[    0.050433] DMAR: IOMMU enabled
[    0.050434] DMAR: Disable GFX device mapping
...
[    1.373598] DMAR: No ATSR found
[    1.373721] DMAR: dmar2: Using Register based invalidation
[    1.373767] DMAR: dmar0: Using Register based invalidation
[    1.373810] DMAR: dmar3: Using Register based invalidation
...
[    1.391563] DMAR: Intel(R) Virtualization Technology for Directed I/O

The kernel boot logs
   dmesg-5.10.113.txt (kernel.org without any boot options)
   dmesg-5.10.0-14-amd64.txt (Debian with boot option
intel_iommu=on,igfx_off)
are also in the attached tar.xz.

In the patches


https://salsa.debian.org/kernel-team/linux/-/blob/bullseye-security/debian/patches/features/x86/intel-iommu-add-kconfig-option-to-exclude-igpu-by-default.patch

https://salsa.debian.org/kernel-team/linux/-/blob/bullseye-security/debian/patches/features/x86/intel-iommu-add-option-to-exclude-integrated-gpu-only.patch

there is introduced the kernel config option
   INTEL_IOMMU_DEFAULT_ON_INTGPU_OFF
but it is not handled anywhere in the code.

I think you have mixed up the defaults of the configuration and settings
of igfx_off and intgpu_off somehow which sets something up resulting in
a wrong config for my boot. intgpu_off boot config itself doesn't change
anything, with the Debian kernel I need igfx_off.

At

https://salsa.debian.org/kernel-team/linux/-/blob/bullseye-security/debian/patches/features/x86/intel-iommu-add-option-to-exclude-integrated-gpu-only.patch#L66
you should compare 10 chars and not only 8, but is more or less
correctness.

Maybe this
   static int dmar_map_intgpu =
IS_ENABLED(CONFIG_INTEL_IOMMU_DEFAULT_ON);
at

https://salsa.debian.org/kernel-team/linux/-/blob/bullseye-security/debian/patches/features/x86/intel-iommu-add-kconfig-option-to-exclude-igpu-by-default.patch#L74
should be
   static int dmar_map_intgpu =
IS_ENABLED(INTEL_IOMMU_DEFAULT_ON_INTGPU_OFF);
or the negated value, not sure at the moment, what a y or n should mean
in this config and if the assignments of 0 or 1 are correct everywhere.

#1012210#42
Date:
2022-06-02 15:27:24 UTC
From:
To:
On Thu, 2022-06-02 at 15:42 +0200, Markus Kolb wrote:
[...]

It is handled implicitly.  When that config symbol is enabled, both
INTEL_IOMMU_DEFAULT_ON and INTEL_IOMMU_DEFAULT_OFF are disabled.

Well spotted.  This is because at some point in development I changed
the name of the option from igpu_off to intgpu_off.  The patch
description also has the earlier name.  I'll correct that.

No, the whole point of INTEL_IOMMU_DEFAULT_ON_INTGPU_OFF is to turn
that off by default while still enabling the IOMMU for other devices.

Based on the log from your self-built kernel, it seems like your system
should work with the kernel parameter "intel_iommu=on".  Can you test
whether that makes a difference with the Debian kernels?

Ben.

#1012210#47
Date:
2022-06-02 18:20:36 UTC
From:
To:
Am 02.06.2022 17:27, schrieb Ben Hutchings:

[...]
[...]


Yes, I got this, but not sure if the code logic is really correct.
With this INTEL_IOMMU_DEFAULT_ON is "implicitly" falsy and the code is
supposed to run like it would be truthy.


With Debian kernel it doesn't boot with
   intel_iommu=on
and also not with
   intel_iommu=on,intgpu_off (which should be the same like nothing
specified).

Really only possibility intel_iommu=on,igfx_off.

#1012210#52
Date:
2022-06-02 21:26:59 UTC
From:
To:
Am 2. Juni 2022 13:42:54 UTC schrieb Markus Kolb <debian@tower-net.de>:

[...]
[...]
I've patched the kernel.org 5.10.113 just with these 2 Debian patches and at least I can confirm, that these changes are the cause.
Although I don't understand at the moment where the difference of intel_iommu=on with patch and the defaults without patch could be.
Will have a closer look tomorrow.

#1012210#57
Date:
2022-06-03 03:35:15 UTC
From:
To:
I'm running Debian stable on a VM from an Apple MacBook Pro M1 14" 2021.
The software I'm using is UTM, which uses QEMU under the hood.

Yesterday I did a system upgrade:
  aptitude -y update && aptitude -y full-upgrade && apt -y autoremove

I noticed the kernel was upgraded to 5.10.0-14 so I rebooted the VM. After that
Debian was unable to boot. Choosing the previous kernel image available
from Grub (5.10.0-13) allowed Debian to boot normally.

Regards,
Vincent

#1012210#62
Date:
2022-06-03 11:05:09 UTC
From:
To:
Control: clone -1 -2
Control: notfound -2 linux/5.17.3-1
Control: retitle -2 linux-image-5.10.0-14-amd64: boot failure in VM after upgrading from -13
Control: tag -2 moreinfo

This seems to be completely unrelated to that bug, so I've cloned it into a new
bug. When responding please only respond to that new bug report/number.
With 'update' it seems pointless
With 'full-upgrade' you REALLY should review what is about to happen before
agreeing to that as it could remove packages (important for you)
I'd recommend reviewing the 'autoremove' result too before committing it

Bug 1012210 is about a boot failure on a (wide) variety of kernels, likely
related to igpu.
Your issue is a regression from -13 to -14.
I assumed that you're running Debian Stable *in* a VM (on what host OS?).
Please clarify whether that is correct or not. Also provide more info about
YOUR boot failure and sent that to the NEW bug number that you should receive.

#1012210#69
Date:
2022-06-03 13:59:20 UTC
From:
To:
Am 02.06.2022 23:26, schrieb Markus Kolb:
[...]

I've found the difference, somehow I've had the opinion that with the
kernel.org and Debian Buster kernel dmar_disabled would be set to false
by default or CONFIG_INTEL_IOMMU_DEFAULT_ON would be enabled by default.
But this is not the case. So dmar_disabled is true there without boot
config.
With the Debian patch in Bullseye and newer this has been enabled
implicitly via CONFIG_INTEL_IOMMU_DEFAULT_ON_INTGPU_OFF=y and
dmar_disabled became false.
With the older kernels you had to enable it per boot config, and now you
need to disable it.

So added now this to drivers/iommu/intel/iommu.c and my computer boots
without any required kernel boot option with Debian kernels:
--- a/drivers/iommu/intel/iommu.c 2022-06-03 14:50:52.248268257 +0200 +++ b/drivers/iommu/intel/iommu.c 2022-06-03 14:48:12.695769217 +0200 @@ -6186,6 +6186,9 @@ dmar_map_gfx = 0; } +/* Q35 integrated gfx dmar support is totally busted. */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x29b2, quirk_iommu_igfx); + /* G4x/GM45 integrated gfx dmar support is totally busted. */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2a40, quirk_iommu_igfx); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e00, quirk_iommu_igfx); The only related to problem, I've found, is this discussion without result 5 years ago: https://lore.kernel.org/linux-iommu/20161205215841.GA20819@beast/ And this nearly 4 year old bug report without attention: https://bugzilla.kernel.org/show_bug.cgi?id=201185 I've opened https://bugzilla.kernel.org/show_bug.cgi?id=216064 Would you add this patch to Debian kernels?
#1012210#74
Date:
2025-02-19 15:20:28 UTC
From:
To:
Hi

This bug was filed for a very old kernel or the bug is old itself
without resolution.

If you can reproduce it with

- the current version in unstable/testing
- the latest kernel from backports

please reopen the bug, see https://www.debian.org/Bugs/server-control
for details.

Regards,
Salvatore

#1012210#77
Date:
2025-02-19 15:20:28 UTC
From:
To:
Hi

This bug was filed for a very old kernel or the bug is old itself
without resolution.

If you can reproduce it with

- the current version in unstable/testing
- the latest kernel from backports

please reopen the bug, see https://www.debian.org/Bugs/server-control
for details.

Regards,
Salvatore

#1012210#88
Date:
2026-03-12 00:29:09 UTC
From:
To:
The issue reported in this bug is still reproducible in Debian 13 stable
(trixie) with kernel 6.12.73+deb13-amd64. This occurs with the same
behavior described in the original report, the system hangs during startup.

Disabling VT for Direct I/O in BIOS also seems to mitigate the problem.

#1012210#93
Date:
2026-03-15 14:35:55 UTC
From:
To:
Hi Ariel,

From your followup we do not know if it is the same issue, can you
please specify if our is as well a

00:02.0 VGA compatible controller [0300]: Intel Corporation 82Q35 Express Integrated Graphics Controller [8086:29b2] (rev 02) (prog-if 00 [VGA controller])
	Subsystem: Hewlett-Packard Company 82Q35 Express Integrated Graphics Controller [103c:2818]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at f0100000 (32-bit, non-prefetchable) [size=512K]
	Region 1: I/O ports at 1210 [size=8]
	Region 2: Memory at e0000000 (32-bit, prefetchable) [size=256M]
	Region 3: Memory at f0000000 (32-bit, non-prefetchable) [size=1M]
	Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
	Capabilities: <access denied>
	Kernel driver in use: i915
	Kernel modules: i915

If so, can you test the patch from message #69
(https://bugs.debian.org/1012210#69) and confirm if this fixes your
issue?

Regards,
Salvatore