#992304 linux-image-5.10.0-8-amd64: Unable to boot from Dell R340 RAID Controller H330 (MegaRAID SAS-3 3008) after upgrade from Buster

#992304#5
Date:
2021-08-16 22:55:43 UTC
From:
To:
Dear Maintainer,
   * What led up to the situation?
I upgraded Dell R430 server from Buster to Bullseye. It has standard RAID1 configuration with two SAS disks.
   * What exactly did you do (or not do) that was effective (or ineffective)?
I updated the server and after the upgrade process finished the server did not boot properly in kernel 5.10.
I found that there is an error regarding missing hard disk drive. I supposed that the problem maybe the kernel
driver for megaraid_sas.ko. I booted the installation media via the iDRAC interface and started system rescue.
The Debian 11.0 (netinstall) media did not find the hard disk and could not mount the root partition. So I
booted Debian 10.7 (netinstall) in rescue mode and I was able to see the disks and to choose the right root
partition.
   * What was the outcome of this action?
I found that the problem is in the new kernel 5.10 so I modified the /etc/default/grub file in order to boot the
old kernel by default.

#992304#10
Date:
2021-09-11 18:58:17 UTC
From:
To:
Hi Nikolay,

Upstream report

https://bugzilla.kernel.org/show_bug.cgi?id=214311

suggests that this is dependent on if you are booting with legancy
BIOS or UEFI.

Can you confirm that?

Regards,
Salvatore

#992304#21
Date:
2021-09-18 07:16:03 UTC
From:
To:
 Hi Salvatore,
I confirm that in Legacy BIOS mode the disk is not available and in UEFI mode the disk is available.
Kind regards,Nikolay

    On Saturday, September 11, 2021, 9:58:20 PM GMT+3, Salvatore Bonaccorso <carnil@debian.org> wrote:

 Control: tags -1 + moreinfo
Control: severity -1 important

Hi Nikolay,

Upstream report

https://bugzilla.kernel.org/show_bug.cgi?id=214311

suggests that this is dependent on if you are booting with legancy
BIOS or UEFI.

Can you confirm that?

Regards,
Salvatore

#992304#28
Date:
2021-09-18 08:30:39 UTC
From:
To:
Hi Nikolay,

Thanks for confirming. let's see if upstream replies.

Regards,
Salvatore

#992304#33
Date:
2022-09-26 09:58:08 UTC
From:
To:
Hello,

We are also bitten by this bug, we have DELL R340,we are trying to
upgrade from buster to bullseye and installation breaks.

I'm using netinstall 11.5 (amd64) as base to build our product on top.
Kernel version: 5.10.140-1.

lspci output:

00:00.0 Host bridge: Intel Corporation 8th/9th Gen Core Processor Host
Bridge/DRAM Registers [Coffee Lake] (rev 07)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th
Gen Core Processor PCIe Controller (x16) (rev 07)
00:01.1 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th
Gen Core Processor PCIe Controller (x8) (rev 07)
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-
1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
00:12.0 Signal processing controller: Intel Corporation Cannon Lake PCH
Thermal Controller (rev 10)
00:14.0 USB controller: Intel Corporation Cannon Lake PCH USB 3.1 xHCI
Host Controller (rev 10)
00:14.2 RAM memory: Intel Corporation Cannon Lake PCH Shared SRAM (rev
10)
00:16.0 Communication controller: Intel Corporation Cannon Lake PCH
HECI Controller (rev 10)
00:16.4 Communication controller: Intel Corporation Cannon Lake PCH
HECI Controller #2 (rev 10)
00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI
Controller (rev 10)
00:1c.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root
Port #1 (rev f0)
00:1c.1 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root
Port #2 (rev f0)
00:1e.0 Communication controller: Intel Corporation Cannon Lake PCH
Serial IO UART Host Controller (rev 10)
00:1f.0 ISA bridge: Intel Corporation Cannon Point-LP LPC Controller
(rev 10)
00:1f.4 SMBus: Intel Corporation Cannon Lake PCH SMBus Controller (rev
10)
00:1f.5 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH
SPI Controller (rev 10)
02:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury]
(rev 02)
03:00.0 PCI bridge: PLDA PCI Express Bridge (rev 02)
04:00.0 VGA compatible controller: Matrox Electronics Systems Ltd.
Integrated Matrox G200eW3 Graphics Controller (rev 04)
05:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme
BCM5720 2-port Gigabit Ethernet PCIe
05:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme
BCM5720 2-port Gigabit Ethernet PCIe

Following footprint of megraraid_sas failure is seen in syslog after
pressing ctrl+shit+f2:

Sep 23 03:51:49 kernel: [    1.534186] megaraid_sas 0000:02:00.0:
Ignore DCMD timeout: megasas_get_ctrl_info 5335
Sep 23 03:51:49 kernel: [    1.534188] megaraid_sas 0000:02:00.0: Could
not get controller info. Fail from megasas_init_adapter_fusion 1865
Sep 23 03:51:49 kernel: [    1.536973] megaraid_sas 0000:02:00.0:
Failed from megasas_init_fw 6467

This breaks upgrade to bullseye, is there any workaround available?

Thank you!

Regards,
Jaikumar

#992304#38
Date:
2022-10-27 13:56:33 UTC
From:
To:
Same problem with the Dell PowerEdge T140 with LSI MegaRAID SAS-3 3008
[Fury] (rev 02) and linux-image-5.10.0-19-amd64 (NO UEFI)

No problem booting with the old 4.19.0-22-amd64 kernel

#992304#43
Date:
2022-10-31 15:12:27 UTC
From:
To:
Update

On Thu, 27 Oct 2022 15:56:33 +0200 "Gabriel Rolland [Res Novae]"  <gabriel@resnovae.it> wrote:
 > Same problem with the Dell PowerEdge T140 with LSI MegaRAID SAS-3 3008
 > [Fury] (rev 02) and linux-image-5.10.0-19-amd64 (NO UEFI)
 >
 > No problem booting with the old 4.19.0-22-amd64 kernel
 >
 >

The problem is still present with
linux-image-5.19.0-0.deb11.2-amd64_5.19.11-1~bpo11+1_amd64

#992304#48
Date:
2022-10-31 15:39:53 UTC
From:
To:
The best way to make progress with this is doing a `git bisect` where
good = v4.19.194 (= 4.19.0-17)* and
bad = v5.10.46
https://wiki.debian.org/DebianKernel/GitBisect

*) I think that's better/faster then v4.19.260 (=4.19.0-22), but I'm not sure.
Hopefully someone can confirm/deny that and offer a better/faster strategy.

#992304#53
Date:
2022-11-04 20:55:37 UTC
From:
To:
Try using the kernel parameter:  intel_iommu=off
#992304#58
Date:
2022-11-06 15:45:23 UTC
From:
To:
It's work for me on Dell T140 with  PERC H330 Adapter

root@zurix:~# uname -a
Linux zurix 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/
Linux

#992304#63
Date:
2023-02-10 17:07:50 UTC
From:
To:
I was recently met with this same issue while upgrading a server with a
Perc H310 to Bullseye.

After checking the manual for this Perc series, it turns out that the
H310 controller does not support IOMMU. The remark is on page 85 of the
manual [1].

I can confirm that using intel_iommu=off allows the host to boot without
issues.

[1] https://dl.dell.com/manuals/common/rc_h310_h710_h710p_h810_ug_en-us.pdf