#1141183 src:linux: I/O errors on SATA devices related to IOMMU

#1141183#5
Date:
2026-06-30 20:10:08 UTC
From:
To:
Somewhere between 6.18 and 6.19, a bug was introduced that causes one of
my servers to somewhat randomly throw DMA errors on all SATA HDDs.  Teej
has kindly been helping me troubleshoot this issue and here is the summary:

* 6.18.1-1~exp1 works.
* 6.19.1~exp1 throws random I/O errors.
* 6.19.1~exp1 works when booted with "iommu=off".

This issue is still present in the 7.0 trixie-backports kernel, and the
"iommu=off" workaround works on it as well.

Sample error:

Both reads and writes are affected.

Note the 6.18 log has btrfs scrub errors.  These are unrelated in the
sense that it's not a 6.18 issue, but related in the sense that the
on-disk errors were caused by this issue in an earlier test of a kernel
exhibiting this bug.

Attached are kernel logs from each run as well as "lspci -nnk" output.

#1141183#12
Date:
2026-06-30 22:08:52 UTC
From:
To:
I've worked with Chris in Matrix Debian room over the last two days to
diagnose this issue. Initial report was I/O errors after kernel upgrade
from v6.12.90 to v7.0.10 with:

[22396.952764] ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[22396.952791] ata2.00: failed command: WRITE DMA EXT
[22396.952799] ata2.00: cmd 35/00:b8:02:30:0b/00:1d:00:00:00/e0 tag 26 dma 3895296 out
                        res 50/00:00:01:20:0b/00:00:00:00:00/e0 Emask 0x40 (internal error)
[22396.952822] ata2.00: status: { DRDY }

We initially focused on the C220 series chipset and the SATA links and determined this HP
ProLiant ML10 v2 (J10) server's specific chipset is C222 - this means that of the 6 SATA
ports 2 are 6Gbps and 4 3Gbps.

After checking sysfs attributes for the links and Chris trying alternate
ports with no improvement we next considered the several BIOS firmware
bugs reported by the kernel but discounted those since they exist in the
good kernel versions too. BIOS was upgraded to latest with no
difference.

I found b.k.o #220693 "SATA bus goes offline after a while" [0] that has similar
symptoms and has several intermingled issues that seem to exhibit very
similar symptoms. It discusses several possible workarounds:

1. ATA LPM ( libata.force=nolpm )
2. maximum transfer size ( libata.force=maxsec1024 )
3. IOMMU ( iommu=off )

The first two didn't show improvements on the failing kernel versions
but disabling IOMMU did. Comment 44 [1] of the bug report considers that may
indicate an issue with readahead code.

Using this as a clue and looking through v6.12..v7.0 commits I found a
group of related iomap commits introduced at the start of the v6.19
cycle.

That spurred the test of v6.18 that doesn't exhibit the issue, and
v6.19 that does.

I've built and shared v6.18.37 with Chris since this does not contain
the suspect iomap commits and he'll report back on that later.

Bisecting the iomap commits in v6.19 might be a challenge due to how
there are several later changes based on the series, and they might be a
red herring.

So this needs more eyeballs to consider alternative triggers of the bug
and once we have the v6.18.37 result forwarding upstream.

Tj.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=220693

[1] https://bugzilla.kernel.org/show_bug.cgi?id=220693#c44

#1141183#17
Date:
2026-06-30 22:32:00 UTC
From:
To:
I forgot to mention that this may be related to the following kernel
bug: https://bugzilla.kernel.org/show_bug.cgi?id=220693

Neither libata.force=maxsec1024 nor libata.force=nolpm mitigate the issue.
------------------------------------------------------------------------ IMPORTANT INFORMATION/DISCLAIMER This document should be read only by those persons to whom it is addressed. If you have received this message it was obviously addressed to you and therefore you can read it. Additionally, by sending an email to ANY of my addresses or to ANY mailing lists to which I am subscribed, whether intentionally or accidentally, you are agreeing that I am "the intended recipient," and that I may do whatever I wish with the contents of any message received from you, unless a pre-existing agreement prohibits me from so doing. This overrides any disclaimer or statement of confidentiality that may be included on your message.