Dear Maintainer, after upgrading from Linux 6.1.158 to 6.1.162, NFS client writes fail with input/output errors (EIO). Environment: - Debian Bookworm - Kernel: 6.1.0-43-amd64 (6.1.162-1) - NFSv4.2 (also reproducible with 4.1) - Default mount options include rsize=1048576,wsize=1048576 Reproducer: dd if=/dev/zero of=~/testfile bs=1M count=500 or dd if=/dev/zero of=~/testfile bs=4k count=100000 On different computers and VMs! Result: dd: closing output file: Input/output error Workaround: Mount with: rsize=65536,wsize=65536 With reduced I/O size, the issue disappears completely. Impact: - File writes fail (file >1M) - KDE Plasma crashes due to corrupted cache/config writes The issue does NOT occur on kernel 6.1.0-42 (6.1.158).
Hi Maik,
Thanks for your report. I'm currently not able to reproduce it with:
# echo 1048576 > /proc/fs/nfsd/max_block_size
# systemctl restart nfs-server.service
# mount -t nfs -o rsize=1048576,wsize=1048576 127.0.0.1:/srv/data /mnt
127.0.0.1:/srv/data /mnt nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=127.0.0.1,local_lock=none,addr=127.0.0.1 0 0
root@bookworm-amd64:/mnt# dd if=/dev/zero of=testfile bs=4k count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB, 391 MiB) copied, 0.861692 s, 475 MB/s
root@bookworm-amd64:/mnt# dd if=/dev/zero of=testfile bs=1M count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB, 500 MiB) copied, 0.907189 s, 578 MB/s
root@bookworm-amd64:/mnt# uname -a
Linux bookworm-amd64 6.1.0-43-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.162-1 (2026-02-08) x86_64 GNU/Linux
root@bookworm-amd64:/mnt#
Is automounting involved in your case?
I will try bit harder to reproduce, since it seems reliably for you,
might you be able to bisect the issue please?
git clone --single-branch -b linux-6.1.y https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
cd linux-stable
git checkout v6.1.158
cp /boot/config-$(uname -r) .config
yes '' | make localmodconfig
make savedefconfig
mv defconfig arch/x86/configs/my_defconfig
# test 6.1.158 to ensure this is "good"
make my_defconfig
make -j $(nproc) bindeb-pkg
... install the resulting .deb package and confirm the problem does not exists.
# test 6.1.162 to ensure this is "bad"
git checkout v6.1.162
make my_defconfig
make -j $(nproc) bindeb-pkg
... install the resulting .deb package and confirm problem exists.
With that confirmed, the bisection can start:
git bisect start
git bisect good v6.1.158
git bisect bad v6.1.162
In each bisection step git checks out a state between the oldest
known-bad and the newest known-good commit. In each step test using:
make my_defconfig
make -j $(nproc) bindeb-pkg
... install, try to boot / verify if problem exists
and if the problem is hit run:
git bisect bad
and if the problem doesn't trigger run:
git bisect good
. Please pay attention to always select the just built kernel for
booting, it won't always be the default kernel picked up by grub.
Iterate until git announces to have identified the first bad commit.
Then provide the output of
git bisect log
In the course of the bisection you might have to uninstall previous
kernels again to not exhaust the disk space in /boot. Also in the end
uninstall all self-built kernels again.
Thank you already!
Regards,
Salvatore
Dear Salvatore, thanks for your quick response and brilliant tutorial! :) First: I couldn't reproduce the problem with a Debian (Bookworm) NFS Server. The bug report referred to shares provided by two different Dell EMC (Isilon) systems. Second: Yes, the shares are mounted via automount, but for my tests I mounted them manually before changing to the directory. The mount was always performed with "mount -t nfs -o vers=4.2" without specifying rsize/wsize. But with the help of git bisect, I was able to narrow down the commit! (see below) Now I suspected that inheritance wasn't working, so I explicitly specified rsize/wsize again when mounting, and then it worked again... But look: despite 1048576, mount only shows 1047532. # mount -t nfs -o vers=4.2 nfs-server:/ifs/nas01/share /data/share # mount nfs-server:/ifs/nas01/share on /data/share type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=x.x.x.x,local_lock=none,addr=x.x.x.x) # dd if=/dev/zero of=testfile bs=4k count=100000 dd: closing output file 'testfile': Input/output error # mount -t nfs -o vers=4.2,rsize=1048576,wsize=1048576 nfs-server:/ifs/nas01/share /data/share # mount nfs-server:/ifs/nas01/share on /data/share type nfs4 (rw,relatime,vers=4.2,rsize=1047672,wsize=1047532,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=x.x.x.x,local_lock=none,addr=x.x.x.x) # dd if=/dev/zero of=testfile bs=4k count=100000 100000+0 records in 100000+0 records out 409600000 bytes (410 MB, 391 MiB) copied, 1.07056 s, 383 MB/s 11:11 $ git bisect log git bisect start # status: waiting for both good and bad commits # good: [f6e38ae624cf7eb96fb444a8ca2d07caa8d9c8fe] Linux 6.1.158 git bisect good f6e38ae624cf7eb96fb444a8ca2d07caa8d9c8fe # status: waiting for bad commit, 1 good commit known # bad: [0182cb5b74ee79448adf4f33a137ca34e209eb30] Linux 6.1.162 git bisect bad 0182cb5b74ee79448adf4f33a137ca34e209eb30 # bad: [0ce72df67e78a68c74bcd64d2f4959c4ec571897] dma/pool: eliminate alloc_pages warning in atomic_pool_expand git bisect bad 0ce72df67e78a68c74bcd64d2f4959c4ec571897 # good: [4443fc58fcc283a9a7aa73f2f30e427296612ec7] softirq: Add trace points for tasklet entry/exit git bisect good 4443fc58fcc283a9a7aa73f2f30e427296612ec7 # good: [935ad4b3c325c24fff2c702da403283025ffc722] comedi: pcl818: fix null-ptr-deref in pcl818_ai_cancel() git bisect good 935ad4b3c325c24fff2c702da403283025ffc722 # good: [1ad2f81a099b8df5f72bce0a3e9f531263a846b8] ocfs2: relax BUG() to ocfs2_error() in __ocfs2_move_extent() git bisect good 1ad2f81a099b8df5f72bce0a3e9f531263a846b8 # good: [4acd1dd5a1e37ea86f5c4a6c52de2ebfe24ad1e1] drm/amd/display: Fix logical vs bitwise bug in get_embedded_panel_info_v2_1() git bisect good 4acd1dd5a1e37ea86f5c4a6c52de2ebfe24ad1e1 # good: [fcb91be52eb6e92e00b533ebd7c77fecada537e1] net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop git bisect good fcb91be52eb6e92e00b533ebd7c77fecada537e1 # good: [3df62bf15590d28f9e72916a8aa46bc769228988] Revert "nfs: ignore SB_RDONLY when mounting nfs" git bisect good 3df62bf15590d28f9e72916a8aa46bc769228988 # bad: [3d26dfc67e0b179ed78b1526b47545026e660b9d] platform/x86: asus-wmi: use brightness_set_blocking() for kbd led git bisect bad 3d26dfc67e0b179ed78b1526b47545026e660b9d # good: [bc8c969b62677b8acf3282afc6f779f9e76b598c] Expand the type of nfs_fattr->valid git bisect good bc8c969b62677b8acf3282afc6f779f9e76b598c # bad: [c559c99796911ab972e84a68bf4d223b921ce212] fs/nls: Fix inconsistency between utf8_to_utf32() and utf32_to_utf8() git bisect bad c559c99796911ab972e84a68bf4d223b921ce212 # bad: [732a5be2d49fcf01afa18009cfb6cafe7bd94314] NFS: Fix inheritance of the block sizes when automounting git bisect bad 732a5be2d49fcf01afa18009cfb6cafe7bd94314 # first bad commit: [732a5be2d49fcf01afa18009cfb6cafe7bd94314] NFS: Fix inheritance of the block sizes when automounting best regards Maik
Dear Maik,
Kudos goes to the original draft for this tutorial done by Uwe
Kleine-Koenig :)
Oh, so that will not become easy to debug as well server side :(
Ack that is important information that we really have not another
layer inbetween.
was somehow outstanding while checking which commits might be
candidate.
if you rollback to the old kernel, with the mount options as when you
are able to trigger the problem, what will mount show for the
negotiated wsize and rsize?
I'm asking because of, cf nfs(5):
If an rsize value is not specified, or if the specified
rsize value is larger than the maximum that either client
or server can support, the client and server negotiate
the largest rsize value that they can both support.
and
If a wsize value is not specified, or if the specified
wsize value is larger than the maximum that either client
or server can support, the client and server negotiate
the largest wsize value that they can both support.
So I suspect the negotiated rsize and wsize is then not 1M, correct,
I.e what are the negotiated sizes?
One additional question: Do you have a test system exposing the
problem where you can try as well the kernel from trixie, unstable or
experimental to gather an idea if it affects still upper kernels? I'm
asking because 2b092175f5e3 ("NFS: Fix inheritance of the block sizes
when automounting") is in v6.19-rc1 and got backported to v6.18.2,
v6.17.13, v6.12.63, v6.6.120 and v6.1.160.
Regards,
Salvatore
unsubscribe 1128834
Hi, > So I suspect the negotiated rsize and wsize is then not 1M, correct, > I.e what are the negotiated sizes? Mounting without specific rsize/wsize: Kernel 6.1.158 negotiated (from Dell Powerscale OneFS aka Isilon) rsize=1047672,wsize=1047532 and no errors Kernel 6.1.162 negotiated rsize=1048576,wsize=1048576 and Input/output error Kernel 6.1.158 negotiated (from Debian Bookworm NFS) rsize=1048576,wsize=1048576 and no errors Linux trixie-test 6.12.73+deb13-amd64 mount -t nfs -o vers=4.2 Results in "rsize=1048576,wsize=1048576" and Input/output error mount -t nfs -o vers=4.2,rsize=1048576,wsize=1048576 Results in "rsize=1047672,wsize=1047532" and no errors I would say that the commit corrected the wrong values (<1M to the correct 1M) and that in this case there is no problem with the kernel. It now looks much more like the "Powerscale OneFS" NFS server is somehow incorrect. <1M works for us, so I think we should just specify a value for automount and live with it and create a Dell support ticket. best regards, Maik
Hey, unsubscribe to the exact site.
Hi Maik, Sorry for the late reply. Then yes, indeed it looks lie an issue on the Oowerscale OneFS NFS server. I'm keeping this bug a bit open still, as I plan to ask the NFS upstream developer if they can have a look and post as a question if this might be worth beeing detected and mitigated on the kernel side (or ignored and defer to Dell). I will update this bug later (and maybe close it then). Have you heard something back from Dell yet? Regards, Salvatore
Hi, We are facing the same issue. Dell seems to point to a client issue: The kernel treats the max size as the nfs payload max size whereas OneFs treat the max size as the overall compound packet max size (everything related to NFS in the call). Hence when OneFS receives a call with a payload of 1M, the overall NFS packet is slightly bigger and it returns an NFS4ERR_REQ_TOO_BIG. So the question is: should max req size/max resp size be treated as the nfs payload max size or the whole nfs packet max size? Here below the whole response we got: Best regards,
Hi Trond, hi Anna
In Debian we got reports of a NFS client regression where large
rsize/wsize (1MB) causes EIO after the commit 2b092175f5e3 ("NFS: Fix
inheritance of the block sizes when automounting") and its backports
to the stable series. The report in full is at:
https://bugs.debian.org/1128834
Maik reported:
I was not able to reproduce the problem, and it turned out that it
seems to be triggerable when on NFS server side a Dell EMC (Isilion)
system was used. So the issue was not really considered initially as
beeing "our" issue.
Valentin SAMIR, a second user affected, did as well report the issue
to Dell, and Dell seems to point at a client issue instead. Valentin
writes:
His reply in https://bugs.debian.org/1128834#55 contains a quote from
the response Valentin got from Dell, I'm full quoting it here for
easier followup in case needed:
So question, are we behaving here correctly or is it our Problem, or is the
issue still considered on Dell's side?
#regzbot introduced: 2b092175f5e301cdaa935093edfef2be9defb6df
#regzbot monitor: https://bugs.debian.org/1128834
How to proceeed from here?
Regards,
Salvatore
Hi Salvatore, The Linux NFS client uses the 'maxread' and 'maxwrite' attributes (see RFC8881 Sections 5.8.2.20. and 5.8.2.21.) to decide how big a payload to request/send to the server in a READ/WRITE COMPOUND. If Dell's implementation is returning a size of 1MB, then the Linux client will use that value. It won't cross check with the max request size, because it assumes that since both values derive from the server, there will be no conflict between them.
So maxread and/or maxwrite MUST NOT be larger than the clients maximum RPC size? Maybe add an assert()-like warning to syslog if there is a mismatch? Aurélien
This seems like a wrong interpretation to me. Servers use the max_request_size to properly size their receive buffers, and the client is responsible for adhering to that value. I don't think you can stick a bunch of operations in a request compound and then put a huge WRITE at the end that blows out max_request_size, and expect the server to be OK with that. ISTM the client should clamp the length down to something shorter that allows the request to fit. Maybe drop the last folio and force another request? Performance would suck but it would work. All that said, the server in this case isn't sizing max_request_size with enough overhead for the client to actually achieve a full 1M write, which is just dumb. Dell should fix that.
I'm aware of what the spec says, Jeff. We're not putting "a bunch of operations" before the WRITE. There's a SEQUENCE, PUTFH, WRITE and GETATTR. The point is, we expect the value of maxwrite to be set to a reasonable value w.r.t. max_request_size so that the client doesn't have to sanity check everybody and their dog's server.