The setup we have on openqa.debian.net and it's openqa-worker machines is that the workers have a read-only NFS mount of a share containing the .iso images that are used for booting the test VMs. All kernels that I have available since 6.12.48 (6.12.57 6.12.69 6.12.73 & 6.18.5) cause the workers to fail when they try to apply a "consistent read" lock, giving the error message: No locks available One can demonstrate that by running this on the NFS client: $ flock -e -w 4 /var/lib/openqa/factory/iso/xfce_sid_20260223T140607Z.iso sleep 1 flock: /var/lib/openqa/factory/iso/xfce_sid_20260223T140607Z.iso: No locks available meanwhile, running that flock on the server works fine. BTW Here are the journals from booting various versions of the kernel: https://hands.com/~phil/tmp/openqa-nfs-lock-issue/ Cheers, Phil.
The /etc/exports entry on the server for the share in question is: /var/lib/openqa/share *(fsid=0,rw,no_root_squash,sync,no_subtree_check) and the /etc/fstab entries on the clients (both of which show the behaviour) are: openqa.debian.net:/var/lib/openqa/share /var/lib/openqa/share nfs ro,fsc,soft,bg 0 0 and: openqa.debian.net:/var/lib/openqa/share /var/lib/openqa/share nfs nofail,ro,fsc,soft,x-systemd.automount,x-systemd.requires=network-online.target,x-systemd.device-timeout=10s 0 0
Here's a bisect log: # bad: [8a243ecde1f6447b8e237f2c1c67c0bb67d16d67] Linux 6.12.57 # good: [f1e375d5eb68f990709fce37ee1c0ecae3645b6f] Linux 6.12.48 git bisect start 'v6.12.57' 'v6.12.48' # good: [28defa35ed158bcca43e0d3d0122e747f57be867] clk: nxp: Fix pll0 rate check condition in LPC18xx CGU driver git bisect good 28defa35ed158bcca43e0d3d0122e747f57be867 # bad: [3e7b89ed9f07e6864943c4237a9c86e0cf9d3f33] drm/msm/a6xx: Fix PDC sleep sequence git bisect bad 3e7b89ed9f07e6864943c4237a9c86e0cf9d3f33 # good: [cbcfb32b6aaeebda11c945698634294a48259ac3] memory: samsung: exynos-srom: Fix of_iomap leak in exynos_srom_probe git bisect good cbcfb32b6aaeebda11c945698634294a48259ac3 # good: [32c258aad47ef9c58c8ae50e160b9c94c43f3829] KVM: x86: Advertise SRSO_USER_KERNEL_NO to userspace git bisect good 32c258aad47ef9c58c8ae50e160b9c94c43f3829 # bad: [e67e3e738f088e6c5ccfab618a29318a3f08db41] sched/fair: Block delayed tasks on throttled hierarchy during dequeue git bisect bad e67e3e738f088e6c5ccfab618a29318a3f08db41 # bad: [1e059ce9cc7b20f19d3c4b6e39e72ecb42da1ce8] mptcp: pm: in-kernel: usable client side with C-flag git bisect bad 1e059ce9cc7b20f19d3c4b6e39e72ecb42da1ce8 # bad: [0a1ee3c932dcf6f446e69a0ce67f36550a69a9ed] nfsd: don't use sv_nrthreads in connection limiting calculations. git bisect bad 0a1ee3c932dcf6f446e69a0ce67f36550a69a9ed # good: [34ff466f74d0fe1db8956f9c245e2bb2c67f67bf] x86/kvm: Force legacy PCI hole to UC when overriding MTRRs for TDX/SNP git bisect good 34ff466f74d0fe1db8956f9c245e2bb2c67f67bf # good: [763d4aa418456afb2e1bdef27216332352813aad] NFSD: Replace use of NFSD_MAY_LOCK in nfsd4_lock() git bisect good 763d4aa418456afb2e1bdef27216332352813aad # good: [18744bc56b0ec34b0fe397ee71c2ffdc48c6a0e0] nfsd: refine and rename NFSD_MAY_LOCK git bisect good 18744bc56b0ec34b0fe397ee71c2ffdc48c6a0e0 # first bad commit: [0a1ee3c932dcf6f446e69a0ce67f36550a69a9ed] nfsd: don't use sv_nrthreads in connection limiting calculations.
Philip Hands <phil@hands.com> writes: ... It seems that ^ was a mistake on my part -- having retested 18744bc56b0ec34 it is in fact bad.
After several test of the .53...54 NFSD changes I've found that reverting 18744bc56b0ec "nfsd: refine and rename NFSD_MAY_LOCK" appears to resolve it in my debvm-created server test scenario. Likewise, a build before those NFSD changes at 34ff466f74d0f also works fine.
stable v6.12.54 commit 18744bc56b0ec (re)moves checks from
fs/nfsd/vfs.c::nfsd_permission().
This causes NFS clients to see
$ flock -e -w 4 /srv/NAS/test/debian-13.3.0-amd64-netinst.iso sleep 1
flock: /srv/NAS/test/debian-13.3.0-amd64-netinst.iso: No locks available
Keeping the check in nfsd_permission() whilst also copying it to
fs/nfsd/nfsfh.c::__fh_verify() resolves the issue.
This was discovered on the Debian openQA infrastructure server when
upgrading kernel from v6.12.48 to later v6.12.y where worker hosts (with
any earlier or later kernel version) pass NFSv3 mounted ISO images to
qemu-system-x86_64 and it reports:
!!! : qemu-system-x86_64: -device
scsi-cd,id=cd0-device,drive=cd0-overlay0,serial=cd0: Failed to get
"consistent read" lock: No locks available
QEMU: Is another process using the image
[/var/lib/openqa/pool/2/20260223-1-debian-testing-amd64-netinst.iso]?
A simple reproducer with the server using:
# cat /etc/exports.d/test.exports
/srv/NAS/test
fdff::/64(fsid=0,rw,no_root_squash,sync,no_subtree_check,auth_nlm)
and clients using:
# mount -t nfs [fdff::2]:/srv/NAS/test /srv/NAS/test -o
proto=tcp6,ro,fsc,soft
will trigger the error as shown above:
$ flock -e -w 4 /srv/NAS/test/debian-13.3.0-amd64-netinst.iso sleep 1
flock: /srv/NAS/test/debian-13.3.0-amd64-netinst.iso: No locks available
A simple test program calling fcntl() with the same arguments QEMU uses
also fails in the same way.
$ ./nfs3_range_lock_test
/srv/NAS/test/debian-13.3.0-amd64-netinst.{iso,overlay}
Opened base file: /srv/NAS/test/debian-13.3.0-amd64-netinst.iso
Opened overlay file: /srv/NAS/test/debian-13.3.0-amd64-netinst.overlay
Attempting lock at 4 on /srv/NAS/test/debian-13.3.0-amd64-netinst.iso
fcntl(fd, F_GETLK, &fl) failed on base: No locks available
Attempting lock at 8 on /srv/NAS/test/debian-13.3.0-amd64-netinst.overlay
fcntl(fd, F_GETLK, &fl) failed on overlay: No locks available
Follow-up with results of adding dump_stack() to nfsd_permission() revealing the paths that trigger the issue. [ 133.185579] Call Trace: [ 133.185580] <TASK> [ 133.185580] dump_stack_lvl+0x64/0x80 [ 133.185582] nfsd_permission+0x20/0x100 [nfsd] [ 133.185612] nfsd_access+0xc8/0x140 [nfsd] [ 133.185639] nfsd4_proc_compound+0x350/0x670 [nfsd] [ 133.185670] nfsd_dispatch+0x100/0x220 [nfsd] [ 133.185698] svc_process_common+0x314/0x700 [sunrpc] [ 133.185733] ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd] [ 133.185762] svc_process+0x131/0x1c0 [sunrpc] [ 133.185795] svc_recv+0x80a/0x9e0 [sunrpc] [ 133.185827] ? __pfx_nfsd+0x10/0x10 [nfsd] [ 133.185856] nfsd+0xa3/0x100 [nfsd] [ 133.185882] kthread+0xd2/0x100 [ 133.185884] ? __pfx_kthread+0x10/0x10 [ 133.185885] ret_from_fork+0x34/0x50 [ 133.185886] ? __pfx_kthread+0x10/0x10 [ 133.185887] ret_from_fork_asm+0x1a/0x30 [ 133.185890] </TASK> [ 144.020165] Call Trace: [ 144.020165] <TASK> [ 144.020166] dump_stack_lvl+0x64/0x80 [ 144.020168] nfsd_permission+0x20/0x100 [nfsd] [ 144.020201] nfsd_access+0xc8/0x140 [nfsd] [ 144.020228] nfsd3_proc_access+0x6c/0x110 [nfsd] [ 144.020257] nfsd_dispatch+0x100/0x220 [nfsd] [ 144.020286] svc_process_common+0x314/0x700 [sunrpc] [ 144.020321] ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd] [ 144.020350] svc_process+0x131/0x1c0 [sunrpc] [ 144.020383] svc_recv+0x80a/0x9e0 [sunrpc] [ 144.020415] ? __pfx_nfsd+0x10/0x10 [nfsd] [ 144.020445] nfsd+0xa3/0x100 [nfsd] [ 144.020471] kthread+0xd2/0x100 [ 144.020472] ? __pfx_kthread+0x10/0x10 [ 144.020473] ret_from_fork+0x34/0x50 [ 144.020475] ? __pfx_kthread+0x10/0x10 [ 144.020476] ret_from_fork_asm+0x1a/0x30 [ 144.020478] </TASK>
[CCing a few people and lists] the former. Does this happen on mainline (e.g. 7.0-rc1) as well? Ciao, Thorsten
Control: found -1 6.19.5-1~exp1 Hi, Not tested 7.0-rc2, but the issue is reproducible still in 6.19.5. Regards, Salvatore
On Fri, Feb 27, 2026 at 5:00 AM Thorsten Leemhuis <regressions@leemhuis.info> wrote: I could be wrong but I suspect that this occurs when the user running flock does not have write access to the file it's trying to get an exclusive lock on. Furthermore, it has been noted that the export policy of the server contains "auth_nlm". I ran into this before and there is an email thread titled: "[PATCH 3/3] nfsd: reset access mask for NLM calls in nfsd_permission" which tried to "fix the nfsd: refine and rename NFSD_MAY_LOCK" (this is in the middle of the discussion link https://www.spinics.net/lists/linux-nfs/msg111534.html .. that basically says flock should fail) . There was a period of time related to commit 4cc9b9f2bf4d when such access was allowed until it was not. Change export policy to no_auth_nlm if it's desired that flock gets an exclusive lock on a file without write permissions. Or give write permissions to get
This response was AI-generated by bug-bot. The analysis may contain errors — please verify independently.--- Bug Summary Commit 4cc9b9f2bf4d ("nfsd: refine and rename NFSD_MAY_LOCK"), backported to v6.12.54 as 18744bc56b0ec, removed a critical permission downgrade from nfsd_permission() that affects NLM lock requests. This is a severity: functional regression -- exclusive (and shared) file locking via NLM fails with ENOLCK on files where the requesting user lacks write permission on the inode, such as read-only ISO images served over NFSv3. Stack Trace Analysis No stack trace was included in the report; the failure is a user-visible ENOLCK error, not a kernel crash or warning. Root Cause Analysis The bug is in the interaction between nlm_fopen() in fs/nfsd/lockd.c and nfsd_permission() in fs/nfsd/vfs.c. Before 4cc9b9f2bf4d, nfsd_permission() contained this block: if (acc & NFSD_MAY_LOCK) { if (exp->ex_flags & NFSEXP_NOAUTHNLM) return 0; else acc = NFSD_MAY_READ | NFSD_MAY_OWNER_OVERRIDE; } This downgraded the permission check for lock requests from MAY_WRITE to MAY_READ, because file locks do not require write access to the file data -- only read access is needed. Commit 4cc9b9f2bf4d correctly moved the NFSEXP_NOAUTHNLM bypass into __fh_verify() (fs/nfsd/nfsfh.c, line 377) and added explicit NFSD_MAY_OWNER_OVERRIDE and NFSD_MAY_BYPASS_GSS flags in nlm_fopen(). However, it dropped the permission downgrade (the "else" branch) entirely. The call chain for an exclusive NLM lock is: nlm_fopen() [fs/nfsd/lockd.c:50] access = NFSD_MAY_WRITE | NFSD_MAY_NLM | NFSD_MAY_OWNER_OVERRIDE | NFSD_MAY_BYPASS_GSS -> nfsd_open() -> __fh_verify() -> nfsd_permission() -> inode_permission(inode, MAY_WRITE) <-- FAILS with -EACCES -> nfsd_open() returns nfserr -> nlm_fopen() default case returns nlm_failed -> client sees ENOLCK For files like ISO images (typically mode 0444 or 0644 owned by root), the requesting NFS user does not have write permission, so inode_permission(MAY_WRITE) fails. Previously, the downgrade to MAY_READ would have allowed this to succeed. The NFSD_MAY_OWNER_OVERRIDE added in nlm_fopen() only helps when the NFS credential matches the file owner (checked at fs/nfsd/vfs.c:2858), which is not the case for files owned by root when accessed by non-root NFS users. Affected Versions This is a regression introduced by: 4cc9b9f2bf4d ("nfsd: refine and rename NFSD_MAY_LOCK") Mainline: affected since v6.13-rc1 Stable: v6.12.54+ (backport 18744bc56b0ec) Any kernel version >= v6.13 or v6.12.54 is affected. Versions prior to v6.12.54 in the 6.12.y series are not affected. Relevant Commits and Fixes Introducing commit (mainline): 4cc9b9f2bf4d nfsd: refine and rename NFSD_MAY_LOCK Stable backport: 18744bc56b0ec nfsd: refine and rename NFSD_MAY_LOCK (v6.12.54) Predecessor commit that removed NFSD_MAY_LOCK from NFSv4: 6640556b0c80 NFSD: Replace use of NFSD_MAY_LOCK in nfsd4_lock() Existing partial fix for a different aspect of the same regression: 0813c5f01249 nfsd: fix access checking for NLM under XPRTSEC policies (Fixes: 4cc9b9f2bf4d, by Olga Kornievskaia -- addresses only the XPRTSEC policy bypass, NOT the permission downgrade issue) No existing mainline fix for the permission downgrade regression was found. Prior Discussions No prior reports of this specific NLM permission downgrade regression were found on lore.kernel.org. The only related discussion is the XPRTSEC fix by Olga Kornievskaia (commit 0813c5f01249), which addresses a different facet of the same 4cc9b9f2bf4d refactoring. The original bug was also reported via Debian bug #1128861. Adding Neil Brown who authored the original commit 4cc9b9f2bf4d. Adding Chuck Lever and Jeff Layton as NFSD maintainers. Adding Olga Kornievskaia who authored the related XPRTSEC fix (0813c5f01249) and is an NFSD reviewer. Adding Dai Ngo and Tom Talpey as NFSD reviewers. CC'ing stable@vger.kernel.org as the regression affects v6.12.y. Suggested Actions The fix is to restore the permission downgrade for NFSD_MAY_NLM in nfsd_permission() (fs/nfsd/vfs.c). The following patch should resolve the issue:
Linux has two quite different sorts of locks - flock and fcntl.
flocks lock the whole file, shared or exclusive.
fcntl can lock any byte-range (including the whole file), shared or
exclusive. flock and fcntl locks don't conflict.
exclusive flock locks only require read access to the file
exclusive fcntl locks require write access to the file.
The NLM protocol only supports one type of byte-range lock. It is
natural to map fcntl locks onto NLM locks. The early Linux NFS
implementation handled flock locks entirely locally so different clients
didn't conflict. This could be confusing but was widely documented and
understood.
Some years ago Linux NFS was enhanced to handle flock locks like
whole-file fcntl locks. This means that clients with flock locks would
conflict (maybe good) but that flock locks and fcntl locks would now
conflict (maybe bad).
You can still get the old behaviour with "-o local_lock=flock".
So if you open a file on NFS read-only and attempt an exclusive flock,
that will be sent to the server as a full-range fcntl lock which should
require write access. If the server finds you don't have write access -
you lose.
It would seems to make sense to tell qemu that the device is read-only.
Then it will hopefully only request a shared lock. Can you try that?
Note that even before my patch, if the filesystem was exported read-only
or mounted read-only on the server, then exclusive flock locks would
fail.
I think that the current behaviour is correct, however I do understand
that it is a regression and maybe that justifies incorrect behaviour.
Maybe Jeff, as locking maintainer, would be willing to do something like
diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
index dd0214dcb695..6c674fc51bab 100644
--- a/fs/lockd/svcsubs.c
+++ b/fs/lockd/svcsubs.c
@@ -73,6 +73,14 @@ static inline unsigned int file_hash(struct nfs_fh *f)
int lock_to_openmode(struct file_lock *lock)
{
+ /*
+ * flock only requires READ access and to support
+ * clients which send flock locks via NLM we
+ * report O_RDONLY for full-file locks.
+ */
+ if (lock->fl_start == 0 &&
+ lock->fl_end == NLM4_OFFSET_MAX)
+ return O_RDONLY;
return lock_is_write(lock) ? O_WRONLY : O_RDONLY;
}
But I wouldn't encourage him to.
NeilBrown
Jeff, do you have any opinion on what Neil suggested (see quote below). But as Neil mentioned, it's a regression, so it must be handled some way. And it looks like this stalled. Given that the commit in that caused this is somewhat old, I wonder: Is that something we expect other people to run into? If yes, I'd say Linus expects us to fix this. And if not: is there something the Debian openQA infra (a) can and (b) is willing to do to work around this regression cleanly (by upgrading Qemu or something like that maybe)? Then we maybe can leave things as they are[1]. Ciao, Thorsten [1] see the hand-holding aspect mention in https://www.kernel.org/doc/html/next/process/handling-regressions.html#on-exceptions-to-the-no-regressions-rule
Yes. NAK on the patch below. It would break legitimate cases where the lock should be denied. Neil is right not to encourage its use. As Neil points out, exclusive locks on NLM require write access. We're constrained by a protocol that doesn't have a provision for flock() style locks. It may technically be a regression since it worked before, but I'm wondering whether it ever should have. Has anyone experiencing this tried using the no_auth_nlm export option on the server? ISTM that that should work around the problem for these folks, even if it's not ideal.
I have to wonder if this is a QEMU bug too: Why is it opening a file read-only and then taking out an exclusive lock on it? What's the point of denying access to other readers?
It turns out that I mis-diagnosed the problem. i.e. I guess wrong as to what weird thing qemu is doing. qemu isn't using flock(). It is using fcntl() locking but at this point isn't trying to GET a lock, it is testing if a lock already exists. i.e. F_GETLK or F_OFD_GETLK. F_GETLK doesn't require WRITE access, even when getting an exclusive lock. But NFSD does :-) So maybe this is the fix that we want. diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c index 255a847ca0b6..67234686ef8c 100644 --- a/fs/lockd/svclock.c +++ b/fs/lockd/svclock.c @@ -632,7 +632,7 @@ nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file, goto out; } - mode = lock_to_openmode(&lock->fl); + mode = O_RDONLY; locks_init_lock(&conflock->fl); /* vfs_test_lock only uses start, end, and owner, but tests flc_file */ conflock->fl.c.flc_file = lock->fl.c.flc_file; ???? NeilBrown
Oh! That makes much more sense.
We definitely allow F_GETLK requests on local files when the task
doesn't have write access to the file, so I don't see any issue with
allowing it here. Your fix seems sensible to me.
Looking back, it looks like this may have been broken back in 2021 in:
7f024fcd5c97 ("Keep read and write fds with each nlm_file")
?
Cheers,
From: NeilBrown <neil@brown.name>
The F_GETLK fcntl can work with either read access or write access or
both. It can query F_RDLCK and F_WRLCK locks in either case.
However lockd currently treats F_GETLK similar to F_SETLK in that read
access is required to query an F_RDLCK lock and write access is required
to query a F_WRLCK lock.
This is wrong and can cause problem - e.g. when qemu accesses a
read-only (e.g. iso) filesystem image over NFS (though why it queries
if it can get a write lock - I don't know. But it does, and this works
with local filesystems).
So we need TEST requests to be handled differently. To do this:
- change nlm_do_fopen() to accept O_RDWR as a mode and in that case
succeed if either a O_RDONLY or O_WRONLY file can be opened.
- change nlm_lookup_file() to accept a mode argument from caller,
instead of deducing base on lock time, and pass that on to nlm_do_fopen()
- change nlm4svc_retrieve_args() and nlmsvc_retrieve_args() to detect
TEST requests and pass O_RDWR as a mode to nlm_lookup_file, passing
the same mode as before for other requests. Also set
lock->fl.c.flc_file to whichever file is available for TEST requests.
- change nlmsvc_testlock() to also not calculate the mode, but to use
whenever was stored in lock->fl.c.flc_file.
Reported-by: Tj <tj.iam.tj@proton.me>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1128861
Fixes: 7f024fcd5c97 ("Keep read and write fds with each nlm_file")
Signed-off-by: NeilBrown <neil@brown.name>
---
fs/lockd/svc4proc.c | 13 ++++++++++---
fs/lockd/svclock.c | 4 +---
fs/lockd/svcproc.c | 15 ++++++++++++---
fs/lockd/svcsubs.c | 26 +++++++++++++++++---------
include/linux/lockd/lockd.h | 2 +-
5 files changed, 41 insertions(+), 19 deletions(-)
diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
index 4b6f18d97734..75e020a8bfd0 100644
--- a/fs/lockd/svc4proc.c
+++ b/fs/lockd/svc4proc.c
@@ -26,6 +26,8 @@ nlm4svc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
struct nlm_host *host = NULL;
struct nlm_file *file = NULL;
struct nlm_lock *lock = &argp->lock;
+ bool is_test = (rqstp->rq_proc == NLMPROC_TEST ||
+ rqstp->rq_proc == NLMPROC_TEST_MSG);
__be32 error = 0;
/* nfsd callbacks must have been installed for this procedure */
@@ -46,15 +48,20 @@ nlm4svc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
if (filp != NULL) {
int mode = lock_to_openmode(&lock->fl);
+ if (is_test)
+ mode = O_RDWR;
+
lock->fl.c.flc_flags = FL_POSIX;
- error = nlm_lookup_file(rqstp, &file, lock);
+ error = nlm_lookup_file(rqstp, &file, lock, mode);
if (error)
goto no_locks;
*filp = file;
-
/* Set up the missing parts of the file_lock structure */
- lock->fl.c.flc_file = file->f_file[mode];
+ if (is_test)
+ lock->fl.c.flc_file = nlmsvc_file_file(file);
+ else
+ lock->fl.c.flc_file = file->f_file[mode];
lock->fl.c.flc_pid = current->tgid;
lock->fl.fl_start = (loff_t)lock->lock_start;
lock->fl.fl_end = lock->lock_len ?
diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 255a847ca0b6..adfd8c072898 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -614,7 +614,6 @@ nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file,
struct nlm_lock *conflock)
{
int error;
- int mode;
__be32 ret;
dprintk("lockd: nlmsvc_testlock(%s/%ld, ty=%d, %Ld-%Ld)\n",
@@ -632,14 +631,13 @@ nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file,
goto out;
}
- mode = lock_to_openmode(&lock->fl);
locks_init_lock(&conflock->fl);
/* vfs_test_lock only uses start, end, and owner, but tests flc_file */
conflock->fl.c.flc_file = lock->fl.c.flc_file;
conflock->fl.fl_start = lock->fl.fl_start;
conflock->fl.fl_end = lock->fl.fl_end;
conflock->fl.c.flc_owner = lock->fl.c.flc_owner;
- error = vfs_test_lock(file->f_file[mode], &conflock->fl);
+ error = vfs_test_lock(lock->fl.c.flc_file, &conflock->fl);
if (error) {
ret = nlm_lck_denied_nolocks;
goto out;
diff --git a/fs/lockd/svcproc.c b/fs/lockd/svcproc.c
index 5817ef272332..d98e8d684376 100644
--- a/fs/lockd/svcproc.c
+++ b/fs/lockd/svcproc.c
@@ -55,6 +55,8 @@ nlmsvc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
struct nlm_host *host = NULL;
struct nlm_file *file = NULL;
struct nlm_lock *lock = &argp->lock;
+ bool is_test = (rqstp->rq_proc == NLMPROC_TEST ||
+ rqstp->rq_proc == NLMPROC_TEST_MSG);
int mode;
__be32 error = 0;
@@ -70,15 +72,22 @@ nlmsvc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
/* Obtain file pointer. Not used by FREE_ALL call. */
if (filp != NULL) {
- error = cast_status(nlm_lookup_file(rqstp, &file, lock));
+ mode = lock_to_openmode(&lock->fl);
+
+ if (is_test)
+ mode = O_RDWR;
+
+ error = cast_status(nlm_lookup_file(rqstp, &file, lock, mode));
if (error != 0)
goto no_locks;
*filp = file;
/* Set up the missing parts of the file_lock structure */
- mode = lock_to_openmode(&lock->fl);
lock->fl.c.flc_flags = FL_POSIX;
- lock->fl.c.flc_file = file->f_file[mode];
+ if (is_test)
+ lock->fl.c.flc_file = nlmsvc_file_file(file);
+ else
+ lock->fl.c.flc_file = file->f_file[mode];
lock->fl.c.flc_pid = current->tgid;
lock->fl.fl_lmops = &nlmsvc_lock_operations;
nlmsvc_locks_init_private(&lock->fl, host, (pid_t)lock->svid);
diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
index dd0214dcb695..b92eb032849f 100644
--- a/fs/lockd/svcsubs.c
+++ b/fs/lockd/svcsubs.c
@@ -82,18 +82,28 @@ int lock_to_openmode(struct file_lock *lock)
*
* We have to make sure we have the right credential to open
* the file.
+ *
+ * mode can be O_RDONLY(0), O_WRONLY(1) or O_RDWR(2) meaning either
*/
static __be32 nlm_do_fopen(struct svc_rqst *rqstp,
struct nlm_file *file, int mode)
{
- struct file **fp = &file->f_file[mode];
+ struct file **fp;
__be32 nfserr;
+ int m;
- if (*fp)
- return 0;
- nfserr = nlmsvc_ops->fopen(rqstp, &file->f_handle, fp, mode);
- if (nfserr)
- dprintk("lockd: open failed (error %d)\n", nfserr);
+ for (m = O_RDONLY ; m <= O_WRONLY ; m++) {
+ if (mode != O_RDWR && mode != m)
+ continue;
+
+ fp = &file->f_file[m];
+ if (*fp)
+ return 0;
+ nfserr = nlmsvc_ops->fopen(rqstp, &file->f_handle, fp, m);
+ if (!nfserr)
+ return 0;
+ }
+ dprintk("lockd: open failed (error %d)\n", nfserr);
return nfserr;
}
@@ -103,17 +113,15 @@ static __be32 nlm_do_fopen(struct svc_rqst *rqstp,
*/
__be32
nlm_lookup_file(struct svc_rqst *rqstp, struct nlm_file **result,
- struct nlm_lock *lock)
+ struct nlm_lock *lock, int mode)
{
struct nlm_file *file;
unsigned int hash;
__be32 nfserr;
- int mode;
nlm_debug_print_fh("nlm_lookup_file", &lock->fh);
hash = file_hash(&lock->fh);
- mode = lock_to_openmode(&lock->fl);
/* Lock file table */
mutex_lock(&nlm_file_mutex);
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index 330e38776bb2..fe5cdd4d66f4 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -294,7 +294,7 @@ void nlmsvc_locks_init_private(struct file_lock *, struct nlm_host *, pid_t);
* File handling for the server personality
*/
__be32 nlm_lookup_file(struct svc_rqst *, struct nlm_file **,
- struct nlm_lock *);
+ struct nlm_lock *, int);
void nlm_release_file(struct nlm_file *);
void nlmsvc_put_lockowner(struct nlm_lockowner *);
void nlmsvc_release_lockowner(struct nlm_lock *);
Meaning either... ? Reviewed-by: Jeff Layton <jlayton@kernel.org>
Hi Neil, which kernels should this fix apply to?
If O_RDWR is given then either a O_RDONLY file or a O_WRONLY file is provided. I see now that isn't obvious from the text... Thanks, NeilBrown
The Fixes: tag is actually wrong. This bug has been present forever.
However a different bug that
Commit: 4cc9b9f2bf4d ("nfsd: refine and rename NFSD_MAY_LOCK")
fixed was hiding the bug.
So it should probably be marked
Fixes: 4cc9b9f2bf4d ("nfsd: refine and rename NFSD_MAY_LOCK")
with an explanation.
NeilBrown
Assuming that includes upstream, I recommend that I take this into nfsd-testing / nfsd-next and let nature, ah, er, stable automation, take it's course. the issue (Fixes: since forever) and then use a "# v6.13+" comment on the Cc: stable to control how far back to backport it. Commit message could mention that 4cc9b9f2bf4d uncovered the issue.
6.12.y is also affected since commit 4cc9b9f2bf4d was backported there (triggering this bug report). Ben.
points out, it will miss older kernels that have backported 4cc9b9f2bf4d. We know which kernel.org kernels that includes, but not what other organisations might maintain for their own purposes. So I think we meet the needs of automation best by saying: Fixes: 4cc9b9f2bf4d even though that didn't introduce the bug but only expose the bug. NeilBrown
From: NeilBrown <neil@brown.name>
The F_GETLK fcntl can work with either read access or write access or
both. It can query F_RDLCK and F_WRLCK locks in either case.
However lockd currently treats F_GETLK similar to F_SETLK in that read
access is required to query an F_RDLCK lock and write access is required
to query a F_WRLCK lock.
This is wrong and can cause problems - e.g. when qemu accesses a
read-only (e.g. iso) filesystem image over NFS (though why it queries
if it can get a write lock - I don't know. But it does, and this works
with local filesystems).
So we need TEST requests to be handled differently. To do this:
- change nlm_do_fopen() to accept O_RDWR as a mode and in that case
succeed if either a O_RDONLY or O_WRONLY file can be opened.
- change nlm_lookup_file() to accept a mode argument from caller,
instead of deducing base on lock time, and pass that on to nlm_do_fopen()
- change nlm4svc_retrieve_args() and nlmsvc_retrieve_args() to detect
TEST requests and pass O_RDWR as a mode to nlm_lookup_file, passing
the same mode as before for other requests. Also set
lock->fl.c.flc_file to whichever file is available for TEST requests.
- change nlmsvc_testlock() to also not calculate the mode, but to use
whenever was stored in lock->fl.c.flc_file.
This behaviour of lockd - requesting O_WRONLY access to TEST for
exclusive locks - has been present at least since git history began.
However it was hidden until recently because knfsd ignored the access
requested by lockd and required only READ access for all locking
requests (unless the underlying filesystem provided an f_op->open
function which checked access permissions).
The commit mentioned in Fixes: below changed nfsd_permission() to NOT
override the access request for LOCK requests and this exposed the bug
that we are now fixing.
Note that there is another issue that this patch does not address.
The flock(.., LOCK_EX) call is permitted on a read-only file descriptor.
Linux NFS maps this to NLM locking as whole-file byte-range locks.
nfsd will see this as though it were fcntl( F_SETLK (F_WRLCK)) and will
now require write access, which it might not be able to get.
It is not clear if this is a problem in practice, or what the best
solution might be. So no attempt is made to address it.
Reported-by: Tj <tj.iam.tj@proton.me>
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1128861
Fixes: 4cc9b9f2bf4d ("nfsd: refine and rename NFSD_MAY_LOCK")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>
---
fs/lockd/svc4proc.c | 13 ++++++++++---
fs/lockd/svclock.c | 4 +---
fs/lockd/svcproc.c | 15 ++++++++++++---
fs/lockd/svcsubs.c | 28 +++++++++++++++++++---------
include/linux/lockd/lockd.h | 2 +-
5 files changed, 43 insertions(+), 19 deletions(-)
diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c
index 4b6f18d97734..75e020a8bfd0 100644
--- a/fs/lockd/svc4proc.c
+++ b/fs/lockd/svc4proc.c
@@ -26,6 +26,8 @@ nlm4svc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
struct nlm_host *host = NULL;
struct nlm_file *file = NULL;
struct nlm_lock *lock = &argp->lock;
+ bool is_test = (rqstp->rq_proc == NLMPROC_TEST ||
+ rqstp->rq_proc == NLMPROC_TEST_MSG);
__be32 error = 0;
/* nfsd callbacks must have been installed for this procedure */
@@ -46,15 +48,20 @@ nlm4svc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
if (filp != NULL) {
int mode = lock_to_openmode(&lock->fl);
+ if (is_test)
+ mode = O_RDWR;
+
lock->fl.c.flc_flags = FL_POSIX;
- error = nlm_lookup_file(rqstp, &file, lock);
+ error = nlm_lookup_file(rqstp, &file, lock, mode);
if (error)
goto no_locks;
*filp = file;
-
/* Set up the missing parts of the file_lock structure */
- lock->fl.c.flc_file = file->f_file[mode];
+ if (is_test)
+ lock->fl.c.flc_file = nlmsvc_file_file(file);
+ else
+ lock->fl.c.flc_file = file->f_file[mode];
lock->fl.c.flc_pid = current->tgid;
lock->fl.fl_start = (loff_t)lock->lock_start;
lock->fl.fl_end = lock->lock_len ?
diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 255a847ca0b6..adfd8c072898 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -614,7 +614,6 @@ nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file,
struct nlm_lock *conflock)
{
int error;
- int mode;
__be32 ret;
dprintk("lockd: nlmsvc_testlock(%s/%ld, ty=%d, %Ld-%Ld)\n",
@@ -632,14 +631,13 @@ nlmsvc_testlock(struct svc_rqst *rqstp, struct nlm_file *file,
goto out;
}
- mode = lock_to_openmode(&lock->fl);
locks_init_lock(&conflock->fl);
/* vfs_test_lock only uses start, end, and owner, but tests flc_file */
conflock->fl.c.flc_file = lock->fl.c.flc_file;
conflock->fl.fl_start = lock->fl.fl_start;
conflock->fl.fl_end = lock->fl.fl_end;
conflock->fl.c.flc_owner = lock->fl.c.flc_owner;
- error = vfs_test_lock(file->f_file[mode], &conflock->fl);
+ error = vfs_test_lock(lock->fl.c.flc_file, &conflock->fl);
if (error) {
ret = nlm_lck_denied_nolocks;
goto out;
diff --git a/fs/lockd/svcproc.c b/fs/lockd/svcproc.c
index 5817ef272332..d98e8d684376 100644
--- a/fs/lockd/svcproc.c
+++ b/fs/lockd/svcproc.c
@@ -55,6 +55,8 @@ nlmsvc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
struct nlm_host *host = NULL;
struct nlm_file *file = NULL;
struct nlm_lock *lock = &argp->lock;
+ bool is_test = (rqstp->rq_proc == NLMPROC_TEST ||
+ rqstp->rq_proc == NLMPROC_TEST_MSG);
int mode;
__be32 error = 0;
@@ -70,15 +72,22 @@ nlmsvc_retrieve_args(struct svc_rqst *rqstp, struct nlm_args *argp,
/* Obtain file pointer. Not used by FREE_ALL call. */
if (filp != NULL) {
- error = cast_status(nlm_lookup_file(rqstp, &file, lock));
+ mode = lock_to_openmode(&lock->fl);
+
+ if (is_test)
+ mode = O_RDWR;
+
+ error = cast_status(nlm_lookup_file(rqstp, &file, lock, mode));
if (error != 0)
goto no_locks;
*filp = file;
/* Set up the missing parts of the file_lock structure */
- mode = lock_to_openmode(&lock->fl);
lock->fl.c.flc_flags = FL_POSIX;
- lock->fl.c.flc_file = file->f_file[mode];
+ if (is_test)
+ lock->fl.c.flc_file = nlmsvc_file_file(file);
+ else
+ lock->fl.c.flc_file = file->f_file[mode];
lock->fl.c.flc_pid = current->tgid;
lock->fl.fl_lmops = &nlmsvc_lock_operations;
nlmsvc_locks_init_private(&lock->fl, host, (pid_t)lock->svid);
diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
index dd0214dcb695..865ff6844743 100644
--- a/fs/lockd/svcsubs.c
+++ b/fs/lockd/svcsubs.c
@@ -82,18 +82,30 @@ int lock_to_openmode(struct file_lock *lock)
*
* We have to make sure we have the right credential to open
* the file.
+ *
+ * mode can be O_RDONLY(0), O_WRONLY(1) or O_RDWR(2).
+ * The latter means succecss can be achieved with EITHER O_RDONLY or
+ * O_WRONLY. It does NOT mean both read and write are required.
*/
static __be32 nlm_do_fopen(struct svc_rqst *rqstp,
struct nlm_file *file, int mode)
{
- struct file **fp = &file->f_file[mode];
+ struct file **fp;
__be32 nfserr;
+ int m;
- if (*fp)
- return 0;
- nfserr = nlmsvc_ops->fopen(rqstp, &file->f_handle, fp, mode);
- if (nfserr)
- dprintk("lockd: open failed (error %d)\n", nfserr);
+ for (m = O_RDONLY ; m <= O_WRONLY ; m++) {
+ if (mode != O_RDWR && mode != m)
+ continue;
+
+ fp = &file->f_file[m];
+ if (*fp)
+ return 0;
+ nfserr = nlmsvc_ops->fopen(rqstp, &file->f_handle, fp, m);
+ if (!nfserr)
+ return 0;
+ }
+ dprintk("lockd: open failed (error %d)\n", nfserr);
return nfserr;
}
@@ -103,17 +115,15 @@ static __be32 nlm_do_fopen(struct svc_rqst *rqstp,
*/
__be32
nlm_lookup_file(struct svc_rqst *rqstp, struct nlm_file **result,
- struct nlm_lock *lock)
+ struct nlm_lock *lock, int mode)
{
struct nlm_file *file;
unsigned int hash;
__be32 nfserr;
- int mode;
nlm_debug_print_fh("nlm_lookup_file", &lock->fh);
hash = file_hash(&lock->fh);
- mode = lock_to_openmode(&lock->fl);
/* Lock file table */
mutex_lock(&nlm_file_mutex);
diff --git a/include/linux/lockd/lockd.h b/include/linux/lockd/lockd.h
index 330e38776bb2..fe5cdd4d66f4 100644
--- a/include/linux/lockd/lockd.h
+++ b/include/linux/lockd/lockd.h
@@ -294,7 +294,7 @@ void nlmsvc_locks_init_private(struct file_lock *, struct nlm_host *, pid_t);
* File handling for the server personality
*/
__be32 nlm_lookup_file(struct svc_rqst *, struct nlm_file **,
- struct nlm_lock *);
+ struct nlm_lock *, int);
void nlm_release_file(struct nlm_file *);
void nlmsvc_put_lockowner(struct nlm_lockowner *);
void nlmsvc_release_lockowner(struct nlm_lock *);
We have a new problem now (or maybe just I do). I think the stable folks will insist on this fix going into upstream first. However, this version of the fix does not apply to nfsd-testing because that branch has the NLMv4 xdrgen rewrite. That rewrite is not likely to be backported to LTS. So this version will have to be directed to stable once upstream has been fixed. We cannot rely on automation to get upstream's version of the fix backported to v6.19 and earlier.
Hello Chuck, This is not the first time a bug is fixed by changes that are too intrusive for backport. Usually the stable maintainers can be talked to accept a small targeted fix even if it's not upstream. The discussion is simplified by people claiming to have tested the fix and confirm it helps. Best regards Uwe
Sorry I wasn't clear. Neil and I have also been down this road before so I wasn't explicit about my request. I'd like Neil to provide a patch for upstream against nfsd-testing. Once that is merged, he can present the patch from this thread to the stable/LTS maintainers.
We believe that the bug you reported is fixed in the latest version of
linux, which is due to be installed in the Debian FTP archive.
A summary of the changes between this version and the previous one is
attached.
Thank you for reporting the bug, which will now be closed. If you
have further comments please address them to 1128861@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.
Debian distribution maintenance software
pp.
Ben Hutchings <benh@debian.org> (supplier of updated linux package)
(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@ftp-master.debian.org)
Format: 1.8
Date: Tue, 09 Jun 2026 21:20:58 +0200
Source: linux
Architecture: source
Version: 7.1~rc7-1~exp1
Distribution: experimental
Urgency: medium
Maintainer: Debian Kernel Team <debian-kernel@lists.debian.org>
Changed-By: Ben Hutchings <benh@debian.org>
Closes: 1128861
Changes:
linux (7.1~rc7-1~exp1) experimental; urgency=medium
.
* New upstream release candidate.
- lockd: fix TEST handling when not all permissions are available.
(Closes: #1128861)
.
[ Bastian Blank ]
* Move unsigned vmlinuz out of boot dir.
.
[ Aurelien Jarno ]
* [riscv64] Bump CMA_SIZE_MBYTES to 96 from 64.
Checksums-Sha1:
e67cd5d27697da689e3170f558c4f445371ed574 183133 linux_7.1~rc7-1~exp1.dsc
798b4ab9c7b48ab340ba4d9731ff0fad9e0dc58a 161556604 linux_7.1~rc7.orig.tar.xz
f5e87e129f8bde02586e482ede37fd6e5de71da8 1457580 linux_7.1~rc7-1~exp1.debian.tar.xz
51662ebc82f002326268c2e548c02090133a65f5 6948 linux_7.1~rc7-1~exp1_source.buildinfo
Checksums-Sha256:
8b03ec2d53ac93632ae2b358747e415a53ad9c086464e061f5621001a597db92 183133 linux_7.1~rc7-1~exp1.dsc
4f682b29c6881b2169e1abcdcf9ee8309dd47073af8137f3e70c000f8e6a96ca 161556604 linux_7.1~rc7.orig.tar.xz
500c8bc024ecf82e8493684bd3bf4e69bcff4a459336cf67ce604858b170c10d 1457580 linux_7.1~rc7-1~exp1.debian.tar.xz
b2a1afcb112791cb68bf5df0e26bf6f98ceb288e6a932584de33c8a1a435375d 6948 linux_7.1~rc7-1~exp1_source.buildinfo
Files:
3795dfee719656570c4251ffdcd669f1 183133 kernel optional linux_7.1~rc7-1~exp1.dsc
2829430e84bf4d1b224bcb52d0736968 161556604 kernel optional linux_7.1~rc7.orig.tar.xz
fb4f0c36cf2f0dd6790dee028d1b7a20 1457580 kernel optional linux_7.1~rc7-1~exp1.debian.tar.xz
25985dba0a5749a67db11a4791c743ac 6948 kernel optional linux_7.1~rc7-1~exp1_source.buildinfo
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCgAdFiEErCspvTSmr92z9o8157/I7JWGEQkFAmopOeMACgkQ57/I7JWG
EQnW7A/9FCVxV42qXygMRfMefK6F4ArS0FzzQ4rst/1ehvli1pA9PEvQpCo8IwQE
DjzHfakD9hTQVFYVC8/EvLmgJlr+jT6rHq0dNcY8LIojl4oG+ILBAHreVELzT5LF
rGDahIX6EwgezUrmV/EEdWRuePTuQg8PUt6VA1qae3KJqKcUiqT/Jx++8aaPKWmh
I6gVDaLxB7AImoaK45jgxfDBqCa82NW14IFJ2wYe3EODZqFIg1AKzTbpksq3FddN
c89bHz/lWd9J20s8hjJk0Rv8UnPtHkiCyRW2npamOqivECq/RBmZKTwewwW6p6L2
FrFpBH/0nykHomHV8hzscwHhykOQyVVwAwZLl4d64JYoZSXrMF9LKD9clvpq2pFJ
T7yJ0IG5TA+PvU0XYKYD3Y26ddogXjcpQJXTRJi3UQI9g4iYhUsrWkEsw+KXesDo
8zU1RzJRmtyQdLOT5/p8INAY/+huS1QEvO49duympTogQ5EL5Xj9R7WYZnI9Bfzh
CwgpaeQ6Ia2KQDBmWPuMLiphUyluhEewi7HrDOk/SKMogE5Cx0rQleRq9/Y/PbhI
p7PVfO3z9cqM1SFm/R/wpMRr9x5KKsS7diduia15m1hc2c+6g6JcOaTiLhMkC1oM
CHm+5fnXRklNi+ghy44vtblAsQZNs6xcqN6LZd0aUsLstL7OHh0=
=yarb
-----END PGP SIGNATURE-----