Hi, overlayfs does not support renaming directories when the directories live in the lower filesystem: * Directory renames only allowed on "pure upper" (already created on * upper filesystem, never copied up). Directories which are on lower or * are merged may not be renamed. For these -EXDEV is returned and * userspace has to deal with it. This means, when copying up a * directory we can rely on it and ancestors being stable. https://github.com/torvalds/linux/blob/v4.8-rc2/fs/overlayfs/copy_up.c#L318-L322 Unfortunately this means that dpkg fails at least in the case where a directory is converted into a file: apt 1.3~rc2 moves /usr/share/bug/apt/script to /usr/share/bug/apt . This causes dpkg to fail when running in an overlayfs with the following error: # dpkg -i /var/cache/apt/archives/apt_1.3~rc3_amd64.deb (Reading database ... 11401 files and directories currently installed.) Preparing to unpack .../archives/apt_1.3~rc3_amd64.deb ... Unpacking apt (1.3~rc3) over (1.3~rc2) ... dpkg: error processing archive /var/cache/apt/archives/apt_1.3~rc3_amd64.deb (--install): unable to move aside './usr/share/bug/apt' to install new version: Invalid cross-device link dpkg-deb: error: subprocess paste was killed by signal (Broken pipe) Processing triggers for libc-bin (2.23-5) ... Errors were encountered while processing: /var/cache/apt/archives/apt_1.3~rc3_amd64.deb I don't know what the correct workaround should be. This can break pre-build upgrades in sbuild when the schroot is of overlayfs type.
Hi, thanks for the analysis! I just stumbled across the exact same problem with sbuild in a directory type overlayfs schroot. My workaround: destroy the chroot (you can now use sbuild-destroychroot) and then recreate it with sbuild-createchroot. :( cheers, josch
wrote: or https://github.com/torvalds/linux/blob/v4.8-rc2/fs/overlayfs/copy_up.c#L318-L322 Invalid cross-device link type You can also enter the source chroot and upgrade there, as that skips the overlayfs setup. I used sbuild-update and it worked OK. I discussed this with Guillem on IRC and he doesn't think adding workarounds for this is reasonable, and this should be fixed in overlayfs. Assuming a directory lives in the same device as itself seems reasonable to me, and overlayfs should try hard to pretend it does. Saludos
Hi,
[...]
That's the change at the package level. The operation that fails
at the dpkg level is rename('/usr/share/bug/apt', '/usr/share/bug/apt.dpkg-tmp').
[...]
I got the same failure here but I also saw similar reports in a Kali live
system where overlayfs is used for persistence (and with the upgrade
of gedit). So that error is affecting many Kali users. It's not a
very rare error (ex: https://bugs.kali.org/view.php?id=3473,
https://bugs.kali.org/view.php?id=3476,
https://bugs.kali.org/view.php?id=3361,
https://bugs.kali.org/view.php?id=3365).
I'm putting in copy the overlayfs kernel maintainer... Hello Miklos,
that restriction above (cf
https://github.com/torvalds/linux/blob/v4.8-rc2/fs/overlayfs/copy_up.c#L318-L322)
is very problematic for us.
Do you have plans to get rid of it?
It does not seem very correct to put the burden on user-space to be aware
of overlayfs restrictions such as this one. Renaming a directory is
not something that happens often in practice in the uses cases where
we use overlayfs but it's still frequent enough to deserve better than
a EXDEV error IMO and dpkg trying to rename "foo/" into "foo.dpkg-tmp/"
is in its right to assume that this rename will not cross any device
boundary.
I'm happy to test out kernel patches if I can help you in the process
of getting a fix.
Cheers,
You are the first to report that this quirk actually causes problems in real life. I have plans to fix this by adding a "redirector" attribute so that copied up directories (and perhaps files) can refer to lower directory that is found at a different location relative to the root of the overlay from the location of the upper directory. Maybe an example can make this more clear: lower/foo/bar/baz upper/foo/bar <- whiteout upper/moved/here <- redirects to "foo/bar" will result in the tree: overlay/moved/here/baz The EXDEV trick just works for mv(1), hence this didn't seem to be a big issue in practice. Thanks, Miklos
Hi! That'd be very much appreciated, thanks! In dpkg, the directory is rename(2)d to perform an atomic temporary backup, so that it can be rolled-back in case of errors during unpacking. Having to possibly do a deep-copy and a deep-remove, and then trying to recover from errors inbetween those, would be very unsatisfactory. Thanks, Guillem
Hi, Just for the record, the attached patch is what we use in Kali to work-around this problem. It works but it's not going to be merged in Debian because we really want a fix at the overlayfs level instead (and this is just a hack relying on "mv" to copy the whole-directory). I post it just in case other derivatives have a similar need. Cheers,
Toy patch attached. Passes minimal test. It does not deal with multi-layer setups when one of the lower layers was previously an upper layer (generally overlayfs does support this, so this patch will need to deal with that case). You can help by - review it - improve it - test it - break it - write test cases in xfstests Thanks, Miklos--- fs/overlayfs/dir.c | 42 ++++++++++++++++++++-- fs/overlayfs/overlayfs.h | 1 fs/overlayfs/super.c | 89 +++++++++++++++++++++++++++++++++++++++++------ 3 files changed, 119 insertions(+), 13 deletions(-)--- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -764,6 +764,31 @@ static int ovl_rmdir(struct inode *dir, return ovl_do_remove(dentry, true); } +static bool ovl_can_move(struct dentry *dentry, enum ovl_path_type type) +{ +// return !d_is_dir(dentry) || !OVL_TYPE_MERGE_OR_LOWER(type); + return true; +} + +static int ovl_set_redirect(struct dentry *dentry) +{ + char *buf = __getname(); + char *path; + int err; + + if (!buf) + return -ENOMEM; + + path = dentry_path_raw(dentry, buf, PATH_MAX); + + printk(KERN_WARNING "redirect: <%s>\n", path); + err = ovl_do_setxattr(ovl_dentry_upper(dentry), OVL_XATTR_REDIRECT, + path, strlen(path), 0); + __putname(buf); + + return err; +} + static int ovl_rename2(struct inode *olddir, struct dentry *old, struct inode *newdir, struct dentry *new, unsigned int flags) @@ -798,7 +823,7 @@ static int ovl_rename2(struct inode *old /* Don't copy up directory trees */ old_type = ovl_path_type(old); err = -EXDEV; - if (OVL_TYPE_MERGE_OR_LOWER(old_type) && is_dir) + if (!ovl_can_move(old, old_type)) goto out; if (new->d_inode) { @@ -811,7 +836,7 @@ static int ovl_rename2(struct inode *old new_type = ovl_path_type(new); err = -EXDEV; - if (!overwrite && OVL_TYPE_MERGE_OR_LOWER(new_type) && new_is_dir) + if (!overwrite && !ovl_can_move(new, new_type)) goto out; err = 0; @@ -883,7 +908,6 @@ static int ovl_rename2(struct inode *old trap = lock_rename(new_upperdir, old_upperdir); - olddentry = lookup_one_len(old->d_name.name, old_upperdir, old->d_name.len); err = PTR_ERR(olddentry); @@ -920,11 +944,23 @@ static int ovl_rename2(struct inode *old if (newdentry == trap) goto out_dput; + printk(KERN_WARNING "is_dir: %i, old_opaque: %i, old_type: %i\n", + is_dir, old_opaque, old_type); + if (is_dir && OVL_TYPE_MERGE_OR_LOWER(old_type)) { + err = ovl_set_redirect(old); + if (err) + goto out_dput; + } if (is_dir && !old_opaque && new_opaque) { err = ovl_set_opaque(olddentry); if (err) goto out_dput; } + if (!overwrite && new_is_dir && OVL_TYPE_MERGE_OR_LOWER(new_type)) { + err = ovl_set_redirect(new); + if (err) + goto out_dput; + } if (!overwrite && new_is_dir && old_opaque && !new_opaque) { err = ovl_set_opaque(newdentry); if (err) --- a/fs/overlayfs/overlayfs.h +++ b/fs/overlayfs/overlayfs.h @@ -26,6 +26,7 @@ enum ovl_path_type { #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay." #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX "opaque" +#define OVL_XATTR_REDIRECT OVL_XATTR_PREFIX "redirect" #define OVL_ISUPPER_MASK 1UL--- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -285,6 +285,33 @@ static bool ovl_is_opaquedir(struct dent return false; } +static int ovl_check_redirect(struct dentry *dentry, char **bufp) +{ + struct inode *inode = dentry->d_inode; + int res; + + if (!S_ISDIR(inode->i_mode) || !inode->i_op->getxattr) + return false; + + res = inode->i_op->getxattr(dentry, inode, OVL_XATTR_REDIRECT, NULL, 0); + if (res > 0) { + char *buf = kzalloc(res + 1, GFP_KERNEL); + + if (!buf) + return -ENOMEM; + + res = inode->i_op->getxattr(dentry, inode, OVL_XATTR_REDIRECT, + buf, res); + if (res < 0) + return res; + + kfree(*bufp); + *bufp = buf; + } + + return 0; +} + static void ovl_dentry_release(struct dentry *dentry) { struct ovl_entry *oe = dentry->d_fsdata; @@ -465,6 +492,9 @@ int ovl_path_next(int idx, struct dentry return (idx < oe->numlower) ? idx + 1 : -1; } +extern int vfs_path_lookup(struct dentry *, struct vfsmount *, + const char *, unsigned int, struct path *); + struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags) { @@ -478,6 +508,7 @@ struct dentry *ovl_lookup(struct inode * struct dentry *this, *prev = NULL; unsigned int i; int err; + char *redirect = NULL; upperdir = ovl_upperdentry_dereference(poe); if (upperdir) { @@ -498,11 +529,18 @@ struct dentry *ovl_lookup(struct inode * upperopaque = true; } else if (poe->numlower && ovl_is_opaquedir(this)) { upperopaque = true; + } else if (d_is_dir(this)) { + err = ovl_check_redirect(this, &redirect); + if (err) + goto out; } } upperdentry = prev = this; } + if (redirect) + poe = dentry->d_sb->s_root->d_fsdata; + if (!upperopaque && poe->numlower) { err = -ENOMEM; stack = kcalloc(poe->numlower, sizeof(struct path), GFP_KERNEL); @@ -514,16 +552,45 @@ struct dentry *ovl_lookup(struct inode * bool opaque = false; struct path lowerpath = poe->lowerstack[i]; - this = ovl_lookup_real(dentry->d_sb, - lowerpath.dentry, &dentry->d_name); - err = PTR_ERR(this); - if (IS_ERR(this)) { - /* - * If it's positive, then treat ENAMETOOLONG as ENOENT. - */ - if (err == -ENAMETOOLONG && (upperdentry || ctr)) - continue; - goto out_put; + if (!redirect) { + this = ovl_lookup_real(dentry->d_sb, + lowerpath.dentry, &dentry->d_name); + err = PTR_ERR(this); + if (IS_ERR(this)) { + /* + * If it's positive, then treat ENAMETOOLONG as ENOENT. + */ + if (err == -ENAMETOOLONG && (upperdentry || ctr)) + continue; + goto out_put; + } + } else { + struct path thispath; + + err = vfs_path_lookup(lowerpath.dentry, lowerpath.mnt, + redirect, 0, &thispath); + + if (err) { + if (err == -ENOENT) + this = NULL; + else if (err == -ENAMETOOLONG) + this = NULL; + } else { + this = thispath.dentry; + mntput(thispath.mnt); + if (err == -ENOENT) { + dput(this); + this = NULL; + } else if (!this->d_inode) { + dput(this); + this = NULL; + } else if (ovl_dentry_weird(this)) { + dput(this); + err = -EREMOTE; + } + } + if (err) + goto out_put; } if (!this) continue; @@ -592,6 +659,7 @@ struct dentry *ovl_lookup(struct inode * oe->__upperdentry = upperdentry; memcpy(oe->lowerstack, stack, sizeof(struct path) * ctr); kfree(stack); + kfree(redirect); dentry->d_fsdata = oe; d_add(dentry, inode); @@ -606,6 +674,7 @@ struct dentry *ovl_lookup(struct inode * out_put_upper: dput(upperdentry); out: + kfree(redirect); return ERR_PTR(err); }
Hi! [ Reassigning to linux, where this can be fixed. Upstream has provided a preliminary patch to this bug report, which might need testing and perhaps some polishing? Trimmed irrelevant information. ] Thanks, Guillem
Version: 4.10.0-1~exp1 I verified today that the issue is gone with Linux 4.12 and apparently the appropriate patches have been merged for Linux 4.10 already (merged by linus in commit ff0f962ca3c38239b299a70e7eea27abfbb979c3). So I'm closing this bug. Cheers,
Control: reopen -1 Control: notfixed -1 4.10.0-1~exp1 Control: found -1 4.12.6-1 I don't know how I did my check last time, but I was wrong. The changes merged above fixed issues about tools being confused with the unexpected st_dev values on files but they did not fix the fact that overlayfs might return EXDEV when renaming a directory that is part of the lower layer. So I'm opening the bug again. Cheers,
Hi, Doing some maintenance on open kernel bugs. Is this by now still an issue with recent kernels? If not can we close this bug? Part of the isuses at leas were fixed in 4.10.0-1~exp1, but what about the remaining or followup issue mentioned by Raphael? Regards, Salvatore
Hi,
the EXDEV return value is still reproducible:
$ mkdir -p /tmp/ovl/{lower/dir,upper,work,mount}
$ sudo mount -t overlay -o lowerdir=/tmp/ovl/lower,upperdir=/tmp/ovl/upper,workdir=/tmp/ovl/work none /tmp/ovl/mount
$ strace -e trace=rename,renameat,renameat2 mv /tmp/ovl/mount/dir /tmp/ovl/mount/newname
renameat2(AT_FDCWD, "/tmp/ovl/mount/dir", AT_FDCWD, "/tmp/ovl/mount/newname", RENAME_NOREPLACE) = -1 EXDEV (Invalid cross-device link)
+++ exited with 0 +++
and in dpkg sources I cannot find any special handling for EXDEV. From
my point of view, the bug is still not fixed.
Kind regards,
Nicolas
On Wed, 31 Aug 2016 12:16:10 -0300 Felipe Sateler <fsateler@debian.org> wrote:> unable to move aside './usr/share/bug/apt' to install new version: Invalid cross-device link by overlayfs. Kernel version on host: 5.12.8.arch1-1 It works fine with kernel 5.10.41-1.