#1058752 coreutils: cp/mv changed return code in incompatible way

Package:
coreutils
Source:
coreutils
Description:
GNU core utilities
Submitter:
Chris Hofstaedtler
Date:
2024-02-01 00:39:05 UTC
Severity:
normal
Tags:
#1058752#5
Date:
2023-12-15 15:12:36 UTC
From:
To:
Hi,

cp and mv -n changed their behaviour with regard to their exit code
when doing nothing, and also now write to stderr.

This causes a few problems in existing packages:

1) initramfs-tools autopkgtests fail, and it is unclear if
initramfs images with busybox actually work. Compare #1058630 and
[1]

2) scanbd now fails to install, see #1058749

And maybe other, so far unreported problems.


The upstream bug report appears to be stalled:
https://debbugs.gnu.org/cgi/bugreport.cgi?bug=62572

What will be done about this?


Should all affected packages (which are these?) workaround the
fallout?


Chris



[1] https://ci.debian.net/packages/i/initramfs-tools/testing/amd64/41037580/

#1058752#16
Date:
2023-12-15 15:56:03 UTC
From:
To:
I tend to think this was a serious mistake: it breaks the behavior of
existing scripts with no deprecation period. A stated advantage is
better compatibility with freebsd, but I don't understand why that is
more desirable than compatibility with all deployed gnu/linux systems? I
also don't think it's sufficient to try to lawyer out by saying that the
current behavior was undocumented: the previous documentation said that
-n would "silently do nothing" and that the return code would be zero on 
success. Logically, unless cp fails to "do nothing", it should exit with
a zero code.

Such a drastic change in behavior demands a new flag, not a radical
repurposing of a widely used existing flag.

I was hoping to see more action on this bug, but that hasn't happened.
I'm not sure I see a way forward for debian other than reverting to the
old behavior. I am reluctant to do so as that will likely lead to
divergent behavior between distributions, but breaking scripts without a
compelling reason is also not good. I would encourage coreutils to
reconsider the change and finding a non-breaking way forward.

Michael Stone

#1058752#23
Date:
2023-12-15 18:33:00 UTC
From:
To:
It's an awkward case, and worth discussing.

To summarise:

   coreutils >= 7.1 had -n skip existing in dest (2009)
   coreutils >= 9.2 has -n immediately fail if existing in dest
   coreutils >= 9.3 has --update=none to skip existing in dest

   FreeBSD >= 4.7/macos has -n immediately fail if existing in dest

   bash has noclobber as a file protection mechanism,
   and fails immediately upon trying to overwrite a file.
   This is more consistent with the new coreutils behavior.

I see a reasonable amount of cp -n usage across github:
https://github.com/search?q=/cp+.*+-n+.*/+path:*.sh&type=code

Now it's not clear which behavior these github usages expect,
and the original docs didn't make it clear which behavior to expect.
A quick scan of the github usages also seem mainly to expect
a protection rather than an update use case, so failing
immediately would be the most appropriate action there too.
Also the original coreutils bug report here expected the new behaviour.

So we probably all agree that failing immediately is the
most appropriate / consistent -n behavior,
but GNU had diverged from that so there are about 10 years
of scripts that may expect the silent skip behavior.

Two options I see are:

- Leave as is and fix -n usages that expected the skip behavior
- Deprecate -n entirely and prompt to use --update={fail,none}

Advantages of leaving as is:
We get consistency of "noclobber" behavior across systems / shells.
We fix cases where previously scripts could have proceeded with
stale old files in place.

Disadvantages of leaving as is:
Users expecting the skip behavior, have to change to --update=none.

There is no potential for data loss etc. so it just comes
down to how disruptive it is, or how often -n was used
with the "skip behavior" assumption.

We've not had much push back as of yet,
and my current thinking is it's not that disruptive a change.
So I'd be 55:45 if favor of keeping things as is.

thanks,
Pádraig.

#1058752#28
Date:
2023-12-15 18:49:59 UTC
From:
To:
You don't, unless you ignore the coreutils/linux installed base
entirely. Essentially the current situation is that -n shouldn't be used
if you expect a certain behavior for this case and you are writing a
script for linux systems. Maybe in 10 years you'll be able to assume
the new behavior. Better to just tell people to not use it at all, and
leave the historic behavior alone until everyone has stopped using -n
entirely.

There may not be, strictly speaking, if you look only at cp without
context, but we have absolutely no idea what the impact is on the
unknown number of existing scripts that depend on the historic behavior.
This is causing breakages in practice.

IMO, it should come down to trying to avoid breaking changes in core
system utilities. There's no compelling reason to force this change, so
why break anything that depended on the historic behavior--especially
without any notice or transition period--regardless of arguments over
whether the historic behavior was right?

I suspect that's because it has not yet been widely deployed, which
makes now the time to fix it.

Michael Stone

#1058752#33
Date:
2023-12-15 19:21:06 UTC
From:
To:
Well, certainly nobody compelled us at gunpoint....

Stlll, Pádraig gave a reasonable summary of why the change was made,
despite its incompatibility with previous behavior. (One thing I'd add
is that the FreeBSD behavior is inherently less race-prone.) It seemed
like a good idea at the time all things considered, and to my mind still
does.

Even if we tell people not to use -n at all, that doesn't mean we should
revert to the coreutils 9.1 behavior.

The cat is to some extent out of the bag. Unless one insists on (FreeBSD
| coreutils 9.2-9.4), or insist on coreutils 7.1-9.1, one should not
rely on cp -n failing or silently succeeding when the destination
already exists. This will remain true regardless of whether coreutils
reverts to its 7.1-9.1 behavior.

#1058752#38
Date:
2023-12-15 20:13:16 UTC
From:
To:
I think you underestimate the value of maintaining compatibity with
deployed versions. In the abstract it may have been a nice cleanup, but
there are a lot of dumb things in the posix utilities that have been
dumb for so long it's not worth the pain of changing them. Since this
change hasn't yet hit mainstream debian, ubuntu, rhel, or suse users, I
strongly suspect that this is a case where the absence of complaints is
simply a sign that most of the people who'd be impacted haven't
experienced the change yet.

It does, IMO, as it would be less likely to break scripts written by
existing coreutils users.

Or you use a distribution that has to patch to maintain compatibility
between versions. Ideally upstream would revert the behavior for now,
deprecate as the long term fix, and all distributions would work the
same. The other option is that each distribution decides whether to be
compatible with upstream coreutils or their own previous release.

#1058752#43
Date:
2023-12-17 08:34:11 UTC
From:
To:
I wasn't referring to the internal implementation. I was referring to cp
users. With the newer Coreutils (FreeBSD) behavior, you can reliably
write a script to do something if cp -n didn't copy the file because the
destination already existed. With the older Coreutils behavior you
cannot do that reliably; there will always be a race condition.

#1058752#48
Date:
2023-12-17 12:18:33 UTC
From:
To:
Paul Eggert wrote on Fri, Dec 15, 2023 at 11:21:06AM -0800:

This. Scripts that want to be portable already can't assume cp -n will
do what they want, so at this point it doesn't really matter what
coreutils does in the grand scheme of things.

For distros like debian since even -testing hasn't seen coreutils 9.2,
there's still value in reverting locally (with a warning that it's not
reliable perhaps?), but in general coreutils 9.2 has been out for 9
months (2023 March 20), so many systems can already be considered
affected; but it's a disservice to users to just try to hide the problem
under the rug.


(To give a data point, this did bite us as well, and I was annoyed
enough that I went to look for the old bug report back in September, but
at that point 9.3 had already been out and I had given up without
reporting anything as nothing would change the fact that my scripts
would need updating. For the gory details I also need compatibility
with busybox cp (where -n silently ignores existing files), so
--update=none is not an option, but I for this particular usage I
settled for '-u' (--update=older, that busybox also support as short
option only...), and I since hurried to forget about it)

#1058752#53
Date:
2023-12-17 14:46:03 UTC
From:
To:
To clarify my summary a little, there I said that -n now _immediately_ fails.
I should have said _silently_ fails.  I.e. the complete copy operation
proceeds as before, and only the exit status is at issue here.

Agreed we should improve the docs a bit for this option.
I'll apply this at least:

diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 1f8b356d1..bf0f424d3 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -9057,6 +9057,8 @@ Do not overwrite an existing file; silently fail instead.
  This option overrides a previous
  @option{-i} option.  This option is mutually exclusive with @option{-b} or
  @option{--backup} option.
+See also the @option{--update=none} option which will
+skip existing files but not fail.

  @item -P
  @itemx --no-dereference
diff --git a/src/cp.c b/src/cp.c
index 04a5cbee3..3ccc4c4e6 100644
--- a/src/cp.c
+++ b/src/cp.c
@@ -192,8 +192,8 @@ Copy SOURCE to DEST, or multiple SOURCE(s) to DIRECTORY.\n\
    -L, --dereference            always follow symbolic links in SOURCE\n\
  "), stdout);
        fputs (_("\
-  -n, --no-clobber             do not overwrite an existing file (overrides a\n\
-                                 -u or previous -i option). See also --update\n\
+  -n, --no-clobber             ensure no existing files overwritten, and fail\n\
+                                 silently instead. See also --update\n\
  "), stdout);
        fputs (_("\
    -P, --no-dereference         never follow symbolic links in SOURCE\n\

Well RHEL 6 came out a bit after (2010), and had the --no-clobber change,
while RHEL 5 before that did not.

Taking about distros, it's worth noting that the change is Fedora 39
which has been released for a month now.
We'll keep a close eye on issues, but haven't heard much as
of yet at least.

Thanks for your thoughts,
appreciated as always.

cheers,
Pádraig

#1058752#58
Date:
2023-12-17 14:49:08 UTC
From:
To:
You can now reliably write a script using the new long option. Changing
the behavior of the short option helped nobody.

#1058752#63
Date:
2024-01-02 14:36:42 UTC
From:
To:
We believe that the bug you reported is fixed in the latest version of
coreutils, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 1058752@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Michael Stone <mstone@debian.org> (supplier of updated coreutils package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@ftp-master.debian.org)
Format: 1.8
Date: Tue, 02 Jan 2024 08:54:03 -0500
Source: coreutils
Binary: coreutils coreutils-dbgsym
Architecture: source amd64
Version: 9.4-3
Distribution: unstable
Urgency: low
Maintainer: Michael Stone <mstone@debian.org>
Changed-By: Michael Stone <mstone@debian.org>
Description:
 coreutils  - GNU core utilities
Closes: 1058752
Changes:
 coreutils (9.4-3) unstable; urgency=low
 .
   * remove arch restriction from libssl-dev build-depend
   * revert cp -n behavior to debian 12 & prior (Closes: #1058752)
   * add deprecation/compatibility warning for above
Checksums-Sha1:
 21349e63dcc1114be875611b9a9f0984360eb018 1860 coreutils_9.4-3.dsc
 820dad0b53fdbd38941ca21a4d7e953ec932c294 29716 coreutils_9.4-3.debian.tar.xz
 75f9b87d874cf3165ce93898f020bae893e11c46 6996564 coreutils-dbgsym_9.4-3_amd64.deb
 1ecaedf62cb98d59f15b921a2dc3d06c1c206992 8362 coreutils_9.4-3_amd64.buildinfo
 a2943e2d642f274c6002ea23d7e3843d1d5b3ce3 2950596 coreutils_9.4-3_amd64.deb
Checksums-Sha256:
 ca3e625d24ed3983ec88b11d80ae0200e89eb70f719f6da352cfa28a64d8b6fe 1860 coreutils_9.4-3.dsc
 a4b4a1ee99cf8114a438b592cd7ffcbb3eb17ff8b8b7ca717509112481527567 29716 coreutils_9.4-3.debian.tar.xz
 10f1155ec6267f64352b2e616dc7e524da83bfc923c4066200b0d9e57a30f7f8 6996564 coreutils-dbgsym_9.4-3_amd64.deb
 19b02edc87ebc73968716111f5a1eaf89826884d7f346361629b700dff815175 8362 coreutils_9.4-3_amd64.buildinfo
 7d13d49e4b1454a680c60e28b08ee589ec8ec022bf9b8bd122eee19b185fd11b 2950596 coreutils_9.4-3_amd64.deb
Files:
 23c32a4b5b4c509c28525523e2a9a589 1860 utils required coreutils_9.4-3.dsc
 55f996d3650ab9ce331f9eab2e84f08e 29716 utils required coreutils_9.4-3.debian.tar.xz
 5002c2925e520cd83d8f2cb536472b7f 6996564 debug optional coreutils-dbgsym_9.4-3_amd64.deb
 5ceeedaf8ee12bb5c7b0b502338157a4 8362 utils required coreutils_9.4-3_amd64.buildinfo
 7adffc2c2d42da93250befe88991851e 2950596 utils required coreutils_9.4-3_amd64.deb
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEEAtUxX/EfGGGGDh4C9hqs3PoR/94FAmWUGPIACgkQ9hqs3PoR
/96k2w/+JWVIX+rMbzji6yUoKe3Rsxz2A29BkCoF/S27XAL2ngZWtoFkxD8/647k
ay5/rJeVHxBdNHUvOGhB/rD46ITok0oEATx+R8IGMvpN/my7OVQ3ZC7lU9WNSnGB
demG3+w1DZO+nwXAGVDqG+xNhfpB0hJTovxLixd3ConRDduyE7aItClDlgF7njkj
Ndpcyx6B3IEiT/reNe3vQPZXWBYW5Zp5HH/ccBm9EHDoPT6bnSNfyZhmcSn2sBrp
g6gazWnR6FWxrBz0Pl1ZmnbjF53Qflf9lMlX++o4gxf+1Bfelmo0Npv7ylEQ4Vhs
QVBJbZGdV4DVBW7qTSET6QSRczplp61r6pGDL2mQ2K83rzMe6LO1zJfbaMGDGgvE
jXOn3ozkoIA6CjS+o+nIpVN8FgxA2TV7gTulPCqGS/AC5k0N4CHdiyly/7KDL271
KNmzgO6BQ8BmJaVGMceetKEECaapzIE0TU8s4cfMIrJSKst+Sxu4X9qSq4PAI97t
6+Q15MKOpBbiWXg/IbKVfhAXww4WUSRLiWTNDn48SQ8XmvQRub3XzlmYrSuPhCla
xoKNKzLbjQI67vSX9N0AIqpv2jGOALkUGjUeuCXMF8YzWlTmjg2+VA6QKdXtNP3S
SDTWz/zcn80qnoC816w0tS8TgAqVEzO1vyZ4xIvJKzWtAGQ2RnE=
=2Bv7
-----END PGP SIGNATURE-----

#1058752#74
Date:
2024-01-29 07:14:14 UTC
From:
To:
I'm not sure reverting would be best. It would introduce more confusion,
and would make coreutils incompatible with FreeBSD again.

The recent Debian change indicates that their intent is to move to the
FreeBSD behavior too. This would improve cross-platform portability and
I don't think we should discourage that.

Yes that's a problem, and I doubt whether we should mimic Debian.

Yes, it's not a good place to be. Surely current coreutils is better
than what Debian is doing.

True, but that's not a deal-killer. No advice that we give can be 100%
portable to all platforms.

Adding an --update=noclobber sounds like a good thing to do.

Another possibility is to add a warning that is emitted only at the end
of 'cp'. The warning would occur only if the exit code differs because
of this cp -n business. We could stretch things a bit and have a
configure-time option --enable-compat-warnings that builders like Debian
could use if they want such warnings.

#1058752#79
Date:
2024-01-29 14:01:10 UTC
From:
To:
Reverting makes more sense than the current situation. I do not
understand why you seem to value FreeBSD compatibility more than
compatibility with the vast majority of installed coreutils/linux
systems.

You've introduced a silent incompatibility and I'm trying to find some
way to make that clear. If upstream would provide a better solution I
would certainly use it. I have despaired of there being such since your
attitude thus far seems to be entirely dismissive of compatibility
concerns.

You'd only emit a notification of a change in behavior if some
(potentially uncommon/rarely encountered) situation arises which would
actually trigger breakage? So people can't prepare ahead of time and
change their script to handle the necessary change in logic, they can
only maybe figure out why something broke at 2am when the uncommon event
occurred?

At the end of the day, -n is basically a useless option with unknowable
semantics which should be avoided by everyone. In the past it was an
option which wasn't portable between coreutils/linux and freebsd systems,
and I guess you've "fixed" that (by making it an option everyone should
avoid entirely), but let's be honest about how common that concern was.

#1058752#84
Date:
2024-01-29 16:11:05 UTC
From:
To:
That's a bit unfair.  The current upstream -n behavior is with a view
to being _more_ compat across all systems.
Now I agree this may not be worth it in this case,
but it is a laudable goal.

Yes I agree with this point, mostly.
Outputting a diagnostic would help users diagnose what's going on,
and help users to fix forward and avoid their problematic -n usage.
But yes, the crux of the issue with fixing issues as they occur,
is it depends on the state of the destination and so can become an issue
at any time.  Now I previously did an audit with github and debian code search
and noticed very few problematic uses of cp -n, but that does miss
the mountain of private code.

Right, that's why I'm still leaning towards my proposal in the last mail.
   - revert to previous exit success -n behavior
   - document -n as deprecated
   - provide --update=noclobber to give exit failure functionality
     - BTW, it probably makes sense to print a diagnostic for each skipped file here
       as it's exceptional behavior, for which we're exiting with failure for.
   - the existing --update=none provides the exit success functionality

With the above in place for the next coreutils release,
then debian could remove its noisy patch.

thanks,
Pádraig

#1058752#89
Date:
2024-01-29 16:44:57 UTC
From:
To:
You are saying that again without explicitly acknowledging that "being
_more_ compat" in this case means "becoming _incompat_ with the vast
majority of installed systems". IMO it could be reasonably phrased as
"being more compatible across all systems in the long term when all
existing legacy systems are gone", but the key here is that I read
"_more_ compat across all systems" as dismissing the coreutils installed
base as part of "all systems". I understand that may not be/have been
the intent, but I also can't help feeling the way that I do when the
benefits of compatability with freebsd are repeatedly emphasized while
the costs of incompatibility with the coreutils installed base are
dismissed with something along the lines of "we'll see what breaks". (If
the costs of incompatibility are really that low in this case, why would
compatability be a worthwhile goal in this case?)

I do wish that more users had noticed the change earlier and that we're
fairly deep into a mess, but it's not always easy to see the impact of
what seems like a relatively minor patch. I do appreciate that the new
version printed some diagnostics when the change was triggered, as that
certainly helped call attention to scripts which were impacted.

I would certainly align with that, and the sooner the better to decrease
the chances that different distributions handle this in different ways
or we get to the point of having to release in an interim state. If you
commit a final version I'll apply that patch if the next release isn't
imminent.

#1058752#94
Date:
2024-01-29 21:44:16 UTC
From:
To:
Well, I won't insist on doing nothing; however, the proposal needs
ironing out and now's a good time to do it before installing changes.

So --update=noclobber would differ in meaning from the deprecated-in-9.5
--no-clobber, but would agree in meaning with 9.4 --no-clobber? That 
sounds pretty confusing for future users. (And a nit: why should one
spelling have a hyphen but the other doesn't?)

Coreutils 9.4 cp -n already does that, no? So I'm not sure what's being
proposed here.

   $ touch a b
   $ cp -n a b; echo $?
   cp: not replacing 'b'
   1

It seems to me that this proposal conflates two questions:

* What rules should cp use to decide whether to update a destination?

* When cp decides not to update a destination, what should it do? Exit
with nonzero status? Output a diagnostic? Both? Neither?

Aren't these independent axes? If so, shouldn't they have independent
options? For example, since we have --update=older, shouldn't there be a
way to say "I want to copy A to B only if B is older than A, and I want
the exit status to be zero only if A was copied to B"?

#1058752#99
Date:
2024-01-30 11:18:12 UTC
From:
To:
That's a fair point; just avoid "no-clobber" entirely.
We could choose a new name so. How about --update=none-fail

Sorry I got confused between your suggestion of added diagnostics,
and FreeBSD's silent behavior here. Looks like we just keep the
existing behavior of diagnosing 'not replacing' iff exiting with failure.

Well they're not entirely independent.
It's more appropriate to output a diagnostic if exiting failure,
and POSIX also advises along the same lines.
We also have --verbose to control diagnostics somewhat
for the non failure case.

So we now have the proposed change as:

   - revert -n to old silent success behavior
   - document -n as deprecated
   - Leave --update=none as is (will be synonymous with -n)
   - Provide --update=none-fail to diagnose and exit failure

thanks,
Pádraig.

#1058752#104
Date:
2024-01-30 18:31:32 UTC
From:
To:
for confusion.

If I understand things correctly, cp --update=none is not synonymous
with the proposed (i.e., old-behavior) cp -n, because -n overrides
previous -i options but --update=none does not. Also, -n overrides
either previous or following --update=UPDATE options, but --update=none
overrides only previous --update=UPDATE options. (For what it's worth,
FreeBSD -n overrides

Some of this complication seems to be for consistency with how mv
behaves with -f, -i, -n, and --update, and similarly with how rm behaves
with -f, -i, -I, and --interactive. To be honest I don't quite
understand the reason for all this complexity, which suggests it should
be documented somewhere (the manual?) if it isn't already.

This raises more questions:

* If we deprecate cp -n, what about mv -n? FreeBSD mv -n behaves like
Coreutils mv -n: it silently does nothing and exits successfully. So
there's no compatibility argument for changing mv -n's behavior.
However, whatever --update option we add to cp (to output a diagnostic
and exit with failure) should surely also be added to mv, to aid
consistency.

* Should cp --update=none be changed so that it really behaves like the
old cp -n, in that it overrides other options in ways that differ from
how the other --update=UPDATE options behave? I'm leaning toward "no" as
this adds complexity that I don't see the use for.

* If we don't change cp --update=none's overriding behavior, is it still
OK to tell users to substitute --update=none for -n even though the two
options are not exactly equivalent? I'm leaning towards "yes" but would
like other opinions.

#1058752#109
Date:
2024-01-31 14:06:08 UTC
From:
To:
Good point.
Well -n is a protection mechanism really, so should override.
Since --update now incorporates a protection mode, it should also override I think.

To my mind the most protective option takes precedence.
So from least to most that would be -f -n --update=none --update=none-fail -i
So the main consideration there is that -n should not override --update=none-fail

Yes I agree.

Yes I think we should still give that advice to deprecate -n,
if we ensure that -n does not override --update=none=fail.

thanks,
Pádraig

#1058752#114
Date:
2024-02-01 00:36:16 UTC
From:
To:
That's not how POSIX works with mv -i and mv -f. The last flag wins. I
assume this is so that people can have aliases or shell scripts that
make -i the default, but you can override by specifying -f on the
command line. E.g., in mymv:

    #!/bin/sh
    mv -i "$@"

then "mymv -f a b" works as expected.

Wouldn't a similar argument apply to cp's --update options?

Or perhaps we should play it safe, and reject any combination of
--update etc. options that are incompatible. We can always change our 
mind later and say that later options override earlier ones, or do
something else that's less conservative.