#601006 /usr/bin/monkeysphere fails miserably when it shouldn't due to irksome set -e

#601006#5
Date:
2010-10-22 12:47:33 UTC
From:
To:
/usr/bin/monkeysphere conains a "set -e" invocation rather
than relying on proper error checking.

This means that arithmatic expressions which return 0 will be
treated as failures.  In order to work around this problem,
you can replace things like

(( nline=0 ))

with

(( nline=0,1 ))

or

(( nline=0 )) || true

and things like

(( nline++ ))

with

(( ++nline ))


Personally, though, I would get rid of "set -e" since it is a
blight on humanity.

#601006#10
Date:
2010-10-22 13:44:22 UTC
From:
To:
Hi, Clint.  Using "set -e" was a deliberate choice.  The reasoning being
that we don't want to just pass over any errors that could potentially
be bad news.  I personally think this is reasonable, and is somewhat in
line with how other programing languages work (e.g. if memory can't be
allocated, or a function call is incorrect, the program terminates).
This forces us to be vigil about catching the return of calls that we
expect to be non-zero.  This seems like a more reasonable task than
catching the return of *all* calls, and checking that they are in fact
zero.

So is this just a theoretical concern, or did you actually run into a
bug caused by the use of "set -e"?

jamie.

#601006#15
Date:
2010-10-23 05:06:10 UTC
From:
To:
I think that that is a false analogy.  I would liken it more to setting
your bicycle to fall apart if ever it is ridden at a speed of exactly
5 miles per hour.

Yes, monkeysphere update-authorized_keys failed because "set -e" considers
((nline=0)) to be an error.  Thus I am running a monkeysphere with the
"set -e" removed.

#601006#20
Date:
2010-10-24 00:03:05 UTC
From:
To:
I think that's an even falser analogy.  Because we're using bash, we're
calling out to a lot of secondary processes.  You really think the
program should just blithely continue running if one of those
subprocesses fails?  I certainly don't.  I think that's probably a
really bad idea, actually, especially when it means potentially locking
someone out of a machine, or opening it up to attack.  To follow your
analogy, that's more like saying that you should just keep riding your
bike even after your tire has just blown out.

The assumption when using "set -e" is that if a call fails, it means
there's a problem.  If we *expect* a call to return non-zero, or to
occasionally return non-zero, then we should be capturing it's return
code and acting accordingly.  If we're not, then that's a bug.

Your proposal to not use "set -e" would make everything much harder.
Not using "set -e" means we either just ignore all return codes, which I
already said I think is a bad idea, or we have to capture the return
code of *every* call, and check to make sure that it's returned what we
expect (which is usually just 0).  That's a much more onerous task, and
one much more prone to failure.  As it is now, we just have to capture
the return of the calls that we *expect* to occasionally return
non-zero.

So the bug here is really that the use of "((nline=0))" is problematic,
not with the overall use of "set -e".  If "((nline=0))" is expected to
return non-zero, then we should be properly capturing it's return code,
or using a different function call.

I think the nline issue has actually been dealt with upstream.

#601006#25
Date:
2010-10-24 22:08:08 UTC
From:
To:
No, and that is why in every other programming language people do
proper error-checking rather than relying on a hack that presumes
the meaning of every single command's exit status.

If this were written in Perl, how would you achieve the same effect?

#601006#30
Date:
2010-10-25 15:09:55 UTC
From:
To:
tags 601006 wontfix
thanks

So the fact of the matter is that turning off "set -e" is a very large
change, with what I see as at most dubious benefit.  The use or not of
"set -e" may provide some benefit, or it may just be a personal
preference, but in any event it guides some pretty fundamental
assumptions in the structure of the code.  Since all of monkeysphere
scripts have been coded with the "set -e" assumptions, removing it would
necessarily entail a lot of work.

We should obviously keep discussing this, and hopefully others will step
up with some input, but until I'm convinced that there is some tangible
benefit to removing "set -e", security or otherwise, I'm not going to
act on it myself.  If someone wants to submit a patch that makes such a
transition, and the transition is demonstrably better, then we'll
definitely consider it.

In the mean time, issues that involve not properly capturing return
codes should be filed as bugs against those particular calls, or against
the effects thereof.

jamie.

#601006#37
Date:
2010-10-25 17:39:08 UTC
From:
To:
From what I can tell, nobody was suggesting that.

Exactly what is being suggested.

Nobody is suggesting that all return codes are ignored, so can we stop
suggesting that as a possibility?

Capturing the return code of every call and making sure it is what is
expected isn't particularly onerous, its actually good practice. The
fact that its not done, and instead "set -e" is used instead, is the
reason why it seems onerous now.

Its just good programming practice to look at the return code generated
and decide what that means. In most cases a simple test to see if the
return code is not zero will suffice.

let a=0; echo $?

The Bash FAQ[0] details one reason why 'set -e' fails in subtle and
insidious ways, but an easy one is this:

#!/bin/bash

set -e

let a=0 ; echo $?

<EOF>

You might expect the following to trigger an error, but it doesn't:

#!/bin/bash

set -e

foo () { false; echo; }

foo | echo

<EOF>

As the FAQ says, the goal of adding automatic error detection to the
shell is non-trivial, and prone to unexpected results, that is because
many commands are supposed to return non-zero. For example:

  if [ -d /foo ]; then
    ...
  else
    ...
  fi

Clearly we don't want to abort when [ -d /foo ] returns non-zero
(because the directory does not exist) -- this script wants to handle
that in the else part. So the implementors decided to make a bunch of
special rules, like "commands that are part of an if test are immune",
or "commands in a pipeline, other than the last one, are immune".

These rules are extremely convoluted, and they still fail to catch even
some remarkably simple cases. Even worse, the rules change from one Bash
version to another, as Bash attempts to track the extremely slippery
POSIX definition of this "feature". When a SubShell is involved, it gets
worse still -- the behavior changes depending on whether Bash is invoked
in POSIX mode. There is an entire wiki that was created JUST for the
purpose of trying to cover this in detail[1], the caveats are
mind-numbingly onerous.

Its far simplier, and far safer, to use your own error checking rather
than try to learn all of these rules.

micah

0. http://mywiki.wooledge.org/BashFAQ/105
1. http://fvue.nl/wiki/Bash:_Error_handling

#601006#42
Date:
2010-10-25 18:14:00 UTC
From:
To:
i believe the current monkeysphere code actually does this in most
circumstances.  set -e is currently there to catch things like failures
within pipelines (we're also using set -o pipefail), and there are
several places we use traps alongside set -e as a shell-based
(POSIX-compliant, i believe) exception-handling mechanism.

My point here is to say that we are not currently blithely ignoring
errors; I believe we're checking them appropriately where mitigating
techniques are possible.  That said, patches are welcome.

#601006#47
Date:
2010-10-25 20:05:12 UTC
From:
To:
Let's say I'm writing a bash script.  It will make 5 calls.  Three of
the calls are expected to return zero on success, one is expected to
return 1 on success, and one whose return code I don't care about.
Here's this script written with set -e:

#!/bin/bash
set -e
call0
call1
call2
call3 || ret=$?
if ((ret != 1)); then
   exit 1
fi
call4 || true

Here's the same script written without the use of set -e, following the
suggestion that all return codes are checked individually

#!/bin/bash
call0 || exit 1
call1 || exit 1
call2 || exit 1
call3
if (($? != 1)); then
   exit 1
fi
call4

How exactly is the second script an improvement over the first?  I'm
seriously asking because I really don't understand.  As far as I can
tell, it is generally agreed upon that calls return zero on success.
Using set -e is equivalent to explicitly making this single simplifying,
and reasonable, assumption.  Clearly there are a couple of circumstances
where this is not true, but it is certainly true in the majority of
cases.  If it's true 90% of the time, then using set -e means we don't
have to explicitly check those 90% of the calls.  If we don't use set -e
then we have to explicitly check those 90% of calls.  AND we still have
to check the return of the remaining 10% whose return we do care about.
As far as I can tell, not using set -e is equivalent to writing a lot
more code for no particular reason.

I'm honestly curious to hear what you guys see as wrong with this
reasoning.

#601006#52
Date:
2010-10-25 21:11:19 UTC
From:
To:
imagine that line 2 of the script is this:

let i=$i+1

set -e FAILS and because this script was designed to protect your life,
you perish.  The end.

Now imagine that your script says this instead:

call0 || die

oh no!

Now imagine that your script says this instead:

call0 || die "Could not stuff!, please edit stuff.conf"
call1 || die "Could not cd to /stuff; check the stuff.conf configuration"

etc.

When you handle errors *yourself* instead of just falling over dead, you
get to be helpful.

#601006#57
Date:
2010-10-25 21:22:22 UTC
From:
To:
Now imagine its three years from now. The POSIX definition of this
feature changes, yet again, as it has been doing often. The next version
of Bash tries its best to track this change, but results in your 'set
-e' doing something different than you expect. Now we are sad because
this change causes our heads to get lopped off, which wouldn't have
happened because we explicitly handled our own error checking.

#601006#67
Date:
2010-11-05 19:25:54 UTC
From:
To:
this has to do with maintainer scripts (eg. preinst, postinst, prerm,
postrm).

micah

#601006#74
Date:
2018-06-13 22:49:13 UTC
From:
To:
Bug #601006 is quite the blast from the past.  I'm pretty sure `set -e`
is strongly endorsed at this point due to security concerns.

While an interesting historic relic, is it worth keeping #601006 around
to point out how attitudes have changed?

#601006#79
Date:
2018-06-20 00:28:24 UTC
From:
To:
I guess I should register for the record that my attitude has not
changed.

#601006#84
Date:
2024-11-28 18:21:16 UTC
From:
To:
Dear submitter,

as the package monkeysphere has just been removed from the Debian archive
unstable we hereby close the associated bug reports.  We are sorry
that we couldn't deal with your issue properly.

For details on the removal, please see https://bugs.debian.org/1085868

The version of this package that was in Debian prior to this removal
can still be found using https://snapshot.debian.org/.

Please note that the changes have been done on the master archive and
will not propagate to any mirrors until the next dinstall run at the
earliest.

This message was generated automatically; if you believe that there is
a problem with it please contact the archive administrators by mailing
ftpmaster@ftp-master.debian.org.

Debian distribution maintenance software
pp.
Thorsten Alteholz (the ftpmaster behind the curtain)