- Package:
- devscripts
- Source:
- devscripts
- Description:
- scripts to make the life of a Debian Package maintainer easier
- Submitter:
- Adrian Bunk
- Date:
- 2019-10-02 19:00:03 UTC
- Severity:
- normal
devscripts.conf(5) says: The two configuration files are /etc/devscripts.conf for system-wide defaults and ~/.devscripts for per-user settings. They are written with bash(1) syntax, but should only have comments and simple variable assignments in them; they are both sourced (if present) by many of the devscripts scripts. This is a hack, not a proper handling of the configuration. As seen in #863101, "they are both sourced" implies that one whitespace too many can have unexpected consequences. The "but should only have" statement implies that users cannot rely on being able todo anything other than variable assignment in them. It would be much more robust against user typos if the configuration files would be properly parsed instead of blindly executed.
fwiw, the maintainers have been in violent agreement with this for years, but it needs someone with tuits to replace the current situation with a solution that's usable from all the languages included in devscripts and maintains the existing semantics (it's not just simple key=value pairs, users are actively relying on being able to use shell within the file in some cases, and that at least needs considering). Regards, Adam
I see. The "yes" command is especially bad but we can think of many other values causing problem: "true" "false" "exit" "kill" if space is following "=". key= value By rejecting such line in ~/.devscripts should prevent such annoying problem. If you really mean to write such line, we can write: ENV_VAR="" command So this kind of error prone line containing set up can be excluded by a simple grep test: egrep -e '[^\s#]*\w+=\s+[^#]" Also "bash -n" test may be also interesting to catch syntax error. Such restriction are reasonable. Osamu
control: tags -1 patch OK, here is my try to do the following. Ugly, yes. But it seems to catch most errors in friendly way. * "egrep" test to catch space after "=" * "set -e" to detect command not found; * space before "=" * "false" in line * "bash -n" test * Report the file name if a bug is found as much. If no objection, I will apply this patch to git. Osamu
The concept looks like a step in the right direction. However, uscan.pl isn't the only script that does this type of config parsing. Also, I would wait until after Stretch is released. Cheers,
Hi,
To be honest, I am not a fun of sourcing config file as BASH script.
Anyway here is a list of packages which needs attention:
$ fgrep ".devscripts'" *.pl
bts.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
dcontrol.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
debchange.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
debcheckout.pl:my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
debcommit.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
debc.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
debdiff.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
debi.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
debsnap.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
debuild.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
dget.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
dpkg-depcheck.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
dscverify.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
grep-excuses.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
mass-bug.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
mk-build-deps.pl:my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
plotchangelog.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
rmadison.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
uscan.pl: my @config_files = ('/etc/devscripts.conf', '~/.devscripts');
$ fgrep .devscripts *.sh
debclean.sh: for file in /etc/devscripts.conf ~/.devscripts
debrelease.sh: for file in /etc/devscripts.conf ~/.devscripts
debrsign.sh: for file in /etc/devscripts.conf ~/.devscripts; do
debsign.sh: for file in /etc/devscripts.conf ~/.devscripts
nmudiff.sh: for file in /etc/devscripts.conf ~/.devscripts
pts-subscribe.sh: for file in /etc/devscripts.conf ~/.devscripts
uupdate.sh: for file in /etc/devscripts.conf ~/.devscripts
who-uploads.sh: for file in /etc/devscripts.conf ~/.devscripts
wnpp-alert.sh:OLDCACHEDIR=~/.devscripts_cache
28 scripts.
Questions are:
* Should we allow such bash config?
* Should we consolidate and factorize such scripts to be shared among
all scripts to reduce maintenance burden?
Osamu
Neither am I, but it's been that way for ~15 years now... I'd also prefer not to have /etc/devscripts manipulated in postinst to add new config variables and for it to be a conffile (or not exist at all by default). I think it's a bit too late to change that, unfortunately. Yes. That's been something which has been discussed many times over the years, but no one has done it. Cheers,
Hi. The attached changes fix this bug. They are intrusive and probably introduce new bugs, but they should ease the maintenance in the long run. Regards.
Please ignore previous attachment, and consider this one instead. It brings various improvements, including tests. It also splits the unrelated changes in order to ease review. Thanks.
Hi. Please also consider these two patches. Thanks.
Hello. Since last message, most sources have been reformatted. uscan.pl has been heavily modified, introducing a new Devscripts::Config library similar to Devscripts::Confvar. The attachment rebases all previous patches in this changelog, except the one affecting uscan.pl. Patches 1 to 15 carry various non intrusive suggestions improving details, but not specific to Confvar. Patch 16 lets all tools use Devscripts::Confvar, except uscan.pl. let-uscan-use-confvar.patch contains the unrefreshed changes affecting uscan.pl/2.18.1. In february, the library was passing its automated testsuite, and each tool was reporting correctly with --help. However, the recent manual refresh may have introduced new errors. I suggest the following steps. * choose between Devscripts::Confvar and Devscripts::Config * cherry-pick good ideas from the unselected implementation * redo all testing
Hi, About those, thank you! I've applied most of them here my comments about some: * 0006-dcontrol.pl-fix-noconf-handling.patch I changed the commit message since, despite being correctly handlded, it just wasn't accepted as a parameter. Also, I addded it to the pod (for the manpage). * you removed some $opt_* variables; there is habit of calling all variables coming from getopt as $opt_* - but I found what you did cleared, so I took them. * 0011-Set-e-option-for-some-shell-scripts.patch I'll wait till have some more time before merging this one * 0012-Improve-reporting-of-noconf-misusage.patch Also. I want to check how the documentation around those options is. * 0016-Move-all-parsing-of-configuration-files-to-a-separat.patch this is the huge one, so…
[CC-ing Xavier in case he has missed this conversation] (and let-uscan-use-confvar.patch) Devscripts::Confvar answers the needs of all 25 existing Perl or Shell scripts. For each one, I have checked that settings in ~/.devscripts.conf are reflected by --help. The devscripts test suite is extended to check each function of the module with variables containing various escaped characters. Independently, Xavier has created Devscripts::Config. This module is part of refactoring uscan.pl, and is only tested with it. It has been merged and released. Contributors updating or creating scripts will rightfully use it, as it is designed with generality in mind. This will increase the divergence and make the merge of features and ideas harder and harder. I would quite like an explicit list of requirements for Devscripts::Confvar to be considered too. Rebasing it again and again is boring, error-prone, and a complete waste of time as long as no one reviews it. I have written this patch exactly by comparing for each script the output of --help with the actual option handling. I have then run each one with --usage, and checked the no-conf part. What else can make you feel confident about the change? I have looked for deliberately ignored errors in the whole source of each script. Of course, more reviewers would be welcome. How many? What else can make you feel confident about the change?
When writting Confvar, I have been careful to keep the existing behaviour. This seemed good separation of concerns. Confvar only searches in /etc/devscripts.conf and $HOME/.devscripts. $HOME is escaped by single quotes when invoking the subshell. Config introduces a new --conf[-]file option. It takes its own responsibilities. Confvar returns the correct value when a configuration file contains var=bla\ bla\'bla\"bla\$bla\\bla\\nbla (extract from test/test_confvar_*, which tests both Perl and Shell) I do not understand your point. Could you give an example of unsafety not yet covered by Confvar tests? I fail to understand what you mean by reading your commits. Everyone agrees that the old configuration must go away. My point was that your sentence suggests - that several scripts use Config. Actually, only one does. - that migrating the 24 other scripts is an obvious need. Another option is to migrate the 25 scripts to Confvar. The work is already done and partially tested. Config only targets perl, not shell. It does not anticipate the needs of * debuild.pl (DEBUILD_*_HOOKS, DEBUILD_*_OPTS, DEBUILD_SET_ENVVAR) * rmadison.pl (RMADISON_URL_MAP_*) I am confident that Confvar does because I have extended its interface while migrating and testing a script after the other. Could you detail some of the needed differences? Even if Config brings valuable updates, like parsing of related command line arguments, I think incremental improvements to Confvar would have been, and would still be, better than wholesale replacement. It seemed right to answer a bug requesting help in the bug log, but let us try a merge request instead. https://salsa.debian.org/debian/devscripts/merge_requests/162
My commits were about establishing a protocol between the generated
config parser bash script and the Perl code calling that bash script
such that the Perl code does not have to have knowledge of how shell
quoting works, only of how to establish the protocol from shell.
It does that by avoiding layering violations between shell and Perl;
passing filenames as arguments and returning variables NUL separated.
I count three scripts that use Config (salsa, uscan, mk-origtargz).
A few thoughts:
Don't use an absolute path to bash.
Unsetting the environment represents a change in the interface between
devscripts and people's config files. An example of what could break:
if someone sets HOME in their running zsh to a subdirectory, and in
that directory has a devscripts config that uses ~/ then unsetting HOME
could mean that the devscripts config would use their main home
directory instead of the subdirectory they wanted to use.
Checking the syntax of the config files represents a change in the
interface between devscripts and people's config files; config files
that contain valid variables at the start and syntax errors at the end
would stop setting the valid variables from the start.
Telling bash to exit on errors by passing the -e option to bash
represents a change in the interface between devscripts and people's
config files. It could cause later parts of the config files to
silently stop working if they have a command near the start that does
not succeed.
In the perl code use q{}/qq{} instead of ''/"" for readability; this is
so that you can use quoting characters in the shell code without
escaping them with backslash characters, which reduce the readability.
Use the spawn function from Dpkg::IPC instead of backticks because they
introduce an extra unnecessary shell process while spawn does not use
the shell so there can be no quoting issues no matter what is passed.
Pass the config filenames as arguments to the generated bash script and
load the files from those arguments instead of from quoted strings.
This avoids any possible mismatch between bash/Perl quoting code.
Use NUL characters (\0) to split the output up, this allows all
characters to be represented instead of relying on Perl code to parse
the output of the bash set command.
One enhancement I should have added to the Config module but didn't
think of until now is to ensure that the stderr isn't parsed and that
any output the devscripts config generates also isn't parsed by the
Perl code. Probably the way to do this is to print a few NUL characters
in a row to indicate the start of the exported variables and discard
any output before those NUL characters. Of course the devscripts config
could print the NULs too and thus break the parsing, but outputting
NULs from one's devscripts config seems unlikely to exist today.
This seems like a few separate issues all on one branch, usually they
would be separate merge requests. I'd also squash the new code plus
fixes to and perltidy of the new code into one commit.
Sorry. A year ago, grepping scripts was enough, but uscan is not
alone in lib/ anymore.
3 is still less than 25.
Note that Config.pm uses '/bin/bash'.
Out of curiousity: I was once taught that not relying on $PATH,
aliases and so on was safer. Do you have a reference (the question
seems to vague for a web search)?
Confvar only unsets the variables matching the list (or regex)
expected by the devscript tool.
devscripts.conf(5) requires this behaviour.
Confvar does not modify HOME, or any unrelated setting.
I agree that these are changes in behaviour introduced by Confvar.
They have been approved by a devscripts maintainer in messages 10 to
27 in the same bug log.
qq{} allows ' instead of \' inside literal strings.
q{} allows " instead of \" inside interpolated strings.
None happens in Confvar for now.
This would only raise the barrier for people less fluent in perl like
me.
Done. Thanks.
Done. Thanks.
I could argue that, in order to see that your implementation is safer,
one has to be fluent with perl constructs like
map {qq{"\$$_"}} @key_names
and understand why " \0 \n are passed correctly via
printf '%s\0' "$a"
After many attempts at comparable solutions, I have encountered
scripts accepting a non-static range of variable identifiers defined
by a pattern, or even worst by the content of another user variable. I
have found no way to do this without parsing the output of the 'set'
built-in. The escaping used by 'set' is described in the comments at
the end of Confvar.pm.
For now, only the a=$'...' syntax is ignored (and bash array
variables, but this does not matter here). Bash uses it when the
variable contains control characters like \r \n. It would be possible
to also decode such values (by calling printf or similar), but
investing work in this direction does not seem in line with the
orientation selected by the maintainers in first messages of this bug
log.
As far as I understand, Dpkg::IPC::spawn separates stdout from stderr.
In the messages quoted above, the maintainers are in favor of
requiring only simple affectations in configuration files, and even
removing shell support in the future if ever possible.
It would be way safer to report any unexpected output, and only go on
proceeding with an explicit override (for example if
lib/Devscripts/Output.pm::die_on_error is false).
Done. Thanks.
I hadn't noticed, thanks for pointing that out.
Relying on the current location of the binary isn't a great idea since
it could change or could be different on different operating systems.
For example with the /usr merge stuff it will be in /usr/bin/bash and
/bin/bash will only be a compatibility symlink. On Solaris or BSD it
could be somewhere in /opt/gnu or similar.
Hmm, I could have sworn environment was inherited before, perhaps I
have broken it with my changes.
I see thanks.
Personally I think these increase readability even when the escapes
aren't present because it is easy to see where the string ends instead
of having to parse it in your head.
Another handy one is using s{foo}{bar}g instead of s/foo/bar/g
I consider those to be fairly simple constructs, but a clarifying
comment could help people not very familiar with Perl or shell.
The map call simply applies the first argument as a function to each
element of the array, with $_ being the argument to the function.
The printf builtin using %s as the first argument is the only safe way
to output a string without any mangling by the shell. The printf
builtin applies the first argument to each of the other arguments and
prints them to stdout in turn. The NUL character (\0) is the best
option for separating each variable to avoid doing escaping/etc.
It looks like `compgen -A variable` or `compgen -v` can output the
names of all bash variables, if you then loop through all the variables
printing both the name and value (NUL separated) then you can just pass
out all variable names and values and then match them on the Perl side.
This could even mean that the bash script can be converted to a static
script file instead of a generated script embedded in Perl. Personally
I quite dislike code where one language is embedded in another.
Good to hear.
Woops, I missed the very obvious solution of using a separate file
descriptor to pass out the variables instead of using stdout.
Personally I use the shell support to avoid hard-coding things in
~/.devscripts that are primarily defined elsewhere or vary depending on
aspects of my shell environment. I think other folks do the same.
I think it would be better to just define a safe protocol than try and
handle exceptions. Sure if you want to force people to make their
configs not print anything, that is fine but that is orthogonal to safe
communication between the config script and devscripts, the best way to
do that is to make what the config script does output irrelevant by
ignoring it. Using a separate file descriptor would be the best option
here, unfortunately Dpkg::IPC doesn't seem to provide that but there
are other IPC options that would be able to support this.
I was wrong. devscripts.conf does not require to ignore values from the environment, but currently, scripts do. I prefer to keep `set` because bashisms would break debsign.sh. For the record, Confvar.pm was already doing this. However, while implementing your idea below, I have found that only outputting the modified settings is more simple. The full list of shell variables is only parsed when satisfying debuild and rmadison special requirements. That is obviously a better design, thanks for the idea! I have updated the pull request.