#415567 findutils: Regexps not handled correctly in PRUNEPATHS

Package:
findutils
Source:
findutils
Description:
utilities for finding files--find, xargs
Submitter:
István Váradi
Date:
2024-05-26 12:45:05 UTC
Severity:
normal
Tags:
#415567#5
Date:
2007-03-20 12:07:30 UTC
From:
To:
The updatedb script uses the PRUNEPATHS environment variable without
quoting it. This causes some problems when putting regular expressions
into the value of this variable. For example, if I use an asterisk (*)
in a path, it will be expanded by the shell before using it as a regular
expression. Thus, .*/lost+found becomes ../lost+found (and possibly others,
depending on how many files or directories, whose names start with a dot
I have in the working directory).

#415567#10
Date:
2007-03-21 18:21:48 UTC
From:
To:
URL:
  <http://savannah.gnu.org/bugs/?19374>

                 Summary: Insufficient quoting of PRUNEPATHS in updatedb
                 Project: findutils
            Submitted by: ametzler
            Submitted on: Mittwoch 21.03.2007 um 19:21
                Category: updatedb
                Severity: 3 - Normal
              Item Group: None
                  Status: None
                 Privacy: Public
             Assigned to: None
         Originator Name: István Váradi
        Originator Email:
             Open/Closed: Open
         Discussion Lock: Any
                 Release: 4.2.28
           Fixed Release: None

#415567#15
Date:
2009-01-27 12:21:58 UTC
From:
To:
Follow-up Comment #1, bug #19374 (project findutils):

I second this ticket. Trying to prune directories based on regular
expressions doesn't work. For example:

updatedb --findoptions='-mount' --localpaths='/cygdrive/c'
--prunepaths='.*/.svn'

The asterisk in '.*/.svn' is shell-expanded by the line where pruned paths
are converted to regular expressions. If $PRUNEPATH in this line is placed in
double quotes works.

#415567#18
Date:
2009-02-09 09:22:18 UTC
From:
To:
Follow-up Comment #2, bug #19374 (project findutils):

Here is a patch:
--- updatedb.org 2009-01-27 13:29:28.575086300 +0100 +++ updatedb 2009-02-09 10:19:41.132505500 +0100 @@ -163,7 +163,7 @@ # Trailing slashes result in regex items that are never matched, which # is not what the user will expect. Therefore we now reject such # constructs. -for p in $PRUNEPATHS; do +for p in "$PRUNEPATHS"; do case "$p" in /*/) echo "$0: $p: pruned paths should not contain trailing slashes" >&2 exit 1 @@ -172,7 +172,7 @@ # The same, in the form of a regex that find can use. test -z "$PRUNEREGEX" && - PRUNEREGEX=`echo $PRUNEPATHS|sed -e 's,^,\(^,' -e 's, ,$\)\|\(^,g' -e 's,$,$\),'` + PRUNEREGEX=`echo "$PRUNEPATHS"|sed -e 's,^,\(^,' -e 's, ,$\)\|\(^,g' -e 's,$,$\),'` # The database file to build. : ${LOCATE_DB=/var/locatedb}
#415567#23
Date:
2009-02-21 22:42:52 UTC
From:
To:
This looks like a useful patch, would you please mail it to
bug-findutils@gnu.org and findutils-patches@gnu.org as a git patch
against the current source tree (see
https://savannah.gnu.org/git/?group=findutils) and with updates to the
ChangeLog and NEWS files?

Thanks,
James.

#415567#26
Date:
2009-02-21 22:42:52 UTC
From:
To:
This looks like a useful patch, would you please mail it to
bug-findutils@gnu.org and findutils-patches@gnu.org as a git patch
against the current source tree (see
https://savannah.gnu.org/git/?group=findutils) and with updates to the
ChangeLog and NEWS files?

Thanks,
James.

#415567#29
Date:
2010-04-11 11:56:39 UTC
From:
To:
Update of bug #19374 (project findutils):

                  Status:                    None => Postponed

#415567#32
Date:
2010-07-14 20:11:51 UTC
From:
To:
Follow-up Comment #4, bug #19374 (project findutils):

Patches don't work for me when using the command-line option --prunepath as
the asterisks are expanded in the "for arg" access of the implicit $@.  As a
workaround, wrapping my pattern with ( and ) works, like updatedb
--prunepaths='(.*/.svn)'

Patches do work when setting the PRUNEPATH environment variable.

#415567#35
Date:
2023-07-26 12:01:39 UTC
From:
To:
Follow-up Comment #5, bug #19374 (project findutils):

Indeed, the patches don't work because they miss a critical unquoted variable,
"arg".

The following patch should work:

#415567#38
Date:
2023-08-01 18:25:30 UTC
From:
To:
Follow-up Comment #6, bug #19374 (project findutils):

This seems correct, but should also be applied to the line before for
consistency:

-  opt=`echo $arg|sed 's/^\([^=]*\).*/\1/'`  || exit 71
+  opt=`echo "$arg"|sed 's/^\([^=]*\).*/\1/'`  || exit 71

This defeats the purpose of the for-loop, as this tries to check if any of
the paths in PRUNEPATHS ends on '/', while the patch would reduce that check
only to the last element (due to quoting into a single string).

I'd suggest to change this to:

-              exit 1
-    esac
-done
+nl='
+'
+if echo "$PRUNEPATHS" | tr ' ' "$nl" | grep '[^/]/$' >/dev/null; then
+  echo "$0: $p: pruned paths should not contain trailing slashes" >&2
+  exit 1
+fi
,$\\\)\\\|\\\(^,g' -e 's,$,$\\\),'`
,$\\\)\\\|\\\(^,g' -e 's,$,$\\\),'`

I see several problems in this area.

updatedb allows to define PRUNEREGEX as environment variable from outside,
and only uses the value of --prunepath if the former is unset.
Usually, options should override environment variables, not the other way
round.

Furthermore, PRUNEPATHS seems to be defined to allow already-expanded items
only.
If one wants to use regular expressions, then why not directly define
PRUNEREGEX from outside?

In that regard, it would maybe be better to introduce a --pruneregex option
which takes the final pruning expression for find(1).  It could still
override
any of PRUNEPATHS and --prunepaths value, but shouldn't do that silently.

#415567#41
Date:
2024-05-26 12:41:45 UTC
From:
To:
Update of bug #19374 (group findutils):

                  Status:               Postponed => None