#586045 File::Find no_chdir misbehaves when $dir is_utf8

Package:
perl
Source:
perl
Description:
Larry Wall's Practical Extraction and Report Language
Submitter:
Joey Hess
Date:
2020-11-15 00:21:08 UTC
Severity:
normal
Tags:
#586045#5
Date:
2010-06-15 22:30:15 UTC
From:
To:
File::Find has a problem in no_chdir mode when the directory
it is run on is utf-8, and has a utf-8 filename inside.

Test program:

joey@gnu:~>cat test
use utf8;
use Encode;
use File::Find;
my $dir=shift;
if (shift) {
	$dir=decode_utf8($dir);
}
print Encode::is_utf8($dir)."\n";
find({
	wanted => sub {
		my $f=decode_utf8($_);
		$f=~s/ü/mango/g;
		print "$f\n";
	},
	no_chdir => 1},
$dir)

Here it works as expected; the wanted function is able to decode_utf8($_)
and operate on individual unicode characters:

joey@gnu:~>find fooü
fooü
fooü/ü
joey@gnu:~>perl test fooü

foomango
foomango/mango

But what if the directory passed to File::Find has the utf8 flag set?

joey@gnu:~>perl test fooü 1
1
foomango
foomango/ü

What's going on is that Find::Find concacenates together the $dir, which
has the flag set, with a filename, which has not been decoded from utf8.
The resulting string has the utf8 flag set, so when the wanted function
runs decode_utf8 on it, nothing is done, and it remains partially utf8
encoded, and partially not.