File::Find has a problem in no_chdir mode when the directory
it is run on is utf-8, and has a utf-8 filename inside.
Test program:
joey@gnu:~>cat test
use utf8;
use Encode;
use File::Find;
my $dir=shift;
if (shift) {
$dir=decode_utf8($dir);
}
print Encode::is_utf8($dir)."\n";
find({
wanted => sub {
my $f=decode_utf8($_);
$f=~s/ü/mango/g;
print "$f\n";
},
no_chdir => 1},
$dir)
Here it works as expected; the wanted function is able to decode_utf8($_)
and operate on individual unicode characters:
joey@gnu:~>find fooü
fooü
fooü/ü
joey@gnu:~>perl test fooü
foomango
foomango/mango
But what if the directory passed to File::Find has the utf8 flag set?
joey@gnu:~>perl test fooü 1
1
foomango
foomango/ü
What's going on is that Find::Find concacenates together the $dir, which
has the flag set, with a filename, which has not been decoded from utf8.
The resulting string has the utf8 flag set, so when the wanted function
runs decode_utf8 on it, nothing is done, and it remains partially utf8
encoded, and partially not.