#842634 schroot: fiddles with localhost entries in /etc/hosts creating duplicates

Package:
schroot
Source:
schroot
Description:
Execute commands in a chroot environment
Submitter:
Santiago Vila
Date:
2018-12-23 00:45:04 UTC
Severity:
important
#842634#5
Date:
2016-10-30 22:09:35 UTC
From:
To:
Dear maintainer:

I tried to build this package in stretch with "dpkg-buildpackage -A"
(which is what the "Arch: all" autobuilder would do to build it)
but it failed:
--------------------------------------------------------------------------------
[...]
 debian/rules build-indep
dh build-indep --parallel
   dh_testdir -i -O--parallel
   dh_update_autotools_config -i -O--parallel
   debian/rules override_dh_auto_configure
make[1]: Entering directory '/<<BUILDDIR>>/rustc-1.12.0+dfsg1'
:
SHELL=/bin/sh PATH="$PWD/debian/bin:$PATH" DEB_HOST_ARCH="amd64" \
    ./configure --host=x86_64-unknown-linux-gnu --target=x86_64-unknown-linux-gnu --disable-manage-submodules --release-channel=stable --prefix=/usr --llvm-root=/usr/lib/llvm-3.8 --enable-local-rust --local-rust-root=/usr
configure: looking for configure programs
configure: found program 'cmp'
configure: found program 'mkdir'
configure: found program 'printf'

[... snipped ...]

test time::duration::tests::creation ... ok
test time::duration::tests::div ... ok
test time::duration::tests::mul ... ok
test time::duration::tests::nanos ... ok
test time::duration::tests::secs ... ok
test time::duration::tests::sub ... ok
test time::duration::tests::sub_bad1 ... ok
test time::duration::tests::sub_bad2 ... ok
test time::tests::instant_duration_panic ... thread '<unnamed>' panicked at 'other was less than the current instant', src/libstd/sys/unix/time.rs:276
stack backtrace:
   1:     0x56516b428dbb - std::sys::backtrace::tracing::imp::write::hf5a96839a69ef354
   2:     0x56516b46f3cf - std::panicking::default_hook::_{{closure}}::h722bc1c43176990e
   3:     0x56516b4506c4 - std::panicking::rust_panic_with_hook::h5c990dc76905436a
   4:     0x56516b45016f - std::panicking::begin_panic::h655bdc31a0135cb7
   5:     0x56516b41158a - std::time::tests::instant_duration_panic::hd6821be556a6c8a7
   6:     0x56516b4791eb - _<F as alloc..boxed..FnBox<A>>::call_box::hd5b7ed3c754c2638
   7:     0x56516b4724a8 - std::panicking::try::do_call::h63a911edf59441a1
   8:     0x56516b4bce76 - __rust_maybe_catch_panic
   9:     0x56516b478bb3 - _<F as alloc..boxed..FnBox<A>>::call_box::h3db4d78fb37007ad
  10:     0x56516b4b3202 - std::sys::thread::Thread::new::thread_start::h4c0ad33b336bc6ea
  11:     0x7f7b7df59463 - start_thread
  12:     0x7f7b7da859de - __clone
  13:                0x0 - <unknown>
ok
test time::tests::instant_elapsed ... ok
test time::tests::instant_math ... ok
test time::tests::instant_monotonic ... ok
test time::tests::since_epoch ... ok
test time::tests::system_time_elapsed ... ok
test time::tests::system_time_math ... ok

failures:

failures:
    sys_common::net::tests::no_lookup_host_duplicates

test result: FAILED. 753 passed; 1 failed; 0 ignored; 0 measured

/<<BUILDDIR>>/rustc-1.12.0+dfsg1/mk/tests.mk:423: recipe for target 'tmp/check-stage2-T-x86_64-unknown-linux-gnu-H-x86_64-unknown-linux-gnu-std.ok' failed
make[2]: *** [tmp/check-stage2-T-x86_64-unknown-linux-gnu-H-x86_64-unknown-linux-gnu-std.ok] Error 101
make[2]: Leaving directory '/<<BUILDDIR>>/rustc-1.12.0+dfsg1'
debian/rules:187: recipe for target 'override_dh_auto_test' failed
make[1]: *** [override_dh_auto_test] Error 2
make[1]: Leaving directory '/<<BUILDDIR>>/rustc-1.12.0+dfsg1'
debian/rules:105: recipe for target 'build-indep' failed
make: *** [build-indep] Error 2
dpkg-buildpackage: error: debian/rules build-indep gave error exit status 2
--------------------------------------------------------------------------------

I'm attaching three different build logs.

I wonder what exactly this test "no_lookup_host_duplicates" does.
My autobuilder do not have a FQDN, it has just this in /etc/hosts:

public-ip	skywalker1

Is this a bug in my autobuilder? I hope not.

In either case, I would just forward this upstream and disable the
test which fails.

Thanks.

#842634#10
Date:
2016-11-25 23:00:38 UTC
From:
To:
severity 842634 minor
thanks

I'm just lowering the severity here to unblock the migration, but I honestly
believe your autobuilder is misconfigured. The "no_lookup_host_duplicates"
tries to resolve localhost and verify that there aren't ipv4/ipv6 duplicates,
however your host is missing an /etc/hosts line for localhost and the test
panics.
I don't have any reference at hand, but I'd consider the environment broken.

Ciao, Luca

#842634#17
Date:
2016-11-25 23:11:07 UTC
From:
To:
Hmm, maybe I did not explain it well enough.

By "it has just this" I was only talking about lines containing the hostname,
not that the line above was the full contents of /etc/hosts.

I have a line for localhost, of course.

I'll repeat the build a few more times to see if this still happens,
but "minor" is probably lowering too much if your intention was to let
this package migrate to testing.

Thanks.

#842634#22
Date:
2016-11-26 00:42:26 UTC
From:
To:
Hi.

This is indeed related to localhost entries as you say, not to the
hostname as I believed.

My localhost lines are like this:

127.0.0.1	localhost
::1     localhost ip6-localhost ip6-loopback

but when I enter the chroot, localhost lines inside the chroot become
like this:

127.0.0.1       localhost
127.0.0.1       localhost ip6-localhost ip6-loopback

This is supposed to be controlled by /etc/schroot/default/nssdatabases,
which reads like this:


# System databases to copy into the chroot from the host system.
#
# <database name>
passwd
shadow
group
gshadow
services
protocols
networks
hosts


Maybe this is just a bug in schroot, for the strange way of "copying"
the /etc/hosts file?

But even in such case: Should we allow packages to FTBFS when the
autobuilder does not have IPv6? It would seems a gratuitous
requirement to me.

Thanks.

#842634#27
Date:
2016-11-29 09:17:56 UTC
From:
To:
Hi.

After I removed the ::1 line from /etc/hosts, rustc builds ok again.

I wonder, however, why two lines like this in /etc/hosts

127.0.0.1       localhost
127.0.0.1       localhost ip6-localhost ip6-loopback

should make rustc build to fail at all.

Could you please quote the relevant standard?

The tests that are performed after a program has been compiled are
supposed to check that the program is ok, not to check that the
machine is ok, unless, of course, problems in the machine are known to
affect the build itself.

Please tell me: How does the existence of two lines having 127.0.0.1
(both being "localhost") affect any of the other tests?

Thanks.

#842634#32
Date:
2016-11-29 18:07:51 UTC
From:
To:
The test was added in https://github.com/rust-lang/rust/pull/34700 to
guard against regression of a bug where std::net::lookup_host()
incorrectly returned multiple copies of each address.

The /etc/hosts database format certainly allows duplicate entries. It
seems that glibc, at least, does not merge those, but instead returns
the listed n-to-m mappings directly through the `struct hostent`
returned by both gethostbyaddr() and gethostbyname().

Even if it doesn't violate any spec, I agree with Luca that this is a
misconfiguration. The normal application of multiple address entries
is for service failover. Any client connecting to localhost with your
configuration will immediately retry connection failures, which is a
waste of resources.

More practically, I don't see a way to make the test work in your
environment without implementing a separate /etc/hosts parser. I think
that would be a lot of code to justify in this case.

 -r

#842634#37
Date:
2016-12-16 10:21:34 UTC
From:
To:
Ok, but the duplicate localhost entry, as I explained, is not
something which I deliberately chose, it's the result of using
schroot on a machine not having IPv6 enabled.

Unless IPv6 has become mandatory for autobuilders (I hope not),
this is still a bug (i.e. definitely *not* wishlist) in schroot,
for converting ::1 into 127.0.0.1 and causing the FTBFS problem.

I'll reassign it.

Thanks.

#842634#42
Date:
2017-01-30 00:05:08 UTC
From:
To:
reassign 842634 schroot
retitle 842634 schroot: fiddles with localhost entries in /etc/hosts creating duplicates
severity 842634 important
than

This could be a misconfiguration, yes, but not deliberately chosen by
the user, and it causes FTBFS bugs in other packages, so I don't think
"minor" is appropriate here.

Reassigning to schroot, which is the program creating duplicate entries.

Dear schroot maintainers, I'll summarize the bug:

On some machines, schroot mangles /etc/hosts as follows:

This is the original /etc/hosts from the host, which is mainly derived
from the one created by debian-installer:
--------------------------------------------------------- 127.0.0.1 localhost 192.168.1.101 myhostname # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters --------------------------------------------------------- And this is the /etc/hosts file after entering a chroot with the schroot command:
--------------------------------------------------------- 127.0.0.1 localhost 192.168.1.101 myhostname 127.0.0.1 localhost ip6-localhost ip6-loopback --------------------------------------------------------- The duplicate 127.0.0.1 entry makes some packages to FTBFS (for example, rustc, the package from which this is being reassigned). Please make schroot not to fiddle with /etc/hosts "more than required". In particular, it should not convert a /etc/hosts file that allows rustc to be built without errors into another one which does not allow building rustc. For example, dropping ff02::1 and ff02::2 might be ok, but converting ::1 into 127.0.0.1 is not. Thanks.
#842634#55
Date:
2017-05-14 20:37:17 UTC
From:
To:
This seems correct.
as opposed to one IPv6 and one IPv4 address (or just one IPv4 address).

I can actually reproduce this issue on abel.debian.org (armhf porterbox):

On abel, my gai test program returns ::1, 127.0.0.1.
On abel within an schroot, my gai test program returns 127.0.0.1, 127.0.0.1.

Looking at /etc/hosts within the schroot, I see:
127.0.0.1       localhost
127.0.0.1       localhost ip6-localhost ip6-loopback
172.28.17.11    abel.debian.org abel

Modifying /etc/hosts by replacing ::1 with 127.0.0.1 results in being able
to reproduce the issue on other machines as well.

This has already caused issues in other packages (e.g. rustc), and is
tracked as Debian bug #842634.

Now, the next question is: where does this /etc/hosts come from? The file
is present in the above form directly after unpacking the schroot tarball,
before even entering the schroot:

abel% sessionid=$(schroot -b -c sid)
abel% cat /srv/chroot/schroot-unpack/$sessionid/etc/hosts
127.0.0.1       localhost
127.0.0.1       localhost ip6-localhost ip6-loopback
172.28.17.11    abel.debian.org abel

Running debootstrap does not produce an /etc/hosts in --variant=minbase and
--variant=buildd. When run without --variant, it does produce an
/etc/hosts, but that looks correct:

midna% sudo debootstrap --variant=minbase stretch /tmp/bootstrap
http://deb.debian.org/debian
midna% cat /tmp/bootstrap/etc/hosts
cat: /tmp/bootstrap/etc/hosts: No such file or directory
midna% sudo rm -rf /tmp/bootstrap
midna% sudo debootstrap stretch /tmp/bootstrap http://deb.debian.org/debian
midna% cat /tmp/bootstrap/etc/hosts
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

So, where does the file get mangled? I can’t find any traces in the schroot
and sbuild sources. Does anyone know by chance?

#842634#60
Date:
2017-05-14 22:50:41 UTC
From:
To:
So it's a fully _reproducible_ bug, with a well-defined immediate cause
(even if we haven't identified the indirect cause yet) -- unlike the
original report by Santiago Villa.  Thus, it looks we have two different
bugs that just happen to trigger the same failure mode.

And thus, even if we fix the schroot issue, Santiago's bug likely won't be
fixed.
[snip]

Even more puzzling: I just recreated the chroot again, and despite using the
very same command to do so as before (last on 2017-05-04) there's no
/etc/hosts in the chroot now, which makes sslh build correctly.

The version from 2017-05-04 includes has an /etc/hosts, with ::1 replaced by
127.0.0.1 just as you noticed.  And I see no uploads of debootstrap, sbuild,
schroot or a package that looks related in that time period.

Got an unrelated big build running at the moment, once it's done I'll boot
from a snapshot (got backups from 2017-05-01 (plus earliers) and dailies
since 2017-05-06) to see if it's a matter of an installed package.

But again, this is probably unrelated to Santiago's bug other than for the
results.


Meow!

#842634#65
Date:
2017-05-14 23:52:09 UTC
From:
To:
Hi,

Le 15/05/2017 à 00:50, Adam Borowski a écrit :

As this bug is not related to sslh package itself, i've removed the
pending tag, i let Michael revert
https://anonscm.debian.org/cgit/collab-maint/sslh.git/commit/?id=243bb3faa682afa8168664eaf5a4f72cfc21ee27
and closing this bug to disable the autoremoval in testing.

#842634#70
Date:
2017-05-15 00:13:32 UTC
From:
To:
Control: severity -1 important

Well, closing is inappropriate, as we have at least _two_ bugs that result
in sslh hanging during the testsuite -- this one being deterministic means
it clearly is not the cause of random FTBFSes as in the original report.

Let's restore the severity then, as the not-yet-known bug happens only some
of the time.


Meow!

#842634#75
Date:
2017-05-15 05:40:52 UTC
From:
To:
Note that my commit still improves things, regardless of this specific bug
report or others. I think the best outcome in the long run would be to keep
the commit by upstreaming it. I can understand if you’d like to revert it
while we’re in a freeze, but let’s not drop it entirely please :).

#842634#80
Date:
2018-10-06 21:38:59 UTC
From:
To:
I'd like to clarify that most probably there was only one bug here
after all, namely, the one in schroot (#842634).

I initially reported this as "random" because I had a mix of
successful builds and failed builds, but most probably the
autobuilders in which it failed were always the same, the ones in
which the build succeeded were always the same, and I just failed to
recognize the pattern.

Fortunately you have found the real reason for the bug (while I was
missing from the discussion :-), and I believe this was the only
reason it failed for me last year.

Now a simple question: Do you think the workaround you prepared could
still be useful at all (I personally don't think so), or should I just
reassign this to schroot and use "affects"?

Thanks.

#842634#85
Date:
2018-12-23 00:41:13 UTC
From:
To:
The root cause of the bug is glibc's implementation of gethostent. The
chroot's hosts file is filled by schroot by running `getent hosts`, and
this is what prints the wrong information. If you peek inside the source
of glibc's getent program, then this part of it can be extracted to the
following (LGPLv2.1, and formatting is the GNU style so don't blame me):

If you compile and run this on Linux, you will get the same output as
`getent hosts`, with `::1' being turned into `127.0.0.1'. However, on a
BSD (userland) system, you get a more sane output that doesn't rewrite
IPv6 addresses. So, as this demonstrates, the root problem is that
gethostent in glibc mangles its input. What I don't know is if this is
desired behaviour; I guess someone should file a bug report upstream and
see what they say...