#582352 dak/apt: deficiencies at handling out-of-sync metadata

Package:
apt
Source:
apt
Description:
commandline package manager
Submitter:
Jonathan Nieder
Date:
2011-09-17 15:37:28 UTC
Severity:
normal
#582352#5
Date:
2010-05-20 06:09:24 UTC
From:
To:
Every once in a while (mirror sync-related?), cupt update is failing:

 # cupt update
 [...]
 Get:6 http://ftp.us.debian.org/debian experimental Release.gpg
 Get:7 http://ftp.us.debian.org/debian experimental/main Packages.bz2 [257KiB]
 54% [7 experimental/main Packages.bz2 0B/257KiB 0%]            | 75.7KiB/s | ETA: 4s
 W: downloading http://ftp.us.debian.org/debian/dists/experimental/main/binary-i386/Packages.bz2 failed: invalid size: expected '262710', got '260345'
 [...]
 Fetched 10.7MiB in 43s.
 #
 # cupt update
 [...]
 Get:6 http://ftp.us.debian.org/debian sid Release.gpg
 100% [6 sid Release.gpg 0B/835B 0%]                            | 77.9KiB/s | ETA: 0s
 W: gpg: '/var/lib/apt/lists/ftp.us.debian.org_debian_dists_sid_Release': bad signature: 9AA38DCD55BE302B Debian Archive Automatic Signing Key (5.0/lenny) <ftpmaster@debian.org>
 W: signature verification for 'sid Release' failed
 Get:7 http://ftp.us.debian.org/debian sid/main Packages.bz2 [6495KiB]
 Get:8 http://ftp.us.debian.org/debian sid/main Translation-en_US.bz2
 100% [8 sid/main Translation-en_US.bz2 0B]                          | 0B/s | ETA: 0s
 W: downloading http://ftp.us.debian.org/debian/dists/sid/main/i18n/Translation-en_US.bz2 failed: HTTP response code said error: 404
 [...]
 Fetched 6798KiB in 28s.
 # date -u
 Thu May 20 06:05:55 UTC 2010

I am not sure what is actually behind this but thought I should get
your advice.  Is this a known problem?  Could cupt help diagnose it
more easily?

Workaround: use apt-get update to get the pdiffs.

Jonathan

#582352#10
Date:
2010-05-20 16:33:27 UTC
From:
To:
Jonathan Nieder wrote:
             =20
257KiB]      =20
iB/s | ETA: 4s
/binary-i386/Packages.bz2 failed: invalid size: expected '262710', got '2=
60345'
             =20
Firstly, I would like to confirm that even in that case, the whole update=

thing is run fully, because the next lines should indicate downloading
Packages.gz for the same entry and succeed (or, hm, possibly, fail too). =
Is it
the case?
No. This is the first time I see this error popped out. My first guess is=

mirror is misbehaving. Personally I didn't use ftp.us.debian.org at least=
 for
year and some months, I however use several other ones without errors lik=
e this.
the actual size of Packages.{ext}. This may also mean security problems, =
so
Cupt won't download anything with modified sizes.

Some time ago I asked a FTP team about spec on possible detecting of
updating-mirror-is-in-progress by using a file which they apparently plac=
e
temporary to some place, but my mail didn't get any answer, and since tha=
t I
did not encounter this problem and forgot about it.

So. The first thing I propose to do is verify that the problem is
mirror-dependent, for example, by trying using other non-US mirror for so=
me
time. This would not necessarily mean that problem is not in Cupt, but he=
lp to
"bisect" what's going on.

#582352#15
Date:
2010-05-20 17:18:39 UTC
From:
To:
Eugene V. Lyubimkin wrote:

Get:1 http://ftp.us.debian.org/debian sid Release
Get:2 http://ftp.us.debian.org/debian experimental Release
Get:3 http://ftp.egr.msu.edu/debian sid Release
Get:4 http://ftp.us.debian.org/debian experimental Release.gpg
Get:5 http://ftp.us.debian.org/debian sid Release.gpg
100% [4 experimental Release.gpg 835B/835B 100%][5 sid Release.| 76.5KiB/s | ETA: 0s
W: gpg: '/var/lib/apt/lists/ftp.us.debian.org_debian_dists_experimental_Release': bad signature: 9AA38DCD55BE302B Debian Archive Automatic Signing Key (5.0/lenny) <ftpmaster@debian.org>
W: signature verification for 'experimental Release' failed
Get:6 http://ftp.egr.msu.edu/debian sid Release.gpg
Get:7 http://ftp.us.debian.org/debian experimental/main Packages.bz2 [254KiB]
54% [7 experimental/main Packages.bz2 0B/254KiB 0%]            | 77.9KiB/s | ETA: 2s
W: downloading http://ftp.us.debian.org/debian/dists/experimental/main/binary-i386/Packages.bz2 failed: invalid size: expected '260345', got '260479'
Get:8 http://ftp.us.debian.org/debian experimental/main Packages.gz [314KiB]
[...]

Trying ‘cupt update’ again:

Get:1 http://ftp.us.debian.org/debian sid Release
Get:2 http://ftp.us.debian.org/debian experimental Release
Get:3 http://ftp.egr.msu.edu/debian sid Release
Get:4 http://ftp.us.debian.org/debian sid Release.gpg
76% [3 sid Release 28.0KiB/101KiB 28%][4 sid Release.gpg 835B/8| 57.0KiB/s | ETA: 0s
W: gpg: '/var/lib/apt/lists/ftp.us.debian.org_debian_dists_sid_Release': bad signature: 9AA38DCD55BE302B Debian Archive Automatic Signing Key (5.0/lenny) <ftpmaster@debian.org>
W: signature verification for 'sid Release' failed
Get:5 http://ftp.us.debian.org/debian experimental Release.gpg
Get:6 http://ftp.egr.msu.edu/debian sid Release.gpg
100% [5 experimental Release.gpg 835B/835B 100%]               | 77.5KiB/s | ETA: 0s
W: gpg: '/var/lib/apt/lists/ftp.us.debian.org_debian_dists_experimental_Release': bad signature: 9AA38DCD55BE302B Debian Archive Automatic Signing Key (5.0/lenny) <ftpmaster@debian.org>
W: signature verification for 'experimental Release' failed
Get:7 http://ftp.us.debian.org/debian sid/main Packages.bz2 [6495KiB]
4% [7 sid/main Packages.bz2 0B/6495KiB 0%]                  | 25.0KiB/s | ETA: 1m47s
W: downloading http://ftp.us.debian.org/debian/dists/sid/main/binary-i386/Packages.bz2 failed: invalid size: expected '6651138', got '6650451'
Get:8 http://ftp.us.debian.org/debian sid/main Packages.gz [8542KiB]
3% [8 sid/main Packages.gz 0B/8542KiB 0%]                      | 403B/s | ETA: 2m49s
W: downloading http://ftp.us.debian.org/debian/dists/sid/main/binary-i386/Packages.gz failed: invalid size: expected '8747079', got '8747611'
Get:9 http://ftp.us.debian.org/debian sid/main Packages [30.4MiB]
1% [9 sid/main Packages 0B/30.4MiB 0%]                        | 452B/s | ETA: 10m16s
W: failed to download index for 'sid/main'
W: downloading http://ftp.us.debian.org/debian/dists/sid/main/binary-i386/Packages failed: HTTP response code said error: 404

So it looks like the Release is temporarily “out of sync” with other files.

Meanwhile, I have never run into this problem with apt-get.

... ah, okay, maybe this is it: ftp.us.debian.org uses round-robin DNS
to switch between multiple mirrors.  APT’s HTTP method copes with this
by doing the lookup once and reusing the IP for a number of requests,
whereas it looks like cupt is switching between mirrors too often.

Selecting a random particular mirror (like mirrors2.kernel.org, the
first one ‘ping’ gave me) does avoid the problem, though that doesn’t
rule out this having just avoided some particular problematic mirror.

Regards,
Jonathan

#582352#20
Date:
2010-05-20 17:41:13 UTC
From:
To:
Jonathan Nieder wrote:
Yup.
You seem to be right. Cupt just passes all network work to Curl library.

Khm. I would argue that it's not a Cupt problem and the other side providing
round-robin DNS should ensure the equality of files. I don't know is it
possibly technically to enable-disable particular IPs on the fly
technically... Need to think more probably.
Hm, given a reason above, using a static mirror (not "changing" content
between calls) should avoid this problem completely, no?

#582352#25
Date:
2010-05-20 18:08:15 UTC
From:
To:
Eugene V. Lyubimkin wrote:

Eh, it might be nice for them to do that, but we would have to either
change them or live with what we have.

In other words, maybe this is not the way DNS is supposed to be used,
but it is an assumption for “APT over HTTP” used on both ends (the
mirrors and the clients).  And this is a simple assumption that would
not be broken by any DNS cache.  It is the client’s responsibility to
use the same IP where it needs consistency.

Yes, I only meant that I have not empirically ruled out other causes.
But I do think we’ve found the problem.

Now to find some time to fix it. :)

Thanks,
Jonathan

#582352#30
Date:
2010-05-20 19:46:33 UTC
From:
To:
package cupt libcupt-perl mirrors
reassign 582352 mirrors
retitle 582352 different mirrors under single DNS should have equal content
affects 582352 + libcupt-perl
thanks

Jonathan Nieder wrote:
Ideally, I disagree. Is there some RFC spec about 'use the same IP if you need
consistence?' or like?
Even given all the above, I'd in favor to provide a workaround from Cupt's
side. But, in this case, I practically cannot. Curl has not options to control
DNS->IP selection, and Cupt download system is written in method- and
host-agnostic way, with multi-process approach and internal pipelining. Last
two things mean that downloads methods cannot rely on any DNS cache. Saying
that not all files can be downloaded independently (in this case, Release and
Packages, but that also applies to to download of .debs as well) breaks too
many places of the system, and "fixing" this part needs a major rewrite, less
maintainable/scalable code and hacks to avoid race condition bugs #442189. The
current download system avoids that by fully parallel design.

The possible solutions include multithreading instead of multiprocess design
(has its downsides, requires a change of implementation language) or
implemenenting a file-based DNS cache in Curl instead of memory-based
(unlikely to have, I suppose).

Summarizing: sorry, I can't provide a workaround in a near future.

Also, let's see what maintainers of 'mirrors' pseudo-package can suggest.

#582352#43
Date:
2010-05-21 00:40:43 UTC
From:
To:
Eugene V. Lyubimkin wrote:

I was too focused on DNS before, and there is no problem there.  Of
course DNS does exactly what we want it to here.

However, a case could be made that the inconsistency between APT
mirrors served under the same hostname breaks HTTP (and perhaps
violates it, though I haven’t found a relevant passage), since an HTTP
proxy could ask the DNS resolver for a new server to connect to at any
time.

(Just to be clear, there is no redirection involved here.)

 $ wget -S http://ftp.us.debian.org/debian/dists/unstable/Release
 --2010-05-20 19:34:35--
http://ftp.us.debian.org/debian/dists/unstable/Release
 Resolving ftp.us.debian.org... 35.9.37.225, 64.50.236.52, 128.30.2.36,
 ...
 Connecting to ftp.us.debian.org|35.9.37.225|:80... connected.
 HTTP request sent, awaiting response...
   HTTP/1.1 200 OK
 [...]

Should apt’s HTTP method be using IP addresses in its requests
instead?  Would this be safe, or do some mirrors use virtual hosts?

Jonathan

#582352#48
Date:
2010-05-21 14:49:57 UTC
From:
To:
Jonathan Nieder wrote:
Erm, this bug is not related to APT HTTP method. You probably want to discuss
this matter with APT maintainers.

#582352#53
Date:
2010-05-21 15:27:48 UTC
From:
To:
Hi APT team,

As Eugene noticed, the use of round-robin DNS between out-of-sync
mirrors, by ftp.us.debian.org for example, makes it hard to reliably
fetch and verify the Debian archive's index files. Despite having
similar addresses Release.gpg, Release, and Packages can end up being
fetched from different mirrors. I suspect it is possible for this to
come up in some proxy setups, too, where the client has no control
over which mirror each file is fetched from.

I suggested that one possible solution would be to force use of IP
addresses for host names in requests made by the APT HTTP method. Of
course this is not ideal, because among other things it breaks virtual
hosts.

Eugene V. Lyubimkin wrote:

This is about the protocol used by apt and other front-ends to
retrieve packages over HTTP, no?

Is your point that the same problem applies to other protocols like
FTP, too? In that case, I would disagree. With FTP, unlike HTTP, it is
easy to arrange for the Release.gpg, Release, and Packages files to be
obtained from a single mirror.

Good idea, thanks. CC-ed.

Thoughts?
Jonathan

#582352#58
Date:
2010-05-21 15:37:59 UTC
From:
To:
No-no, I just didn't realize you want discuss this question protocol-wise.
#582352#63
Date:
2011-06-07 15:31:50 UTC
From:
To:
Hi APT maintainers,

Jonathan Nieder wrote:

Ping?  Would you be interested in a patch for this?