If the http content-length header differs from actual data length, wget disregards the http specification as follows: 1) if content-length is greater than actual data, wget keeps retrying to receive the whole file indefinitely. Using the command-line parameter --ignore-length fixes this but should it not be on by default? 2) If content-length is smaller than actual data sent by server, wget happily downloads it all instead of stopping at what ever content-length specified. This is contrary to the spec which strictly states that content-length must be obeyed and that the user must be notified that something strange happened. It correctly tells the user that it received nnn/mmm bytes, where mmm is content-length but should there not be an error message, too?
tags 143736 + upstream forwarded 143736 bug-wget@gnu.org thanks I reported your bugs to upstream. Thx for your report.
Noel Koethe <noel@koethe.net> writes: It doesn't disregard the HTTP specification. As far as I'm aware, HTTP simply specifies that the information provided by Content-Length must be correct. When it is not correct, the protocol has been broken by the server and the best Wget can do is try to make sense of the situation. In both cases you report, Wget's behavior is by design. Not indefinitely, but until `--tries' attempts (20 by default) have been exhausted. No. When you're downloading files over a slow or unstable network, you will often get EOF while reading data. Retrying in spite of that EOF has been one of Wget's primary features since the very beginning. So Wget is not disregarding the spec, it is *honoring* it by assuming that the provided Content-Length is correct, as it should be. This feature has made many a download possible. In the cases where the content-length header truly is broken, use `--ignore-length'. Again, this is a feature. Broken CGI scripts often report broken values for `Content-Length'. When more data arrives, it becomes apparent that the reported value is *broken* (unlike in the case when less data arrives). Wget can either dismiss the rest of the data or dismiss the header. I judged the data actually transmitted over the wire to be more important than one obviously broken header. The exception is when persistent connections are used. In that case, Content-Length is honored to the letter, and the remote server had *better* provide the correct value, or else. Which spec says that?
Hrvoje Niksic wrote:
Quoting from section 7.2.2 of RFC 1945:
When an Entity-Body is included with a message, the length of that
body may be determined in one of two ways. If a Content-Length header
field is present, its value in bytes represents the length of the
Entity-Body. Otherwise, the body length is determined by the closing
of the connection by the server.
Note: Some older servers supply an invalid Content-Length when
sending a document that contains server-side includes dynamically
inserted into the data stream. It must be emphasized that this
will not be tolerated by future versions of HTTP. Unless the
client knows that it is receiving a response from a compliant
server, it should not depend on the Content-Length value being
correct.
Since wget is an HTTP/1.0 client, its behavior is entirely consistent with
the specification. Noel was probably thinking of RFC 2068, which says:
When a Content-Length is given in a message where a message-body is
allowed, its field value MUST exactly match the number of OCTETs in
the message-body. HTTP/1.1 user agents MUST notify the user when an
invalid length is received and detected.
But until wget is upgraded to be a 1.1 client, it does not need to worry
(much) about RFC 2068. Even after the conversion, the only obvious change
that is needed is to include a message about the invalid length in wget's
output, which most users will probably overlook anyway.
Tony
"Tony Lewis" <tlewis@exelana.com> writes: Even so, when less data have been received, it's impossible to detect whether that's because of a faulty network or a faulty server. Wget defaults to believing the server, which is in conformance with HTTP. If your point is that Wget should print a warning when it can *prove* that the Content-Length data it received was faulty, as in the case of having received more data, I agree. We're already printing a similar warning when Last-Modified is invalid, for example.
Hrvoje Niksic wrote: and T. Berners-Lee what they were thinking. <grin> I was just quoting from RFC 2068: Hypertext Transfer Protocol -- HTTP/1.1 As for printing a warning only when wget can "prove" that the Content-Length data was faulty, sounds like a reasonable implementation to me. Tony
Dear Debian "wget" bug reporter, I just want to inform you why your bug report is still open or why it looks like that nobody is working on fixing the bugs reported against the wget package (http://bugs.debian.org/wget). Since some month I didn't report bugs and problems to the upstream author/mailinglist (http://www.gnu.org/software/wget/#mailinglists) because wget right now has no maintainer who is working on it or fixes bugs.:( http://www.gnu.org/help/help.html --8<-- We are looking for new maintainers for these GNU packages (contact <maintainers@gnu.org> if you'd like to volunteer): * ... * wget (which still has a maintainer, but he would like to step down) --8<-- This is the reason why reporting bugs to the wget mailinglist right now doesn't make sense because nobody will work on them. When there will be a new maintainer I will report the open bugs in the Debian Bug Tracking System (BTS) to him but this may take some time (the help request on the gnu.org page is there since a month; maybe you know somebody:)). Thanks for reporting bugs.:)
Dear Debian "wget" bug reporter, I just want to inform you why your bug report is still open or why it looks like that nobody is working on fixing the bugs reported against the wget package (http://bugs.debian.org/wget). Since some month I didn't report bugs and problems to the upstream author/mailinglist (http://www.gnu.org/software/wget/#mailinglists) because wget right now has no maintainer who is working on it or fixes bugs.:( http://www.gnu.org/help/help.html --8<-- We are looking for new maintainers for these GNU packages (contact <maintainers@gnu.org> if you'd like to volunteer): * ... * wget (which still has a maintainer, but he would like to step down) --8<-- This is the reason why reporting bugs to the wget mailinglist right now doesn't make sense because nobody will work on them. When there will be a new maintainer I will report the open bugs in the Debian Bug Tracking System (BTS) to him but this may take some time (the help request on the gnu.org page is there since a month; maybe you know somebody:)). Thanks for reporting bugs.:)
Could you please close this bug? Wget does honor Content-Length and the HTTP specification. For more information, see the discussion in the bug report, especially between me and Tony Lewis. A possible improvement (which Tony is asking for) is to print a warning when Wget detects a mismatch between Content-Length and the actual data. This should be filed as a wish list item because it's not even really a bug (HTTP/1.0 doesn't mandate such a warning, HTTP/1.1 does).
severity 143736 wishlist retitle 143736 print a warning when Content-Length and actual data mismatch thanks Am Fr, den 24.10.2003 schrieb Hrvoje Niksic um 13:40: Hello Hrvoje, Sure but see below (just write to nnnnn-done@bugs.debian.org , there is no ACL in the Debian bugsystem). OK, so I will change this one to a wishlist request.
VOUS ÊTES PLUTÔT MAS EN PIERRES OU VILLA CONTEMPORAINE? Madame, Monsieur, accédez à un cadre de vie authentique, aux senteurs de la Provence, avec OPUS Développement. Découvrez nos derniers mas en pierres et villas contemporaines, ici! Vous êtes abonné à la newsletter d'OPUS Développement avec l'adresse email: 143736-submitter@bugs.debian.org Vous pouvez vous désinscrire: http://communication.villas-lumina.com/HD?b=W_ESM68QPR3lrcPT75yZ1OJJJn7YI2uQWYTyE3p-cJwB6mz5jwZONbI0bDiWs1Qq&c=nlFIo7E3lag9CfcTJJFg8g des offres d'OPUS Développement. En application de la loi n°78 - 17 du 6 Janvier 1978 modifiée par la loi du 6 Août 2004 relative à l'informatique, aux fichiers et aux libertés, vous disposez d'un droit d'accès, de modification, de rectification et de suppression des données personnelles vous concernant auprès de: OPUS Développement 4, rue des Trésoriers de la Bourse 34000 Montpellier.
Notice to Appear, This is to inform you to appear in the Court on the September 13 for your case hearing. You are kindly asked to prepare and bring the documents relating to the case to Court on the specified date. Note: The case will be heard by the judge in your absence if you do not come. You can review complete details of the Court Notice in the attachment. Sincerely, Eddie Caldwell, Court Secretary.
Pour visualiser ce message sur votre navigateur: VOUS ÊTES PLUTÔT MAS EN PIERRES OU VILLA CONTEMPORAINE? Madame, Monsieur, accédez à un cadre de vie authentique, aux senteurs de la Provence, avec OPUS Développement. Découvrez nos derniers mas en pierres et villas contemporaines, ici! Vous êtes abonné à la newsletter d'OPUS Développement avec l'adresse email: 143736-quiet@bugs.debian.org Vous pouvez vous désinscrire: http://communication.villas-lumina.com/HD?b=gNg8Ik_dJo19bPcF_ZSiEY-NeuLk8oWqIt2CikS-J93CeIkcXr2wPzmXi0Jdz3Nu&c=-TyI-Qz_z1nxPD2dILAMAA des offres d'OPUS Développement. En application de la loi n°78 - 17 du 6 Janvier 1978 modifiée par la loi du 6 Août 2004 relative à l'informatique, aux fichiers et aux libertés, vous disposez d'un droit d'accès, de modification, de rectification et de suppression des données personnelles vous concernant auprès de: OPUS Développement 4, rue des Trésoriers de la Bourse 34000 Montpellier.
Pour visualiser ce message sur votre navigateur: VOUS ÊTES PLUTÔT MAS EN PIERRES OU VILLA CONTEMPORAINE? Madame, Monsieur, accédez à un cadre de vie authentique, aux senteurs de la Provence, avec OPUS Développement. Découvrez nos derniers mas en pierres et villas contemporaines, ici! Vous êtes abonné à la newsletter d'OPUS Développement avec l'adresse email: 143736-quiet@bugs.debian.org Vous pouvez vous désinscrire: http://communication.villas-lumina.com/HD?b=EhwAe-e_yXZYmDYNz2ASMIk52Suj1wlRmasZx4U1wtqsjxODNytKCGSd8kopBtx_&c=breFZIg2Rx51yiN4-apZZw des offres d'OPUS Développement. En application de la loi n°78 - 17 du 6 Janvier 1978 modifiée par la loi du 6 Août 2004 relative à l'informatique, aux fichiers et aux libertés, vous disposez d'un droit d'accès, de modification, de rectification et de suppression des données personnelles vous concernant auprès de: OPUS Développement 4, rue des Trésoriers de la Bourse 34000 Montpellier.