#85509 links: can't handle a big crappy html file

Package:
links
Source:
links2
Description:
Web browser running in text mode
Submitter:
Josip Rodin
Date:
2011-08-19 01:03:06 UTC
Severity:
normal
#85509#5
Date:
2001-02-10 18:51:41 UTC
From:
To:
Hi,

Links doesn't like http://www.formel.hr/cjenik.htm which is a 324KB-large
file with a couple of glitches in HTML. It just hangs there, consuming all
the CPU, and I can't abort it any other way but with ^\ (SIGQUIT).

(I wouldn't complain normally, but Lynx and w3m display the file without
problems.)

#85509#8
Date:
2001-04-03 00:06:15 UTC
From:
To:
Hello,

Here are the latest bugreports from Debian, I hope they worth the
time writing them here and hopefully even fixing them.

===
rxvt keys 'home' and 'end' reversed - the patch was sent on the list

===
Bad URI expansion:
 The start page of the man2html package:
http://localhost/cgi-bin/man2html

 contains references like:

   <A HREF="http:/cgi-bin/manwhatis?1">1. User Commands</A>;

 Following RFC 1808, the browser should resolve this reference to
http://localhost/cgi-bin/manwhatis?1

 but links incorrectly resolves it to

http://localhost/cgi-bin/http:/cgi-bin/manwhatis?1

===
Weird URL transformation/unwanted decoding:
 links "http://www.google.com/search?q=abc+41+41+41xyz"

 Results a search "abcAAAxyz" instead of "abc 41 41xyz"

===
Links eat up spaces when browsing local filesystem:
 $ touch "a b"
 $ links .

 Now try to open file "a b".  On status line is seen that
 links eats spaces: url is "file:///ab"

===
Links chokes on a big file (lynx and w3c handle it):
 Links doesn't like http://www.formel.hr/cjenik.htm which is a 324KB-large
 file with a couple of glitches in HTML. It just hangs there, consuming all
 the CPU, and I can't abort it any other way but with ^\ (SIGQUIT).



Wishlist items:
===
Like 'E' in lynx, "go to URL based on selected link".

===
Bookmark editor: Home/End could work, and Ins could ADD URL and
Del could REMOVE URL.


Thanks,
Peter

#85509#9
Date:
2001-04-03 19:18:54 UTC
From:
To:
Peter Gervai <grin@tolna.net> writes:

I don't think this is correct, but unfortunately I left my RFC 1808 at home.

Yes, that's incredibly irritating.  Links uses it for 'open in new
window', though.  It would be nice if that encoding were turned off by
default and only enabled by a command-line flag.  I get around it by
encoding spaces as %20 instead.

#85509#10
Date:
2001-04-03 19:29:43 UTC
From:
To:
[RFC1808]
2.1.  URL Syntactic Components

   The URL syntax is dependent upon the scheme.  Some schemes use
   reserved characters like "?" and ";" to indicate special components,
   while others just consider them to be part of the path.  However,
   there is enough uniformity in the use of URLs to allow a parser to
   resolve relative URLs based upon a single, generic-RL syntax.  This
   generic-RL syntax consists of six components:

      <scheme>://<net_loc>/<path>;<params>?<query>#<fragment>

   each of which, except <scheme>, may be absent from a particular URL.
   These components are defined as follows (a complete BNF is provided
   in Section 2.2):

      scheme ":"   ::= scheme name, as per Section 2.1 of RFC 1738 [2].

      "//" net_loc ::= network location and login information, as per
                       Section 3.1 of RFC 1738 [2].

      "/" path     ::= URL path, as per Section 3.1 of RFC 1738 [2].

      ";" params   ::= object parameters (e.g., ";type=a" as in
                       Section 3.2.2 of RFC 1738 [2]).

      "?" query    ::= query information, as per Section 3.3 of
                       RFC 1738 [2].

      "#" fragment ::= fragment identifier.


There is later an BNF description:

   URL         = ( absoluteURL | relativeURL ) [ "#" fragment ]

   absoluteURL = generic-RL | ( scheme ":" *( uchar | reserved ) )

   generic-RL  = scheme ":" relativeURL

   relativeURL = net_path | abs_path | rel_path

   net_path    = "//" net_loc [ abs_path ]
   abs_path    = "/"  rel_path
   rel_path    = [ path ] [ ";" params ] [ "?" query ]

   path        = fsegment *( "/" segment )
   fsegment    = 1*pchar
   segment     =  *pchar

   params      = param *( ";" param )
   param       = *( pchar | "/" )

   scheme      = 1*( alpha | digit | "+" | "-" | "." )
   net_loc     =  *( pchar | ";" | "?" )

[..etc..]

I don't really see the point of this encoding. What is it good for?

Peter

#85509#11
Date:
2001-06-02 00:17:52 UTC
From:
To:
It actually displayed the file after 2 minutes :-)

Yes, handling many form entries in a large table is ineffective and I'm
afraid to change it in this stable release.

Mikulas

#85509#20
Date:
2007-12-29 13:20:07 UTC
From:
To:
Hi,

do you have another example? This one shows me:

The requested URL /cjenik.htm was not found on this server.