Ubuntu
lynx package

lynx -dump fails if filename is not *.html

Bug #1112568 reported by David Biesack on 2013-02-01

6

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	lynx (Ubuntu)	Fix Released	Undecided	Unassigned

Bug Description

lynx -dump url is supposed to strip out the HTML markup and format the text,
but it is not working; it is just echoing the output.

The input is simple HTML:

$ echo >| curl.out <<EOF
<html>
<body><p>Some text</p></body>
</html>
EOF

but when I do

$ lynx -d curl.out

lynx just echos the HTML markup.

It works if I do:

$ mv curl.out test.html
$ lynx -d test.html

I.e. it appears the command requires the file name to end in .html

I can use

$ lynx -dump -stdin < curl.out

as a workaround but the default should work.

(This is in the context of another script which fetches a web resource
via curl, then looks at the file content and does different thing based
on the contents; it is not always html, so that is why my file name is curl.out)

I'm on Ubuntu using:

    $ lynx --version
    Lynx Version 2.8.8dev.9 (12 Jun 2011)
    libwww-FM 2.14, SSL-MM 1.4.1, GNUTLS 2.10.5, ncurses 5.9.20110404(wide)
    Built on linux-gnu Nov 19 2012 15:52:46

Revision history for this message

Thomas Dickey (dickey-his) wrote on 2013-02-02: Re: [Bug 1112568] [NEW] lynx -dump fails if filename is not *.html

#1

On Fri, Feb 01, 2013 at 02:59:16PM -0000, David Biesack wrote:
> Public bug reported:
>
> lynx -dump url is supposed to strip out the HTML markup and format the text,
> but it is not working; it is just echoing the output.
>
> The input is simple HTML:
>
> $ echo >| curl.out <<EOF
> <html>
> <body><p>Some text</p></body>
> </html>
> EOF
>
> but when I do
>
> $ lynx -d curl.out
>
> lynx just echos the HTML markup.

man lynx:

-force_html
forces the first document to be interpreted as HTML.

--
Thomas E. Dickey <email address hidden>
http://invisible-island.net
ftp://invisible-island.net

Revision history for this message

David Biesack (david-biesack-sas) wrote on 2013-02-04:

#2

Thanks for the tip, Thomas.

I still find the behavior confusing; i.e. that the name is important rather than the content.
"-dump dumps the formatted output of the default document"
implies the document is HTML, does not mention other possible interpretations or file formats,
and does not mention -force_html or the *.html convention or why there is
a need for -force_html.

If you do not wish to change the behavior, please consider updating the man page
to document the conventions/assumptions that lynx -dump makes.

Revision history for this message

Thomas Dickey (dickey-his) wrote on 2013-02-05: Re: [Bug 1112568] Re: lynx -dump fails if filename is not *.html

#3

On Mon, Feb 04, 2013 at 02:41:15PM -0000, David Biesack wrote:
> Thanks for the tip, Thomas.
>
> I still find the behavior confusing; i.e. that the name is important rather than the content.
> "-dump dumps the formatted output of the default document"
> implies the document is HTML, does not mention other possible interpretations or file formats,
> and does not mention -force_html or the *.html convention or why there is
> a need for -force_html.
>
> If you do not wish to change the behavior, please consider updating the man page
> to document the conventions/assumptions that lynx -dump makes.

yes - documentation is a good thing to improve :-)

--
Thomas E. Dickey <email address hidden>
http://invisible-island.net
ftp://invisible-island.net

Revision history for this message

Thomas Dickey (dickey-his) wrote on 2014-02-02:

#4

This was addressed in Debian #254603

Revision history for this message

Thomas Dickey (dickey-his) wrote on 2018-08-28:

#5

fwiw, that was 4 years ago, and was in the previous (2.8.8) release.

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=254603

Changed in lynx (Ubuntu):
status:	New → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

debbugs #254603
[done minor fixed-upstream] Edit

Bug watches keep track of this bug in other bug trackers.