make_links_absolute breaks meta http-equiv

Bug #1419354 reported by sylvain zimmer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Fix Released
Low
scoder

Bug Description

It seems that making iterlinks() return links inside meta refresh tags broke make_links_absolute.

Sample testcase:

from lxml.html import fromstring, tostring

html = """<html><head><meta http-equiv="refresh" content="0; url=http://example.com/"></head><body></body></html>"""
doc = fromstring(html)
doc.make_links_absolute(base_url="http://www.mypage.com/")

print tostring(doc)

lxml 3.3.6:
<html><head><meta http-equiv="refresh" content="0; url=http://example.com/"></head><body></body></html>

lxml 3.4.2:
<html><head><meta http-equiv="refresh" content="0; urlhttp://www.mypage.com/url=http://example.com/"></head><body></body></html>

More info about my system:
>>> print("%-20s: %s" % ('Python', sys.version_info))
Python : sys.version_info(major=2, minor=7, micro=5, releaselevel='final', serial=0)
>>> print("%-20s: %s" % ('lxml.etree', etree.LXML_VERSION))
lxml.etree : (3, 3, 6, 0)
>>> print("%-20s: %s" % ('libxml used', etree.LIBXML_VERSION))
libxml used : (2, 9, 1)
>>> print("%-20s: %s" % ('libxml compiled', etree.LIBXML_COMPILED_VERSION))
libxml compiled : (2, 9, 1)
>>> print("%-20s: %s" % ('libxslt used', etree.LIBXSLT_VERSION))
libxslt used : (1, 1, 28)
>>> print("%-20s: %s" % ('libxslt compiled', etree.LIBXSLT_COMPILED_VERSION))
libxslt compiled : (1, 1, 28)
>>>

>>> print("%-20s: %s" % ('Python', sys.version_info))
Python : sys.version_info(major=2, minor=7, micro=5, releaselevel='final', serial=0)
>>> print("%-20s: %s" % ('lxml.etree', etree.LXML_VERSION))
lxml.etree : (3, 4, 2, 0)
>>> print("%-20s: %s" % ('libxml used', etree.LIBXML_VERSION))
libxml used : (2, 9, 1)
>>> print("%-20s: %s" % ('libxml compiled', etree.LIBXML_COMPILED_VERSION))
libxml compiled : (2, 9, 1)
>>> print("%-20s: %s" % ('libxslt used', etree.LIBXSLT_VERSION))
libxslt used : (1, 1, 28)
>>> print("%-20s: %s" % ('libxslt compiled', etree.LIBXSLT_COMPILED_VERSION))
libxslt compiled : (1, 1, 28)

Revision history for this message
scoder (scoder) wrote :

Thanks for the report. The problem seems to be the space before the "url=". Without it, it works.

Changed in lxml:
importance: Undecided → Low
milestone: none → 3.4
status: New → Confirmed
Revision history for this message
scoder (scoder) wrote :
Changed in lxml:
assignee: nobody → scoder (scoder)
status: Confirmed → Fix Committed
Revision history for this message
sylvain zimmer (sylvain-sylvainzimmer) wrote :

Wow, that was fast, thanks!

Revision history for this message
scoder (scoder) wrote :

Fixed in lxml 3.4.3.

Changed in lxml:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.